I work at Spotify on backend infrastructure. In this context, infrastructure is the shared plumbing and platform on which various Spotify systems run. In particular, I work on an open source tool called Helios. This project and, more importantly, my team are pretty awesome.
What is Helios?
Helios is a Docker orchestration framework. This means it’s a tool used to manage your Docker containers across a fleet of hosts.
Why did Spotify create Helios?
Let’s say you have 20 hosts, and you want to run a Docker container named “hello-world” on each of
them. You’ve installed the Docker daemon on each. Now you SSH into each of them and run
docker run hello-world. Doing this 20 times is tedious.
So you use cluster SSH or fabric and run it once. That works.
Your application grows and you need environmental variables, exposed ports, and mounted volumes. You Docker command becomes longer:
You need to remember this long command so you save it to a file in your code repository. This works OK.
You notice your Docker containers sometimes crash when you start them. You run
watch 'docker ps'
to see which containers have crashed on which hosts. You have to tail the logs on those hosts to
figure out what went wrong. Hm, this is becoming hard to manage.
One day, one of your hosts restarts. You don’t notice that the container is no longer running on that host until several days later. Maybe it’s time to think of a better solution.
These and many other reasons are why Spotify created Helios. Helios makes it easy to deploy to multiple hosts, and Helios keeps track of which containers are running where and will restart containers if they crash.