Using service registry or key value store to store service state
We'll continue using Docker Flow Proxy as a playground to explore some of the mechanisms and decisions we might make when dealing with stateful services. Please note that, in this chapter, we are concentrating on services with a relatively small state. We'll explore other use cases in the chapters that follow.
Imagine that the proxy does not use Consul to store data and that we do not use volumes. What would happen if we were to scale it up? The new instances would be out of sync. Their state would be the same as the initial state of the first instance we created. In other words, there would be no state, even though the instances that are already running changed over time and generated data.
That is where Consul comes into play. Every time an instance of the proxy receives a request that results in the change of its state, it propagates that change to other instances, as well as to Consul. On the other hand, the first action the proxy performs when initialized is to consult Consul, and create the configuration from its data.
We can observe the state stored in Consul by sending a request for all the data with keys starting with docker-flow:
curl "http://$(docker-machine ip swarm-1):8500/v1/kv/\
docker-flow?recurse"
A part of the output is as follows:
[
...
{
"LockIndex": 0,
"Key": "docker-flow/go-demo/path",
"Flags": 0,
"Value": "L2RlbW8=",
"CreateIndex": 233,
"ModifyIndex": 245
},
...
{
"LockIndex": 0,
"Key": "docker-flow/go-demo/port",
"Flags": 0,
"Value": "ODA4MA==",
"CreateIndex": 231,
"ModifyIndex": 243
},
...
]
The preceding example shows that the path and the port we specified when we reconfigured the proxy for the go-demo service, is stored in Consul. If we instruct Swarm manager to scale the proxy service, new instances will be created. Those instances will query Consul and use the information to generate their configurations.
Let's give it a try:
docker service scale proxy=6
We increased the number of instances from three to six.
Let's take a sneak peek into the instance number six:
NODE=$(docker service ps proxy | grep "proxy.6" | awk '{print $4}')
eval $(docker-machine env $NODE)
ID=$(docker ps | grep "proxy.6" | awk '{print $1}')
docker exec -it $ID cat /cfg/haproxy.cfg
A part of the output of the exec command is as follows:
frontend services
bind *:80
bind *:443
mode http
backend go-demo-be8080
mode http
server go-demo :8080
As you can see, the new instance recuperated all the information from Consul. As a result, its state became the same as the state of any other proxy instance running inside the cluster.
If we destroy an instance, the result will, again, be the same. Swarm will detect that an instance crashed and schedule a new one. The new instance will repeat the same process of querying Consul and create the same state as the other instances:
docker rm -f $(docker ps \
| grep proxy.6 \
| awk '{print $1}')
We should wait for a few moments until Swarm detects the failure and creates a new instance.
Once it's running, we can take a look at the configuration of the new instance. It will be the same as before:
NODE=$(docker service ps \
-f desired-state=running proxy \
| grep "proxy.6" \
| awk '{print $4}')
eval $(docker-machine env $NODE)
ID=$(docker ps | grep "proxy.6" | awk '{print $1}')
docker exec -it $ID cat /cfg/haproxy.cfg
The explanation of Docker Flow Proxy inner workings is mostly for educational purposes. I wanted to show you one of the possible solutions when dealing with stateful services. The methods we discussed are applicable only when the state is relatively small. When it is bigger, as is the case with databases, we should employ different mechanisms to accomplish the same goals.
If we go one level higher, the primary requirements, or prerequisites, when running stateful services inside a cluster are as follows:
- Ability to synchronize the state across all instances of the service.
- Ability to recuperate the state during initialization.
If we manage to fulfill those two requirements, we are on the right path towards solving one of the major bottlenecks when operating stateful services inside the cluster.