Using service registry to store the state
Now that we have Consul instances set up let us explore how to exploit them to our own benefit. We'll study the design of the Docker Flow Proxy as a way to demonstrate some of the challenges and solutions you might want to apply to your own services.
Let us create the proxy network and the service:
eval $(docker-machine env swarm-1)
docker network create --driver overlay proxy
docker service create --name proxy \
-p 80:80 \
-p 443:443 \
-p 8080:8080 \
--network proxy \
-e MODE=swarm \
--replicas 3 \
-e CONSUL_ADDRESS="$(docker-machine ip swarm-1):8500 \
,$(docker-machine ip \
swarm-2):8500,$(docker-machine ip swarm-3):8500" \
vfarcic/docker-flow-proxy
The command we used to create the proxy service is slightly different than before. Namely, now we have the CONSUL_ADDRESS variable with the comma separated addresses of all three Consul instances. The proxy is made in a way that it will try the first address. If it does not respond, it will try the next one, and so on. That way, as long as at least one Consul instance is running, the proxy will be able to fetch and put data. We would not need to do this loop if Consul would run as a Swarm service. In that case, all we'd need to do is put both inside the same network and use the service name as the address.
Unfortunately, Consul cannot, yet, run as a Swarm service, so we are forced to specify all addresses, refer to the following diagram:
Before we proceed, we should make sure that all instances of the proxy are running:
docker service ps proxy
Please wait until the current state of all the instances is set to Running.
Let's create the go-demo service. It will act as a catalyst for a discussion around challenges we might face with a scaled reverse proxy:
docker network create --driver overlay go-demo
docker service create --name go-demo-db \
--network go-demo \
mongo:3.2.10
docker service create --name go-demo \
-e DB=go-demo-db \
--network go-demo \
--network proxy \
vfarcic/go-demo:1.0
There's no reason to explain the commands in detail. They are the same as those we've run in the previous chapters.
Please wait until the current state of the go-demo service is Running. Feel free to use docker service ps go-demo command to check the status.
If we would repeat the same process we used in the Chapter 3, Docker Swarm Networking and Reverse Proxy the request to reconfigure the proxy would be as follows (please do not run it).
curl "$(docker-machine ip swarm-1):8080/v1/\
proxy/reconfigure?serviceName=go-demo&servicePath=/demo&port=8080"
We would send a reconfigure request to the proxy service. Can you guess what would be the result?
A user sends a request to reconfigure the proxy. The request is picked by the routing mesh and load balanced across all the instances of the proxy. The request is forwarded to one of the instances. Since the proxy is using Consul to store its configuration, it sends the info to one of the Consul instances which, in turn, synchronizes the data across all others.
As a result, we have proxy instances with different states. The one that received the request is reconfigured to use the go-demo service. The other two are, still, oblivious to it. If we try to ping the go-demo service through the proxy, we will get mixed responses. One out of three times, the response would be status 200. The rest of the time, we would get 404, not found:
We would experience a similar result if we scale MongoDB. The routing mesh would load balance across all instances, and their states would start to diverge. We could solve the problem with MongoDB by using replica sets. That's the mechanism that allows us to replicate data across all DB instances. However, HAProxy does not have such a feature. So, I had to add it myself.
The correct request to reconfigure the proxy running multiple instances is as follows:
curl "$(docker-machine ip swarm-1):8080/v1/\
docker-flow-proxy/reconfigure \
serviceName=go-demo&servicePath=/demo&port=8080&distribute=true"
Please note the new parameter distribute=true. When specified, the proxy will accept the request, reconfigure itself, and resend the request to all other instances:
That way, the proxy implements a mechanism similar to replica sets in MongoDB. A change to one of the instances is propagated to all others.
Let us confirm that it indeed works as expected:
curl -i "$(docker-machine ip swarm-1)/demo/hello"
The output is as follows:
HTTP/1.1 200 OK
Date: Fri, 09 Sep 2016 16:04:05 GMT
Content-Length: 14
Content-Type: text/plain; charset=utf-8
hello, world!
The response is 200 meaning that the go-demo service received the request forwarded by the proxy service. Since the routing mesh is in play, the request entered the system, was load balanced and resent to one of the proxy instances. The proxy instance that received the request evaluated the path and decided that it should go to the go-demo service. As a result, the request is resent to the go-demo network, load balanced again and forwarded to one of the go-demo instances. In other words, any of the proxy and go-demo instances could have received the request. If the proxy state was not synchronized across all the instances, two out of three requests would fail.
Feel free to repeat the curl -i $(docker-machine ip swarm-1)/demo/hello command. The result should always be the same.
We can double check that the configuration is indeed synchronized by taking a peek into one of the containers.
Let's take a look at, let's say, proxy instance number three.
The first thing we should do is find out the node the instance is running in:
NODE=$(docker service ps proxy | grep "proxy.3" | awk '{print $4}')
We listed all proxy service processes docker service ps proxy, filtered the result with the third instance grep "proxy.3", and returned the name of the node stored in the fourth column of the output awk '{print $4}'. The result was stored in the environment variable NODE.
Now that we know the server this instance is running in, we can enter the container and display the contents of the configuration file:
eval $(docker-machine env $NODE)
ID=$(docker ps | grep "proxy.3" | awk '{print $1}')
We changed the Docker client to point to the node. That was followed with the command that lists all running processes docker ps, filters out the third instance grep "proxy.3", and outputs the container ID stored in the first column awk '{print $1}'. The result was stored in the environment variable ID.
With the client pointing to the correct node and the ID stored as the environment variable ID, we can, finally, enter the container and display the configuration:
docker exec -it $ID cat /cfg/haproxy.cfg
The relevant part of the output is as follows:
frontend services
bind *:80
bind *:443
mode http
acl url_go-demo8080 path_beg /demo
use_backend go-demo-be8080 if url_go-demo8080
backend go-demo-be8080
mode http
server go-demo go-demo:8080
As you can see, the third instance of the proxy is indeed configured correctly with the go-demo service. Feel free to repeat the process with the other two instances. The result should be exactly the same proving that synchronization works.
How was it done? How did the proxy instance that received the request discover the IPs of all the other instances? After all, there is no Registrator that would provide the IPs to Consul, and we cannot access Swarms internal service discovery API.