Skip to main content

Microservices - Distributed tracing with Spring Cloud Sleuth and Zipkin


Microservices are very flexible. We can have multiple microservices for each domain and interact as necessary. But it comes with a price. It becomes very complex when the number of microservices grow.
Imagine a situation where you found a bug or slowness in the system. How do you find the root cause by examinig logs?

  • Collect all the logs from related microservices.
  • Pick the starting microservice and find a clue there using some id (userid, businessid, etc).
  • Pick the next microservice and check whether the previous information are there.
  • Keep going until you find which microservice has the bug.

I have followed that practise in one of my previous projects. It is very difficult and takes a lot of time to track an issue.
This is why we need to use distributed tracing in microservices. One place where we can go and see the entire trace.

It helps us by;
  • Asign unique id (correlation id) to all request.
  • Pass unique id across all the microservices automatically.
  • Record time information.
  • Log service name, unique id, span id.
  • Aggregate log data from multiple microservices into single source.

Spring Cloud Sleuth

Spring Cloud Sleuth implements a distributed tracing solution for Spring Cloud. We can capture data simply using logs or send data to a collector service like Zipkin. 
Just by adding the library into our project Spring Cloud Sleuth can;
  • Add correlation id to all request if it doesn't exist.
  • Pass the id with outbound call.
  • Add correlation information to Spring's Mapped Diagnostic Context (MDC) which internally use SL4J and Logback implementations.
  • If the collector service is configured, it can pass the log information to it.

Adding Spring Cloud Sleuth

This is very simple. We need to update our pom.xml files to include the Sleuth dependency. Let's update api-gateway, service-a and service-b pom files with this.

Now re-start applications and visit http://localhost:7060/api/service-a/

Look at the logs. In service-a you will be able to see;
2019-06-23 14:45:41.024  INFO [service-a,50937d2183890546,fc6079712896add8,false] 15445 --- [io-7000-exec-10] c.s.s.controller.MessageController       : get message

In service-b
2019-06-23 14:45:41.033  INFO [service-b,50937d2183890546,260506a7161eca33,false] 15654 --- [io-7005-exec-10] c.s.s.controller.MessageController       : serving message from b

You can see in logs it has [service_name, traceId, spanId, exportable] format. Both logs have same correlation id printed. Exportable is false because we haven't added our log tracing server yet.
Let's add it.

Zipkin Server

Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in microservice architectures.

We used Sleuth to add tracing information in our logs. Now we are going to use Zipkin to visualize it.

Zipkin server as a Docker container

We can find the docker command from the official site to run Zipkin server as a docker container.
docker container run -d -p 9411:9411 openzipkin/zipkin

Now our Zipkin server is available at http://localhost:9411/zipkin/

Let's add Zipkin dependency to api-gateway, service-a and service-b.


Then we need to specify where to send our tracing data. Update application.yml in api-gateway, service-a and service-b as below.
    name: service-a
    baseUrl: http://localhost:9411/
      probability: 1.0

Re-start applications and visit http://localhost:7060/api/service-a/
You will be able see exportable true this time.

Now visit http://localhost:9411/zipkin/ and click on 'Firnd Traces'. You will be able see tracing information and click on it.

Now if you click on a service name it will give you more information like below.

That is it for now. You have your base tracing module to play with.