Welcome to another instalment of Optis Tech Talks, where our employees explain technologies they are passionate about. Building on our previous talks about reactive programming and Infrastructure as Code, we will tell a bit more about distributed logging for microservice architectures.
Stop us if you’ve heard this story before in one of our previous Tech Talks: the monolithic way of working no longer cuts it. These legacy applications are gradually being replaced by microservices to shorten deploy times and increase agility. This is a great idea for your organisation, at least as long as everything works correctly. Unfortunately, Murphy’s law still exists.
When things inevitably go wrong, your team now needs to debug a complex interconnected network of microservices instead of a single application. Each service has its own logs, making analytics and reporting that much harder. You should make sure that your support engineers aren’t wasting valuable time and resources on debugging when they could be innovating instead.
After all, the advantages of microservices are only apparent when they are employed correctly. We’ve explained how to do so in our previous Tech Talks, so be sure to check out our posts on Reactive Programming and Infrastructure as Code if you haven’t done so already. To expand on those concepts, today we will investigate the logging aspect.
At its core, distributed logging brings the centralised logging aspect of a monolithic application to architectures based on asynchronous communication.
A dedicated tool will analyse, format and correlate logging streams from each microservice. It will then centralise these logs in one convenient overview, making debugging and analytics a lot easier. Tracing makes it possible to track a single use rand their activities when debugging. It saves valuable time by no longer requiring manual scrolling through one extensive log file.
In that sense, distributed logging eases some difficulties inherent in reactive architectures. It can, therefore, help those architectures to adhere to the resilience principle of reactive programming. As you might recall from our introductory article, reactive programming is built around elasticity and resilience.Effective debugging and diagnostics are essential to not only building that resilience, but continually improving it.
The advantages of microservices are only apparent when they are employed correctly.
There’s a whole slew of logging tools out there, depending on your specific needs. One aspect in which they differ is their level of management. Datadog, for example, is fully managed, requiring no server management and very little in the way of configuration. On the other end of this spectrum, we have open-source tools which require self-hosting. Examples of such tools include Zipkin, Grafana, and Kibana. These are often referred to as the Elastic (ELK) Stack, reminding us of another tenet of reactive programming.
Some cloud services also offer logging and tracing capabilities as add-ons. Examples of this include Amazon CloudWatch and Azure Monitor’s Application Insights. At Optis, we often choose Datadog because it is cloud-agnostic and therefore supports hybrid cloud infrastructures. At Optis asa whole there is a lot of AWS expertise, so CloudWatch might be a better fit.
In the end, it mainly depends on if you use a hybrid cloud infrastructure, and the degree to which you want to avoid vendor lock-in. Depending on your choice, you will use either its proprietary format or the OpenTelemetry standard. We will go more in-depth on this standard and its benefits in the next part of this Tech Talk.
Distributed logging is key when working with microservices because of their fragmented character. You can implement logging in a variety of ways, depending on your cloud infrastructure’s needs. Regardless of the tool used, it will lead to a single place for all of your logging and analytics.