Practical Observability in Action: Monitoring Distributed Systems with Elastic APM

Practical Observability in Action: Monitoring Distributed Systems with Elastic APM

In today’s fast-paced digital landscape, ensuring the reliability, performance, and health of distributed systems is crucial. Modern architectures, like microservices, introduce complexities that require more than just traditional monitoring. Enter observability — a practice that goes beyond metrics to provide actionable insights into a system’s internal states.

This guide offers a comprehensive introduction to observability and monitoring in production-grade systems, focusing on hands-on implementation with a distributed Node.js application using Elastic APM, Elasticsearch, and Kibana.

What is Observability and Monitoring?

Observability refers to the capability of understanding the internal state of a system by analyzing its outputs — logs, metrics, and traces. It enables teams to:

  • Diagnose performance issues across distributed systems.

  • Trace workflows spanning multiple services.

  • Anticipate and prevent system failures.

Monitoring, on the other hand, involves collecting and analyzing specific metrics or predefined thresholds to ensure a system operates as expected. Together, they help teams:

  • Proactively identify and fix issues.

  • Enhance system reliability and uptime.

  • Deliver a seamless user experience.

Why Observability Matters in Distributed Systems

Modern distributed systems, such as those based on microservices, are inherently complex:

  • Services interact across networks.

  • Failures can cascade unpredictably.

  • Debugging involves multiple moving parts.

Observability provides:

  1. End-to-End Tracing: Understand how requests flow through the system.

  2. Contextual Insights: Combine logs, metrics, and traces for a complete picture.

  3. Faster Troubleshooting: Pinpoint issues efficiently without guesswork.

Hands-On Guide: Observability in a Distributed Node.js System

Let’s build a distributed system with two Node.js microservices, implement observability, and visualize traces using Elastic APM, Elasticsearch, and Kibana.

1. Prerequisites

To follow along, ensure you have:

  • Node.js installed.

  • Elasticsearch, Kibana, and Elastic APM Server installed.

  • A basic understanding of REST APIs.

2. Designing the Distributed System

We will create two services:

  1. Service A: Exposes two APIs. One of them calls an API in Service B and processes the response.

  2. Service B: Exposes two APIs that perform dummy processing and return results.

3. Setting Up Elastic APM, Elasticsearch, and Kibana

  1. Install Elasticsearch, Kibana, and APM Server: Follow the official installation guides for:

4. Start Elasticsearch, Kibana, and APM Server:

# Start Elasticsearch
./bin/elasticsearch

# Start Kibana
./bin/kibana

# Start APM Server
./bin/apm-server -e

5. Verify the Setup: and ensure Elasticsearch and APM Server are running.

5.1 Access Elasticsearch at [http://localhost:9200](http://localhost:9200)

5.2 Access Elastic Apm at [http://localhost:8200](http://localhost:8200)

5.3 Access Kibana at [http://localhost:5601](http://localhost:5601)

6. Setting Up Service A with Elastic APM

6.1 Initialize Service A:

mkdir service-a cd service-a npm init -y

6.2 Install Dependencies:

npm install express axios elastic-apm-node

6.3 Integrate Elastic APM:

6.3.1 Create index.js:

const apm = require("elastic-apm-node").start({
serviceName: "service-a",
serverUrl: "http://127.0.0.1:8200",
environment: "development",
});

const express = require("express");
const axios = require("axios");

const app = express();

app.get("/api-a1", async (req, res) => {
const response = await axios.get("http://localhost:4000/api-b1");
res.send(`Service A API-1 completed! Response: ${response.data}`);

});
app.get("/api-a2", (req, res) => {
setTimeout(() => res.send("Service A API-2 completed!"), 1500);
});

app.listen(3000, () => console.log("Service A running on port 3000"));

7. Setting Up Service B with Elastic APM

7.1 Initialize Service B:

mkdir service-b cd service-b npm init -y

7.2 Install Dependencies:

npm install express elastic-apm-node

7.3 Create index.js:

const apm = require("elastic-apm-node").start({
serviceName: "service-b",
serverUrl: "http://127.0.0.1:8200",
environment: "development",
});

const express = require('express');

const app = express();

app.get('/api-b1', (req, res) => {
setTimeout(() => res.send('Service B API-1 completed!'), 2000);
});

app.get('/api-b2', (req, res) => {
setTimeout(() => res.send('Service B API-2 completed!'), 1000);
});

app.listen(4000, () => console.log('Service B running on port 4000'));

8. Viewing Traces in Kibana

8.1 Generate Traffic: Start both services and generate requests using a tool like curl or Postman:

curl http://localhost:3000/api-a1

curl http://localhost:3000/api-a2

8.2 Access APM in Kibana:

  • Go to the APM section in Kibana.

  • View the service map to see how requests flow between Service A and Service B.

  • Explore traces for detailed information on request timings and potential bottlenecks.

Benefits of Observability in Production Systems

  1. Proactive Issue Detection: Spot anomalies before they escalate into major problems.

  2. Improved Collaboration: Shared insights enable teams to work together more effectively.

  3. Optimized Performance: Understand bottlenecks and optimize resource usage.

  4. Enhanced User Experience: Ensure seamless interactions by minimizing downtime and latency.

Conclusion

Observability is not just a tool; it’s a mindset. By implementing it in your distributed systems and leveraging Elastic APM, Elasticsearch, and Kibana, you can gain unparalleled insights, enhance reliability, and deliver exceptional performance. Use this guide as a starting point to build a robust observability framework and ensure your systems are always production-ready.

Start your observability journey today and unlock the full potential of your distributed systems!

🙏🏻 If you found this helpful, please like, comment, and share! Let’s connect on LinkedIn and learn from each other’s experiences in building reliable, production-ready systems.