Posts Taged observability

OpenObserve: A High-Performance Modern Observability

OpenObserve: A High-Performance Modern Observability

Good morning, everyone! Dimitri Bellini here, and welcome back to Quadrata, my channel dedicated to the open-source world and the IT I love. As you know, I’m a big fan of my friend Zabbix, but it’s crucial to keep our eyes on the horizon, understand where the world is moving, and explore new solutions that meet the demands of our customers and the community.

That’s why today, I want to introduce you to a solution I’ve had the pleasure of getting to know: OpenObserve. It’s another powerful tool in the observability space, but it approaches the task in a refreshingly different way.

What is OpenObserve and Why Should You Care?

OpenObserve is a cloud-native, open-source observability platform designed to be a unified backend for your logs, metrics, and traces. Think of it as a lightweight yet powerful alternative to heavyweights like Elasticsearch, Splunk, or Datadog. It tackles a key challenge many of us face: consolidating different monitoring tools into a single, cohesive platform.

Instead of juggling separate tools like Prometheus for metrics, Loki for logs, and Jaeger for traces, OpenObserve brings everything under one roof. This unified approach simplifies your workflow and provides a single pane of glass to view the health of your entire infrastructure.

The Game-Changing Features

What really caught my attention are the core functionalities that make OpenObserve stand out:

  • Massive Cost Reduction: This is a big one. By using a specific format called Parquet and a stateless architecture that leverages object storage (like S3, MinIO, or even a local disk), OpenObserve can drastically reduce storage costs. They claim it can be up to 140 times lower than Elasticsearch! For anyone managing hundreds of gigabytes of data per day, this is a revolutionary benefit.
  • Blazing-Fast Performance: The entire engine is written in Rust. We’ve heard a lot about Rust, especially in the Linux kernel world, and for good reason. It’s an incredibly optimized and efficient language. This means OpenObserve can ingest a massive amount of data with a significantly lower memory and CPU footprint compared to Java-based solutions.
  • Simplified Querying: If you’re comfortable with SQL, you’ll feel right at home. OpenObserve allows you to query your logs using standard SQL-based syntax, which dramatically lowers the learning curve. For metrics, it also supports PromQL, giving you the best of both worlds.
  • Native OpenTelemetry Support: It seamlessly integrates with OpenTelemetry, the emerging standard for collecting traces and metrics. This makes it incredibly easy to instrument your applications, whether they’re written in Go, Python, or another language, and start sending data to OpenObserve.
  • Real-time Alerting: Right from the UI, you can define alerts based on log patterns or metric thresholds, similar to what you might do in Prometheus.

Under the Hood: The Technology Stack

I always believe it’s fundamental to understand the components of a solution to appreciate its engineering. OpenObserve is built on a stack of impressive open-source technologies:

  • Rust: The core language, providing memory safety and high performance.
  • Apache Arrow DataFusion: A powerful query engine that enables the SQL support on top of Parquet files.
  • Apache Parquet: A columnar storage format developed by the Apache Foundation that allows for incredible data compression and efficient querying.
  • NATS: A lightweight and high-performance messaging system used for communication and coordination between nodes in a clustered setup.
  • Vue.js: The framework used to build the modern and reactive web interface.
  • SQLite / PostgreSQL: SQLite is used for metadata in simple, standalone deployments, while PostgreSQL is recommended for robust, high-availability production environments.

Getting Started with OpenObserve

One of the best parts is how easy it is to get started. For testing and simple use cases, you just need Docker. The architecture is straightforward: collectors like FluentBit, Vector, or OpenTelemetry send data to your OpenObserve container, which writes to a local disk. This simple setup can already handle an impressive ingestion rate of over 2TB of data per day on a single machine.

For high-availability (HA) production environments, the architecture scales out using Kubernetes, with distinct roles for routers, ingesters, queriers, and more, all coordinated by NATS and backed by object storage.

A Quick Tutorial: Installation with Docker

You can get a test environment running in minutes. It’s as simple as running a single Docker command. Here is the command I used, which you can customize with your own user and password:


docker run -d --name openobserve \
-p 5080:5080 \
-e ZO_ROOT_USER_EMAIL="admin@example.com" \
-e ZO_ROOT_USER_PASSWORD="Complexpass#123" \
-v /opt/openobserve-data:/data \
public.ecr.aws/zinclabs/openobserve:latest

I manage my containers with a tool that simplifies deployment, where I just fill in the image, ports, environment variables, and volume. It’s incredibly straightforward!

A Look at the Dashboard and Final Thoughts

Once you log in, you’re greeted with a clean dashboard showing key stats like ingested events and storage size. The “Data Sources” section is fantastic, providing you with ready-to-use instructions for ingesting data from Kubernetes, Linux, Windows, various databases, and more. This makes the initial setup a breeze.

The log exploration interface will feel familiar to anyone who has used Splunk, with powerful SQL-based querying and on-the-fly filtering. You can visualize metrics, build custom dashboards, analyze application traces with service maps, and even dive into real user monitoring.

What truly impressed me, however, is their licensing model. For self-hosted deployments, you can use the full enterprise version for free for up to 200GB of data ingestion per day. This includes features like single sign-on (SSO) and role-based access control (RBAC). This is a brilliant move that allows smaller teams and environments to leverage the full power of the platform without a cost barrier. A big round of applause to the OpenObserve team for that!

Conclusion: Keep a Close Eye on This One

So, is OpenObserve an interesting solution? Absolutely. It’s a project to watch closely. It has a smart approach—a lightweight, non-pachydermic solution built with exciting technologies like Rust and Parquet. It seems to have a finesse that sets it apart from the many other open-source observability tools out there.

I encourage you to take a look at it. The project is moving fast, and it offers a compelling combination of performance, cost-efficiency, and user-friendliness.

That’s all for today! Let me know your thoughts in the comments below. Do you find these all-in-one observability solutions useful? I’d love to hear from you.

A greeting from Dimitri, see you next week!


Don’t forget to like this video and subscribe to my channel for more open-source content:

My YouTube Channel: Quadrata

Join the conversation on Telegram: Zabbix Italia

Read More
Zabbix 8.0 Alpha1 Is Here: A First Look at the Future of Monitoring

Zabbix 8.0 Alpha1 Is Here: A First Look at the Future of Monitoring

Good morning, everyone! Dimitri Bellini here, back with another episode on Quadrata, my channel dedicated to the world of open source and IT. This week, we’re diving into something I know many of you have been eagerly anticipating: the first alpha release of Zabbix 8.0!

This is a major Long-Term Support (LTS) release, and the roadmap is packed with exciting features that promise to reshape how we approach monitoring and observability. So, let’s explore what’s new, what’s coming, and what you can already get your hands on.

The Vision for Zabbix 8.0: A Focus on Observability

Before we get into the specifics of the alpha, it’s worth looking at the grand vision for Zabbix 8.0. The development is heavily focused on expanding into the realm of full-fledged observability. This means more than just collecting metrics; it’s about gaining deeper insights into our systems.

Key areas of development include:

  • OpenTelemetry and Log Ingestion: A huge step forward will be the native handling of OpenTelemetry data and enhanced log ingestion. This requires a robust backend, and Zabbix is exploring solutions like ClickHouse or OpenSearch to manage the massive amount of JSON-structured data that comes with it.
  • Event Correlation: A feature I’m personally very excited about is the advanced event correlation engine. This will be a game-changer for reducing message entropy and alert noise, allowing us to pinpoint root causes more effectively.
  • Enhanced Network Monitoring: We’re also seeing a big push in network monitoring, with support for data streaming via NetFlow and sFlow, tying directly into the broader observability goals.

What’s New in the First Alpha Release?

While the full vision will unfold over the coming months, the first alpha already delivers some fantastic quality-of-life improvements and new functionalities. Here are the highlights that stood out to me.

Finally! Inherited Tags in Latest Data

This is a big one. For a long time, the “Latest Data” page has been a source of frustration because it didn’t inherit tags from templates. If you’ve been in the Zabbix world for a while, you know that since the removal of “Applications,” filtering data for a specific component, like MySQL, became a bit of a chore. The community has been vocal about this, and I’m thrilled to say Zabbix has listened. Now, tags from your templates are visible directly in the Latest Data view, making it incredibly easy to filter and segregate items from the OS versus a specific application.

Streamlined SAML Authentication

For anyone working in an enterprise environment, managing SAML certificates could be a bit “rustic.” You had to manually place certificate files into the server’s file system. Zabbix 8.0 introduces a much more professional solution: you can now upload and manage SAML certificates directly through the web interface under Administration -> Authentication. This is a small but significant change that simplifies setup, reduces errors, and makes the whole process much cleaner.

New and Improved Templates

Zabbix continues to deliver excellent out-of-the-box templates. This release brings new additions for networking gear from Aruba, Cisco, StormShield, and Vyatta. Furthermore, the Proxmox template has received a much-needed overhaul. With the recent shifts in the virtualization landscape (looking at you, VMware), many are turning to Proxmox, and it’s great to see Zabbix providing a more modern, robust template for it. There are also improvements to the MySQL template, specifically around replication monitoring, and native monitoring support for Ceph storage, which is heavily used in Proxmox environments.

A Fresh Coat of Paint: New Font and UI Tweaks

Zabbix has moved on from the trusty Arial font to a new, more refined typeface. While the difference is subtle, it gives the interface a slightly more modern and elegant feel. You might not notice it at first glance, but it’s part of a continuous effort to improve the user experience.

New Visualization Power: The Scatterplot Widget

Version 8.0 introduces a brand-new dashboard widget: the scatterplot. This might not be for every use case, but it’s incredibly powerful for visualizing the relationship between two different metrics across multiple hosts. For example, you could plot the signal-to-noise ratio for dozens of access points, allowing you to instantly identify outliers and potential issues. It’s a fantastic tool for spotting correlations and anomalies that would be lost in a standard time-series graph.

Under the Hood Improvements

There are also some important changes that improve stability and performance, particularly in large-scale environments:

  • Improved Event Cleanup: When a trigger is deleted, the associated events are now cleaned up immediately, rather than waiting for the housekeeper process.
  • Smarter Proxy Throttling: The logic for how proxies send data to the server when the history cache is under pressure has been revised. This helps prevent data storms from proxies overwhelming the Zabbix server and avoids getting stuck in loops, which could happen in large installations with heavy log monitoring.

What’s Next? Alpha 2 and the Road to Release

The journey to Zabbix 8.0 LTS, expected around mid-2026, is just beginning. Work is already underway for Alpha 2, which is slated to introduce the JSON item type and the ClickHouse backend support—foundational pieces for the observability features we discussed. These additions will be critical for handling streaming data from OpenTelemetry and other sources, truly pushing Zabbix into the next era of monitoring.

I am incredibly excited to see these features develop and to test how they transform our ability to monitor complex, application-centric environments.

What Are You Most Excited About?

That’s a wrap for my first look at Zabbix 8.0 Alpha! From my perspective, the moves toward observability and better event correlation are the most exciting developments. But I want to hear from you!

What features are you most looking forward to? Is it the OpenTelemetry integration, the advanced event correlation, or perhaps the network topology improvements? Let me know in the comments below!

And if you’re not already part of our community, I invite you to join the conversation.

Thanks for tuning in. A big greeting from me, Dimitri, and I’ll see you next week. Bye everyone!

Read More
SigNoz: A Powerful Open Source APM and Observability Tool

Diving Deep into SigNoz: A Powerful Open Source APM and Observability Tool

Good morning everyone, I’m Dimitri Bellini, and welcome back to Quadrata, the channel where we explore the fascinating world of open source and IT. As I always say, I hope you enjoy these videos, and if you haven’t already, please consider subscribing and hitting that like button if you find the content valuable!

While Zabbix always holds a special place in our hearts for monitoring, today I want to introduce something different. I’ve been getting requests from customers about how to monitor their applications, and for that, you typically need an Application Performance Monitor (APM), or as it’s sometimes fancily called, an “Observability Tool.”

Introducing SigNoz: Your Open Source Observability Hub

The tool I’m excited to share with you today is called SigNoz. It’s an open-source solution designed for comprehensive observability, which means it helps you monitor metrics, traces (the calls made within your application), and even logs. This last part is a key feature of SigNoz, as it aims to incorporate everything you might need to keep a close eye on your applications.

One of its core strengths is that it’s built natively on OpenTelemetry. OpenTelemetry is becoming an industry standard for collecting telemetry data (metrics, traces, logs) from your applications and transmitting it to a backend like SigNoz. We’ll touch on the advantages of this later.

Why Consider SigNoz?

SigNoz positions itself as an open-source alternative to paid, proprietary solutions like Datadog or New Relic, which can be quite expensive. Of course, choosing open source isn’t just about avoiding costs; it’s also about flexibility and community. For home labs, small projects, or even just for learning, SigNoz can be incredibly useful.

Key Features of SigNoz

  • Application Performance Monitoring (APM): Out-of-the-box, you get crucial metrics like P99 latency, error rates, requests per second, all neatly presented in dashboards.
  • Distributed Tracing: This allows you to follow the path of a request as it travels through your application, helping you pinpoint bottlenecks and errors.
  • Log Management: A relatively recent but powerful addition, SigNoz can ingest logs, allowing you to search and analyze them, similar to tools like Greylog (though perhaps with fewer advanced log-specific features for now).
  • Metrics and Dashboards: SigNoz provides a user-friendly interface with customizable dashboards and widgets.
  • Alerting: You can set up alerts, much like triggers in Zabbix, to get notified via various channels when something goes wrong.

Under the Hood: The Architecture of SigNoz

Understanding how SigNoz is built is fundamental to appreciating its capabilities:

  • OpenTelemetry: As mentioned, this is the core component for collecting and transmitting data from your applications.
  • ClickHouse: This is the database SigNoz uses. ClickHouse is an open-source, column-oriented database management system that’s incredibly efficient for handling and querying millions of data points very quickly. It also supports high availability and horizontal scaling even in its open-source version, which isn’t always the case with other databases.
  • SigNoz UI: The web interface that allows you to visualize and interact with the data collected by OpenTelemetry and stored in ClickHouse.

For those wanting to try it out at home, you can easily get this all running with Docker.

The Power of OpenTelemetry

OpenTelemetry is a game-changer. It’s becoming a de facto standard, with even tools like Dynatrace now able to use OpenTelemetry as a data source. The community around it is very active, making it a solid foundation for a product like SigNoz.

Key advantages of OpenTelemetry include:

  • Standardization: It provides a consistent way to instrument applications.
  • Libraries and Agents: It offers out-of-the-box libraries and agents for most major programming languages, simplifying instrumentation.
  • Auto-Instrumentation (Monkey Patching): Theoretically, OpenTelemetry can automatically inject the necessary code into your application to capture telemetry data without you needing to modify your application’s source code significantly. You just invoke your application with certain environment parameters. I say “theoretically” because while I tried it with one of my Python applications, I couldn’t get it to trace anything. Let me know in the comments if you’d like a dedicated video on this; I’m curious to dig deeper into why it didn’t work for me straight away!

Getting Started: Installing SigNoz with Docker and a Demo App

For my initial tests, I used a demo application suggested by the SigNoz team. Here’s a rundown of how you can get started with a standalone Docker setup:

1. Install SigNoz

It’s straightforward:

  1. Clone the SigNoz repository: git clone https://github.com/SigNoz/signoz.git (or the relevant path from their docs).
  2. Navigate into the directory and run Docker Compose. This will pull up four containers:

    • SigNoz Hotel Collector (OpenTelemetry Collector): Gathers data from OpenTelemetry agents.
    • SigNoz Query Service/Frontend: The graphical interface.
    • ClickHouse Server: The database.
    • Zookeeper: Manages ClickHouse instances (similar to etcd).

You can usually find the exact commands in the official SigNoz documentation under the Docker deployment section.

2. Set Up the Sample FastAPI Application

To see SigNoz in action, I used their “Sample FastAPI App”:

  1. Clone the demo app repository: (You’ll find this on the SigNoz GitHub or documentation).
  2. Create a Python 3 virtual environment: It’s always good practice to isolate dependencies.
    python3 -m venv .venv
    source .venv/bin/activate

  3. Install dependencies:
    pip install -r requirements.txt

  4. Install OpenTelemetry components for auto-instrumentation:
    pip install opentelemetry-distro opentelemetry-exporter-otlp

  5. Bootstrap OpenTelemetry (optional, for auto-instrumentation):
    opentelemetry-bootstrap --action=install

    This attempts to find requirements for your specific application.

  6. Launch the application with OpenTelemetry instrumentation:

    You’ll need to set a few environment variables:

    • OTEL_RESOURCE_ATTRIBUTES: e.g., service.name=MyFastAPIApp (This name will appear in SigNoz).
    • OTEL_EXPORTER_OTLP_ENDPOINT: The address of your SigNoz collector (e.g., http://localhost:4317 if running locally).
    • OTEL_EXPORTER_OTLP_TRACES_EXPORTER: Set to otlp.
    • OTEL_EXPORTER_OTLP_PROTOCOL: Can be grpc or http/protobuf.

    Then, run your application using the opentelemetry-instrument command:

    OTEL_RESOURCE_ATTRIBUTES=service.name=FastApp OTEL_EXPORTER_OTLP_ENDPOINT="http://:4317" OTEL_EXPORTER_OTLP_TRACES_EXPORTER=otlp OTEL_EXPORTER_OTLP_PROTOCOL=grpc opentelemetry-instrument uvicorn main:app --host 0.0.0.0 --port 8000

    (Replace with the actual IP where SigNoz is running).
    The opentelemetry-instrument part is what attempts the “monkey patching” or auto-instrumentation. The application itself (uvicorn main:app...) starts as it normally would.

A Quick Look at SigNoz in Action

Once the demo app was running and sending data, I could see traces appearing in my terminal (thanks to console exporter settings). To generate some load, I used Locust with a simple configuration to hit the app’s HTTP endpoint. This simulated about 10 users.

Navigating to the SigNoz UI (typically on port 3301, or as configured, if you’re using the Docker setup that forwards to 8080 or another port for the frontend, but the collector often listens on 4317/4318), the dashboard immediately showed my “FastApp” service. Clicking on it revealed:

  • Latency, request rate, and error rate graphs.
  • A list of endpoints called.

Drilling down into the traces, I could see individual requests. For this simple “Hello World” app, the trace was trivial, just showing the HTTP request. However, if the application were more complex, accessing a database, for example, OpenTelemetry could trace those interactions too, showing you the queries and time taken. This is where it gets really interesting for debugging and performance analysis.

The SigNoz interface felt responsive and well-designed. I was quite impressed with how smoothly it all worked.

Final Thoughts and What’s Next

I have to say, SigNoz seems like a very capable and well-put-together tool. It’s definitely worth trying out, especially if you’re looking for an open-source observability solution.

I plan to test it further with a more complex application, perhaps one involving a database, to see how it handles more intricate call graphs and to really gauge if it can be a strong contender against established players for more demanding scenarios.

It’s also interesting to note that Zabbix has APM features on its roadmap, potentially for version 8. So, the landscape is always evolving! But for now, SigNoz is a noteworthy project, especially for those interested in comprehensive observability that includes metrics, traces, AND logs in one package. This log management capability could make it a simpler alternative to setting up a separate, more complex logging stack for many use cases, particularly in home labs or smaller environments.

So, what do you think? Have you tried SigNoz or other APM tools? Let me know in the comments below! If there’s interest, I can certainly make more videos exploring its features or trying out more complex scenarios.

Thanks for watching, and I’ll see you next week. A greeting from me, Dimitri!

Stay Connected with Quadrata:

📺 Subscribe to Quadrata on YouTube

💬 Join the Zabbix Italia Telegram Channel (Also great for general monitoring discussions!)

Read More