Posts on Jan 1970

Unlock Powerful Visualizations: Exploring Zabbix 7.0 Dashboards

Unlock Powerful Visualizations: Exploring Zabbix 7.0 Dashboards

Good morning everyone! Dimitri Bellini here, back with another episode on Quadrata, my channel dedicated to the world of open source and IT. Today, we’re diving back into our favorite monitoring tool, Zabbix, but focusing on an area we haven’t explored much recently: **data visualization and dashboarding**, especially with the exciting improvements in Zabbix 7.0.

For a long time, many of us (myself included!) might have leaned towards Grafana for sophisticated dashboards, and rightly so – it set a high standard. However, Zabbix has been working hard, taking inspiration from the best, and Zabbix 7.0 introduces some fantastic new widgets and capabilities that significantly boost its native dashboarding power, pushing towards better observability of our collected metrics.

Why Zabbix Dashboards Now Deserve Your Attention

Zabbix 7.0 marks a significant step forward in visualization. The web interface’s dashboard section has received substantial upgrades, introducing new widgets that make creating informative and visually appealing dashboards easier than ever. Forget needing a separate tool for basic visualization; Zabbix now offers compelling options right out of the box.

Some of the key additions in 7.0 include:

  • Gauge Widgets: For clear, immediate visualization of single metrics against thresholds.
  • Pie Chart / Donut Widgets: Classic ways to represent proportions.
  • On-Icon Widget: Excellent for compactly displaying the status of many items (like host availability).
  • Host Navigator & Item Navigator: Powerful tools for creating dynamic, interactive dashboards where you can drill down into specific hosts and their metrics.
  • Item Value Widget: Displays a single metric’s value with trend indication.

Building a Dynamic Dashboard in Zabbix 7.0: A Walkthrough

In the video, I demonstrated how to leverage some of these new features. Let’s recap the steps to build a more dynamic and insightful dashboard:

Step 1: Creating Your Dashboard

It all starts in the Zabbix interface under Dashboards -> All dashboards. Click “Create dashboard”, give it a meaningful name (I used “test per video”), and you’ll be presented with an empty canvas, ready for your first widget.

Step 2: Adding the Powerful Graph Widget

The standard graph widget, while not brand new, has become incredibly flexible.

  • Host/Item Selection: You can use wildcards (*) for both hosts (e.g., Linux server*) and items (e.g., Bits received*) to aggregate data from multiple sources onto a single graph.
  • Aggregation: Easily aggregate data over time intervals (e.g., show the sum or average traffic every 3 minutes).
  • Stacking: Use the “Stacked” option combined with aggregation to visualize total resource usage (like total bandwidth across multiple servers).
  • Multiple Datasets: Add multiple datasets (like ‘Bits received’ and ‘Bits sent’) to the same graph for comprehensive views.
  • Customization: Control line thickness, fill transparency, handling of missing data, axis limits (e.g., setting a max bandwidth), legend display, and even overlay trigger information or working hours.

This allows for creating dense, informative graphs showing trends across groups of systems or interfaces.

Step 3: Introducing Interactivity with Navigators

This is where Zabbix 7.0 dashboards get really dynamic!

Host Navigator Setup

Add the “Host Navigator” widget. Configure it to target a specific host group (e.g., Linux Servers). You can further filter by host status (enabled/disabled), maintenance status, or tags. This widget provides a clickable list of hosts.

Item Navigator Setup

Next, add the “Item Navigator” widget. The key here is to link it to the Host Navigator:

  • In the “Host” selection, choose “From widget” and select your Host Navigator widget.
  • Specify the host group again.
  • Use “Item tags” to filter the list of items shown (e.g., show only items with the tag component having the value network).
  • Use “Group by” (e.g., group by the component tag) to organize the items logically within the navigator. (Note: In the video, I noticed a slight confusion where the UI might label tag value filtering as tag name, something to keep an eye on).

Now, clicking a host in the Host Navigator filters the items shown in the Item Navigator – the first step towards interactive drill-down!

Step 4: Visualizing Single Metrics (Gauge & Item Value)

With the navigators set up, we can add widgets that react to our selections:

Gauge Widget

Add a “Gauge” widget. Configure its “Item” setting to inherit “From widget” -> “Item Navigator”. Now, when you select an item in the Item Navigator (after selecting a host), this gauge will automatically display that metric’s latest value. Customize it with:

  • Min/Max values and units (e.g., %, BPS).
  • Thresholds (defining ranges for Green, Yellow, Red) for instant visual feedback.
  • Appearance options (angles, decimals).

Item Value Widget

Similarly, add an “Item Value” widget, also inheriting its item from the “Item Navigator”. This provides a simple text display of the value, often with a trend indicator (up/down arrow). You can customize:

  • Font size and units.
  • Whether to show the timestamp.
  • Thresholds that can change the background color of the widget for high visibility.

Step 5: Monitoring Multiple Hosts with On-Icon

The “On-Icon” widget is fantastic for a compact overview of many similar items across multiple hosts.

  • Configure it to target a host group (e.g., Linux Servers).
  • Select a specific item pattern relevant to status (e.g., agent.ping).
  • Set thresholds (e.g., Red if value is 0, Green if value is 1) to color-code each host’s icon based on the item value.

This gives you an immediate “at-a-glance” view of the health or status of all hosts in the group regarding that specific metric. The icons automatically resize to fit the widget space.

Putting It All Together

By combining these widgets – graphs for trends, navigators for interactivity, and gauges/item values/on-icons for specific states – you can build truly powerful and informative dashboards directly within Zabbix 7.0. The ability to dynamically filter and drill down without leaving the dashboard is a massive improvement.

Join the Conversation!

So, that’s a first look at the enhanced dashboarding capabilities in Zabbix 7.0. There’s definitely a lot to explore, and these new tools significantly improve how we can visualize our monitoring data.

What do you think? Have you tried the new Zabbix 7.0 dashboards? Are there specific widgets or features you’d like me to cover in more detail? Let me know in the comments below!

If you found this useful, please give the video a thumbs up and consider subscribing to the Quadrata YouTube channel for more content on open source and IT.

And don’t forget to join the conversation in the ZabbixItalia Telegram Channel – it’s a great place to ask questions and share knowledge with fellow Zabbix users.

Thanks for reading, and I’ll see you in the next one!

– Dimitri Bellini

Read More
Zabbix 7.0 Synthetic Monitoring: A Game Changer for Web Performance Testing

Zabbix 7.0 Synthetic Monitoring: A Game Changer for Web Performance Testing

Good morning everyone, I’m Dimitri Bellini, and welcome back to Quadrata! This is my channel dedicated to the open-source world and the IT topics I’m passionate about – and hopefully, you are too.

In today’s episode, we’re diving back into our friend Zabbix, specifically version 7.0, to explore a feature that I genuinely believe is a game-changer in many contexts: Synthetic Monitoring. This powerful capability allows us to simulate and test complex user interaction scenarios on our websites, corporate pages, and web applications. Ready to see how it works? Let’s jump in!

What Exactly is Synthetic Monitoring in Zabbix 7.0?

As I briefly mentioned, synthetic monitoring is a method used to simulate how a real user interacts with a web page or application. This isn’t just about checking if a page loads; it’s about mimicking multi-step journeys – like logging in, navigating through menus, filling out forms, or completing a checkout process.

While this concept might seem straightforward, having it seamlessly integrated into a monitoring solution like Zabbix is incredibly valuable and not always a given in other tools. (1:09-1:16)

The Key Ingredients: Zabbix & Selenium

To make this magic happen, we need a couple of key components:

  • Zabbix Server or Proxy (Version 7.0+): This is our central monitoring hub.
  • Selenium: This is the engine that drives the browser simulation. I strongly recommend running Selenium within a Docker container, ideally on a machine separate from your Zabbix server for better resource management.

Selenium is a well-established framework (known for decades!) that allows us to automate browsers. One of its strengths is the ability to test using different browsers like Chrome, Edge, Firefox, and even Safari, ensuring your site works consistently across platforms. Zabbix interacts with Selenium via the WebDriver API, which essentially acts as a remote control for the browser, allowing us to send commands without writing complex, browser-specific code.

For our purposes, we’ll focus on the simpler Selenium Standalone setup, specifically using the Chrome browser container, as Zabbix currently has the most robust support for it.

How the Architecture Works

The setup is quite logical:

  1. Your Zabbix Server (or Zabbix Proxy) needs to know where the Selenium WebDriver is running.
  2. Zabbix communicates with the Selenium container (typically on port 4444) using the WebDriver protocol.
  3. Selenium receives instructions from Zabbix, executes them in a real browser instance (running inside the container), and sends the results back.

If you need to scale your synthetic monitoring checks, using Zabbix Proxies is an excellent approach. You can dedicate specific proxies to handle checks for different environments.

The Selenium Docker container also provides useful endpoints:

  • Port 4444: Besides being the WebDriver endpoint, it often hosts a web UI to view the status and current sessions.
  • Port 7900 (often): Provides a VNC/web interface to visually watch the browser automation in real-time – fantastic for debugging!

Setting Up Your Environment

Getting started involves a couple of configuration steps:

1. Zabbix Configuration

You’ll need to edit your zabbix_server.conf or zabbix_proxy.conf file and set these parameters:

  • WebDriverURL=http://:4444/wd/hub (Replace with the actual IP/DNS of your Selenium host)
  • StartBrowserPollers=1 (Start with 1 and increase based on workload)

Remember to restart your Zabbix server or proxy after making these changes.

2. Installing the Selenium Docker Container

Running the Selenium Standalone Chrome container is straightforward using Docker. Here’s a typical command:

docker run -d -p 4444:4444 -p 7900:7900 --shm-size="2g" --name selenium-chrome selenium/standalone-chrome:latest

  • -d: Run in detached mode.
  • -p 4444:4444: Map the WebDriver port.
  • -p 7900:7900: Map the VNC/debug view port.
  • --shm-size="2g": Allocate shared memory (important for browser stability, especially Chrome).
  • --name selenium-chrome: Give the container a recognizable name.
  • selenium/standalone-chrome:latest: The Docker image to use. You can specify older versions if needed.

Crafting Your Monitoring Scripts with JavaScript

The heart of synthetic monitoring in Zabbix lies in JavaScript. Zabbix utilizes its familiar JavaScript engine, now enhanced with a new built-in object: browser.

This browser object provides methods to interact with the web page via Selenium:

  • browser.Navigate('https://your-target-url.com'): Opens a specific URL.
  • browser.FindElement(by, target): Locates an element on the page. The by parameter can be methods like browser.By.linkText('Click Me'), browser.By.tagName('button'), browser.By.xpath('//div[@id="login"]'), browser.By.cssSelector('.submit-button').
  • element.Click(): Clicks on a previously found element.
  • browser.CollectPerfEntries(): Gathers performance metrics and, crucially, takes a screenshot of the current page state.

The output of these scripts is a JSON object containing performance data (response times, status codes, page weight) and the screenshot encoded in Base64.

Testing Your Scripts

You can test your JavaScript code before deploying it in Zabbix:

  • Zabbix Web Interface: The item configuration page has a “Test” button.
  • zabbix_js Command-Line Tool: Useful for quick checks and debugging.

    zabbix_js --webdriver http://:4444/wd/hub -i your_script.js -p 'dummy_input' -t 60

    (Remember to provide an input parameter -p even if your script doesn’t use it, and set a reasonable timeout -t. Piping the output to jq (| jq .) makes the JSON readable.

Visualizing the Results in Zabbix

Once your main “Browser” item is collecting data (including the Base64 screenshot), you can extract specific pieces of information using:

  • Dependent Items: Create items that depend on your main Browser item.
  • Preprocessing Steps: Use JSONPath preprocessing rules within the dependent items to pull out specific metrics (e.g., $.steps[0].responseTime).
  • Binary Item Type: Zabbix 7.0 introduces a binary item type specifically designed to handle data like Base64 encoded images.

This allows you to not only graph performance metrics but also display the actual screenshots captured during the check directly in your Zabbix dashboards using the new Item History widget. Seeing a visual snapshot of the page, especially when an error occurs, is incredibly powerful for troubleshooting!

Looking Ahead

Synthetic monitoring in Zabbix 7.0 is a fantastic addition, opening up new possibilities for ensuring application availability and performance from an end-user perspective.

Here at Quadrata, we’re actively exploring ways to make creating these JavaScript scenarios even easier for everyone. We might even have something to share at the Zabbix Summit this year, so stay tuned!

I hope this overview gives you a solid start with Zabbix 7’s synthetic monitoring. It’s a feature with immense potential.

What do you think? Have you tried it yet, or do you have specific use cases in mind? Let me know in the comments below!

If you found this video helpful, please give it a thumbs up and consider subscribing to the Quadrata channel for more open-source and IT content.

Don’t forget to join our Italian Zabbix community on Telegram: ZabbixItalia!

Thanks for watching, and see you next week. Bye everyone!

Read More
Zabbix 7.0 LTS is Almost Here: A Deep Dive into the Next Generation of Monitoring

Zabbix 7.0 LTS is Almost Here: A Deep Dive into the Next Generation of Monitoring

Good morning everyone, Dimitri Bellini here from the Quadrata channel! It’s fantastic to have you back on my corner of the internet dedicated to the world of open source and IT. First off, a huge thank you – we’ve recently crossed the 800 subscriber mark, which is amazing! My goal is to hit 1000 this year, so if you haven’t already, please consider subscribing!

This week, we’re diving into something truly exciting: the upcoming release of Zabbix 7.0. It’s just around the corner, and while we wait for the final release, the Release Candidate 1 already packs all the features we expect to see. So, let’s explore what makes this version so special.

Why Zabbix 7.0 LTS is a Game Changer

Zabbix 7.0 isn’t just another update; it’s an LTS (Long Term Support) version. This is crucial because LTS releases are designed for stability and longevity, typically receiving support for three years (extendable to five for security). This makes them the ideal choice for production environments where reliability and long-term planning are paramount. Updating complex systems isn’t always easy, so an LTS version provides that much-needed stability.

The previous LTS, version 6.0, was released back in February 2022. While the usual cycle is about 1.5 years between LTS versions, 7.0 took a bit longer. This extended development time was necessary to incorporate significant changes, both under the hood and in terms of user-facing features. Trust me, the wait seems worth it!

Don’t panic if you’re on 6.0 – full support continues until February 28, 2025. You have ample time to plan your migration. In fact, I often suggest waiting for a few minor point releases (like 7.0.3 or 7.0.5) to let the dust settle before deploying in critical environments.

Bridging the Gap: Key Enhancements Since Zabbix 6.0

Since many users stick with LTS versions in production and might have skipped the intermediate 6.2 and 6.4 releases, I want to cover the major improvements introduced between 6.0 and 7.0. There’s a lot to unpack!

Performance, Scalability, and Architecture Boosts

  • Automatic Configuration Sync (Server-Proxy-Agent): This is huge! Previously, configuration changes on the GUI could take time (often a minute or more by default) to propagate through the server, proxies, and agents due to database polling cycles. Now, changes trigger near-instantaneous updates across the chain using a differential sync mechanism. This means faster deployments of new hosts and checks, and less load as only the *changes* are pushed.
  • SNMP Bulk Monitoring: Instead of multiple individual `get` requests, Zabbix can now group SNMP requests, reducing load on both the Zabbix server and the monitored devices.
  • Asynchronous Polling: Passive checks (SNMP, HTTP, Zabbix agent passive) are now asynchronous. The server sends the request and moves on, processing the data as it arrives. This significantly improves scalability, especially for large environments.
  • Proxy High Availability (HA) and Load Balancing: Finally! You can group proxies together for automatic failover and load balancing. If a proxy in the group fails, its assigned hosts are automatically redistributed to other available proxies in the group. We’ll look at a demo of this shortly!
  • Customizable Item Timeouts: The previous global 30-second timeout limit is gone. Now you can set timeouts per item (up to 10 minutes), providing flexibility for checks that naturally take longer.
  • Proxy Backward Compatibility: Upgrading is now less stressful. A Zabbix 7.0 server can work with 6.0 proxies in an “outdated” state. They’ll continue sending data and executing remote commands, giving you time to upgrade the proxies without a complete “big bang” cutover. Configuration updates for hosts on outdated proxies won’t work, however.

Enhanced Integrations and User Management

  • LDAP/AD Just-in-Time (JIT) User Provisioning: A highly requested feature for enterprise environments. Zabbix can now automatically create and update user accounts based on your Active Directory or LDAP server upon their first login. You can map attributes like email and phone numbers, and even assign Zabbix roles based on AD/LDAP groups. Plus, support for multiple LDAP servers is included.
  • Expanded Vault Support: Alongside HashiCorp Vault, Zabbix 7.0 now integrates with CyberArk for external secret management.
  • Real-time Data Streaming: Push metrics and events efficiently to external systems like Kafka or Splunk. Crucially, you can filter what gets sent based on tags (e.g., send only high-priority events, or only network-related metrics), allowing for sophisticated data routing to different data lakes or analysis tools.

Improved Monitoring & User Experience

  • Advanced Web Monitoring with Selenium: This is potentially revolutionary for Zabbix. The web monitoring capabilities have been rebuilt using Selenium, allowing you to simulate real user interactions (clicking buttons, filling forms) and monitor user experience, page performance, and functionality directly within Zabbix.
  • Manual Problem Suppression: Acknowledge known issues or problems occurring during maintenance windows by suppressing them temporarily (indefinitely or for a defined period). They’ll reappear if the issue persists after the suppression window.
  • Clearer Agent Availability: The agent availability status is now more intuitive, incorporating a heartbeat mechanism to clearly show if an agent (active or passive) is truly up and running. The corresponding dashboard widget has also been revamped.
  • UI and Template Upgrades: Continuous improvements to the graphical interface, new widgets (like improved pie charts/gauges), and significantly enhanced templates, especially for VMware, AWS, Azure, and Google Cloud monitoring.
  • OS Prototype Functionality Extension: More flexibility in managing discovered hosts, including adding templates and tags more easily.

Hands-On: Exploring Proxy High Availability (Demo Recap)

I set up a quick Docker environment with a Zabbix 7.0 RC1 server and three proxies to test the new Proxy HA feature. Here’s a summary of how it works:

  1. Create a Proxy Group: Under `Administration -> Proxy groups`, define a group (e.g., “MyHAProxies”). Set the `Failover period` (how quickly hosts move after a proxy failure – I used 1 minute for the demo) and the `Minimum available proxies` (the minimum number needed for the group to be considered operational).
  2. Assign Proxies to the Group: Edit each proxy you want in the group (`Administration -> Proxies`) and assign it to the created Proxy Group. You also need to specify the proxy address for active agents, as agent configuration changes with this feature.
  3. Assign Hosts to the Proxy Group: When configuring a host, instead of selecting a specific proxy, select the Proxy Group. Zabbix automatically assigns the host to the least loaded proxy within that group initially.
  4. Simulate Failure: I stopped one of the proxy containers.
  5. Observe Failover: After the configured failover period (1 minute), Zabbix detected the proxy was down. The hosts monitored by that proxy were automatically redistributed among the remaining two active proxies in the group. Crucially, checking the `Latest data` for a moved host showed minimal interruption in data collection.
  6. Recovery: When I restarted the stopped proxy, Zabbix detected it coming back online. After a stabilisation period (to avoid flapping), it automatically started rebalancing the hosts back across all three proxies.
  7. Monitoring the Group: A new internal item `zabbix[proxy_group,,state]` allows you to monitor the health of the proxy group itself (e.g., online, degraded). This is essential for alerting! (Note: I’ve asked the Zabbix team to add this to the default server health template).

This feature significantly enhances the resilience of your Zabbix monitoring infrastructure, especially in distributed environments.

Planning Your Upgrade to Zabbix 7.0

Upgrading requires careful planning. Here’s a checklist:

  • Read the Release Notes: Understand all the new features and changes.
  • Check Requirements: Ensure your OS, database, and library versions meet the new requirements for Zabbix 7.0. You might need to upgrade your underlying infrastructure (e.g., moving from RHEL 7 to RHEL 8 or 9).
  • Review Breaking Changes & Known Issues: The official Zabbix documentation has sections dedicated to these. Pay close attention to avoid surprises.
  • CRITICAL: Primary Keys: Zabbix 6.0 introduced optional primary keys for history tables. If you upgraded from 5.x to 6.x and didn’t manually add them (it could be a slow process), **now is the time**. While Zabbix 7.0 *might* run without them, future versions *will require* them. Adding primary keys also provides a significant performance boost (often 20%+). Factor this database maintenance into your upgrade plan if needed.
  • Follow the Official Upgrade Procedure: Stick to the step-by-step guide in the Zabbix documentation for your specific environment.
  • Backup Everything: Before you start, ensure you have reliable backups of your Zabbix database and configuration files.

Conclusion: Get Ready for Zabbix 7.0!

Zabbix 7.0 LTS is shaping up to be a monumental release. The focus on scalability, high availability, usability, and deeper integration makes it incredibly compelling for both existing users and those considering Zabbix for their monitoring needs. Features like Proxy HA, LDAP JIT provisioning, and the new Selenium-based web monitoring are truly exciting developments.

I’ll definitely be exploring the new web monitoring capabilities in more detail once the documentation is fully available, so stay tuned for that!

What feature in Zabbix 7.0 are you most excited about? Let me know in the comments below!

If you found this overview helpful, please give the video a thumbs up, share it, and subscribe to Quadrata to help us reach that 1000 subscriber goal!

Also, feel free to join the discussion on the ZabbixItalia Telegram Channel.

Thanks for watching/reading, have a great week, and see you next time!

– Dimitri Bellini

Read More
Visualizing Your Infrastructure: A Deep Dive into Zabbix Maps

Visualizing Your Infrastructure: A Deep Dive into Zabbix Maps

Good morning everyone, Dimitri Bellini here, and welcome back to Quadrata – your go-to channel for open source and IT solutions! Today, I want to dive into a feature of our good friend Zabbix that we haven’t explored much yet: Zabbix Maps.

Honestly, I was recently working on some maps, and while it might not always be the most glamorous part of Zabbix, it sparked an idea: why not share what these maps are truly capable of, and perhaps more importantly, what they aren’t?

What Zabbix Maps REALLY Are: Your Digital Synoptic Panel

Think of Zabbix Maps as the modern, digital equivalent of those old-school synoptic panels with blinking lights. They provide a powerful graphical way to represent your infrastructure and its status directly within Zabbix. Here’s what you can achieve:

  • Real-time Host Status: Instantly see the overall health of your hosts based on whether they have active problems.
  • Real-time Event Representation: Visualize specific problems (triggers) directly on the map. Imagine a specific light turning red only when a critical service fails.
  • Real-time Item Metrics: Display actual data values (like temperature, traffic throughput, user counts) directly on your map, making data much more intuitive and visually appealing.

The core idea is to create a custom graphical overview tailored to your specific infrastructure, giving you an immediate understanding of what’s happening at a glance.

Clearing Up Misconceptions: What Zabbix Maps Are NOT

It’s crucial to understand the limitations to use maps effectively. Often, people hope Zabbix Maps will automatically generate network topology diagrams.

  • They are NOT Automatic Network Topology Maps: While you *could* manually build something resembling a network diagram, Zabbix doesn’t automatically discover devices and map their connections (who’s plugged into which switch port, etc.). Tools that attempt this often rely on protocols like Cisco’s CDP or the standard LLDP (both usually SNMP-based), which aren’t universally available across all devices. Furthermore, in large environments (think thousands of hosts and hundreds of switches), automatically generated topology maps quickly become an unreadable mess of tiny icons and overlapping lines. They might look cool initially but offer little practical value day-to-day.
  • They are NOT Application Performance Monitoring (APM) Relationship Maps (Yet!): Zabbix Maps don’t currently visualize the intricate relationships and data flows between different application components in the way dedicated APM tools do. While Zabbix is heading towards APM capabilities, the current map function isn’t designed for that specific purpose.

For the nitty-gritty details, I always recommend checking the official Zabbix documentation – it’s an invaluable resource.

Building Blocks of a Zabbix Map

When constructing your map, you have several element types at your disposal:

  • Host: Represents a monitored device. Its appearance can change based on problem severity.
  • Trigger: Represents a specific problem condition. You can link an icon’s appearance directly to a trigger’s state.
  • Map: Allows you to create nested maps. The icon for a sub-map can reflect the most severe status of the elements within it – great for drilling down!
  • Image: Use custom background images or icons to make your map visually informative and appealing.
  • Host Group: Automatically display all hosts belonging to a specific group within a defined area on the map.
  • Shape: Geometric shapes (rectangles, ellipses) that can be used for layout, grouping, or, importantly, displaying text and real-time data.
  • Link: Lines connecting elements. These can change color or style based on a trigger’s status, often used to represent connectivity or dependencies.

Zabbix also provides visual cues like highlighting elements with problems or showing small triangles to indicate a recent status change, helping you focus on what needs attention.

Bringing Maps to Life with Real-Time Data

One of the most powerful features is embedding live data directly onto your map. Instead of just seeing if a server is “up” or “down,” you can see its current CPU load, network traffic, or application-specific metrics.

This is typically done using Shapes and a specific syntax within the shape’s label. In Zabbix 6.x and later, the syntax looks something like this:

{?last(/Your Host Name/your.item.key)}

This tells Zabbix to display the last received value for the item your.item.key on the host named Your Host Name. You can add descriptive text around it, like:

CPU Load: {?last(/MyWebServer/system.cpu.load[,avg1])}

Zabbix is smart enough to often apply the correct unit (like Bps, %, °C) automatically if it’s defined in the item configuration.

Let’s Build a Simple Map (Quick Guide)

Here’s a condensed walkthrough based on what I demonstrated in the video (using Zabbix 6.4):

  1. Navigate to Maps: Go to Monitoring -> Maps.
  2. Create New Map: Click “Create map”. Give it a name (e.g., “YouTube Test”), set dimensions, and optionally choose a background image.

    • Tip: You can upload custom icons and background images under Administration -> General -> Images. I uploaded custom red/green icons and a background for the demo.

  3. Configure Map Properties: Decide on options like “Icon highlighting” (the colored border around problematic hosts) and “Mark elements on trigger status change” (the triangles for recent changes). You can also filter problems by severity or hide labels if needed. Click “Add”.
  4. Enter Constructor Mode: Open your newly created map and click “Constructor”.
  5. Add a Trigger-Based Icon:

    • Click “Add element” (defaults to a server icon).
    • Click the new element. Change “Type” to “Trigger”.
    • Under “Icons”, select your custom “green” icon for the “Default” state and your “red” icon for the “Problem” state.
    • Click “Add” next to “Triggers” and select the specific trigger you want this icon to react to.
    • Click “Apply”. Position the icon on your map.

  6. Add Real-Time Data Display:

    • Click “Add element” and select “Shape” (e.g., Rectangle).
    • Click the new shape. In the “Label” field, enter your data syntax, e.g., Temp: {?last(/quadrata-test-host/test.item)} (replace with your actual host and item key).
    • Customize font size, remove the border (set Border width to 0), etc.
    • Click “Apply”. Position the shape.
    • Important: In the constructor toolbar, toggle “Expand macros” ON to see the live data instead of the syntax string.

  7. Refine and Save: Adjust element positions (you might want to turn off “Snap to grid” for finer control). Remove default labels if they clutter the view (Map Properties -> Map element label type -> Nothing). Click “Update” to save your changes.

Testing with `zabbix_sender`

A fantastic tool for testing maps (especially with trapper items) is the zabbix_sender command-line utility. It lets you manually push data to Zabbix items.

Install the `zabbix-sender` package if you don’t have it. The basic syntax is:

zabbix_sender -z -s -k -o

For example:

zabbix_sender -z 192.168.1.100 -s quadrata-test-host -k test.item -o 25

Sending a value that crosses a trigger threshold will change your trigger-linked icon on the map. Sending a different value will update the real-time data display.

Wrapping Up

So, there you have it – a look into Zabbix Maps. They aren’t magic topology generators, but they are incredibly flexible and powerful tools for creating meaningful, real-time visual dashboards of your infrastructure’s health and performance. By combining different elements, custom icons, backgrounds, and live data, you can build truly informative synoptic views.

Don’t be afraid to experiment! Start simple and gradually add complexity as you get comfortable.

What are your thoughts on Zabbix Maps? Have you created any cool visualizations? Share your experiences or ask questions in the comments below!

If you found this helpful, please give the video a thumbs up, share it, and subscribe to Quadrata for more content on Zabbix and open source solutions.

Also, feel free to join the conversation in the Zabbix Italia Telegram channel – it’s a great community!

Thanks for reading, and I’ll see you in the next post!

– Dimitri Bellini

Read More
Unlock Your Documents Potential with Ragflow: An Open-Source RAG Powerhouse

Unlock Your Documents Potential with Ragflow: An Open-Source RAG Powerhouse

Good morning, everyone! Dimitri Bellini here, back on the Quadrata channel – your spot for diving into the exciting world of open source and IT tech. Today, we’re tackling a topic some of you have asked about: advanced solutions for interacting with your own documents using AI.

I sometimes wait to showcase software until it’s a bit more polished, and today, I’m excited to introduce a particularly interesting one: Ragflow.

What is RAG and Why Should You Care?

We’re diving back into the world of RAG solutions – Retrieval-Augmented Generation. It sounds complex, but the core idea is simple and incredibly useful: using your *own* documents (manuals, reports, notes, anything on your disk) as a private knowledge base for an AI.

Instead of relying solely on the general knowledge (and potential inaccuracies) of large language models (LLMs), RAG lets you get highly relevant, context-specific answers based on *your* information. This is a practical, powerful use case for AI, moving beyond generic queries to solve specific problems using local data.

Introducing Ragflow: A Powerful Open-Source RAG Solution

Ragflow (find it on GitHub!) stands out from other RAG tools I’ve explored. It’s not just a basic framework; it’s shaping up to be a comprehensive, business-oriented platform. Here’s why it caught my eye:

  • Open Source: Freely available and community-driven.
  • Complete Solution: Offers a wide range of features out-of-the-box.
  • Collaboration Ready: Designed for teams to work on shared knowledge bases.
  • Easy Installation: Uses Docker Compose for a smooth setup.
  • Local First: Integrates seamlessly with local LLM providers like Ollama (which I use).
  • Rapid Development: The team is actively adding features and improvements.
  • Advanced Techniques: Incorporates methods like Self-RAG and Raptor for better accuracy.
  • API Access: Allows integration with other applications.

Diving Deeper: How Ragflow Enhances RAG

Ragflow isn’t just about basic document splitting and embedding. It employs sophisticated techniques:

  • Intelligent Document Analysis: It doesn’t just grab text. Ragflow performs OCR and analyzes document structure (understanding tables in Excel, layouts in presentations, etc.) based on predefined templates. This leads to much better comprehension and more accurate answers.
  • Self-RAG: A framework designed to improve the quality and factuality of the LLM’s responses, reducing the chances of the AI “inventing” answers (hallucinations) when it doesn’t know something.
  • Raptor: This technique focuses on the document processing phase. For long, complex documents, Raptor builds a hierarchical summary or tree of concepts *before* chunking and embedding. This helps the AI maintain context and understand the overall topic better.

These aren’t trivial features; they represent significant steps towards making RAG systems more reliable and useful.

Getting Started: Installing Ragflow (Step-by-Step)

Installation is straightforward thanks to Docker Compose. Here’s how I got it running:

  1. Clone the Repository (Important Tip!): Use the `–branch` flag to specify a stable release version. This saved me some trouble during testing. Replace `release-branch-name` with the desired version (e.g., `0.7.0`).

    git clone --branch release-branch-name https://github.com/infiniflow/ragflow.git

  2. Navigate to the Docker Directory:

    cd ragflow/docker

  3. Make the Entrypoint Script Executable:

    chmod +x entrypoint.sh

  4. Start the Services: This will pull the necessary images (including Ragflow, MySQL, Redis, MinIO, Elasticsearch) and start the containers.

    docker-compose up -d

    Note: Be patient! The Docker images, especially the main Ragflow one, can be quite large (around 9GB in my tests), so ensure you have enough disk space.

Once everything is up, you can access the web interface (usually at `http://localhost:80` or check the Docker Compose file/logs for the exact port).

A Look Inside: Configuring and Using Ragflow

The web interface is clean and divided into key sections: Knowledge Base, Chat, File Manager, and Settings.

Setting Up Your AI Models (Ollama Example)

First, you need to tell Ragflow which AI models to use. Go to your profile settings -> Model Providers.

  • Click “Add Model”.
  • Select “Ollama”.
  • Choose the model type: “Chat” (for generating responses) or “Embedding” (for analyzing documents). You’ll likely need one of each.
  • Enter the **exact** model name as it appears in your Ollama list (e.g., `mistral:latest`, `nomic-embed-text:latest`).
  • Provide the Base URL for your Ollama instance (e.g., `http://your-ollama-ip:11434`).
  • Save the model. Repeat for your embedding model if it’s different. I used `nomic-embed-text` for embeddings and `weasel-lm-7b-v1-q5_k_m` (a fine-tuned model) for chat in my tests.

Creating and Populating a Knowledge Base (Crucial Settings)

This is where your documents live.

  • Create a new Knowledge Base and give it a name.
  • Before Uploading: Go into the KB settings. This is critical! Define:

    • Language: The primary language of your documents.
    • Chunking Method: How documents are split. Ragflow offers templates like “General”, “Presentation”, “Manual”, “Q&A”, “Excel”, “Resume”. Choose the one that best fits your content. I used “Presentation” for my Zabbix slides.
    • Embedding Model: Select the Ollama embedding model you configured earlier.
    • Raptor: Enable this for potentially better context handling on complex docs.

  • Upload Documents: Now you can upload files or entire directories.
  • Parse Documents: Click the “Parse” button next to each uploaded document. Ragflow will process it using the settings you defined (OCR, chunking, embedding, Raptor analysis). You can monitor the progress.

Building Your Chat Assistant

This connects your chat model to your knowledge base.

  • Create a new Assistant.
  • Give it a name and optionally an avatar.
  • Important: Set an “Empty Response” message (e.g., “I couldn’t find information on that in the provided documents.”). This prevents the AI from making things up.
  • Add a welcome message.
  • Enable “Show Citation”.
  • Link Knowledge Base: Select the KB you created.
  • Prompt Engine: Review the system prompt. The default is usually quite good, instructing the AI to answer based *only* on the documents.
  • Model Setting: Select the Ollama chat model you configured. Choose a “Work Mode” like “Precise” to encourage focused answers.
  • (Optional) Re-ranking Model: I skipped this in version 0.7 due to some issues, but it’s a feature to watch.
  • Confirm and save.

Putting Ragflow to the Test (Zabbix Example)

I loaded my Zabbix presentation slides and asked the assistant some questions:

  • Explaining Zabbix log file fields.
  • Identifying programming languages used in Zabbix components.
  • Differentiating between Zabbix Agent Passive and Active modes.
  • Describing the Zabbix data collection flow.

The results were genuinely impressive! Ragflow provided accurate, detailed answers, citing the specific slides it drew information from. There was only one minor point where I wasn’t entirely sure if the answer was fully grounded in the text or slightly inferred, but overall, the accuracy and relevance were excellent, especially considering it was analyzing presentation slides.

Integrating Ragflow with Other Tools via API

A standout feature is the built-in API. For each assistant you create, you can generate an API key. This allows external applications to query that specific assistant and its associated knowledge base programmatically – fantastic for building custom integrations.

Final Thoughts and Why Ragflow Stands Out

Ragflow is a compelling RAG solution. Its focus on accurate document analysis, integration of advanced techniques like Self-RAG and Raptor, ease of use via Docker and Ollama, and the inclusion of collaboration and API features make it feel like a mature, well-thought-out product, despite being relatively new.

While it’s still evolving (as seen with the re-ranking feature I encountered), it’s already incredibly capable and provides a robust platform for anyone serious about leveraging their own documents with AI.

What do you think? Have you tried Ragflow or other RAG solutions? What are your favourite use cases for chatting with your own documents?

Let me know in the comments below! I’m always keen to hear your experiences and suggestions for tools to explore.

Don’t forget to give this video a thumbs up if you found it helpful, and subscribe to the Quadrata channel for more open-source tech deep dives.

Also, if you’re interested in Zabbix, join our friendly community on Telegram: Zabbix Italia.

Thanks for watching, and see you next week!

– Dimitri Bellini

Read More
Deep Dive into Zabbix Database Monitoring with ODBC

Deep Dive into Zabbix Database Monitoring with ODBC

Good morning everyone, and welcome back to my channel, Quadrata! I’m Dimitri Bellini, and this week, we’re diving deep into Zabbix again. Why? Because many of you in our fantastic Zabbix Italia Telegram channel asked for a closer look at database monitoring with Zabbix.

Zabbix offers a few ways to monitor databases – using Zabbix Agent 2, User Parameters, and ODBC. While Agent 2 is great for standard metrics and User Parameters offer script-based flexibility (which I personally enjoy!), today we’re focusing on what I consider the most elegant and powerful method for many scenarios: **ODBC (Open Database Connectivity)**.

What is ODBC and Why Use It with Zabbix?

ODBC is a standard API (Application Programming Interface) born years ago, designed to allow applications to communicate with various databases without needing to know the specifics of each database system. Think of it as a universal translator for databases.

Here’s the basic idea:

  • Your application (in our case, Zabbix Server or Proxy) talks to the ODBC layer using standard SQL commands.
  • The ODBC layer uses specific **drivers** for each database type (MySQL, Oracle, Postgres, etc.).
  • These drivers handle the native communication with the target database.

This decoupling means Zabbix doesn’t need built-in drivers for every database, making it more flexible. On Linux systems, the common implementation is UnixODBC.

Setting Up ODBC for Zabbix

Installation (Linux Example)

Getting UnixODBC installed is usually straightforward as it’s often in standard repositories. For Red Hat-based systems (like Rocky Linux, AlmaLinux), you’d typically run:

dnf install unixodbc unixodbc-devel

(The `devel` package might not always be necessary, but can be helpful).

After installing UnixODBC, you need the specific ODBC driver for the database you want to monitor.

Crucial Driver Note for MySQL/MariaDB

Important! If you’re monitoring MySQL or MariaDB with Zabbix, you currently need to use the MariaDB ODBC connector. Due to licensing complexities, Zabbix Server/Proxy binaries are often compiled against the MariaDB libraries.

Install it like this (on Red Hat-based systems):

dnf install MariaDB-connector-odbc

While the official MySQL ODBC driver might work fine with command-line tools like `isql`, Zabbix itself likely won’t be able to use it directly. Stick with the MariaDB connector for compatibility.

Configuration Files: odbcinst.ini and odbc.ini

UnixODBC typically uses two main configuration files:

  • odbcinst.ini: Defines the available drivers, giving them an alias and pointing to the driver library file.
  • odbc.ini: Defines Data Source Names (DSNs). A DSN is a pre-configured connection profile containing the driver alias, server address, port, user, password, and database name.

Two Approaches to Connection: DSN vs. Connection String

1. The DSN Method (Using odbc.ini)

You can define all your connections in odbc.ini and then simply refer to the DSN name in Zabbix. You can test this setup from the command line using the isql utility:

isql your_dsn_name

If it connects successfully, you’ll get a prompt, confirming your ODBC setup works.

However, I personally find this method less flexible for Zabbix. Managing static entries in odbc.ini for potentially hundreds of databases can become cumbersome.

2. The Connection String Method (My Preferred Way!)

Instead of relying on odbc.ini for connection details, you can provide all the necessary information directly within Zabbix using a **connection string**. This bypasses the need for DSN entries in odbc.ini (though odbcinst.ini is still needed to define the driver itself).

You can test this approach from the command line too:

isql -k "Driver=YourDriverAlias;Server=your_db_host;Port=3306;User=your_user;Password=your_password;Database=your_db;"

(Replace the placeholders with your actual details. The exact parameters might vary slightly depending on the driver).

This method offers much greater flexibility, especially when combined with Zabbix templates and macros, as we’ll see.

Configuring Zabbix for ODBC Monitoring

Creating the Zabbix Item

To monitor a database using ODBC in Zabbix:

  1. Create a new item for your host.
  2. Set the **Type** to Database monitor.
  3. The **Key** format is crucial: db.odbc.select[,].

    • : A unique name for this specific check within the host (e.g., `mysql_version`, `user_count`). This ensures the key is unique.
    • : Here you either put your DSN name (if using odbc.ini) OR the full connection string (if bypassing odbc.ini).

  4. If using the connection string method, you leave the DSN part empty but include the connection string within the key’s parameters, often enclosed in quotes if needed, or directly if simple. *Correction from video explanation: The Zabbix key structure is slightly different. It’s `db.odbc.select[unique_name,dsn]` or `db.odbc.select[unique_name,,connection_string]`. Notice the double comma when omitting the DSN.*
  5. The most important field is the **SQL query** field. This is where you put the actual SQL query you want Zabbix to execute.
  6. You can optionally provide Username and Password in the dedicated fields, which might override or complement details in the DSN/connection string depending on the driver and configuration.
  7. Set the **Type of information** based on the expected query result (Numeric, Text, etc.).

Example with Connection String

Here’s how an item key might look using the connection string method (notice the empty DSN parameter indicated by the double comma):

db.odbc.select[CountZabbixTables,, "Driver=MariaDB;Server=127.0.0.1;Port=3306;Database=zabbix;User={$ZABBIX_DB_USER};Password={$ZABBIX_DB_PASSWORD};"]

And in the **SQL query** field for this item, I might put:

SELECT count(*) FROM information_schema.tables WHERE table_schema = 'zabbix';

Leveraging User Macros for Flexibility

The real power of the connection string method shines when you use Zabbix User Macros (like {$ZABBIX_DB_USER}, {$ZABBIX_DB_PASSWORD}, {$DB_HOST}, {$DB_NAME}) within the string. This allows you to create generic templates and customize the connection details per host via macros – incredibly useful for large or complex environments!

ODBC for Custom Queries vs. Agent 2 for System Health

It’s important to understand the typical use cases:

  • Standard Zabbix Templates (Agent 2 or ODBC): These usually focus on monitoring the *health and performance* of the database system itself (e.g., queries per second, buffer usage, connection counts, uptime).
  • Manual ODBC Items (like we’re discussing): This method is **perfect for running custom SQL queries**. Need to check the number of rows in a specific table? Verify if a critical configuration value exists? Confirm application data is being populated correctly? ODBC monitoring configured this way is your go-to solution.

While you *could* potentially use Agent 2 with User Parameters and scripts for custom queries, ODBC often provides a cleaner, more centralized, and integrated way to achieve this directly from the Zabbix Server or Proxy.

Putting It All Together: A Practical Example

In the video, I demonstrated creating an item to count the number of tables in my Zabbix database using the connection string method. The key steps were:

  1. Define the item with Type `Database monitor`.
  2. Construct the key using `db.odbc.select` and the connection string.
  3. Enter the `SELECT count(*)…` query in the SQL query field.
  4. Use the fantastic **Test** button in Zabbix! This lets you immediately check if the connection works and the query returns the expected data (in my case, 173 tables) without waiting for the item’s update interval.

This confirms the connection from Zabbix to the database via ODBC is working correctly.

Troubleshooting Tips

If things don’t work right away (which happens!), follow these steps:

  • Test Connectivity First: Always use the `isql` command-line tool (either with the DSN or the `-k “connection_string”`) to verify basic ODBC connectivity *before* configuring Zabbix.
  • Check Logs: Examine the Zabbix Server or Zabbix Proxy logs for detailed error messages related to database monitors.
  • Consult Documentation: The official Zabbix documentation has a dedicated section on ODBC monitoring.
  • Verify Driver Path: Ensure the driver path specified in `odbcinst.ini` is correct.
  • Permissions: Make sure the database user configured for Zabbix has the necessary permissions to connect and execute the query.
  • Take a Breath: Rushing leads to mistakes. Double-check configurations, read errors carefully, and approach it methodically.

Conclusion and Next Steps

ODBC monitoring in Zabbix is a highly flexible and powerful tool, especially when you need to go beyond standard system metrics and execute custom SQL queries to validate data or check specific application states. While the initial setup requires careful attention to drivers and connection details, the connection string method combined with user macros offers excellent scalability.

What are your experiences with Zabbix database monitoring? Do you prefer ODBC, Agent 2, or User Parameters? Share your thoughts and questions in the comments below!

If you found this helpful, please give the video a thumbs up and subscribe to Quadrata for more Zabbix content. Don’t forget to join our Zabbix Italia Telegram channel!

Stay tuned – with Zabbix 6.4 just around the corner, I’ll likely be covering the “What’s New” very soon!

Thanks again for watching and reading. Happy monitoring!

Ciao,
Dimitri Bellini

Read More
PFSense vs. OPNsense: Choosing Your Open Source Firewall Champion

PFSense vs. OPNsense: Choosing Your Open Source Firewall Champion

Good morning everyone! Dimitri Bellini here, back with you on Quadrata, my channel dedicated to the fascinating world of open source and IT. You might notice a new backdrop today – still working on the aesthetics, but the content is what matters!

Today, we’re shifting gears slightly from our usual Zabbix discussions to explore another critical area in the open-source landscape: firewall solutions. I’m not just talking about the basic firewall tools within a Linux distro, but dedicated distributions that create powerful network appliances.

Why Dedicated Open Source Firewalls?

Why opt for something like PFSense or OPNsense instead of simpler solutions? Well, whether you’re running a home lab, a small office, or even looking for robust solutions without an enterprise price tag, these distributions offer incredible value. They are designed to be:

  • Robust and Reliable: Often running for years without issues, especially on dedicated hardware (like those small, multi-port appliances you often see).
  • Feature-Rich: Going beyond basic packet filtering to offer comprehensive network management.
  • Versatile: Capable of handling routing, complex NAT scenarios, port forwarding, secure VPN access (site-to-site or remote user), intrusion detection, content filtering, and much more.

Essentially, if you need to manage traffic between your LAN and the internet, securely expose services, connect different locations, or simply have granular control over your network, these tools are invaluable. You can test them easily on a VM using VMware, VirtualBox, or my personal favourite, Proxmox, or deploy them on dedicated, often fanless, hardware appliances.

Meet the Contenders: PFSense and OPNsense

Two names dominate the open-source firewall scene today: PFSense and OPNsense. You’ve likely heard of PFSense; it’s been a stalwart in this space for decades. OPNsense (or OpenSense, as some say) is a newer, but rapidly growing, alternative.

Interestingly, both projects share common ancestry, tracing their roots back to the venerable M0n0wall project. PFSense emerged as an effort to expand M0n0wall’s capabilities. Later, around 2015, as M0n0wall’s development ceased, OPNsense was forked, partly from PFSense code, by developers aiming for a more modern approach and a fully open-source path, perhaps diverging from the direction PFSense was taking with its commercial backing.

PFSense: The Established Powerhouse

PFSense, particularly in its Community Edition (CE), offers a vast array of features:

  • Core Networking: Stateful Packet Inspection, robust NAT capabilities, advanced routing.
  • VPN Support: Includes OpenVPN, IPsec, and WireGuard (WireGuard often requires installing a package).
  • Security Features: Intrusion Detection Systems (IDS/IPS via packages like Suricata or Snort), Captive Portal, Anti-Lockout rules.
  • Management: A comprehensive web GUI for configuration, monitoring, and High Availability (HA) setups using CARP.
  • Extensibility: A package manager allows adding functionality like PFBlockerNG (IP/DNS blocking), Zabbix agents/proxies, and more.

PFSense is backed by the company NetGate, which offers commercial support and hardware appliances running PFSense Plus. While based on the community version, PFSense Plus is positioned as a separate product with potentially faster updates, additional features (especially related to performance and hardware offloading), and professional support. This is a key distinction: PFSense CE updates might sometimes lag behind the Plus version.

The user interface, while powerful, is often described as more traditional or “retro” compared to OPNsense. It’s built on a FreeBSD base (currently FreeBSD 14 for the latest CE at the time of recording).

OPNsense: The Modern, Fully Open Challenger

OPNsense aims for a similar feature set but with a strong emphasis on usability, security, and a truly open-source model:

  • Core Networking: All the essentials like Stateful Packet Inspection, NAT, Routing, VLAN support, GeoIP blocking.
  • VPN Support: Also features OpenVPN, IPsec, and WireGuard integrated.
  • Security Enhancements: Notably includes built-in Two-Factor Authentication (2FA) support, which is a great plus for securing the firewall itself. It also has strong reporting and traffic visualization tools (Insight).
  • Management: Features a modern, clean, and arguably more intuitive web GUI. High Availability is also supported.
  • Extensibility: Offers plugins for various services, including Zabbix agents/proxies, Let’s Encrypt certificate management, etc.

The biggest philosophical difference lies in its licensing and development model. OPNsense is fully open source under a BSD license. While there’s a “Business Edition” offering professional support (from Deciso B.V., the company heavily involved), the software itself is identical to the free version. There are no feature differences or separate “plus” tiers. Updates tend to be more frequent, often released on a fixed schedule (e.g., twice a year for major releases, with patches in between).

It’s also based on FreeBSD (currently FreeBSD 13.x at the time of recording, though this changes).

Key Differences Summarized

  • Licensing Model: PFSense has CE (free) and Plus (commercial, tied to NetGate hardware or subscription). OPNsense is fully open source, with optional paid support that doesn’t unlock extra features.
  • User Interface: OPNsense generally offers a more modern and potentially user-friendly GUI. PFSense has a more traditional, albeit very functional, interface.
  • Development & Updates: OPNsense follows a more predictable release schedule and is community-driven (with corporate backing for infrastructure/support). PFSense CE updates can sometimes lag behind the commercial Plus version driven by NetGate.
  • Out-of-the-Box Features: OPNsense includes things like 2FA and enhanced reporting built-in. PFSense might require packages for some equivalent functionalities (like WireGuard initially).
  • Commercial Backing: PFSense is directly backed and developed by NetGate. OPNsense has backing from Deciso for support/infrastructure but development is more community-focused.

Which One Should You Choose?

Both are excellent, mature, and highly capable solutions. The choice often comes down to your priorities:

  • Choose OPNsense if: You prioritize a modern UI, a predictable release cycle, built-in features like 2FA, and a strictly open-source philosophy without tiered versions. It’s often recommended for newcomers due to its interface.
  • Choose PFSense (CE or Plus) if: You’re comfortable with its traditional interface, need features potentially exclusive to or better optimized in the Plus version (especially with NetGate hardware), or prefer the ecosystem and support structure provided by NetGate. For business deployments where guaranteed support and optimized hardware/software integration are key, buying a NetGate appliance with PFSense Plus is a very compelling option – you get the hardware, software license, and support in one package, often at a reasonable price point.

Personally, while I love the open-source nature of OPNsense, I also understand the need for companies like NetGate to fund development. If I were deploying for a client needing robust support, the NetGate appliance route with PFSense Plus offers significant peace of mind.

Wrapping Up

Both PFSense and OPNsense empower you to build incredibly powerful network security and management solutions using open-source software. They run on standard hardware or VMs, offer extensive features, and have active communities.

I encourage you to download both and try them out in a virtual environment to see which interface and workflow you prefer. They are fantastic tools for learning and for securing your networks.

I hope you found this comparison useful! What are your experiences with PFSense or OPNsense? Which one do you prefer and why? Let me know in the comments below!

If you enjoyed this video and post, please give it a thumbs up, share it, and consider subscribing to Quadrata for more content on open source and IT.

And don’t forget, for Zabbix discussions, join our community on the ZabbixItalia Telegram Channel!

Thanks for watching (and reading!), and I’ll see you in the next one. Bye from Dimitri!

Read More
Automate Smarter, Not Harder: Exploring N8n for AI-Powered Workflows

Automate Smarter, Not Harder: Exploring N8n for AI-Powered Workflows

Good morning everyone! Dimitri Bellini here, back on Quadrata, my channel where we dive into the fascinating world of open source and IT. As I always say, I hope you find these topics as exciting as I do!

This week, we’re venturing back into the realm of artificial intelligence, but with a twist. We’ll be looking at an incredibly interesting, user-friendly, and – you guessed it – open-source tool called N8n (pronounced “N-eight-N”). While we’ve explored similar solutions before, N8n stands out with its vibrant community and powerful capabilities, especially its recent AI enhancements.

What is N8n and Why Should You Care?

At its core, N8n is a Workflow Automation Tool. It wasn’t born solely for AI; its primary goal is to help you automate sequences of tasks, connecting different applications and services together. Think of it as a visual way to build bridges between the tools you use every day.

Why opt for a tool like N8n instead of just writing scripts in Python or another language? The key advantage lies in maintainability and clarity. While scripts work, revisiting them months later often requires deciphering complex code. N8n uses a graphical user interface (GUI) with logical blocks. This visual approach makes workflows much easier to understand, debug, and modify, even long after you’ve created them. For me, especially for complex or evolving processes, this visual clarity is a huge plus.

The best part? You can install it right on your own hardware or servers, keeping your data and processes in-house.

Key Functionalities of N8n

N8n packs a punch when it comes to features:

  • Visual Workflow Builder: Create complex automation sequences graphically using a web-based GUI. Drag, drop, and connect nodes to define your logic.
  • Extensive Integrations: It boasts a vast library of pre-built integrations for countless applications and services (think Google Suite, Microsoft tools, databases, communication platforms, and much more).
  • Customizable Nodes: If a pre-built integration doesn’t exist, you can create custom nodes, for example, to execute your own Python code within a workflow.
  • AI Agent Integration: This is where it gets really exciting for us! N8n now includes dedicated modules (built using Langchain) to seamlessly integrate AI models, including self-hosted ones like those managed by Ollama.
  • Data Manipulation: N8n isn’t just about triggering actions. It allows you to transform, filter, merge, split, and enrich data as it flows through your workflow, enabling sophisticated data processing.
  • Strong Community & Templates: Starting from scratch can be daunting. N8n has a fantastic community that shares workflow templates. These are invaluable for learning and getting started quickly.

Getting Started: Installation with Docker

My preferred method for running N8n, especially for testing and home use, is using Docker and Docker Compose. It’s clean, contained, and easy to manage. While you *can* install it using npm, Docker keeps things tidy.

  1. Use Docker Compose: I started with the official Docker Compose setup provided on the N8n GitHub repository. This typically includes N8n itself and a Postgres database for backend storage (though SQLite is built-in for simpler setups).
  2. Configure Environment: Modify the .env file to set up database credentials and any other necessary parameters.
  3. Launch: Run docker-compose up -d to start the containers.
  4. Access: You should then be able to access the N8n web interface, usually at http://localhost:5678. You’ll need to create an initial user account.
  5. Connect AI (Optional but Recommended): Have your Ollama instance running if you plan to use local Large Language Models (LLMs).

N8n in Action: Some Examples

Let’s look at a few examples I demonstrated in the video to give you a feel for how N8n works:

Example 1: The AI Calculator

This was a simple workflow designed to show the basic AI Agent block.

  • It takes a mathematical question (e.g., “4 plus 5”).
  • Uses the AI Agent node configured with an Ollama model (like Mistral) and Postgres for memory (to remember conversation context).
  • The “tool” in this case is a simple calculator function.
  • The AI understands the request, uses the tool to get the result (9), and then formulates a natural language response (“The answer is 9”).
  • The execution log is fantastic here, showing step-by-step how the input flows through chat memory, the LLM, the tool, and back to the LLM for the final output.

Example 2: AI Web Agent with SERP API

This workflow demonstrated fetching external data and using AI to process it:

  • It used the SERP API tool (requiring an API key) to perform a web search (e.g., “latest news about Zabbix”).
  • The search results were passed to the first AI Agent (using Ollama) for initial processing/summarization.
  • Crucially, I showed how to pass the output of one node as input to the next using N8n’s expression syntax ({{ $json.output }} or similar).
  • A second AI Agent node was added with a specific prompt: “You are a very good AI agent specialized in blog writing.” This agent took the summarized web content and structured it into a blog post format.

Example 3: Simple Web Scraper

This showed basic web scraping without external APIs:

  • Used the built-in HTTP Request node to fetch content from specific web pages.
  • Applied filtering and data manipulation nodes to limit the number of pages and extract relevant text content (cleaning HTML).
  • Passed the cleaned text to Ollama for summarization.
  • The visual execution flow clearly showed each step turning green as it completed successfully.

I also briefly mentioned a much more complex potential workflow involving document processing (PDFs, text files), using Quadrant as a vector database, and Mistral for creating embeddings to build a Retrieval-Augmented Generation (RAG) system – showcasing the scalability of N8n.

Conclusion: Your Automation Powerhouse

N8n is a remarkably powerful and flexible tool for anyone looking to automate tasks, whether simple or complex. Its visual approach makes automation accessible, while its deep integration capabilities, including first-class support for AI models via tools like Ollama, open up a world of possibilities.

Being open-source and self-hostable gives you complete control over your workflows and data. Whether you’re automating IT processes, integrating marketing tools, processing data, or experimenting with AI, N8n provides a robust platform to build upon.

What do you think? Have you tried N8n or other workflow automation tools? What kind of tasks would you love to automate using AI?

Let me know your thoughts, suggestions, and experiences in the comments below! Your feedback is incredibly valuable.

If you found this useful, please consider sharing it and subscribing to my YouTube channel, Quadrata, for more content on open source and IT.

Thanks for reading, and see you in the next one!

– Dimitri Bellini

Read More
Simplify Your Web Services: An Introduction to Nginx Proxy Manager

Simplify Your Web Services: An Introduction to Nginx Proxy Manager

Good morning everyone, Dimitri Bellini here from Quadrata! While my channel often dives deep into the world of Zabbix and open-source monitoring, today I want to shift gears slightly. I’ve realized that some foundational concepts, while powerful, aren’t always common knowledge, and sharing them can be incredibly helpful.

This week, we’re exploring Nginx Proxy Manager. It’s not a revolutionary concept in itself, but it’s a tool that significantly simplifies managing access to your web services, especially when dealing with HTTPS and multiple applications behind a single IP address.

What Exactly is Nginx Proxy Manager?

At its core, Nginx Proxy Manager is a reverse proxy built on top of the popular Nginx web server. But what makes it special? It packages several essential functionalities into one easy-to-manage solution, accessible via a clean web interface.

Here are its main characteristics:

  • Reverse Proxy Functionality: It acts as an intermediary, allowing you to securely expose multiple internal web services (like your Zabbix frontend, internal wikis, etc.) to the internet using potentially just one public IP address. Instead of exposing your services directly, the proxy handles incoming requests and forwards them appropriately.
  • Free SSL Certificates with Let’s Encrypt: It seamlessly integrates with Let’s Encrypt, enabling you to obtain and, crucially, automatically renew free, trusted SSL/TLS certificates for your domains. This makes setting up HTTPS incredibly straightforward.
  • User-Friendly Web Interface: This is a huge plus! While configuring Nginx via text files is powerful (Infrastructure as Code!), it can be complex and time-consuming, especially if you don’t do it often. The web UI simplifies creating proxy hosts, managing certificates, and viewing logs, making it accessible even if you’re not an Nginx expert. Remembering complex configurations months later becomes much easier!
  • Docker-Based: It runs as a Docker container, bundling all dependencies (Nginx, Certbot for Let’s Encrypt, the web UI) together. This makes installation, updates, and management very convenient.

Understanding Reverse Proxies (and why they’re not Forward Proxies)

It’s important to distinguish a reverse proxy from the traditional “forward” proxy many of us remember from the early internet days. A forward proxy sits between users on a network and the *external* internet, often used for caching or filtering outbound requests.

A reverse proxy does the opposite. It sits in front of your *internal* web servers and manages incoming requests from the *external* internet. When someone types zbx1.yourdomain.com, the request hits the reverse proxy first. The proxy then looks at the requested domain and forwards the traffic to the correct internal server (e.g., the machine hosting your Zabbix web GUI).

This is essential if you have only one public IP but want to host multiple websites or services using standard HTTPS (port 443).

The Crucial Role of DNS and Let’s Encrypt

DNS: Directing Traffic

How does a user’s browser know where to find your reverse proxy? Through DNS! You need to configure your public DNS records (usually on your domain registrar’s platform or DNS provider) so that the domain names you want to expose (e.g., zbx1.yourdomain.com, wiki.yourdomain.com) point to the public IP address of your Nginx Proxy Manager server. This is typically done using:

  • A Record: Points a domain directly to an IPv4 address.
  • CNAME Record: Points a domain to another domain name (often more flexible). For example, zbx1.yourdomain.com could be a CNAME pointing to proxy.yourdomain.com, which then has an A record pointing to your public IP.

Without correct DNS setup, requests will never reach your proxy.

Let’s Encrypt: Free and Automated SSL

Let’s Encrypt is a non-profit Certificate Authority that provides free, domain-validated SSL/TLS certificates. Before Let’s Encrypt, obtaining trusted certificates often involved significant cost and manual processes. Let’s Encrypt has democratized HTTPS, making it easy and free for everyone.

The main “catch” is that their certificates have a shorter validity period (e.g., 90 days). This is where Nginx Proxy Manager shines – it handles the initial domain validation (“challenge”) and the periodic, automatic renewal process, ensuring your sites remain secure without manual intervention.

Getting Started: Installation via Docker Compose

Installing Nginx Proxy Manager is straightforward using Docker Compose. Here’s a basic docker-compose.yml file similar to the one I use:


version: '3.8'
services:
app:
image: 'jc21/nginx-proxy-manager:latest'
restart: unless-stopped
ports:
# Public HTTP Port for Let's Encrypt challenges
- '80:8080'
# Public HTTPS Port
- '443:443'
# Admin Web UI Port (access this in your browser)
- '81:81'
volumes:
- ./data:/data
- ./letsencrypt:/etc/letsencrypt
# For production, consider using a proper database like MySQL/MariaDB instead of SQLite
# environment:
# DB_MYSQL_HOST: "db"
# DB_MYSQL_PORT: 3306
# DB_MYSQL_USER: "npm"
# DB_MYSQL_PASSWORD: "your_password"
# DB_MYSQL_NAME: "npm"
# depends_on:
# - db

# Uncomment this section if using MySQL/MariaDB
# db:
# image: 'jc21/mariadb-aria:latest'
# restart: unless-stopped
# environment:
# MYSQL_ROOT_PASSWORD: 'your_root_password'
# MYSQL_DATABASE: 'npm'
# MYSQL_USER: 'npm'
# MYSQL_PASSWORD: 'your_password'
# volumes:
# - ./data/mysql:/var/lib/mysql

Key Ports Explained:

  • 80:8080: Maps external port 80 to the container’s port 8080. Port 80 is needed externally for Let’s Encrypt HTTP-01 challenges.
  • 443:443: Maps external port 443 (standard HTTPS) to the container’s port 443. This is where your proxied traffic will arrive.
  • 81:81: Maps external port 81 to the container’s port 81. You’ll access the Nginx Proxy Manager admin interface via http://your_server_ip:81.

Volumes:

  • ./data:/data: Stores configuration data (using SQLite by default).
  • ./letsencrypt:/etc/letsencrypt: Stores your SSL certificates.

To start it, simply navigate to the directory containing your docker-compose.yml file and run:

docker-compose up -d

Note: For environments with many sites, the official documentation recommends using MySQL or MariaDB instead of the default SQLite for better performance.

Configuring a Proxy Host: A Quick Walkthrough

Once the container is running, access the web UI (http://your_server_ip:81). The default login credentials are usually admin@example.com / changeme (you’ll be prompted to change these immediately).

From the dashboard, you’ll see options like Proxy Hosts, Redirection Hosts, Streams (for TCP/UDP forwarding), and 404 Hosts.

To expose an internal service (like Zabbix):

  1. Go to Hosts -> Proxy Hosts.
  2. Click Add Proxy Host.
  3. Details Tab:

    • Domain Names: Enter the public domain name(s) you configured in DNS (e.g., zbx1.quadrata.it).
    • Scheme: Select the protocol your *internal* service uses (usually http for Zabbix web UI).
    • Forward Hostname / IP: Enter the internal IP address of your Zabbix server (e.g., 192.168.1.100).
    • Forward Port: Enter the internal port your service listens on (e.g., 80 for Zabbix web UI).
    • Enable options like Block Common Exploits and Websockets Support if needed.

  4. SSL Tab:

    • Select Request a new SSL Certificate from the dropdown.
    • Enable Force SSL (redirects HTTP to HTTPS).
    • Enable HTTP/2 Support.
    • Enter your email address (for Let’s Encrypt notifications).
    • Agree to the Let’s Encrypt Terms of Service.

  5. Click Save.

Nginx Proxy Manager will now attempt to obtain the certificate from Let’s Encrypt using the domain you provided. If successful, the entry will show as green/online. You should now be able to access your internal Zabbix interface securely via https://zbx1.quadrata.it!

There’s also an Advanced tab where you can add custom Nginx configuration snippets for more complex scenarios, which is incredibly useful.

Wrapping Up

Nginx Proxy Manager is a fantastic tool that bundles complex functionalities like reverse proxying and SSL certificate management into an easy-to-use package. It lowers the barrier to entry for securely exposing web services and makes ongoing management much simpler, especially with its automated certificate renewals and clear web interface.

Whether you’re managing home lab services, small business applications, or just experimenting, I highly recommend giving it a try. It saves time, enhances security, and simplifies your infrastructure.

What are your thoughts on Nginx Proxy Manager? Have you used it or similar tools? Let me know in the comments below!

If you found this helpful, consider subscribing to the Quadrata YouTube channel for more content on open-source solutions and IT topics.

And don’t forget, if you have Zabbix questions, join our growing community on the Zabbix Italia Telegram Channel!

Thanks for reading, and I hope to see you in the next video!

– Dimitri Bellini

Read More
Taming the SNMP Beast: Custom Monitoring with Zabbix Discovery

Taming the SNMP Beast: Custom Monitoring with Zabbix Discovery

Good morning everyone, and welcome back! It’s Dimitri Bellini here on Quadrata, your channel for open-source tech insights. Today, we’re diving back into the world of Zabbix, specifically tackling what many consider a ‘black beast’: SNMP monitoring.

SNMP (Simple Network Management Protocol) is incredibly common for monitoring network devices like routers and switches (think Mikrotik, Cisco), but it’s also used for applications and servers running SNMP daemons. Zabbix comes packed with pre-built templates for many SNMP devices, which makes life easy – apply the template, set a few parameters, and data starts flowing. But what happens when you have a device with no template, or you only have the manufacturer’s MIB file? That’s when things get trickier, and you need to build your monitoring from scratch.

In this post, I want to share my approach to creating new SNMP monitoring templates or discovering SNMP data when starting from zero. Let’s demystify this process together!

Understanding the SNMP Basics: MIBs and OIDs

Before jumping into Zabbix, we need to understand what we *can* monitor. This information lives in MIB (Management Information Base) files provided by the device vendor. These files define the structure of manageable data using OIDs (Object Identifiers) – unique numerical addresses for specific metrics or pieces of information.

Your Essential Toolkit: MIB Browser and snmpwalk

To explore these MIBs and test SNMP communication, I rely on a couple of key tools:

  • MIB Browser: I often use the iReasoning MIB Browser. It’s free, multi-platform (Java-based), and lets you load MIB files visually. You can navigate the OID tree, see descriptions, data types, and even potential values (which helps later with Zabbix Value Maps). For example, you can find the OID for interface operational status (ifOperStatus) and see that ‘1’ means ‘up’, ‘2’ means ‘down’, etc.
  • snmpwalk: This command-line utility (part of standard SNMP tools on Linux) lets you query a device directly. It’s crucial for verifying that the device responds and seeing the actual data returned for a specific OID.

Finding Your Way with OIDs

Let’s say we want to monitor network interfaces on a device (like the pfSense appliance I use in the video). Using the MIB browser, we find the OID for interface descriptions, often IF-MIB::ifDescr. We can then test this with snmpwalk:

snmpwalk -v2c -c public 192.168.1.1 IF-MIB::ifDescr

(Replace public with your device’s SNMP community string and 192.168.1.1 with its IP address. We’re using SNMP v2c here for simplicity, though v3 offers better security).

This command might return something like:


IF-MIB::ifDescr.1 = STRING: enc0

IF-MIB::ifDescr.2 = STRING: ovpns1

...

Sometimes, especially when Zabbix might not have the MIB loaded, it’s easier to work with the full numerical OID. Use the -On flag:

snmpwalk -v2c -c public -On 192.168.1.1 1.3.6.1.2.1.2.2.1.2

This will output the full numerical OIDs, like .1.3.6.1.2.1.2.2.1.2.1, .1.3.6.1.2.1.2.2.1.2.2, etc.

The Power of SNMP Indexes

Notice the numbers at the end of the OIDs (.1, .2)? These are **indexes**. SNMP often organizes data in tables. Think of it like a spreadsheet: each row represents an instance (like a specific interface or disk), identified by its index. Different columns represent different metrics (like description, status, speed, octets in/out) for that instance.

So, ifDescr.1 is the description for interface index 1, and ifOperStatus.1 (OID: .1.3.6.1.2.1.2.2.1.8.1) would be the operational status for that *same* interface index 1. This index is the key to correlating different pieces of information about the same logical entity.

Automating with Zabbix Low-Level Discovery (LLD)

Manually creating an item in Zabbix for every single interface and every metric (status, traffic in, traffic out…) is tedious and static. If a new interface appears, you have to add it manually. This is where Zabbix’s Low-Level Discovery (LLD) shines for SNMP.

LLD allows Zabbix to automatically find entities (like interfaces, disks, processors) based on SNMP indexes and then create items, triggers, and graphs for them using prototypes.

Setting Up Your Discovery Rule

Let’s create a discovery rule for network interfaces:

  1. Go to Configuration -> Templates -> Your Template -> Discovery rules -> Create discovery rule.
  2. Name: Something descriptive, e.g., “Network Interface Discovery”.
  3. Type: SNMP agent.
  4. Key: A unique key you define, e.g., net.if.discovery.
  5. SNMP OID: This is the core. Use the Zabbix discovery syntax: discovery[{#MACRO_NAME1}, OID1, {#MACRO_NAME2}, OID2, ...].

    • Zabbix automatically provides {#SNMPINDEX} representing the index found.
    • We define custom macros to capture specific values. For interface names, we can use {#IFNAME}.

    So, to discover interface names based on their index, the OID field would look like this:
    discovery[{#IFNAME}, 1.3.6.1.2.1.2.2.1.2]
    (Using the numerical OID for ifDescr).

  6. Configure other settings like update interval as needed.

Zabbix will periodically run this rule, perform an SNMP walk on the specified OID (ifDescr), and generate a list mapping each {#SNMPINDEX} to its corresponding {#IFNAME} value.

Pro Tip: Use the “Test” button in the discovery rule configuration! It’s incredibly helpful to see the raw data Zabbix gets and the JSON output with your macros populated before saving.

Creating Dynamic Items with Prototypes

Now that Zabbix can discover the interfaces, we need to tell it *what* to monitor for each one using Item Prototypes:

  1. Within your discovery rule, go to the “Item prototypes” tab -> Create item prototype.
  2. Name: Use the macros found by discovery for dynamic naming, e.g., Interface {#IFNAME}: Operational Status.
  3. Type: SNMP agent.
  4. Key: Must be unique per host. Use a macro to ensure this, e.g., net.if.status[{#SNMPINDEX}].
  5. SNMP OID: Specify the OID for the metric you want, appending the index macro. For operational status (ifOperStatus, numerical OID .1.3.6.1.2.1.2.2.1.8), use: 1.3.6.1.2.1.2.2.1.8.{#SNMPINDEX}. Zabbix will automatically replace {#SNMPINDEX} with the correct index (1, 2, 3…) for each discovered interface.
  6. Type of information: Numeric (unsigned) for status codes.
  7. Units: N/A for status.
  8. Value mapping: Select or create a value map that translates the numerical status (1, 2, 3…) into human-readable text (Up, Down, Testing…). This uses the information we found earlier in the MIB browser.
  9. Configure other settings like update interval, history storage, etc.

Once saved, Zabbix will use this prototype to create an actual item for each interface discovered by the LLD rule. If a new interface appears on the device, Zabbix will discover it and automatically create the corresponding status item!

A Practical Example: Monitoring Disk Storage

We can apply the same logic to other SNMP data, like disk storage. In the video, I showed discovering disk types and capacities on my pfSense box.

Discovering Disk Types with Preprocessing

I created another discovery rule targeting the OID for storage types (e.g., from the HOST-RESOURCES-MIB or a vendor-specific MIB). This OID often returns numbers (like 3 for Hard Disk, 5 for Optical Disk).

To make the discovered macro more readable (e.g., {#DISKTYPE}), I used Zabbix’s **Preprocessing** feature within the discovery rule itself:

  • Add a preprocessing step of type “Replace”.
  • Find `^5$` (regex for the number 5) and replace with `Optical Disk`.
  • Add another step to find `^3$` and replace with `Hard Disk`.

Now, the {#DISKTYPE} macro will contain “Hard Disk” or “Optical Disk” instead of just a number.

Monitoring Disk Capacity with Unit Conversion

Then, I created an item prototype for disk capacity:

  • Name: `Disk {#DISKTYPE}: Capacity`
  • Key: `storage.capacity[{#SNMPINDEX}]`
  • SNMP OID: `[OID_for_storage_size].{#SNMPINDEX}`
  • Units: `B` (Bytes)
  • Preprocessing (in the Item Prototype): The SNMP device reported capacity in Kilobytes (or sometimes in allocation units * block size). To normalize it to Bytes, I added a “Custom multiplier” preprocessing step with a value of `1024`.

Putting It All Together

By combining MIB exploration, `snmpwalk` testing, Zabbix LLD rules with custom macros, and item prototypes with appropriate OIDs and preprocessing, you can build powerful, dynamic SNMP monitoring for almost any device, even without off-the-shelf templates.

It might seem a bit daunting initially, especially understanding the OID structure and LLD syntax, but once you grasp the concept of indexes and macros, it becomes quite manageable. The key is to break it down: find the OIDs, test them, set up discovery, and then define what data to collect via prototypes.


I hope this walkthrough helps demystify custom SNMP monitoring in Zabbix! It’s a powerful skill to have when dealing with diverse infrastructure.

What are your biggest challenges with SNMP monitoring? Have you built custom LLD rules? Share your experiences, questions, or tips in the comments below! I’ll do my best to answer any doubts you might have.

And if you have more Zabbix questions, feel free to join the Italian Zabbix community on Telegram: Zabbix Italia.

If you found this post helpful, please give the original video a ‘Like’ on YouTube, share this post, and subscribe to Quadrata for more open-source and Zabbix content.

Thanks for reading, and see you next week!

– Dimitri Bellini

Read More
Building a Bulletproof PostgreSQL Cluster with Patroni, etcd, and PGBackrest

Building a Bulletproof PostgreSQL Cluster: My Go-To High Availability Setup

Good morning everyone! Dimitri Bellini here, back on Quadrata, my channel dedicated to the open-source world and the IT topics I love – and hopefully, you do too!

Thanks for tuning in each week. If you haven’t already, please hit that subscribe button and give this video a thumbs up – it really helps!

Today, we’re diving into a crucial topic for anyone running important applications, especially (but not only!) those using Zabbix: database resilience and performance. Databases are often the heart of our applications, but they can also be the source of major headaches – slow queries, crashes, data loss. Ensuring your database is robust and performs well is fundamental.

Why PostgreSQL and This Specific Architecture?

A few years back, we made a strategic decision to shift from MySQL to PostgreSQL. Why? Several reasons:

  • The community and development activity around Postgres seemed much more vibrant.
  • It felt like a more “serious,” robust database, even if maybe a bit more complex to configure initially compared to MySQL’s out-of-the-box readiness.
  • For applications like Zabbix, which heavily utilize the database, especially in complex setups, having a reliable and performant backend is non-negotiable. Avoiding database disasters and recovery nightmares is paramount!

The architecture I’m showcasing today isn’t just for Zabbix; it’s a solid foundation for many applications needing high availability. We have clients using this exact setup for various purposes.

The Core Components

The solution we’ve settled on combines several powerful open-source tools:

  • PostgreSQL: The core relational database.
  • Patroni: A fantastic template for creating a High Availability (HA) PostgreSQL cluster. It manages the Postgres instances and orchestrates failover.
  • etcd: A distributed, reliable key-value store. Patroni uses etcd for coordination and sharing state information between cluster nodes, ensuring consensus.
  • PGBackrest: A reliable, feature-rich backup and restore solution specifically designed for PostgreSQL.
  • HAProxy (Optional but Recommended): A load balancer to direct application traffic to the current primary node seamlessly.

How It Fits Together

Imagine a setup like this:

  • Multiple PostgreSQL Nodes: Typically, at least two nodes running PostgreSQL instances.
  • Patroni Control: Patroni runs on these nodes, monitoring the health of Postgres and managing roles (leader/replica).
  • etcd Cluster: An etcd cluster (minimum 3 nodes for quorum – one can even be the backup server) stores the cluster state. Patroni instances consult etcd to know the current leader and overall cluster health.
  • PGBackrest Node: Often one of the etcd nodes also serves as the PGBackrest repository server, storing backups and Write-Ahead Logs (WALs) for point-in-time recovery. Backups can be stored locally or, even better, pushed to an S3-compatible object store.
  • Load Balancer: HAProxy (or similar) sits in front, checking an HTTP endpoint provided by Patroni on each node to determine which one is the current leader (primary) and directs all write traffic there.

This creates an active-standby (or active-passive) cluster. Your application connects to a single endpoint (the balancer), completely unaware of which physical node is currently active. HAProxy handles the redirection automatically during a switchover or failover.

Key Advantages of This Approach

  • True High Availability: Provides a really bulletproof active-standby solution.
  • Easy Balancer Integration: Uses simple HTTP checks, avoiding the complexities of virtual IPs (VIPs) and Layer 2 network requirements often seen in traditional clustering (like Corosync/Pacemaker), making it great for modern Layer 3 or cloud environments.
  • “Simple” Configuration (Relatively!): Once you grasp the concepts, configuration is largely centralized in a single YAML file per node (patroni.yml).
  • Highly Resilient & Automated: Handles node failures, switchovers, and even node reintegration automatically.
  • Powerful Backup & Recovery: PGBackrest makes backups and, crucially, Point-in-Time Recovery (PITR) straightforward (again, “straightforward” for those familiar with database recovery!).
  • 100% Open Source: No licensing costs or vendor lock-in. Test it, deploy it freely.
  • Enterprise Ready & Supportable: These are mature projects. For production environments needing formal support, companies like Cybertec PostgreSQL (no affiliation, just an example we partner with) offer commercial support for this stack. We at Quadrata can also assist with first-level support and implementation.

In my opinion, this architecture brings PostgreSQL very close to the robustness you might expect from expensive proprietary solutions like Oracle RAC, but using entirely open-source components.

Let’s See It In Action: A Practical Demo

Talk is cheap, right? Let’s walk through some common management tasks and failure scenarios. In my lab, I have three minimal VMs (2 vCPU, 4GB RAM, 50GB disk): two for PostgreSQL/Patroni (node1, node2) and one for PGBackrest/etcd (backup-node). Remember, 3 nodes is the minimum for a reliable etcd quorum.

1. Checking Cluster Status

The primary command is patroni ctl. Let’s see the cluster members:

$ patronictl -c /etc/patroni/patroni.yml list
+ Cluster: my_cluster (73...) ---+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------+--------+---------+----+-----------+
| node1 | 10.0.0.1| Leader | running | 10 | |
| node2 | 10.0.0.2| Replica| running | 10 | 0 |
+--------+---------+--------+---------+----+-----------+

Here, node1 is the current Leader (primary), and node2 is a Replica, perfectly in sync (Lag 0 MB) on the same timeline (TL 10).

2. Performing a Manual Switchover

Need to do maintenance on the primary? Let’s gracefully switch roles:

$ patronictl -c /etc/patroni/patroni.yml switchover
Current cluster leader is node1
Available candidates for switchover:
1. node2
Select candidate from list [1]: 1
When should the switchover take place (e.g. 2023-10-27T10:00:00+00:00) [now]: now
Are you sure you want to switchover cluster 'my_cluster', leader 'node1' to member 'node2'? [y/N]: y
Successfully switched over to "node2"
... (Check status again) ...
+ Cluster: my_cluster (73...) ---+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------+--------+---------+----+-----------+
| node1 | 10.0.0.1| Replica| running | 11 | 0 |
| node2 | 10.0.0.2| Leader | running | 11 | |
+--------+---------+--------+---------+----+-----------+

Patroni handled demoting the old leader, promoting the replica, and ensuring the old leader started following the new one. Notice the timeline (TL) incremented.

3. Simulating a Primary Node Failure

What if the primary node just dies? Let’s stop Patroni on node2 (the current leader):

# systemctl stop patroni (on node2)

Now, check the status from node1:

$ patronictl -c /etc/patroni/patroni.yml list
+ Cluster: my_cluster (73...) ---+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------+--------+---------+----+-----------+
| node1 | 10.0.0.1| Leader | running | 12 | |
| node2 | 10.0.0.2| | stopped | | unknown |
+--------+---------+--------+---------+----+-----------+

Patroni automatically detected the failure and promoted node1 to Leader. When node2 comes back online (systemctl start patroni), Patroni will automatically reintegrate it as a replica.

4. Recovering a Destroyed Node

What if a node is completely lost? Data disk corrupted, VM deleted? Let’s simulate this on node2 (assuming node1 is currently the leader):

# systemctl stop patroni (on node2)
# rm -rf /var/lib/patroni/data # Or wherever your PG data directory is
# systemctl start patroni (on node2)

Watching the Patroni logs on node2, you’ll see it detects it has no data and initiates a `pg_basebackup` (or uses PGBackrest if configured) from the current leader (node1) to rebuild itself from scratch. Checking patroni ctl list shows its state transitioning through `creating replica` to `running` as a replica again, all automatically!

5. Point-in-Time Recovery (PITR) – The Real Lifesaver!

This is why I made the video! Recently, a bad deployment caused data corruption. We needed to restore to a state just *before* the incident. Here’s how PGBackrest and Patroni help.

Scenario: I accidentally deleted all records from a critical table.

psql> SELECT COUNT(*) FROM my_table; -- Shows 1000 rows
psql> DELETE FROM my_table;
psql> SELECT COUNT(*) FROM my_table; -- Shows 0 rows! Disaster!

Recovery Steps:

  1. STOP PATRONI EVERYWHERE: This is critical. We need to prevent Patroni from interfering while we manipulate the database state manually.

    # systemctl stop patroni (on ALL nodes: node1, node2)

  2. Identify Target Time/Backup: Use PGBackrest to find the backup and approximate time *before* the data loss.

    $ pgbackrest --stanza=my_stanza info 
    ... (Find the latest FULL backup timestamp, e.g., '2023-10-27 11:30:00') ...

  3. Perform Restore on the (Ex-)Leader Node: Go to the node that *was* the leader (let’s say node1). Run the restore command, specifying the target time. The `–delta` option is efficient as it only restores changed files.

    $ pgbackrest --stanza=my_stanza --delta --type=time --target="2023-10-27 11:30:00" --target-action=pause restore

    (Note: `–target-action=pause` or `promote` might be needed depending on your exact recovery goal. For simplicity here, let’s assume we want to stop recovery at that point). Check PGBackrest docs for specifics. The video used a slightly different target specification based on the backup label.)

    Correction based on Video: The video demonstrated restoring to the end time of a specific full backup. A more typical PITR might use `–type=time` and a specific timestamp like `YYYY-MM-DD HH:MM:SS`. Let’s assume we used the backup label as shown in the video’s logic:

    $ pgbackrest --stanza=my_stanza --delta --set=20231027-xxxxxxF --type=default --target-action=promote restore

    (Replace `20231027-xxxxxxF` with your actual backup label. Using `–target-action=promote` tells Postgres to finish recovery and become promotable immediately after reaching the target.)

  4. Start Postgres Manually (on the restored node): Start the database *without* Patroni first.

    # pg_ctl -D /var/lib/patroni/data start

    PostgreSQL will perform recovery using the restored files and WAL archives up to the specified target. Because we used `–target-action=promote`, it should finish recovery and be ready. If we had used `pause`, we would need `pg_ctl promote`.

  5. Verify Data: Connect via `psql` and check if your data is back!

    psql> SELECT COUNT(*) FROM my_table; -- Should show 1000 rows again!

  6. Restart Patroni: Now that the database is in the desired state, start Patroni on the restored node first, then on the other nodes.

    # systemctl start patroni (on node1)
    # systemctl start patroni (on node2)

    Patroni on `node1` will see it’s a valid database, assert leadership in etcd. Patroni on `node2` will detect it’s diverged (or has no data if we wiped it) and automatically re-sync from the now-restored leader (`node1`).

As you saw, we recovered from a potential disaster relatively quickly because the architecture and tools are designed for this.

Final Thoughts

Setting up this entire stack isn’t trivial – it requires understanding each component. That’s why I didn’t do a full step-by-step configuration in the video (it would be too long!). But I hope showing you *how it works* and its capabilities demonstrates *why* we chose this architecture.

It provides automation, resilience, and recovery options that are crucial for critical systems. Having an organized setup like this, combined with good documentation (please, write down your procedures!), turns stressful recovery scenarios into manageable tasks.

What do you think? Is PostgreSQL with Patroni something you’d consider? Are there comparable HA solutions in the MySQL/MariaDB world you think are as robust or easy to manage? Let me know your thoughts in the comments below!

Don’t forget to check out the Quadrata YouTube channel for more open-source and IT content, and join the discussion on the Zabbix Italia Telegram channel!

That’s all for this episode. A big greeting from me, Dimitri, and see you in the next one. Bye everyone!

Read More
Unlocking Zabbix Proxies: Monitoring Remote Networks Like a Pro

Unlocking Zabbix Proxies: Monitoring Remote Networks Like a Pro

Hey everyone, Dimitri Bellini here, back with another episode on Quadrata (my YouTube channel, @quadrata)! This week, we’re diving deep into Zabbix proxies. I’ve been getting a lot of questions about how these things work, especially when it comes to discoveries and monitoring devices in remote networks. So, let’s get into it!

What is a Zabbix Proxy and Why Do You Need It?

Think of a Zabbix proxy as your monitoring agent in a segregated area. It’s a powerful tool within the Zabbix ecosystem that allows us to:

  • Monitor segregated areas or remote branches: It handles all the checks the Zabbix server would, but closer to the source.
  • Scale horizontally: It can offload work from your main Zabbix server in larger deployments.
  • Reduce bandwidth usage: It collects data locally and transmits it to the Zabbix server in a single, often compressed, transaction.
  • Simplify firewall configurations: You only need to configure one TCP port.
  • Buffer data during connectivity issues: The proxy stores collected data and forwards it when the connection to the Zabbix server is restored.

What Can a Zabbix Proxy Do?

A Zabbix proxy is surprisingly versatile. It can perform almost all the checks your Zabbix server can:

  • SNMP monitoring
  • IPMI checks
  • Zabbix agent monitoring
  • REST API checks
  • Remote command execution for auto-remediation

Key Improvements in Zabbix 7.0

The latest version of Zabbix (7.0) brings some significant enhancements to Zabbix proxies, including:

  • IA Viability: Improved overall stability and performance.
  • Automatic Load Distribution: The Zabbix server intelligently distributes hosts across proxies based on various factors.

Configuring a Zabbix Proxy with SQLite

For smaller setups, SQLite is a fantastic option. Here’s the basic configuration:

  1. Modify zabbix_proxy.conf:

    • Set the Server directive to the IP or DNS name of your Zabbix server.
    • Define the Hostname. This is crucial and must match the proxy name in the Zabbix web interface.
    • Set DBName to the path and filename for your SQLite database (e.g., /var/lib/zabbix/proxy.db). Remember, this is a *file path*, not a database name.

  2. Configure Zabbix Agents: Point the Server or ServerActive directives in your agent configurations to the proxy’s IP address, not the Zabbix server’s.

Remember to always consult the official Zabbix documentation for the most up-to-date and comprehensive information!

Zabbix Proxy Discovery in Action

Now, let’s talk about automatic host discovery using a Zabbix proxy. Here’s how I set it up:

  1. Create a Discovery Rule: In the Zabbix web interface, go to Data Collection -> Discovery and create a new rule.

    • Give it a descriptive name.
    • Set Discovery by to Proxy and select your proxy.
    • Define the IP range to scan. You can specify multiple ranges separated by commas.
    • Adjust the Update interval. Start with something reasonable (like an hour) to avoid network flooding. You can temporarily lower it for testing, but remember to change it back!
    • Configure the Checks. I used ICMP ping, SNMP (to get the system name), and Zabbix agent checks (system.hostname, system.uname).
    • Define Device unique criteria, typically IP address.
    • Specify Hostname and Visible name (I usually use the Zabbix agent’s hostname).

  2. Check Discovery Results: Go to Monitoring -> Discovery to see what the proxy has found.

Pro Tip: Debugging Discovery Issues with Runtime Commands

If you’re not seeing results immediately, don’t panic! Instead of guessing, SSH into your Zabbix proxy server and use the Zabbix proxy binary’s runtime commands:

zabbix_proxy -R help

This will show you available commands. The key one for debugging discovery is:

zabbix_proxy -R loglevel_increase="discovery manager"

This increases the logging level for the discovery manager process, providing much more verbose output in the Zabbix proxy’s log file. This is invaluable for troubleshooting!

Automating Host Onboarding with Discovery Actions

The real magic happens when you automate the process of adding discovered hosts. This is done through Configuration -> Actions -> Discovery actions.

  1. Enable the Default “Autodiscovery Linux Servers” Action (or create your own):

    • The key conditions are:

      • Application equals Discovery (meaning something was discovered).
      • Received value like Linux. This checks if the Zabbix agent’s system.uname value contains “Linux”.

    • The key operations are:

      • Create a host.
      • Add the host to the “Linux servers” host group.
      • (Crucially!) Link a template (e.g., “Template OS Linux by Zabbix agent”).

You can create more sophisticated actions based on other discovered properties, like SNMP data, allowing you to automatically assign appropriate templates based on device type (e.g., Cisco routers, HP printers).

Wrapping Up

While my live demo didn’t go *exactly* as planned (as is the way with live demos!), I hope this has given you a solid understanding of how Zabbix proxies work and how to use them effectively for monitoring remote networks. The key takeaways are understanding the configuration, using discovery rules effectively, and leveraging discovery actions to automate host onboarding.

If you found this helpful, give me a thumbs up! If you have any questions, drop them in the comments below. Also, be sure to join the ZabbixItalia Telegram channel (ZabbixItalia) for more Zabbix discussions. I can’t always answer everything immediately, but I’ll do my best to help. Thanks for watching, and I’ll see you next week on Quadrata!

Read More
AI for Coding: A Revolution or Just a Buzzword?

AI for Coding: A Revolution or Just a Buzzword?

Hello everyone, Dimitri Bellini here, and welcome back to my channel, Quadrata! It’s always a pleasure to share my thoughts on the open-source world and IT. If you haven’t already, please give this a like and subscribe to the channel. In this episode, I’m diving into a hot topic: artificial intelligence for coding. Is it truly the game-changer many claim it to be, or is it just another overhyped buzzword? Let’s find out.

The Promise of AI in Coding

The idea that AI can help us write code is incredibly appealing. Coding, whether it’s in Python or any other language, isn’t always straightforward. It involves working with libraries, understanding complex concepts, and debugging. So, the prospect of AI assistance to generate scripts or entire software is definitely something that excites many people, including me!

However, there’s a catch. Accessing these AI coding tools often comes at a cost. Many platforms require a subscription, or you need to pay for API usage, like with OpenAI’s ChatGPT. And, of course, you’ll need a computer, but the bulk of the processing is increasingly cloud-based.

Personally, I’ve experimented with AI for tasks like creating widgets in Zabbix and tuning parameters in Python scripts. The results? Mixed. Sometimes AI does a decent job, but other times, it falls short.

Popular AI Coding Tools

Let’s look at some of the popular tools in the AI coding space:

    • Cursor: One of the most well-known, Cursor is essentially a fork of Visual Studio Code. It provides a suite of AI models (OpenAI, Anthropic, Google) for a subscription fee, starting at around $20 per month. The pricing model, based on tokens, can be a bit complex. Initially focused on code creation, Cursor now seems to emphasize code suggestion and autocompletion.
    • Windsurf Editor: Another VS Code fork, Windsurf also integrates API calls to major AI models. It’s priced slightly lower, around $15 per month. Like Cursor, the actual cost can vary based on token usage.
    • Cline and Roocode: These are open-source VS Code extensions. Roocode is actually a fork of Cline. While they offer the advantage of being free, you’ll need to manage your subscriptions with AI providers separately. This approach can be cost-effective, especially if you want to use local AI engines.
    • Bolt DIY: Similar to Bolt.new, Bolt DIY is an open-source platform focused on code generation. While it can be useful for small tasks, I have doubts about its effectiveness for more complex projects. It also comes with a subscription fee of around $20 per month, but the token allocation for AI models isn’t very clear.

In my own testing, I used the trial version of Windsurf. I attempted to create a widget for Zabbix and modify a Python script. In just two days, I exhausted the available credits. This highlights the importance of carefully evaluating the cost-effectiveness of these tools.

The Concept of AI Agents and Tools

To improve the output from AI, the concept of using specialized AI agents has emerged. Instead of giving an AI model a broad task, breaking it down into smaller, specialized tasks can lead to more efficient and sensible results.

This is where “tools” or “function calling” comes in. These techniques allow AI engines to use external tools. For example, if an AI model’s dataset is limited to 2023, it won’t be able to provide real-time information like today’s flight details. However, with tools, the AI can be instructed to use an external script (e.g., in Python) to fetch the information from the internet and then process the output.

This capability extends the functionality of AI models, enabling them to, for example, pull code snippets from documentation or connect to APIs.

Challenges and the Model Context Protocol (MCP)

Despite the promise, there are challenges. Not all AI models support tools or function calling, and even those that do may have different formats. This is where the Model Context Protocol (MCP) comes in.

Introduced by Anthropic, the company behind Cloud, MCP aims to standardize communication between different tools and AI models. Think of it like a USB hub for AI. It provides a standard way for AI to discover available tools, understand their functions, and invoke them. This standardization could simplify development and reduce the complexity of integrating various services.

The MCP server, which can be hosted in your private cloud, exposes an API to allow AI or MCP clients to discover available tools and their capabilities. It also provides a standardized method for invoking these tools, addressing the current inconsistencies between AI models.

The Road Ahead

Despite these advancements, AI for coding still faces challenges. AI models often struggle to interpret the output from tools and use them effectively to produce satisfactory results. We are still in the early stages of this technology.

There are also concerns about the complexity introduced by MCP, such as the need for a server component and potential security issues like the lack of encryption. It’s a balancing act between the benefits and the added complexities.

Personally, I don’t believe AI is ready to handle serious coding tasks independently. However, it can be incredibly useful for simplifying repetitive tasks, like translations, text improvements, and reformatting. AI is excellent at repetitive tasks. While I may not be using it to its fullest potential, it certainly makes my daily tasks easier.

The future of AI in coding is promising, especially with the development of smaller, more efficient models that can run locally. Models like the one with 24 billion parameters, having the same capacity as DeepSeq R1 and requiring 20GB of RAM, are a step in the right direction. If we can continue to refine these models, AI could become an even more integral part of our coding workflow.

Let’s Discuss!

I’m eager to hear your thoughts on AI for coding. Please share your experiences and opinions in the comments below. Let’s learn from each other! You can also join the conversation on the ZabbixItalia Telegram Channel.

Thank you for joining me today. This is Dimitri Bellini, and I’ll see you next week. Bye everyone!

Visit my channel: Quadrata

Join the Telegram Channel: ZabbixItalia

Read More
Linux to Mac OS: A Tech Dilemma?

The Great Debate: Am I Ditching Linux for Mac OS? – Dimitri Bellini from Quadrata

Hey everyone, Dimitri Bellini here from Quadrata! This week, I’ve been wrestling with a decision that might surprise some of you: I’m seriously considering switching from Linux to Mac OS for my daily driver laptop. Yes, you read that right!

Why the Switch? A Long-Time Linux User’s Perspective

Now, before you brand me a heretic, let me explain. I’ve been a Linux user since the early 2000s – think Gentoo, Debian, Caldera, the early Fedora days. I’ve navigated the complexities, the driver issues, the customization rabbit holes. And I’ve loved it! But times change, and so do my needs.

The Linux Love Affair: A History

  • Early Days: Gentoo, Debian, Caldera
  • Mid-2000s Onward: Fedora (Desktop), CentOS/Rocky Linux/Alma Linux (Servers)
  • Current: Fedora on ThinkPad

My journey with Linux has been one of constant learning and problem-solving. From tweaking icons to optimizing desktops, I enjoyed the process of making Linux my own. But nowadays, I value different things.

The Allure of Mac OS: Productivity and Performance

The main reason I’m considering the switch is the need for a more reliable, out-of-the-box experience. My current ThinkPad T14S (Ryzen 7, 16GB RAM, 1TB SSD) is starting to show its age, especially during these hot summer months. I’m experiencing:

  • Thermal Throttling: Performance slowdowns when multitasking.
  • Webcall Issues: Cracking voice during video conferences.
  • Hardware Optimization: Ongoing driver challenges.

The new MacBook Air M4 is tempting. Here’s a quick comparison:

ThinkPad T14S (Current) vs. MacBook Air M4 (Potential)

  • Processor: AMD Ryzen 7 (X86) vs. Apple M4 (ARM)
  • Cores: 8 Cores (ThinkPad) vs 10 Cores (MacBook Air)
  • RAM: 16GB (ThinkPad) vs. 16GB (MacBook Air)
  • Storage: 1TB (ThinkPad) vs. 256GB (MacBook Air – expandable)
  • Ports: USB-A, USB-C, HDMI (ThinkPad) vs. 2x USB-C (MacBook Air)
  • Price: My upgraded ThinkPad, vs a starting price for a MacBook Air (similar spec’d Thinkpads are far more costly)

The thermal efficiency of the M4 chip is particularly appealing. The promise of 7-12 hours of battery life compared to my current 1-2 hours is a game-changer. Plus, Mac OS is a Unix-based system, which makes the move a little less daunting.

The Pros and Cons: A Balanced View

Mac OS Pros:

  • Thermal Efficiency: Runs much cooler and longer.
  • Software Ecosystem: Access to popular professional applications.
  • Build Quality: Aluminum build feels premium.
  • Unix Based: Unix underpinnings.

Mac OS Cons:

  • Virtualization Challenges: ARM architecture requires ARM-compiled VMs.
  • Expandability: Limited upgrade options.
  • Ecosystem: Can be expensive and less flexible.

The Virtualization Hurdle

One of my biggest concerns is virtualization. Switching to an ARM-based system means I’ll need ARM versions of Linux and Windows for my virtual machines. While most Linux distros offer ARM versions, older Windows software might not work seamlessly. I am also worried about the cost of the virtualization applications. I am used to using free version but for Mac OS it might require paid solutions like Parellel,

It’s important to note about the ecosystem surrounding Mac OS and software. Many of the apps are free or paid (downloadable from the Apple store or other method), unlike with the Linux eco-system where pretty much all the apps come via a software repository.

The Decision Looms: What Do I Do?

Ultimately, I’m at a crossroads. I need a reliable tool for productivity, not just a platform for endless tinkering. The MacBook Air M4 offers that promise, but the ecosystem concerns and virtualization hurdles are giving me pause. I also need to be mindful of costs for the required applications.

I want a solution that makes my daily work life simpler. I want to move away from “Commercial Scum” that is starting to become the Linux world, with Redhat forcing everyone to paid tiers.

Let’s Talk!

So, what do you think? Should I make the jump to Mac OS? Are any of you fellow Linux users who’ve made the switch (or switched back)? Let me know your thoughts in the comments below! Don’t forget to give this post a thumbs up and subscribe to my channel Quadrata for more open-source and IT discussions. Also, check out the ZabbixItalia Telegram Channel for awesome community discussions.

Read More
Demystifying AI in Zabbix: Can AI Correlate Events?

Demystifying AI in Zabbix: Can AI Really Correlate Events?

Good morning, everyone! Dimitri Bellini here, back with you on Quadrata, my YouTube channel dedicated to the open-source world and the IT topics I’m passionate about. This week, I wanted to tackle a question that I, and many members of the Zabbix community, get asked all the time: Why doesn’t Zabbix have more built-in AI?

It seems like every monitoring product out there is touting its AI capabilities, promising to solve all your problems with a touch of magic. But is it all hype? My colleagues and I have been digging deep into this, exploring whether an AI engine can truly correlate events within Zabbix and make our lives easier. This blog post, based on my recent video, will walk you through our thought process.

The AI Conundrum: Monitoring Tools and Artificial Intelligence

Let’s be honest: integrating AI into a monitoring tool isn’t a walk in the park. It requires time, patience, and a willingness to experiment with different technologies. More importantly, it demands a good dose of introspection to understand how all the pieces of your monitoring setup fit together. But why even bother?

Anyone who’s managed a complex IT environment knows the struggle. You can be bombarded with hundreds, even thousands, of alerts every single day. Identifying the root cause and prioritizing issues becomes a monumental task, even for seasoned experts. Severity levels help, but they often fall short.

Understanding the Challenges

Zabbix gives us a wealth of metrics – CPU usage, memory consumption, disk space, and more. We typically use these to create triggers and set alarm thresholds. However, these metrics, on their own, often don’t provide enough context when a problem arises. Here are some key challenges we face:

  • Limited Metadata: Event information and metadata, like host details, aren’t always comprehensive enough. We often need to manually enrich this data.
  • Lack of Visibility: Monitoring teams often lack a complete picture of what’s happening across the entire organization. They might not know the specific applications running on a host or the impact of a host failure on the broader ecosystem.
  • Siloed Information: In larger enterprises, different departments (e.g., operating systems, databases, networks) might operate in silos, hindering the ability to connect the dots.
  • Zabbix Context: While Zabbix excels at collecting metrics and generating events, it doesn’t automatically discover application dependencies. Creating custom solutions to address this is possible but can be complex.

Our Goals: Event Correlation and Noise Reduction

Our primary goal is to improve event correlation using AI. We want to:

  • Link related events together.
  • Reduce background noise by filtering out less important alerts.
  • Identify the true root cause of problems, even when buried beneath a mountain of alerts.

Possible AI Solutions for Zabbix

So, what tools can we leverage? Here are some solutions we considered:

  • Time Correlation: Analyzing the sequence of events within a specific timeframe to identify relationships.
  • Host and Host Group Proximity: Identifying correlations based on the physical or logical proximity of hosts and host groups.
  • Semantic Similarities: Analyzing the names of triggers, tags, and hosts to find connections based on their meaning.
  • Severity and Tag Patterns: Identifying correlations based on event severity and patterns in tags.
  • Metric Pattern Analysis: Analyzing how metrics evolve over time to identify patterns associated with specific problems.

Leveraging scikit-learn

One promising solution we explored involves using scikit-learn, an open-source machine learning library. Our proposed pipeline looks like this:

  1. Event Processing: Collect events from our Zabbix server using streaming capabilities.
  2. Encoding Events: Use machine learning techniques to vectorize and transform events into a usable format.
  3. Cluster Creation: Apply algorithms like DBSCAN to create clusters of related events (e.g., network problems, operating system problems).
  4. Merging Clusters: Merge clusters based on identified correlations.

A Simple Example

Imagine a scenario where a router interface goes down and host B becomes unreachable. It’s highly likely that the router issue is the root cause, and host B’s unreachability is a consequence.

Implementation Steps

To implement this solution, we suggest a phased approach:

  1. Temporal Regrouping: Start by grouping events based on their timing.
  2. Host and Group Context: Add context by incorporating host and host group information.
  3. Semantic Analysis: Include semantic analysis of problem names to identify connections.
  4. Tagging: Enrich events with tags to define roles and provide additional information.
  5. Iterated Feedback: Gather feedback from users to fine-tune the system and improve its accuracy.
  6. Scaling Considerations: Optimize data ingestion and temporal window size based on Zabbix load.

Improvements Using Existing Zabbix Features

We can also leverage existing Zabbix features:

  • Trigger Dependencies: Utilize trigger dependencies to define static relationships.
  • Low-Level Discovery: Use low-level discovery to gather detailed information about network interfaces and connected devices.
  • Enriched Tagging: Encourage users to add more informative tags to events.

The Reality Check: It’s Not So Simple

While the theory sounds great, real-world testing revealed significant challenges. The timing of events in Zabbix can be inconsistent due to update intervals and threshold configurations. This can create temporary discrepancies and make accurate correlation difficult.

Consider this scenario:

  • File system full
  • CRM down
  • DB instance down
  • Unreachable host

A human might intuitively understand that a full file system could cause a database instance to fail, which in turn could bring down a CRM application. However, a machine learning algorithm might struggle to make these connections without additional context.

Exploring Large Language Models (LLMs)

To address these limitations, we explored using Large Language Models (LLMs). LLMs have the potential to understand event descriptions and make connections based on their inherent knowledge. For example, an LLM might know that a CRM system typically relies on a database, which in turn requires a file system.

However, even with LLMs, challenges remain. Identifying the root cause versus the symptoms can be tricky, and LLMs might not always accurately correlate events. Additionally, using high-end LLMs in the cloud can be expensive, while local models might not provide sufficient accuracy.

Conclusion: The Complex Reality of AI in Monitoring

In conclusion, integrating AI into Zabbix for event correlation is a complex challenge. A one-size-fits-all solution is unlikely to be effective. Tailoring the solution to the specific needs of each client is crucial. While LLMs offer promise, the cost and complexity of using them effectively remain significant concerns.

We’re continuing to explore this topic and welcome your thoughts and ideas!

Let’s Discuss!

What are your thoughts on using AI in monitoring? Have you had any success with similar approaches? Share your insights in the comments below or join the conversation on the ZabbixItalia Telegram Channel! Let’s collaborate and find new directions for our reasoning.

Thanks for watching! See you next week!

Bye from Dimitri!

Watch the original video: Quadrata Youtube Channel

Read More