Posts Taged it-monitoring

Mastering SNMP in Zabbix: A Deep Dive into Modern Monitoring Techniques

Mastering SNMP in Zabbix: A Deep Dive into Modern Monitoring Techniques

Good morning, everyone! It’s Dimitri Bellini, and welcome back to my channel, Quadrata, where we explore the fascinating world of open source and IT. This week, I’m revisiting a topic that frequently comes up in my work: SNMP monitoring with Zabbix. There have been some significant updates in recent Zabbix versions, especially regarding how SNMP is handled, so I wanted to share a recap and dive into these new features.

If you enjoy this content, don’t forget to subscribe to the channel or give this video a thumbs up. Your support means a lot!

What Exactly is SNMP? A Quick Refresher

SNMP stands for Simple Network Management Protocol. It’s an internet standard protocol designed for collecting and organizing information about managed devices on IP networks. Think printers, switches, routers, servers, and even more specialized hardware. Essentially, it allows us to query these devices for valuable operational data.

Why Bother with SNMP?

You might wonder why we still rely on such an “old” protocol. The answer is simple:

  • Ubiquity: Almost every network-enabled device supports SNMP out of the box.
  • Simplicity (in concept): It provides a standardized way to access a wealth of internal device information without needing custom agents for every device type.

SNMP Fundamentals You Need to Know

Before diving into Zabbix specifics, let’s cover some SNMP basics:

  • Protocol Type: SNMP primarily uses UDP. This means it’s connectionless, which can sometimes make testing connectivity (like with Telnet for TCP) a bit tricky.
  • Components:

    • Manager: This is the entity that requests information. In our case, it’s the Zabbix Server or Zabbix Proxy.
    • Agent: This is the software running on the managed device that listens for requests from the manager and sends back the requested data.

  • Versions:

    • SNMPv1: The original, very basic.
    • SNMPv2c: The most commonly used version. It introduced improvements like the “GetBulk” operation and enhanced error handling. “c” stands for community-based.
    • SNMPv3: Offers significant security enhancements, including encryption and authentication. It’s more complex to configure but essential for secure environments.

  • Key Operations:

    • GET: Retrieves the value of a specific OID (Object Identifier).
    • GETNEXT: Retrieves the value of the OID following the one specified – useful for “walking” a MIB tree.
    • SET: (Rarely used in Zabbix for monitoring) Allows modification of a device’s configuration parameter via SNMP.
    • GETBULK: (Available in SNMPv2c and v3) Allows the manager to request a large block of data with a single request, making it much more efficient than multiple GET or GETNEXT requests. This is key for modern Zabbix performance!

The `GETBULK` operation is particularly important. Imagine querying a switch with 100 interfaces, and for each interface, you want 10 metrics. Without bulk requests, Zabbix would make 1000 individual requests. This can flood the device and cause its SNMP process to consume excessive CPU, especially on devices with less powerful processors. `GETBULK` significantly reduces this overhead.

Understanding OIDs and MIBs

You’ll constantly hear about OIDs and MIBs when working with SNMP.

  • OID (Object Identifier): This is a unique, numeric address that identifies a specific piece of information on a managed device. It’s like a path in a hierarchical tree structure. For example, an OID might point to a specific network interface’s operational status or its traffic counter.
  • MIB (Management Information Base): A MIB is essentially a database or a “dictionary” that describes the OIDs available on a device. It maps human-readable names (like `ifDescr` for interface description) to their numeric OID counterparts and provides information about the data type, access rights (read-only, read-write), and meaning of the data. MIBs can be standard (e.g., IF-MIB for network interfaces, common across vendors) or vendor-specific.

To navigate and understand MIBs and OIDs, I highly recommend using a MIB browser. A great free tool is the iReasoning MIB Browser. It’s a Java application that you can download and run without extensive installation. You can load MIB files (often downloadable from vendor websites or found via Google) into it, visually explore the OID tree, see the numeric OID for a human-readable name, and get descriptions of what each OID represents.

For example, in a MIB browser, you might find that `ifOperStatus` (interface operational status) returns an integer. The MIB will tell you that `1` means “up,” `2` means “down,” `3` means “testing,” etc. This information is crucial for creating value mappings in Zabbix to display human-friendly statuses.

SNMP Monitoring in Zabbix: The Evolution

Zabbix has supported SNMP for a long time, but the way we implement it has evolved, especially with recent versions.

The “Classic” Approach (Pre-Zabbix 6.4)

Traditionally, SNMP monitoring in Zabbix involved:

  • SNMP Agent Items: You’d create an item of type “SNMP agent” and provide the specific numeric OID (or its textual representation if the MIB was installed on the Zabbix server/proxy) in the “SNMP OID” field.
  • Discovery Rules: For discovering multiple instances (like network interfaces), you’d use a discovery rule, again specifying OIDs. The key would often look like `discovery[{#SNMPVALUE},oid1,{#IFDESCR},oid2,…]`. Each OID would populate a Low-Level Discovery (LLD) macro.

Limitations of the Classic Approach:

  • No True Bulk Requests: Even if “Use bulk requests” was checked in the host interface settings, it was more of an optimization for multiple *items* rather than fetching multiple OID values for a *single* item or discovery rule efficiently. Each OID in a discovery rule often meant a separate query.
  • Synchronous Polling: Each SNMP check would typically occupy a poller process until it completed.
  • Potential Device Overload: As mentioned earlier, many individual requests could strain the monitored device.

The Modern Approach with Zabbix 6.4+

Zabbix 6.4 brought a significant game-changer with new SNMP item types:

  • `snmp.get[OID]`: For fetching a single OID value.
  • `snmp.walk[OID1,OID2,…]`: This is the star of the show! It allows you to “walk” one or more OID branches and retrieve all underlying data in a single operation. The output is a large text block containing all the fetched OID-value pairs.

Key Benefits of the snmp ‘walk’ Approach:

  • True Bulk SNMP Requests: The snmp ‘walk’ item inherently uses bulk SNMP operations (for SNMPv2c and v3), making data collection much more efficient.
  • Asynchronous Polling Support: These new item types work with Zabbix’s asynchronous polling capabilities, meaning a poller can initiate many requests without waiting for each one to complete, freeing up pollers for other tasks.
  • Reduced Load on Monitored Devices: Fewer, more efficient requests mean less stress on your network devices.
  • Master/Dependent Item Architecture: The snmp ‘walk’ item is typically used as a “master item.” It collects a large chunk of data once. Then, multiple “dependent items” (including discovery rules and item prototypes) parse the required information from this master item’s output without making additional SNMP requests.

Implementing the Modern SNMP Approach in Zabbix

Let’s break down how to set this up:

1. Configure the SNMP Interface on the Host

In Zabbix, when configuring a host for SNMP monitoring:

  • Add an SNMP interface.
  • Specify the IP address or DNS name.
  • Choose the SNMP version (v1, v2, or v3). For v2c, you’ll need the Community string (e.g., “public” or whatever your devices are configured with). For v3, you’ll configure security name, levels, protocols, and passphrases.
  • Max repetitions: This setting (default is often 10) applies to snmp ‘walk’ items and controls how many “repeats” are requested in a single SNMP GETBULK PDU. It influences how much data is retrieved per underlying bulk request.
  • Use combined requests: This is the *old* “Use bulk requests” checkbox. When using the new snmp ‘walk’ items, this is generally not needed and can sometimes interfere. I usually recommend unchecking it if you’re fully embracing the snmp ‘walk’ methodology. The snmp ‘walk’ item itself handles the efficient bulk retrieval.

2. Create the Master snmp ‘walk’ Item

This item will fetch all the data you need for a set of related metrics or a discovery process.

  • Type: SNMP agent
  • Key: `snmp.walk[oid.branch.1, oid.branch.2, …]`

    Example: `snmp.walk[IF-MIB::ifDescr, IF-MIB::ifOperStatus, IF-MIB::ifAdminStatus]` or using numeric OIDs.

  • Type of information: Text (as it returns a large block of text).
  • Set an appropriate update interval.

This item will collect data like:


IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: down(2)
...and so on for all OIDs specified in the key.

3. Create a Discovery Rule (Dependent on the Master Item)

If you need to discover multiple instances (e.g., network interfaces, storage volumes):

  • Type: Dependent item
  • Master item: Select the snmp ‘walk’ master item created above.
  • Preprocessing Steps: This is where the magic happens!

    • Add a preprocessing step: SNMP walk to JSON.

      • Parameters: This is where you define your LLD macros and map them to the OIDs from the snmp ‘walk’ output.


        {#IFDESCR} => IF-MIB::ifDescr
        {#IFOPERSTATUS} => IF-MIB::ifOperStatus
        {#IFADMINSTATUS} => IF-MIB::ifAdminStatus
        // or using numeric OIDs:
        {#IFDESCR} => .1.3.6.1.2.1.2.2.1.2
        {#IFOPERSTATUS} => .1.3.6.1.2.1.2.2.1.8

      • This step transforms the flat text output of snmp ‘walk’ into a JSON structure that Zabbix LLD can understand. It uses the SNMP index (the number after the last dot in the OID, e.g., `.1`, `.2`) to group related values for each discovered instance. Zabbix automatically makes `{#SNMPINDEX}` available.

The `SNMP walk to JSON` preprocessor will generate JSON like this, which LLD uses to create items based on your prototypes:


[
{ "{#SNMPINDEX}":"1", "{#IFDESCR}":"lo", "{#IFOPERSTATUS}":"1", ... },
{ "{#SNMPINDEX}":"2", "{#IFDESCR}":"eth0", "{#IFOPERSTATUS}":"2", ... }
]

4. Create Item Prototypes (Dependent on the Master Item)

Within your discovery rule, you’ll create item prototypes:

  • Type: Dependent item
  • Master item: Select the same snmp ‘walk’ master item.
  • Key: Give it a unique key, often incorporating LLD macros, e.g., `if.operstatus.[{#IFDESCR}]`
  • Preprocessing Steps:

    • Add a preprocessing step: SNMP walk value.

      • Parameters: Specify the OID whose value you want to extract for this specific item prototype, using `{#SNMPINDEX}` to get the value for the correct discovered instance.


        IF-MIB::ifOperStatus.{#SNMPINDEX}
        // or numeric:
        .1.3.6.1.2.1.2.2.1.8.{#SNMPINDEX}

    • Add other necessary preprocessing steps (e.g., “Change per second,” “Multiply by 8” for bits to Bytes/sec, custom scripts, value mapping).

For static items (not discovered) that should also use the data from the snmp ‘walk’ master item, you’d create them as dependent items directly under the host, also using the “SNMP walk value” preprocessor, but you’d specify the full OID including the static index (e.g., `IF-MIB::ifOperStatus.1` if you always want the status of the interface with SNMP index 1).

Practical Tips & Troubleshooting

  • Use `snmpwalk` Command-Line Tool: Before configuring in Zabbix, test your OIDs and community strings from your Zabbix server or proxy using the `snmpwalk` command (part of `net-snmp-utils` or similar packages on Linux).

    Example: `snmpwalk -v2c -c public your_device_ip IF-MIB::interfaces`

    Use the `-On` option (`snmpwalk -v2c -c public -On your_device_ip .1.3.6.1.2.1.2`) to see numeric OIDs, which can be very helpful.

  • Check Zabbix Server/Proxy Logs: If things aren’t working, the logs are your best friend. Increase debug levels if necessary.
  • Consult Zabbix Documentation: The official documentation is a valuable resource for item key syntax and preprocessing options.
  • Test Preprocessing Steps: Zabbix allows you to test your preprocessing steps. For dependent items, you can copy the output of the master item and paste it as input for testing the dependent item’s preprocessing. This is invaluable for debugging `SNMP walk to JSON` and `SNMP walk value`.

Wrapping Up

The introduction of snmp ‘walk’ and the refined approach to SNMP in Zabbix 6.4+ is a massive improvement. It leads to more efficient polling, less load on your monitored devices, and a more streamlined configuration once you grasp the master/dependent item concept with preprocessing.

While it might seem a bit complex initially, especially the preprocessing steps, the benefits in performance and scalability are well worth the learning curve. Many of the newer official Zabbix templates are already being converted to use this snmp ‘walk’ method, but always check, as some older ones might still use the classic approach.

That’s all for today! I hope this deep dive into modern SNMP monitoring with Zabbix has been helpful. I got a bit long, but there was a lot to cover!


What are your experiences with SNMP in Zabbix? Have you tried the new snmp ‘walk’ items? Let me know in the comments below!

Don’t forget to check out my YouTube channel for more content:

Quadrata on YouTube

And join the Zabbix Italia community on Telegram:

ZabbixItalia Telegram Channel

See you next week, perhaps talking about something other than Zabbix for a change! Bye everyone, from Dimitri Bellini.

Read More
Deep Dive into Zabbix Low-Level Discovery & the Game-Changing 7.4 Update

Deep Dive into Zabbix Low-Level Discovery & the Game-Changing 7.4 Update

Good morning, everyone, and welcome back to Quadrata! This is Dimitri Bellini, and on this channel, we explore the fascinating world of open source and IT. I’m thrilled you’re here, and if you enjoy my content, please give this video a like and subscribe if you haven’t already!

I apologize for missing last week; work had me on the move. But I’m back, and with the recent release of Zabbix 7.4, I thought it was the perfect time to revisit a powerful feature: Low-Level Discovery (LLD). There’s an interesting new function in 7.4 that I want to discuss, but first, let’s get a solid understanding of what LLD is all about.

What Exactly is Zabbix Low-Level Discovery?

Low-Level Discovery is a fantastic Zabbix feature that automates the creation of items, triggers, and graphs. Think back to the “old days” – or perhaps your current reality if you’re not using LLD yet. Manually creating monitoring items for every CPU core, every file system, every network interface on every host… it’s a painstaking and error-prone process, especially in dynamic environments.

Imagine:

  • A new mount point is added to a server. If you forget to add it to Zabbix, you won’t get alerts if it fills up. Disaster!
  • A network switch with 100 ports. Manually configuring monitoring for each one? A recipe for headaches.

LLD, introduced way back in Zabbix 2.0, came to rescue us from this. It allows Zabbix to automatically discover resources on a host or device and create the necessary monitoring entities based on predefined prototypes.

Why Do We Need LLD?

  • Eliminate Manual Toil: Say goodbye to the tedious task of manually creating items, triggers, and graphs.
  • Dynamic Environments: Automatically adapt to changes like new virtual machines, extended filesystems, or added network ports.
  • Consistency: Ensures that all similar resources are monitored in the same way.
  • Accuracy: Reduces the risk of human error and forgotten resources.

How Does Low-Level Discovery Work?

The core principle is quite straightforward:

  1. Discovery Rule: You define a discovery rule on a host or template. This rule specifies how Zabbix should find the resources.
  2. Data Retrieval: Zabbix (or a Zabbix proxy) queries the target (e.g., a Zabbix agent, an SNMP device, an HTTP API) for a list of discoverable resources.
  3. JSON Formatted Data: The target returns the data in a specific JSON format. This JSON typically contains an array of objects, where each object represents a discovered resource and includes key-value pairs. A common format uses macros like {#FSNAME} for a filesystem name or {#IFNAME} for an interface name.


    {
    "data": [
    { "{#FSNAME}": "/", "{#FSTYPE}": "ext4" },
    { "{#FSNAME}": "/boot", "{#FSTYPE}": "ext4" },
    { "{#FSNAME}": "/var/log", "{#FSTYPE}": "xfs" }
    ]
    }

  4. Prototype Creation: Based on the received JSON data, Zabbix uses prototypes (item prototypes, trigger prototypes, graph prototypes, and even host prototypes) to automatically create actual items, triggers, etc., for each discovered resource. For example, if an item prototype uses {#FSNAME} in its key, Zabbix will create a unique item for each filesystem name returned by the discovery rule.

The beauty of this is its continuous nature. Zabbix periodically re-runs the discovery rule, automatically creating entities for new resources and, importantly, managing resources that are no longer found.

Out-of-the-Box vs. Custom Discoveries

Zabbix comes with several built-in LLD rules, often found in default templates:

  • File systems: Automatically discovers mounted file systems (e.g., on Linux and Windows).
  • Network interfaces: Discovers network interfaces.
  • SNMP OIDs: Discovers resources via SNMP.
  • Others like JMX, ODBC, Windows services, and host interfaces.

But what if you need to discover something specific to your custom application or a unique device? That’s where custom LLD shines. Zabbix is incredibly flexible, allowing almost any item type to become a source for discovery:

  • Zabbix agent (system.run[]): Execute a script on the agent that outputs the required JSON.
  • External checks: Similar to agent scripts but executed on the Zabbix server/proxy.
  • HTTP agent: Perfect for querying REST APIs that return lists of resources.
  • JavaScript items: Allows for complex logic, multiple API calls, and data manipulation before outputting the JSON.
  • SNMP agent: For custom SNMP OID discovery.

The key is that your custom script or check must output data in the LLD JSON format Zabbix expects.

Configuring a Discovery Rule: Key Components

When you set up a discovery rule, you’ll encounter several important configuration tabs:

  • Discovery rule (main tab): Define the item type (e.g., Zabbix agent, HTTP agent), key, update interval, etc. This is where you also configure how Zabbix handles “lost” resources.
  • Preprocessing: Crucial for custom discoveries! You can take the raw output from your discovery item and transform it. For example, convert CSV to JSON, use regular expressions, or apply JSONPath to extract specific parts of a complex JSON.
  • LLD macros: Here, you map the keys from your discovery JSON (e.g., {#FSNAME}) to JSONPath expressions that tell Zabbix where to find the corresponding values in the JSON output from the preprocessing step.
  • Filters: Include or exclude discovered resources based on regular expressions matching LLD macro values.
  • Overrides: A more advanced feature allowing you to change specific attributes (like item status, severity of triggers, tags) for discovered objects that match certain criteria.

Managing Lost Resources: A Welcome Improvement

A critical aspect of LLD is how it handles resources that were previously discovered but are no longer present. For a long time, we had the “Keep lost resources period” setting. If a resource disappeared, Zabbix would keep its associated items, triggers, etc., for a specified duration (e.g., 7 days) before deleting them. During this period, the items would often go into an unsupported state as Zabbix tried to query non-existent resources, creating noise.

Starting with Zabbix 7.0, a much smarter option was introduced: “Disable lost resources.” Now, you can configure Zabbix to immediately (or after a period) disable items for lost resources. This is fantastic because:

  • It stops Zabbix from trying to poll non-existent resources, reducing load and unsupported item noise.
  • The historical data for these items is preserved until they are eventually deleted (if configured to do so via “Keep lost resources period”).
  • If the resource reappears, the items can be automatically re-enabled.

You can use these two settings in combination: for example, disable immediately but delete after 7 days. This offers great flexibility and a cleaner monitoring environment.

Prototypes: The Blueprints for Monitoring

Once LLD discovers resources, it needs templates to create the actual monitoring entities. These are called prototypes:

  • Item prototypes: Define how items should be created for each discovered resource. You use LLD macros (e.g., {#FSNAME}) in the item name, key, etc.
  • Trigger prototypes: Define how triggers should be created.
  • Graph prototypes: Define how graphs should be created.
  • Host prototypes: This is a particularly powerful one, allowing LLD to create *new hosts* in Zabbix based on discovered entities (e.g., discovering VMs from a hypervisor).

The Big News in Zabbix 7.4: Nested Host Prototypes!

Host prototypes have been around for a while, evolving significantly from Zabbix 6.0 to 7.0, gaining features like customizable interfaces, tags, and macro assignments for the discovered hosts. However, there was a significant limitation: a template assigned to a host created by a host prototype could not, itself, contain another host prototype to discover further hosts. In essence, nested host discovery wasn’t supported.

Imagine trying to monitor a virtualized environment:

  1. You discover your vCenter.
  2. You want the vCenter discovery to create host objects for each ESXi hypervisor. (Possible with host prototypes).
  3. Then, you want each discovered ESXi hypervisor host (using its assigned template) to discover all the VMs running on it and create host objects for those VMs. (This was the roadblock!).

With Zabbix 7.4, this limitation is GONE! Zabbix now supports nested host prototypes. This means a template applied to a discovered host *can* indeed contain its own host prototype rules, enabling multi-level, chained discoveries. This is a game-changer for complex environments like Kubernetes, container platforms, or any scenario with layered applications.

A Quick Look at How It Works (Conceptual Demo)

In the video, I demonstrated this with a custom LLD setup:

  1. Initial Discovery: I used a simple system.run item that read a CSV file. This CSV contained information about “parent” entities (simulating, say, hypervisors).
  2. Preprocessing: A “CSV to JSON” preprocessing step converted this data into the LLD JSON format.
  3. LLD Macros: I defined LLD macros like {#HOST} and {#HOSTGROUP}.
  4. Host Prototype (Level 1): A host prototype used these macros to create new hosts in Zabbix and assign them a specific template (let’s call it “Template A”).
  5. The Change in 7.4:

    • In Zabbix 7.0 (and earlier): If “Template A” itself contained a host prototype (e.g., to discover “child” entities like VMs), that nested host prototype would simply not appear or function on the hosts created by the Level 1 discovery. The Zabbix documentation even explicitly stated this limitation.
    • In Zabbix 7.4: “Template A” *can* now have its own discovery rules and host prototypes. So, when the Level 1 discovery creates a host and assigns “Template A”, “Template A” can then kick off its *own* LLD process to discover and create further hosts (Level 2).

This allows for a much more dynamic and hierarchical approach to discovering and monitoring complex infrastructures automatically.

Conclusion: Embrace the Automation!

Low-Level Discovery is an indispensable Zabbix feature for anyone serious about efficient and comprehensive monitoring. It saves incredible amounts of time, reduces errors, and keeps your monitoring setup in sync with your ever-changing IT landscape.

The introduction of “Disable lost resources” in Zabbix 7.0 was a great step forward, and now, with nested host prototypes in Zabbix 7.4, the power and flexibility of LLD have reached new heights. This opens up possibilities for automating the discovery of deeply layered applications and infrastructure in a way that wasn’t easily achievable before.

I encourage you to explore LLD in your Zabbix environment. Start with the out-of-the-box discoveries, and then don’t be afraid to dive into custom LLDs to tailor Zabbix perfectly to your needs.

What are your thoughts on Low-Level Discovery or this new Zabbix 7.4 feature? Are there any specific LLD scenarios you’d like me to cover in a future video? Let me know in the comments below! Your feedback is always appreciated.

Thanks for watching, and I’ll see you next week!

All the best,
Dimitri Bellini


Connect with me and the community:

Read More