Showing You the Ropes with the Broken Hosts App for Splunk

A quick introduction of implementation of the Broken Hosts App for Splunk, and basic configuration to get you started with this tool.


  • Conner Reis
  • Aug 17, 2020
  • Tested on Splunk Version: 8.0.5 (Broken Hosts App is compatible with 8.0, 7.3, 7.2, and 7.1)

Introduction

Since starting my journey here at Hurricane Labs, I have come to dive into the benefits the Broken Hosts App for Splunk has to offer. Nothing is more important than knowing your data is coming into Splunk, and the Broken Hosts app takes on that monitoring role–letting you know if anything disrupts the flow of data.

In this tutorial, I will be going over the implementation steps for Hurricane Lab’s Broken Hosts App for Splunk and get you started with the basic configurations to better your Splunk experience using this app.

Installation

Let’s go ahead and get the Broken Hosts App installed on our Splunk server. You can find this app over on Splunkbase, via the Broken Hosts App for Splunk link.

Once you have the Broken Hosts App downloaded, head over to your Splunk server. In the top left corner, choose the “Apps” dropdown and select “Manage Apps.”

This will bring you to the Apps Management page of your Splunk server. From there you will want to choose the “Install app from file” option and choose the Broken Host App which has just been downloaded. Choose “Upload.”

Just like that and you are one step closer to better monitoring of your data!

Broken Hosts App

Once installed, you can head over to the Broken Hosts App, where you are presented with your very own Broken Hosts dashboard. This dashboard contains four main parts:

  • Broken Hosts–Hosts that have not sent data to Splunk for too long
  • Future Hosts–Hosts that have data from the future
  • Broken Hosts Event Types–Eventtypes used by the Broken Hosts app
  • Lookup Suppressed Items–Items suppressed by the Broken Hosts Lookup

In this post, we will mainly be taking a look at the “Broken Hosts” panel, which will give us a breakdown of the current hosts that are, well, broken. This basically means that the host has failed to send in some type of data within a configured amount of time.

In the picture below we can see multiple entries for the host WIN-AT43IJOE0PM failing to send in different types of wineventlog and perfmon events:

The Broken Hosts panel will give you a breakdown of all that is going on in your environment with hosts that are failing to send data to Splunk. This makes monitoring your hosts that much simpler.

We can also confirm this Broken Host by running a simple search and noticing no data has been coming for about 2 hours:

The Broken Hosts App works by comparing the time since the last event (In seconds) to a configured expected amount of time. (By default this is tuned for 4 hours). When that time since the last event exceeds the configured tune, Broken Hosts will alert and show up within the panel.

For a more in depth explanation of this process please check out the following post: Breakdown of the Hurricane Labs Broken Hosts App for Splunk

These tuning entries can be configured however best suits your needs. For the purpose of this demonstration, this tune was set pretty aggressively to alert after 1 hour of no events. I will be covering tuning and suppressions next.

Tuning and Suppression

As mentioned above, Broken Hosts can be configured to your liking. This is usually done by creating new tunes to better accommodate your data or suppressions which can stop alerting on specific data. Tuning and suppressions entries can be added, edited, or removed from Broken Hosts by going into the “Configure Broken Hosts Lookup” tab within the Broken Hosts app.

Here we can see a number of tunes set for different sourcetypes coming from the host WIN-AT43IJOE0PM. These are currently tuned to alert after 3600 seconds (1 hour). Creating a new tune is as easy as choosing the green box labeled “Add New Suppression.”

We are brought to the New Suppression window. Here we can start filling out the requested fields:

  • Index–The index for the data that you would like to match.
  • Sourcetype–The sourcetype for the data that you would like to match.
  • Host–The host for the data that you would like to match.
  • Late Seconds–The amount of time (in seconds) that the index/sourcetype/host combination is allowed to be late before it alerts.
  • Suppress Until–Alerts for the index/sourcetype/host combination will be suppressed until this date.
  • Add Contacts–The email address where you would like the alert to be sent.
  • Comments–Any comments that you would like to add. This is not used in the alert.

Wildcards can be used within the index/sourcetype/host fields. This can make it easier to create a tune for all combinations of each field.

In this example, we used a wildcard for the sourcetype wineventlog*, which will cover all three of the previous tunes and have it alert after 21600 seconds or 6 hours of no events. It is always good practice to remove any older tuning entries from the Broken Host Lookup to aid in better results.

Suppressions work the same exact way as tuning. The only difference here is that the Late Seconds would be changed to “0” meaning to always suppress. This suppression can be made permanent by adding a “0” for Suppress Until, or or you can pick a specific date and time to suppress for an extended period of time. This is a good choice if a host was decommissioned, going to be down for maintenance, or if you just don’t care whether or not data is coming in. We are not here to judge how to handle your Broken Hosts! In the picture below, we can see a basic permanent suppression being made for host WIN-AT43IJOE0PM.

Since we have added a permanent suppression on the host which covers any type of data coming in from that host, we can go back to our Broken Hosts dashboard and see that there are currently no more Broken Host alerts for any of the wineventlog or perfmon events.

Conclusion

As you can see, the Broken Hosts App for Splunk can be a powerful tool when used within a Splunk environment. This tool can keep you on top of your data inputs, making sure data is coming in without issue.

I hope this post has helped show the benefits of the Broken Hosts App and allows you to better monitor your data. Happy Splunking!




Close off Canvas Menu