Have you ever had a search fail because the data isn't there? The Broken Hosts App for Splunk alerts you when hosts stop sending data into Splunk. Features: * The amount of time before alerting is tunable per host, sourcetype, and/or index. * Alert suppressions can be customized permanently or on a temporary basis. * Alerts can be sent to different email addresses based on host sourcetype, and/or index.
Nov. 14, 2016
The main part of this app is a saved search that looks at the last time that a log was received for each index/sourcetype/host and alerts if that is later than "expected".
There are macros to define a few defaults: (App setup will configure these macros)
The contact and "late seconds" can be configured for different indexes/sourcetypes/hosts in the "expectedTime" lookup table (the Lookup Editor app is really helpful, since it allows you to edit the lookup table from within Splunk).
The search runs every 30 minutes, and will wait 1 hour before re-alerting for the same items.
Each line of the lookup table has several columns. The first three (index, sourcetype, host) are used to select which data you are adjusting settings for. These are case-insensitive and wildcard enabled fields.
The next column is "lateSecs", this is the number of "late seconds" for this host (amount of time that a host can be late before alerting).
The fifth column is "suppressUntil". This allows you to temporarily suppress the alerts.
The next column is "contact". This allows you to send the alerts for different items to different email addresses
The final column is "comments". This is a non-functional column that is intended to help remember why a line was set a certain way.
Because the lookup table is searched from the top down and splunk takes the first match, it is recommended to put the lookup table entries in the following order:
Broken Hosts dashboard can be used to get a visual picture of the current status of hosts.
"Broken Hosts" panel will show all hosts that are not reporting in time.
"Future Hosts" panel will show all hosts that are reporting timestamps from the future.
These panels will allow you to quickly update expectedTime lookup table to suppress a host from monitoring. Clicking on "Suppress" next to an item will remove it from the dashboard and alerts by adding it to the tuning spreadsheet.
"Suppressed Items" will show you the current contents of the "expectedTime" lookup table.
If you're looking for something different than the typical "one-size-fits-all" security mentality, you've come to the right place.