Splunk Baseline Props Part 2: Building a Smart Foundation

This blog includes tips for validating the integrity of your data and keeping performance costs down while searching. Part 2 will focus on LINE BREAKING.



In Part 1 of this blog we talked about three key elements of events in Splunk, more specifically about capturing the correct timestamp of an event. Splunk is a technology built to analyze data, usually on a linear timeline, and if your timestamps are not accurate, neither are your results. From a technical perspective, the time of an event is just as important as the separation of multiple events. Splunk will automatically try to separate events, assigning the appropriate timestamp to each. This automation works fairly well, but sometimes, you may have some interesting data feeds that require some additional configuration. That’s why 3 of the 6 baseline props configurations revolve solely around LINE BREAKING.

LINE BREAKING - EXCUSE ME? SPLUNK DOESN’T DO THAT ITSELF?

While Splunk will automatically try to separate your events for you, the auto-parsing technology can sometimes be wrong. Not all data sources follow the same logging format, and that may cause issues down the line. Taking a look at the default configs below might persuade you to handle this at the time of initial data onboarding, providing a sound format for your data source to follow when it comes to separating events. Whether your data is looking great and spliced correctly, the processing overhead and broken-data headaches this initial work can save is tremendous.

SHOULD_LINEMERGE = [TRUE|FALSE]

This setting controls whether or not Splunk should combine several lines of data into a single event (like JSON or XML data). The default for this setting is “true”, meaning “yes, merge my lines”. In my day, roughly 90% of the data sources I had onboarded did not require line merging. When I realized how easy it could be to go back and remove any chance for inaccurate data on these critical data sources, I took advantage - because no one wants to deal with broken events - not you, not me, and definitely not your boss.

Think about it, if Splunk’s default is to attempt to merge multiple lines of data if it doesn’t see any specific text to separate events on, you’re going to get dirty data (multiple events in one). If you know your data is “one event per line”, why not just set Splunk straight from the beginning, allowing Splunk to spend even less time attempting to determine where the actual line break is. If this needs to be set to “true”, check Splunk’s props.conf documentation about more specific details around other variables used in line breaking.

LINE_BREAKER = <REGULAR EXPRESSION>

This attribute specifies a regex that determines how the raw text stream is broken into initial events. Usually, this will be a timestamp or new line. Again, this will definitely help out in avoiding broken or prolonged events. If you’re dealing with events that have multiple lines of code (like XML events), setting SHOULD_LINEMERGE to “false” and providing an appropriate LINE_BREAKER regex will effectively parse your events correctly. The setting actually defaults to a new line, or more specifically, regex:

([\r\n]+)

TRUNCATE = <NON-NEGATIVE INTEGER>

This setting was built to give a ceiling value for how many bytes an event can be. The default here is 10,000 bytes. Some events do not require 10,000 bytes. Usually, I’ve seen events be somewhere in the 500 byte range, but all data is different. Using a large sample set of data and knowing an average character length of your events can help you better determine what this value should be for your data source. It is important to note that the TRUNCATE value exists partially to protect your instance from a memory overload. Setting these values too low will cause your events to break, but setting them too high (or unlimited) can introduce resource issues around your server’s memory consumption.

EXAMPLE - KNOW WHAT TO DO WITH YOUR LINE BREAKERS IN REAL-TIME

Now that we’ve gone through what these props.conf attributes are used for, let’s see how they are actually applied. For this exercise, we’ll look at the same proxy log we used in Part 1 of this blog:

Mar 21 10:04:54  ProxySG: 500000 Dynamic categorization error: unexpected response code 500 from service(0) SEVERE_ERROR myapi_api.cpp 100

Mar 22 10:05:36  ProxySG: 500000 Dynamic categorization error: unexpected response code 500 from service(0) SEVERE_ERROR myapi_api.cpp 100

Mar 23 10:07:11  ProxySG: 500000 Dynamic categorization error: unexpected response code 500 from service(0) SEVERE_ERROR myapi_api.cpp 100

Note: This is one long line of text, despite what the width of your monitor is showing you.

The human eye can notice that there are three different events here, separated by a timestamp (seeing as they happened at different times on different days).

The first thing you want to do is verify whether or not events are on multiple lines, or just one. In this case, we have a separate event on each line. As each event is its own line in the log file, we do *not* want Splunk to attempt to merge lines. This will also stop Splunk from attempting to use more resources to figure out whether it should merge lines or not, reducing load at index-time. While resource consumption may not seem like an overly-important issue with just a few data sources, trust that it is. As we add and add more data to Splunk, the last thing you want is a bottleneck where you could have avoided it.

SHOULD_LINEMERGE = false

When onboarding data, we want to find a line breaker in the logs to separate events with. Seeing as each event is a new line, we can use the “new line” regex value for a line breaker regex. Splunk uses the “new line” value for a line breaker by default, but I am going to include this in my custom props TA for this data source just to be consistent and keep everything together:

LINE_BREAKER = ([\r\n]+)

When it comes to the “truncate” attribute, we need to be careful. If you’re hitting the default limit of 10k bytes, you may have garbage data. If you know that your events are definitely above 10k, you can always make this limit “0” and effectively shut it off. In our case, we have relatively small events, so we’re going to tune this limit down:

TRUNCATE = 256

Between Parts 1 and 2 of this post, we now have a full “baseline props” configuration we can use efficiently for this data source. This is going to be a major help for how much load your instance has to manage, combing through tons of events in Splunk, and building useful and correct dashboards for your organization.

Here’s what it looks like all together:

[proxy_console_syslog]
TIME_PREFIX = ^
TIME_FORMAT = %b %d %H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD = 15
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TRUNCATE = 256

Baseline props configurations for effectively organizing and parsing data need to happen before your data is indexed, which translates to “put these configurations on the first Splunk Enterprise server your data hits,” whether that be a Heavy Forwarder or Indexer. Keep in mind that creating new apps on your Splunk instance requires a Splunk Restart.

Enjoy your clean data!