DNS Entropy Hunting and You

Sometimes your DNS logs tell a story, you just need to listen a little more closely to be able to hear it. In this post, I will be discussing some Splunk queries from the SANS whitepaper Using Splunk to Detect DNS Tunneling, and how they can be tuned to provide actionable results in the real world.

To DNS: The root of, and solution to, all of our networking problems.

Introduction

Recently, I’ve been doing more threat hunting related activity. In the event you’ve had other things going on in your life and have no idea what threat hunting is, in a nutshell, it’s taking a proactive, human-oriented approach to searching for previously undiscovered threats to your enterprise network.

We say that is proactive and human-oriented because a lot of the time it’s a human being asking a question, or picking a stack of data to query (or as I call it, ‘poking it with a stick’), and seeing if anything nasty appears. We do these queries manually either because the query cannot easily be automated, or you cannot reasonably expect a security control to enumerate all forms of bad in the stack of data you’re digging or “hunting” through.

DNS-Based Threat Hunting

That brings me to my current focus, DNS-based threat hunting. In particular, Entropy and statistical analysis of DNS queries.

Before I continue, I’d like to thank SANS for making their reading room and the papers of their researchers freely available for practically anyone to download. This post is more or less just me trying to see if the techniques mentioned in Using Splunk to Detect DNS Tunneling are still viable, three years after the original document was published.

Three years may not seem like such a long time, but you have to consider: APT/intrusion set techniques and tradecraft become red team penetration tester techniques, and then become IOCs and/or hunting queries for the blue team to wipe off the face of the earth. What I’m trying to say is that attacker tradecraft is polymorphic by its very nature (having to adapt to find a way past competent defense), and detection methods have to adapt just as quickly to keep spotting them.

SANS Whitepaper Discussion

Before moving on to my results and conclusions, let's talk about the paper itself (re: Using Splunk to Detect DNS Tunneling). This paper, which was written by Steve Jaworski, covers using Splunk to detect DNS tunneling attacks and techniques.

As it turns out, if you have DNS logs either from a DNS server (such as BIND, Infoblox, Microsoft DNS, etc.) and/or some form of passive DNS gathering queries off the wire (e.g. Suricata, Bro/Zeek, Splunk’s Stream app configured to monitor DNS queries, etc.), and you have the Splunk URL Toolbox app, you can look at your queries and detect queries to high entropy domains. These would be domains that are likely to be generated by a computer algorithm, also known as DGAs (Domain Generation Algorithms). Additionally, you can query for the average length of domain query, and compare the results to the average, returning results from queries to abnormally long domain names.

The paper also discusses the use of statistical analysis to determine times in which there is an abnormally high number of DNS queries (e.g. detecting spikes in DNS traffic during abnormal times), an abnormally large number of a particular type of DNS query (e.g. seeing an unusually high number of TXT DNS requests), or analyzing firewall logs to determine what external DNS servers systems are attempting to reach that are not recursive resolvers for your enterprise network (e.g. rogue resolvers, DNS tunneling endpoints, etc.).

Steve also goes through the process of showing readers how to reproduce his lab environment and results, recording his lab configuration in painstaking detail. It is impressive research and shows incredible dedication to be sure.

My Hunting and Splunk Experiences

Onward to my experiences. In my line of work, I am blessed with being able to test hunting/Splunk queries on a wide variety of customer networks, deployments, and datasets. I’ve only tried these queries out on a handful of different networks.

All of this being said, my experiences may not be yours and you may have better (or worse) luck utilizing these queries for your own hunting purposes.


Cloud services have made some of these Splunk queries a little more unwieldy, but given time and enough filtering of benign results, they still have promise.

Visualization to Detect Higher Than Average Number of DNS Queries

By and far my results were very inconclusive. How so? Well, let's start with the easier of the queries to pick apart, the Splunk queries in which where are just summing up the number of DNS queries (in total and by type). The following is a DNS query that will count out the number of queries for the time period you are querying, and timechart it.

sourcetype="stream:dns" | timechart span=10s count(_raw) as Count

The purpose of this query is to determine times of the day in which there is a noticeable uptick in DNS queries being made. This could be indicative of DNS tunneling, or other malware infections, and probably warrants investigation.

Issue 1: Scale

The problem with attempting to use this query in an enterprise network is in comparison to Steve’s lab environment, impressive though it may be, it is nowhere near the scale of an enterprise network. Therefore you have much more data to sift through. So much so that the original query mentioned in Steve’s paper, setting the span option to 1’s, results in the timechart option failing to run on the query because there is too much data to plot for the default query time period of 24 hours. I modified the command to use a span of 10’s, but that may or may not be optimal.

I’m not a Splunk expert, I just play one on TV. Also, enterprise networks vary in size and complexity so your mileage will definitely vary there.

Issue 2: Baselines

The other issue you run into is that this type of query assumes you have baselines on what is considered a normal number of DNS queries for various times of the day, which is almost never the case. So for this query to be of any practical use in the real-world, you need to run it multiple times over multiple days (or multiples of whatever time period you ran the query for) to establish that baseline, compare and contrast.

The first time you run this query you might see a humongous spike at say, 8am, or 1pm, or 5pm. These are normally times in which people are browsing the internet before work, at lunch, or at the end of the workday.

Are they normal? We have no idea, because we need a baseline or multiple results to compare and contrast to. What about huge spikes that occur after work hours? Are they necessarily bad? No idea. It could be automated patching or other maintenance jobs running or it could be bad traffic. We have no frame of reference without a proper baseline.

Detection of DNS Queries with High Levels of Entropy (indication of DGA)

Here is the other statistical query from Steve’s research paper I utilized:

sourcetype=stream:dns | stats count(query_type{}) as count by query_type{} | sort -count

The purpose of this query is to show you a breakdown count of DNS queries by type (e.g. A, AAAA, NULL, TXT, etc.) for a particular time frame. The goal would be to detect malicious DNS and/or DNS tunneling by noticing a large spike in uncommon DNS query type(s).

This query has the same problems as the previous query mentioned in that it assumes you have baselines established and you know what looks abnormal on your network. Without a baseline, you have no frame of reference to refer to. On the plus side however, these two queries can be used in conjunction pretty effectively.

As an example, let's say you ran the previous query and you noticed a spike of DNS queries at 7pm; well after everyone normally leaves the office. You could then choose to run our query above with a focused timeframe of say, 6:50pm to 7:10pm to give you a break-down of DNS queries by type for that timeframe to see if anything stands out. A large number of MX DNS records when there is nobody around to be sending email would be pretty fishy and warrant further investigation, right?

Tuning for Better Results

Now that I’ve talked about some of the statistical analysis queries, let’s look at one of the queries that analyzes DNS queries and sorts them by an entropy threshold:

sourcetype="stream:dns" | eval utlist = "custom" | lookup ut_shannon_lookup word as query | search ut_shannon > 2.5 

The goal of this query is to analyze DNS queries using the URL toolbox Shannon Entropy calculator to determine a given query’s entropy scoring. From there, we display results with an entropy score of 2.5 or higher.

The higher the entropy score, the more likely a given DNS domain was algorithmically generated. These computer-generated domain names are often referred to as Domain Generation Algorithms (DGAs), and some (a lot) of malware strains use DGAs as either a primary or fallback method for reaching an attacker-controlled command and control servers for the malware to receive instructions/commands.

With instant gratification and the bottom line being a huge driving force in IT departments worldwide, cloud services have become extremely popular in recent years. Most major websites utilize cloud services to some extent; if not to host something then for redundancy via CDNs (content delivery networks).

The problem we run into in attempting to hunt for malicious domain names by their Shannon Entropy score is that a lot of these cloud services are generating domain names that have a high entropy score as well. This causes the query above to end up with a massive amount of benign results.

In addition to DGAs being considered “the new normal” for cloud services, something the original research paper didn’t account for was the sheer volume of data that would need to be sifted through. The number of DNS queries a given network makes on a daily basis varies greatly on a number of factors, but to say that you’d have to sift through billions of DNS queries for a single 24 hour period is not at all outlandish. This is a huge amount of data for a single analyst to sift through.

The only viable solutions I can see to resolving this issue and making this query usable is to reduce the amount of data the query returns. You can do this by reducing the time frame for a query (e.g. looking at 8 hour chunks of DNS data as opposed to an entire 24 hours of DNS data at a time), and/or filtering what domains, subdomains or TLDs you do not want to see results from.

Here is a modified query you might want to consider instead:

sourcetype="stream:dns" | eval utlist = "custom" | lookup ut_shannon_lookup word as query | search ut_shannon > 2.5 AND NOT query IN ([list of comma separated domains, TLDs, or substrings that you do not want to see results for]) | dedup query | table query ut_shannon | sort  - ut_shannon

The query above differs in that we have a portion in which we can provide a comma separated list of domains, TLDs, and/or substrings we can add to the query to filter what results we get back. Additionally, we use the dedup function to reduce the number of duplicate results we get back.

Let's look at an example:

sourcetype="stream:dns" | eval utlist = "custom" | lookup ut_shannon_lookup word as query | search ut_shannon > 2.5 AND NOT query IN ( *.LOCAL, *.localdomain, *.corporate_domain, *.cloudflare.* ) | dedup query | table query ut_shannon | sort  - ut_shannon

This querying is filtering out all results that end in .LOCAL, .localdomain, .corporate_domain, and queries containing “.cloudflare.”. This query can easily be modified to ONLY look at queries from a particular domain or contain a particular string by removing the “NOT” modifier from the query and inputting the domains/subdomains/substrings you want to search for.

While the modified queries and query ideas I have presented will result in some of the data not being analyzed, it makes the query usable, a bit more scalable, and allows for filtering out benign data that analysts aren’t interested in. Given the choice between a query that returns unmanageable amounts of data, and one that returns something analysts can sift through in a more reasonable amount of time, I decided that the lesser of two evils is usable data. This leads us to our next query, which is somewhat related.

Detection of DNS Queries with Abnormally Long Length (3 or more times the average length)

sourcetype="stream:dns" | eval qlen=len(query) | eventstats avg(qlen) as avg stdev(qlen) as stdev | where qlen>(stdev*3) | stats count by qlen stdev avg query

The purpose of this query is to collect DNS queries, evaluate the length of the query, determine the average length of a query, and return any results that are 3 or more times larger than the standard deviation (e.g. three times larger than the average DNS query). Much like the previous query measuring the entropy of DNS queries, this Splunk query suffers from the fact obnoxiously long domain names for cloud services and CDNs is the new normal and having to sift through the sheer volume of data returned because of this being the new normal for internet traffic.

Much like the query before it, while it has weaknesses due to the make-up of internet traffic today looking like a complete and utter mess by default (old man shaking fist at cloud), you can limit the scope of the mess you are looking at, by reducing the timeframe in which you are searching for data and/or adding in a filter clause like so:

Tuning for Better Results

sourcetype="stream:dns" AND NOT query IN ([list of comma separated domains, TLDs, or substrings that you do not want to see results for]) | eval qlen=len(query) | eventstats avg(qlen) as avg stdev(qlen) as stdev | where qlen>(stdev*3) | stats count by qlen stdev avg query

This allows you to filter out particular domains, subdomains, and/or TLDs from the query. Additionally, just like previous query, we can invert the logic by removing the “NOT” operator and choose to just focus on domains containing a string and/or from a given TLD.

That brings us to a final query that I’ve modified.

Destination DNS Servers in the “Outside” Firewall Zone

sourcetype="[firewall sourcetype goes here]" dest_port=53 action=allowed dest_zone=outside | stats count by dest_ip | sort -count | search count>100

The purpose of this query is to gather a collection of IP addresses in which your firewalls have allowed traffic on port 53, TCP or UDP (remember that DNS uses both). We want to sort out these IP addresses, count out how many times the firewall has allowed communication to a given IP address, and show me all results greater than 100.

Due to the nature of DNS tunneling, you’ll easily see hundreds or thousands of DNS requests for a DNS tunneling endpoint.

Tuning for Better Results

Of course, you’ll probably see hundreds or thousands of hits for DNS requests to your company’s primary DNS resolvers, and if your firewall zones are… unique in that the “outside” firewall zone (or the name of the firewall zone for your network’s perimeter) includes DNS traffic to internal IP addresses.

This can be resolved by modifying the query ever so slightly:

sourcetype="[firewall sourcetype goes here]" dest_port=53 action=allowed dest_zone=outside and NOT dest_ip IN ([list of comma-separated IP addresses you dont wanna see here]) | stats count by dest_ip | sort -count | search count>100

You may also want to change the portion “count>100” to a larger number, depending on how large your network is to filter out some of the results. This list of IP addresses could be exported and fed to DNS enumeration tools and/or threat intelligence sources to determine whether or not the communication was potentially malicious.

Until Next Time

This is all I have for now. I’d like to offer my thanks to SANS and Steve Jaworski for writing, producing, and hosting the original work my research in this post was based off of, and making that report freely available for security analysts everywhere.



Close off Canvas Menu