Detecting DNS Tunneling with Zeek, Passive DNS, and Python

Posted Sep 5, 2025

By Nathan Berg

6 min read

DNS is the perfect covert channel for attackers because it is almost always allowed outbound. Tunneling tools encode data in subdomain labels and trick resolvers into carrying payloads through normal lookups. If you can observe enough DNS metadata and compute a few simple statistics, you can spot most tunnel traffic without deep packet inspection.

This post walks through a practical detection workflow using Zeek logs and a small Python analysis script. The goal is to produce signals you can use in a lab or a small enterprise network: high entropy subdomains, high NXDOMAIN ratios, long label lengths, and unusually high unique query counts per client.

Capture DNS data with Zeek

Zeek provides a rich dns.log that includes query names, response codes, TTLs, and the transport protocol. Enable DNS logging in local.zeek if you have not already, and use JSON logs for easier parsing.

@load base/protocols/dns
redef LogAscii::use_json = T;

If your sensor sees too much traffic, apply a BPF filter to keep only DNS. This lowers disk use and makes analysis faster.

  
sudo zeek -i eth1 "port 53" local.zeek

Zeek will write JSON lines to dns.log. Each entry includes query, rcode, qtype_name, id.orig_h, and id.resp_h at a minimum, which is enough to build a detector.

What tunneling looks like in metadata

Classic DNS tunnels have several measurable properties:

Long labels (often 40+ characters) and long total query names.
High entropy in labels due to base32 or base64 encoding.
High ratio of NXDOMAIN responses when the tunnel server only responds to valid chunks.
Very high unique subdomain counts for a single base domain.
Unusual query types such as TXT or NULL when the tunnel is carrying data.

No single signal is conclusive. The strength comes from combining them with a threshold and a baseline from your normal DNS activity.

Python analysis: entropy and uniqueness

The script below ingests Zeek JSON, extracts the leftmost label, and computes Shannon entropy. It also tracks unique subdomains per base domain for each client. This is not a full detector, but it will surface the outliers you should investigate.

  
import json
import math
from collections import defaultdict

def shannon_entropy(s):
    if not s:
        return 0.0
    freq = {c: s.count(c) for c in set(s)}
    return -sum((n/len(s)) * math.log2(n/len(s)) for n in freq.values())

stats = defaultdict(lambda: {"count": 0, "unique": set(), "nxd": 0, "entropy_sum": 0})

with open("/opt/zeek/logs/current/dns.log") as f:
    for line in f:
        rec = json.loads(line)
        query = rec.get("query")
        if not query:
            continue
        labels = query.split(".")
        if len(labels) < 2:
            continue
        base = ".".join(labels[-2:])
        left = labels[0]
        client = rec.get("id.orig_h")
        key = (client, base)

        stats[key]["count"] += 1
        stats[key]["unique"].add(left)
        if rec.get("rcode") == "NXDOMAIN":
            stats[key]["nxd"] += 1
        stats[key]["entropy_sum"] += shannon_entropy(left)

for (client, base), s in stats.items():
    unique = len(s["unique"])
    avg_entropy = s["entropy_sum"] / max(s["count"], 1)
    nxd_ratio = s["nxd"] / max(s["count"], 1)
    if unique > 50 and avg_entropy > 4.0 and nxd_ratio > 0.3:
        print(client, base, s["count"], unique, f"{avg_entropy:.2f}", f"{nxd_ratio:.2f}")

This produces a short list of client and base-domain pairs that have anomalous behavior. In a lab you can easily trigger it with tools like iodine or dnscat2 and observe the values spike.

Use passive DNS for baselining

Entropy is useful, but baselining is better. Track how many unique subdomains you normally see for common providers. CDNs and tracking domains can produce high counts, so build a small allowlist based on passive DNS observations. A week of data is enough to avoid most false positives.

A quick method is to export distinct base domains with their unique subdomain counts and then manually tag the known noisy domains.

  
jq -r '.query' /opt/zeek/logs/current/dns.log | awk -F. 'NF>=2{print $(NF-1)"."$NF}' | sort | uniq -c | sort -nr | head

Detection enrichment

You can add context to reduce noise:

Compare client hostnames to see if a single workstation is the source.
Correlate with conn.log to see if the host also talks to rare IPs.
Add GeoIP to responses to flag resolver changes.
Look for TXT payloads larger than 200 bytes.

All of these can be done in Python or a simple OpenSearch pipeline. The goal is to build a chain of evidence instead of relying on a single signal.

Evasion patterns and countermeasures

Attackers can make tunnels look more normal by spreading queries across time and domains. Low-and-slow tunneling reduces unique subdomains per minute, which can slip below naive thresholds. Counter this by tracking unique subdomains per hour and looking for entropy outliers even at low volume.

DNS-over-HTTPS (DoH) and DNS-over-TLS (DoT) can hide queries from your sensor if you only monitor port 53. In a lab, block or monitor known DoH resolvers and track HTTPS connections to those endpoints. If you can see SNI and JA3 fingerprints, you can at least detect which hosts are using DoH and flag unusual destinations.

Threshold tuning and metrics

Pick thresholds that reflect your lab. If you have 10 hosts, a single machine generating 200 unique subdomains in an hour is suspicious. In a larger network, that number might be normal for CDN-heavy applications. Track baseline distributions and store them so you can adjust thresholds without guesswork.

Focus on a few metrics: average label length, average entropy, NXDOMAIN ratio, and unique subdomains per base domain. These metrics are stable across environments and are easy to compute. If you capture them daily, you can also detect gradual shifts that might indicate a new tunneling channel.

Validation in a lab

Generate tunneling traffic in your lab and confirm the signal. For example, use dnscat2 to create random subdomains and then query them from a test host. Make sure your script surfaces that host within a few minutes.

If you do not have a tunnel tool available, you can simulate the pattern with a quick Python loop that sends long random labels to a test domain you control.

  
import os
import random
import string
import subprocess

def rand_label(n=50):
    alphabet = string.ascii_lowercase + string.digits
    return "".join(random.choice(alphabet) for _ in range(n))

for _ in range(200):
    q = rand_label() + ".lab.test"
    subprocess.run(["dig", "+short", q], stdout=subprocess.DEVNULL)

Operational considerations

DNS tunneling detection is about trend analysis. A single host with 20 random queries can be normal. The same host with 2,000 unique subdomains and high entropy is not. Keep thresholds flexible and revisit them after software updates or new lab services.

Most importantly, keep your detection simple and fast. A small Python script running as a cron job is often more useful than a complex pipeline that no one maintains. Start small, verify in the lab, and scale only when the signal is consistent.

blog

This post is licensed under CC BY 4.0 by the author.