SANS Penetration Testing

EQL Threat Hunting

Authored by Joshua Wright |

The Event Query Language (EQL) is a standardized query language (similar to SQL) to evaluate Windows events. Written by Ross Wolf, EQL is an amazing tool to normalize Windows log events for consistent access and query.

In practice, EQL is most effective when working with Windows Event Log and Sysmon logging data as part of your threat hunting tactics. In this article I'll demonstrate some ways to get started with EQL to assess the tactics of an attacker from a compromised system.

Installing EQL

EQL works equally well on Windows, Linux, and macOS, and requires Python. You can install EQL with pip3 install eql, or build it from the GitHub repository.

Alternatively, you can download and run Slingshot Linux, where EQL is already installed and ready to go!

Getting Started

EQL works best with Sysmon logs, converted to JSON format. From a system where you're using a Sysmon configuration to capture detailed system events, download the EQL (scrape-events.ps1]( PowerShell script. Import it and write the Sysmon data as a JSON file, as shown here:

# Import the functions provided within scrape-events
Import-Module .\scrape-events.ps1

# Get all the Sysmon logs from Windows Event Logs
Get-WinEvent -filterhashtable @{logname="Microsoft-Windows-Sysmon/Operational"} `
-Oldest | Get-EventProps | ConvertTo-Json | Out-File -Encoding ASCII `
-FilePath my-sysmon-data.json

Note that in this example I've broken up this long command into multiple lines with a backtick at the end of each line per the PowerShell convention. If you type this on one long line, omit the backticks.

EQL includes two important utilities: eql and eqllib:

  • eql is a command line tool to interrogate your data
  • eqllib is a command line tool to format your data in a consistent manner

Working from the PowerShell my-sysmon-data.json file, convert the Sysmon-structured data to the EQL schema using eqllib:

$ eqllib convert-data my-sysmon-data.json -s "Microsoft Sysmon" querydata.json

The querydata.json file will be your data source for interrogation with EQL.

Get the Demo Files

To follow the examples in this article, download the sample data files, unzip, and change to the eql-data-samples directory.

slingshot@slingshot:~$ wget
slingshot@slingshot:~$ unzip -q
slingshot@slingshot:~$ cd eql-data-samples
slingshot $

Threat Hunting: regsvr32.exe

To use EQL to search through the eqllib-normalized JSON files, you will craft SQL-like queries using this syntax:


This is best shown in examples. Let's start with the file querydata.json. We'll start by looking for any instances where the DLL registration utility regsvr32 is run:

slingshot $ eql query -f querydata.json "process where process_name = 'regsvr32.exe'"
{"command_line": "\"C:\\Windows\\syswow64\\regsvr32.exe\" /s .\\meterpreter.dll", "event_type": "process", "logon_id": 180388, "parent_process_name": "powershell.exe", "parent_process_path": "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe", "pid": 5208, "ppid": 1500, "process_name": "regsvr32.exe", "process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe", "subtype": "create", "timestamp": 132039702277510000, "unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}", "unique_ppid": "{AC6A4E42-064B-5CF4-0000-00106FB21900}", "user": "SEC504STUDENT\\Sec504", "user_domain": "SEC504STUDENT", "user_name": "Sec504"}
{"event_type": "process", "pid": 5208, "process_name": "regsvr32.exe", "process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe", "subtype": "terminate", "timestamp": 132039702279730000, "unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}"}

This is ... less than beautiful. If you haven't already, install jq to pretty-print the output data:

slingshot $ sudo apt-get install -y jq
slingshot $ eql query -f querydata.json "process where process_name = 'regsvr32.exe'" | jq
"command_line": "\"C:\\Windows\\syswow64\\regsvr32.exe\" /s .\\meterpreter.dll",
"event_type": "process",
"logon_id": 180388,
"parent_process_name": "powershell.exe",
"parent_process_path": "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
"pid": 5208,
"ppid": 1500,
"process_name": "regsvr32.exe",
"process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe",
"subtype": "create",
"timestamp": 132039702277510000,
"unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}",
"unique_ppid": "{AC6A4E42-064B-5CF4-0000-00106FB21900}",
"user": "SEC504STUDENT\\Sec504",
"user_domain": "SEC504STUDENT",
"user_name": "Sec504"
"event_type": "process",
"pid": 5208,
"process_name": "regsvr32.exe",
"process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe",
"subtype": "terminate",
"timestamp": 132039702279730000,
"unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}"

This is much more useful output! Here we see that someone launched regsrv32.exe from a PowerShell session, passing the command line parameters /s .\\meterpreter.dll. Probably not good news if this is a box you rely on.

We can break down the arguments in this query as shown:

  • eql query: Run the eql command, execute a query
  • -f querydata.json: Read from the specified file
  • "process where process_name = 'regsvr32.exe'": The EQL query for the process event
  • | jq: Send the JSON results to the jq utility to print output nicely

The query syntax process where process_name = ... is used often with EQL. The initial keyword process indicates that we are querying the process data in the normalized JSON. Other keywords for interrogation include file, network, registry, and image_load.

Threat Hunting: ntdsutil

An attacker with privileged access to a Windows Domain Controller can use ntdsutil to create an accessible backup of the domain password hashes. Not a good time for the security of the Windows Domain. For this example, we can reference the T1003-CredentialDumping-ntdsutil_eql.json file:

slingshot $ eql query -f T1003-CredentialDumping-ntdsutil_eql.json \
'process where process_name == "ntdsutil.exe" \
and command_line == "*create*" \
and command_line == "*ifm*"' | jq
"command_line": "\"C:\\Windows\\system32\\ntdsutil.exe\" \"ac i ntds\" ifm \"create full c:\\hive\" q q",
"event_type": "process",
"logon_id": 301152,
"parent_process_name": "powershell.exe",
"parent_process_path": "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
"pid": 5680,
"ppid": 628,
"process_name": "ntdsutil.exe",
"process_path": "C:\\Windows\\System32\\ntdsutil.exe",
"subtype": "create",
"timestamp": 132046718142390000,
"unique_pid": "{8a215c30-bc46-5cfe-0000-0010ae451200}",
"unique_ppid": "{8a215c30-b80d-5cfe-0000-0010e96a0d00}",
"user": "Wardrobe99\\Administrator",
"user_domain": "Wardrobe99",
"user_name": "Administrator"

Note that in this example I've broken up this long command into multiple lines with a backslash at the end of each line. If you type this on one long line, omit the backslashes.

Here we see another example of using the process keyword to search for instances of the ntdsutil process. This by itself is probably enough to warrant further investigation, but we can confirm it further by also checking for the create and ifm command lines as well using wildcard (*) matchine.

Ntdsutil can also be invoked without arguments and run interactively, eliminating the command line detail. Don't rely on the presence of the additional command line arguments to indicate suspicious ntdautil use.

Anomalous Command Lines

Instead of looking for specific processes, we can also use EQL to look for anomalous behavior. For example, we can search for very long command lines, often used to pass encoded PowerShell scripts to bypass Set-ExecutionPolicy restrictions:

slingshot $ eql query -f normalized-rta.json 'process where \
length(command_line) > 200 and not process_name in ("chrome.exe", "ngen.exe")' \
| jq "{process_name,parent_process_name}"
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe"
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe"
"process_name": "powershell.exe",
"parent_process_name": "python.exe"

Notice here how we can call EQL's length function to calculate the length of the command line. I've also added a second and not process_name in clause to eliminate common return values where long command lines are normal and not necessarily indicative of an attack. Finally, I added some jq syntax to display only the process_name and parent_process_name values for each response event.

The return values here aren't that exciting, though we see three events in the log that have a command line longer than 200 characers. Let's modify the jq syntax to get the detail from the command_line member:

slingshot $ eql query -f normalized-rta.json 'process where length(command_line) > 200 and not process_name in ("chrome.exe", "ngen.exe")'   | jq "{process_name,parent_process_name,command_line}"
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe",
"command_line": "C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\cvtres.exe /NOLOGO /READONLY /MACHINE:IX86 \"/OUT:C:\\Users\\alice\\AppData\\Local\\Temp\\RES5673.tmp\" \"c:\\Users\\alice\\AppData\\Local\\Temp\\eexr0kqp\\CSCE6E4328451414E5C89B772D1F2FFE5F8.TMP\""
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe",
"command_line": "C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\cvtres.exe /NOLOGO /READONLY /MACHINE:IX86 \"/OUT:C:\\Users\\alice\\AppData\\Local\\Temp\\RES575D.tmp\" \"c:\\Users\\alice\\AppData\\Local\\Temp\\5gcbnfh4\\CSC48F3F5A831E04AC989C6D1D0A2C1DE4D.TMP\""
"process_name": "powershell.exe",
"parent_process_name": "python.exe",

EQL reveals a suspicious PowerShell command that we would want to investigate further!

Data Exploration

Apart from process information, the EQL-normalized Sysmon logs can reveal additional attributes about the system for interrogation. You can get a summary of the available data by asking:

slingshot $ eql query -f normalized-rta.json "any where true | count event_type"
{"count": 87, "key": "network", "percent": 0.0027321546336714505}
{"count": 240, "key": "file", "percent": 0.007536978299783312}
{"count": 574, "key": "process", "percent": 0.018025939766981754}
{"count": 9811, "key": "image_load", "percent": 0.30810539207989196}
{"count": 21131, "key": "registry", "percent": 0.6635995352196715}

Here I used EQL to retrieve objects of the any event type, the where true being necessary to return the records (EQL requires an evaluation criteria that returns true). The event_type member tells me the type of event, which I obtain using the count function.

The object members in each record will be different for process vs. network (you wouldn't expect destination_port to be a member in the process event, for example). You can read about the data structures in the EQL documentation, or you can ask EQL to tell you what it knows:

slingshot $ eql query -f normalized-rta.json "network where true | head 1" | jq
"destination_address": "",
"destination_port": "445",
"event_type": "network",
"pid": 4,
"process_name": "System",
"process_path": "System",
"protocol": "tcp",
"source_address": "",
"source_port": "50456",
"subtype": "outgoing",
"timestamp": 131883575711730000,
"unique_pid": "{9C977984-B294-5C05-0000-0010EB030000}",
"user_domain": "NT AUTHORITY",
"user_name": "SYSTEM"

EQL supports the head function (note that this is part of the EQL query, not a command-line argument) to limit the number of events returned by a query. In the output we see that the network event type reveals information about the system including source and destination addresses and ports, protocol information, the process ID, domain, and user information.

We can put this information to good use, summarizing the destination port information that the system is using:

slingshot $ eql query -f normalized-rta.json "network where subtype = 'outgoing' | \
count destination_port | sort count"

{"count": 1, "key": "137", "percent": 0.023255813953488372}
{"count": 1, "key": "138", "percent": 0.023255813953488372}
{"count": 1, "key": "53155", "percent": 0.023255813953488372}
{"count": 1, "key": "53159", "percent": 0.023255813953488372}
{"count": 1, "key": "53355", "percent": 0.023255813953488372}
{"count": 1, "key": "80", "percent": 0.023255813953488372}
{"count": 2, "key": "139", "percent": 0.046511627906976744}
{"count": 4, "key": "445", "percent": 0.09302325581395349}
{"count": 4, "key": "49667", "percent": 0.09302325581395349}
{"count": 5, "key": "49669", "percent": 0.11627906976744186}
{"count": 9, "key": "135", "percent": 0.20930232558139536}
{"count": 13, "key": "8000", "percent": 0.3023255813953488}

This output reveals that nearly a third of the activity is destined to TCP/8000. We can investigate this further, identifying any port 8000 activity where the process name is an executable file:

slingshot $ eql query -f normalized-rta.json "network where process_name = '*.exe' \
and destination_port = '8000'"
| jq "{process_path,user,timestamp,destination_port}"
"process_path": "C:\\Windows\\System32\\mshta.exe",
"user": "RTA-DESKTOP\\alice",
"timestamp": 131883576881820000,
"destination_port": "8000"
"process_path": "C:\\Windows\\System32\\msiexec.exe",
"timestamp": 131883577020100000,
"destination_port": "8000"
"process_path": "C:\\Windows\\System32\\msiexec.exe",
"timestamp": 131883577024280000,
"destination_port": "8000"
"process_path": "C:\\Windows\\System32\\rundll32.exe",
"user": "RTA-DESKTOP\\alice",
"timestamp": 131883577304160000,
"destination_port": "8000"

Some suspicious activity going on here. First, the Microsoft HTA execution utility is launched by Alice, then msiexec is used twice as system nearly instantaneously, followed a few seconds later by Alice running rundll32.exe. Definitely worth investigation.


EQL is a powerful tool, with a lot of significant benefits for defenders to leverage when threat hunting. While the search syntax can be a little confusing at first, with a little practice it becomes second-nature, making it possible to explore logging data in a consistent, simple format with minimal fuss.

Got a fantastic query you use for threat hunting with EQL? Please let me know! until then, use the sample data files, and explore the secrets hidden in Sysmon logs with EQL.


Posted December 11, 2019 at 3:43 PM | Permalink | Reply

Ross Wolf

Thanks for the post!
Just wanted to add a note that there's also an interactive shell that can be used to hopefully make the learning experience easier. Personally, I can find JQ to be daunting and regularly struggle with it, so I put together an interactive tool that has all of the things that I thought were lacking''"a REPL-like loop, syntax highlighting, custom tables, tab complete, etc. It can also recognize your schema.
There's a guide and gif posted with EQL's documentation:

Post a Comment


* Indicates a required field.