A colleague asked me about what are the best threat intelligence tools I use for data manipulation on a daily basis. He had stumbled upon a massive CSV file that was submitted to a popular sharing site. He had received a tip that there was some interesting data in it for a client he works with. The trouble he was having was that the file was several gigabytes in size. Every attempt to open the file in Microsoft Excel has proved unfruitful. I recommended that he could use a tool like ‘cat’ and ‘grep’ to look at the file through the command line. This got me thinking. On a day-to-day basis, I often find myself reaching for command-line tools. Even more so than traditional applications. It also seemed like these command line tools were sometimes forgotten about by intel analysts. As a result, I have put together some of the handiest command line tools every threat intelligence analyst should be familiar with.
The grep Family
One of the best threat intelligence tools that should be in every analyst’s repertoire is grep. Grep stands for “Global Regular Expressions Print.” Grep and all its variants which we will get to, is a tool that puts searching front and center.
What can it search for?
Almost anything you are looking for in a file, or even the file itself! This could include a file name, a word, a line, several lines or more. But its versatility begins to shine when you consider the formats that it can use. It is capable of searching not only for strings, but regular expressions (regex). This capability is especially handy if you are looking for information in a variety of formats, like credentials, domains, or IP ranges.
The Rest of the grep Family
The fun with grep however doesn’t stop with grep itself. The grep command is most well-known, but there are a variety of specialized grep commands.
- egrep – egrep is a version of grep (grep -E) that means “Extended grep.” that treats the searched pattern as an extended regular expression. This also means that it interprets characters like “+” and “?” differently than a normal grep command.
- fgrep – this is another version of grep that is more of a “shortcut” (in this case grep -F). In this case the F stands for “Fixed Grep.” The function of fgrep is that it does away with the regular expression functions of grep. Instead, it only searches for literal strings. While this might sound counter intuitive, the pay-off is that it is much, much, faster. This may not seem like a big deal searching files with only a few thousand lines… But when dealing with millions or billions of lines, it definitely shows its worth.
- rgrep – rgrep is a tool that I find myself reaching for all the time. Like grep and egrep it can search files for strings and regular expressions. Unlike those tools, however, it can recursively search through directories and all the files in them. If you have ever had to search through git repos for specific information, rgrep is a time saver!
Awk is Your Friend
Another of the best threat intelligence tools available is a little tool called awk. Awk is very valuable if you find yourself manipulating separated values. Whether it is comma, semicolon, space or any other type of delimiter, awk is your friend. Awk looks at a file line-by-line and can then parse each line into individual fields. This allows you to choose which columns you want to print. It also allows you to rearrange the order of them, or output them in a completely different format.
However, awk is more than just a command-line version of Excel. Its real power is that it can parse fields where they didn’t exist. For instance, you could use awk to pull only the username or domain from an email. To do this, specify the “@” symbol as the delimiter. Another use is pulling out domains, subdomains, and top-level domains for URLs in different logs files.
If you need to manipulate big lists of data, you need to start using awk!
You sed it!
Another of the best threat intelligence tools available for the command line is a little powerhouse called sed. The Stream Editor (or sed) is a tool that allows you to parse and manipulate text. In layman’s terms, this thing is a souped-up version of find-and-replace. As you have guessed, however, it is a lot more flexible!
Like most of the tools we have looked at, sed allows you to use regular expressions, as well as literal strings. This means that if you need to manipulate variable data sed will be able to handle the job. Sed also allows you to not just find data but also remove it. This capability is a huge help when an intel analyst has to wade through various log files.
Now, it should be noted that sed does have a bit of a steep learning curve. Its syntax can be a bit intimidating, and I will admit I have a cheatsheet I use for some of the more involved queries. However, the value sed provides to threat intelligence analysts is undeniable!
No Strings Attached
What list about the best threat intelligence tools would be complete without strings?
While most folks who have delved into the command line once or twice will know that you can look at almost any file. And most will be familiar with using a tool like cat to print any file to the screen. Most will also know that binary files (and even document files) will often return something that looks like it is from The Matrix. This is because many bytes within the file are non-printable characters. This is where a tool like strings shines!
Strings will take any file and print the only the human readable strings to the screen. This will allow you to see very useful information about any file, and especially things like binaries. Some examples can include comments or text for pop up messages contained in the code. Another use is to look for references to DLL files the binary uses that may reveal some interesting functions. This can also be useful when looking at ransomware as the ransom note is often in the actual binary.
Strings remains one of the most useful tools for security researchers and intelligence analysts.
The security space has some amazing graphical tools. We even covered some in a previous blog, here. However, threat intelligence analysts shouldn’t only focus on the GUI. Indeed, many command line tools are even more powerful and more flexible than their GUI cousins!