Earlier this week, I posted some tips for doing Quick and Dirty PA Firewall Log Analysis.
After I posted this, I got a very common question, and I wanted to answer it here.
The question is something along the lines of “When I use the techniques from your post, the outputs of the commands are column separated data. I need them to be CSV to use with my (tool/SEIM/Aunt Gracie/whatever). How can I convert them?” Sound familiar?
OK, so how do we accomplish this feat of at the command line without all of the workarounds that people posted, and without EVER loading Excel? Thankfully we can use awk again for this.
We can use:
- awk ‘BEGIN { OFS = “,”} ; {print $1,$2,$3}’
- Basically, take an input of column data, and print out the columns we want (can be any, in this case I want the first 3 columns), and make the outputs comma delimited.
- We can just append this to our other command stacks with another pipe (|) to get our output CSV
- Example: cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9}’ | sort -n | uniq -c | sort -n -r | awk ‘BEGIN { OFS = “,”} ; {print $1,$2,$3}’
- In this example, the source IP and destination IP will be analyzed, and the reduced to unique pairs, along with the number of times that that pair is duplicated in the input log (I use this as a “hit rate” as I described earlier
- A common question, why do I ask for two columns in the first awk and then ask for three columns in the second awk?
- The answer of course, is that the first awk prints the unique pairs, but it also adds a column of the “hit rate”, so to get the output appropriately, I need all three fields.
So, once again, get to know awk. It is your friend.:)
PS – Yes, I know, there are hundreds of other ways to get this same data, in the same format, using other command line text processing tools. Many may even be less redundant than the commands above. BUT, this is how I did it. I think it makes it easy for people to get started and play with the data. Post your ways to Twitter or share with the community. Exploration is awesome, so it will encourage users to play more. Cool! Hit me on Twitter if you wanna share some or talk more about this approach (@lbhuston).
Thanks for reading!