Last Quick and Dirty Log Tip for the Week

OK, so this week I posted two other blog posts about doing quick and dirty log analysis and some of the techniques I use. This one also covers converting column logs to CSV.

After the great response, I wanted to drop one last tip for the week. 

Several folks asked me about re-sorting and processing the column-based data in different ways and to achieve different analytical views. 

Let me re-introduce you to my friend and yours, sort.

In this case, instead of using the sort -n -r like before (numeric sort, reverse order), we can use:

  • sort -k# -n input_file (where # is the number of the column you’d like to sort by and the input file is the name of the file to sort)
    • You can use this inline by leveraging the pipe (|) again – i.e.: cat input.txt | sort -k3 -n (this types the input file and sends it to sort for sorting on the third column in numeric order) (-r would of course, reverse it…)
    • You can write the output of this to a file with redirects “> filename.txt”, i.e.: cat input.txt | sort -k3 -n -r > output.txt
      • You could also use “>>” as the redirect in order to create a file if it doesn’t exist OR append to a file if it does exist… i.e..:  cat input.txt | sort -k3 -n -r >> appended_output.txt

That’s it! It’s been a fun week sharing some simple command line processing tips for log files. Drop me a line on Twitter (@lbhuston) and let me know what you used them for, or which ones are your favorite. As always, thanks and have a great weekend! 

Quick And Dirty Log Analysis Followup

Earlier this week, I posted some tips for doing Quick and Dirty PA Firewall Log Analysis.

After I posted this, I got a very common question, and I wanted to answer it here.

The question is something along the lines of “When I use the techniques from your post, the outputs of the commands are column separated data. I need them to be CSV to use with my (tool/SEIM/Aunt Gracie/whatever). How can I convert them?” Sound familiar?

OK, so how do we accomplish this feat of at the command line without all of the workarounds that people posted, and without EVER loading Excel? Thankfully we can use awk again for this.

We can use:

  • awk ‘BEGIN { OFS = “,”} ; {print $1,$2,$3}’
    • Basically, take an input of column data, and print out the columns we want (can be any, in this case I want the first 3 columns), and make the outputs comma delimited.
    • We can just append this to our other command stacks with another pipe (|) to get our output CSV
  • Example: cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9}’ | sort -n | uniq -c | sort -n -r | awk ‘BEGIN { OFS = “,”} ; {print $1,$2,$3}’
    • In this example, the source IP and destination IP will be analyzed, and the reduced to unique pairs, along with the number of times that that pair is duplicated in the input log (I use this as a “hit rate” as I described earlier
      • A common question, why do I ask for two columns in the first awk and then ask for three columns in the second awk?
        • The answer of course, is that the first awk prints the unique pairs, but it also adds a column of the “hit rate”, so to get the output appropriately, I need all three fields.

So, once again, get to know awk. It is your friend.:)

PS – Yes, I know, there are hundreds of other ways to get this same data, in the same format, using other command line text processing tools. Many may even be less redundant than the commands above. BUT, this is how I did it. I think it makes it easy for people to get started and play with the data. Post your ways to Twitter or share with the community. Exploration is awesome, so it will encourage users to play more. Cool! Hit me on Twitter if you wanna share some or talk more about this approach (@lbhuston).

Thanks for reading!

Quick & Dirty Palo Alto Log Analysis

OK, so I needed to do some quick and dirty traffic analysis on Palo Alto text logs for a project I was working on. The Palo Alto is great and their console tools are nice. Panorama is not too shabby. But, when I need quick and dirty analysis and want to play with data, I dig into the logs. 
That said, for my quick analysis, I needed to analyze a bunch of text logs and model the traffic flows. To do that, I used simple command line text processing in Unix (Mac OS, but with tweaks also works in Linux, etc.)
I am sharing some of my notes and some of the useful command lines to help others who might be facing a similar need.
First, for my project, I made use of the following field #’s in the text analysis, pulled from the log header for sequence:
  • $8 (source IP) 
  • $9 (dest IP)
  • $26 (dest port)
  • $15 (AppID)
  • $32 (bytes)
Once, I knew the fields that corresponded to values I wanted to study, I started using the core power of command line text processing. And in this case, the power I needed was:
  • cat
  • grep
    • Including, the ever useful grep -v (inverse grep, show me the lines that don’t match my pattern)
  • awk
    • particularly: awk ‘BEGIN { FS = “,”} ; {print $x, $y}’ which prints specific columns in CSV files 
  • sort
    • sort -n (numeric sort)
    • sort -r (reverse sort, descending)
  • uniq
    • uniq -c (count the numbers of duplicates, used for determining “hit rates” or frequency, etc.)
Of course, to learn more about these commands, simply man (command name) and read the details. 😃 
OK, so I will get you started, here are a few of the more useful command lines I used for my quick and dirty analysis:
  • cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9,$26}’ | sort | uniq -c | sort -n -r > hitrate_by_rate.txt
    • this one produces a list of Source IP/Dest IP/Dest Port unique combinations, sorted in descending order by the number of times they appear in the log
  • cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9}’ | sort -n | uniq -c | sort -n -r > uniqpairs_by_hitrate.txt
    • this one produces a list of the uniq Source & Destination IP addresses, in descending order by how many times they talk to each other in the log (note that their reversed pairings will be separate, if they are present – that is if A talks to B, there will be an entry for that, but if B initiates conversations with A, that will be a separate line in this data set)
  • cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $15}’ | sort | uniq -c | sort -n -r > appID_by_hitrate.txt
    • this one uses the same exact techniques, but now we are looking at what applications have been identified by the firewall, in descending order by number of times that application identifier appears in the log
Again, these are simple examples, but you can tweak and expand as you need. This trivial approach to command line text analysis certainly helps with logs and traffic data. You can use those same commands to do a wondrous amount of textual analysis and processing. Learn them, live them, love them. 😃 
If you have questions, or want to share some of the ways you use those commands, please drop us a line on Twitter (@microsolved) or hit me up personally for other ideas (@lbhuston). As always, thanks for reading and stay safe out there! 

Daily Log Monitoring and Increased Third Party Security Responsibilities: Here They Come!

For years now we at MSI have extoled the security benefits of daily log monitoring and reciprocal security practices between primary and third party entities present on computer networks. It is constantly being proven true that security incidents could be prevented, or at least quickly detected, if system logs were properly monitored and interpreted. It is also true that many serious information security incidents are the result of cyber criminals compromising third party service provider systems to gain indirect access to private networks. 

I think that most large-network CISOs are well aware of these facts. So why aren’t these common security practices right now? The problem is that implementing effective log monitoring and third party security practices is plagued with difficulties. In fact, implementation has proven to be so difficult that organizations would rather suffer the security consequences than put these security controls in place. After all, it is cheaper and easier – usually – unless you are one of the companies that get pwned! Right now, organizations are gambling that they won’t be among the unfortunate – like Target. A fools’ paradise at best! 

But there are higher concerns in play here than mere money and efficiency. What really is at stake is the privacy and security of all the system users – which one way or another means each and every one of us. None of us likes to know our private financial or medical or personal information has been exposed to public scrutiny or compromise, not to mention identity theft and ruined credit ratings. And what about utilities and manufacturing concerns? Failure to implement the best security measures among power concerns, for example, can easily lead to real disasters and even loss of human life. Which all means that it behooves us to implement controls like effective monitoring and vendor security management. There is no doubt about it. Sooner or later we are going to have to bite the bullet. 

Unfortunately, private concerns are not going to change without prodding. That is where private and governmental regulatory bodies are going to come into play. They are going to have to force us to implement better information security. And it looks like one of the first steps in this process is being taken by the PCI Security Standards Council. Topics for their special interest group projects in 2015 are going to be daily log monitoring and shared security responsibilities for third party service providers.

That means that all those organizations out there that foster the use of or process credit cards are going to see new requirements in these fields in the next couple of years. Undoubtedly similar requirements for increased security measures will be seen in the governmental levels as well. So why wait until the last minute? If you start now implementing not only effective monitoring and 3rd party security, but other “best practices” security measures, it will be much less painful and more cost effective for you. You will also be helping us all by coming up with new ways to practically and effectively detect security incidents through system monitoring. How about increasing the use of low noise anomaly detectors such as honey pots? What about concentrating more on monitoring information leaving the network than what comes in? How about breaking massive networks into smaller parts that are easier monitor and secure? What ideas can you come up with to explore?

This post written by John Davis.

Remember, Log Analysis is Important, Especially Now

Remember, during the holiday season, attacks tend to increase and so do compromises. With vacations and staff parties, monitoring the logs and investigating anomalies can quickly get forgotten. Please make sure you remain vigilant during this time and pay close attention to logs during and just after holiday breaks.

As always, thanks for reading and we wish you a safe and happy holiday season!

Monitoring: an Absolute Necessity (but a Dirty Word Nonetheless)

There is no easier way to shut down the interest of a network security or IT administrator than to say the word “monitoring”. You can just mention the word and their faces fall as if a rancid odor had suddenly entered the room! And I can’t say that I blame them. Most organizations do not recognize the true necessity of monitoring, and so do not provide proper budgeting and staffing for the function. As a result, already fully tasked (and often times inadequately prepared) IT or security personnel are tasked with the job. This not only leads to resentment, but also virtually guarantees that the job is will not be performed effectively.

And when I say human monitoring is necessary if you want to achieve any type of real information security, I mean it is NECESSARY! You can have network security appliances, third party firewall monitoring, anti-virus packages, email security software, and a host of other network security mechanisms in place and it will all be for naught if real (and properly trained) human beings are not monitoring the output. Why waste all the time, money and effort you have put into your information security program by not going that last step? It’s like building a high and impenetrable wall around a fortress but leaving the last ten percent of it unbuilt because it was just too much trouble! Here are a few tips for effective security monitoring:

  • Properly illustrate the necessity for human monitoring to management, business and IT personnel; make them understand the urgency of the need. Make a logical case for the function. Tell them real-world stories about other organizations that have failed to monitor and the consequences that they suffered as a result. If you can’t accomplish this step, the rest will never fall in line.
  • Ensure that personnel assigned to monitoring tasks of all kinds are properly trained in the function; make sure they know what to look for and how to deal with what they find.
  • Automate the logging and monitoring function as much as possible. The process is difficult enough without having to perform tedious tasks that a machine or application can easily do.
  • Ensure that you have log aggregation in place, and also ensure that other network security tool output is centralized and combined with logging data. Real world cyber-attacks are often very hard to spot. Correlating events from different tools and processes can make these attacks much more apparent. 
  • Ensure that all personnel associated with information security communicate with each other. It’s difficult to effectively detect and stop attacks if the right hand doesn’t know what the left hand is doing.
  • Ensure that logging is turned on for everything on the network that is capable of it. Attacks often start on client side machines.
  • Don’t just monitor technical outputs from machines and programs, monitor access rights and the overall security program as well:
  • Monitor access accounts of all kinds on a regular basis (at least every 90 days is recommended). Ensure that user accounts are current and that users are only allocated access rights on the system that they need to perform their jobs. Ensure that you monitor third party access to the system to this same level.
  • Pay special attention to administrative level accounts. Restrict administrative access to as few personnel as possible. Configure the system to notify proper security and IT personnel when a new administrative account is added to the network. This could be a sign that a hack is in progress.
  • Regularly monitor policies and procedures to ensure that they are effective and meet the security goals of the organization. This should be a regular part of business continuity testing and review.
Thanks to John Davis for writing this post.

Ask The Experts: Too Much Data

Q: “I have massive amounts of log files I have to dig through every day. I have tried a full blown SEIM, but can’t get it to work right or my management to support it with budget. Right now I have Windows logs, firewall logs and AV logs going to a syslog server. That gives me a huge set of text files every day. How can I make sense of all that text? What tools and processes do you suggest? What should I be looking for? HELP!!!!”


Adam Hostetler answered with:


I would say give OSSEC a try. It’s a free log analyzer/SEIM. It doesn’t

have a GUI with100 different dashboards and graphs, it’s all cli and

e-mail based (though there is a simple web interface for it also). It is

easy to write rules for, and it has default rules for many things,

except for your AV. You can write simple rules for that, especially if

you are just looking for items AV caught. It does take some tuning, as

with all analysis tools, but isn’t difficult after learning how OSSEC

works. If you want to step it up a bit, you can feed OSSEC alerts into

Splunk where you can trend alerts, or create other rules and reports in it.


Bill Hagestad added:


First things first – don’t be or feel overwhelmed – log files are what they are much disparate data from a variety of resources that need reviewing sooner rather than later.


Rather than looking at another new set to tools or the latest software gizmo the trade rags might suggest based on the flair of the month, try a much different and more effective approach to the potential threat surface to your network and enterprise information network.


First take a look at what resources need to be protected in order of importance to your business. Once you have prioritized these assets then begin to  determine what is the minimum level of acceptable risk you can assign to each resource you have just prioritized.


Next, make two columns on a either a piece of paper or a white board. In one column list your resources in order of protection requirements, i.e.; servers with customer data, servers with intellectual property, so and so forth. In a column to the right of the first assets list plug in your varying assigned levels of risk. Soon you will see what areas/assets within your organization/enterprise you should pay the most attention to in terms of threat mitigation.


After you have taken the steps to determine your own self- assessment of risk contact MicroSolved for both a vulnerability assessment and penetration test to provide additional objective perspective on threats to your IT infrastructure and commercial enterprise. 


Finally, Jim Klun weighed in with: 


You are way ahead of the game by just having a central log repository.  You can go to one server and look back in time to the point where you expect a security incident.


And what you have – Windows logs, firewall logs, and AV – is fantastic.  Make sure all your apps are logging as well ( logon success, logon failure).

Too often I have seen apps attacked and all I had in syslog was OS events that showed nothing.


Adam’s suggestion, OSSEC, is the way to go to keep cost down… but don’t just install and hope for the best.

You will have to tweak the OSSEC rules and come up with what works.


Here’s the rub: there is no substitute for knowing your logs – in their raw format, not pre-digested by a commercial SIEM or OSSEC.


That can seem overwhelming. And to that, some Unix commands and regular expressions are your friend.




zcat auth.log | grep ssh | egrep -i ‘failed|accepted’




Jul  4 16:32:16 dmz-server01 sshd[8786]: Failed password for user02 from port 38143 ssh2

Jul  4 16:33:53 dmz-server01 sshd[8786]: Accepted password for user01 from port 38143 ssh2

Jul  4 16:36:05 dmz-server01 sshd[9010]: Accepted password for user01 from port 38315 ssh2

Jul  5 01:04:00 dmz-server01 sshd[9308]: Accepted password for user01 from port 60351 ssh2

Jul  5 08:21:58 dmz-server01 sshd[9802]: Accepted password for user01 from port 51436 ssh2

Jul  6 10:21:52 dmz-server01 sshd[21912]: Accepted password for user01 from port 36486 ssh2

Jul  6 13:43:10 dmz-server01 sshd[31701]: Accepted password for user01 from port 34703 ssh2

Jun 26 11:21:02 dmz-server01 sshd[31950]: Accepted password for user01 from port 37209 ssh2



Instead of miles of gibberish the log gets reduced to passed/fail authentication attempts.


You can spend an hour with each log source ( firewall, AV, etc) and quickly pare them down to whats interesting.


Then make SURE your OSSEC  rules cover what you want to see.

If that does not work – cron a script to parse the logs of interest using your regular expression expertise and have an email sent to you when something goes awry.


Revisist the logs manually periodically – they will change. New stuff will happen.  Only a human can catch that.


Take a look at:


The site lists a number of tools that may be useful


John Davis added:


You voice one of the biggest problems we see in information security programs: monitoring! People tell us that they don’t have the proper tools and, especially, they don’t have the manpower to perform effective logging and monitoring. And what they are saying is true, but unfortunately doesn’t let them out from having to do it. If you have peoples financial data, health data (HIPAA) or credit card information (PCI) you are bound by regulation or mandate to properly monitor your environment – and that means management processes, equipment, vulnerabilities and software as well as logs and tool outputs. The basic problem here is that most organizations don’t have any dedicated information security personnel at all, or the team they have isn’t adequate for the work load. Money is tight and employees are expensive so it is very difficult for senior management to justify the expenditure – paying a third party to monitor firewall logs is cheaper. But for real security there is no substitute for actual humans in the security loop – they simply cannot be replaced by technology. Unfortunately, I feel the only answer to your problem is for government and industry to realize this truth and mandate dedicated security personnel in organizations that process protected data.


As always, thanks for reading and if you have a question for the experts, either leave it in the comments, email us or drop us a line on Twitter at (@lbhuston). 

March Touchdown Task: Check the Firewall Logs

This month’s Touchdown Task is to help you with detection and response. For March, we suggest you do a quick controls review on your firewall logs. Here’s some questions to begin with:

  • Are you tracking the proper amount of data?
  • Are the logs archived properly?
  • Do you have IP addresses instead of DNS names in the logs?
  • Are the time and date settings on the logs correct?
  • Is everything working as expected?

Undertaking a different quick and dirty Touchdown Task each month helps increase vigilance without huge amounts of impact on schedules and resources. Thanks for reading!

Splunk 4 Review

For this weeks tool review, we’re looking at Splunk. Splunk is a log collection engine at heart, but it’s really more than that. Think of it as search engine for your IT infrastructure. Splunk will actually collect and index anything you can throw at it, and this is what made me want to explore it.

Setting up your Splunk server is easy, there’s installers for every major OS. Run the installer and visit the web front end, and you are in business. Set up any collection sources you need, I started off with syslog. I started a listener in Splunk, and then forwarded my sources to Splunk (I used syslog-ng for this). Splunk will also easily do WMI polling, monitoring local files, change monitoring, or run scripts to generate any data you want. Some data sources require running Splunk as an agent, but it goes easy on system resources as the GUI is turned off. Installing agents is exactly the same process — you just disable the GUI when you’re finished setting up; however you can still control Splunk through the command line.

Splunk can also run addons, in the form of apps. These are plugins that are designed to take and display certain information. There are quite a few, provided both by the Splunk team and also some created by third parties. I found the system monitoring tools to be very helpful. There are scripts for both Windows and Unix. In this instance, it does require running clients on the system. There are also apps designed for Blue Coat, Cisco Security and more.

In my time using Splunk, I’ve found it to be a great tool for watching logs for security issues (brute forcing ssh accounts for example), it was also useful in fine tuning my egress filtering, as I could instantly see what was being blocked by the firewall, and of course the system monitoring aspects are useful. It could find a home in any organization, and it plays nice with other tools or could happily be your main log aggregation system.

Splunk comes in two flavors, free and professional. There’s not a great difference between them. The biggest difference is that with the free version Splunk is limited to 500MB of indexing per day, which proves to be more than enough for most small businesses, and testing for larger environments. Stepping up to the professional version is a lot easier on the pockets than might be expected, only about $3,000.

Data Visualization Tools in Security

I have been playing with a few data visualization tools and doing some on the fly firewall log analysis. Mostly just basic plots and stuff so far.

These tools make analysis a pretty cool process. I can see where it would be useful with a very large data set.

I have been reading a new book about it, stay tuned for a review. In the meantime, there are a lot of data visualization tools out there, but I have been playing with the InspireData from Inspiration Software.  Check them out if you want to see what I am talking about.

Have you tried using visualization tools for log analysis? If so, leave me a comment about your experiences and the tools you use.