Last Quick and Dirty Log Tip for the Week

OK, so this week I posted two other blog posts about doing quick and dirty log analysis and some of the techniques I use. This one also covers converting column logs to CSV.

After the great response, I wanted to drop one last tip for the week. 

Several folks asked me about re-sorting and processing the column-based data in different ways and to achieve different analytical views. 

Let me re-introduce you to my friend and yours, sort.

In this case, instead of using the sort -n -r like before (numeric sort, reverse order), we can use:

  • sort -k# -n input_file (where # is the number of the column you’d like to sort by and the input file is the name of the file to sort)
    • You can use this inline by leveraging the pipe (|) again – i.e.: cat input.txt | sort -k3 -n (this types the input file and sends it to sort for sorting on the third column in numeric order) (-r would of course, reverse it…)
    • You can write the output of this to a file with redirects “> filename.txt”, i.e.: cat input.txt | sort -k3 -n -r > output.txt
      • You could also use “>>” as the redirect in order to create a file if it doesn’t exist OR append to a file if it does exist… i.e..:  cat input.txt | sort -k3 -n -r >> appended_output.txt

That’s it! It’s been a fun week sharing some simple command line processing tips for log files. Drop me a line on Twitter (@lbhuston) and let me know what you used them for, or which ones are your favorite. As always, thanks and have a great weekend! 

Quick And Dirty Log Analysis Followup

Earlier this week, I posted some tips for doing Quick and Dirty PA Firewall Log Analysis.

After I posted this, I got a very common question, and I wanted to answer it here.

The question is something along the lines of “When I use the techniques from your post, the outputs of the commands are column separated data. I need them to be CSV to use with my (tool/SEIM/Aunt Gracie/whatever). How can I convert them?” Sound familiar?

OK, so how do we accomplish this feat of at the command line without all of the workarounds that people posted, and without EVER loading Excel? Thankfully we can use awk again for this.

We can use:

  • awk ‘BEGIN { OFS = “,”} ; {print $1,$2,$3}’
    • Basically, take an input of column data, and print out the columns we want (can be any, in this case I want the first 3 columns), and make the outputs comma delimited.
    • We can just append this to our other command stacks with another pipe (|) to get our output CSV
  • Example: cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9}’ | sort -n | uniq -c | sort -n -r | awk ‘BEGIN { OFS = “,”} ; {print $1,$2,$3}’
    • In this example, the source IP and destination IP will be analyzed, and the reduced to unique pairs, along with the number of times that that pair is duplicated in the input log (I use this as a “hit rate” as I described earlier
      • A common question, why do I ask for two columns in the first awk and then ask for three columns in the second awk?
        • The answer of course, is that the first awk prints the unique pairs, but it also adds a column of the “hit rate”, so to get the output appropriately, I need all three fields.

So, once again, get to know awk. It is your friend.:)

PS – Yes, I know, there are hundreds of other ways to get this same data, in the same format, using other command line text processing tools. Many may even be less redundant than the commands above. BUT, this is how I did it. I think it makes it easy for people to get started and play with the data. Post your ways to Twitter or share with the community. Exploration is awesome, so it will encourage users to play more. Cool! Hit me on Twitter if you wanna share some or talk more about this approach (@lbhuston).

Thanks for reading!

Quick & Dirty Palo Alto Log Analysis

OK, so I needed to do some quick and dirty traffic analysis on Palo Alto text logs for a project I was working on. The Palo Alto is great and their console tools are nice. Panorama is not too shabby. But, when I need quick and dirty analysis and want to play with data, I dig into the logs. 
 
That said, for my quick analysis, I needed to analyze a bunch of text logs and model the traffic flows. To do that, I used simple command line text processing in Unix (Mac OS, but with tweaks also works in Linux, etc.)
 
I am sharing some of my notes and some of the useful command lines to help others who might be facing a similar need.
 
First, for my project, I made use of the following field #’s in the text analysis, pulled from the log header for sequence:
  • $8 (source IP) 
  • $9 (dest IP)
  • $26 (dest port)
  • $15 (AppID)
  • $32 (bytes)
 
Once, I knew the fields that corresponded to values I wanted to study, I started using the core power of command line text processing. And in this case, the power I needed was:
  • cat
  • grep
    • Including, the ever useful grep -v (inverse grep, show me the lines that don’t match my pattern)
  • awk
    • particularly: awk ‘BEGIN { FS = “,”} ; {print $x, $y}’ which prints specific columns in CSV files 
  • sort
    • sort -n (numeric sort)
    • sort -r (reverse sort, descending)
  • uniq
    • uniq -c (count the numbers of duplicates, used for determining “hit rates” or frequency, etc.)
 
Of course, to learn more about these commands, simply man (command name) and read the details. 😃 
 
OK, so I will get you started, here are a few of the more useful command lines I used for my quick and dirty analysis:
  • cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9,$26}’ | sort | uniq -c | sort -n -r > hitrate_by_rate.txt
    • this one produces a list of Source IP/Dest IP/Dest Port unique combinations, sorted in descending order by the number of times they appear in the log
  • cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $8,$9}’ | sort -n | uniq -c | sort -n -r > uniqpairs_by_hitrate.txt
    • this one produces a list of the uniq Source & Destination IP addresses, in descending order by how many times they talk to each other in the log (note that their reversed pairings will be separate, if they are present – that is if A talks to B, there will be an entry for that, but if B initiates conversations with A, that will be a separate line in this data set)
  • cat log.csv | awk ‘BEGIN { FS = “,”} ; {print $15}’ | sort | uniq -c | sort -n -r > appID_by_hitrate.txt
    • this one uses the same exact techniques, but now we are looking at what applications have been identified by the firewall, in descending order by number of times that application identifier appears in the log
 
Again, these are simple examples, but you can tweak and expand as you need. This trivial approach to command line text analysis certainly helps with logs and traffic data. You can use those same commands to do a wondrous amount of textual analysis and processing. Learn them, live them, love them. 😃 
 
If you have questions, or want to share some of the ways you use those commands, please drop us a line on Twitter (@microsolved) or hit me up personally for other ideas (@lbhuston). As always, thanks for reading and stay safe out there! 

An Exercise to Increase IT/OT Engagement & Cooperation

Just a quick thought on an exercise to increase the cooperation, trust and engagement between traditional IT and OT (operational technology – (ICS/SCADA tech)) teams. Though it likely applies to just about any two technical teams, including IT and development, etc.

Here’s the idea: Host a Hack-a-thon!

It might look something like this:

  • Invest in some abundant kits of LittleBits. These are like Legos with electronics, mechanical circuits and even Arduino/Cloud controllers built in. Easy, safe, smart and fun!
  • Put all of the technical staff in a room together for a day. Physically together. Ban all cell phones, calls, emails, etc. for the day – get people to engage – cater in meals so they can eat together and develop rapport
  • Split the folks into two or more teams of equal size, mixing IT and OT team members (each team will need both skill sets – digital and mechanical knowledge) anyway.
  • Create a mission – over the next 8 hours, each team will compete to see who can use their smart bits set to design, program and proto-type a solution to a significant problem faced in their everyday work environments.
  • Provide a prize for 1st and 2nd place team. Reach deep – really motivate them!
  • Let the teams go through the process of discussing their challenges to find the right problem, then have them use draw out their proposed solution.
  • After lunch, have the teams discuss the problems they chose and their suggested fix.Then have them build it with the LittleBits. 
  • Right before the end of the day, have a judging and award the prizes.

Then, 30 days later, have a conference call with the group. Have them again discuss the challenges they face together, and see if common solutions emerge. If so, implement them.

Do this a couple times a year, maybe using something like Legos, Raspberry Pis, Arduinos or just whiteboards and markers. Let them have fun, vent their frustrations and actively engage with one another. The results will likely astound you.

How does your company further IT/OT engagement? Let us know on Twitter (@microsolved) or drop me a line personally (@lbhuston). Thanks for reading! 

Hurricane Matthew Should Remind You to Check Your DR/BC Plans

The news is full of tragedy from Hurricane Matthew at the moment, and our heart goes out to those being impacted by the storm and its aftermath.

This storm is a powerful hit on much of the South East US, and should serve as a poignant reminder to practice, review and triple check your organization’s DR and BC plans. You should have a process and procedure review yearly, with an update at least quarterly and anytime major changes to your operations or environment occur. Most organization’s seem to practice these events on a quarterly or at least 2x per year cycle. They often use a full test once a year, and table top exercises for the others. 

This seems to be an effective cycle and approach. 

We hope that everyone stays safe from the hurricane and we are hoping for minimal impacts, but we also hope that organizations take a look at their plans and give them a once over. You never know when you just might need to be better prepared.

Yahoo Claims of Nation State Attackers are Refuted

A security vendor claims that the Yahoo breach was performed by criminals and not a nation state.

This is yet more evidence that in many cases, focusing on the who is the wrong approach. Instead of trying to identify a specific set of attacker identities, organizations should focus on the what and how. This is far more productive, in most cases.

If, down the road, as a part of recovery, the who matters to some extent (for example, if you are trying to establish a loss impact or if you are trying to create economic defenses against the conversion of your stolen data), then might focus on the who at that point. But, even then, performing a spectrum analysis of potential attackers, based on risk assessment is far more likely to produce results that are meaningful for your efforts. 

Attribution is often very difficult and can be quite misleading. Effective incident response should clearly focus on the what and how, so as to best minimize impacts and ensure mitigation. Clues accumulated around the who at this stage should be archived for later analysis during recovery. Obviously, this data should be handled and stored carefully, but nonetheless, that data shouldn’t derail or delay the investigation and mitigation work in nearly every case.

How does your organization handle the who evidence in an incident? Let us know on Twitter (@microsolved) and we will share the high points in a future post.

Password Breach Mining is a Major Threat on the Horizon

Just a quick note today to get you thinking about a very big issue that is just over the security horizon.

As machine learning capabilities grow rapidly and mass storage pricing drops to close to zero, we will see a collision that will easily benefit common criminals. That is, they will begin to apply machine learning correlation and prediction capabilities to breach data – particularly passwords, in my opinion.

Millions of passwords are often breached at a time these days. Compiling these stolen password is quite easy, and with each added set, the idea of tracking and tracing individual users and their password selection patterns becomes trivial. Learning systems could be used to turn that raw data into insights about particular user patterns. For example, if a user continually creates passwords based on a season and a number (ex: Summer16) and several breaches show that same pattern as being associated with that particular user (ex: Summer16 on one site, Autumn12 on another and so on…) then the criminals can use prediction algorithms to create a custom dictionary to target that user. The dictionary set will be concise and is likely to be highly effective.

Hopefully, we have been teaching users not to use the same password in multiple locations – but a quick review of breach data sets show that these patterns are common. I believe they may well become the next evolution of bad password choices.

Now might be the time to add this to your awareness programs. Talk to users about password randomization, password vaults and the impacts that machine learning and AI are likely to have on crime. If we can change user behavior today, we may be able to prevent the breaches of tomorrow!

From Dark Net Research to Real World Safety Issue

On a recent engagement by the MSI Intelligence team, our client had us researching the dark net to discover threats against their global brands. This is a normal and methodology-driven process for the team and the TigerTrax™ platform has been optimized for this work for several years.

We’ve seen plenty of physical threats against clients before. In particular, our threat intelligence and brand monitoring services for professional sports teams have identified several significant threats of violence in the last few years. Unfortunately, this is much more common for high visibility brands and organizations than you might otherwise assume.

In this particular instance, conversations were flagged by TigerTrax from underground forums that were discussing physical attacks against the particular brand. The descriptions were detailed, politically motivated and threatened harm to employees and potentially the public. We immediately reported the issue and provided the captured data to the client. The client reviewed the conversations and correlated them with other physical security occurrences that had been reported by their employees. In today’s world, such threats require vigilant attention and a rapid response.

In this case, the client was able to turn our identified data into insights by using it to gain context from their internal security issue reporting system. From those insights, they were able to quickly launch an awareness campaign for their employees in the areas identified, report the issue to localized law enforcement and invest in additional fire and safety controls for their locations. We may never know if these efforts were truly effective, but if they prevented even a single occurrence of violence or saved a single human life, then that is a strong victory.

Security is often about working against things so that they don’t happen – making it abstract, sometimes frustrating and difficult to explain to some audiences. But, when you can act on binary data as intelligence and use it to prevent violence in the kinetic world, that is the highest of security goals! That is the reason we built TigerTrax and offer the types of intelligence services we do to mature organizations. We believe that insights like these can make a difference and we are proud to help our clients achieve them.

3 Reasons You Need Customized Threat Intelligence

Many clients have been asking us about our customized threat intelligence services and how to best use the data that we can provide.

1. Using HoneyPoint™, we can deploy fake systems and applications, both internally and in key external situations that allow you to generate real-time, specific to your organization, indicators of compromise (IoC) data – including a wide variety of threat source information for blacklisting, baseline metrics to make it easy to measure changes in the levels of threat actions against your organization up to the moment, and a wide variety of scenarios for application and attack surface hardening.

2. Our SilentTiger™ passive assessments, can help you provide a wider lens for vulnerability assessment visibility than your perimeter, specifically. It can be used to assess, either single instance or ongoing, the security posture of locations where your brand is extended to business partners, cloud providers, supply chain vendors, critical dependency API and data flows and other systems well beyond your perimeter. Since the testing is passive, you don’t need permission, contract language or control of the systems being assessed. You can get the data in a stable, familiar format – very similar to vulnerability scanning reports or via customized data feeds into your SEIM/GRC/Ticketing tools or the like. This means you can be more vigilant against more attack surfaces without more effort and more resources.

3. Our customized TigerTrax™ Targeted Threat Intelligence (TTI) offerings can be used for brand specific monitoring around the world, answering specific research questions based on industry / geographic / demographic / psychographic profiles or even products / patents or economic threat research. If you want to know how your brand is being perceived, discussed or threatened around the world, this service can provide that either as a one-time deliverable, or as an ongoing periodic service. If you want our intelligence analysts to look at industry trends, fraud, underground economics, changing activist or attacker tactics and the way they collide with your industry or organization – this is the service that can provide that data to you in a clear and concise manner that lets you take real-world actions.

We have been offering many of these services to select clients for the last several years. Only recently have we decided to offer them to our wider client and reader base. If you’d like to learn how others are using the data or how they are actively hardening their environments and operations based on real-world data and trends, let us know. We’d love to discuss it with you!