This time I interviewed Aaron Bedra about his newest creation ~ RepSheet. Check it out here:
Aaron’s Bio:
Aaron is the Application Security Lead at Braintree Payments. He is the co-author of Programming Clojure, 2nd Edition as well as a frequent contributor to the Clojure language. He is also the creator of Repsheet, a reputation based intelligence and security tool for web applications.
Question #1: You created a tool called Repsheet that takes a reputational approach to web application security. How does it work and why is it important to approach the problem differently than traditional web application firewalling?
I built Repsheet after finding lots of gaps in traditional web application security. Simply put, it is a web server module that records data about requests, and either blocks traffic or notifies downstream applications of what is going on. It also has a backend to process information over time and outside the request cycle, and a visualization component that lets you see the current state of the world. If you break down the different critical pieces that are involved in protecting a web application, you will find several parts:
* Solid and secure programming practices
* Identity and access management
* Visibility (what’s happening right now)
* Response (make the bad actors go away)
* HELP!!!! (DDoS and other upstream based ideas)
* A way to manage all of the information in a usable way
This is a pretty big list. There are certainly some things on this list that I haven’t mentioned as well (crypto management, etc), but this covers the high level. Coordinating all of this can be difficult. There are a lot of tools out there that help with pieces of this, but don’t really help solve the problem at large.
The other problem I have is that although I think having a WAF is important, I don’t necessarily believe in using it to block traffic. There are just too many false positives and things that can go wrong. I want to be certain about a situation before I act aggressively towards it. This being the case, I decided to start by simply making a system that records activity and listens to ModSecurity. It stores what has happened and provides an interface that lets the user manually act based on the information. You can think of it as a half baked SIEM.
That alone actually proved to be useful, but there are many more things I wanted to do with it. The issue was doing so in a manner that didn’t add overhead to the request. This is when I created the Repsheet backend. It takes in the recorded information and acts on it based on additional observation. This can be done in any form and it is completely pluggable. If you have other systems that detect bad behavior, you can plug them into Repsheet to help manage bad actors.
The visualization component gives you the detailed and granular view of offenses in progress, and gives you the power to blacklist with the click of a button. There is also a global view that lets you see patterns of data based on GeoIP information. This has proven to be extremely useful in detecting localized botnet behavior.
So, with all of this, I am now able to manage the bottom part of my list. One of the pieces that was recently added was upstream integration with Cloudflare, where the backend will automatically blacklist via the Cloudflare API, so any actors that trigger blacklisting will be dealt with by upstream resources. This helps shed attack traffic in a meaningful way.
The piece that was left unanswered is the top part of my list. I don’t want to automate good programming practices. That is a culture thing. You can, of course, use automated tools to help make it better, but you need to buy in. The identity and access management piece was still interesting to me, though. Once I realized that I already had data on bad actors, I saw a way to start to integrate this data that I was using in a defensive manner all the way down to the application layer itself. It became obvious that with a little more effort, I could start to create situations where security controls were dynamic based on what I know or don’t know about an actor. This is where the idea of increased security and decreased friction really set it and I saw Repsheet become more than just a tool for defending web applications.
All of Repsheet is open sourced with a friendly license. You can find it on Github at:
There are multiple projects that represent the different layers that Repsheet offers. There is also a brochureware site at http://getrepsheet.com that will soon include tutorial information and additional implementation examples.
Question #2: What is the future of reputational interactions with users? How far do you see reputational interaction going in an enterprise environment?
For me, the future of reputation based tooling is not strictly bound to defending against attacks. I think once the tooling matures and we start to understand how to derive intent from behavior, we can start to create much more dynamic security for our applications. If we compare web security maturity to the state of web application techniques, we would be sitting right around the late 90s. I’m not strictly talking about our approach to preventing breaches (although we haven’t progressed much there either), I’m talking about the static nature of security and the impact it has on the users of our systems. For me the holy grail is an increase in security and a decrease in friction.
A very common example is the captcha. Why do we always show it? Shouldn’t we be able to conditionally show it based on what we know or don’t know about an actor? Going deeper, why do we force users to log in? Why can’t we provide a more seamless experience if we have enough information about devices, IP address history, behavior, etc? There has to be a way to have our security be as dynamic as our applications have become. I don’t think this is an easy problem to solve, but I do think that the companies that do this will be the ones that succeed in the future.
Tools like Repsheet aim to provide this information so that we can help defend against attacks, but also build up the knowledge needed to move toward this kind of dynamic security. Repsheet is by no means there yet, but I am focusing a lot of attention on trying to derive intent through behavior and make these types of ideas easier to accomplish.
Question #3: What are the challenges of using something like Repsheet? Do you think it’s a fit for all web sites or only specific content?
I would like to say yes, but realistically I would say no. The first group that this doesn’t make sense for are sites without a lot of exposure or potential loss. If you have nothing to protect, then there is no reason to go through the trouble of setting up these kinds of systems. They basically become a part of your application infrastructure and it takes dedicated time to make them work properly. Along those lines, static sites with no users and no real security restrictions don’t necessarily see the full benefit. That being said, there is still a benefit from visibility into what is going on from a security standpoint and can help spot events in progress or even pending attacks. I have seen lots of interesting things since I started deploying Repsheet, even botnets sizing up a site before launching an attack. Now that I have seen that, I have started to turn it into an early warning system of sorts to help prepare.
The target audience for Repsheet are companies that have already done the web security basics and want to take the next step forward. A full Repsheet deployment involves WAF and GeoIP based tools as well as changes to the application under the hood. All of this requires time and people to make it work properly, so it is a significant investment. That being said, the benefits of visibility, response to attacks, and dynamic security are a huge advantage. Like every good investment into infrastructure, it can set a company apart from others if done properly.
Thanks to Aaron for his work and for spending time with us! Check him out on Twitter, @abedra, for more great insights!