Anytime something bad happens, such as a major security incident, it is tempting to point fingers and blame the engineers and leadership at Capital One. It is also tempting (although sometimes useful) to point out specific steps the victim organization should have taken.
The truth is any one of us could make the same or similar mistakes. And it only takes one small misstep to allow a crafty attacker in the front door. That is why I lean toward a more holistic approach when commenting on events like the Capital One data breach.
To summarize the dozens of articles describing how the breach occurred: former AWS engineer Paige Thompson allegedly used a server-side request forgery (SSRF) bug to bypass the ModSecurity open source web application firewall to access the AWS metadata service, obtain credentials, and eventually send requests to get sensitive data from file stores because of overly permissive access permissions. Whew!
While security is never a 100% guarantee, there are some things to review if you want to make sure you’re not the next Capital One. Large-scale breaches like this one are a good time to pause long enough to consider the lessons we can learn from Capital One’s unfortunate situation.
For us to improve our security practices as individual organizations and an industry as a whole, it’s not about placing blame but looking at lessons we can learn from this particular incident. Below are four areas to review at your organization while the breach is still fresh in our minds:
There are three areas surrounding this incident where security leadership may have been a factor:
Due to the relatively new and radically different structure that is cloud computing, we often see a general lack of understanding around AWS and other platforms.
In recent years, the term “cloud” has been used for anything hosted at somebody else’s data center. Its overuse clearly demonstrates a lack of understanding surrounding cloud platforms. When an application is created in or migrated to AWS, it’s not just a simple matter of moving to another data center.
We have seen organizations try to make a direct move from a traditional data center to AWS. And it just isn’t that simple. Before running a system in AWS, everybody needs to understand what AWS is and how the system will interact with its various services.
Typically, IT projects get pushed through quickly, with a security review being tacked on, often as an afterthought. This is where we as security professionals need more representation in all areas of the organization.
If you don’t have a design review team for any configuration of external-facing entities, get one now and include a security professional on the team. Every team should have a person who is familiar with current security controls who can act as an advisor.
When new systems are implemented, security leadership should commission a comprehensive risk assessment on the platform. At that point, the risk assessment team would dig into all areas of risk and would most likely review access rules for servers such as the ModSecurity firewall server. The risk assessment would include recommendations that the server itself should not have access to various S3 buckets, that no firewall server should ever have access to.
One of the key risk indicators for any system is access controls around the application, operating system, and underlying infrastructure. Asking the simple question “What, within AWS, does this external-facing (and therefore higher risk) server that hosts our firewall have access to?” could have saved the company a lot of headaches by preventing the data exfiltration.
The specifics around the Capital One incident should spark at least one action across the security industry. That is, double and triple checking access roles and account privileges, to make sure no overly permissive rules exist.
Unfortunately for Capital One, this is security 101. So it just looks bad.
However, as an industry, we are stuck on the crummy concept of auditing security controls once a year. That worked for the accounting industry 30 years ago — which is where our annual security audit format came from — but doesn’t work for cybersecurity, and never has.
If you aren’t continuously testing key controls and measuring key performance indicators throughout your security program, you should strongly consider doing so.
The beauty here is that continuous testing isn’t a monumental effort, or even a significant change in the way security programs are operated. Somebody at your organization is already, hopefully, operating security controls throughout the year. If you remind them to capture some evidence of control effectiveness while they are operating the control, you can monitor the controls without burdening IT people. If a control isn’t operating effectively, such as regular access reviews (e.g. looking for overly permissive rule sets), you’ll know within days or weeks, when something can be done about it quickly.
Regardless of where an application resides, whether it be AWS, Azure, DigitalOcean, or a server room in the basement, basic application security principles must be followed.
Ensure any code you create has security requirements included up front. If you are using an agile development approach, this can easily be done with security-relevant user stories in pre-production sprints. Including security requirements early in the process helps encourage development teams to learn and pay attention to security during code writing, rather than sifting through lines of code after the fact.
Once code is created, application code reviews need to be performed regularly as well. It is very likely a code review would catch SSRF vulnerabilities.
Red Team activities like penetration testing and web application security testing should be performed to identify issues like SSRF. The fact that Capital One’s security testing did not find the issue is somewhat concerning, but not entirely surprising.
We see a lot of penetration tests that aren’t really legitimate pentests. It’s tough to say for certain this happened at Capital One prior to the incident, but the testing scope should include SSRF as well as cross-site scripting, user input validation, and other common attack methods.
Once code is in place and tested, you’ll want to monitor data access (in AWS this is access to S3 buckets), and other important activities such as API calls, using a tool like CloudTrail.
In Capital One’s case, the incident occurred in their AWS account. What we see more often across our financial institution clients is the institution’s vendors hosting their applications in AWS.
In a broad sense, the Capital One breach highlights a major concern of ours. The concern is not that vendors are moving to AWS or AWS itself. The concern is that vendors frequently try to pass off the AWS security audit report (commonly in the form of a SOC 1 or 2) as a complete security program. This recent incident highlights how the AWS infrastructure is only a small slice of what should be reviewed during vendor due diligence.
To fully understand a vendor’s security posture and assess the risk introduced by the vendor’s solution, you can’t just look at the AWS SOC report. You need to understand:
I hope this information is helpful! To give credit where it is due, my research began, as it often does, at KrebsOnSecurity. My co-author on this article is the smartest AWS security person I know of: Steven Lattin.
About the Author
Randy Lindberg has 18+ years of experience in information security. He is the founder and CEO of Rivial Data Security and partners with Quantivate to offer IT risk management and cybersecurity services.