Security in DevOps (DevSecOps)
Security is a key part of DevOps. But how does a team know it's secure? Is it ever really possible to deliver a completely secure service?
Unfortunately, the answer is no. DevSecOps is a continuous and ongoing effort that requires the attention of everyone on the team. While the job is never truly done, the practices teams employ to prevent and handle breaches produce systems that are as secure and resilient as possible.
"Fundamentally, if somebody wants to get in, they're getting in...accept that. What we tell clients is: number one, you're in the fight, whether you thought you were or not. Number two, you almost certainly are penetrated." -- Michael Hayden, Former Director of NSA and CIA
The security conversation
Teams that don't have a formal DevSecOps strategy are encouraged to begin the planning as soon as possible. At first there may be some resistance from team members who don't fully appreciate the threats that exist. Others may not feel that the team is equipped to face the problem and that any special investment would be a wasteful distraction from shipping features. However, it's necessary to begin the conversation to build consensus as to the nature of the risks, how the team can mitigate them, and whether the team needs resources they don't currently have.
Expect skeptics to bring some common arguments, such as:
- How real is the threat? Teams often don't appreciate the potential value of the services and data they're charged with protecting.
- Our team is good, right? A security discussion may be perceived as doubt in the team's ability to build a secure system.
- I don't think that's possible. This is a common argument from junior engineers. Those with experience usually know better.
- We've never been breached. But how do you know? How would you know?
- Endless debates about value. DevSecOps is a serious commitment that may be perceived as a distraction from core feature work. While the security investment should be balanced with other needs, it can't be ignored.
The mindset shift
The mindset shift to a DevSecOps culture includes an important thinking about not only preventing breaches, but assuming them as well.
Security strategy components
There are many techniques that can be applied in the quest for more secure systems.
|Preventing breaches||Assuming breaches|
|Threat models||War game exercises|
|Code reviews||Central security monitors|
|Security testing||Live site penetration tests|
|Security development lifecycle (SDL)|
Every team should already have at least some practices in place for preventing breaches. Writing secure code has become more of a default, and there are many free and commercial tools to aid in static analysis and other security testing features.
However, many teams lack a strategy around dealing with a world in which they assume they will be breached at some point. This can be a hard thing to admit, especially when having difficult conversations with management. The most important thing to focus on is that practicing techniques that assume breaches helps the team answer questions about their security on their own time, so they don't have to figure it all out during a real security emergency.
Common questions the team needs to think through:
- How will we detect an attack?
- How will respond if there is an attack or penetration?
- How will we recover from an attack, such as when data has been leaked or tampered with?
Key DevSecOps practices
There are several common DevSecOps practices that apply to virtually any team.
First, teams should focus on improving their mean time to detection and mean time to recovery. These are metrics that indicate how long it takes to detect a breach and how long it takes to recover, respectively. They can be tracked through ongoing live site testing of security response plans. When evaluating potential policies, improving these metrics should be an important consideration.
Teams should also practice defense in depth. When a breach happens, it often results in the attacker getting access to internal networks and everything they have to offer. While it would be ideal to stop them before it gets that far, a policy of assuming breaches would drive teams to minimize their exposure from an attacker who has already gotten in.
Finally, teams should perform periodic post-breach assessments of the practices and environments. After a breach has been resolved, the team should evaluate the performance of the policies, as well as their own adherence to them. This serves to not only ensure the policies are effective, but also that the team is actually following them. Every breach, whether real or practiced, should be seen as an opportunity to improve.
Strategies for mitigating threats
The list of potential threats to a system is so substantial that it's not possible to enumerate everything. Some security holes are due to issues in dependencies like operating systems and libraries, so keeping them up-to-date is critical. Others are due to bugs in system code that require careful analysis to find and fix. Poor secret management is the cause of many breaches, as is social engineering. It's a good practice to think about the different kind of security holes and what they mean to the system.
Consider a scenario where an attacker has gained access to a developer's credentials. What can they do?
|Can they send emails?||Phish colleagues|
|Can they access other machines?||Log on, mimikatz, repeat|
|Can they modify source||Inject code|
|Can they modify the build/release process?||Inject code, run scripts|
|Can they access a test environment?||If a production environment takes a dependency on the test environment, exploit it|
|Can they access the production environment?||So many options...|
How can the blue team defend against this?
- Store secrets in protected vaults
- Remove local admin accounts
- Restrict SAMR
- Credential Guard
- Remove dual-homed servers
- Separate subscriptions
- Multi-factor authentication
- Privileged access workstations
- Detect with ATP & Azure Security Center
All secrets must be stored in a protected vault. Secrets include:
- Passwords, keys, and tokens
- Storage account keys
- Credentials used in shared non-production environments, too
Use a hierarchy of vaults to eliminate the duplication of secrets. Also consider how and when secrets are accessed. Some are used at deploy-time when building environment configurations, whereas others are accessed at run-time. Deploy-time secrets typically require a new deployment in order to pick up new settings, whereas run-time secrets are accessed when needed and can be updated at any time.
- Azure Security Center is great for generic infrastructure alerts, such as for malware, suspicious processes, etc.
- Source code analysis tools for static application security testing (SAST).
- GitHub advanced security for analysis and monitoring of repos.
- mimikatz extracts passwords, keys, pin codes, tickets, and
more from the memory of
lsass.exe, the Local Security Authority Subsystem Service on Windows. It only requires administrative access to the machine, or an account with the debug privilege enabled.
- BloodHound builds a graph of the relationships within an Active Directory environment. It can be used the red team to easily identify attack vectors that are difficult to quickly identify.
War game exercises
A common practice at Microsoft is to engage in war game exercises. These are security testing events where two teams are tasked with testing the security and policies of a system.
The red team takes on the role of an attacker. They attempt to model real-world attacks in order to find gaps in the security strategy. If they can exploit any, they also demonstrate the potential impact of their breaches.
The blue team takes on the role of the DevOps team. They test their ability to detect and respond to the red team's attacks. This helps to enhance situational awareness and measure the readiness & impact of the DevSecOps strategy.
Evolving a war games strategy
One of the reasons war games are so effective in hardening security is that they motivate the red team to find and exploit issues. It'll probably be a lot easier than expected early on. Teams that haven't actively tried to attack their own systems are generally unaware of the size and quantity of security holes available to attackers. This may be demoralizing to the blue team at first since they'll get run over repeatedly. Fortunately, the system and practices should evolve over time such that the blue team consistently wins.
Preparing for war games
Before starting war games, the team should take care of any issues they can find through a security pass. This is a great exercise to perform before attempting an attack because it will provide a baseline experience for everyone to compare with after the first exploit is found later on. It's good to start off by identifying vulnerabilities through a manual code review and static analysis tools.
Red and blue teams should be organized by specialty. The goal is to build the most capable teams for each side in order to execute as effectively as possible.
The red team should include some security-minded engineers and developers deeply familiar with the code. It's also helpful to augment the team with a penetration testing specialist, if possible. If there are no specialists in-house, many companies provide this service along with mentoring.
The blue team should be made up of ops-minded engineers who have a deep understanding of the systems and logging available. They have the best chance of detecting and addressing suspicious behavior.
Running early war games
Expect the red team to be effective in the early war games. They should be able to succeed through fairly simple attacks, such as by finding poorly protected secrets, SQL injection, and successful phishing campaigns. Take plenty of time between rounds to apply fixes and feedback on policies. This will vary by organization, but you don't want to start the next round until everyone is confident that the previous round has been mined for all it's worth.
Ongoing war games
After a few rounds, the red team will need to rely on more sophisticated techniques, such as cross-site scripting (XSS), deserialization exploits, and engineering system vulnerabilities. it will also help to bring in additional outside security experts in areas like Active Directory in order to attack more obscure exploits. By this time, the blue team should not only have a hardened platform to defend, but will also make use of comprehensive, centralized logging for post-breach forensics.
"Defenders think in lists. Attackers think in graphs. As long as this is true, attackers win." -- John Lambert (MSTIC)
Over time, the red team will take much longer to reach objectives. When they do, it will often requiring discovery and chaining of multiple vulnerabilities to have a limited impact. Through the use of real-time monitoring tools, the blue team should start to catch them in real-time.
War games shouldn't be a free-for-all. It's important to recognize that the goal is to produce a more effective system run by a more effective team.
Code of conduct
Here is a sample code of conduct used by Microsoft:
- Both the red and blue teams will do no harm. If the potential to cause damage is significant, it should be documented and addressed.
- The red team should not compromise more than needed to capture target assets.
- Common sense rules apply to physical attacks. While the red team is encouraged to be creative with non-technical attacks, such as social engineering, they shouldn't print fake badges, harass people, etc.
- If a social engineering attack is successful, don't disclose the name of the person who was compromised. The lesson can be shared without alienating or embarrassing the team member everyone needs to continue to work with.
Rules of engagement
Here is a sample rules of engagement used by Microsoft:
- Do not impact availability of any system.
- Do not access external customer data.
- Do not significantly weaken in-place security protections on any service.
- Do not intentionally perform destructive actions against any resources.
- Safeguard credentials, vulnerabilities, and other critical information obtained.
Any security risks or lessons learned should be documented in a backlog of repair items. Teams should define a service level agreement (SLA) for how quickly security risks will be addressed. Severe risks should be addressed as soon as possible, whereas minor issues may have a two-sprint deadline.
A report should be presented to the entire organization with lessons learned and vulnerabilities found. It's a learning opportunity for everyone, so make the most of it.
Lessons learned at Microsoft
Microsoft regularly practices war games and has learned a lot of lessons along the way.
- War games are a really effective way to change DevSecOps culture and keep security top-of-mind.
- Phishing attacks are very effective for attackers and should not be underestimated. The impact can be contained by limiting production access and requiring two-factor authentication.
- Control of the engineering system leads to control of everything. Be sure to strictly control access to the build/release agent, queue, pool, and definition.
- Practice defense in depth to make it harder for attackers. Every boundary they have to breach slows them down and offers another opportunity to catch them.
- Don't ever cross trust realms. Production should never trust anything in test.
Submit and view feedback for