The blast radius metaphor is useful for communicating with business and management stakeholders the gravity and potential impact of their decisions. Put simply, blast radius is the worst-case scenario for something going wrong: If this blew up, what would be impacted?
Blast radius considerations should be part of your risk management strategy, so that you can quickly highlight the most critical components in your cloud environment. Keeping this in mind can make it easier to meet compliance requirements in the most efficient way, and minimize wasted effort and extra work. In this piece we will explore the concept of blast radius as it applies to a hyper-scale cloud environments, such as AWS.
What is Blast Radius?
Take the example of a developer and an administrator in an environment. A developer will need to make changes to their development environment, and likely runs the risk of breaking it if something goes wrong. An administrator is responsible for the entire organization, and has access to more resources than just the developer’s environment, so the potential impact of their changes is much greater if something were to go wrong. Confusingly, sometimes the developer and the administrator are the same person, just operating in a different role.
The term blast radius doesn’t distinguish between good or bad events. It applies not only to malicious actions and actors, such as those using compromised keys or other hacks, but also misconfigurations and mistakes that well meaning (and legitimate) users of the system might perform. Like the timeless motto “hope for the best, plan for the worst” it’s a useful idea to explore and prepare for challenging scenarios. It helps you and your team identify the things they should care about, and what the impacts of their actions (or lack thereof) might be in an undesirable scenario.
Cloud vs On-Premises
With more businesses deploying and running business-critical systems in the cloud, protecting and operating those systems at scale means securing them against the most likely failure scenarios. When it comes to your cloud-based systems, while nothing is going to literally explode - and if it does, it’s the vendor’s problem, not yours! Since the cloud providers like AWS take care of all the physical dangers, you’re left to manage the software-based dangers to your business environment.
In on-premises environments, as long as you secured physical access to the data center, you limited how much damage could be done. Barring extreme natural disasters or catastrophic hardware failure, no incident could make your servers or disks disappear.
While cloud environments don’t have many of the same physical/capacity limitations of on-premises environments, your software-defined resources can be destroyed, making recovery of business systems hard, if not impossible. This makes understanding the full blast radius of your changes a requirement, and not just a “nice to have.”
Evolution of the AWS Cloud Boundary
In AWS, the first and most common blast radius is the account boundary. AWS accounts are like containers for your cloud resources and data. Because of this, businesses used multiple accounts to ensure that any mistake or compromise couldn’t impact the entire organization. This approach was streamlined with the release of AWS Organizations, because it made it easier to manage and automate multiple accounts.
Identity Is the Boundary
Even in a multi-account environment, the default blast radius is still the AWS account, but now centralized identity providers enable access to multiple accounts. This means that the effective blast radius for a compromised identity is now larger than just a single account.
This doesn’t mean you shouldn’t use an identity provider! A centralized source of truth for all identities that can access your environment is still more secure than the old, account-based IAM users that were best practice years ago. It does mean that you need to change how you think about blast radius, and focus on the identities in your environment.
Compliance and Risk
As with most things security-related, your environment is only as strong as your weakest link. This is one of the reasons why compliance frameworks are usually interested in the same kind of considerations that go into managing blast radius. It’s all about risk management. A poorly managed blast radius can lead to negative outcomes for your business, such as:
- Increased vulnerability to attacks
- Potential for widespread data breaches
- Compliance and regulatory risks
This is especially important in the context of potentially expensive cloud services that can hurt both your organization’s reputation, and its wallet. For example, just like crypto mining attacks using stolen compute were popular in recent years, GenAI services are increasingly being targeted by attackers to abuse, and by removing the access to these services for identities that don’t need them, a whole swathe of attack vectors are mitigated.
Overprivileged identities that have too much access to data are an easy example of increased blast radius. If there’s no business case for an identity having access to data or a service, then those permissions should be removed.
Being aware and managing your blast radius makes it easier to meet compliance requirements, because it limits the scope that compliance controls have to apply to. Once you’ve identified your blast radius in a particular context, you know what the maximum scope of any incidents are. In the context of your cloud identities, if an identity needs to access PII or credit card information, it should be subject to additional controls and auditing.
Best Practices for Limiting Blast Radius
Unfortunately there’s no magic solution for limiting blast radius. Change is required to grow and succeed as a business, and all change has some degree of risk associated with it.
Some of the practices you can use to proactively manage and limit your blast radius are:
- Continuous monitoring and alerting
- Regular access reviews
- Training and awareness
- Just in time (JIT) access provisioning
Monitoring and alerting is a great first step towards security, because it makes interesting and concerning events in your environment visible.
Once you’ve got a record of actions that are going on in your environment, you can start the process of reviewing them, especially in the case of identity access permissions. This process is a key part of following the principle of least privilege, and can help reaching outcomes like zero standing privileges (ZSP).
Investing in the human capital of your organization, through training, awareness, and understanding of concepts like blast radius, makes it easier for risks to be managed as an organization, rather than in bits and pieces.
For services like access management, access can be limited not only by scope, but also by time. JIT access provisioning gives you a limited blast radius when it comes to the time that access is available to an identity.
What's Your Blast Radius?
If you’re responsible for systems and applications in the cloud, you should be able to answer the question “what’s your blast radius?” quickly and easily. Not only that, you should be happy with the answer, because that’s what you’ve signed up to protect and operate. If you’re not, you should prioritize fixing it now, before it’s too late, because that’s how you prepare for the worst, while still hoping for the best.
Worried about your access to AWS and GCP? Common Fate can help with that by minimizing your access management blast radius with JIT provisioning. Book a demo to find out how, or take a free trial today.