NEWS & RESOURCES /
BLOG

Cloud Disaster Recovery vs. Ransomware: The Cyber-Resilience Gap

Why traditional cloud disaster recovery falls short against ransomware, and what a complete cyber-resilience posture requires.

Iftah Goldschmidt
Security Researcher
Cloud Disaster Recovery vs. Ransomware: The Cyber-Resilience Gap

TL;DR

  • The threat model has changed. Cloud DR was built to recover from passive failure. Ransomware is an adversary that has been inside the environment for weeks, mapping your recovery so it fails when you need it.
  • The gap is measurable. In Veeam's 2026 Data Trust and Resilience Report, 90% of organizations said they could recover from a cyber incident. Fewer than 1 in 3 ransomware victims actually did.
  • The familiar cloud safety net was built for a different problem. Replication isn't a backup. IAM is global. The features that look like resilience to an architect look like an attack surface to an adversary.
  • The audit floor has shifted. NIST CSF 2.0, ISO 27001 A.5.30, and DORA Article 11 all expect tested cyber recovery now. Cyber-resilient recovery and traditional DR are complementary, and most organizations have only built one.

Most cloud DR plans solve a problem from the data center era. They assume disruption is passive: hardware breaks, the network drops, occasionally a beaver chews through the fiber and uses it for a dam. Recovery is fast when the architecture was designed for it. That kind of disruption still happens. It isn’t what most likely takes a cloud environment down today.

The disruption you should actually be planning to recover from arrived three weeks ago. It has been mapping your backup infrastructure, your IAM topology, and your recovery procedures, and it’s waiting until you can’t recover before encrypting anything. That’s a different threat model. Most recovery programs are still built for the first one.

DR is built to recover from the wrong disaster

The vast majority of organizations have a formal disaster recovery or business continuity plan, and most include IT disaster recovery as a core component. The plans exist. They’re audited. They’re funded. The problem is what they were built to recover from.

Traditional DR was built for one kind of disruption: physical or environmental. Data center offline, regional power failure, hardware breakdown. Those disruptions still happen, in every environment. They're just no longer the most likely cause of downtime for a cloud or hybrid workload. Most DR programs are still optimized for them anyway.

Cloud infrastructure shifts the disruption surface, but it doesn’t remove it. AWS, Azure, and GCP give you the primitives (multi-AZ, cross-region replication, automated failover), but whether your workload actually gets resilience from them depends entirely on how it was architected. The May 2026 AWS US-EAST-1 thermal event illustrated the point at scale: organizations with multi-AZ deployments still lost hours of availability because multi-zone failure hadn't been designed for. Multi-AZ deployment isn't the same as multi-AZ resilience. In most environments those primitives are wired together inconsistently, configured to defaults that aren’t appropriate for the workload, or skipped for cost reasons. The most common cloud disruption today isn’t a failure of the cloud.

Cyber threats operate on different mechanics entirely. Modern ransomware doesn’t surface as a single dramatic event; the encryption is the last step. By the time the operator activates, they’ve usually been inside the environment for weeks, deliberately mapping backup infrastructure, IAM topology, and recovery procedures so that none of it works when you need it. Mandiant’s M-Trends 2026 report calls out this systemic shift, with ransomware operators in 2025 actively targeting backup infrastructure, identity services, and virtualization management planes. An event most teams imagined as “an outage” shows up looking nothing like one. Recovery doesn’t restart in 15 minutes. It moves through a forensics process, an insurance review, and a multi-week rebuild.

Unlike physical disasters, the cyber attack surface evolves continuously. New techniques emerge, vulnerabilities are discovered in software you didn’t write, third-party components introduce risk you don’t control. Cyber risk is dynamic. Preparedness has to account for threats that don’t yet exist. And it's not only ransomware. An autonomous AI agent acting outside its expected scope can produce the same outcome at a different speed; a single Terraform command from a misconfigured AI agent can wipe a production database and every snapshot, and the recovery posture decides whether that's an incident or an outage.

Preparedness has not kept pace with what’s actually happening. In Veeam’s 2026 Data Trust and Resilience Report, 90% of organizations expressed confidence in their ability to recover from a cyber incident; fewer than one in three ransomware victims actually did. NIST CSF 2.0’s Recover function, ISO 22301, and DORA Article 11 have all moved tested cyber recovery from best practice to board-level expectation. The audit floor has shifted underneath teams that haven’t updated their programs.

The gap: most cloud DR programs are still built to recover from passive, physical failure, not from the adversaries actually targeting cloud and hybrid workloads today.

The false sense of security: replication is not a backup

Multi-region replication, redundant storage, and high-availability configurations are effective for keeping workloads available through infrastructure failure. They were not designed to recover from an adversary, and they don’t. An adversary attacks the recovery infrastructure itself.

Replication propagates corruption. If ransomware encrypts data in your primary region, the encrypted data replicates to your secondary region. Both copies are now compromised. There are documented cases where organizations lost every recovery point this way: ransomware spread faster than anyone could respond, and what teams thought of as a “backup” turned out to be just another copy of the damage.

IAM is global and complex. In most cloud environments, identity and access management spans every region within an account, with permissions distributed across many roles and services. A compromised administrator credential doesn’t just affect one region. It can be used to access, modify, or delete resources everywhere, including backups and snapshots. Regional redundancy offers no protection when the attacker operates at the identity layer.

Availability vs. integrity. Traditional DR provides availability against accidents. It does not provide resilience against adversaries. That distinction matters, because organizations relying solely on replication and failover may discover, during an actual attack, that their safety net was never designed for this scenario.

The core issue is that traditional DR treats a cyberattack like any other outage. But an outage is passive. An attacker is active, and a skilled adversary will go after backups first, knowing that without them, the victim has no leverage.

Cyber-resilient recovery (and why it works)

Recovering from a cyber attack requires a different mindset than recovering from an outage. The premise has to be that an adversary is actively trying to undermine your recovery. Cyber-resilient recovery is designed with that scenario in mind. Five controls do most of the heavy lifting:

  • Immutable backups stored on write-once-read-many (WORM) storage or vault locks cannot be altered or deleted, even by an administrator. Even with privileged access, an attacker cannot purge or encrypt them. This is increasingly a hard requirement from cyber-insurance carriers.
  • Multi-factor and multi-person authorization for critical backup operations means a single compromised credential is not enough. Vault deletions or retention changes require an additional verification step.
  • Least-privilege access controls segment who can interact with backups. Application servers, for example, should not be able to delete backup snapshots. Smaller blast radius, smaller compromise.
  • Isolated and offsite backups create an “air gap”: copies in a separate account or even a different cloud provider that the primary environment cannot reach. If your primary is breached, the isolated backup remains untouched.
  • Backup monitoring and anomaly detection catch attacks in progress: mass deletion of backup, unexpected policy changes, or spikes in backup failures. Early detection gives administrators a chance to intervene before damage is irreversible.

These controls exist specifically because cyber incidents follow a different pattern than infrastructure failures. In a datacenter outage, you need a copy of your data and/or infrastructure somewhere else. In a ransomware or control-plane attack, you need a copy that an attacker cannot reach or modify. The reverse is also true: immutability, MFA, and strict access controls do not help during a flood or regional outage. These are complementary layers, not substitutes. A complete resilience strategy requires both.

Implementing them is not straightforward. Knowing which controls to apply, to which resources, and in what configuration requires continuous visibility across your environment. These protections also come at a cost and need to be optimized carefully: over-protecting low-priority resources wastes budget, while under-protecting critical ones creates risk. Environments change constantly. New workloads spin up, new AI systems are integrated, accounts merge or split, services come online and get deprecated. The optimal backup and DR strategy needs to evolve with them. Managing all of this manually, across accounts and providers, is where most organizations struggle.

Resiliency Posture Management: making the gap visible

Pulling these threads together, what most cloud teams need is not another backup tool. It’s a way to see their resilience posture across both traditional disasters and cyber threats at once. We’ve been calling this category Resiliency Posture Management, and it’s the discipline behind Gambit.

The core idea is straightforward: continuously evaluate every workload, every account, every region across two questions in parallel. Is this ready for an outage? Is this ready for an adversary, human or autonomous? Most environments score very differently on those two questions, and almost no organization has clear visibility into the gap between them.

In our research across cloud environments, the patterns we see most often are:

  • Critical Resources with no backup or replication configured at all
  • Backups that exist but are not immutable
  • Missing cross-region or cross-account copies
  • Retention policies that leave gaps in recovery points

Each finding maps to a resilience score covering both dimensions. An organization can have strong geo-redundancy (high traditional DR score) and weak resilience against insider or adversarial deletion (low cyber-resilience score), and most teams discover this asymmetry only when they look at both side by side. That visibility is what lets them prioritize the right fixes: enabling object lock on a specific S3 bucket, tightening IAM on a particular role, or configuring isolated backup copies for the workloads that actually warrant them.

The shift this enables is from recovery (assuming backups exist and hoping they’ll work when needed) to proactive resilience: a posture that’s verified, scored, and tracked the way any other security domain is. We think this is the next frontier for cloud security, and it’s why we built Gambit.

For teams working through these questions in their own cloud environments, more on how Gambit approaches digital resilience posture is at gambit.security.

FAQ

You ask? We answer

Is replication the same as a backup?

No. Replication keeps a copy of your data in another region or zone, which protects against infrastructure failure. But replication propagates corruption: if your primary data is encrypted by ransomware, the replicated copy is too. A true backup is independent: ideally immutable, isolated, and protected by separate access controls.

What’s the difference between a data-plane attack and a control-plane attack?

A data-plane attack targets the data itself, encrypting or corrupting files on servers. Standard backups recover from these. A control-plane attack targets the backup system, using compromised admin credentials to delete, encrypt, or otherwise destroy your backups. Cyber-resilient controls (immutability, MFA on backup operations, least-privilege, isolation) are what protect against the second.

Does cloud disaster recovery protect against ransomware?

Partially, and inconsistently. Cloud DR is built to recover from infrastructure failures: outages, hardware breaks, regional incidents. It is not, by default, built to recover from an adversary who can reach your backup or admin layer. Cyber resilience is the missing layer most organizations need to add.

What is an immutable backup?

A backup stored on write-once-read-many (WORM) storage or protected by a vault lock that prevents alteration or deletion for a defined retention period, even by a privileged administrator. Immutable backups are increasingly required by cyber-insurance carriers as a condition of ransomware coverage.

What is cyber resilience in cloud computing?

Cyber resilience is the ability to maintain or quickly restore operations during and after a cyber attack. It goes beyond traditional disaster recovery by assuming an active, targeted adversary, and adds controls (immutable backups, isolated copies, multi-person authorization, anomaly detection) designed specifically to keep recovery options available even when an attacker has privileged access.

How is cyber recovery different from traditional disaster recovery?

Traditional DR is built for accidental, passive failures: outages, hardware breaks, environmental damage. Cyber recovery is built for an active adversary trying to defeat your recovery options. The two are complementary; most modern recovery programs need both.

Other Blogs

blog
April 10, 2026

A Single Operator, Two AI Platforms, Nine Government Agencies: The Full Technical Report

Eyal Sela
Director of Threat Intelligence
Read More
blog
February 24, 2026

Prevention has lost its edge. Resilience is the winning play.

Curtis Simpson
Chief Strategy Officer
Read More