The company had backups. That was the sentence everyone repeated during the first hour of the ransomware incident. Then the restore started. The newest backup was encrypted because the backup share was reachable from the domain. The older backup restored, but nobody knew the database password. The file server came back, but the accounting application needed a license key stored in the mailbox that was also encrypted. The business did not have a backup problem. It had a recovery problem.
Ransomware recovery drills exist to find those failures before an attacker does. A backup is only useful if you can restore the right data, into a trusted environment, fast enough for the business to survive, with confidence that the restored data is clean.
Define The Recovery Target
Before testing tools, define what must come back first. For many small businesses, the priority is identity, email, accounting, customer records, file shares, website, and line-of-business applications. Each system needs a recovery time objective and a recovery point objective.
Example recovery targets: Identity provider: RTO 4 hours, RPO 24 hours Email: RTO 8 hours, RPO 24 hours Accounting: RTO 8 hours, RPO 4 hours File shares: RTO 24 hours, RPO 24 hours Website: RTO 12 hours, RPO 24 hours CRM: RTO 24 hours, RPO 24 hours
If the business cannot tolerate losing four hours of accounting data, the backup schedule must reflect that. If the business cannot operate without email, you need an emergency communication plan that does not depend on the compromised tenant.
Use The 3-2-1-1-0 Backup Pattern
The classic 3-2-1 rule still helps: three copies, two media types, one offsite. Modern ransomware adds two more requirements: one immutable or offline copy, and zero restore errors when tested. Immutability matters because attackers often target backups before triggering encryption.
3-2-1-1-0 backup pattern: 3 copies of important data 2 different storage types 1 copy offsite 1 copy immutable or offline 0 restore errors during testing
Immutability can be object lock in cloud storage, immutable backup repositories, snapshots protected by separate admin credentials, or offline media. The key is that a compromised domain admin should not be able to delete every recovery point.
Run A Clean-Room Restore
A real drill restores into an isolated environment, not back over production. Build a small clean-room network with no trust relationship to production. Restore one critical system, verify it starts, validate data integrity, and scan before reconnecting anything.
Clean-room restore flow: 1. Create isolated network 2. Restore identity or local admin access 3. Restore application server 4. Restore database or file data 5. Patch and scan restored systems 6. Validate application behavior 7. Record restore time and errors 8. Destroy test environment or preserve evidence as needed
For cloud SaaS, clean-room may mean restoring a subset of data to a test tenant, exporting critical records, or validating vendor restore procedures. Do not assume the vendor can restore exactly what you need until you test it.
Measure What Actually Matters
The most important metric is not backup job success. Backup software can report success while the application restore fails. Measure restore time, data loss, missing dependencies, manual steps, and who had the credentials needed to complete the recovery.
Recovery drill scorecard: System tested: Backup selected: Restore start time: Restore complete time: Data timestamp: Data loss: Credentials available: yes/no Application validation: pass/fail Malware scan: pass/fail Missing documentation: Owner for fixes: Retest date:
Tabletop The Human Decisions
Technical restore tests are only half the drill. Run a tabletop exercise for decision making. Who can declare an incident? Who contacts cyber insurance? Who talks to customers? Who approves taking systems offline? Who calls the outside incident response firm? Who has authority if the CEO is unavailable?
Tabletop injects: - EDR reports encryption on three servers - Backup console login fails - Finance asks whether payroll will run tomorrow - A customer asks if their data was stolen - The attacker posts a ransom note with a deadline - The domain controller cannot be trusted - The only backup admin is on vacation
Write down the answers. Ambiguity during a ransomware incident wastes the most expensive minutes you have.
Protect The Recovery System
Backup infrastructure should have separate admin accounts, MFA, restricted network access, logging, and alerts for deletion, retention changes, repository changes, and mass restore operations. If the backup console uses the same identity provider that may be compromised, define emergency access in advance.
Backup security alerts: - Backup job disabled - Retention policy reduced - Repository deleted or detached - Immutable setting changed - New backup administrator added - Large backup deletion - Failed login burst against backup console
How Often To Drill
Run a small restore test monthly, a critical application restore quarterly, and a full ransomware tabletop at least twice a year. After major infrastructure changes, acquisitions, cloud migrations, or backup platform changes, run an extra drill.
The result should be a short recovery runbook with screenshots, credentials location, vendor contacts, restore order, validation commands, and known problems. The runbook is not finished until someone other than the author can use it successfully.
Further Reading
Useful official references include NIST SP 1800-11 on recovering from ransomware and other destructive events, NIST small business ransomware guidance, and NCCoE material on data integrity recovery.