If your Umbraco site drives leads, revenue or public services, downtime is more than an inconvenience. This playbook shows how to set clear recovery objectives, choose the right architecture, and run the monitoring, testing and SLAs that keep your platform online. It also explains how Growcreate puts recovery into practice for enterprise teams in the UK.
Start with the numbers that matter
Before diagrams or tools, agree two targets:
- Recovery Point Objective (RPO) – the point in time you must be able to restore to after an outage.
- Recovery Time Objective (RTO) – the time window to restore service before the disruption becomes unacceptable.
If you process personal data, build breach response into your plan. UK GDPR expects certain personal data breaches to be reported to the ICO within 72 hours of becoming aware, where feasible.
Typical objectives by site type
| Site type | Example RPO | Example RTO | Notes | 
|---|---|---|---|
| Corporate marketing site | 30 – 60 minutes | 1 – 4 hours | Prioritise database and media snapshots | 
| Public sector information portal | 15 – 30 minutes | 1 – 2 hours | Add zone redundancy and synthetic monitoring | 
| Finance or healthcare service | 0 – 15 minutes | < 1 hour | Active‑active or fast failover across regions | 
Use these as a starting point, then set impact‑driven targets in your BIA and test them.
Choose an architecture that protects revenue and reputation
The right pattern depends on compliance, traffic and budget. Here are proven approaches for Umbraco on Azure and Umbraco Cloud.
Single region with availability zones
Deploy Umbraco App Service and Azure SQL across availability zones in one region. With the May 2025 update, zone‑redundant App Service can achieve a 99.99% SLA on two instances.
Good for: Most UK organisations that need high availability without multi‑region complexity.
Active‑passive multi‑region failover
Run production in Region A and maintain warm standby in Region B. Use Azure Traffic Manager priority routing for automatic DNS failover when health probes fail. Pair with Azure Front Door health probes for origin health and fast rerouting.
Good for: Regulated teams needing geographic resilience with controlled cost.
Active‑active multi‑region
Run mirrored stamps in two or more regions. Front your origins with Azure Front Door or Traffic Manager, and design for each region to absorb full load if the other fails. This model is described in the Azure Well‑Architected disaster recovery guide.
Good for: Mission‑critical platforms with strict RTO and near‑zero RPO.
Umbraco Cloud
Umbraco Cloud provides built‑in backup and restore options: 35‑day point‑in‑time database restore, 30‑day filesystem snapshots and 35‑day blob storage snapshots for disaster recovery.
Good for: Teams that want managed Umbraco with predictable cost and solid DR defaults.
Backups that actually meet your RPO
Your backups and replicas are your RPO in practice. Align schedules to editing cadence and release activity.
- Azure SQL automatic backups: full weekly, differential daily and log backups every 5 minutes enable point‑in‑time restore within a configurable 1 – 35 day retention.
- Long‑term retention: extend SQL backup retention up to 10 years for compliance.
- Media safety: use Azure Blob snapshots or enable blob versioning for point‑in‑time recovery of media files.
- Umbraco Cloud: database PITR for 35 days plus filesystem and blob snapshots for DR as above.
If editors publish hourly, a daily database backup means you accept a day’s rework on restore. Tighten the schedule or document the risk and live with it.
Monitoring that spots trouble before users do
Monitoring should validate uptime targets, not just collect logs.
- Synthetic availability tests from multiple locations alert on failures and slow responses.
- Azure Service Health alerts notify your team about platform incidents and planned maintenance that may affect your regions or services.
- Azure Front Door and Traffic Manager health probe logs explain why an origin is marked unhealthy which speeds diagnosis. 
- Align your logging approach with recognised best practice so you can investigate quickly and meet audit needs.
 
SLAs that mean something to the business
SLA language should link to RTO, RPO and comms, not just uptime.
- With zone‑redundant App Service on two instances, Microsoft states a 99.99% SLA for App Service availability.
- Traffic Manager provides automatic DNS failover based on endpoint health, supporting availability targets during regional incidents.
Clarify who declares a disaster, how you fail over, how you fail back and the communication windows for stakeholders, regulators and customers. For personal data breaches, your plan should reference the ICO's' 72‑hour expectation to assess and report where required.
How Growcreate operationalises recovery for Umbraco
We run and recover Umbraco platforms for organisations that cannot afford to guess. Here’s our approach.
Architecture that fits your risk
Azure environments engineered for Umbraco hosting on App Service and Azure SQL, with options for zone redundancy, Front Door and multi‑region failover where the risk warrants it.
Umbraco Cloud where speed and predictable cost make sense, with clear backup and restore capabilities.
Recovery runbooks
Documented failover and failback steps covering database restore, media recovery and DNS or Front Door routing, aligned to your RTO and RPO. Tested and refined on a schedule you can show auditors.
Backups you can bank on
Azure SQL PITR configured to your editorial cadence, with optional long‑term retention for compliance.
Blob snapshots or versioning for media.
Monitoring and alerting that proves SLAs
Application Insights availability tests, Front Door and Traffic Manager health signals, plus Service Health alerts into your incident channels.
SLA‑backed support
24/7 incident response with agreed objectives. As an Platinum Umbraco Partner with Azure‑native engineering, we build platforms designed for uptime and safe releases.
Public sector procurement ready
We are listed on G‑Cloud 14 for Umbraco CMS and managed cloud services.
Test like outages are real
Paper plans don’t recover sites. Exercise them.
- Run table‑top exercises and live failovers to validate roles, runbooks and comms. Use findings to close gaps with time‑bound actions.
- For multi‑region setups, rehearse Traffic Manager or Front Door failover and measure time to serve healthy endpoints.
A simple quarterly test plan
| Quarter | Scenario | What to validate | 
|---|---|---|
| Q1 | Point‑in‑time database restore to staging | Editors lose no more than target RPO. Sign‑off steps are clear | 
| Q2 | Front Door or Traffic Manager failover to secondary region | DNS and health probes behave as expected. CDN and cache headers don’t block recovery | 
| Q3 | Media rollback via blob versioning or snapshot | Correct assets restored without collateral loss | 
| Q4 | Full runbook exercise with stakeholder comms | Roles, comms timings and third‑party contacts work under pressure | 
Governance and compliance pointers
- Keep a breach response checklist next to your DR plan. The ICO expects assessment and, where required, notification within 72 hours.
- Document invocation criteria, decision‑makers and external notifications. Log evidence of tests and improvements.
What good looks like in practice
- RPO and RTO written into your SLA with monitoring that proves them.
- Backups verified by restore, not just by green ticks.
- Health‑based failover that routes around trouble without manual heroics.
- Tests that surface weak points before incidents do.
Ready to tighten your recovery posture?
If you want an honest view of where you stand today, book a free hosting audit and we’ll map your current set‑up to the right recovery options for your risk, budget and targets.
- Explore Umbraco hosting by Growcreate
- Book a free hosting audit
We tackle the details so you can focus on the big picture.
Let's talk about your platform
