At Nexstor, we have designed and implemented Veeam Backup & Replication across a lot of different environments:
- Single-server schools with a handful of VMs and a tight budget.
- Financial institutions with hundreds of terabytes, strict regulatory requirements, and zero tolerance for data loss.
- Manufacturing businesses with awkward legacy infrastructure bolted onto modern virtualisation platforms.
- Law firms where a partner’s email archive is apparently worth more than the building.
Every time, the fundamentals are the same regardless of scale. The mistakes are also largely the same, just more expensive at the top end.
This isn’t a walkthrough of the Veeam UI. It’s the architecture and configuration thinking that makes the difference between a backup environment that holds up under pressure and one that fails you at the worst possible moment.
Before Anything Else: The 3-2-1-1-0 Rule.
If you’re still designing around plain 3-2-1, three copies, two media types, one off-site, you’re a version behind. The updated standard is 3-2-1-1-0, and the two additions matter:
- 3 copies of your data
- 2 different storage media types
- 1 off-site copy
- 1 immutable or air-gapped copy; a copy that cannot be modified or deleted, even by a compromised admin account
- 0 errors on verified recovery; backup completion is not the same as backup validity
This is used as a baseline conversation with every client before we touch a single configuration setting. Whether it’s a primary school with a single Hyper-V host or a multi-site enterprise with a dedicated DR facility, this framework applies. The implementation looks different; the principle doesn’t.
The zero errors point is particularly important and chronically under-actioned. SureBackup, Veeam’s automated recovery verification, should be running in every environment deployed. We’ve seen ‘successful’ backup jobs that would have produced unbootable VMs on restore, the job completed, the data was useless, nobody knew until we tested.
Backup Server Architecture: Get This Wrong and Everything Else Suffers.
The backup server is the management plane for your entire Veeam environment. It doesn’t do the heavy lifting, that’s the proxies, but it orchestrates everything: job scheduling, repository management, replication coordination, licensing.
Our standard recommendation is a dedicated physical server, isolated from the production environment it’s protecting.
In smaller environments, we often get pushback on this. “Can’t we just run it as a VM?” Sometimes the answer is yes, but only if you understand the failure scenario you’re accepting. If your VMware or Hyper-V cluster is the thing that’s gone down, and your backup server lives inside it, you’ve lost your management plane at exactly the moment you need it most.
For enterprise clients, we often recommend deploying the backup server on a small physical box, nothing elaborate, sitting outside the main cluster. For mid-market environments where physical isn’t practical, we virtualise it but replicate it to an independent host or the DR site. For small clients, we scope the risk and make a conscious decision rather than just defaulting to the convenient option.
One thing we will always recommend regardless of environment size: to back up the Veeam configuration database. It’s a built-in feature, Veeam can export the entire configuration to a file and send it to a repository. I’ve used this to rebuild a backup server from scratch in under an hour after a hardware failure. Without it, that’s a day’s work.
Backup Proxies: This Component is Critically Under-Specified in Most Environments.
The proxy is where your data actually gets processed. It connects to the source host, reads the VM data via whichever transport mode is appropriate, compresses and deduplicates it, and moves it to the repository. The backup server just tells it what to do and when. Getting proxy architecture right is where the real performance engineering happens.
Transport Mode Selection
Transport mode selection matters more than most people realise. In VMware environments, Hot Add is usually the right choice for virtual proxies, it accesses VMDK data directly via the SCSI bus rather than over the network, which is significantly faster and reduces ESXi host CPU load. Direct SAN is the fastest option if you have FC or iSCSI connectivity between the proxy and storage, and it’s what I’d specify for large enterprise environments with dedicated SAN infrastructure. NBD (Network Block Device) is the fallback, it works everywhere but it’s the slowest, and you’ll notice it at scale.
For Hyper-V environments, the On-Host proxy model, where the proxy role runs directly on the Hyper-V host, is often overlooked but gives you the best performance for smaller environments. Off-Host processing with a dedicated proxy server scales better for larger deployments.
Proxy Sizing in Practice
Each proxy task processes one VM disk concurrently by default. So if you have a proxy with 8 CPU cores allocated, Veeam will default to 4 concurrent tasks (the rule of thumb is one task per 2 cores, though this is tunable). Work backwards from your backup window. If you need to back up 200 VMs in a 6-hour window, you need enough concurrent task capacity to make that math work, and that number gets multiplied by the average backup duration per VM.
For clients with remote branch offices, I always put a local proxy at the branch. Pulling uncompressed VM data across a WAN link to a central proxy before deduplication is painful and unnecessary. A local proxy compresses and deduplicates before transmission, which typically reduces WAN traffic by 60-80%.
Backup Repositories: Where Architecture Decisions Have Long-Term Consequences
Repository design is the area where decisions made on day one are hardest to change later. I’ve inherited environments where everything was crammed into a single CIFS share on a NAS and the organisation had been adding VMs for three years. Unpicking that is not fun.
Filesystem choice first. If you’re deploying a Windows-based repository, use ReFS. Veeam’s fast clone technology uses ReFS block cloning to create synthetic full backups without copying data, it’s dramatically faster and uses a fraction of the storage overhead. On Linux, XFS gives you equivalent benefits. This isn’t a ‘nice-to-have’ it’s standard practice on any environment we deploy.
Immutability is now a baseline requirement, not an advanced feature. The Linux hardened repository, a Veeam-managed Linux server with immutability enabled, means that backup files within the immutability window cannot be deleted or modified by anyone, including a Veeam administrator with full credentials. That matters enormously in a ransomware scenario where the attacker has compromised admin accounts. We design this into environments ranging from small accounting firms all the way up to financial services clients with strict regulatory obligations. The implementation complexity is low; the protection it provides is significant.
A few things to be precise about when configuring immutability: the immutability period needs to be long enough to cover your retention requirements with margin, the Linux server needs to be hardened correctly (single-purpose, SSH key auth only, no shared credentials), and you need a documented process for what happens when a restore point needs to be deleted legitimately. Get those details right upfront.
Scale-Out Backup Repository (SOBR) for larger environments. SOBR lets you aggregate multiple repository extents into a single logical pool and define policies for how data moves between tiers. I use it in most mid-market and enterprise deployments, it gives you the flexibility to add capacity without restructuring your job configuration, and the capacity tier integration with object storage (S3-compatible targets, Azure Blob, etc.) makes long-term retention economical without filling up expensive primary storage.
Keep backup and production infrastructure separate. Your backup repository should be on storage that is physically or logically independent of your production environment. The failure domain of your backups should not overlap with the failure domain of your production systems.
Job Design: The Detail That Determines Your Recovery Options
How you structure backup jobs determines what you can recover, how quickly, and with what granularity. It also determines whether your backup window is manageable or a constant source of alerts.
Separate jobs by recovery tier. Not all VMs are equal. Your tier-one systems, core databases, domain controllers, critical application servers should be in dedicated jobs with shorter RPOs, more aggressive retention, and possibly a separate replication job running alongside backup. Tier-two and tier-three systems can tolerate longer RPOs and simpler configurations. Mixing them all together means you end up over-protecting low-priority VMs and under-configuring the ones that matter.
Isolate large VMs. A 4TB file server in a job with twenty 50GB application servers will delay those smaller VMs every single time. I put any VM over 1TB in its own job, sized appropriately, so it can’t hold up the rest of the environment.
Application-aware processing is not optional for stateful workloads. For SQL Server, Exchange, SharePoint, and Oracle, application-aware processing is what gives you a genuinely consistent backup state rather than just a crash-consistent snapshot. For SQL specifically, it also handles transaction log truncation, without this, you’ll find log drives filling up and causing application issues. I enable it on every stateful workload, every time.
Incremental forever with scheduled synthetic fulls is the right approach for most environments. Active full backups, where Veeam re-reads the entire source VM; are expensive in I/O and time. Synthetic fulls achieve the same result by merging existing incremental chains on the repository side, with minimal impact on production. I typically schedule these weekly, off-peak, and stagger them across job groups so they don’t all hit the repository simultaneously.
Changed Block Tracking (CBT) in VMware environments. Make sure it’s enabled and healthy on your VMs. CBT tells Veeam exactly which blocks have changed since the last backup, making incrementals fast and efficient. Occasionally CBT gets into a bad state and incremental jobs start running much longer than expected, this is one of the first things I check when a client reports unexpectedly large or slow incrementals.
Veeam Cloud Connect: Seamless and Reliable Off-Site Backup
Getting a copy of your backups off-site is non-negotiable. A fire, flood, or ransomware attack that hits your primary site takes local backups with it. The question is how you do it without creating a complex, brittle architecture that requires constant maintenance.
Veeam Cloud Connect is the cleanest solution I’ve found. Rather than building VPN tunnels, managing firewall rules, and maintaining off-site infrastructure yourself, you connect to a service provider’s cloud repository directly from the Veeam console. The connection is TLS-encrypted, the data is deduplicated before transmission, and the service provider manages the underlying platform.
From a design perspective, Cloud Connect also supports DRaaS (Disaster Recovery as a Service), rather than just storing backup data off-site, you can replicate VMs to the cloud and fail over to them in the event of a site outage. The recovery time difference is material. Restoring from backup to new infrastructure might take four to eight hours. Failing over to an existing replica can be done in minutes.
Nexstor operates Veeam Cloud Connect services from UK-based private cloud infrastructure, purpose-built for this workload rather than carved out of a hyperscaler environment.
Testing Recovery: The Part Everyone Skips
SureBackup should be scheduled and running automatically. Not manually triggered before an audit. Automatically, on a schedule, with results being monitored. This is configured on every deployment. It boots the backed-up VM in an isolated network, runs heartbeat and ping tests, can run custom scripts to test application-layer health, and reports pass or fail. If a backup that completed successfully wouldn’t actually produce a bootable VM, you want to know that today, not during an incident.
Nexstor would recommend at least one full DR exercise per year for any environment with formal recovery objectives. That means actually failing over to the DR environment, working through the recovery runbook end to end, timing each step, and documenting what didn’t work as planned. Something unexpected always comes up whether it’s DNS, licensing, application interdependencies or network routing. Better to find it in a controlled exercise on a Tuesday afternoon than at 3am during an actual incident.
Sizing Reference: The Numbers Used in Practice
For Repositories
- Total used data (not provisioned – actual used storage across all VMs)
- Daily change rate – typically 3–5% for general workloads, 10–15% for active databases
- Retention period – number of restore points × average daily change data
- Expected deduplication ratio – 2:1 to 4:1 depending on data type; databases deduplicate poorly, general file data much better
- Add 20–30% headroom
For Proxies
- Backup window duration
- Total VM count and average backup duration per VM
- Required concurrent task count = (VM count × avg duration) ÷ backup window
- Proxy CPU cores = concurrent tasks × 2 (default ratio, tunable)
For the Backup Server
- 4 cores and 8GB RAM handles most environments up to a few hundred VMs
- The PostgreSQL database (from V12 onwards) benefits from fast local storage – don’t put it on slow spinning disk
What To Prioritise in Any Environment Review
Walking into an existing Veeam environment for the first time, here’s what we would check:
- Is SureBackup running? If not, you don’t actually know if your backups are valid.
- Is there an immutable copy? If not, a ransomware attack or compromised admin account could wipe your backup history.
- Where is the backup server? If it lives inside the environment it’s protecting, that’s a risk to understand and address.
- Is there an off-site copy? Local-only backup is not a complete strategy.
- When was the last DR test? If no one can answer this confidently, it’s overdue.
- Is the Veeam configuration backed up? If the backup server itself needs rebuilding, how long would that take?
These aren’t exotic concerns but environments that have all six covered are genuinely well-protected. Environments that are missing two or three of them are one bad day away from a serious problem.
Ready to schedule a FREE consultation with a Nexstor backup expert?
Interested in pricing for Nexstor BaaS, powered by Veeam?