I recently worked on a performance turning for the backup solution with below components:
- EMC Avamar as the backup server at DC1 and DC2;
- Data Domain as the backup target storages at DC1 and DC2; and
- Backup datasets are replicated between DC1 and DC2.
The backup client is Avamar NDMP node at both sites. The 50TB ISILON CIFS shares can be backed up in 2 hours by Avamar through NDMP. However, the replication of the daily backup to peer DC can take more than 24 hours.
1. Basic Replication Performance Turning
Basic turning includes checking below items:
- Check Data Domain Replication IP Interface, use 10Gb if available;
- Check Data Domain CPU/Memory/Disk utilization;
- Check Avamar Dataset setting;
- Use Avamar Replication Group Filter to replicate selected save sets instead of all; and
- Check Avamar Replication Group setting “maximum concurrent processes”, each client can start 1 process, if there are multi clients in the replication group, this setting can help. The max value is 8.
2. Advanced Replication Performance Tuning
For Avamar controlled Data Domain replication, there are 2 major replication concepts for Avamar version 7.0 and above.
2.1 Automated Multistream Replication
From Avamar 7.0, Automated Multistream Replication (AMS) is supported. All Data Domain clients on Avamar v7 and DDOS v5.3 and later automatically use AMS.
The key feature in AMS is to leverage multiple parallel streams to process and send backups to the replication targe, by default, each client job using 6 simultaneous streams in the background as seen on the Data Domain servers. If you have 10 clients in your environment, then you should get 6*10=60 replication sessions on Data Domain.
We can turn below flag to change the streams for each client in Avamar replication group advanced setting:
- [avtar]ddr-repl-max-parallel-streams = # of streams to be allocated (1~29)
Beware when making this change, be sure the total replication sessions do not exceed the max support number on the Data Domain. Use below Data Domain command to identify the max supported sessions:
- ddboost streams show active
For single Avamar client with large size datasets, increase the streams will bring great improvement for replication if Data Domain can support enough replication sessions and DC link bandwidth is large. However, this does not mean using the max sessions may result max performance, Data Domain may have disk contention and lead performance degrade.
2.2 Virtual Synthetic Replication (VSR)
From Avamar version 7.1 and DDOS version 5.5, VSR is introduced to leverage the previously replicated backup so that only the changes are required to be scanned and sent to the target Data Domain. Once those changes have been sent, the target Data Domain is able to virtually synthesize a new full backup with previous full backup and changes.
The VSR will dramatically reduce the data scanning during the replication comparing with the AMS feature that the full scan is still required. Therefore, VSR is ideal for the client with low change rate.
There are several requirement of VSR replication is listed in EMC support website (Full requirements are listed in EMC KB479374). There is a major requirements as below:
- “Base” backup, which is the immediate previous backup of that dataset, must exist on replication destination
It is important to be aware that there is a single session per client datasets to run VSR, this might impact the replication performance.
The AMS and VSR can be controlled by user with below flag in Avamar replication group advanced setting:
Setting “N” to 0 forces AMS/NCR
Setting “N” to 10 forces VSR
By default “N” is set to 5.
3.Performance Turning Result
In my case, Avamar NDMP client is a single client and option “maximum concurrent processes” provides no help. Also, I tried to use VSR and I found it is hard to meet the VSR requirement to replicate the partial backup in the correct order in long term. Therefore, the solution is to add below flags into Avamar replication group with NDMP client:
[avtar]ddr-repl-method-control=0 (Force to use AMS)
[avtar]ddr-repl-max-parallel-streams =28 (28 sessions for AMS replication)
After the change been made, the replication for 50TB NDMP daily backup can be finished in 18 hours compared with 36 hours before turning.