Swinsian dedupe

5/15/2023

You have a different mixture of structured and unstructured data and a different change rate. Perform 100 full backups with no change and you have a 100:1 dedupe ratio!Įven if a vendor attempts to mimic a real production environment - with a mix of structured and unstructured data, and a reasonable amount of change - it will not match the ratios and change rate of your environment. If they wanted to make their deduplication ratio look better, they would simply perform a full backup every time with no change rate. The biggest reason for this was that you had no idea what type of backups each vendor sent to their appliance, or the change rates that they introduced after each backup - if any. You compared the volume of data sent by the backup product to the amount of disk used on the appliance, and that ratio was used to justify this new type of product.Įven in those early days, however, you couldn’t compare the advertised deduplication ratio of different products, because you had no idea how they created that number.

You purchased a product like a Data Domain appliance and sent hundreds of terabytes of backups to an NFS mount, after which the appliance would deduplicate the data. The concept of deduplication ratios was born in the early days of target deduplication. In this blog, we’ll take a deeper look into why that is, and describe the proper way to compare vendors’ deduplication capabilities. This fact has been true since the very beginning of deduplication, but it is now more true than ever. The reason deduplication ratios from different vendors cannot be compared directly with one another is that inputs, methods, and reporting are all different from product to product.

Comparing the deduplication ratios of different backup vendors is an incredibly invalid way of comparing a product’s ability to deduplicate data.

0 Comments

Swinsian dedupe

Leave a Reply.

Author

Archives

Categories