Setting the Record Straight on Backup
Or should I say, ‘Setting the Record Straight on Backing Up Optimized Data’? Carter discusses on this blog they myriad of ways to perform backups on optimized data. (His blog actually reads more like a white paper explaining how backup needs to be configured to work with his product.) One of the ways Carter describes to do backup is via NDMP and says “… is the most complicated.” The funny thing is that this is the way that 90% of enterprises backup their NAS data. The other scenarios are not quite stated correctly or are again designed to lead users to believe their solution is ‘simple’ when they really add complexity (however, I’ll let the backup community debate that – I have been in backup for 10+ years and I know this won’t go over on them, nor do I want to waste too much blog space). Finally the last scenario they discuss isn’t backup – its replication, but I’ll address that too. Let’s address these one at a time. First, Carter mentions that in some scenarios there is a need to rehydrate data in order to back it up. The process of rehydrating data may not require that the array have the physical capacity to store the data before it is backed up, but the array will require the CPU resources, I/O resources, bandwidth and time to rehydrate to data to back it up. George goes on to say that this situation is “ugly, but not that ugly”. I will tell you any time you put more resource requirements on systems that do backups, your running the risk that backups won’t get done. One of the greatest challenges in IT is backup. Backup administrators are running into backup window problems all the time. Data is growing not shrinking; having to do more work on more data in order to protect it is a recipe for failure. In my previous comments I may have incorrectly stated you need more disk space to do the backups, but I did correctly state that the array will require more system resources. And where do these resources come from? When the system is idle? When is your storage array idle? Now, what if all you had to do was – well nothing. Storwize sits in front of primary storage and stores your data, compressed, in real-time with no performance impact and preserving the envelope of the data file. Then when it comes time to backup, the backup administrator does absolutely nothing different that he/she did yesterday. Same shares are backed up, same clients, and all the work is done by the Storwize appliance, there is no load on the filer. The next question is can Storwize keep up with the backup stream and the answer is YES. As you saw in the Wikibon CORE blog, our time to compress is on the order of magnitude of milliseconds – the time to decompress is even less. (I should also mention one thing Carter failed to mention, in order for backups to come off their system ‘transparently’ you need a software agent on the client – who wants to manage more clients?



