Process vs. Technology


The hardest thing to change inside IT is not technology, it is process!  I say this because all too often there are technologies available that provide a far superior solution to a complex IT problem, however, this new technology may not fit into your existing business process.  Need proof?  Let's take data protection as an example.  Did you know that VTLs (virtual tape libraries) and data deduplication technologies came out at the exact same point in history, 10 years ago?  Which technology had faster market adoption?  VTLs of course because implementing them didn't cause a major disruption in processes.

Let's take a look at a simple backup environment.  We won't worry about archiving or compliance for the moment, just operational backup and recovery.  Today's backup has a number of complexities.  There are some data sets that have weekly full backups and daily incremental backups.  There are some data sets that sit under applications that, for faster recovery capabilities and simplicity, require daily full backups.  Once the backups are done, in order to ensure true data protection reliability, a process of checking the backup logs to ensure every system was successfully protected begins.  Next, backup tapes are either created (if it is a disk based backup) or tapes are taken from the library and moved to a transportable box, hopefully a secure box.  Finally, a third party vendor comes to pick up the tapes and take them off site for safe-keeping.  Additionally, if the data is backed up using encryption, then the encryption keys are also kept off site for security purposes.

 Customers face these standard backup challenges:

1) Backups take too long and cannot meet backup windows as a result of too much data.

2) Backups fail due to poorly configured (networked) backup environments.

3) Backups at remote offices are 'unreliable'. (Don't follow best practices set in the data center.)

a. No one with the appropriate skill set is available to monitor these backups.

b. No one with the appropriate skill set is available to troubleshoot these backups.

c. No one with the appropriate skill set is available to perform data recovery.

4) New applications / processes cause additional challenges; does this application need incremental backups, full backups, what is the RPO / RTO???

5) Managing backup tapes is too difficult and costly.

However, the reality is that in this particular IT shop, no one has ever been fired for data loss. Each time there is a recovery request, data is recovered.  It may not be the absolute most recent data, or it may take 48 hours to recover, but eventually, the data is recovered. The question is, has everyone's business objectives been met? Chances are the answer is "no" but when the issue of what it would cost to meet everyones' needs comes up, there is usually no money in the budget for 'backup' and it's right back to the same old way of doing things. Backup is not really strategic to a business (unless of course you're in the business of providing backup solutions to customers) but it is more of an insurance policy. There is no doubt you need it, but you want it for the lowest possible price, hope you never have to call on it, and when you do, you better get good service.

Maybe that is why EMC is now the GEICO of data protection.

 That aside, when there is money in the budget, it usually comes in small doses so backup administrators have to make the biggest impact in the 'easiest' way possible. This means, implement something that allows them to meet most of their challenges and doesn't:

1) Change process because they already have run books established for data recovery and because everyone is already trained on the existing technology.

2) Change configuration because they have already invested a great deal of time and money to sort out their issues with the existing products.

3) Cost a lot of money

That usually means, augmenting the existing backup software technology with something that allows them to gain some efficiencies on the backend because they already have significant investments in their backup software. This was one of the main reasons for the success of VTL (virtual tape libraries). It is way easier to unplug the slow, serial tape library and replace it with fast, parallel disk. The backup administrator gets all the advantages of disk and doesn't have to change a single process, except for maybe adding a step of cloning the data from the disk that looks exactly like tape, to an actual tape in order to offsite the data. Additionally, this is why companies with target deduplication devices became so popular so quickly. When VTL was having challenges solving backup data capacity issues, deduplication became the next popular thing.  The big issue was plugging into the existing infrastructure without disruption.  If I have to change too much about my process, I can't 'afford' to make it work.

The trouble is backup administrators are at an inflection point. They can no longer continue to use the same old technology at the front of the backup process and meet the needs of the business. We are at a time when new technologies such as source based deduplication technologies can really have a significant impact on a number of the backup challenges. The problem is that it goes against the grain of why IT doesn't want to change technology, because it forces a change to the process. For example, out come the traditional backup agents and new ones are put into place. Since data no longer is stored in tape format, new processes must be utilized for getting tape offsite. When backup administrators hear this, they tend to shy away from it. It costs money and it changes processes right when they had all the original processes figure out.  It is only now that source based deduplication solutions have gained significant momentum as it is really solving a number of the key data protection challenges for more than 70% of the data in most data centers.

  • Remote offices can now experience the same set of data protection best practices that are used in the data center. (Keeping in mind, IT is accountable for 100% of the data created in the corporate, local or remote.  This is good piece of mind.)VMware environments tend to ruin a TCO when using traditional backup applications. Leveraging source based deduplication can bring up your TCO and ROI.

This is not to say that source based deduplication is the savior of the backup world. It is not. There are places where source based deduplication technologies are not the best fit. Very large environments with very high change rates and little duplicate data don't tend to be good fits. However, if you attack the places that are a good fit for source based deduplication, you will create relief in your backup environment at the target and that will be good for everyone.  It is time to take backup, beyond.

Posted by Steve Kenniston

PDF Creator    Send article as PDF   

About the Author

Steve Kenniston - The Storage Alchemist.

Comments (1)

Trackback URL | Comments RSS Feed

  1. Glenn Grabowski says:

    Great stuff, Steve. Couldn’t agree more.
    While I agree we’re at an inflection point where data growth and traditional backup methods can’t ensure a business’s objectives are met, we’re also at another inflection point. Economic woes have caused IT Shops to trim their talent and budgets while trying to grow whatever business they still have. Backup technology is often, as you say, put on the back burner. Enter Data De-Dupe. It looks tempting because it “reduces the storage footprint” in the datacenter, etc. However, some de-dupe technology is too expensive on paper so a “free feature” on an array/filer is often chosen as a “quick fix”, or band-aid. My standpoint is that de-dupe technology is only part of the solution and is, on the surface, a band-aid. It has its place but it doesn’t attack the root of the problem: poor data management. The same process that Steve speaks of that makes change difficult also ignores how data is managed during its lifecycle. Whether we use source-based or target-based de-dupe SW/HW, we’re not addressing how data is truly managed. The easy way out is to ignore the issue for a rainy day and address it later. In the meantime, IT Shops will buy the most storage they can afford(with or without de-dupe “built in”) and try to tread water.
    Now, I’m a realist. I know data lifecycle management has been bandied about for a decade(or more) with few winning vendors delivering the ROI they promised, which leads me to the second inflection point I brought up earlier. With dwindling budgets and staff, now is the time to either:
    1. invest a little in data management tools, or a services engagement, and decide what information even belongs in the datasets customers struggle to backup nightly, or,
    2. use some good old fashioned elbow grease to analyze your data.

    The goal would be to reduce your primary dataset FIRST. Then, decide if you even need de-dupe. Again, I’m a realist. Archiving software isn’t perfect and isn’t simple to deploy but once you “get to know” your data, you can use elbow grease again, if necessary, to move the old data off the primary dataset.

    Like Steve says, it’s process vs. technology. Getting a grasp of your data is a process and technological challenge. However, I think it’s the key regardless of the type of data de-dupe eventually chosen.

Leave a Reply




If you want a picture to show with your comment, go get a Gravatar.