Tag: "Recovery"

Data Protection, Retention and Archive Starts with Data Value


 It feels good to open up the blogging again to new topics, especially ones I am intimately familiar with.  (But have no fear, there will be references to primary storage optimization / compression.)

This weekend I had an interesting conversation with my Dad.  We were discussing backup.  My dad basically runs IT for the State of Maine.  The State of Maine uses CommVault backup software.  So I posed the question to him, “What would it take for you to rip out CommVault and replace it with another solution.  He thought about it for a moment and replied “I wouldn’t”.  His answer came down to a couple of reasons.

First was the expense.  It’s not just about buying the new software, it would be training people to run the new software and it would be about throwing away the massive investment they have in their existing product as well as converting all the years of backup takes created with one software to the new software.  This is one of the biggest things vendors forget when trying to sell a customer on their backup software.

Second was the fact that, feature for feature, the top 5 traditional backup software products are not really that different from one another.  Sure, I do agree that some products have features that others don’t, and others products have features that work better than others, but in reality, the delta is so small and the workarounds are so simple it doesn’t really matter.  Unless your replacing traditional backup software with an evolutionary source based data deduplication software (which is only applicable for some environments) there is no advantage to switching software.

The challenge is if Data Protection is still one of the biggest and most expensive pain points within IT, how do the problems get resolved if replacing the software controlling it all is too costly to change?

PDF    Send article as PDF   

IBM Day 1 – It’s Official


Between time off with the family this summer and all the work required to get done between 'signing' a deal to be acquired and 'closing' a deal to get acquired, the blog has been a bit slow.  But I am here now to tell you it is official.  Storwize is now Storwize, an IBM company.

As for myself, I am looking forward to the work of integrating the Storwize Technology into the IBM Storage portfolio.  The Storwize group will live under the STG organization under Brian Truskowski.  There is a new ground swell taking head at IBM these days all around storage efficiency.  To get a better understanding, please have a look at my new colleague, Tony Pearson's blog discussing storage efficiency.  My job will be now to evangelize how IT now needs to take a look at all of the available storage "services" (clones, snapshots, thin provisioning, replication, compression, deduplication, etc...) can help to create an overall storage solution that allows them to reduce their over all $/TB on not only capital expense, but also on operational expense.

Lets face it, data growth isn't slowing down and there is never a one size fits all solution for storage.  The great part about being a part of IBM now is that we have all the tools to pick from to architect a data storage solution, end to end, that allows customers to reduce their overall $/TB for both primary as well as secondary storage and make that storage much more efficient and work for the end user.

This is going to be an exciting time.  I am also anxious to continue the Storage Alchemist blog.  EMC, under the guise of Polly Pearson and Chuck Hollis taught me that social media is great, but social media done right, in a collaborative and thoughtful way can drive influence.  I join some of the best bloggers around from IBM.  (I have added Tony's "Inside System Storage" - It is a great read.)

PDF Download    Send article as PDF   

Enterprise Data Protection at the Edge


What does that really mean?  When I worked for Veritas, back in 1998 we acquired a company based out of Canada called TeleBackup that backed up desktop / laptops.  In 1999 Veritas acquired Seagate and the Backup Exec product which also had a desktop / laptop option.  These products were meant to eventually be integrated into the main backup applications but never were.  Additionally, a lot of that software was given away (hard to make a business on that) and for the most part,  lived on a shelf somewhere and was never installed.

In 2004 I worked for Connected Corporate (acquired by Iron Mountain), who’s sole business was desktop / laptop backup.  (In fact, from 2000 to 2004 I worked as an analyst for ESG covering all the vendors in the backup space and used the Connected product to backup my work laptop – and it actually saved my hide once.)  While the company executed a successful exit, the business was (and probably still is) only about a $20M to $40M business.

Why do I bring this up?  There is a new reality in IT these days.  I have said it before, IT is accountable for 100% of the data created in any company, including that stored on desktop/laptops.  This means that not only do they have to provide a location to store this data but IT also needs to provide tools to protect this information and ensure that this information is highly recoverable for both business productivity purposes as well as corporate and legal governance.   This means that desktop / laptop backup is now gaining a lot more visibility in the enterprise.

However, desktop / laptop data protection is one of those areas in IT that is just a nuisance because it seems like it should be an easy problem to solve, but there are so many moving parts to it that it ends up falling by the wayside.

A successful desktop / laptop backup technology needs three very specific capabilities:

  • Integrate seamlessly with the existing backup solution in the enterprise
PDF Download    Send article as PDF   

Architecting for Recovery


Here is a shocker for you, backup IS a science.  Good backup administrators / architects are worth their weight in gold.  CIO’s just wish backup would go away.   Backup costs money, it’s not strategic, it chews up man power and when it is 'running' (successfully or not) no one really pays attention to it, but when it fails or more likely when you need to restore data and can't, someone can lose their job - so backup is VERY important, it is a science and to architect a backup environment correctly  it takes time, skill, money and someone who knows what they are dong.

Good backup administrators architect for recovery, not for backup.  Prove it you say.  Okay, question: “Why do backup administrators do full backups of Exchange every night?”  Answer - because it is way easier and much faster to perform a one step full recovery for Exchange than it is to lay down the weekly full and apply the incrementals.  Since mail is considered a “critical application” in the enterprise these days, and down time is critical for this application, good backup administrators architect for the least amount of downtime for the application.  This also applies to databases.  Ninety-five percent of all databases are actually snapped for quick recovery and I would also bet that a full backups is performed on them (or the snap) every evening.

Recovery is a primary driver of any good backup architecture but lately I have been hearing a great deal of talk around ‘backup consolidation’.  The reality is, there is no ‘one size fits all’ when it comes to backup software or hardware.  Consolidating backup software may make your environment easier to manage, but does it provide you the tools/technology you need to maximize your data protection objectives in your environment?  Consolidating backup targets (tape / disk) may yield fewer devices to manage, but what happens to your overall backup and recovery performance when doing so?  While new technologies may help fine-tune the science side of backup, they still need an artist’s touch.

PDF Download    Send article as PDF   

The Side Effects of Backup on Server Virtualization


Server virtualization has changed the IT landscape dramatically.  It has become a magic potion curing a number of ills in the physical server world such as low individual CPU utilization and excess use of space, power and cooling in the data center.  However, like all potions that cure what ails you, there can be side effects.  You need to be careful of what the Witch Doctor orders.

When I speak with customers who have aggressively implemented a virtual server infrastructure, 9 out of 10 will tell me that they underestimated the affect that virtualization would have on their backups and backup process and how backup might actually make virtualization less of the magic potion they had hoped, when not considered during the virtual server assessment and planning process.  So what is the issue?  Backup is a virtualization bottleneck, and without addressing it, you may not be able to obtain the server consolidation ratios you had been expecting which can have a negative effect on your virtual server TCO and ROI.

This is a timely discussion as VMworld has just concluded.  VMware users flocked to VMworld looking for best practices when it comes to implementing virtual server technology.  Because virtualization allows IT to reduce the overall physical hardware infrastructure, users will be looking at how to maximize their server consolidation ratios (get as many virtual servers on a physical server as they can and still provide good application performance).

I often hear that companies assess their environments by looking at the production applications on their physical server environment, identify their work loads and translating that into some consolidation ratio of physical servers to virtual servers.  I also hear, from these same customers, that backup was never taken into consideration during the assessment phase when trying to identify the best possible consolidation ratios.  These customers implement their new virtual server environments, install the backup agent they had previously been using for physical server backups and attempt to backup their virtual servers and they find that they would only be able to protect 50% to 60% of the new environment.  Why?

PDF    Send article as PDF   

A Data Protection Reference Architecture – The Final Chapter


The Architecture

This ‘architecture’ diagram, as you can see, is not a typical architecture diagram, but hopefully it can be used to align your business and business objectives with the technologies that are available and can best be applied to solve your issues helping to balance, cost, complexity and compliance.

This diagram can also be used to do a couple of other things.  It can help you begin to classify your data and align your  data to your business objectives.  It also lets you begin to identify what data or data services in your environment that may be more important to you than others and based on this help you to choose areas you may want to outsource or move to the cloud.

As you can tell, there really is not one solution for meeting all your data protection needs.  The challenge comes with managing multiple solutions in an effort to meet your business objectives.  While there are only a few technologies available that allow you to manage your environment across all your RPOs and RTOs, it is important that I point out EMC’s NetWorker is able to do this, centralizing your data protection infrastructure  for ease of management.  It allows you to manage traditional backup, source based deduplicated backup with Avamar, CDP with RecoverPoint, as well as the EMC disk libraries and tape where the data is stored.  Now, I am not saying that NetWorker solves all of your data protection challenges, nor am I suggesting that replacing one traditional backup technology for another is the right answer, but what I am saying is that if you’re looking to have all the feature functionality required to meet all your business objectives and you want easier management, NetWorker is one avenue to get you there.  Additionally, the underlying image of the triangle represents data protection management.  Putting all the new technology in place is one thing, managing it, and ensuring you are now meeting your business needs is another.  EMC's Data Protection Advisor can help here as well.

Create PDF    Send article as PDF   

A Data Proteciton Reference Architecture – Part 4


Business Critical Applications

The tip of the triangle focuses on the applications (or data) that drives your business.  It is these applications within your business that, should they go down for any length of time, cost you money.  The recovery of this information, in the event of a ‘disaster’, needs to be very fast (RTO in minutes) and the data can’t be very ‘old’ when it is recovered (short RPO, less than 24 hours).   Typically,  the technologies that are used for these types of applications are replication (synchronous or asynchronous) or continuous data protection (CDP).  These technologies ensure that recovery at the alternate location  are instant (or near instant) and / or give users the ability to pick a point in time they want to recover to in order to ensure no data loss and the ability to bring up the applications as fast and accurately as possible.  This category, much like the rest of them, have the same disclaimer, 'one size (product) does not fit all'.  Depending upon the value of the data in this tier, and the risk to the business if this data is unavailable drives the technology and spend in this part of the triangle.  Keep in mind, the right technology (Don't choose CDP if you need an active remote file system) gives you the best recovery (RPO) for your business needs and can keep you on the Road to Recovery.

Create PDF    Send article as PDF   

A Data Proteciton Reference Architecture – Part 3


The 'Fat Middle'

In the 'fat middle' of the triangle, as I stated last week, there are a number of ways to protection information.  I have chosen to break apart the middle into two categories.  The reality is, this is meant to be used as a tool for helping you lay out a strategy so your boxes could be based on capacity and could end up in different areas of the triangle depending upon your business needs.  The thing to keep in mind is that it's not about your environment matching these boxes exactly, but it's about making sure that all of the critical data that requires backup with a 24 hour RPO is protected; you then alignthe data value in the box with the most appropriate technology to 1) solve the challenge 2) fit best in your environment.

SMB / ROBO

First, let me clarify my terminology.  ROBO is remote office, back office and SMB is small to medium business.  If we think about the business needs that are most important in this arena, they are:

1)      Low cost

2)      Simplicity (one tool)

3)      24 hour RPO is adequate

Small and medium businesses, as well as remote offices, need a robust data protection solution that allows them to meet their backup windows and that has the ability to recover data that is not any older than 24 hours (RPO).  The RTO drives whether the backup target is disk or tape.   Faster recoveries come from disk.  Another thing to keep in mind is that there isn’t usually a lot of technical expertise at these sites so the backup application needs to be very simple to manage.

Backup appliances or appliance-like backup technologies tend to work very well in these environments.  A self contained backup appliance, (disk based) with the ability to replicate efficiently to another site for disaster protection is a great solution for sites like these.

PDF    Send article as PDF   

A Data Protection Reference Architecture – Part 2


Archive

The most fundamental part of developing a good data protection architecture starts at the base of the triangle with Archive.  Archive is often an overlooked component of data protection - It’s not just for regulated business anymore.  Archive essentially gives users 100% data deduplication efficiency.  What I mean by this is that you have the ability to remove ‘stale’ data (and by 'stale' I don't mean unimportant data, I just mean data that is not accessed frequently) completely from your backup stream so you don’t continue to back it up.  Let’s face it; the two most important commodities in backup are time and capacity.  Both of these are interdependent of one another.  The more capacity you have, the longer it takes to backup and the more money it costs to store.  The longer it takes you to backup, the less likely you are to be meeting your business objectives.  Data capacities aren’t shrinking, they are growing.  According to the latest IDC data, capacity is growing at a staggering pace of 65% year over year and the digital pack rat in all of us is too afraid to get rid of anything,  compromising backup windows and hence the business.  By archiving data that hasn’t been touched in some period of time and removing it from the backup stream, you can relieve some of the pressure on your backups and possibly not have to make any significant changes to your backup infrastructure.

Also, you don’t have to backup to a special purpose device or appliance for archive.  You can archive data to any file system.  I would keep in mind however, that you want to archive to a platform that can keep costs low.  Remember this data is not unimportant, just not highly used.  Take into account your RTO and store the data on the most cost effective platform possible that also aligns to the business objectives.  This may be tape, it may be optical or it may be disk.  If it is disk, you want to store it on disk that is optimized for this type of data, optimized for capacity (deduplication, compression, single instancing), has low power and cooling costs, can replicate for availability and is highly reliable.  You will also want to make sure that it is integrated to some extent with an application that lets you find the data pretty quickly when you need it and put you further down the Road to Recovery.

PDF Printer    Send article as PDF   

Storage Switzerland


One of the more thoughtful analysts in the industry, in my opinion is George Crump from Storage Switzerland.  (I like the name and George is as independent as you can get in

this business.)  Yesterday I had the pleasure of briefing George on EMC's Data Protection Vision.  I like talking with George for a couple of reasons.  First, he gets it.  What does that mean.  Read his material.  He is genuinely trying to educate IT folks on what is really important in the data center and how to address these challenges.  Next, he keeps the 'pay for', 'vendor spin' to a minimum.  George works hard to just talk about the facts of a product or industry and talk about how products can help without selling.  The reality is, we live in a great technological time.  The problem with IT is that only 50% of the problems are technology related.  The other 50%  is psychological.  IT can't just implement new technology because its cool or even because it really does solve a problem.  Sometimes new technology is too expensive to implement or the solution that is currently in place had a three year amortization and your only two years into your product life.  Or, more importantly, the new technology may be the greatest technology at the right price but it doesn't fit into the current IT priorities.  These are all things IT needs to work through when considering whether or not to invest in new technology.  The other thing George and I spoke about was the fact that it gets difficult to be 'strategic' in IT especially given certain economic times.  A lot of times IT just needs a band-aide or quick fix to move on to more important issues that really drive the business.  I talk about this  a lot, especially when it comes to backup.  Lets face it, it may not be what we all want to hear but backup is not strategic to most environments.  The applications that drive the business are most important.  Backup is about risk mitigation and information availability if everything else fails.  Right, 'if everything else fails', and IT typically invests in technology in the front end in an effort to have as little failure as possible.  Meaning, IT doesn't just buy JBOD with no RAID if they think the environment shouldn't be put at that kind of risk.  So IT is  already investing in some risk management up front which drives the spend on the back end for data protection.

PDF    Send article as PDF