Tag: "EMC"

Confessions of an ex-EMC Blogger


It is an interesting time we live in.  In a world where high-tech meets social networking things can run on the hairy edge of information leakage or brand management, especially in a public company.  However, during 2008 and 2009, when big companies were trying to figure out what to do within the ‘social media’ fray, I was working at EMC and EMC did a fantastic job of embracing social media and using it to their advantage to drive a number of very positive initiatives.  So much so that I believe in August of last year they won an award (or were at least publicly recognized) for their use of social media.  I have to commend Polly Pearson for this.  Driving a brand with no less than 20 bloggers (probably more), among them the likes of Chuck Hollis, Barry Burke and StorageZilla, all of whom tweet as well, one would think would take quite a bit of corralling.  Interestingly though, it didn’t.  The main reason, trust.

Each person at the company who blogged took that ‘role’ very seriously.  Each person I knew who blogged wanted to not only be the top EMC blogger, but the top blogger in their respective area of expertise.  EMC bloggers are very smart people and have a desire to be the best at what they do.  EMC bloggers have driven some of the most authentic and original blogs with great thought leadership in the storage industry.  It is because of the desire to deliver great quality content that they lived by a set of rules that anyone who worked for a public company would adhere to.

1)      Don’t divulge any company secrets – which is a part of your employee agreement anyway

2)      Don’t say things that are untrue or could get you in trouble in the future

3)      Deliver great content

And if there was ever a question, there were always folks internally who you could bounce your thoughts and ideas off of before posting.  It was for these reasons, as well as trust that propelled EMC to the top of the high tech social media ladder.

PDF Download    Send article as PDF   

Storage Tiers – Take 3


 I find myself in a true quandary.  First, I have true admiration for my good friend and fellow blogger 3Par Farley and never feel comfortable being on the other side of the coin from him.  Second, I find myself agreeing, to a degree, with Jon Toigo (who still uses crazy permalinks and considers Novell a serious storage player.  What is up with that?).

I’m sure by now most of you all have read the fury lately over Tom Georgens’ comments about the future of storage tiering.  A number of folks who have ‘tiering’ technology reacted with disdain (see a list on Storagerap).  Some wondered how a storage visionary like Tom could turn his back on technology that helps people save money in storage.  Some even suggested that this is just marketing to overcome deficiency in the NetApp product line.  However, one applauded Tom for understanding how the real world deploys storage.  All good points, but I have my own theory on storage teiring...

I want to come right out and say I think that storage tiering is an incredibly smart concept.  (Now that that is off the table…) I would also say that much like the prediction that tape is ‘dead’ (I guess Data Domain didn’t get that memo), storage tiering, while it can’t be dead, because in reality, it never actually was, nor do I think it will be for a very long time.  Let’s look at the facts:

First, HSM never really went anywhere.  There is not mass adoption of HSM technology.  Second, tiering is not a technology issue.  Humans are lazy.  What do I mean?  HSM / Tiering or whatever you want to call it depends on policy.  IT can’t get any two groups in a company to decide on anything other than storage is too expensive.  When I speak to well respected people in IT the ‘real world’ (my dad), they tell me it is too difficult to get organizations to agree on when data can be archived in order to save money (and that is what this is all about really).  Finally, IT processes get in the way of a good tiering strategy.  Getting data to go one way is easy – move data to cheaper and cheaper tiers of storage until it vanishes.  Try getting it back.  That takes a lot of management tools and integration and costs just as much as doing nothing.

Create PDF    Send article as PDF   

The Myths about Compression and Data Deduplication


 How many of you have heard that compression and deduplication just don’t belong together?  Like oil and water.  I know from experience, when I worked for EMC, the Avamar sales reps and the Data Domain sales reps would tell their customers that the best thing to do if they had encrypted or compressed primary data, that they uncompress it to get the savings in their backups that deduplication promises.

This is wrong on a number of levels.  First, the shear nature of telling a customer to not compress primary storage data only to get down stream benefits is counter intuitive.  Second, if the customer has already changed their processes in order to accommodate compressed primary data, then the deduplication backup vendor is asking their customers to again change the customer’s process.  Not to mention it costs the customer more money in primary storage, and lastly undermines the decision made by the customer to compress the data in the first place.  If you really want to insult your customer, tell them the decision they made to save money was a bad one. Finally, all data deduplication technologies utilize LZ compression on their data ‘chunks’ to further reduce their data size, and then use this added compression benefit to talk about their deduplication ratios.

The reality is, with traditional compression implementations, the affects of deduplication are not significantly realized.  The reason is due to how traditional compression writes the files it compresses.  If a file is changed, from the point of the change, through the rest of the file, the new compressed file is essentially a new file.  When deduplication (even variable block deduplication) looks at this file and finds the initial changed blocks, the rest of the file will also be different and the deduplication ratios will be significantly reduced.  (Essentially it turns the highly affective ‘variable block’ deduplication into ‘fixed block’ deduplication and research shows that fixed block deduplication is 3 to 5 times less efficient than variable block deduplication.  Now that you’ve spent all that money for an expensive variable block solution, are you really getting the benefits?)

PDF Creator    Send article as PDF   

Enterprise Data Protection at the Edge


What does that really mean?  When I worked for Veritas, back in 1998 we acquired a company based out of Canada called TeleBackup that backed up desktop / laptops.  In 1999 Veritas acquired Seagate and the Backup Exec product which also had a desktop / laptop option.  These products were meant to eventually be integrated into the main backup applications but never were.  Additionally, a lot of that software was given away (hard to make a business on that) and for the most part,  lived on a shelf somewhere and was never installed.

In 2004 I worked for Connected Corporate (acquired by Iron Mountain), who’s sole business was desktop / laptop backup.  (In fact, from 2000 to 2004 I worked as an analyst for ESG covering all the vendors in the backup space and used the Connected product to backup my work laptop – and it actually saved my hide once.)  While the company executed a successful exit, the business was (and probably still is) only about a $20M to $40M business.

Why do I bring this up?  There is a new reality in IT these days.  I have said it before, IT is accountable for 100% of the data created in any company, including that stored on desktop/laptops.  This means that not only do they have to provide a location to store this data but IT also needs to provide tools to protect this information and ensure that this information is highly recoverable for both business productivity purposes as well as corporate and legal governance.   This means that desktop / laptop backup is now gaining a lot more visibility in the enterprise.

However, desktop / laptop data protection is one of those areas in IT that is just a nuisance because it seems like it should be an easy problem to solve, but there are so many moving parts to it that it ends up falling by the wayside.

A successful desktop / laptop backup technology needs three very specific capabilities:

  • Integrate seamlessly with the existing backup solution in the enterprise
PDF    Send article as PDF   

Comprehensive Capacity Optimization – Deduplication 2.0


Technology is great isn't it?  When someone thinks they have a new idea on the same old technology foundation they call it "X 2.0".  I have been watching the banter between analysts and vendors (specifically NTAP’s Dr. Dedupe and Permabit’s CEO Tom Cook) on the topic of Deduplication 2.0 and it is my belief that the proverbial boat is being missed (since we are using water analogies).  I have been watching these guys hash it out for the past few weeks and decided I have to jump in.  I find the real value to these conversations is the value to the end user.  At the end of the day, it doesn't really matter who 'coined' or 'invented' a term (like deduplication 2.0) but what does matter is if  the term actually helps describe a technology and how that technology can be leveraged to make things better in the data center.  We should focus on the implications of this new generation of deduplication - ‘deduplication 2.0’.

In May I delivered a presentation to a number of EMC customers on the topic of Data Deduplication 2.0 - Comprehensive Capacity Optimization.  The point of my presentation was simple (and keep in mind this was before the Data Domain acquisition); there are a number of capacity optimization technologies/capabilities that are available to customers today.  Originally these deduplication technologies were used primarily for backup purposes but slowly, deduplication is making its way into primary storage. Deduplication in primary storage makes a lot of sense FOR DATA THAT IS STATIC.  Why only static data?  Static data is data that isn't used frequently (doesn't mean it's not important, it just simply is not accessed often); because access to this data is infrequent, the performance requirements for this data is less than that of active data. Remember; nothing in IT is free.  If I deduplicate data, in order to use it, I must ‘rehydrate’ it and thus there is a performance implication so I want to be careful where I deduplicate data so as not to inhibit performance on production data.

Free PDF    Send article as PDF   

A Data Protection Reference Architecture – The Final Chapter


The Architecture

This ‘architecture’ diagram, as you can see, is not a typical architecture diagram, but hopefully it can be used to align your business and business objectives with the technologies that are available and can best be applied to solve your issues helping to balance, cost, complexity and compliance.

This diagram can also be used to do a couple of other things.  It can help you begin to classify your data and align your  data to your business objectives.  It also lets you begin to identify what data or data services in your environment that may be more important to you than others and based on this help you to choose areas you may want to outsource or move to the cloud.

As you can tell, there really is not one solution for meeting all your data protection needs.  The challenge comes with managing multiple solutions in an effort to meet your business objectives.  While there are only a few technologies available that allow you to manage your environment across all your RPOs and RTOs, it is important that I point out EMC’s NetWorker is able to do this, centralizing your data protection infrastructure  for ease of management.  It allows you to manage traditional backup, source based deduplicated backup with Avamar, CDP with RecoverPoint, as well as the EMC disk libraries and tape where the data is stored.  Now, I am not saying that NetWorker solves all of your data protection challenges, nor am I suggesting that replacing one traditional backup technology for another is the right answer, but what I am saying is that if you’re looking to have all the feature functionality required to meet all your business objectives and you want easier management, NetWorker is one avenue to get you there.  Additionally, the underlying image of the triangle represents data protection management.  Putting all the new technology in place is one thing, managing it, and ensuring you are now meeting your business needs is another.  EMC's Data Protection Advisor can help here as well.

PDF Creator    Send article as PDF   

Storage Switzerland


One of the more thoughtful analysts in the industry, in my opinion is George Crump from Storage Switzerland.  (I like the name and George is as independent as you can get in

this business.)  Yesterday I had the pleasure of briefing George on EMC's Data Protection Vision.  I like talking with George for a couple of reasons.  First, he gets it.  What does that mean.  Read his material.  He is genuinely trying to educate IT folks on what is really important in the data center and how to address these challenges.  Next, he keeps the 'pay for', 'vendor spin' to a minimum.  George works hard to just talk about the facts of a product or industry and talk about how products can help without selling.  The reality is, we live in a great technological time.  The problem with IT is that only 50% of the problems are technology related.  The other 50%  is psychological.  IT can't just implement new technology because its cool or even because it really does solve a problem.  Sometimes new technology is too expensive to implement or the solution that is currently in place had a three year amortization and your only two years into your product life.  Or, more importantly, the new technology may be the greatest technology at the right price but it doesn't fit into the current IT priorities.  These are all things IT needs to work through when considering whether or not to invest in new technology.  The other thing George and I spoke about was the fact that it gets difficult to be 'strategic' in IT especially given certain economic times.  A lot of times IT just needs a band-aide or quick fix to move on to more important issues that really drive the business.  I talk about this  a lot, especially when it comes to backup.  Lets face it, it may not be what we all want to hear but backup is not strategic to most environments.  The applications that drive the business are most important.  Backup is about risk mitigation and information availability if everything else fails.  Right, 'if everything else fails', and IT typically invests in technology in the front end in an effort to have as little failure as possible.  Meaning, IT doesn't just buy JBOD with no RAID if they think the environment shouldn't be put at that kind of risk.  So IT is  already investing in some risk management up front which drives the spend on the back end for data protection.

Fax Online    Send article as PDF   

No More Tiers / Tears


The great thing about blogging and independence is that we can post things that add value that we want to share as long as we give the proper recognition.  One of my colleagues, Mike Dutch from the CTO office of SSG and long time SNIA member had some thoughts as it pertained to storage tiering that were insightful  so together we decided to share this post.  I hope you enjoy it.

I'm guessing that many people define a storage tier by its particular storage technology (like SATA). While this may be a useful working definition it obscures the essential notion of what a storage tier really is and leads to confusion when a new technology like data deduplication comes around.  A precise definition may also lead to some interesting innovations if we were to take a slightly different path.

Should deduplicated storage be considered a storage tier?  I would say “no” and here's why: because a technology such as deduplication can span, and optimize across all tiers.

A storage tier is storage space that has availability, performance, and cost characteristics different enough from other storage tiers as to economically justify the movement of data between it and other storage tiers based on the importance (value, performance need etc…) of the data. While storage tiers are often thought of as being tied to a particular type of hardware,

e.g.,  Flash, FC, SAS, SATA, VTL, PTL, COM (Computer Output Microfiche), or even paper, this is not necessarily the case. For example, highly available cloud or network-based virtual disks could leverage multiple technologies within their single tier.  Since a variety of technologies can be used to provide a particular storage service level, you should not think of a specific technology as a specific storage tier, but should instead evaluate what technology, or combination of technologies would deliver the availability-performance-cost point that I need for this level tier.  "SATA" is not a storage tier, it just happens to be one "technology-set" that can deliver for a single storage tier.

PDF Printer    Send article as PDF   

A Data Protection Tribute to Michael Jackson


I was walking through the data center the other day when I heard one of my colleagues, MJ “Scream”, “I wish I had some ‘Morphine’”.  Well, I have to say I was “Speechless”.  I walked over to where MJ was standing, near the tape library, and when I asked him what was wrong, he replied “there was another backup tape ’Jam‘.”  MJ told me he had been “Working Day and Night” on a major backup problem and he was now bouncing “Off The Wall”.  He told me he was sick of dealing with traditional backup tools and just wanted to get rid of tape.  I told MJ that it was “Human Nature” to feel “Bad” in a time like this but I also told him, “You Are Not Alone”.  I said MJ, “’Keep The Faith’, we all ‘Remember The Time’ when backups ran like a ‘Speed Demon’ and were ‘Unbreakable’, but that is ‘HIStory’, tape isn’t that fast any more given the amount of data we now have.  I also told him that “We are Backup Administrators, we are ‘Invincible’ and ‘Heaven Can Wait’ for us, and while we may not have our issue fixed at the ‘Break Of Dawn’, we would ‘Come Together’ to ‘Heal The World,’ or at least the datacenter’ (I chuckled).  I proceeded to tell him about a revolutionary new backup concept utilizing source-based deduplication technology.  It’s “PYT”, a pretty young thing, but  more importantly it’s here to stay.  EMC  offers it with a product called Avamar , the most efficient variable block,  source-based, deduplication technology on the market that:

  • Helps to eliminate tape all together
  • Is perfect for VMware environments
  • Protects remote offices most efficiently
  • Stems the tide of data growth on NAS platforms
PDF Download    Send article as PDF   

What Happened in Vegas, Stayed in Vegas


Well, until now.  This is an interesting story about archiving and how it could have, but didn't help a friend of mine.

Often, when speaking with customers, I talk to them about the 4 fundamental principals with regard to data protection:

  1. Assess
  2. Archive
  3. Backup
  4. Manage

The assessment phase is a multi-dimensional phase.  It's about people, process and technology.  Like with most things, the technology piece is the easy piece.  EMC has tools that allow us to scan file systems, data bases and email systems that report back a litany of information including but not limited to:

  • Number of files
  • Age of files
  • Volume of data
  • Owner of the data

Once EMC passes the information to the customer about their data, the real hard work begins.  Armed with the information, IT now has to go and speak to line of business managers in order to determine the value of the data, and how data of a specific value needs to be managed and protected.  The problem is line of business managers want everything saved forever, until IT tells them what the bill would be.  IT begins to describe the different 'classes' of service capabilities and line of business managers, who don't really care about the details (not because they don't care, they are just too busy), finally say "Just give me the highest level of protection I can get for the least amount of money."  IT now does the best they can to align their perceived value of the data, to the most appropriate backup and archive capabilities they have.

Now, in Vegas, I think we can all agree that the video surveillance has a ton of value to  the stake holders of the hotels and casinos.  The amount of debauchery that takes place in Vegas with the amount of money that is 'rolling' around Vegas, it is important to 'know what is going on' and to make sure all situations can be handled as efficiently as possible and this is where video surveillance comes into play and the more you 'save' on high speed disk, the easier it is to get to the truth or solve the mystery.

Create PDF    Send article as PDF