Category: Data Deduplication

Storage Efficiency Spotlight at VMworld


VMworld Live 2011
Via: Wikibon

PDF Download    Send article as PDF   

Efficiency vs. Optimization


“Storage Efficiency” has become a big topic over the past 12 months.  There are a number of new technologies that have come out in the last few years that are helping to deal with storage growth.  We all know that data is the root of the decisions that drive business today.  The more data you have, hopefully, the better decisions you can make to drive your business to success.  The question is, “what is the value (and hence the cost) of the infrastructure to create that success?”  What we do know is that the ability to put more data in a highly efficient footprint can give your company a competitive edge.  There are five technologies that can help an IT organization create an efficient storage infrastructure.  These are:

 

1)      Tiering

2)      Virtualization

3)      Thin Provisioning

4)      Compression

5)      Deduplication

It is also important to point out that there are some semantics when talking about storage efficiency, specifically between efficiency and optimization technologies.  I think it is useful to attempt to define these as they lead us to picking the right solutions for what we are trying to accomplish.  For the purpose of this post, efficiency will relate to making existing capacity more useful and optimization will mean making more capacity out of existing capacity.

Using these definitions, technologies such as Tiering, Virtualization and Thin Provisioning are efficiency technologies.  These technologies help to utilize the existing capacity that you have.

Tiering is technology that is used on about 10% of your data or less.  It is used to move data that requires higher performance to flash storage.  Good tiering technology analyzes data access patterns and moves the most active data to the highest performing disk.  It doesn’t really change the amount of physical capacity that is required; it just changes what type of capacity is required and allows IT to make sure data is operating as fast and efficiently as possible.

Free PDF    Send article as PDF   

Data Protection, Retention and Archive Starts with Data Value


 It feels good to open up the blogging again to new topics, especially ones I am intimately familiar with.  (But have no fear, there will be references to primary storage optimization / compression.)

This weekend I had an interesting conversation with my Dad.  We were discussing backup.  My dad basically runs IT for the State of Maine.  The State of Maine uses CommVault backup software.  So I posed the question to him, “What would it take for you to rip out CommVault and replace it with another solution.  He thought about it for a moment and replied “I wouldn’t”.  His answer came down to a couple of reasons.

First was the expense.  It’s not just about buying the new software, it would be training people to run the new software and it would be about throwing away the massive investment they have in their existing product as well as converting all the years of backup takes created with one software to the new software.  This is one of the biggest things vendors forget when trying to sell a customer on their backup software.

Second was the fact that, feature for feature, the top 5 traditional backup software products are not really that different from one another.  Sure, I do agree that some products have features that others don’t, and others products have features that work better than others, but in reality, the delta is so small and the workarounds are so simple it doesn’t really matter.  Unless your replacing traditional backup software with an evolutionary source based data deduplication software (which is only applicable for some environments) there is no advantage to switching software.

The challenge is if Data Protection is still one of the biggest and most expensive pain points within IT, how do the problems get resolved if replacing the software controlling it all is too costly to change?

Create PDF    Send article as PDF   

Real-time Compression “Meets Minimum”


IBM's Ed Walsh, Director of Storage Efficiency sits down with Steve Duplessie, Founder of ESG to talk about how IBM Real-time Compression sets the bar for doing storage optimization in NAS. At the end of the day, if you can do compression in real time, without sacrificing performance and the transparency of the implementation, then why wouldn't you - given the savings you can get over traditional compression.

We all know compression is not new and it is coming as a standard feature in a number of storage systems. The issue is, each of these technologies has a significant impact on performance - both primary storage performance as well as the performance on all of the back end operations such as backups, replication etc...

IBM's Real-time Compression doesn't have any of these limitations - listen to Ed to hear more.

PDF Printer    Send article as PDF   

Key Competitive Advantages to IBM Real-time Compression


It still baffles me when there is so much information available for people to learn about any topic and it is not used.  Many times people just tend to rely on the information provided by their employer (which in many cases is just competitive FUD).  This video was the result of reading an email between IBM and one of their key partners on the competitive knowledge of each others products.

Fax Online    Send article as PDF   

Storage Alchemist Video Update #2


See how data deduplication and IBM Real-time Compression work hand in hand.

PDF Creator    Send article as PDF   

Linked In Storage Discussion on Storage Efficiency


Great conversation on Linked In about deduplication and compression for storage efficiency in the Data Storage Professionals Group.  Help the storage community answer this question:

Does anyone has any experience in NAS de-duplication at filesystem level, like NetApps. Does it really work? I concerns/limitations?

PDF    Send article as PDF   

Top 10 Reasons Real-time Compression Provides Extraordinary Storage Efficiency


Over the past few weeks I have witnessed the proverbial mudslinging that takes place in the blogosphere when marketing feathers are ruffled.  Most recently I was reading Rich Anderson of The StorageSavvy Blog.  The article was "Compression better than Dedup?  NetApp Confirms!"

I have to agree with Rich on many fronts.  First, "When all you have is a hammer, everything is a nail."  Rich points out vendors have to sell "what's in the bag" so it is conceivable that all problems look like they can be solved with their solution.  If you look back over the last few years NTAP has always had a "me too" reputation.  Whatever the industry has, they have one too and its better.  For the last few years, while competing against Storwize, they have pulled the EMC tactic of trying to stall a market by saying, "We have optimization for primary storage with deduplication."   The reality is, you can't use it in real time, it is a resource hog, and again Rich mentions, the only use case it works well on in primary storage is VMware (and that is ONLY IF the customer stores their data outside the .vmdk file otherwise compression is much better).  Now that NTAP has compression their story has changed saying that compression on primary storage is better for most use cases.  Duh!  The folks at Storwize (now IBM Real-time Compression) have been saying that for years.  Why, deduplication is great for repetitive data sets, i.e. backup, not primary storage.  There just isn't that much repetitive data in primary storage.  Again, NTAP is trying to stall the market saying they have "in-line" compression for primary storage.  Sorry guys, not good enough.  In-line is NOT Real-time.  Rich also points out that the key characteristics of storage for customers are capacity and performance.  Patrick Rogers of NTAP has said publically that compression WILL indeed impact performance and that they even have a tool that will tell you how much performance will be impacted.  While NTAP may say compression is "free", we all know nothing worth having in life is free, you get what you pay for.  If you need the performance to do compression you are going to have to perform a major upgrade to  your filer in order to just be able to perform compression let alone try to do compression in real time.  No real savings there.

PDF Creator    Send article as PDF   

The Storage Network


With the impending name change to the "Storwize" product, the marketing folks at the old "Storwize" are at it again with their "viral video" campaign.  Not sure how many of you have seen the movie or even the trailer to "The Social Network" that grossed $23M in the US brining it to #1 in the box office last week .  Its a story of a guy that started in college with an idea and turned it into something big.  Much like Storwize - an idea that started with only a few in Israel and has now been acquired by IBM for multi millions of dollars and will become a key part to IBM's overall "Storage Efficiency" strategy.  This new trailer "The Storage Network" highlights too may realities of today's data management issues.  Hope you enjoy it.

Video created by MediaBoss Studios

(BTW: In case you didn't get it Storwize is now IBM Real-time Compression)!

PDF Download    Send article as PDF   

Disk Elasticity and Storage Efficiency


Storage is elastic.  How do I know you ask?  Yesterday I visited a customer who is using the Storwize product to do Real-time Compression on their primary storage.  The customer is Allianz and has been using the product for over a year.  They see 75% compression on their users home directory data.  To give you an idea, Allianz is an insurance company and generates TONS of spreadsheets, 14TB worth of spreadsheets (okay, not all 14TB is spreadsheets but you get the picture).

Prior to Allianz purchasing the Storwize technology, Allianz didn’t have great data management practices.  Users store data in their home directories and there is really no discipline around deleting or cleaning up files so data just grows.  Additionally, storage isn’t really budgeted for.  Overall IT is but at a storage level, they just purchase some when the need some.

Again, prior to the Storwize technology, Allianz had their primary storage and a backup to tape at their local site.  They then replicated the data to their remote site and also performed a backup to tape.

Allianz has an overall IT mission to reduce spend by 10% per year.  The thing to think about is that this 10% could come from a lot of places including data management.

Once the Storwize technology was installed the first things they saw were:

  • 75% capacity optimization
  • Better data management capabilities through Storwize reporting
  • The ability to keep more data on line and available for faster recoveries
  • No change in any of their existing storage processes
Free PDF    Send article as PDF