Tag: "Permabit"

Storage Efficiency Panel – SNW 2011 Fall


Yesterday I was on a panel at SNW in Orlando Florida.  The panel was hosted by Dave Vellente, Founder of Wikibon and always a great host for these kinds of things.  On the panel was Larry Freeman of NetApp, Craig Nunes of HP (formally 3Par), Jarred Floyed CTO / Founder at Permabit and myself, IBM (formally Storwize).

Some interesting data came out of this panel.  There were probably over 150 people in the audience.  It was a well-attended session.  Also, Dave is VERY good about asking the audience questions.  Let me start by making sure we all know where everyone sits at the “storage efficiency table” that was on the panel.

  • Larry Freeman is from NetApp – they claim, and I believe them, that they have 10 storage efficiency technologies that are embedded into WAFL
  • Craig Nunes main focus on the panel was ‘zero reclamation’ to optimize storage
  • I have a Real-time Compression drum I am beating
  • Jarred Floyed focuses on data deduplication

Here are some questions and answers Dave got when speaking to the audience:

Dave’s Question

Audience Response (in close estimated %)

How many people use deduplication / compression in their storage? 60% responded they did use one or both of these technologies in their environment
How do users use these technologies - embedded or appliance? 100% of the 60% said "embedded"
Who is your storage vendor was that provided these technologies? 100% of the 60% said NTAP
What is the number 1 issue was with the embedded solution and making it not more widely adopted? Performance was the answer.  They all believed that for 70% of their applications, the embedded solution was “good enough” but for 30% where performance is critical – it couldn’t do the job.
Why are not more appliances deployed to solve the performance issues? The response was that customers didn’t want to have to manage multiple solutions in their environment doing the same thing.
Free PDF    Send article as PDF   

A Blueprint for Primary Storage Optimization


During the past three to four months the storage industry has seen a spike in the number of reports, white papers and news articles surrounding the evolution of primary storage technology, capacity optimization (it is 2010’s Hottest Storage Technology).

The reason this technology is getting a lot of ‘air play’ these days is due to the fact that this technology is so critical to help control the growth and costs of storage.  In 2010 the EMC sponsored IDC Report The Digital Universe Decade - Are You Ready? was release and stated that:

  • In 2009, amid the “Great Recession,” the amount of digital information grew 62% over 2008 to 800 billion gigabytes (0.8 Zettabytes).
  • The amount of digital information created annually will grow by a factor of 44 from 2009 to 2020…

The folks at Wikibon also released an info graph that exposes the true explosion of data.

Information Explosion & Cloud Storage
Via: Wikibon

When you combine storage capacity (and the foot print it takes up) along with the power it takes to run it and cool it as well as the human resource it takes to manage it, you soon realize we cannot keep ‘just adding more cheap disk’ in an effort to manage the storage demands.  High Tech companies with high tech labs are also telling IT that ‘they are out of tricks’ when it comes to the ability to continue deliver disk drive that double capacity every 18 months.  It is for these reasons that primary storage optimization technologies have stepped into the ‘lime light’ as it serves as a means to help control the growth of primary storage including the foot print, power, cooling and man power required to manage it.

However, as we all know in IT, no two environments are the same and what may be good for one may not be good for another.  When looking at primary storage optimization there seem to be a number of available technologies and ways to deploy these technologies and the key question is what is right for ‘my’ environment.

Free PDF    Send article as PDF   

Marketing, FUD and Doing What You Do Best


Rather than leave a lengthy comment on Tom Cook’s blog post from Friday Compression and Dedupe: Business Value and Data Safety (and from a marketing perspective, Friday’s are bad days to post blogs – especially in the summer) – I thought I would respond here (this may get lengthy as Tom made a number of points which I need comment on).

The first thing I do want to say is that when doing technical marketing; the proper strategy would be to not be on defense but rather take an offensive approach.  However, given the amount of FUD that Tom put in his latest blog post, I have to defend compression to some degree.

Now, I think we can all agree that data compression and data deduplication are two technologies that can complement one another very well.  Avamar (EMC) deduplicates the data at the source and then compresses the data before sending it to the Avamar Data Store gaining tremendous efficiency in network utilization.  ProtecTIER (IBM) compresses the data once it is deduplicated at the target device before it stores the data.  Other solutions also combine compression and data deduplication.

I’d like to comment on some key point Tom made in his piece where he is just blatantly wrong:

1)      Compression identifies redundant data across a very small window, usually 64 KB. – While this may be true for other compression technologies, this is not true for Storwize.  Storwize performs compression where the initial window is not fixed in size at all; it is the resultant write that is fixed in size.  This size is also specifically mapped to the I/O patter of the data being written.  The goal is such that in 1 I/O Storwize can do all the work it needs to on a particular file or LUN and it is for this reason Storwize has no performance penalty.

2)      Compression produces data reduction rates at most 2X for most data types. – Seems Tom needs a lesson in the most common answer in IT – “IT DEPENDS”.  Data compression ratios are 100% tied to the data type.  For a true indication of data compression ratios see Figure 1.

PDF Printer    Send article as PDF   

Gravity Applies to Everyone!


There was an interesting announcement today regarding Permabit who is now providing primary storage optimization through OEMs and having their solution embedded into the storage system.  This further drives home the point of where capacity optimization should live.  I do have a couple of questions however:

1)      What is the performance like?  I see phrases such as “High Performance Data Optimization Software” but don’t see any performance metrics – such as ‘no performance degradation’ for customers utilizing the solution.  Or testing metrics from their ‘partners’ (as it probably isn’t in production yet) – which brings up another question:

2)      Why were none of the ‘design win’ partners quoted in this announcement?

3)      Rehydration – Mr. Floyd states:

Permabit's Floyd claims Albireo can maintain data integrity because data written to disk isn't altered, and the reduction takes place out of the data path. When parallel processing is used, deduped data doesn't have to be rehydrated when it's accessed.

The question is – if it doesn’t need to be rehydrated, then how does the application read it?  I can only assume that Mr. Floyd means the data doesn’t have to be rehydrated on disk, which is fine, the question become: a) how does the application know what the data is? (Ocarina uses an agent to help them understand the data, but this is another thing to manage) and b) What is the performance of the system looking up all of the hash keys to reassemble the data on the fly, so how much more storage resources will this consume?

4)      Back to performance – Permabit states:

When done inline, data will flow to the Albireo library before going to disk. Post-process deduplication will write data to disk first, then scan and eliminate duplicated data. The parallel option sends data to disk while still in memory, and applies updates the same way as post-processing without having to read data off disk. Each method has different amounts of latency and reduction efficiencies.

PDF Printer    Send article as PDF