Gravity Applies to Everyone!
There was an interesting regarding Permabit who is now providing primary storage optimization through OEMs and having their solution embedded into the storage system. This further drives home the point of . I do have a couple of questions however:
1) What is the performance like? I see phrases such as “High Performance Data Optimization Software” but don’t see any performance metrics – such as ‘no performance degradation’ for customers utilizing the solution. Or testing metrics from their ‘partners’ (as it probably isn’t in production yet) – which brings up another question:
2) Why were none of the ‘design win’ partners quoted in this announcement?
3) Rehydration – Mr. Floyd states:
Permabit's Floyd claims Albireo can maintain data integrity because data written to disk isn't altered, and the reduction takes place out of the data path. When parallel processing is used, deduped data doesn't have to be rehydrated when it's accessed.
The question is – if it doesn’t need to be rehydrated, then how does the application read it? I can only assume that Mr. Floyd means the data doesn’t have to be rehydrated on disk, which is fine, the question become: a) how does the application know what the data is? (Ocarina uses an agent to help them understand the data, but this is another thing to manage) and b) What is the performance of the system looking up all of the hash keys to reassemble the data on the fly, so how much more storage resources will this consume?
4) Back to performance – Permabit states:
When done inline, data will flow to the Albireo library before going to disk. will write data to disk first, then scan and eliminate duplicated data. The parallel option sends data to disk while still in memory, and applies updates the same way as post-processing without having to read data off disk. Each method has different amounts of latency and reduction efficiencies.
Here the question is what is the difference between ‘inline’ and ‘parallel’? Additionally, If you review the description of how parallel deduplication works, it seems as if does not optimize writes and probably writes over blocks that were recently added and are redundant. This does not save in write activity as compression does (writing less to disk).
I commend Permabit for being very up front about the performance issues that come with doing something very complex. Deduplication for storage is very difficult. To do it inline and in real-time is even more of a challenge. As Floyd says, "It's been amazing to witness how fast deduplication went from a 'science experiment' to mainstream in the backup use case, and primary storage -- while perhaps not being adopted at the same rate -- is being considered more and more," he said. "Performance will get better and become less and less of an issue as most of the algorithms are limited by CPU, which is getting very inexpensive. But even in cases where memory and disk spindles play a role in performance, those issues are increasingly getting more cost-effective to overcome." I couldn’t agree more.
The one thing that customers should consider today however is primary storage compression, in real-time, without any performance degradation. This technology is available today from Storwize. Today, as an appliance in front of NAS environments, customers are saving anywhere from 50% to 90% of their capacity without performance degradation nor needing to change anything about their environment including application, networking, storage or downstream processes such as snapshots, replication or backup.
Technology is evolutionary. The first step starts today with Storwize. To see more about the Storwize solution or to spend 15 minutes to learn how to save 50% or more of your storage capacity go to .


Steve – great post. I am interested in your thoughts as to how this technology will ultimately get to market. It would appear on some level that primary storage de-dupe/compression is a revenue killer for the big storage firms. Yes data grows, and it seems to not be stopping any time soon (sort of like oil spills) but its got to be a hit to revenue for companies like EMC etc. when the reality hits that you may have already purchased all the storage capacity you need for a year or so.
Are you seeing channel partners/resellers take this on in a big way? or are they also fearful of how it might affect their business?
Clearly a win for the end user. Are the vendors you work with fans or are they getting in the way?
Hey Jeff – GREAT to hear from you! Thanks for the comment. Couple of key things to consider:
1) Storage is elastic. We saw when data deduplication came out that vendors started saying – “oh, now I am going to sell less xxx” – the reality is, customers started keeping more stuff on line and because they could now efficiently replicate data – they kept twice as much stuff on line. Same is / will be true with primary storage optimization. I can now keep more snapshots, I can replicate more efficiently, I can have more clones etc… so storage needs wont shrink, it would just be used in different ways.
2) If vendors don’t chose to ‘get on the optimization horse’ they will die. The technology is too compelling that customers are asking for it and if the vendors don’t have a solution then they will no longer stay the incumbents. They will all need a primary storage optimization solution.
3) From a partners / resellers perspective – I have seen three avenues
a) A number of VARs that have strategic relationships with large companies actually don’t sell the storage direct. For example, at a large on-line auction site in San Jose, CA (guess who) they buy all of their NTAP storage direct. However the VAR in the account, who really understands the customer needs – knows the customer doesn’t want to buy any more disk drives if they can help it. Primary storage optimization – if it can be sold as an extension (appliance) to the environment is actually a great way for the VAR to cut the storage spend and get more $ for stuff the VAR can offer.
b) In cases where the VAR does sell the storage, the answer is, if they can have the customer spend less on infrastructure (low margin stuff) and make more money on the higher margin stuff – Software / Services – then why not get that benefit.
c) Some VARs aren’t really “V”ARs and they will die because if the don’t add the “V”alue – and customers are looking for what is best for them, not the reseller, then they won’t last too long.
Hope that helps.
Thanks again Jeff! Stay well!
Steve