Tag: "Virtualization"

Defining Big Data


Tuesday night I attended an event – storagefest II 2012, which was hosted by Valhalla Partners.  The event was a dinner with a group of storage experts from all vectors of the storage industry.  There were customers of storage technologies, VCs with investments in storage, entrepreneurs (folks from storage startups), industry insiders (analysts) and folks from storage companies who have been acquired into large companies.  The goal of the event also had multiple vectors, specific to each "group" that attended.

VCs attend to hear what customers have to say about the state of the storage industry and what they should be investing in or if the storage startups they have invested in are doing the right things.  They also listen to people who have had successful exits and the advice they may have for running a successful storage business.

Customers attend to hear what is new in the storage business and to share their experiences and challenges within their infrastructure, and what they are looking for from their storage technologies and new companies.

Entrepreneurs attend to lend their advice, to see what is new and share ideas.

Industry insiders attend to learn more about customer challenges, who has the best chance at solving these challenges, how the industry is shaping up and to report on the event.

Large company attendees, people who have had successful exits into the large company, are typically in influential roles in their new company and go to learn about how the industry is evolving and what new technologies are out there that they may want to add to the portfolio of the larger company.  It is also a good chance to listen to customers discuss what they are looking for from the next generation of storage technologies.

I set all of that up so you can understand the players and the mix of people at the event.

PDF Printer    Send article as PDF   

Storage Efficiency Panel – SNW 2011 Fall


Yesterday I was on a panel at SNW in Orlando Florida.  The panel was hosted by Dave Vellente, Founder of Wikibon and always a great host for these kinds of things.  On the panel was Larry Freeman of NetApp, Craig Nunes of HP (formally 3Par), Jarred Floyed CTO / Founder at Permabit and myself, IBM (formally Storwize).

Some interesting data came out of this panel.  There were probably over 150 people in the audience.  It was a well-attended session.  Also, Dave is VERY good about asking the audience questions.  Let me start by making sure we all know where everyone sits at the “storage efficiency table” that was on the panel.

  • Larry Freeman is from NetApp – they claim, and I believe them, that they have 10 storage efficiency technologies that are embedded into WAFL
  • Craig Nunes main focus on the panel was ‘zero reclamation’ to optimize storage
  • I have a Real-time Compression drum I am beating
  • Jarred Floyed focuses on data deduplication

Here are some questions and answers Dave got when speaking to the audience:

Dave’s Question

Audience Response (in close estimated %)

How many people use deduplication / compression in their storage? 60% responded they did use one or both of these technologies in their environment
How do users use these technologies - embedded or appliance? 100% of the 60% said "embedded"
Who is your storage vendor was that provided these technologies? 100% of the 60% said NTAP
What is the number 1 issue was with the embedded solution and making it not more widely adopted? Performance was the answer.  They all believed that for 70% of their applications, the embedded solution was “good enough” but for 30% where performance is critical – it couldn’t do the job.
Why are not more appliances deployed to solve the performance issues? The response was that customers didn’t want to have to manage multiple solutions in their environment doing the same thing.
Free PDF    Send article as PDF   

Virtual Disk Storage


History truly does repeat itself.  We are talking about the history of data storage.  Every once and a while a new technology comes along that requires a new way to think about infrastructure.  Notice I said “infrastructure”.  I’d like to paint two analogies:

Analogy 1: RAID – Prior to RAID users stored their data on disk and if they could afford it, they backed that data up to have a protected copy of their data.  When RAID came out, users were able to store their data on multiple disks appearing as one device.  The benefits to this were, increased data reliability, better performance.  This new technology however, fundamentally changed how disk was sold, but the questions were the same:

  1. How much capacity do you need?
  2. What type of performance does your application require?

The sales reps point of view changed.  There were a number of new considerations that needed to be taken into account.  First, the age old question, “Will I sell less storage “stuff?”  Remember the person, at the time, selling the disk was probably also selling the backup tape and software to protect that information.  If the disks are more reliable, maybe the customer won’t need as much tape?  Second, when the capacity question came up, the seller also needed to know what type of RAID the customer wanted to ensure they sold them enough drives.  It was no longer as simple as asking the capacity requirements and dividing it by the drive capacity at the time.  Now depending upon RAID levels there was a new set of math that needed to be done.  Third was the notion of performance and more spindles meant more performance so now that the capacity equation was solved for, you also needed to know the I/O requirements in order to make sure the right number of drives were sold to solve for the capacity as well as the performance.

Create PDF    Send article as PDF   

Efficiency vs. Optimization


“Storage Efficiency” has become a big topic over the past 12 months.  There are a number of new technologies that have come out in the last few years that are helping to deal with storage growth.  We all know that data is the root of the decisions that drive business today.  The more data you have, hopefully, the better decisions you can make to drive your business to success.  The question is, “what is the value (and hence the cost) of the infrastructure to create that success?”  What we do know is that the ability to put more data in a highly efficient footprint can give your company a competitive edge.  There are five technologies that can help an IT organization create an efficient storage infrastructure.  These are:

 

1)      Tiering

2)      Virtualization

3)      Thin Provisioning

4)      Compression

5)      Deduplication

It is also important to point out that there are some semantics when talking about storage efficiency, specifically between efficiency and optimization technologies.  I think it is useful to attempt to define these as they lead us to picking the right solutions for what we are trying to accomplish.  For the purpose of this post, efficiency will relate to making existing capacity more useful and optimization will mean making more capacity out of existing capacity.

Using these definitions, technologies such as Tiering, Virtualization and Thin Provisioning are efficiency technologies.  These technologies help to utilize the existing capacity that you have.

Tiering is technology that is used on about 10% of your data or less.  It is used to move data that requires higher performance to flash storage.  Good tiering technology analyzes data access patterns and moves the most active data to the highest performing disk.  It doesn’t really change the amount of physical capacity that is required; it just changes what type of capacity is required and allows IT to make sure data is operating as fast and efficiently as possible.

Free PDF    Send article as PDF   

Sever + Storage Optimization = Datacenter Utopia


Matt Prigge had a really great article on his InfoWorld Data Explosion blog called "VMware vSphere raises the bar -- again".  In the piece Matt makes two really important points.

1.  VMware has taken the world by storm over the past few years.  A technology that can lower both CapEx and OpEx costs and ease the burden of administration is a great thing for the data center.  And,

2.  With all the advantages of virtual server optimization, storage administration, is a big issue.

VSphere has done a lot to help the issues of storage administration (specifically storage performance for virtual servers) but that is only a part of the challenge.  Customers consistently tell us that by developing a virtualized server environment, their storage requirements have grown by as much as 4x.  The savings that have been realized by server virtualization are soon eclipsed by the need for more storage.  This  is one of the reasons it has taken a while for server virtualization to really take off in production.  In talking to customers, virtualizing a lab or test environment where data can be deleted once it is 'used' without worry is one thing, but in production, where the production data needs to be kept for a long time starts to cause issues.

Now, with all the hype around primary storage optimization, end users can couple the benefits of server virtualization with primary storage optimization to maximize their ROI in the datacenter.  The important thing to remember, just like server virtualization didn't force customers to sacrifice anything in terms of performance, availability, process and supportability, you need to look for the same thing from a storage optimization solution.

The valuable features added to vSphere around SIOC combined with the optimization capabilities from Storwize can allow IT to maximize storage performance, maximize their existing storage resources and not affect data integrity or data availability.  There is a new white paper on the combined solution of VMware and Storwize that outlines how VMware and Storwize can provide customers with the maximum ROI in the datacenter.

PDF Creator    Send article as PDF   

A Blueprint for Primary Storage Optimization


During the past three to four months the storage industry has seen a spike in the number of reports, white papers and news articles surrounding the evolution of primary storage technology, capacity optimization (it is 2010’s Hottest Storage Technology).

The reason this technology is getting a lot of ‘air play’ these days is due to the fact that this technology is so critical to help control the growth and costs of storage.  In 2010 the EMC sponsored IDC Report The Digital Universe Decade - Are You Ready? was release and stated that:

  • In 2009, amid the “Great Recession,” the amount of digital information grew 62% over 2008 to 800 billion gigabytes (0.8 Zettabytes).
  • The amount of digital information created annually will grow by a factor of 44 from 2009 to 2020…

The folks at Wikibon also released an info graph that exposes the true explosion of data.

Information Explosion & Cloud Storage
Via: Wikibon

When you combine storage capacity (and the foot print it takes up) along with the power it takes to run it and cool it as well as the human resource it takes to manage it, you soon realize we cannot keep ‘just adding more cheap disk’ in an effort to manage the storage demands.  High Tech companies with high tech labs are also telling IT that ‘they are out of tricks’ when it comes to the ability to continue deliver disk drive that double capacity every 18 months.  It is for these reasons that primary storage optimization technologies have stepped into the ‘lime light’ as it serves as a means to help control the growth of primary storage including the foot print, power, cooling and man power required to manage it.

However, as we all know in IT, no two environments are the same and what may be good for one may not be good for another.  When looking at primary storage optimization there seem to be a number of available technologies and ways to deploy these technologies and the key question is what is right for ‘my’ environment.

PDF Download    Send article as PDF   

Compression 101 for CFOs


CFOs have an incredibly hard job when it comes to helping IT manage a budget.  Let’s face it, there have been books written (like ‘Does IT Matter, by Carr) that discuss the value of all those blinking lights in the data center.

The reality is that some of those blinking lights do matter and others are a financial sink hole.  Over the past 3 years storage has crept up to be one of the higher cost items in the data center and storage is a lot like death and taxes, it just IS.  It is really the applications that drive revenue for your company and these applications just keep generating data which in 45 days will most likely be obsolete – well as least 90% of it.  The trick is which 90% and because no one can really tell which 90% you have to keep all of it.

Now let’s switch to technology for a moment.  For sure CFO’s have heard all the technology buzz words around IT.  Vendors today realize that they have to meet high ROI / TCO demands in order to effectively sell to customers, especially in the storage world.  One of these technologies is data deduplication.  On the surface (just by nature of its name) it seems like the defacto standard for all storage growth problems – just ‘deduplicate’ your data and all your storage issues go away.  Well, I am here to tell you ‘Don’t Get Duped by Dedupe’.  It may be the new fancy technology word for storage vendors, but when all you have is a hammer, everything looks like a nail.

What I mean by this is that just because ‘deduplication’ is today’s storage buzz word, it is not a solution for all data growth challenges, especially for primary storage.  Compression, especially when done right – real time and random access, is the best solution for stemming the tide of primary storage growth.

PDF Printer    Send article as PDF   

The Side Effects of Backup on Server Virtualization


Server virtualization has changed the IT landscape dramatically.  It has become a magic potion curing a number of ills in the physical server world such as low individual CPU utilization and excess use of space, power and cooling in the data center.  However, like all potions that cure what ails you, there can be side effects.  You need to be careful of what the Witch Doctor orders.

When I speak with customers who have aggressively implemented a virtual server infrastructure, 9 out of 10 will tell me that they underestimated the affect that virtualization would have on their backups and backup process and how backup might actually make virtualization less of the magic potion they had hoped, when not considered during the virtual server assessment and planning process.  So what is the issue?  Backup is a virtualization bottleneck, and without addressing it, you may not be able to obtain the server consolidation ratios you had been expecting which can have a negative effect on your virtual server TCO and ROI.

This is a timely discussion as VMworld has just concluded.  VMware users flocked to VMworld looking for best practices when it comes to implementing virtual server technology.  Because virtualization allows IT to reduce the overall physical hardware infrastructure, users will be looking at how to maximize their server consolidation ratios (get as many virtual servers on a physical server as they can and still provide good application performance).

I often hear that companies assess their environments by looking at the production applications on their physical server environment, identify their work loads and translating that into some consolidation ratio of physical servers to virtual servers.  I also hear, from these same customers, that backup was never taken into consideration during the assessment phase when trying to identify the best possible consolidation ratios.  These customers implement their new virtual server environments, install the backup agent they had previously been using for physical server backups and attempt to backup their virtual servers and they find that they would only be able to protect 50% to 60% of the new environment.  Why?

Create PDF    Send article as PDF   

A Data Protection Tribute to Michael Jackson


I was walking through the data center the other day when I heard one of my colleagues, MJ “Scream”, “I wish I had some ‘Morphine’”.  Well, I have to say I was “Speechless”.  I walked over to where MJ was standing, near the tape library, and when I asked him what was wrong, he replied “there was another backup tape ’Jam‘.”  MJ told me he had been “Working Day and Night” on a major backup problem and he was now bouncing “Off The Wall”.  He told me he was sick of dealing with traditional backup tools and just wanted to get rid of tape.  I told MJ that it was “Human Nature” to feel “Bad” in a time like this but I also told him, “You Are Not Alone”.  I said MJ, “’Keep The Faith’, we all ‘Remember The Time’ when backups ran like a ‘Speed Demon’ and were ‘Unbreakable’, but that is ‘HIStory’, tape isn’t that fast any more given the amount of data we now have.  I also told him that “We are Backup Administrators, we are ‘Invincible’ and ‘Heaven Can Wait’ for us, and while we may not have our issue fixed at the ‘Break Of Dawn’, we would ‘Come Together’ to ‘Heal The World,’ or at least the datacenter’ (I chuckled).  I proceeded to tell him about a revolutionary new backup concept utilizing source-based deduplication technology.  It’s “PYT”, a pretty young thing, but  more importantly it’s here to stay.  EMC  offers it with a product called Avamar , the most efficient variable block,  source-based, deduplication technology on the market that:

  • Helps to eliminate tape all together
  • Is perfect for VMware environments
  • Protects remote offices most efficiently
  • Stems the tide of data growth on NAS platforms
PDF Creator    Send article as PDF