Storage Benchmarking and Formula 1

By | February 28, 2011

After spending goodness knows how long branding SPC benchmarking as the biggest waste of time since Soap Opera’s, EMC have made an about turn and signed up for some SPC benchmarketing goodness.

Prior to joining the SPC, EMC were vocal about such benchmarks being based on artificial workloads that do not accurately match any one customers profile, so what’s the point!  I see where they’re coming from, but don’t personally feel such a fact renders such benchmarking entirely useless.  Here’s why –

In my experience, many large enterprises have their own in-house simulated workloads that they use to bench storage arrays.  Even these customer-created benchmarks don’t exactly mimic their respective production environments.  However, they do provide a common baseline against which multiple arrays can be measured.  Therefore, they are of some value.  Ditto for benchmarks such as the various SPC and SPECsfs benchmarks.  They’re not perfect, but they’re not useless either.

In my opinion, arrays like VMAX, VSP, DS8, 3PAR, XIV, CLARiiON… seem very often to find themselves on the receiving end of just about every I/O profile conceived by man.  Such arrays are often deployed to meet ~80% of an organisations block requirements.  The other 20% often finding its way to more niche architectures such as TMS RamSan, Isilon, FusionIO, Exadata…  Would love to know if you disagree.

I’d be willing to bet that most of the above mentioned arrays tackle a mix ‘n’ match of all things structured and semi-structured (MS Exchange, Oracle, SQL Server, Siebel, SharePoint etc..) on a daily basis at countless organisations across the planet. 

If I am right, then I postulate that just about any artificial workload can form the basis of a half-decent benchmarking workload – If an array going to get a bit of everything in production (and that bit of everything today may potentially be different to the bit of everything tomorrow) then why not give it a bit of everything in test – right?

With the above in mind, the problem with SPC, SPECsfs, and any other benchmark for that matter, lies not so much the artificial nature of the workload, as it does in the hilarious ridiculous configurations the vendors submit!  One only needs to look at the VNX configuration EMC submitted for its February 2011 NFS SPECafs2008 benchmark4 x somewhat combined VNX5700 with over 400 SSD drives!  Amazing results and shows the capabilities of the architecture at one extreme, but pleeeaaze!

Storage Formula 1

Much like Formula 1 racing, where the Renault car and engine that is submitted for the race bears absolutely zero resemblance to the Renault that my wife drives, the configurations submitted by the vendors bear zero resemblance to the configurations they sell to customers.  One configuration is for pure sport, whereas the other is for customers to buy.

However, a vital difference between Formula 1 racing and storage vendor benchmarking is that one of the two is interesting and exciting, whereas the other is…… well ….. not.  I’ll leave it to you to decide which is which πŸ˜‰

So…….. good on EMC For joining and starting to play ball.  But howz about having some balls and submitting realistic configurations that your customers regularly buy.  While the same goes for everyone else, I’d expect the 800lbs Gorilla to have the biggest balls, hence why I call out to EMC to set the tone – even the top EMC marketing folks agree that things need to be changed.  At least that way, it will be of some value to your customers and not just be a stage for donning your Speedos and flexing your steroid induced IOPs and latencies.

UPDATE 29/02/2011 – If vendors feel they absolutely must submit ridiculous configurations to the likes of SPC and SEPCsfs in order to acheive large numbers, why not also submit realistic every day configurations as well?  That way you get to brag about your big numbers, but also provide customers with something useful.  Obviously, if you're not confident about your realistic configurations and have something to hide, then we understand why you don't do this πŸ˜‰  See here for where I got the idea.

Prestigious Awards

On a different but somewhat similar note, HDS recently announced that its flagship product, the Hitachi VSP, had received a supposedly prestigious iF Product Design Award.  Personally I’d never heard of the award, but don’t let that detract from it. 

According to Hu Yoshida, CTO of HDS, the award takes in to consideration things such as “… environmental sustainability, architectural design quality, and overall functionality”.

All well and good, but was the VSP up against any of its peers?  All well and good if it was, but If not, then what is the real value?

Conclusion

When it comes to meaningful benchmarking and valuable design awards, we are so close, yet so far.

Just my thoughts, appreciate any comments.

10 thoughts on “Storage Benchmarking and Formula 1

  1. JulieHG

    Not everyone submits the ridiculous.  BlueArc (my company) focuses on max IOPS with the least # of drives.  Although I'm loath to highlight NTAP, they usually do the same, highlighting the best IOPS/$ figure that they can drive for the benchmark (old GX submission aside that is).  SPECsfs2008 is pretty stringent on the workload, and actually is more intensive than most user environments, but you can bet that the vendor product is going to deliver what is posted or better.

    Personally, I think EMC has decided to join the party in order to discredit the benchmarks as much as possible, hence the ridiculous configurations that they've submitted.  Why else would they publish a result which highlights how grossly inefficient they are with SSD?

  2. Dimitris Krekoukias

    D From NetApp here…

    Yes, indeed we do try to get the highest benchmark result with the smallest possible configuration instead of using crazy setups nobody will ever buy.

    I agree with Julie that EMC is trying to show that the benchmarks are ridiculous and not to be taken seriously, but all they’ve accomplished so far IMO is to show that THEIR specific config is unrealistic, not other vendors’.

    Full analysis here: http://bit.ly/eWKBFZ

    Thx

    D

  3. Nigel Poulton Post author

    Julie, Dimitris,

    Thanks for commenting and also disclosing your vendor relationships.

    I think your comments about EMC’s potential motives are interesting πŸ˜€

    Nigel

  4. Chuck Hollis

    Hi Nigel
    Nice post, but I disagree.  The SPEC is a pure performance benchmark, plain and simple.  There is no cost metric associated with it whatsoever. 
    Other benchmarks have cost metrics, this one doesn't. 
    Our thinking was simple: show people what the technology is capable of doing, and nothing more.  Sure, not everyone can justify 400 flash drives, but it's nice to know that the VNX  is capable of delivering that sort of throughput if and when needed.
    The notion of "realistic configs" is a bit strained, if you think about it.  As you point out, all benchmark tests of this ilk are entirely synthetic.  There is no implied relationship between a SPEC benchmark and what you're likely to see in your production environment.
    Personally, I love to see all the whining going on from various sources.  The EMC engineering team turned in a smokin' number on the SPEC.  I would think if other vendors could turn in big numbers, they would too.
    Take it for what it is — a demonstration of sheer performance. 
    — Chuck (from EMC!)

  5. IvanE

    > the configurations submitted by the vendors bear zero resemblance to the configurations they sell to customers
     
    Nigel, this seems like overgeneralization from my point of view. Most vendors do submit max or close to max configs but they do sell those as well.
     
    If you look at all submissions it might turn out that ridiculous configs are more an exception than rule.

  6. Mike Horwath

    I love the analogy though I can't comment on whether it is true or not about what you say.
    Benchmarks are benchmarks and should always be taken with a grain of salt, and a good test in your own environment.

  7. Dimitris Krekoukias

    @Chuck –

    I understand you have a marketing position but let’s clarify again: it’s not “the VNX is capable of delivering that sort of throughput” –

    rather:

    4x VNXs, unaware of each other, are capable of delivering that sort of throughput.

    It’s an important distinction.

    You see, any collection of boxes not bound by something will scale linearly.

    It’s kinda like saying “one truck can carry 10 tons, 10 trucks can carry 100 tons”.

    That holds true unless you place a restriction like “but they have to be on the bridge at the same time”.

    Suddenly, the bridge becomes the bottleneck.

    You benchmarked 4 trucks.

    D

  8. Val Bercovici

    Some bombastic vendors value the science of benchmarking as a drag race or merely turning left while bumping into each other, while others with true market momentum prefer a more demanding test of driver and machine under globally variable circumstances.
    To each his own I guess πŸ™‚
    My prediction is you will NEVER see an EMC benchmark with value-added features they try to sell, such as FAST, Dedupe / compression, Virtual Pools for Thin Provisioning, Snapshots, or RAID6 for disk failure protection and efficiency, etc….
    The silence from most if not all NetApp competitors on the performance impact of features storage customers actually value is …. deafening.

  9. Pingback: Technology Short Take #12: Storage Edition - blog.scottlowe.org - The weblog of an IT pro specializing in virtualization, storage, and servers

  10. Stephen

    Nigel,

    I had to chuckle when reading your first paragraph.  It seems that EMC's official stance on anything is that it is ridiculous until they do it.  It's certainly not the first time I've run into that.
    -Stephen

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You can add images to your comment by clicking here.