All-flash and Garbage

By | March 19, 2014

So a ton of buzz in the industry around flash storage and in particular all-flash arrays.  Warranted buzz in my opinion, as I think a lot of all-flash are about to start chomping their way through the infrastructure lunch that’s traditionally only been on the menu for big iron storage arrays like Symmetrix and even NetApp.

But that’s a topic that’s been discussed over and over again pretty much everywhere.  I’ve got a technical question that’s interesting me at the moment……

I’ve had a few discussions with a few all-flash vendors.  And to be honest, a lot of them are doing a lot of things the same way.  Not everything.  But a lot of things.  Anyway….. one thing that some are doing differently is garbage collection.  Oh yeah!  That fascinating subject of flash garbage collection that’s guaranteed to be at the top of your list of things to tell the next guy or girl you’re trying to impress :-S  But seriously, a pretty interesting topic.  And one that, if you believe some vendors, makes a heck of a difference…… in system performance!
It goes a bit like this…….

System Level Garbage Collection

In the past I was brain-washed into thinking was the one-true-way of doing garbage collection.  Well…. probably not brain washed, probably just that I couldn’t be bothered thinking too much about the other ways of doing it.  Anyway….  

With system level garbage collection, you basically turn off any native garbage collection that the SSD flash drives are capable of, and do it all in the arrays firmware.  This requires close collaboration with the SSD vendor so that you get a firmware version on the drives that allows you to do things like disable garbage collection.  It then has the potential benefit that the system (the all-flash array) performs garbage collection in a way that’s optimised for how the array works and lays down data on the flash drives.  But on the downside it has the potential to contend with user workload for CPU, memory, bandwidth etc…

I think one of the reasons I took this as the de facto and optimal modus operandi for flash in a storage array was because back in the day, when the traditional array vendors were hurriedly shoe-horning flash into their aging-20-year-old-architecures, they pretty much had to do a load of custom with the flash drives to make them work in a system that was designed entirely around spinning disk.  BTW How ugly do those solutions look these days!

Basically if the vendors had just taken an SLC flash drive and lashed it into one of their aging systems, it could probably have chewed its way through the entire internal I/O capability of the array – starving every other disk drive in the array of resources…..  After all, on some legacy arrays, a single flash drive had the potential to chew through the entire bandwidth and IOPS of the back-end loop/bus/stack it was on.

Oh and of course they had to make changes in the array-level code so that the way the array wrote data to its internal drives didn’t wear out the flash cells before you could say “Enterprise Flash Drive (EFD)”. 

Anyway, until today I just took it as a given that the best way to do garbage collection was at the system/array level.

Drive Level Garbage Collection

As with most things in technology – there is another way!

So apparently, as SSD/flash drives have matured over time, so has the firmware and smarts they ship with.  Part of those smarts is the garbage collection.

The idea of the drive doing the garbage collection is about as far away from rocket science as it gets (though I have to admit, from my uninformed position, rocket science seems like a pretty simple upward thrust vs the pull of gravity equation, but what would I know).  Anyway, the drives already have smarts that take care of garbage collection, and outsourcing the garbage collection to the drive follows the popular technology model of offloading tasks to components as far down the stack as possible – think VMware VAAI and MS ODX etc….

Also, disk drives these days are mini computers in their own right – they’ve got storage, a controller with a processor running sotware/firmware, and some form of network interface.  So why not offload garbage collection down the stack?

Sonds simple right?  And I do see its merits.

But….. doesn’t it put you at the mercy of the drive vendor and the quality of their firmware?  And isn’t their firmware targeted at the consumer market where load and demands are different from enterprise use cases?  What happens when they report a bug that could result in data loss and tell you to upgrade to the latest drive firmware ASAP, but XYZ vendor hasn’t qualified it yet?  And don’t give me the “At XYZ vendor we do rigorous testing before we ship anything to customers in our arrays”.  Bugs still get through that kind of testing…. 

Does Any of This Matter

OK but should anyone care?

Well…….. according to the folks at EMC XtremIO, everyone thinking of buying an all-flash storage array should care.

Why?

Well…… according to the folks at EMC XtremIO, some of their competition – who do system level garbage collection – see significant performance drops when servicing production workloads at the same time as performing so called background garbage collection.

Now I can’t verify this.  But I know this much.  If I were considering buying an all-flash storage array.  I’m damned sure I’d be filling it to nearly full and putting it to the sword from an I/O perspective and sitting there waiting to see if performance tanked during a garbage collection run on the array.

Worth keeping in mind if your looking for a new all-flash array.

 

5 thoughts on “All-flash and Garbage

  1. Fazal Majid

    I did some OLTP benchmarking with second-generation (MLC) FusionIO drives in 2008 or so, and the transactions per second graph would dip every 30 seconds. This was caused by garbage collection they ran from inside their driver. They improved the situation since, but it's still a factor.

    Modern SSD drives (and even hard drives for that matter) are complex virtualized systems. I wouldn't be surprised if a modern SSD contoller had the same level of complexity as early versions of NetApp's WAFL. Wear leveling, bad block detection, P/E cycles, compression, encryption and even deduplication in the case of SandForce. Sadly, it's almost impossible to get any real information on the characteristics of SSD controllers and SKUs are replaced too quickly to earn any significant field operational histories that can be used for future deployments.

  2. Pingback: All-flash and Garbage

  3. Vaughn Stewart

    Nigel,

    Falling performance as flash nears capacity (aka the write cliff) have been addressed by the storage industry. At a high performance flash devices (be it an SSD or an Array) are configured with reserve flash capacity that ensure performance as the 'published' capacity of the device is met. The flash industry refers to this reserve as over provisioning (OP). OP provides performance benefits by allowing processes like garbage collection (GC) to operate without impacting write operations. OP also increases flash reliability as the additional capacity results in less average writes per cell.

    When it comes to All-Flash Arrays some vendors leverage OP in the SSD (aka HW OP). This method hides OP from the system; however, OP is easy to identify as these SSDs are usually referred to as 'enterprise class' or eMLC.

    As an example: A 400GB Hitachi eMLC SSD (HUSML4040ASS600) is actually 624GB of raw NAND. This SSD is referred to as being 56% over provisioned.

    It's interesting to note that the same NAND flash is used in MLC and eMLC drives.

    Other AFA vendors like Pure Storage implement OP at the system level. In the Purity O.E. FlashCare ensures optimal performance and data reduction by leveraging the GC process to apply deeper compression algorithms, validate data checksum and RAID parity, etc. These capabilities increase the value of the FlashArray and frankly aren't possible with HW level OP. Note: the raw flash capacity and the FlashCare reserve can be displayed in the GUI. Pure Storage guarantees 100% performance with the system at 100% system capacity, with a failed controller and with 2 SSDs pulled.

    As for your post. OP is a universal design in all high performance flash devices. I find it either ignorant or intellectually dishonest for anyone to suggest that system capacity and raw NAND capacity are one in the same.

    Go to http://purestorageguy.com and check out the Flash Bits series. It contains a wealth of deep tech on the FlashArray and Purity Operating Environment from Pure Storage.

  4. Pingback: All-flash and Garbage | Storage CH Blog

  5. Gabriel

    Nigel, sorry to put here something totally unrelated to the article, but since the article about your book doesn't allow comments, I´ll say this here.. I was checking amazon every few months since a year ago, waiting for a book for the Comptia Storage+ exam, it was a pleasant surprise to see a book  authored by you and even better, in the "Real World Skills" format.. Thanks for this! Storage industry has one of the biggest learning curves and is really hard to find useful information in just one place, I can´t wait to put my hands in that book!

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You can add images to your comment by clicking here.