Today Pure Storage, known for their All-Flash Storage arrays, announced their next Big ThingTM, DirectFlash, as well as a new FlashArray model. DirectFlash is a combination of NVMe hardware and the software to manage it (more details on that below). The new array model, the FlashArray//X, uses exclusively DirectFlash as the storage medium.
A Brief Flash Primer
“Flash” refers to silicon-based memory chips used for storage. It tends to come in one of two form factors.
The first is Solid State Drives (SSD). These typically have the same dimensions as hard-disk drives (HDD) and connect via either SATA or (more typically) SAS interfaces.
The second is Non-Volatile Memory Express (NVMe). These use (not surprisingly) non-volatile memory chips to store data. The “Express” in the name indicates that it connects via PCIe which is not only higher bandwidth than SATA or SAS, but is both physically and logically closer to the storage controllers, making NVMe faster and higher bandwidth than SSD. NVMe is often 5X faster than SSD.
I’ll describe the new offerings from Pure below.
DirectFlash is the new technology offering from Pure Storage. The naming is confusing, because Pure uses the name “DirectFlash” to refer to both the hardware modules and to the software that manages them.
Normally, that wouldn’t be confusing at all — if anything, it would make life easier to be able to refer to hardware and the software that runs it by the same name because they’re a single product.
In DirectFlash’s case, however, it gets more complicated because the software doesn’t run on the hardware — they’re completely separate (which turns out to be the point, and a large part of what makes it work so well), although you can’t run the hardware without the software.
To attempt to avoid confusion, the hardware itself is referred to as the “DirectFlash Module” and the software as “DirectFlash Software”. Think of it this way:
DirectFlash == DirectFlash Module(s) + DirectFlash Software
and it will make more sense.
Not too surprisingly, the DirectFlash Module looks a lot like a PCIe card. It has very dense-packed storage.
The DirectFlash Module is 100% NVMe. There are no SSDs involved, and — most significantly — no controllers on the Module itself.
No controllers means that no software whatsoever runs on the DirectFlash Module. None. This allows the storage system to access 100% of the raw storage on the Module.
This is significant because it’s very different than how SSDs work. Each SSD contains an onboard controller. This controller handles things like data placement, garbage collection, and wear-leveling for the Flash within that SSD. SSDs typically contain some “extra” Flash capacity that is hidden from the array. For example, if you disassembled a 100GB SSD, you might find that it actually contains 120GB of Flash capacity. The extra space is managed by the controller for the tasks mentioned above.
This means two things:
- The storage array never really has access to 100% of the capacity within an SSD.
- SSDs have an extra “layer” in the data path, meaning data to be written gets processed by the array, and then passed to the SSD’s controller where it is processed again before it is written to its destination.
DirectFlash Modules make 100% of their NVMe capacity available to the array. The lack of an onboard controller makes for a shorter data path. Data to be written gets processed by the array, and then is written to its destination on the Module.
Before I describe the DirectFlash Software, the picture below shows what the DirectFlash Module looks like. You can see the memory chips, as well as the snazzy orange heatsink with the Pure logo.
DirectFlash Modules will come in one of three capacities: 2.2, 9.1, and 18.3TB raw NVMe Flash. The Modules will be sold in “Chassis Capacity Packs” consisting of 10 modules.
Now, obviously, hardware doesn’t just run itself. That’s where the DirectFlash Software comes in.
DirectFlash Software is a software module (yes, they use the word “module” to indicate the hardware, but still call the software a “software module” — I said the naming was confusing) that runs as part of Pure Storage’s Purity Operating Environment.
The DirectFlash Software allows the storage array to manage all of the NVMe Flash in the array globally (where “globally” is defined as “within this particular array” (Yes, I get it, it’s even more naming confusion. The choice of the term “global” was Pure’s, not mine.)).
This “Global Flash Management” is logically divided into three areas:
- Adaptive I/O Control
This includes all I/O scheduling. It also allows for Flash-level Quality of Service (QoS) controls.
- Smart Endurance
This includes space allocation and wear-leveling. Garbage collection gets grouped in here as well.
- Predictive Resiliency
This includes block-level telemetry, management of bad blocks, and data encryption (all data on Pure Storage arrays is stored using built-in data-at-rest encryption).
Moving management of these functions from the drive or module level (like with SSDs) to the array level allows for increased levels of performance, density, and efficiency.
Wear-leveling doesn’t need to be handled within each individual module, but can be controlled across all the modules. Data placement decisions don’t need to be made within every module, but can be made across the “pool” of all the modules. The elimination of the extra layer improves performance (above and beyond the gains already made by using PCIe instead of SAS). Access to all the NVMe capacity improves efficiency.
If you’re familiar with Pure Storage, you’re already aware of the FlashArray//m. The //m line uses SSD for storage.
The FlashArray//X is a familiar-looking chassis, but, instead of SSDs it uses DirectFlash Modules for storage. All-in-all — from the outside — the //X line looks remarkably like the //m line, so, naturally Pure decided use a new bezel to distinguish between the two.
Since, for some reason, folks get excited about bezels these days, you can see a picture of the FlashArray//X bezel below.
(Back in my day we didn’t care about no fancy-schmancy bezels. Until increased cooling requirements necessitated their use, we’d leave the bezels off of installed systems. We wanted to see the cool stuff inside our systems. But, somehow today when a new model array is announced, somebody in the crowd always asks “What’s the bezel look like?”)
The FlashArray//X is “NVMe over Fabric (NVMeF) Ready”, meaning it won’t support NVMeF at GA, but look for it in future releases.
When the FlashArray//X enters General Availability, Pure Storage customers with existing FlashArray//m systems will be able to non-disruptively upgrade them to FlashArray//X systems.
I’ll say that again, because I think this is huge.
There is a non-disruptive upgrade path from the FlashArray//m (using SSD) to the FlashArray//X (using NVMe in the DirectFlash Modules).
I know what you’re thinking. It’s something like “Wait, but the //m uses SSD over SAS and the //X uses NVMe over PCIe — how is a non-disruptive upgrade from one to the other possible? Wouldn’t you need to swap the chassis?”
No. The FlashArray//m and FlashArray//X use the same exact chassis. The non-disruptive upgrade works because Pure built them with both SAS and PCIe from the beginning (which in this case means when the FlashArray was first introduced in 2015). The presence of the PCIe/NVMe connectivity was never really emphasized because, before DirectFlash, Pure wasn’t making use of it.
This advance planning enables the non-disruptive upgrade from //m and //X, and fits right in with Pure’s Evergreen Storage model.
Specifications for the FlashArray//X70 are listed in the table below.
NOTE: “Effective Capacity” calculations take into account both overhead and data reduction. Overhead (things that take away from effective capacity) includes: HA, RAID, and metadata. Data Reduction (things that improve effective capacity) includes: inline data deduplication, compression, and pattern removal. Pure Storage calculates average data reduction at a 5:1 ratio. Obviously, if a customer’s data set can be reduced at a greater than 5:1 ratio, that customer’s effective capacity would be higher.
FlashArray//X with 2.2 or 9.1TB DirectFlash Modules is available to order today. It will start shipping as a Directed Availability release in May or June of this year.
General Availability of FlashArray//X will support the 2.2, 9.1, and 18.3TB DirectFlash Modules is expected to begin in the August to October timeframe.
Non-disruptive upgrades from existing FlashArray//m systems to FlashArray//X are also expected to be in General Availability in the August to October timeframe.
- First and foremost, I’m impressed. There’s a lot of good stuff in this announcement.
- The foresight to have included the (at the time, unused) PCIe/NVMe built-in to the FlashArray//m chassis from the start speaks well of Pure Storage and how much thought has gone into their product lines.
- Pure Storage marketing is going to go on about DirectFlash being the “first” or “only” “software-defined hardware module”, or something like that. Ignore that — it’s just confusing. Instead, think about how separating the hardware and the management of that hardware from each other improves both performance and efficiency. That’s the important part.
- That said, the naming around DirectFlash is confusing. Is it hardware? Is it software? It turns out it’s both, but they’re separate.
- That ability to non-disruptively upgrade a FlashArray//m to a FlashArray//X is simply beautiful.
- Pure Storage has done confusing stuff by choosing a Fiscal Year that runs February to February. In order to give you the Availability information, I had to convert from “FY18” quarter listings to date ranges listed the way actual human beings talk about them. (You’re welcome.)
- There’s an often-talked-about concept around storage these days, referred to as either “storage density” or “capacity density”. The FlashArray//X70 has one of the highest capacity densities around: 1PB effective capacity in 3RU (using the 18.3TB DirectFlash Modules), for a Capacity Density of 333TB/RU. That’s pretty impressive.
- Given the movement from disk to Flash and from SSD to NVMe, I believe we’ll soon be talking about a new storage concept that I’m going to call Performance Density. Performance Density will be measured as IOPS per RU. I’d tell you what the FlashArray//X70’s Performance Density is, but Pure hasn’t given me IOPS numbers. I’m guessing it’s fairly high, though.
- I predict we’re going to see a lot of Pure’s competition throw a lot of FUD at this one…
- I predict that there may be some initial customer resistance on this — until they see it proven on actual Production workloads. At that point I predict resistance will melt away and nearly all FlashArray//m customers will start making plans to upgrade to FlashArray//X ASAP.
- DirectFlash validates the predicted trend for high-performance, low-latency storage — the move from SSD to NVMe.
- I’m looking forward to the future announcement of NVMe over Fabric.
- I’d really love to get my hands on one of these to run it through its paces. If anyone has any advice on how I might convince Pure Storage to donate one to the GeekFluent Home Lab Project, please reach out and let me know…
- Official Pure Storage Press Release
- Updated FlashArray Data Sheet (includes both //m and //X)