I’ve been involved with Qumulo since they came out of stealth approximately 18 months ago — I actually helped sell a Qumulo cluster to one of our customers while Qumulo was still operating in stealth mode — so it will come as no surprise that I’m a fan.
Qumulo has existed as a company for four years now. They currently have 100 customers running Qumulo Core on more than 60PB of raw disk. Qumulo uses an Agile Development process, updating their software every two weeks. Because of this, 70% of Qumulo clusters are running software that was released during the current quarter. The majority of their feature upgrades and tweaks are based on direct customer feedback, which has led to over 50% of their quarterly sales coming in as reorders from existing customers.
What’s New in Qumulo Core 2.5
Version 2.5 is Qumulo’s first major release since Qumulo Core 2.0. It adds a few pieces of enhanced functionality, including:
- More configuration options for erasure coding
- Throughput Analytics
- Metadata storage moved to SSD
This feature is one that Qumulo customers have been asking for. Snapshots in Qumulo Core 2.5 are pointer-based and use redirect-on-write. There is no limit on the number of snapshots — either per volume or in total. The snapshots also have API support, allowing them to be managed by another system or process.
Currently, Qumulo snapshots are read-only. Support for writable snapshots (sometimes called “clones”) is on the roadmap, but with no projected delivery date attached to it at the moment.
There is also currently no way within the Qumulo Core software to schedule snapshots. That being said, Qumulo does provide a work-around for this. By taking advantage of the API support for snapshots, Qumulo customers can write a script to run out of cron on one of their servers to make API calls to take snapshots. Basically, the customer would “outsource” the scheduling to a server. Scheduled snaps are on the roadmap, but also without a projected delivery date at the moment.
Qumulo added erasure coding in version 2.0 of Qumulo Core. Prior to that, customers could only use RAID 10, meaning that a Qumulo cluster had a maximum storage efficiency of 50%. In 2.0, Qumulo used a 6/2 scheme, although I’m used to thinking of that as a 4+2 layout. In this scheme, data is striped over six drives and that set of six can suffer the failure of two of the drives and still serve data. This resulted in a maximum storage efficiency of 66%.
In Qumulo Core 2.5, data stripes can be wider, using a 10/8 (or 8+2) scheme, striping the data over ten drives, while still being able to tolerate the failure of two of those drives. This offers a maximum storage efficiency of 80%.
There are, however, three “gotchas” with the new erasure coding customers need to be aware of:
- Customers will need a minimum of six Qumulo nodes in a cluster in order to use the 8+2 scheme. This is due to spreading the data across the larger number of drives and not wanting too many of those drives to be within a single node.
- There is currently no in-place conversion available. This means if your Qumulo cluster is currently using the 4+2 erasure coding scheme (or the RAID 10 scheme), and you upgrade to Qumulo Core 2.5, there is no way to convert to the 8+2 erasure coding scheme — you will continue to use the previous, less-efficient scheme.
In-place conversion (the ability to change protection schemes without needing to migrate the data off of — and then back on to — the cluster) is on Qumulo’s roadmap. There is currently no projected release date for this feature.
- The data protection scheme must be selected at install time. As of Qumulo Core 2.5, there is no way to to change schemes without a complete reinstall. The in-place conversion feature described above will fix this when it become available.
One of the best features of Qumulo Core has always been the insight it offers into the data stored on the cluster. It’s been very easy to track which clients, users, and groups have been using how much capacity.
Qumulo Core 2.5 adds insights into throughput to their analytics. This allows the ability to determine which clients are doing IOPS against which files and directories. In addition to IOPS, it can also detect throughput — for example, this client is reading data from that directory at this many MB/s. This offers customers greater insight into how their storage is being used.
At this time, Qumulo’s analytics are real-time only. They do not offer historical trending information. There is a scripting work-around for this, which would allow a script to issue API calls to the cluster to extract current information at regular times. The script would then need to store that information in a database to provide historical info.
Metadata on SSD
All Qumulo nodes use hybrid storage — a mix of SSD and HDD. They use a Flash-first model for handling data to offer the best performance possible.
Qumulo Core 2.5 makes the change to store all metadata on SSD. This one simple change provides a 25X performance increase for metadata operations. (No, I didn’t drop a decimal point. That is twenty-five times faster metadata operations.)
Now, some of you are thinking, “Yeah, but that’s just metadata. So what?”
I’ll tell you so what. In a typical NFS environment, anywhere from 50 to ~80% of all NFS operations are GetAttribute (GETATTR) calls. (To get file size, creation date, last modified date, contents of a directory, etc.) All this information is metadata.
Additionally, all of the analytics information that Qumulo Core provides is essentially metadata as well.
Qumulo Core on HPE Apollo Servers
Qumulo’s value-add has always been in their Qumulo Core software, but it’s only been available running on the hardware appliances purchased from Qumulo.
Until now, that is. Qumulo has also announced the first entry on their Hardware Compatibility List (HCL). This first entry is the HPE Apollo 4200 Gen9 Server.
Qumulo assures me that support for more HPE servers are on the roadmap, and that more vendors’ servers are also on the support roadmap. (I pressed, but they declined to disclose who these other vendors might be, something about NDA agreements with those vendors.)
Going with this option will push customers into a dual-provider support model. Support for Qumulo Core issues will be provided by Qumulo and support for any HPE hardware issues will be provided by HPE. For issues where it’s not initially clear whether the cause is software are hardware, customers are advised to call Qumulo first to help with that determination.
Qumulo Core 2.5 entered General Availability on 15 November.
Qumulo Core 2.5 on HPE Apollo 4200 Servers also entered General Availability on 15 November.
My thoughts on the Qumulo announcements are below.
- Snapshots is a must-have feature for enterprise storage. I wish Qumulo had been able to release this functionality sooner, but that makes it no less welcome now.
- I’m hoping Qumulo will be able to follow up with the ability to perform scheduled snapshots without requiring a script on a separate server ASAP.
- While less urgent than scheduled snaps, hopefully writable snapshots will follow shortly thereafter.
- A more efficient erasure coding protection scheme is a great addition to Qumulo’s value proposition, but it really needs the in-place conversion functionality in order to be truly useful. At the moment the new 8+2 scheme can only be used on newly-deployed clusters. With more than 50% of Qumulo’s revenue coming from existing customers, I think the in-place conversion feature is essential for Qumulo.
- I believe Qumulo is too focused to their existing customer base to allow them to grow faster. While expanding your sales to current customers is a good thing, getting new customers is even better.
- If I were in charge at Qumulo, the Development team would be given three immediate highest-priority goals:
1. Scheduling of snapshots
2. Remote replication
3. Data-in-place conversion of protection schemes
- Qumulo’s friendship / alliance with HPE has been going on for a while. If they want to make a reputation as a hardware-agnostic software-defined storage company, they need to get other vendors’ servers on their HCL soon.
Those are my thoughts, but I’d love to hear yours. Add your thoughts on Qumulo Core 2.5 or Qumulo in general in the comments below.