Today, storage startup Qumulo came out of stealth mode. My company, Accunet Solutions, was involved in one of the first Qumulo sales and implementations. Our customer was invited to attend Qumulo’s launch event, and, as the SE involved with the deal, I also scored an invite.
Qumulo offers data-aware scale-out NAS that is remarkably simple to install and manage. I’ll go through details on both the company and the product below.
The folks behind Qumulo have a long history in storage in general, and scale-out in particular. The company’s three co-founders’ names — Peter Godman, Neal Fachan, and Aaron Passey — might be familiar. They were the primary inventors of the Isilon scale-out NAS platform. (Isilon was acquired by EMC in 2010 for $2.5 Billion.) Isilon’s founder, Sujal Patel, joined Qumulo’s Board of Directors earlier this year.
Qumulo is headquarted in Seattle, WA — the same city housing the old Isilon HQ. To date, Qumulo has raised $67 Million in investor funding.
I’ll skip over the usual background-setting lecture that every storage vendor seems to want to give lately about how much data growth we’ve seen over the years and expect to see in the next few years (Hint: it’s a LOT). I’ll sum it up like this:
More and more people and organizations are creating more and more data. They’re wanting (or even required by law in some cases) to hold onto it longer. They want faster access to their data from a wider variety of devices.
In short, management of data has become a difficult problem. Management of metadata (the data about the data) is also becoming difficult. For typical NFS storage usage, anywhere from 50 to 80% of user transactions are metadata queries, not actual reads and writes.
Sound hard to believe? Think about it. You open a folder, the contents are displayed (all metadata). You open up a subfolder, and then one under that (both metadata requests). You then look at the “last modified” times to determine which is the most-recent version of the file you want (another metadata request) before opening the file (a read request).
Providing rapid access to the metadata becomes a priority for storage. Sounds simple enough. Now, imagine that in the example above, each folder mentioned contains more than a million files and/or subfolders. If the storage system needs to look up those listings each and every time, the mere act of opening a folder will slow to a crawl.
Other metadata issues become important. Administrators often want to be able to quickly determine how the storage is being used. Obviously, this is useful for returning regular reports to their management and users, but what about when a user reports a performance issue? Now the admin needs that information in real time, right now.
Most storage systems allow the admin to fairly quickly determine which client connections are responsible for what percentage of the current workload. What’s more difficult to determine — and what some storage systems can’t give an administrator any sort of easy access to — is which specific users are currently reading or writing which specific files. Other real-time info admins would often want access to includes:
- Which user(s) just added additional data to the system
- Which user(s) or group(s) have gone over quota recently
- Who owns what percent of the data in a specific folder
- (or in one of the subfolders)
- What specific files are currently being written to
Again, imagine the “million files per folder” scenario. Now imagine the storage system needs to walk through the entire folder (or worse — the entire filesystem) to produce the report on any of the above queries. By the time the admin has the data, it’s no longer real-time.
Qumulo’s product, Qumulo Core, solves this problem by updating analytic metadata on the fly. When a user adds a new file to a folder, that folder’s info (metadata) is updated (who owns how much data, what files are open, etc.). Now, when the Qumulo admin wants to perform any of the queries given above, the information is immediately available. No tree-walk needed. No waiting for a report to be generated. The info is available right now, and will continue to be updated in real time.
Qumulo doesn’t just make the analytic information available in real time, but they do so through a slick, intuitive, and very easy to navigate interface. The system map in particular, is very impressive visually.
Don’t take my word for it. Check out the demo of the real-time analytics in the video below:
The Qumulo Core software that makes all of this possible is available in one of two ways. Customers can purchase hardware nodes from Qumulo, or customers can license the software to run on their own, validated hardware. While, I’m going to focus on the Qumulo hardware for the rest of this write-up (for information on acquiring the software-only version, contact your Qumulo reseller), what I discuss below (outside of the details of the hardware specs, obviously) apply to either method.
A Qumulo cluster requires a minimum of 4 nodes and, today, can scale to up to 1,000 nodes. Communication between the nodes in a cluster is done over the same 10GbE interfaces that the nodes use to serve data to clients, so no additional back-end network is required.
Today, Qumulo offers only a single type of hardware node, the Q0626. This node has a 1U form factor and provides 25.6TB of raw storage capacity. Qumulo has hinted that larger, denser nodes are on the roadmap.
A Qumulo cluster provides a single global namespace and a single filesystem. It is able to serve data to NFS and SMB clients simultaneously.
The Qumulo Q0626 specifications can be found in the table below:
Both the Qumulo Q0626 and the software-only version of the Qumulo Core product are available now. Entry pricing for a four-node cluster of Q0626 units is approximately $50,000 and provides 100TB of raw storage capacity.