Now Generally Available: New Cluster Sizes

Clusters are the heart of Materialize, the engines that make operational workloads go. Before you can ingest data from a source, maintain complex queries in realtime, or sink out your changes, you have to size and create a cluster - an isolated pool of compute resources dedicated to your workloads.

Today we’re excited to announce a few improvements to Materialize’s cluster sizings, including new names, new sizes, and more oomph.

From T-shirts to engines

First, we’ve created a clearer naming system for cluster sizes. Until today, we’ve used “T-shirt” sizes, qualitative names like 2xsmall or medium that map to some amount of credit cost and behind-the-scenes compute resources.

After working hands on with customers and prospects over the last year, we found a few wrinkles with this naming system. First, T-shirt naming did not allow for an intuitive understanding of how compute resources scale between sizes: We knew a medium was bigger than a 2xsmall, but by how much? If a workload was using 35% of a large, what size can it safely downsize into?

We also wanted to make it easier for users to understand the relationship between a cluster’s cost and its compute resources. A cluster’s credit cost has always been tied directly to its compute resources, but the T-shirt size names provided no insight into this relationship.

As a result, we’re deprecating the T-shirt sizes, and we’re introducing new cluster size names based on their credit cost (specifically in “centicredits”, or cc).

Converting to the new names is easy: A 3xsmall cluster used to cost 0.25 credits/hour, and voila, it’s now a 25cc cluster! An xlarge used to cost 16 credits/hour, and voila, it is now a 1600cc cluster! [1]

These names should give a more intuitive mapping to their relative sizes: How much more compute is given to an 800cc than a 200cc? 4x! How much larger is a 1600cc than a 100cc? 16x! [2]

[1] For those unfamiliar with scooters, motorcycles, or Mario Kart, a 25cc engine is pretty teeny whereas a 1600cc is quite large.

[2] Note that these ratios aren’t always exact between cluster sizes for many deep technical reasons, but they’re a close approximation for how to think about relative scale. When we aren’t able to get an exact linear relationship, we always round up in favor of the customer and offer more compute per credit.

Disk-enabled clusters

Second, in this new cluster sizing scheme, customers will get disk-enabled clusters that offer spill-to-disk capacity. Yup, that’s right - Materialize processes workloads in-memory, but as the needs arise, Materialize will automatically offload processing to disk, seamlessly handling key spaces that are larger than memory. This lets customers process larger datasets than memory alone would permit, efficiently handling larger workloads without running into memory constraints. This ensures graceful degradation and reliable operations to provide an optimal user experience.

Intermediate sizes

Last up — we have added new cluster size options! When we first drafted our cluster sizes, we ~2x’d the compute resources for each T-shirt size we offered. This gave customers great flexibility to right-size their clusters with smaller workloads, but as the workload scaled, especially beyond the capacity of a medium, the impact of jumping to the next cluster size up could result in a large jump in cost.

Therefore, our last improvement to our cluster sizing system is the addition of two new cluster sizes, the 600cc and 1200cc. These sizes fit between our former medium/ large and large/ xlarge sizes, respectively. This addition smooths out the sizing curve, giving us and our customers more opportunities to right-size clusters running workloads of all sizes.

Conclusion

In total, here are our revamped cluster size offerings. We’re excited to offer this new and improved set of names, sizes, and spill-to-disk capabilities to power your Materialize workloads.

Size	Former Size	Credits (per hour)
25cc	3xsmall	0.25
50cc	2xsmall	0.5
100cc	xsmall	1
200cc	small	2
400cc	medium	4
600cc		6
800cc	large	8
1200cc		12
1600cc	xlarge	16
3200cc	2xlarge	32
6400cc	3xlarge	64
128c	4xlarge	128
256c	5xlarge	256
512c	6xlarge	512

Now Generally Available: New Cluster Sizes

See Materialize in action

From T-shirts to engines

Disk-enabled clusters

Intermediate sizes

Conclusion

More Articles

Loan Underwriting Process: The Move to Big Data & SQL

Strategies for Reducing Data Warehouse Costs: Part 3

Strategies for Reducing Data Warehouse Costs: Part 2

Try Materialize Free