Autoscaling

This page explains how autoscaling works. Before you read this page, you should be familiar with the Overview of Bigtable and Instances, clusters, and nodes.

In Bigtable, instances are containers for clusters, which are location-specific resources that handle requests. Each cluster has one or more nodes, which are compute resources used to manage your data. When you create a cluster in an instance, you choose either manual node allocation or autoscaling.

With manual node allocation, the number of nodes in the cluster remains constant until you change it. When autoscaling is enabled, Bigtable continuously monitors the cluster and automatically adjusts the number of nodes in the cluster when necessary. Autoscaling works on both HDD and SSD clusters, in all Bigtable regions.

You can configure autoscaling in the Google Cloud console, using gcloud, or using the Cloud Bigtable client library for Java.

When to use autoscaling

We recommend that you enable autoscaling in most cases. The benefits of autoscaling include the following:

Costs - Autoscaling can help you optimize costs because Bigtable reduces the number of nodes in your cluster whenever possible. This can help you avoid over-provisioning.
Performance - Autoscaling lets Bigtable automatically add nodes to a cluster when a workload changes or there is an increase in data storage requirements. This helps maintain workload performance objectives by ensuring that the cluster has enough nodes to meet the target CPU utilization and storage requirements.
Automation - Autoscaling reduces management complexity. You don't need to monitor and scale the cluster size manually or write an application to do these tasks, because the Bigtable service handles them for you.

Autoscaling alone might not work well for the following workload types, because even though Bigtable quickly adds nodes when traffic increases, it can take time to balance the additional nodes.

Bursty traffic
Sudden batch workloads

If your spikes in usage are predictable or regularly scheduled, you can use autoscaling and adjust the settings before the planned bursts. See Delay while nodes rebalance for details.

How autoscaling works

Autoscaling is the process of automatically scaling, or changing the size of, a cluster by adding or removing nodes. When you enable autoscaling, Bigtable automatically adjusts the size of your cluster for you. When your cluster's workload or storage needs fluctuate, Bigtable either scales up, adding nodes to the cluster, or it scales down, removing nodes from the cluster.

Bigtable autoscaling determines the number of nodes required, based on the following dimensions:

CPU utilization target
Storage utilization target
Minimum number of nodes
Maximum number of nodes

Each scaling dimension generates a recommended node count, and Bigtable automatically uses the highest one. This means, for example, that if your cluster needs 10 nodes to meet your storage utilization target but 12 to meet your CPU utilization target, Bigtable scales the cluster to 12 nodes.

As the number of nodes changes, Bigtable continuously optimizes the storage, rebalancing data across the nodes, to ensure that traffic is spread evenly and no node is overloaded.

After a cluster is scaled up, Bigtable automatically rebalances the nodes in your cluster for optimal performance. All requests continue to reach the cluster while scaling and rebalancing are in progress. See Scaling limitations for more information.

If a cluster has scaled up to its maximum number of nodes and the CPU utilization target is exceeded, requests might have high latency or fail. If a cluster has scaled up to its maximum number of nodes and the storage utilization limit is exceeded, write requests will fail. See Storage per node for more details on storage limits.

When a node is added to a small cluster, such as a one-node cluster, you might observe a temporary increase in latency as the cluster rebalances. This is because the additional node proportionally doubles the size of the cluster. Similarly, if a cluster decreases in size from two nodes to one node, some latency might occur.

When a cluster is scaled down, nodes are removed at a slower rate than when scaling up, to prevent any impact on latency. See scaling limitations for more details.

Autoscaling parameters

When you create or edit a cluster and choose autoscaling, you define the values for CPU utilization target, min nodes, and max nodes. You can either configure the storage utilization target or leave it at the default, which is 50% (2.5 TB for SSD and 8 TB for HDD).

Parameter	Description
CPU utilization target	A percentage of the cluster's CPU capacity. Can be from 10% to 80%. When a cluster's CPU utilization exceeds the target that you have set, Bigtable immediately adds nodes to the cluster. When CPU utilization is substantially lower than the target, Bigtable removes nodes. For guidance, see Determine the CPU utilization target.
Minimum number of nodes	The lowest number of nodes that Bigtable will scale the cluster down to. If 2x node scaling is enabled, this must be an even number. This value must be greater than zero and can't be lower than 10% of the value you set for the maximum number of nodes. For example, if the maximum number of nodes is 40, the minimum number of nodes must be at least 4. The 10% requirement is a hard limit. For guidance, see Determine minimum number of nodes.
Maximum number of nodes	The highest number of nodes that you want to let the cluster scale up to. If 2x node scaling is enabled, this must be an even number. This value must be greater than zero and equal to or greater than the minimum number of nodes. The value can't be more than 10 times the number that you choose for the minimum number of nodes. The 10x requirement is a hard limit. For guidance, see Determine the maximum number of nodes.
Storage utilization target	The maximum number of terabytes per node that you can store in SSD or HDD clusters before Bigtable scales up. This target ensures that you always have enough nodes to handle fluctuations in the amount of data that you store. For more information, see Determine the storage utilization target. This target doesn't include the infrequent access tier.
Combined usage of SSD and infrequent access	The maximum number of terabytes per node that you can store in SSD and infrequent access clusters before Bigtable scales up. This target ensures that you always have enough nodes to handle fluctuations in the amount of data that you store. For more information, see the Tiered storage and autoscaling section of this document.

Configure autoscaling

This section describes how to choose your autoscaling parameters. After you set your initial values, monitor your cluster and adjust the numbers if necessary.

Determine the CPU utilization target

Base the CPU utilization target on your unique workload. The optimal target for your cluster depends on the latency and throughput requirements of your workload. For more information, see Plan your Bigtable capacity.

In general, if you observe unacceptably high latency, you should lower the CPU utilization target.

Determine the storage utilization target

If your application is latency-sensitive, keep storage utilization below 60%. If your application is not latency-sensitive, you can choose a storage utilization target of 70% or more. For more information, see Plan your Bigtable capacity.

For autoscaling, storage utilization is expressed as the number of bytes of storage per node rather than as a percentage. The storage utilization target is specified per node but is applied to the entire cluster. The capacity limits for nodes are 5 TB per node for SSD storage and 16 TB per node for HDD storage.

The following table shows target amounts for typical storage utilization target percentages. The Google Cloud console accepts the value in TB per node, and the gcloud CLI, API, and Cloud Bigtable client libraries accept an integer value in GiB per node.

Percentage	SSD	HDD
80%	4 TB or 4,096 GiB	12.8 TB or 13,107 GiB
70%	3.5 TB or 3,584 GiB	11.2 TB or 11,468 GiB
60%	3 TB or 3,072 GiB	9.6 TB or 9,830 GiB
50%	2.5 TB or 2,560 GiB	8 TB or 8,192 GiB

Tiered storage and autoscaling

Tiered storage (Preview) doesn't impact SSD autoscaling described in the Determine the storage utilization target section of this document. When you enable infrequent access as part of tiered storage, autoscaling additionally makes sure that the combined SSD and the infrequent access storage doesn't exceed the limit of 32 TB per node. When the limit is reached, Bigtable scales up automatically.

For example, on an SSD cluster, if you set a storage utilization target of 2.5 TB (50%) per node, and your infrequent access usage is high enough to push the storage usage with tiered storage over the limit, Bigtable adds nodes. This happens even if your SSD usage remains within the 50% target.

The following table helps you understand how autoscaling recommends a node count based on both the SSD usage and the infrequent access usage:

Scenario	Storage utilization target	Utilization percentage	SSD usage	Infrequent access usage	Combined SSD and infrequent access storage	Recommended node count
SSD usage is within the target range and there is no infrequent access usage.	5 TB	100%	Less than 5 TB	0 TB	Less than 5 TB	1
SSD usage exceeds the storage per node limit.	5 TB	100%	6 TB	0 TB	6 TB	2
SSD usage and infrequent access usage are within the tiered storage limit.	5 TB	100%	5 TB	27 TB	32 TB	1
Tiered storage usage exceeds the tiered storage limit.	5 TB	100%	5 TB	28 TB	33 TB	2
SSD usage almost exceeds the SSD usage target, and there is no infrequent access usage.	3 TB	60%	3 TB	0 TB	3 TB	1
SSD usage almost exceeds the SSD usage target, and tiered usage almost exceeds the tiered storage limit.	3 TB	60%	3 TB	29 TB	32 TB	1
SSD usage exceeds SSD storage target, and there is no infrequent access usage.	2.5 TB	50%	4 TB	0 TB	4 TB	2
Tiered usage exceeds the tiered storage limit.	2.5 TB	50%	2 TB	31 TB	33 TB	2

For more information about tiered storage, see Tiered storage overview.

Determine the maximum number of nodes

The value that you choose as the maximum number of nodes should be the number of nodes that the cluster needs to handle your workload's heaviest traffic, even if you don't expect to reach that volume most of the time. Bigtable never scales up to more nodes than it needs. You can also think of this number as the highest number of nodes that you are willing to pay for. For details on accepted values, see Autoscaling parameters.