Configuration and Instance Selection
Cloud Tasks has a flexible configuration system. Options can be specified in a YAML-format configuration file or on the command line, and very few options are required for basic use.
The configuration file supports global options for system configuration, options for selecting compute instances and running jobs, and provider-specific options including authentication and job options that can override the other options.
A configuration file has the following structure (all sections are optional):
[Global options]
run:
[Run options]
aws:
[AWS-specific options]
[AWS-specific run options]
gcp:
[GCP-specific options]
[GCP-specific run options]
Global Options
The available global options are:
provider: The cloud provider to use (select one ofawsorgcp)
The provider option must be specified either in the configuration file or on the
command line. In addition to detemining which cloud provider to contact, it is used to
determine which provider-specific options in the configuration file are relevant.
Run Options
Run options come in several flavors:
Options to select a compute instance type
Generally speaking, within the constraints provided, the system will attempt to use the instance type with the lowest cost per vCPU with the maximum number of vCPUs per instance. This results in needing the fewest instances to get the job done, since each instance can do maximal work; this may or may not be an appropriate choice for your workload (for example, having a large number of vCPUs, and thus simultaneosly running tasks, may result in the tasks being throttled by the network or disk bandwidth). With no constraints, the system will tend to choose the cheapest (and probably worst-performing) instance type with the least memory, the least disk space, and the slowest disk type. Thus, while no constraints are required, it is recommended to specify at least some minimal constraints to avoid selecting the worst possible instance type.
If you need specific performance, specify the instance types you are willing to accept as
a regular expression. For example, to allow all GCP “N2” instances, specify
instance_types: "^n2-.*". This will still give the system freedom to choose the best
instance type within that family given the other constraints. Alternatively, you can
specify cpu_family, min_cpu_rank, or max_cpu_rank if you don’t want to look up
the specific instance types that are relevant to your needs. For example, min_cpu_rank:
21 will specify a fast processor (Intel Sapphire Rapids or better). Note that it is
quite possible to over-constrain the system such that no instance types meet the
requirements.
Many attributes can be specified in multiple ways. For example, the minimum amount of
memory can be specified using min_total_memory, min_memory_per_cpu, or
min_memory_per_task. Multiple constraints can be specified for the same attribute and
the system will use the most-constraining value.
To get a list of the available instance types and their attributes, including number of
vCPUs, amount of memory, CPU family and performance rank, price, etc. you can use the
list_instance_types command line command. Include the --detail option to see
all available attributes.
General Constraints
architecture: The architecture to use; valid values areX86_64andARM64(defaults toX86_64)cpu_family: The CPU family to use, for exampleIntel Cascade LakeorAMD Genoa.min_cpu_rank: The minimum CPU performance rank to use (0 is the slowest)max_cpu_rank: The maximum CPU performance rank to use (0 is the slowest)instance_types: A single instance type or list of instance types to use; instance types are specified using Python-style regular expressions (if no anchor character like^or$is specified, the given string will match any part of the instance type name)
CPU
min_cpu: The minimum number of vCPUs per instancemax_cpu: The maximum number of vCPUs per instanceDerived from instance task information (the number of CPUs = cpus_per_task * tasks_per_instance)
cpus_per_task: The number of vCPUs per task (defaults to 1)min_tasks_per_instance: The minimum number of tasks per instancemax_tasks_per_instance: The maximum number of tasks per instance
Memory
min_total_memory: The minimum amount of memory in GB per instancemax_total_memory: The maximum amount of memory in GB per instancePer-CPU constraints
min_memory_per_cpu: The minimum amount of memory in GB per vCPUmax_memory_per_cpu: The maximum amount of memory in GB per vCPU
Per-task constraints (these are the same as the per-CPU constraints and simply use the
cpus_per_taskvalue as a conversion factor)cpus_per_task: The number of vCPUs per task (defaults to 1)min_memory_per_task: The minimum amount of memory in GB per taskmax_memory_per_task: The maximum amount of memory in GB per task
SSD Storage
Some instance types have additional local SSD storage in addition to whatever volume is mounted as the boot disk and these constraints apply to them. By specifying a minimum SSD size you are also constraining the instance type to those that have an extra SSD attached.
min_local_ssd: The minimum amount of local extra SSD storage in GB per instancemax_local_ssd: The maximum amount of local extra SSD storage in GB per instancePer-CPU constraints - the total amount of storage will be the sum of the base size and the product of the number of vCPUs and the per-CPU amount; the base size is optional, and defaults to 0
local_ssd_base_size: The amount of local extra SSD storage in GB present before allocating additional space per vCPUmin_local_ssd_per_cpu: The minimum amount of local extra SSD storage in GB per vCPUmax_local_ssd_per_cpu: The maximum amount of local extra SSD storage in GB per vCPU
Per-task constraints (these are the same as the per-CPU constraints and simply use the
cpus_per_taskvalue as a conversion factor)cpus_per_task: The number of vCPUs per task (defaults to 1)local_ssd_base_size: The amount of local extra SSD storage in GB present before allocating additional space per taskmin_local_ssd_per_task: The minimum amount of local extra SSD storage in GB per taskmax_local_ssd_per_task: The maximum amount of local extra SSD storage in GB per task
Boot Disk
The boot disk size and type is configurable at instance creation time and is not an intrinsic property of a provider’s instance type. As such, there are no “constraints” on the boot disk size. Instead, there are simply ways to specify the size and type of the boot disk you want.
The boot disk size can either be a single absolute value:
total_boot_disk_size: The size of the boot disk in GB (defaults to 10 GB)
or a per-CPU value:
boot_disk_base_size: The amount of boot disk in GB present before allocating additional space per vCPU (defaults to 0)boot_disk_per_cpu: The amount of boot disk in GB per vCPU (defaults to 0)
or a per-task value:
cpus_per_task: The number of vCPUs per task (defaults to 1)boot_disk_base_size: The amount of boot disk in GB present before allocating additional space per task (defaults to 0)boot_disk_per_task: The amount of boot disk in GB per task (defaults to 0)
If more than one size is specified, the maximum of the values will be used. If no values are specified, a default appropriate to the provider will be used.
The boot disk type is provider-specific and can be a single type or a list of types:
boot_disk_types: The type(s) of the boot disk to allow (defaults to all available types for the provider)
Finally, some boot disk types require additional configuration:
boot_disk_iops: For any boot disk type that supports it, the number of provisioned IOPS to request; this is an absolute value and is not scaled by the number of vCPUs or tasksboot_disk_throughput: For any boot disk type that supports it, the number of provisioned throughput in MB/s to request; this is an absolute value and is not scaled by the number of vCPUs or tasks
Options to constrain the number of instances
Generally speaking, the system will attempt to use the maximum number of instances allowed
based on the various max_ constraints, and then will verify that the min_
constraints have not been violated. Note that it is quite possible to over-constrain the
system such that no number of instances meet the requirements. As with the instance type
constraints, no constraints are required, but it is recommended to specify at least some
minimal constraints so that you can maintain control over the size of your instance pool
and the resulting costs. By default, the maximum number of instances is set to 10 to avoid
excessive instance pool sizes, and the maximum price is set to $10 per hour to avoid
runaway costs, but these can be overridden by specifying different values.
Note that depending on the provider and your account setup, you may have quotas for the creation of specific instance types, and Cloud Tasks may attempt to violate these quotas if you do not give it sufficient constraints.
min_instances: The minimum number of instances to use (defaults to 1)max_instances: The maximum number of instances to use (defaults to 10)min_total_cpus: The minimum total number of vCPUs to usemax_total_cpus: The maximum total number of vCPUs to usecpus_per_task: The number of vCPUs per task (defaults to 1); this is also used to configure the worker process to limit the number of tasks that can be run simultaneously on a single instancemin_tasks_per_instance: The minimum number of tasks per instancemax_tasks_per_instance: The maximum number of tasks per instancemin_simultaneous_tasks: The minimum number of tasks to run simultaneouslymax_simultaneous_tasks: The maximum number of tasks to run simultaneouslymin_total_price_per_hour: The minimum total price per hour to usemax_total_price_per_hour: The maximum total price per hour to use (defaults to 10)
Options to specify the type of VM
use_spot: Use spot instances instead of on-demand instances; spot instances are cheaper but may be terminated by the cloud provider with little notice and should only be used for fault-tolerant jobs
Options to specify the boot process
A startup script must be specified when creating new instances. It can be specified either directly inline in the configuration file, or by providing a path to a file containing the startup script. Either one can be used, but not both.
startup_script: The startup script to use (this can not be overridden from the command line because it is assumed that any startup script would be too long to pass as a command line argument)startup_script_file: The path to a file containing the startup script
image: The image to use for the VM. If no image is specified, the default image for the provider will be used. This is most commonly the latest release of Ubuntu 24.04 LTS.
Options to specify the worker and manage_pool processes
scaling_check_interval: The interval in seconds to check for scaling opportunities (defaults to 60)instance_termination_delay: The delay in seconds to wait before terminating instances once the task queue is empty (defaults to 60); this should be set to a value much greater thanmax_runtimeto avoid terminating instances that are still working on tasks.max_runtime: The maximum runtime for a task in seconds (defaults to 60); this is used to set the retry timeout in the task queue such that any task that takes longer than this is assumed to have had an internal error and should be set to a value significantly greater than the longest runtime expected for a taskretry_on_exit: If True, tasks will be retried if the worker exits prematurely, e.g. due to a crashretry_on_exception: If True, tasks will be retried if the user function raises an unhandled exceptionretry_on_timeout: If True, tasks will be retried if they exceed the maximum runtime specified bymax_runtime
Provider-Specific Options
The available provider-specific options are:
All providers
job_id: The ID of the job to run; required for all queue and job-related operationsqueue_name: The name of the task queue to use, derived from job ID if not specified; only use this in special circumstancesregion: The region to use, required for most operations; will be derived from the zone if not specifiedzone: The zone to use; if not specified, all zones in the region will be usedexactly_once_queue: If True, task messages and events are guaranteed to be delivered exactly once to any recipient. If False (the default), messages will be delivered at least once, but could be delivered multiple times. The specific implications of this flag are provider-specific.
AWS
access_key: The access key to usesecret_key: The secret key to use
GCP
project_id: The ID of the project to use; required for most operationscredentials_file: The path to a file containing the credentials to use; if not specified, the default credentials will be usedservice_account: The service account to use; required for worker processes on cloud-based instances to have access to system resources
In addition, all run options can be specified in a provider-specific section, in which case they will override the global run options, if any.
Command Line Overrides
You can specify or override any configuration value from the command line unless otherwise noted.
Simple replace any _ character with -:
python -m cloud_tasks run \
--config config.yaml \
--task-file tasks.json \
--provider aws \ # Specify/override provider setting
--min-cpu 8 \ # Specify/override min_cpu setting
--min-memory-per-cpu 16 \ # Specify/override min_memory_per_cpu setting
--total-boot-disk-size 100 \ # Specify/override total_boot_disk_size setting
--image ami-0123456789abcdef0 \ # Specify/override image setting
--job-id my-processing-job \ # Specify/override job_id setting
--instance-types t3- m5- # Specify/override instance_types and
# restrict to t3 and m5 instance families
Note
The priority of settings is: Command Line > Provider-Specific Config > Global Run Config > System Defaults
You will be notified when overrides occur. For example:
run:
min_cpu: 2
gcp:
min_cpu: 8
2025-06-03 14:04:55.668 - cloud_tasks.common.config - WARNING - Overriding run.min_cpu=2 with gcp.min_cpu=8
or:
$ cloud_tasks manage_pool --config config.yml --min-cpu 16
2025-06-03 14:04:33.848 - cloud_tasks.common.config - WARNING - Overloading run.min_cpu=2 with CLI=16
Examples
The Simplest Configuration
For GCP, the simplest configuration useable for all functions consists of a provider name, a job ID, a project ID, a region, and a startup script.
provider: gcp
gcp:
job_id: my-processing-job
project_id: my-project-id
region: us-central1
startup_script: |
#!/bin/bash
echo "Hello, world!"
$ cloud_tasks manage_pool --config config.yaml
Given the lack of
configuration options to constrain the instance type,
the system will select the e2-highcpu-32 instance type. This is the lowest-memory
version of GCP’s most economical instance type, costing $0.02475/vCPU/hour as of this
writing. It selects the 32-vCPU version, which is the maximum number of vCPUs available in
a single instance for the e2 family, because the cost of the boot disk (which is
per-instance instead of per-vCPU) is amortized over the greatest number of vCPUs. However,
the lack of
configuration options to constain the number of instances
means the system will create the default maximum number of instances, 10,
which will result in the creation of 320 vCPUs and a burn rate of $7.92/hour, which may be
more than required depending on the actual workload. Note that in addition to the
default maximum number of instances being 10, the default maximum total price per hour is
$10.00, which is designed to limit the user’s exposure to a high burn rate without explicitly
asking for it.
With the exception of the startup script, this could also be specified entirely on the command line:
gcp:
startup_script: |
#!/bin/bash
echo "Hello, world!"
$ cloud_tasks manage_pool \
--config config.yaml \
--provider gcp \
--job-id my-processing-job \
--project-id my-project-id \
--region us-central1
If the startup script was present in a file, no configuration file would be needed at all:
$ cloud_tasks manage_pool \
--provider gcp \
--job-id my-processing-job \
--project-id my-project-id \
--region us-central1 \
--startup-script-file startup.sh
Constraining the Instance Type and Containing Costs
This example uses more sophisticated constraints to limit the instance types and number of
instances to use. First, we want to use slightly higher-performance processors and choose
the n series using a balanced persistent boot disk. We want to limit instance types to
those that have at least 8 but not more than 40 vCPUs; we might choose these numbers to
balance parallelism with the network and disk bandwidth available on a single instance. At
the same time, we know that our tasks are themselves parallel internally, and require 4
vCPUs per task for optimal performance. They also require memory of at least 32 GB per
task. Finally, since we have a large number of tasks to process but our task code is still
experimental, we are concerned about starting too many instances at once and thus having a
high burn rate in case something goes wrong and we want to stop the job in the middle when
we detect a problem. We set limits of 20 instances total, 100 simultaneous tasks, and a
burn rate of $15.00 per hour. Whichever of these is most constraining will determine the
total number of instances that will be started.
provider: gcp
gcp:
job_id: my-processing-job
project_id: rfrench
region: us-central1
instance_types: ["^n2-.*", "^n3-.*", "^n4-.*"]
min_cpu: 8
max_cpu: 40
cpus_per_task: 4
min_memory_per_task: 32
max_instances: 20
max_simultaneous_tasks: 100
max_total_price_per_hour: 15.00
boot_disk_types: pd-balanced
startup_script: |
#!/bin/bash
echo "Hello, world!"
In this case, the system starts by looking at all available n2-, n3-, and n4-
instance types that meet our vCPU and memory constraints while minimizing price per vCPU.
This results in the selection of n4-highmem-32 as the optimal instance type with the
lowest cost of $0.0622/vCPU/hour while supporting the most vCPUs in a single instance.
For the number of instances, the system starts with the maximum allowed, 20. However, with
a maximum of 100 simultaneous tasks, 32 vCPUs, and 4 vCPUs per task, this is reduced to 12.
Finally, at a cost of $1.99/hour for each instance, the price limit of $15.00 per hour
sets the final number of instances to 7 for a total cost of $13.93/hour.