Configuration

Chiltepin uses YAML configuration files to specify a collection of resources for use during workflow execution. Each resource in the configuration describes a pool of computational resources to which tasks can be submitted for execution.

Overview

Resources can be:

Local: Your laptop or workstation
HPC systems: Accessed through job schedulers like Slurm or PBS Pro
Remote systems: Accessed through Globus Compute endpoints

The configuration file defines named resources, where each resource represents a pool of nodes and/or cores on which tasks can be executed.

Basic Structure

A Chiltepin configuration file contains one or more named resource definitions:

resource-name-1:
  provider: "slurm"
  partition: "compute"
  cores_per_node: 4
  nodes_per_block: 2
  # ... additional options

resource-name-2:
  endpoint: "uuid-of-globus-compute-endpoint"
  mpi: True
  max_mpi_apps: 4
  # ... additional options

Understanding Resource Configuration

Three key options determine how resources are accessed and used:

Provider: How Resources Are Acquired

The provider option specifies how computational resources for running your workflow are obtained:

"localhost" - Use CPU resources on the current machine
"slurm" - Obtain a pool of resources via a Slurm scheduler pilot job
"pbspro" - Obtain a pool of resources via a PBS Pro scheduler pilot job

When using HPC providers (Slurm or PBS Pro), you can specify scheduler-specific options like partition, queue, account, and walltime.

Endpoint: Remote Resource Access

The endpoint option specifies a Globus Compute endpoint UUID for accessing remote resources:

remote-hpc:
  endpoint: "12345678-1234-1234-1234-123456789abc"
  mpi: True
  provider: "slurm"
  partition: "gpu"

When an endpoint is specified, tasks are sent to the remote system via Globus Compute. All other configuration options (provider, mpi, etc.) are passed to the endpoint’s configuration template (created automatically by Chiltepin when an endpoint is configured) to describe the resource pool on the remote system.

MPI: Support for Parallel Applications

The mpi option indicates whether the resource pool supports MPI (Message Passing Interface) applications:

mpi-resource:
  mpi: True
  max_mpi_apps: 4
  provider: "slurm"
  nodes_per_block: 8

When mpi: True, the resource is configured to run parallel MPI applications. You can control how many concurrent MPI applications can run with max_mpi_apps.

Resource Types

Based on the configuration options, Chiltepin automatically determines the resource type:

Remote Resources: Use Globus Compute to access remote systems. Specified by providing an endpoint UUID.
MPI Resources: Run parallel MPI applications on HPC systems (local or remote). Specified by setting mpi: True.
High-Throughput Resources: Run many independent tasks concurrently. This is the default when mpi is not specified (or is set to False), whether the resources are local or remote.

Configuration Options

Common Options

These options apply to all resource types:

Option	Type	Default	Description
`mpi`	boolean	`False`	Enable MPI support for parallel applications
`provider`	string	`"localhost"`	How to acquire resources: `"localhost"`, `"slurm"`, or `"pbspro"`
`init_blocks`	integer	`0`	Number of resource blocks to provision at startup
`min_blocks`	integer	`0`	Minimum number of resource blocks to maintain
`max_blocks`	integer	`1`	Maximum number of resource blocks allowed
`environment`	list	`[]`	Shell commands to run before executing tasks (e.g., module loads)

MPI-Specific Options

When mpi: True:

Option	Type	Default	Description
`max_mpi_apps`	integer	`1`	Maximum number of concurrent MPI applications
`mpi_launcher`	string	`"srun"` (Slurm) or `"mpiexec"`	MPI launcher command to use

HPC Provider Options

When provider is "slurm" or "pbspro":

Option	Type	Default	Description
`cores_per_node`	integer	`1`	Number of cores per compute node (ignored for MPI resources)
`nodes_per_block`	integer	`1`	Number of nodes per block/job
`exclusive`	boolean	`True`	Request exclusive node allocation (Slurm only)
`partition`	string	None	Scheduler partition to use (Slurm only)
`queue`	string	None	QOS (Slurm) or queue name (PBS Pro)
`account`	string	None	Account/project to charge for resources
`walltime`	string	`"00:10:00"`	Maximum walltime for jobs (HH:MM:SS)

High-Throughput Resource Options

For non-MPI resources:

Option	Type	Default	Description
`cores_per_worker`	integer	`1`	Number of cores per worker process
`max_workers_per_node`	integer	Auto	Maximum workers per node (auto-calculated if not specified)

Remote Resource Options

When endpoint is specified:

Option	Type	Default	Description
`endpoint`	string	Required	UUID of the Globus Compute endpoint

Note

All other options (provider, mpi, cores_per_node, etc.) are passed to the endpoint’s configuration template that Chiltepin creates automatically when endpoints are configured.

Example Configurations

Local Execution

Simple local resource for testing:

local:
  provider: "localhost"
  init_blocks: 1
  max_blocks: 1

Slurm HPC System

Single-node computation:

compute:
  provider: "slurm"
  cores_per_node: 128
  nodes_per_block: 1
  partition: "compute"
  account: "myproject"
  walltime: "01:00:00"
  environment:
    - "module load python/3.11"
    - "module load gcc/11.2"

Multi-node MPI:

mpi:
  mpi: True
  max_mpi_apps: 2
  mpi_launcher: "srun"
  provider: "slurm"
  cores_per_node: 128
  nodes_per_block: 4
  exclusive: True
  partition: "compute"
  account: "myproject"
  walltime: "02:00:00"
  environment:
    - "module load openmpi/4.1"
    - "export MPIF90=$MPIF90"

PBS Pro System

pbs-compute:
  provider: "pbspro"
  cores_per_node: 36
  nodes_per_block: 2
  queue: "normal"
  account: "MYACCT123"
  walltime: "00:30:00"
  environment:
    - "module load intel/2021"

Remote Resource via Globus Compute

remote-mpi:
  endpoint: "12345678-1234-1234-1234-123456789abc"
  mpi: True
  max_mpi_apps: 4
  provider: "slurm"
  cores_per_node: 128
  nodes_per_block: 8
  partition: "gpu"
  account: "myproject"
  walltime: "04:00:00"
  environment:
    - "module load cuda/11.8"
    - "module load openmpi/4.1-cuda"

Multiple Resources

Combine multiple resource types in one file:

# Local service tasks
service:
  provider: "localhost"
  max_blocks: 1
  max_workers_per_node: 3

# Local HPC compute tasks
compute:
  provider: "slurm"
  cores_per_node: 64
  nodes_per_block: 10
  partition: "standard"
  account: "myproject"
  walltime: "01:00:00"

# Remote MPI tasks via Globus Compute
remote-mpi:
  endpoint: "12345678-1234-1234-1234-123456789abc"
  mpi: True
  max_mpi_apps: 2
  provider: "slurm"
  partition: "standard"
  account: "myproject"
  nodes_per_block: 16

Environment Configuration

The environment option accepts a list of shell commands that are executed before running tasks. This is commonly used for:

Loading environment modules
Setting environment variables
Activating virtual environments
Exporting paths

resource-name:
  environment:
    - "module purge"
    - "module load python/3.11 gcc/11.2 openmpi/4.1"
    - "export MY_VAR=value"
    - "source /path/to/venv/bin/activate"

Tip

Use YAML anchors to share common environment setup across multiple resources:

common_env: &common_env
  - "module load python/3.11"
  - "export PYTHONPATH=/my/path:$PYTHONPATH"

resource1:
  environment: *common_env

resource2:
  environment: *common_env

Loading Configurations

Parse and Load

import chiltepin.configure
import parsl

# Parse YAML configuration file
config_dict = chiltepin.configure.parse_file("my_config.yaml")

# Create Parsl configuration
parsl_config = chiltepin.configure.load(
    config_dict,
    include=["compute", "mpi"],  # Only load specific resources
    run_dir="./runinfo"           # Directory for Parsl runtime files
)

# Initialize Parsl with configuration
parsl.load(parsl_config)

The include parameter lets you selectively load only specific resources from your configuration file. If omitted, all resources are loaded.

Configuration Best Practices

Start Small: Begin with short walltimes and small resource requests while testing
Use Anchors: Share common configuration blocks (like environment) using YAML anchors
Resource Limits: Set appropriate min_blocks and max_blocks to control scaling
Environment Modules: Always include necessary module loads in the environment section