Quick Start
This guide walks you through a complete Chiltepin workflow, from setting up an endpoint to submitting tasks.
Overview
Chiltepin is a collection of tools for implementing distributed exascale numerical weather prediction workflows using Parsl and Globus Compute.
Warning
This collection of resources is not intended for use in operational production environments, and is for research purposes only.
Prerequisites
Before starting, ensure you have:
Installed Chiltepin (see Installation)
Access to an HPC system (or use local execution for testing)
A Globus account and a web browser for Globus authentication
Complete Workflow Example
This example demonstrates the full workflow: configure an endpoint, start it, and submit tasks.
Step 1: Authenticate
First, log in to Globus services. This should be done on the machine where you want to run tasks:
$ chiltepin login
This opens a browser for authentication or, if one is not available, provides a URL to complete the authentication manually. Follow the prompts to authorize Chiltepin.
Step 2: Configure an Endpoint
Create a new Globus Compute endpoint to which you will submit tasks. This should be done on the machine where you want to run tasks:
$ chiltepin endpoint configure my-endpoint
This creates the endpoint configuration in ~/.globus_compute/my-endpoint/.
Step 3: Start the Endpoint
Launch the endpoint:
$ chiltepin endpoint start my-endpoint
The endpoint will register with Globus Compute and begin accepting tasks.
Step 4: Get the Endpoint UUID
Retrieve your endpoint’s UUID:
$ chiltepin endpoint list
Example output:
my-endpoint a1b2c3d4-1234-5678-90ab-cdef12345678 Running
Note the UUID (a1b2c3d4-1234-5678-90ab-cdef12345678) for the next step.
Step 5: Create a Configuration File
Create my_config.yaml with your endpoint UUID:
# Local resource for small tasks
local:
provider: "localhost"
init_blocks: 1
max_blocks: 1
# Remote endpoint for HPC tasks
remote:
endpoint: "a1b2c3d4-1234-5678-90ab-cdef12345678" # Use your UUID
provider: "slurm"
cores_per_node: 4
nodes_per_block: 1
partition: "compute"
account: "myproject"
walltime: "00:30:00"
environment:
- "module load python/3.11"
Replace the endpoint UUID with your actual UUID from Step 4.
Step 6: Write Your Workflow
Create my_workflow.py:
import parsl
import chiltepin.configure
from chiltepin.tasks import bash_task, python_task
# Define tasks
@python_task
def hello_local():
import platform
return f"Hello from {platform.node()}"
@bash_task
def hello_remote():
return "hostname"
@python_task
def compute_task(n):
"""Simple computation task"""
result = sum(i**2 for i in range(n))
return result
if __name__ == "__main__":
# Load configuration
config_dict = chiltepin.configure.parse_file("my_config.yaml")
parsl_config = chiltepin.configure.load(
config_dict,
include=["local", "remote"],
run_dir="./runinfo"
)
with parsl.load(parsl_config):
# Run local task on "local" resource
local_future = hello_local(executor="local")
# Run remote bash task on "remote" resource (returns exit code: 0 = success)
remote_future = hello_remote(executor="remote")
# Run multiple compute tasks on "remote" resource
futures = [compute_task(i, executor="remote") for i in range(1, 5)]
# Get the results
print(f"Local: {local_future.result()}")
print(f"Remote exit code: {remote_future.result()}")
print(f"Computation results: {[f.result() for f in futures]}")
print("All tasks completed!")
Step 7: Run Your Workflow
Execute the workflow:
$ python my_workflow.py
Expected output:
Local: Hello from my-laptop.local
Remote exit code: 0
Computation results: [0, 1, 5, 14]
All tasks completed!
Step 8: Stop the Endpoint
When finished:
$ chiltepin endpoint stop my-endpoint
Note
Endpoints automatically scale down resources after idle periods, so manual stopping is optional.
Local-Only Quickstart
For testing without an HPC system, use local execution:
Configuration File (local_config.yaml)
local:
provider: "localhost"
init_blocks: 1
max_blocks: 1
Simple Workflow (simple_workflow.py)
import parsl
import chiltepin.configure
from chiltepin.tasks import bash_task, python_task
# Define tasks
@python_task
def multiply(a, b):
return a * b
@bash_task
def system_info():
return "echo 'Task completed successfully'"
if __name__ == "__main__":
# Load configuration
config_dict = chiltepin.configure.parse_file("local_config.yaml")
parsl_config = chiltepin.configure.load(config_dict, run_dir="./runinfo")
with parsl.load(parsl_config):
result = multiply(6, 7, executor="local").result()
print(f"6 * 7 = {result}")
exit_code = system_info(executor="local").result()
print(f"Bash task exit code: {exit_code}")
Run it:
$ python simple_workflow.py
Working with MPI Tasks
Chiltepin supports MPI applications on HPC systems:
Configuration (mpi_config.yaml)
mpi-resource-name:
endpoint: "your-endpoint-uuid"
mpi: True
max_mpi_apps: 2
mpi_launcher: "srun"
provider: "slurm"
cores_per_node: 128
nodes_per_block: 4
partition: "compute"
account: "myproject"
walltime: "01:00:00"
environment:
- "module load openmpi/4.1"
- "export MPIF90=$MPIF90"
MPI Workflow
import parsl
import chiltepin.configure
from chiltepin.tasks import bash_task
@bash_task
def compile_mpi():
return "$MPIF90 -o mpi_app mpi_app.f90"
@bash_task
def run_mpi(ranks=4):
return f"srun -n {ranks} ./mpi_app"
if __name__ == "__main__":
config_dict = chiltepin.configure.parse_file("mpi_config.yaml")
parsl_config = chiltepin.configure.load(config_dict, run_dir="./runinfo")
with parsl.load(parsl_config):
# Compile MPI application on the MPI resource (returns exit code)
compile_result = compile_mpi(executor="mpi-resource-name").result()
print(f"Compilation exit code: {compile_result}")
# Run with different rank counts on the MPI resource
results = []
for ranks in [4, 8, 16]:
future = run_mpi(ranks, executor="mpi-resource-name")
results.append(future.result())
for i, result in enumerate(results, 1):
print(f"Run {i} exit code: {result}")
Key Concepts
Resources
Resources define where and how tasks run:
Local: Runs on the current machine
HPC: Submits jobs to schedulers (Slurm, PBS Pro)
Globus Compute: Runs on remote endpoints
See Configuration for detailed resource configuration options.
Task Decorators
Chiltepin provides three task decorators to define workflow tasks:
@python_task: Execute Python functions@bash_task: Execute shell commands (returns exit code)@join_task: Coordinate multiple tasks without blocking
When calling a task, use the executor parameter to specify which resource to use:
@python_task
def my_task():
return "result"
# Specify which resource to use
result = my_task(executor="compute").result()
The executor value must match a resource name from your configuration file.
See also
For comprehensive documentation on defining and using tasks, including advanced patterns, error handling, and best practices, see Tasks.
Configuration Loading
The include parameter selects specific resources to load from the configuration:
# Load only specific resources
parsl_config = chiltepin.configure.load(
config_dict,
include=["local", "compute"], # Only these resources
run_dir="./runinfo"
)
If include is omitted, all resources in the configuration are loaded.
Directory Structure
After running workflows, you’ll see:
.
├── my_config.yaml # Configuration file
├── my_workflow.py # Workflow script
└── runinfo/ # Parsl runtime directory
├── 000/ # Run directory
│ ├── local/ # Local resource files
│ ├── remote/ # Remote resource files
│ └── submit_scripts/ # Job submission scripts
└── parsl.log # Parsl log file
The runinfo directory contains execution logs, job scripts, and task outputs.
Troubleshooting
Tasks Not Running
Verify endpoint is running:
chiltepin endpoint listCheck you’re using the correct endpoint UUID
Review logs in
runinfo/directoryCheck endpoint logs:
~/.globus_compute/my-endpoint/endpoint.log
Authentication Expired
$ chiltepin logout
$ chiltepin login
Configuration Errors
Validate your YAML syntax:
import yaml
with open("my_config.yaml") as f:
config = yaml.safe_load(f)
print(config)
Resource Limits
If jobs fail to start:
Check partition/queue names
Verify account/project is valid
Confirm node/core requests are within limits
Machine may be busy and resource pool job may be pending or may be full
Next Steps
Comprehensive task documentation: Tasks
Detailed configuration options: Configuration
Endpoint management: Endpoint Management
Run the test suite: Testing
Set up Docker environment: Docker Container
Explore the API: API Reference