Skip to content

Groundhog Logo

Logo by Nathan Houston

Groundhog đŸŒ¤ī¸đŸĻĢ

Iterative HPC function development. As many "first tries" as you need.

Groundhog makes it easy to run, tweak, and re-run python functions on HPC clusters via Globus Compute using simple decorators.

Groundhog automatically manages remote environments (powered by uv) — just update Python versions or dependencies in your script, no SSH needed.


The Problem

Iterative development on HPC clusters is slow and frustrating for a couple reasons.

First, your local environment is probably very different from the remote environment where you want to run your code (which is itself probably very different from any other cluster where you may want it to run). This means you need to manually maintain multiple Python virtual environments and keep them in sync.

Second, queue times are long. You don't know if your code works yet, so you do more local-only development, delaying remote testing as long as possible. When you finally submit the job, it feels bad to immediately fail with No module named 'numpy' because you forgot to update your remote environments.

The code-iteration loop and environment-iteration loop are completely independent, but both loops must be simultaneously perfect for a successful submission.

So not only does every iteration cost queue time, but you're also constantly context-switching between thinking about code vs its environment and thinking about local vs remote state. Here's a graphical representation:

Code Environment
Remote
if torch.cuda.is_available():
    ...
else:
    ...
conda install pytorch pytorch-cuda -c pytorch -c nvidia
Local
if torch.cuda.is_available():  
    ...
else:
    ...
pip install torch --index-url https://download.pytorch.org/whl/cpu

The Solution

Groundhog couples your code with its environment in a single file using PEP 723 inline metadata. Change your code, change your dependencies, change your Python version, rerun, and Groundhog rebuilds the environment you requested on the remote node automatically. You don't have to manage any state on the remote machine, so you're iterating on your environment and code in the same loop, all from the comfort of your laptop (no SSH necessary).

Look! There's only one context:

Code + Environment
Remote
hog run my_script.py
| my_function | 4c421664-8a37-48f5-8739-13f5428d0c4b | success | 2.8s (exec: 1.1s) | â˜€ī¸đŸĻĢī¸ī¸

Quick Example

# /// script
# requires-python = ">=3.12"
# dependencies = ["numpy", "scipy"]
#
# [tool.hog.anvil]
# endpoint = "5aafb4c1-27b2-40d8-a038-a0277611868f"
# account = "my-account"
# ///

import groundhog_hpc as hog

@hog.function(endpoint="anvil")
def analyze_data(data: list[float]) -> dict:
    """Run analysis on the HPC cluster."""
    import numpy as np
    from scipy import stats

    return {
        "mean": float(np.mean(data)),
        "std": float(np.std(data)),
        "skew": float(stats.skew(data))
    }

@hog.harness()
def main():
    # .remote() sends to HPC, waits for result
    result = analyze_data.remote([1.0, 2.0, 3.0, 4.0, 5.0])
    print(f"Analysis complete: {result}")

Run with:

hog run analysis.py

What Makes Groundhog Different?

Environment and code stay coupled : Change your Python version or dependencies by editing the PEP 723 block in your script. The remote environment rebuilds automatically (if necessary) on the next run.

Globus Compute under the hood : Built on Globus Compute for robust, secure HPC job submission.

No endpoint restarts needed : Because each remote function runs in its own isolated subprocess (managed by uv), you can iterate on environments without restarting the Globus Compute endpoint (or thinking about what python version it's running).

Works everywhere Python works : Call functions from scripts, REPLs, notebooks, or orchestrator harnesses.


Next Steps

  • Get Started


    Install Groundhog and run your first function

    Quickstart →

  • See Examples


    Learn from examples of common patterns

    Examples →

  • Learn Concepts


    Understand functions, harnesses, PEP 723, and remote execution

    Concepts →