sparkrun

One command to rule them all

Launch, manage, and stop inference workloads on one or more NVIDIA DGX Spark systems — no Slurm, no Kubernetes, no fuss.

sparkrun is a unified CLI for running LLM inference on DGX Spark. Point it at your hosts, pick a recipe, and go. It handles container orchestration, InfiniBand/RDMA detection, model distribution, and multi-node tensor parallelism across your Spark cluster automatically.

sparkrun does not need to run on a member of the cluster. You can coordinate one or more DGX Sparks from any Linux machine with SSH access.

Full documentation at sparkrun.dev · Browse recipes on Spark Arena

Install

# uv is preferred mechanism for managing python environments
# To install uv:
curl -LsSf https://astral.sh/uv/install.sh | sh

# automatic installation via uvx (manages virtual environment and
# creates alias in your shell, sets up autocomplete too!)
uvx sparkrun setup install

Alternative: manual pip install

pip install sparkrun
# or
uv pip install sparkrun

With a manual install you will need to run sparkrun setup completion separately for tab completion.

Quick Start

# Save your hosts once
sparkrun cluster create mylab --hosts 192.168.11.13,192.168.11.14 -d "My DGX Spark lab"
sparkrun cluster set-default mylab

# Set up passwordless SSH mesh
sparkrun setup ssh

# Run an inference workload
sparkrun run qwen3-1.7b-vllm

# Multi-node tensor parallel (TP maps to node count on DGX Spark)
sparkrun run qwen3-1.7b-vllm --tp 2

# Override settings on the fly
sparkrun run qwen3-1.7b-vllm --port 9000 --gpu-mem 0.8
sparkrun run qwen3-1.7b-vllm --tp 2 -o max_model_len=8192

# GGUF quantized models via llama.cpp
sparkrun run qwen3-1.7b-llama-cpp

sparkrun launches jobs in the background (detached containers) and follows logs. Ctrl+C detaches from logs — it never kills your inference job. Your model keeps serving.

# Re-attach to logs
sparkrun logs qwen3-1.7b-vllm

# Stop a workload
sparkrun stop qwen3-1.7b-vllm

# Check cluster status
sparkrun status

Browse and inspect recipes

# List all available recipes
sparkrun list

# Search by name or model
sparkrun search qwen3

# Inspect a recipe with VRAM estimation
sparkrun show nemotron3-nano-30b-nvfp4-vllm

The VRAM estimator auto-detects model architecture from HuggingFace and tells you whether your configuration fits within DGX Spark's 128 GB unified memory before you launch.

Spark Arena

Spark Arena is the community recipe hub for DGX Spark. Browse tested recipes, benchmark results, and one-click launch configs — then run them directly with sparkrun.

# Run a recipe from Spark Arena using its short link
sparkrun run @spark-arena/<recipe-id>

# Inspect a Spark Arena recipe with VRAM estimation
sparkrun show @spark-arena/<recipe-id>

# Save a Spark Arena recipe locally for customization
sparkrun show @spark-arena/<recipe-id> --save my-recipe.yaml

Spark Arena recipes are also available through sparkrun's default registries — sparkrun list includes recipes from Spark Arena automatically. Visit spark-arena.com to browse the full catalog, view benchmark results, and copy recipe links.

Custom registries

sparkrun supports additional git-based registries for private or community recipes.

sparkrun registry list
sparkrun registry add https://github.com/myorg/spark-recipes.git
sparkrun registry update

See the registries documentation for details.

Supported Runtimes

Runtime	Engine	Multi-Node
vllm	vLLM	Distributed (default) or Ray
sglang	SGLang	Native distributed
llama-cpp	llama.cpp	Solo (experimental RPC multi-node)
eugr-vllm	vLLM via eugr	Ray (with custom builds/mods)

Each DGX Spark has one GPU with 128 GB unified memory, so tensor parallelism maps directly to node count: --tp 2 means 2 hosts. See the runtime documentation for details.

Prerequisites

SSH: Passwordless SSH from your control machine to every cluster node. Run sparkrun setup ssh for easy setup.
Docker: The SSH user must be in the docker group on every node (sudo usermod -aG docker "$USER").

See the getting started guide for full setup instructions.

Recipes

A recipe is a YAML file describing an inference workload — model, container image, runtime, and defaults:

model: nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4
runtime: vllm
container: scitrera/dgx-spark-vllm:0.16.0-t5

defaults:
  port: 8000
  tensor_parallel: 1
  gpu_memory_utilization: 0.8
  max_model_len: 200000
  served_model_name: nemotron3-30b-a3b

Any default can be overridden at launch time with -o key=value or dedicated flags like --port, --tp, --gpu-mem. See RECIPES.md for the full recipe format specification.

CLI Reference

Run sparkrun <command> --help for full options on any command. Detailed documentation for each command is available at sparkrun.dev/cli.

Workload commands

Command	Description
`sparkrun run <recipe>`	Launch an inference workload
`sparkrun stop [recipe]`	Stop a running workload (or all with `--all`)
`sparkrun logs <recipe>`	Re-attach to workload logs
`sparkrun status`	Show running containers and cluster status
`sparkrun benchmark <recipe>`	Run → benchmark → stop (auto flow); use `--profile` for registry benchmark profiles, `--skip-run` to benchmark already-running instance, `--no-stop` to keep running after benchmark

Common options: --hosts / -H, --cluster, --tp, --port, --gpu-mem, -o key=value, --dry-run.

Recipe commands

Command	Description
`sparkrun list [query]`	List available recipes
`sparkrun show <recipe>`	Show recipe details + VRAM estimate
`sparkrun search <query>`	Search recipes by name/model/description
`sparkrun recipe validate <recipe>`	Validate a recipe file
`sparkrun recipe vram <recipe>`	Estimate VRAM usage

Registry commands

Command	Description
`sparkrun registry list`	List configured registries
`sparkrun registry add <url>`	Add a registry from a repo manifest
`sparkrun registry remove <name>`	Remove a registry
`sparkrun registry enable/disable <name>`	Toggle a registry
`sparkrun registry update [name]`	Update registries from git
`sparkrun registry list-benchmark-profiles`	List benchmark profiles across registries
`sparkrun registry show-benchmark-profile <profile>`	Show detailed benchmark profile info

Cluster commands

Command	Description
`sparkrun cluster create <name>`	Create a named cluster
`sparkrun cluster list`	List all saved clusters
`sparkrun cluster show <name>`	Show cluster details
`sparkrun cluster set-default <name>`	Set the default cluster
`sparkrun cluster default`	Show the current default cluster
`sparkrun cluster unset-default`	Remove the default cluster setting
`sparkrun cluster update <name>`	Update an existing cluster's hosts, description, or SSH user
`sparkrun cluster delete <name>`	Delete a saved cluster
`sparkrun cluster status`	Show running workloads on a cluster

Tune commands

Command	Description
`sparkrun tune sglang <recipe>`	Tune SGLang fused MoE Triton kernels
`sparkrun tune vllm <recipe>`	Tune vLLM fused MoE Triton kernels

Setup commands

Command	Description
`sparkrun setup install`	Install sparkrun as a uv tool + tab-completion + registry sync
`sparkrun setup update`	Update sparkrun + registries
`sparkrun setup ssh`	Set up passwordless SSH mesh across hosts
`sparkrun setup completion`	Install shell tab-completion
`sparkrun setup cx7`	Configure ConnectX-7 NICs
`sparkrun setup fix-permissions`	Fix root-owned HF cache files
`sparkrun setup clear-cache`	Drop Linux page cache on cluster hosts

About

sparkrun provides a unified tool for running inference on DGX Spark systems without Slurm or Kubernetes coordination. It is intended to be donated to a future community organization.

License

Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 313 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
scripts		scripts
sparkrun-cc-plugin		sparkrun-cc-plugin
src/sparkrun		src/sparkrun
tests		tests
website		website
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
RECIPES.md		RECIPES.md
pyproject.toml		pyproject.toml
versions.yaml		versions.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sparkrun

Install

Quick Start

Browse and inspect recipes

Spark Arena

Custom registries

Supported Runtimes

Prerequisites

Recipes

CLI Reference

Workload commands

Recipe commands

Registry commands

Cluster commands

Tune commands

Setup commands

About

License

About

Uh oh!

Releases 13

Contributors 6

Languages

Folders and files

Latest commit

History

Repository files navigation

sparkrun

Install

Quick Start

Browse and inspect recipes

Spark Arena

Custom registries

Supported Runtimes

Prerequisites

Recipes

CLI Reference

Workload commands

Recipe commands

Registry commands

Cluster commands

Tune commands

Setup commands

About

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Contributors 6

Languages