pmetal cluster
Run multi-Mac cluster operations with mDNS discovery, fabric classification, Thunderbolt-first ring formation, all-reduce benchmarks, and distributed train/serve forwarding.
The distributed feature is enabled by default for the stock pmetal binary.
Subcommands
Section titled “Subcommands”| Subcommand | Description |
|---|---|
up | Advertise this node, discover peers, form a ring, and hold the connection open |
status | Print local interfaces, discovered peers, and fabric classification |
bench | Run all-reduce throughput benchmarks across the ring |
pipeline-bench | Run the pipeline activation transport harness |
train | Wrapper placeholder; use pmetal train --distributed-auto directly |
serve | Wrapper placeholder pending per-architecture partial-layer execution |
Example
Section titled “Example”Run on every Mac:
pmetal cluster statuspmetal cluster upThen benchmark or train:
pmetal cluster bench --mb 64 --iters 10pmetal cluster pipeline-bench --tokens 16 --layers 32
pmetal train \ --model Qwen/Qwen3-0.6B \ --dataset train.jsonl \ --distributed-auto \ --compression-strategy fp16Common Parameters
Section titled “Common Parameters”| Parameter | Default | Description |
|---|---|---|
--discovery-port | 52415 | mDNS/libp2p discovery port |
--gradient-port | 52416 | Gradient exchange port |
--activation-port | 52417 | Pipeline activation port |
--result-port | 52418 | Pipeline result-loopback port |
--timeout | 60 | Discovery timeout in seconds |
--min-peers | 1 | Minimum peers before proceeding |
--json | false | Emit JSON where supported |
Fabric Preference
Section titled “Fabric Preference”PMetal classifies local interfaces and prefers Thunderbolt over Ethernet over Wi-Fi when forming the distributed ring. If a faster fabric disappears during a job, distributed components can fall back to available paths.
See Also
Section titled “See Also”- pmetal train — Distributed training flags
- Hardware Support — UltraFusion and fabric context