Skip to content

pmetal bench

Benchmark various aspects of PMetal’s performance on your hardware.

Subcommands

bench

Benchmark training throughput (tokens/second, step time).

pmetal bench --model Qwen/Qwen3-0.6B --batch-size 4

bench-gen

Benchmark the generation loop — tokens per second, time to first token, and decode latency.

pmetal bench-gen --model Qwen/Qwen3-0.6B --prompt "Hello" --max-tokens 100

bench-ffi

Benchmark FFI overhead between Rust and Metal/MLX.

pmetal bench-ffi

See Also

Hardware Support — Hardware capabilities
Kernel Tuning — Per-tier optimizations