Getting Started

Get up and running with PMetal in minutes — install, train your first model, and run inference.

PMetal is a complete machine learning platform for Apple Silicon — from low-level Metal GPU kernels and Apple Neural Engine integration to high-level training APIs, a terminal TUI, and a full desktop GUI.

Prerequisites

macOS on Apple Silicon (M1 or later)
Xcode Command Line Tools: xcode-select --install

For building from source, also install:

Rust 1.86+ via rustup
Metal Toolchain: xcodebuild -downloadComponent MetalToolchain
CMake: brew install cmake

Quick Install

# Option 1: Prebuilt binary
curl -fsSL https://github.com/Epistates/pmetal/releases/latest/download/pmetal-aarch64-apple-darwin.tar.gz | tar xz
sudo mv pmetal /usr/local/bin/

# Option 2: Install from crates.io
cargo install pmetal

See Installation for all options including optional serving, MCP, source builds, and GUI setup.

Your First Training Run

Fine-tune a model with LoRA in one command:

pmetal train \
  --model Qwen/Qwen3-0.6B \
  --dataset train.jsonl \
  --output ./output \
  --lora-r 16 --batch-size 4 --learning-rate 2e-4

PMetal automatically downloads the model from HuggingFace Hub, detects your hardware capabilities, and tunes kernel parameters for your specific chip.

Run Inference

Chat with your fine-tuned model:

pmetal infer \
  --model Qwen/Qwen3-0.6B \
  --lora ./output/lora_weights.safetensors \
  --prompt "Explain quantum entanglement" \
  --chat

Use the SDK

Integrate PMetal into your own Rust applications through the facade crate and re-exported modules:

use pmetal::data::Tokenizer;
use pmetal::hub::resolve_model_path;
use pmetal::models::DynamicModel;

let model_path = resolve_model_path("Qwen/Qwen3-0.6B", None, None).await?;
let tokenizer = Tokenizer::from_model_dir(&model_path)?;
let model = DynamicModel::load(&model_path)?;

See Rust Facade for complete inference and training examples.

Or from Python:

import pmetal

result = pmetal.finetune(
    "Qwen/Qwen3-0.6B",
    "train.jsonl",
    lora_r=16,
    learning_rate=2e-4,
    epochs=3,
)

Explore Further

CLI Reference — Training, serving, pretraining, clustering, quantization, and model operations
Rust Facade — Re-exported Rust modules and examples
Python SDK — PyO3 bindings
Training Methods — SFT, DPO, GRPO, distillation, and more
Supported Models — All supported architectures
Hardware — Chip detection and kernel tuning