Skip to content

Supported Models

PMetal supports a wide range of model architectures. Models are loaded from HuggingFace Hub or local safetensors with automatic architecture detection.

All models below work with the CLI (pmetal infer), TUI, GUI, and SDK.

FamilyArchitectureVariantsmodel_type values
LlamaLlama2, 3, 3.1, 3.2, 3.3llama, llama3
Llama 4Llama4Scout, Maverickllama4
Qwen 2Qwen22, 2.5qwen2, qwen2_5
Qwen 3Qwen33qwen3
Qwen 3 MoEQwen3MoE3-MoEqwen3_moe
Qwen 3.5Qwen3Next3.5 (Next)qwen3_next, qwen3_5
DeepSeekDeepSeekV3, V3.2, V3.2-Specialedeepseek, deepseek_v3
MistralMistral7B, Mixtral 8×7Bmistral, mixtral
GemmaGemma2, 3gemma, gemma2, gemma3
Phi 3Phi3, 3.5phi, phi3
Phi 4Phi44phi4
CohereCohereCommand Rcohere, command_r
GraniteGranite3.0, 3.1, Hybrid MoEgranite, granitehybrid
NemotronHNemotronHHybrid (Mamba+Attention)nemotron_h
StarCoder2StarCoder23B, 7B, 15Bstarcoder2
RecurrentGemmaRecurrentGemmaGriffinrecurrentgemma, griffin
JambaJamba1.5jamba
FluxFlux1-dev, 1-schnellflux
ArchitectureLoRAQLoRANotes
LlamaYesYesCovers Llama 2–3.3. Gradient checkpointing supported.
Qwen 2YesUses Qwen3 LoRA implementation internally.
Qwen 3YesYesGradient checkpointing supported.
Qwen 3.5 (Next)YesHybrid architecture with nested text_config.
GemmaYesYesGeGLU activation, special RMSNorm.
MistralYesYesSliding window attention support.
Phi 3YesPartial RoPE, fused gate_up projection.

Architectures not listed (Llama 4, Qwen 3 MoE, DeepSeek, Cohere, Granite, NemotronH, Phi 4, StarCoder2, RecurrentGemma, Jamba) support inference only.

Architecture Modules (Not Yet in Dispatcher)

Section titled “Architecture Modules (Not Yet in Dispatcher)”

These have implementations in pmetal-models but are not in the DynamicModel dispatcher:

FamilyModuleNotes
GPT-OSSgpt_ossMoE with Top-4 sigmoid routing, 20B/120B
Pixtralpixtral12B vision-language
Qwen2-VLqwen2_vl2B, 7B vision-language
MLlamamllamaLlama 3.2-Vision
CLIPclipViT-L/14 vision encoder
WhisperwhisperBase–Large speech models
T5t5Encoder-decoder architecture

These can be used directly via their Rust types (e.g., pmetal_models::architectures::gpt_oss::GptOssForCausalLM).