Supported Models
Model-family support status for inference, embeddings, LoRA/QLoRA training, and direct architecture modules in PMetal.
PMetal supports a wide range of model architectures. Models are loaded from HuggingFace Hub or local safetensors with automatic architecture detection.
Inference Support
Section titled “Inference Support”All causal language models below work with the CLI (pmetal infer), TUI, GUI, and SDK.
| Family | Architecture | Variants | model_type values |
|---|---|---|---|
| Llama | Llama | 2, 3, 3.1, 3.2, 3.3 | llama, llama3 |
| Llama 4 | Llama4 | Scout, Maverick | llama4 |
| Qwen 2 | Qwen2 | 2, 2.5 | qwen2, qwen2_5 |
| Qwen 3 | Qwen3 | 3 | qwen3 |
| Qwen 3 MoE | Qwen3MoE | 3-MoE | qwen3_moe |
| Qwen 3.5 | Qwen3Next | 3.5 (Next) | qwen3_next, qwen3_5 |
| DeepSeek | DeepSeek | V3, V3.2, V3.2-Speciale | deepseek, deepseek_v3 |
| Mistral | Mistral | 7B, Mixtral 8×7B | mistral, mixtral |
| Gemma | Gemma | 2, 3 | gemma, gemma2, gemma3 |
| Phi 3 | Phi | 3, 3.5 | phi, phi3 |
| Phi 4 | Phi4 | 4 | phi4 |
| Cohere | Cohere | Command R | cohere, command_r |
| Granite | Granite | 3.0, 3.1, Hybrid MoE | granite, granitehybrid |
| NemotronH | NemotronH | Hybrid (Mamba+Attention) | nemotron_h |
| GPT-OSS | GptOss | 20B, 120B | gpt_oss, gpt-oss |
| Gemma 4 | Gemma4 | 4 | gemma4, gemma4_text |
Embedding / Encoder Models
Section titled “Embedding / Encoder Models”| Family | Architecture | Variants | model_type values |
|---|---|---|---|
| BERT | Bert | BERT, RoBERTa, DistilBERT, XLM-RoBERTa | bert, roberta, distilbert, xlm-roberta, xlm_roberta |
LoRA / QLoRA Training Support
Section titled “LoRA / QLoRA Training Support”| Architecture | LoRA | QLoRA | Notes |
|---|---|---|---|
| Llama | Yes | Yes | Covers Llama 2–3.3. Gradient checkpointing supported. |
| Llama 4 | Yes | Yes | Scout/Maverick support via DynamicLoraModel. |
| Qwen 2 | Yes | Yes | Uses Qwen3 LoRA implementation internally. |
| Qwen 3 | Yes | Yes | Gradient checkpointing supported. |
| Qwen 3 MoE | Yes | Yes | Sparse MoE support. |
| Qwen 3.5 (Next) | Yes | Yes | Hybrid architecture with nested text_config. |
| Gemma | Yes | Yes | GeGLU activation, special RMSNorm. |
| Gemma 4 | Yes | Yes | Multimodal-era Gemma text path. |
| Mistral | Yes | Yes | Sliding window attention support. |
| Phi 3/4 | Yes | Yes | Partial RoPE, fused gate_up projection. |
| DeepSeek | Yes | Yes | V3-family support. |
| Cohere | Yes | Yes | Command R support. |
| Granite | Yes | Yes | Dense and hybrid variants. |
| NemotronH | Yes | Yes | Hybrid architecture support. |
| GPT-OSS | Yes | Yes | MoE variants. |
Architecture Modules (Not Yet in Dispatcher)
Section titled “Architecture Modules (Not Yet in Dispatcher)”These have implementations in pmetal-models but are not in the DynamicModel dispatcher:
| Family | Module | Notes |
|---|---|---|
| Pixtral | pixtral | 12B vision-language |
| Qwen2-VL | qwen2_vl | 2B, 7B vision-language |
| MLlama | mllama | Llama 3.2-Vision |
| CLIP | clip | ViT-L/14 vision encoder |
| Whisper | whisper | Base–Large speech models |
| T5 | t5 | Encoder-decoder architecture |
These can be used directly via their Rust types (e.g., pmetal_models::architectures::pixtral::Pixtral).
Diffusion Models
Section titled “Diffusion Models”| Family | Variants | Status |
|---|---|---|
| Flux | 1-dev, 1-schnell | Dispatcher + pipeline implemented |
See Also
Section titled “See Also”- Model Merging — Merge strategies
- Quantization — GGUF quantization