Phi-4-mini-instruct-w4a4-fp4

This is a quantized version of the Phi-4-mini-instruct model by Microsoft, quantized using NVFP4 format with 4-bit weights and 4-bit activations.

Model Description

Phi-4-mini-instruct is a smaller variant of the Phi-4 model, designed for instruction-following tasks. The quantized version retains much of the original model's capabilities while significantly reducing its size and computational requirements.

Quantization Details

  • Quantization Format: NVFP4
  • Weight Precision: 4-bit
  • Activation Precision: 4-bit
  • Quantization Tool: LLM Compressor
Downloads last month
4
Safetensors
Model size
3B params
Tensor type
F32
·
BF16
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pebeto/Phi-4-mini-instruct-w4a4-fp4

Quantized
(121)
this model

Collection including pebeto/Phi-4-mini-instruct-w4a4-fp4