Shrinking Transformers for Production: ONNX Export + Dynamic Quantization10 February 2025·3 minsONNX Quantization Model Optimization DistilBERT Inference
Shrinking Transformers for Production: ONNX Export + Dynamic Quantization10 February 2025·3 minsONNX Quantization Model Optimization DistilBERT Inference