📚Academy
likeone
online

QLoRA & Quantization.

Fine-tune 70B models on a single GPU by combining LoRA with 4-bit quantization.

After this lesson you'll know

  • How quantization reduces model memory footprint without destroying quality
  • The QLoRA technique: 4-bit NormalFloat, double quantization, and paged optimizers
  • Hands-on QLoRA training with bitsandbytes and PEFT
  • When to use QLoRA vs standard LoRA and the quality tradeoffs

Quantization Fundamentals

Model weights are stored as floating-point numbers. The precision of these numbers directly determines memory usage: ``` Precision Bits per weight Memory for 7B model FP32 32 bits 28 GB FP16/BF16 16 bits 14 GB INT8 8 bits 7 GB INT4/NF4 4 bits 3.5 GB ``` Quantization converts weights from higher to lower precision. The challenge is doing this without destroying the model's capability. **Naive quantization** maps floating-point values to integer bins uniformly. This works poorly because model weights follow a roughly normal distribution -- most values cluster near zero, with few extreme outliers. Uniform binning wastes precision on sparse regions and crushes important distinctions near zero. **NormalFloat4 (NF4)** -- the quantization format used in QLoRA -- solves this by creating non-uniform bins that match the normal distribution of weights. More bins near zero (where most weights live), fewer bins in the tails. ``` Uniform INT4: [-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7] NF4 (normalized): [-1.0, -0.69, -0.52, -0.39, -0.28, -0.18, -0.09, 0.0, 0.08, 0.17, 0.27, 0.38, 0.51, 0.68, 1.0] NF4 has higher density near zero where weight values cluster, preserving more information in the region that matters most. ``` The result: NF4 quantization from 16-bit to 4-bit typically loses less than 1% accuracy on benchmarks, while reducing memory by 4x.
Quantization is a compression technique, not a quality improvement. You trade memory for a small accuracy loss. The magic of QLoRA is that LoRA training recovers most of this lost accuracy on your specific task.
🔒

This lesson is for Pro members

Unlock all 518+ lessons across 52 courses with Academy Pro.

Already a member? Sign in to access your lessons.

Academy
Built with soul — likeone.ai