Knowledge Distillation on CIFAR-10

CS229 final project exploring knowledge distillation as a model compression technique. A series of ablation experiments were conducted on teacher-student architectures using the CIFAR-10 benchmark to understand the effect of key hyperparameters on student generalization.

Key topics: temperature scaling · intermediate feature alignment · loss weighting · model compression

Tech stack: Python · PyTorch · CIFAR-10

GitHub Repository