Knowledge Distillation on CIFAR-10
CS229 final project exploring knowledge distillation as a model compression technique. A series of ablation experiments were conducted on teacher-student architectures using the CIFAR-10 benchmark to understand the effect of key hyperparameters on student generalization.
Key topics: temperature scaling · intermediate feature alignment · loss weighting · model compression
Tech stack: Python · PyTorch · CIFAR-10
