project 2 | Nick Chermak

Modern mobile devices offer impressive camera capabilities, but real-time image enhancement still relies heavily on cloud processing, raising concerns about privacy, latency, and battery usage. This project addresses these issues by adapting the SwinIR model for single image super resolution (SISR) to run entirely on-device on the iPhone 16 Pro Max. The goal was to create a fast, energy-efficient, and privacy-preserving solution for image enhancement using a compressed, transformer-based model.

We began with the SwinIR model and explored multiple model compression techniques to fit the constraints of mobile hardware. I led the effort to transition from PyTorch-based quantization to CoreML-compatible workflows, leveraging mixed precision and FP16 strategies aligned with Apple’s GPU and Neural Engine. I implemented structured pruning of attention heads using L1-norms, and combined it with CoreML-based quantization to create a hardware-optimized model for iOS. The pipeline was benchmarked in terms of latency, FLOPs, energy usage, and image quality metrics like PSNR.