Hey, I'm Pranay 👋
Building scalable AI inference systems & city-scale surveillance platforms.
Currently at Masterworks, Hyderabad.
I'm a Computer Vision & Machine Learning Engineer with a deep focus on building production-grade AI systems that operate at scale. From managing 4,000+ camera feeds to reducing computational overhead, I engineer solutions that are both technically rigorous and practically impactful.
A deep-dive into why Triton Inference Server has become the go-to choice for production ML inference — exploring real bottlenecks, benchmarks, and the engineering trade-offs companies face when scaling model serving.
A hands-on exploration of Apple's MLX framework — how it unlocks near-native GPU performance on M-series chips, its unique unified-memory model, and where it fits in the ML ecosystem.
Whether it's an exciting AI opportunity, a collaboration, or just a chat about inference systems — my inbox is always open.
pranaysaha61@gmail.com