AI Inference
Performance Engineer

Building faster inference through latency benchmarks, runtime validation, and GPU-aware systems

CUDA • C++ • PyTorch • Model Serving • Latency Benchmarking • Runtime Systems

View Projects GitHub

FEATURED WORK

Inference Runtime Regression Gate

CI validation gate that benchmarks baseline vs candidate inference runs and blocks meaningful p95 latency regressions before release.

Python PyTorch GitHub Actions

Matching Engine

Deterministic C++ limit order book with price-time priority, partial fills, cancellations, and throughput benchmarking.

C++ Order Book Low Latency

Lock-Free Ring Buffer

Lock-free SPSC ring buffer with tuned wait strategies and benchmarked throughput/latency performance.

C++ Lock-Free Low Latency

Real-Time UAV Simulator

Real-time drone simulation with deterministic physics and stable control systems.

Unreal Engine Physics Simulation Control Systems

Performance Inspector

Real-time profiling tool for identifying runtime bottlenecks and performance regressions.

Roblox Studio Tooling Performance

ScratchFX

GPU shader-based reveal system optimized for real-time rendering.

Unity HLSL Rendering

Static

A first-person psychological horror game built with dynamic rule-based gameplay systems.

Unreal Engine C++ Gameplay Systems

Stuffed Animal Nightmares

A stylized stealth-horror prototype featuring AI-driven enemy behavior and player evasion mechanics.

Unreal Engine AI Stealth Gameplay

FlagStorm

A multiplayer capture-the-flag prototype with networked gameplay systems and team-based objectives.

Roblox Luau Networking Gameplay Systems

EXPERIENCE

Real-Time Simulation Systems at Boeing

Built real-time simulation systems in Unity + C++ modeling aircraft electrical systems and faults
Designed modular, state-driven architectures and optimized CPU, memory, and rendering performance

C++ Integration • Performance Optimization

Infrastructure at Prisms VR

Built TeamCity PR gates across 3 Unity repositories, blocking failed builds before merge.
Automated package sync, lockfile regeneration, CI triage, and release validation workflows.

TeamCity • CI Validation • Release Automation

Let’s Make Inference Faster

surajm99@outlook.com

AI Inference Performance Engineer