I am a Senior Software Engineer who builds reliable, high-performance systems where distributed backends, applied AI/ML, and polished product surfaces meet. My work spans the full stack — from low-latency services and data pipelines to type-safe frontends — with a relentless focus on correctness, observability, and developer experience.
- 🧠 AI / ML Engineering — production LLM systems, RAG, retrieval, evaluation, model serving, and inference optimization.
- 🏗️ Full-Stack Development — designing APIs, data models, and UIs that hold up under real-world scale.
- ⚙️ Product Engineering Mindset — I ship outcomes, not tickets; I measure impact, not output.
- 🔐 Engineering Rigor — testing, security, and operational excellence are non-negotiable defaults.
Open To → Senior / Staff SWE roles · AI/ML Engineering · Platform & Backend · Technical co-founder · High-impact open source.
🟣 Atlas — Distributed AI Inference Platform
A horizontally-scalable inference platform that serves LLM and embedding workloads with sub-100ms p99 latency, dynamic batching, and tenant-aware autoscaling.
| Aspect | Detail |
|---|---|
| Stack | Go · Python · gRPC · Kubernetes · Redis · PostgreSQL |
| Scale | 12k+ requests/sec sustained · 40+ GPU nodes |
| Performance | p99 latency under 95ms · 3.2× throughput via dynamic batching |
| Security | mTLS service mesh · per-tenant isolation · signed audit log |
| Impact | Cut inference cost by 41% · powers 9 internal product teams |
| Repository | github.com/thecelestialmismatch/atlas |
Engineered the request scheduler and KV-cache reuse layer; introduced speculative batching that lifted GPU utilization from 54% to 88% without breaching SLOs.
🟣 Helix — RAG Knowledge Engine
A production retrieval-augmented generation engine with hybrid search, contextual re-ranking, and a continuous evaluation harness for grounded, citation-backed answers.
| Aspect | Detail |
|---|---|
| Stack | TypeScript · FastAPI · pgvector · OpenAI · Next.js |
| Scale | 8M+ indexed documents · 1.5k concurrent sessions |
| Performance | 220ms median retrieval · 94% answer-grounding rate |
| Security | Row-level ACLs · PII redaction · encrypted vector store |
| Impact | 38% deflection of support tickets · NPS +22 |
| Repository | github.com/thecelestialmismatch/helix |
Designed the hybrid BM25 + dense retrieval pipeline and the LLM-as-judge eval loop that gates every release on grounding and faithfulness metrics.
🟣 Forge — Full-Stack SaaS Framework
An opinionated, type-safe SaaS starter — auth, billing, multi-tenancy, and observability wired end-to-end with zero-config deploys.
| Aspect | Detail |
|---|---|
| Stack | Next.js · tRPC · Prisma · Stripe · PostgreSQL · AWS |
| Scale | Battle-tested across 5 shipped products |
| Performance | 98 Lighthouse · edge-cached · <1s cold start |
| Security | RBAC · CSRF/XSS hardening · audited dependency chain |
| Impact | Cut new-product time-to-launch from weeks to days |
| Repository | github.com/thecelestialmismatch/forge |
Authored the multi-tenant data layer and CI/CD templates; the framework is now the default starting point for greenfield products on the team.
Senior Software Engineer · Acme Technologies
Jan 2023 — Present
Lead engineer on the AI platform team, owning architecture for inference, retrieval, and developer tooling consumed across the organization.
- Architected and shipped a distributed inference platform serving 12k+ RPS at sub-100ms p99.
- Drove a 41% reduction in compute cost through batching, quantization, and autoscaling policy.
- Mentored 4 engineers; established the team's testing, review, and on-call standards.
Software Engineer · Nimbus Labs
Jun 2021 — Dec 2022
Built core backend services and data pipelines for a high-growth analytics product.
- Delivered a real-time streaming pipeline processing 400M+ events/day with exactly-once semantics.
- Reduced API latency 60% by redesigning the query layer and adding multi-tier caching.
- Shipped the customer-facing dashboard from prototype to GA.
| Recognition | Details |
|---|---|
| 🏆 Hackathon Winner | 1st place / 200+ teams — national AI engineering challenge |
| ⭐ Open Source | 5k+ cumulative stars across published repositories |
| 📈 Performance Award | Top-tier impact rating, two consecutive cycles |
| 🎤 Speaker | Conference talks on RAG architecture & inference scaling |
| 📝 Author | Technical articles with 100k+ aggregate reads |
learning:
- distributed systems internals (consensus, replication)
- advanced retrieval & agentic LLM architectures
building:
- an open-source RAG evaluation toolkit
- low-latency inference primitives for edge deployment
exploring:
- Rust for high-performance services
- vector database internals & ANN indexing
open_to:
- Senior / Staff Software Engineer roles
- AI / ML Engineering positions
- high-impact open-source collaboration