GitHub - thecelestialmismatch/Gaurav-Rai

⟡ About

I am a Senior Software Engineer who builds reliable, high-performance systems where distributed backends, applied AI/ML, and polished product surfaces meet. My work spans the full stack — from low-latency services and data pipelines to type-safe frontends — with a relentless focus on correctness, observability, and developer experience.

🧠 AI / ML Engineering — production LLM systems, RAG, retrieval, evaluation, model serving, and inference optimization.
🏗️ Full-Stack Development — designing APIs, data models, and UIs that hold up under real-world scale.
⚙️ Product Engineering Mindset — I ship outcomes, not tickets; I measure impact, not output.
🔐 Engineering Rigor — testing, security, and operational excellence are non-negotiable defaults.

Open To → Senior / Staff SWE roles · AI/ML Engineering · Platform & Backend · Technical co-founder · High-impact open source.

⟡ Tech Stack

Languages

Frontend

Backend & Databases

Cloud, DevOps & Tooling

⟡ AI / ML Expertise

Domain	Proficiency	Details
Large Language Models		Prompt engineering, fine-tuning, function calling, agentic orchestration
Retrieval-Augmented Generation		Vector search, hybrid retrieval, re-ranking, chunking & eval pipelines
MLOps & Model Serving		Inference optimization, quantization, GPU scheduling, latency tuning
Deep Learning		Transformers, CNNs, sequence modeling, transfer learning
Data Engineering		Streaming ETL, feature stores, batch + real-time pipelines
Evaluation & Safety		LLM-as-judge, regression suites, guardrails, red-teaming

⟡ Featured Projects

🟣 Atlas — Distributed AI Inference Platform

A horizontally-scalable inference platform that serves LLM and embedding workloads with sub-100ms p99 latency, dynamic batching, and tenant-aware autoscaling.

Aspect	Detail
Stack	Go · Python · gRPC · Kubernetes · Redis · PostgreSQL
Scale	12k+ requests/sec sustained · 40+ GPU nodes
Performance	p99 latency under 95ms · 3.2× throughput via dynamic batching
Security	mTLS service mesh · per-tenant isolation · signed audit log
Impact	Cut inference cost by 41% · powers 9 internal product teams
Repository	github.com/thecelestialmismatch/atlas

Engineered the request scheduler and KV-cache reuse layer; introduced speculative batching that lifted GPU utilization from 54% to 88% without breaching SLOs.

🟣 Helix — RAG Knowledge Engine

A production retrieval-augmented generation engine with hybrid search, contextual re-ranking, and a continuous evaluation harness for grounded, citation-backed answers.

Aspect	Detail
Stack	TypeScript · FastAPI · pgvector · OpenAI · Next.js
Scale	8M+ indexed documents · 1.5k concurrent sessions
Performance	220ms median retrieval · 94% answer-grounding rate
Security	Row-level ACLs · PII redaction · encrypted vector store
Impact	38% deflection of support tickets · NPS +22
Repository	github.com/thecelestialmismatch/helix

Designed the hybrid BM25 + dense retrieval pipeline and the LLM-as-judge eval loop that gates every release on grounding and faithfulness metrics.

🟣 Forge — Full-Stack SaaS Framework

An opinionated, type-safe SaaS starter — auth, billing, multi-tenancy, and observability wired end-to-end with zero-config deploys.

Aspect	Detail
Stack	Next.js · tRPC · Prisma · Stripe · PostgreSQL · AWS
Scale	Battle-tested across 5 shipped products
Performance	98 Lighthouse · edge-cached · <1s cold start
Security	RBAC · CSRF/XSS hardening · audited dependency chain
Impact	Cut new-product time-to-launch from weeks to days
Repository	github.com/thecelestialmismatch/forge

Authored the multi-tenant data layer and CI/CD templates; the framework is now the default starting point for greenfield products on the team.

⟡ Experience

Senior Software Engineer · Acme Technologies Jan 2023 — Present

Lead engineer on the AI platform team, owning architecture for inference, retrieval, and developer tooling consumed across the organization.

Architected and shipped a distributed inference platform serving 12k+ RPS at sub-100ms p99.
Drove a 41% reduction in compute cost through batching, quantization, and autoscaling policy.
Mentored 4 engineers; established the team's testing, review, and on-call standards.

Software Engineer · Nimbus Labs Jun 2021 — Dec 2022

Built core backend services and data pipelines for a high-growth analytics product.

Delivered a real-time streaming pipeline processing 400M+ events/day with exactly-once semantics.
Reduced API latency 60% by redesigning the query layer and adding multi-tier caching.
Shipped the customer-facing dashboard from prototype to GA.

⟡ Achievements

Recognition	Details
🏆 Hackathon Winner	1st place / 200+ teams — national AI engineering challenge
⭐ Open Source	5k+ cumulative stars across published repositories
📈 Performance Award	Top-tier impact rating, two consecutive cycles
🎤 Speaker	Conference talks on RAG architecture & inference scaling
📝 Author	Technical articles with 100k+ aggregate reads

⟡ Certifications

AWS

Oracle

NPTEL

Cisco

⟡ Coding Profiles

⟡ GitHub Analytics

⟡ Trophies

⟡ Contribution Activity

⟡ Contribution Snake

⟡ Current Focus

learning:
  - distributed systems internals (consensus, replication)
  - advanced retrieval & agentic LLM architectures
building:
  - an open-source RAG evaluation toolkit
  - low-latency inference primitives for edge deployment
exploring:
  - Rust for high-performance services
  - vector database internals & ANN indexing
open_to:
  - Senior / Staff Software Engineer roles
  - AI / ML Engineering positions
  - high-impact open-source collaboration

⟡ Connect

"Engineering is the discipline of turning constraints into leverage."

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⟡ About

⟡ Tech Stack

⟡ AI / ML Expertise

⟡ Featured Projects

⟡ Experience

⟡ Achievements

⟡ Certifications

⟡ Coding Profiles

⟡ GitHub Analytics

⟡ Trophies

⟡ Contribution Activity

⟡ Contribution Snake

⟡ Current Focus

⟡ Connect

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

⟡ About

⟡ Tech Stack

⟡ AI / ML Expertise

⟡ Featured Projects

⟡ Experience

⟡ Achievements

⟡ Certifications

⟡ Coding Profiles

⟡ GitHub Analytics

⟡ Trophies

⟡ Contribution Activity

⟡ Contribution Snake

⟡ Current Focus

⟡ Connect

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages