Skip to content

thecelestialmismatch/Gaurav-Rai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation


⟡ About

about-snippet

I am a Senior Software Engineer who builds reliable, high-performance systems where distributed backends, applied AI/ML, and polished product surfaces meet. My work spans the full stack — from low-latency services and data pipelines to type-safe frontends — with a relentless focus on correctness, observability, and developer experience.

  • 🧠 AI / ML Engineering — production LLM systems, RAG, retrieval, evaluation, model serving, and inference optimization.
  • 🏗️ Full-Stack Development — designing APIs, data models, and UIs that hold up under real-world scale.
  • ⚙️ Product Engineering Mindset — I ship outcomes, not tickets; I measure impact, not output.
  • 🔐 Engineering Rigor — testing, security, and operational excellence are non-negotiable defaults.

Open To → Senior / Staff SWE roles · AI/ML Engineering · Platform & Backend · Technical co-founder · High-impact open source.


⟡ Tech Stack

Languages

TypeScript Python Go Rust Java SQL

Frontend

React Next.js Tailwind Redux Framer Motion

Backend & Databases

Node.js FastAPI GraphQL PostgreSQL Redis MongoDB

Cloud, DevOps & Tooling

AWS Docker Kubernetes Terraform GitHub Actions


skill-icons

⟡ AI / ML Expertise

Domain Proficiency Details
Large Language Models Prompt engineering, fine-tuning, function calling, agentic orchestration
Retrieval-Augmented Generation Vector search, hybrid retrieval, re-ranking, chunking & eval pipelines
MLOps & Model Serving Inference optimization, quantization, GPU scheduling, latency tuning
Deep Learning Transformers, CNNs, sequence modeling, transfer learning
Data Engineering Streaming ETL, feature stores, batch + real-time pipelines
Evaluation & Safety LLM-as-judge, regression suites, guardrails, red-teaming

⟡ Featured Projects

🟣 Atlas — Distributed AI Inference Platform

A horizontally-scalable inference platform that serves LLM and embedding workloads with sub-100ms p99 latency, dynamic batching, and tenant-aware autoscaling.

Aspect Detail
Stack Go · Python · gRPC · Kubernetes · Redis · PostgreSQL
Scale 12k+ requests/sec sustained · 40+ GPU nodes
Performance p99 latency under 95ms · 3.2× throughput via dynamic batching
Security mTLS service mesh · per-tenant isolation · signed audit log
Impact Cut inference cost by 41% · powers 9 internal product teams
Repository github.com/thecelestialmismatch/atlas

Engineered the request scheduler and KV-cache reuse layer; introduced speculative batching that lifted GPU utilization from 54% to 88% without breaching SLOs.

🟣 Helix — RAG Knowledge Engine

A production retrieval-augmented generation engine with hybrid search, contextual re-ranking, and a continuous evaluation harness for grounded, citation-backed answers.

Aspect Detail
Stack TypeScript · FastAPI · pgvector · OpenAI · Next.js
Scale 8M+ indexed documents · 1.5k concurrent sessions
Performance 220ms median retrieval · 94% answer-grounding rate
Security Row-level ACLs · PII redaction · encrypted vector store
Impact 38% deflection of support tickets · NPS +22
Repository github.com/thecelestialmismatch/helix

Designed the hybrid BM25 + dense retrieval pipeline and the LLM-as-judge eval loop that gates every release on grounding and faithfulness metrics.

🟣 Forge — Full-Stack SaaS Framework

An opinionated, type-safe SaaS starter — auth, billing, multi-tenancy, and observability wired end-to-end with zero-config deploys.

Aspect Detail
Stack Next.js · tRPC · Prisma · Stripe · PostgreSQL · AWS
Scale Battle-tested across 5 shipped products
Performance 98 Lighthouse · edge-cached · <1s cold start
Security RBAC · CSRF/XSS hardening · audited dependency chain
Impact Cut new-product time-to-launch from weeks to days
Repository github.com/thecelestialmismatch/forge

Authored the multi-tenant data layer and CI/CD templates; the framework is now the default starting point for greenfield products on the team.


⟡ Experience

Senior Software Engineer · Acme Technologies Jan 2023 — Present

Lead engineer on the AI platform team, owning architecture for inference, retrieval, and developer tooling consumed across the organization.

  • Architected and shipped a distributed inference platform serving 12k+ RPS at sub-100ms p99.
  • Drove a 41% reduction in compute cost through batching, quantization, and autoscaling policy.
  • Mentored 4 engineers; established the team's testing, review, and on-call standards.

Go Python Kubernetes AWS


Software Engineer · Nimbus Labs Jun 2021 — Dec 2022

Built core backend services and data pipelines for a high-growth analytics product.

  • Delivered a real-time streaming pipeline processing 400M+ events/day with exactly-once semantics.
  • Reduced API latency 60% by redesigning the query layer and adding multi-tier caching.
  • Shipped the customer-facing dashboard from prototype to GA.

TypeScript Node.js PostgreSQL Redis


⟡ Achievements

Recognition Details
🏆 Hackathon Winner 1st place / 200+ teams — national AI engineering challenge
Open Source 5k+ cumulative stars across published repositories
📈 Performance Award Top-tier impact rating, two consecutive cycles
🎤 Speaker Conference talks on RAG architecture & inference scaling
📝 Author Technical articles with 100k+ aggregate reads

⟡ Certifications

AWS

AWS SAA AWS ML

Oracle

OCI

NPTEL

NPTEL DSA NPTEL ML

Cisco

Cisco


⟡ Coding Profiles


⟡ GitHub Analytics

stats streak
top-langs

⟡ Trophies

trophies

⟡ Contribution Activity

activity-graph

⟡ Contribution Snake

snake

⟡ Current Focus

learning:
  - distributed systems internals (consensus, replication)
  - advanced retrieval & agentic LLM architectures
building:
  - an open-source RAG evaluation toolkit
  - low-latency inference primitives for edge deployment
exploring:
  - Rust for high-performance services
  - vector database internals & ANN indexing
open_to:
  - Senior / Staff Software Engineer roles
  - AI / ML Engineering positions
  - high-impact open-source collaboration

⟡ Connect


"Engineering is the discipline of turning constraints into leverage."

footer

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors