I build production web scraping platforms that deliver structured data at scale
Python | Playwright | FastAPI
Production news platform indexing 100000+ articles per week from 1,000+ news sources updated hourly. (BBC, Reuters, Guardian, TechCrunch).
4-tier RSS fallback → Playwright scraper (paywalls) → hourly APScheduler → full-text search + REST API
FastAPI PostgreSQL Playwright Next.js APScheduler
OSINT dataset marketplace — 150+ premium datasets across 75+ platfroms, datasets enriched with sentiment/topic analysis updated daily. (Reddit/YouTube/Facebook/Telegram/Etc.).
Pre-processed: sentiment scores, topic tags, engagement signals. Drop-in ready for Python/Tableau/LLMs.
FastAPI PostgreSQL Next.js Paddle AWS S3
Production Goodreads scraper → structured CSV/JSON datasets (quotes, authors, tags).
BeautifulSoup Pagination Data validation Multi-page
FinPull (Upcoming)
Financial data pipeline pulling OHLCV, earnings, P/E ratios, and analyst ratings into structured datasets.
Playwright + yfinance → FastAPI → PostgreSQL → REST API + Next.js dashboard
Playwright yfinance FastAPI PostgreSQL Next.js
Target: Traders · Analysts · Portfolio dashboards
| Layer | Tools |
|---|---|
| Scraping | Playwright · BeautifulSoup · Asyncio · Proxy rotation |
| Data | Pandas · NumPy · Parquet/JSONL exports |
| Backend | FastAPI · PostgreSQL · APScheduler · JWT |
| Frontend | Next.js 15 · Tailwind · TypeScript |
| Infra | Railway · Vercel · Supabase · Docker |
freeCodeCamp Certified: Responsive Web Design (Mar 2024) · Scientific Computing with Python (Nov 2025)
Hire Me → Fiverr
Custom scrapers · ETL pipelines · Data platforms · REST APIs