Heshan Sanjuka hexsyro

Hi, I'm Heshan 👨‍💻

I build production web scraping platforms that deliver structured data at scale

Python | Playwright | FastAPI

Featured Projects

Pulse Aggregator

Production news platform indexing 100000+ articles per week from 1,000+ news sources updated hourly. (BBC, Reuters, Guardian, TechCrunch).

4-tier RSS fallback → Playwright scraper (paywalls) → hourly APScheduler → full-text search + REST API

FastAPI PostgreSQL Playwright Next.js APScheduler

Social Intel

OSINT dataset marketplace — 150+ premium datasets across 75+ platfroms, datasets enriched with sentiment/topic analysis updated daily. (Reddit/YouTube/Facebook/Telegram/Etc.).

Pre-processed: sentiment scores, topic tags, engagement signals. Drop-in ready for Python/Tableau/LLMs.

FastAPI PostgreSQL Next.js Paddle AWS S3

GoodQuote Scraper

Production Goodreads scraper → structured CSV/JSON datasets (quotes, authors, tags).

BeautifulSoup Pagination Data validation Multi-page

FinPull (Upcoming)

Financial data pipeline pulling OHLCV, earnings, P/E ratios, and analyst ratings into structured datasets.

Playwright + yfinance → FastAPI → PostgreSQL → REST API + Next.js dashboard

Playwright yfinance FastAPI PostgreSQL Next.js

Target: Traders · Analysts · Portfolio dashboards

Production Tech Stack

Layer	Tools
Scraping	Playwright · BeautifulSoup · Asyncio · Proxy rotation
Data	Pandas · NumPy · Parquet/JSONL exports
Backend	FastAPI · PostgreSQL · APScheduler · JWT
Frontend	Next.js 15 · Tailwind · TypeScript
Infra	Railway · Vercel · Supabase · Docker

freeCodeCamp Certified: Responsive Web Design (Mar 2024) · Scientific Computing with Python (Nov 2025)

Hire Me → Fiverr

Custom scrapers · ETL pipelines · Data platforms · REST APIs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly