EN
Xerox is a UI-first local website cloner.
It runs locally, opens a browser UI, supportspageandcrawl, shows live logs, keeps local history, and exports discovered links.RU
Xerox — локальный клонер сайтов с UI-first подходом.
Он запускается локально, открывает интерфейс в браузере, поддерживает режимыpageиcrawl, показывает live-логи, хранит локальную историю и экспортирует найденные ссылки.
git clone https://github.com/legitsdev/Xerox.git
cd Xerox
./install.sh
./xeroxgit clone https://github.com/legitsdev/Xerox.git
cd Xerox
.\install.ps1
.\xerox.ps1git clone https://github.com/legitsdev/Xerox.git
cd Xerox
install.bat
xerox.batSSH:
git clone git@github.com:legitsdev/Xerox.gitpage: clone one page and the assets needed to reproduce it locallycrawl: start from one URL, follow internal navigation, and rewrite local links between saved pages
The installer:
- picks a working Python
3.10+interpreter - creates a repo-local
.venv - installs dependencies
- installs Playwright Chromium
- runs a smoke check
The launcher:
- starts the local web UI
- uses
127.0.0.1:4173or the next free port - runs from the repo-local
.venvwithout manual activation
Main flow:
./install.sh
./xeroxSecondary entrypoint:
python -m xerox --no-openEditable install:
pip install -e .Typical locations:
- macOS:
~/Library/Application Support/xerox - Linux:
~/.local/share/xerox - Windows:
%APPDATA%\xerox
Each job stores:
- cloned site files
site_report.txtresult.jsonjob.logfound_links.txt
Install Python 3.10+ and rerun the installer.
If you want a specific interpreter on macOS/Linux:
XEROX_PYTHON=/path/to/python3.12 ./install.shRerun the installer.
Manual recovery:
./.venv/bin/python -m playwright install chromiumWindows:
.\.venv\Scripts\python.exe -m playwright install chromiumIf Playwright reports missing OS libraries, install the packages it requests and rerun ./install.sh.
- Some sites use anti-bot or challenge pages. They are not guaranteed.
crawlis conservative and does not submit forms or authenticate.
MIT
