Skip to content

epatel/python_pinch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pinch

Retrieve a single file from inside a ZIP archive — over the network — without downloading the whole thing.

A Python port of flutter_pinch. A ZIP archive keeps its index (the Central Directory) at the end of the file, and every entry records the byte offset of its data. pinch exploits this with HTTP Range requests:

  1. Fetch only the tail of the archive to find the End of Central Directory.
  2. Fetch only the Central Directory to list every entry.
  3. For one chosen entry, fetch only its Local File Header + compressed bytes and inflate them locally.

Extracting one small file from a multi-megabyte archive typically transfers only a few kilobytes.

  • Zero runtime dependencies — standard library only (urllib, zlib, struct).
  • Graceful fallback when a server ignores Range (downloads the full archive, still correct).

Install

pip install -e .

Library usage

from pinch import Pinch, find_by_path

pinch = Pinch()
entries = pinch.fetch_directory("https://example.com/archive.zip")
for entry in entries:
    print(entry.filepath, entry.uncompressed_size)

entry = find_by_path(entries, "path/inside/archive.txt")
result = pinch.fetch_file(entry)
print(result.data)                 # bytes
print(pinch.bytes_transferred)     # how little we downloaded

Example CLI

# List the files in a remote ZIP (a PyPI wheel is a ZIP):
python -m pinch_example https://example.com/archive.zip

# Extract one file to stdout:
python -m pinch_example https://example.com/archive.zip path/inside/archive.txt

# Extract a binary file to disk:
python -m pinch_example https://example.com/archive.zip image.png -o image.png

Test

pip install -e ".[test]"
pytest

The suite builds ZIPs in memory with the stdlib zipfile module and serves them from a local Range-capable HTTP server, round-tripping both stored and deflated entries (and verifying the Range-ignored fallback path).

Scope

Targets the common case, like the original. ZIP64 archives (> 4 GB or > 65535 entries) and encrypted entries are not yet supported.

Documentation for AI agents

See CLAUDE.md (slim overview), docs/CLAUDE.full.md (full algorithm + module map), and project-plan.md.

About

Retrieve a single file from inside a ZIP archive over the network, using HTTP Range requests — Python port of flutter_pinch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors