llmleaf

llmleaf is a llm proxy. It proxies different llm providers and their slighty different apis and converts it a a single api surface.

Goals:

fast
efficient
light-weight
extensible

Features

One stable endpoint in front of every provider — consumers speak OpenAI, OpenRouter, or Anthropic dialects; llmleaf maps them to one internal model and back.
Streaming-first (SSE); a non-streaming response is just a collected stream.
Modalities: chat, embeddings, text-to-speech, speech-to-text, realtime (WebSocket), batch jobs.
Per-model fallback chains with node-local, health-aware switchover — no consensus or shared state, so N nodes run behind a plain load balancer.
Opt-in per request/provider: Anthropic prompt caching, a unified thinking/reasoning-effort ladder.
Auth via HTTP-Basic key tokens (optional OAuth2/JWT); identity, limits, and usage ride an outbound control plane (pull verdicts, push usage). Fully operable from the config file alone.

Supported providers

Native dialects: Anthropic, Google Gemini, Vertex AI, Cohere, Ollama, LM Studio.
OpenAI-wire family: OpenAI, OpenRouter, Requesty, Groq, DeepSeek, xAI (Grok), Mistral, Together, Fireworks, Perplexity, Cerebras, Z.AI (GLM), Moonshot (Kimi), Azure OpenAI.
echo for local testing.

Quick start

# Run with the embedded dev config (echo provider, key `local-dev:s3cret`)
cargo run -p llmleaf

# …or point at your own config
cargo run -p llmleaf -- llmleaf.toml

Copy llmleaf.example.toml, fill in provider credentials (use env:VAR indirection — secrets never live in the file), and pass it as the argument. Container image: docker buildx bake image (listens on :8080). Send a request:

curl localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer $(printf 'local-dev:s3cret' | base64)" \
  -d '{"model":"demo","messages":[{"role":"user","content":"hi"}]}'

See llmleaf.example.toml for the full configuration surface (providers, routes, keys, control plane).

API surface

Consumer endpoints (OpenAI-compatible unless noted):

Endpoint	Purpose
`POST /v1/chat/completions`	Chat (SSE streaming)
`POST /v1/messages`	Anthropic Messages dialect
`POST /v1/embeddings`	Embeddings
`POST /v1/audio/speech`, `GET /v1/audio/voices`	Text-to-speech
`POST /v1/audio/transcriptions`	Speech-to-text
`GET /v1/realtime`	OpenAI Realtime (WebSocket)
`POST /v1/batches`, `GET /v1/batches/{id}[/results]`	Batch jobs
`GET /v1/models`, `GET /v1/openapi.json`, `GET /healthz`	Discovery & health

Read-only admin (optional token): GET /admin/routes, /admin/health, /admin/keys. Official client SDKs for 6 languages live in clients/.

Architecture

Two strictly separated planes. The core (data plane) is the proxy; the control plane is reached only outbound — the core pulls identity/verdicts and pushes usage, never the reverse. See SOUL.md for the full design constitution.

flowchart LR
  Cons["Consumers<br/>OpenAI · OpenRouter · Anthropic"] --> Surf["Compat surfaces"]
  subgraph Core["llmleaf core — data plane"]
    direction LR
    Surf --> Auth["authenticate"] --> In["map in"] --> Route["route + fallback"] --> Stream["stream"] --> Out["map out"] --> Ev["emit events"]
  end
  Route --> Prov["Providers<br/>compiled-in traits · WASM plugins"]
  Prov --> Up["LLM providers"]
  Ctrl[["Control plane (outbound)"]]
  Auth -. "pull identity / verdicts" .-> Ctrl
  Ev -. "push usage" .-> Ctrl

License

llmleaf is free software licensed under the GNU Lesser General Public License, version 3 or later (LGPL-3.0-or-later). The full text is in COPYING.LESSER (the LGPLv3 terms) together with COPYING (the GPLv3 it builds on).

Clients are licensed under MIT and APACHE-2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude/agents		.claude/agents
.github/workflows		.github/workflows
clients		clients
crates		crates
e2e-tests		e2e-tests
.dockerignore		.dockerignore
.gitignore		.gitignore
COPYING		COPYING
COPYING.LESSER		COPYING.LESSER
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
README.md		README.md
SOUL.md		SOUL.md
docker-bake.hcl		docker-bake.hcl
llmleaf.example.toml		llmleaf.example.toml
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llmleaf

Goals:

Features

Supported providers

Quick start

API surface

Architecture

License

About

Licenses found

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llmleaf

Goals:

Features

Supported providers

Quick start

API surface

Architecture

License

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages