diff --git a/IA.md b/IA.md new file mode 100644 index 0000000..4be54ce --- /dev/null +++ b/IA.md @@ -0,0 +1,192 @@ +# Docs V2 β€” Information Architecture & migration checklist + +Living checklist for the docs overhaul. Tracked in Linear under +[CIP-3307](https://linear.app/cipherstash/issue/CIP-3307); the full IA rationale +(design principles, audience doors, correctness strategy) lives in +`CipherStash docs IA v1.md` in the content repo. **Tick items here as they land +on the `v2` branch.** Legend: `[ ]` todo Β· `[x]` done Β· 🚧 stub exists Β· β›” blocked +on a product decision (see CIP-3307 checklist). + +## How this branch works + +- New IA lives in `content/docs`, served from the site root (`/docs/
/…`). +- The legacy tree (`content/stack`) is served alongside it at `/docs/stack/…` + until every section migrates, then deleted (CIP-3335). +- The full legacyβ†’v2 redirect map is `v2-redirects.mjs`, gated behind + `ENABLE_V2_REDIRECTS=1` (flipped on at merge). `bun run validate-redirects` + enforces that every legacy page has a mapping. +- Frontmatter facets (`type`, `components`, `audience`, `integration`, + `verifiedAgainst`, `reviewBy`) are defined in `source.config.ts` (`v2docs`). +- **Moving a page** = move the file into `content/docs`, update its facets, + fix inbound links, confirm its `v2-redirects.mjs` entry, tick it here. + +## URL conventions + +Lowercase, hyphens, no trailing slashes, no version numbers in paths. +Integrations are **flat** (no category segment). Error pages (future, miette) +live at `/docs/errors/` β€” permanent, never restructured (CIP-3338). + +--- + +## Get started β€” CIP-3327 + +- [x] Section scaffold 🚧 +- [ ] `/get-started/what-is-cipherstash` β€” mental model, components map, audience router +- [ ] `/get-started/quickstart` β€” rewritten on EQL v3 (fixes `cs_match_v1`, broken scaffold imports) +- [ ] `/get-started/choose-your-stack` β€” static matrix v1 (platform Γ— ORM Γ— auth) +- [ ] `/get-started/examples` β€” runnable example apps index +- [ ] `/docs` landing page 🚧 β€” now `content/docs/index.mdx` rendered inside the docs + nav (the old standalone `(home)` route is deleted; recoverable from git history). + CIP-3327 refines the content (what-is + audience router) + +## Integrations β€” CIP-3328 (Supabase), CIP-3330 (auth), CIP-3336 (rest) + +- [x] Section scaffold 🚧 (index + supabase stub with facet exemplar) +- [ ] `/integrations` index β€” category grid w/ setup badges +- [ ] `/integrations/supabase` β€” flagship tutorial (CIP-3328) +- [ ] `/integrations/supabase/database` +- [ ] `/integrations/supabase/auth` +- [ ] `/integrations/supabase/dashboard-experience` β€” Table Editor, expose eql schema +- [ ] β›” `/integrations/supabase/edge-functions` β€” pending Deno/FFI answer +- [ ] β›” `/integrations/supabase/realtime` β€” pending product verification +- [ ] `/integrations/drizzle` β€” merge the two divergent Drizzle pages +- [ ] `/integrations/prisma-next` +- [ ] `/integrations/aws/rds-aurora` β€” Proxy path +- [ ] `/integrations/aws/dynamodb` +- [ ] `/integrations/clerk` +- [ ] `/integrations/auth0` β€” end-to-end example (Clerk parity) +- [ ] `/integrations/okta` β€” end-to-end example (Clerk parity) +- [ ] `/integrations/nextjs` +- [ ] `/integrations/typescript` β€” thin router to Stack SDK reference +- [ ] `/integrations/serverless` β€” Vercel/Lambda, bundling, CS_CONFIG_PATH +- [ ] `/integrations/docker` +- [ ] β›” `/integrations/edge-workers` β€” pending Deno/workerd answer + +## Concepts β€” CIP-3333 (searchable-encryption), others per section tickets + +- [x] Section scaffold 🚧 +- [ ] `/concepts/privacy-first-design` +- [ ] `/concepts/application-level-encryption` β€” vs TDE/pgcrypto/RLS +- [ ] `/concepts/searchable-encryption` β€” REWRITE with honest leakage model (canonical leakage page) +- [ ] `/concepts/eql` β€” the typed-column model (declare capability in the schema) +- [ ] `/concepts/key-management` β€” per-value keys, rotation, crypto-shredding +- [ ] `/concepts/identity-aware-encryption` β€” lock contexts, CTS (CIP-3330) +- [ ] `/concepts/threat-modelling` + +## Comparisons β€” CIP-3333 + +- [x] Section scaffold 🚧 +- [ ] `/compare/aws-kms` (port) +- [ ] `/compare/fhe` (port) +- [ ] `/compare/rls-and-tde` (new β€” expand the Supabase-listing RLS contrast) +- [ ] `/compare/hashicorp-vault` (in flight on `docs/vault-comparison` branch β€” land there or here, then port) + +## Guides + +- [x] Section scaffold 🚧 (development, migration, deployment, troubleshooting) +- [ ] `/guides/development/local-setup` β€” profiles, device auth, workspaces, keys +- [ ] `/guides/development/schema-design` β€” which encrypted type/variant per column (CIP-3327) +- [ ] `/guides/development/testing-and-ci` (port deploy/testing) +- [ ] `/guides/development/team-onboarding` (port) +- [ ] `/guides/migration/encrypt-existing-data` β€” the backfill guide, runnable (CIP-3329) +- [ ] β›” `/guides/migration/upgrading-from-eql-v2` β€” REQUIRED; mechanics pending product answer (CIP-3329) +- [ ] `/guides/migration/adopting-incrementally` (CIP-3329) +- [ ] `/guides/migration/key-rotation-operations` +- [ ] `/guides/deployment/going-to-production` (port) +- [ ] `/guides/deployment/serverless-and-bundling` (merge bundling + sst) +- [ ] `/guides/deployment/proxy-deployment` (merge proxy Docker + aws-ecs) +- [ ] `/guides/troubleshooting` index β€” symptom-based router +- [ ] `/guides/troubleshooting/query-performance` β€” seq-scan diagnosis, typed-operand gotcha +- [ ] `/guides/troubleshooting/runtime-errors` +- [ ] `/guides/troubleshooting/cli` (port) +- [ ] `/guides/troubleshooting/proxy` (port) + +## Architecture & security β€” CIP-3331, CIP-3332 (compliance) + +- [x] Section scaffold 🚧 +- [ ] `/security/architecture` β€” ONE reconciled ZeroKMS mechanism story (kills the 3 conflicting accounts) +- [ ] `/security/zerokms` +- [ ] `/security/cts` β€” auth layer architecture (CIP-3330) +- [ ] `/security/stack-sdk` +- [ ] `/security/proxy` +- [ ] `/security/threat-scenarios` +- [ ] β›” `/security/availability-and-continuity` β€” DR (port) + SLA + exit story; pending SLA answer +- [ ] β›” `/security/audit-logging` β€” pending retention answer +- [ ] β›” `/security/key-ownership` β€” BYOK/self-hosted; pending product answer +- [ ] `/security/compliance` index β€” framework mapping (port, good) +- [ ] `/security/compliance/hipaa` β€” BAA scope, Β§164.312 mapping (CIP-3332) +- [ ] `/security/compliance/soc2` β€” verify Type II report exists +- [ ] `/security/compliance/gdpr` + +## Solutions + +- [x] Section scaffold 🚧 +- [ ] `/solutions/protecting-pii` (new) +- [ ] `/solutions/healthcare-hipaa` (new; pairs with compliance/hipaa) +- [ ] `/solutions/ai-and-rag` (port use-cases/ai-rag) +- [ ] `/solutions/data-residency` (port) +- [ ] `/solutions/provable-access` (port) + +## Reference + +- [x] Section scaffold 🚧 (eql, stack, auth, cli, proxy, workspace) +- **EQL (v3 rewrite β€” CIP-3326; Tailwind-shaped: install β†’ core concepts β†’ type + categories β†’ indexes β†’ query patterns). Anti-drift rule: shared mechanics + (typed operands, blockers, envelope, variant model, ORE-equality) live ONLY in + core-concepts β€” category/query pages link, never restate:** +- [x] `/reference/eql` β€” install (single SQL file, permissions split, dbdev, Docker) +- [x] `/reference/eql/core-concepts` β€” variant model, payload anatomy (absorbs + cipher-cell), typed-operand rule, fail-loud blockers, term leakage pointer +- [x] `/reference/eql/numbers` β€” int*/float*/numeric +- [x] `/reference/eql/dates-and-times` β€” date/timestamp (same traits as numbers, + distinct semantics) +- [x] `/reference/eql/text` β€” all six text variants; owns the no-LIKE treatment +- [x] `/reference/eql/json` β€” ste_vec + sv payload shape + containment/path queries +- [x] `/reference/eql/booleans` β€” storage-only variants (bool has only that one) +- [x] `/reference/eql/indexes` β€” functional indexes on extractors; Supabase-compatible +- [x] `/reference/eql/filtering` β€” =, IN, ranges, token match, containment +- [x] `/reference/eql/sorting` β€” ORDER BY, extractor sort-key form, pagination +- [x] `/reference/eql/grouping-and-aggregates` β€” GROUP BY/DISTINCT, min/max, no SUM/AVG +- [x] `/reference/eql/joins` β€” equijoins, the same-keyset constraint +- [ ] β›” `/reference/eql/query-performance` β€” port the EQL repo performance guide once + rewritten for v3 upstream (v3 branch folded it into database-indexes.md; verify + nothing from the v2 guide on main was lost) β€” see CIP-3351 +- **Stack SDK:** +- [ ] `/reference/stack` β€” client + configuration (port encryption/* pages) +- [ ] `/reference/stack/schema` +- [ ] `/reference/stack/encrypt-decrypt` (+ bulk, models) +- [ ] `/reference/stack/supabase` β€” THE canonical `encryptedSupabase` page, ONE signature (CIP-3328) +- [ ] `/reference/stack/drizzle-operators` +- [ ] `/reference/stack/errors` β€” port error-handling; miette catalog later (CIP-3338) +- [ ] `/reference/stack/upgrading-from-protect` (retitled package-rename guide) +- **Auth (CIP-3330):** +- [ ] `/reference/auth/lock-contexts` +- [ ] `/reference/auth/cts-tokens` +- [ ] `/reference/auth/oidc-configuration` +- [ ] `/reference/auth/access-keys` (+ clients) +- **CLI / Proxy / Workspace:** +- [ ] `/reference/cli/*` (port 9 pages) +- [ ] `/reference/proxy/*` (configuration, message-flow, multitenant, errors) +- [ ] `/reference/workspace/billing` + `/members` + `/configuration` +- **Cross-cutting:** +- [ ] `/reference/benchmarks` β€” listing numbers + methodology (CIP-3334) +- [ ] `/reference/agent-skills` (port; expand per CIP-3339) +- [ ] `/reference/glossary` (port) +- [ ] Repoint `scripts/generate-docs.ts` TypeDoc output β†’ `content/docs/reference/stack` + +## Infrastructure / final pass + +- [x] `v2` branch + this checklist +- [x] `v2docs` collection + facet schema (`source.config.ts`) +- [x] Root catch-all routes (`src/app/[...slug]`), llms.mdx mirror, sitemap/llms.txt include v2 +- [x] `v2-redirects.mjs` (flag-gated) + `validate-redirects` gate in prebuild +- [x] `/quickstart` vanity redirect +- [ ] OG images for v2 pages (route only covers legacy tree) +- [ ] Correctness CI: snippet type-checking, SQL-vs-EQL-Docker, terminology lint (CIP-3337) +- [ ] llms.txt curation + Cloudflare AI crawl policy + md-degradation check (CIP-3339) +- [ ] β›” EQL 3.0.0 release alignment (CIP-3352, blocks CIP-3335) β€” the EQL reference + documents the release as decided, ahead of the eql_v3 branch: payload `v: 3`, + OPE SEM specifier, Docker tag `:17-3.0.0`, `version()` output, schema files. + Each must land upstream or be walked back in the docs before merge +- [ ] Flip `ENABLE_V2_REDIRECTS=1`, delete `content/stack` + `/stack` routes + legacy loader (CIP-3335) +- [ ] Consistency sweep + Supabase listing v3 revision (CIP-3335) diff --git a/content/docs/compare/index.mdx b/content/docs/compare/index.mdx new file mode 100644 index 0000000..5c34ec1 --- /dev/null +++ b/content/docs/compare/index.mdx @@ -0,0 +1,9 @@ +--- +title: Comparisons +description: "How CipherStash compares to other approaches to protecting data." +type: concept +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/compare/meta.json b/content/docs/compare/meta.json new file mode 100644 index 0000000..76e9696 --- /dev/null +++ b/content/docs/compare/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Comparisons", + "icon": "Scale", + "pages": ["..."] +} diff --git a/content/docs/concepts/index.mdx b/content/docs/concepts/index.mdx new file mode 100644 index 0000000..3d36567 --- /dev/null +++ b/content/docs/concepts/index.mdx @@ -0,0 +1,9 @@ +--- +title: Concepts +description: "How CipherStash works and how to think about searchable encryption, keys, and identity." +type: concept +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/concepts/meta.json b/content/docs/concepts/meta.json new file mode 100644 index 0000000..521f756 --- /dev/null +++ b/content/docs/concepts/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Concepts", + "icon": "Lightbulb", + "pages": ["..."] +} diff --git a/content/docs/concepts/searchable-encryption.mdx b/content/docs/concepts/searchable-encryption.mdx new file mode 100644 index 0000000..689d1fd --- /dev/null +++ b/content/docs/concepts/searchable-encryption.mdx @@ -0,0 +1,9 @@ +--- +title: Searchable encryption +description: "How querying encrypted data works, and exactly what each index term reveals." +type: concept +--- + +This page is being rewritten as part of the docs V2 overhaul ([CIP-3333](https://linear.app/cipherstash/issue/CIP-3333)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, the current version lives in the [existing docs](/stack/cipherstash/encryption/searchable-encryption). diff --git a/content/docs/get-started/index.mdx b/content/docs/get-started/index.mdx new file mode 100644 index 0000000..a92e606 --- /dev/null +++ b/content/docs/get-started/index.mdx @@ -0,0 +1,9 @@ +--- +title: Get started +description: "What CipherStash is, a 10-minute quickstart, and how to choose your integration path." +type: tutorial +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/get-started/meta.json b/content/docs/get-started/meta.json new file mode 100644 index 0000000..3f92dab --- /dev/null +++ b/content/docs/get-started/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Get started", + "icon": "Rocket", + "pages": ["..."] +} diff --git a/content/docs/guides/deployment/index.mdx b/content/docs/guides/deployment/index.mdx new file mode 100644 index 0000000..f269e42 --- /dev/null +++ b/content/docs/guides/deployment/index.mdx @@ -0,0 +1,8 @@ +--- +title: Deployment +description: "Deployment documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/guides/deployment/meta.json b/content/docs/guides/deployment/meta.json new file mode 100644 index 0000000..e2ffe90 --- /dev/null +++ b/content/docs/guides/deployment/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Deployment", + "pages": ["..."] +} diff --git a/content/docs/guides/development/index.mdx b/content/docs/guides/development/index.mdx new file mode 100644 index 0000000..3ac286f --- /dev/null +++ b/content/docs/guides/development/index.mdx @@ -0,0 +1,8 @@ +--- +title: Development +description: "Development documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/guides/development/meta.json b/content/docs/guides/development/meta.json new file mode 100644 index 0000000..203f9c9 --- /dev/null +++ b/content/docs/guides/development/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Development", + "pages": ["..."] +} diff --git a/content/docs/guides/index.mdx b/content/docs/guides/index.mdx new file mode 100644 index 0000000..8d6647e --- /dev/null +++ b/content/docs/guides/index.mdx @@ -0,0 +1,9 @@ +--- +title: Guides +description: "Task-oriented guides: development workflow, data migration, deployment, and troubleshooting." +type: guide +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/guides/meta.json b/content/docs/guides/meta.json new file mode 100644 index 0000000..498dd61 --- /dev/null +++ b/content/docs/guides/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Guides", + "icon": "Wrench", + "pages": ["..."] +} diff --git a/content/docs/guides/migration/index.mdx b/content/docs/guides/migration/index.mdx new file mode 100644 index 0000000..cd728e5 --- /dev/null +++ b/content/docs/guides/migration/index.mdx @@ -0,0 +1,8 @@ +--- +title: Data migration +description: "Data migration documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/guides/migration/meta.json b/content/docs/guides/migration/meta.json new file mode 100644 index 0000000..941c504 --- /dev/null +++ b/content/docs/guides/migration/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Data migration", + "pages": ["..."] +} diff --git a/content/docs/guides/troubleshooting/index.mdx b/content/docs/guides/troubleshooting/index.mdx new file mode 100644 index 0000000..d049ef4 --- /dev/null +++ b/content/docs/guides/troubleshooting/index.mdx @@ -0,0 +1,8 @@ +--- +title: Troubleshooting +description: "Troubleshooting documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/guides/troubleshooting/meta.json b/content/docs/guides/troubleshooting/meta.json new file mode 100644 index 0000000..82c3c83 --- /dev/null +++ b/content/docs/guides/troubleshooting/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Troubleshooting", + "pages": ["..."] +} diff --git a/content/docs/guides/troubleshooting/query-performance.mdx b/content/docs/guides/troubleshooting/query-performance.mdx new file mode 100644 index 0000000..a128a95 --- /dev/null +++ b/content/docs/guides/troubleshooting/query-performance.mdx @@ -0,0 +1,9 @@ +--- +title: Query performance +description: "Diagnosing and fixing slow queries on encrypted columns." +type: guide +--- + +This page is being built as part of the docs V2 overhaul ([CIP-3351](https://linear.app/cipherstash/issue/CIP-3351)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, the EQL reference covers the essentials: [Indexes](/reference/eql/indexes) walks through `EXPLAIN` verification and large-table index builds, and [Sorting](/reference/eql/sorting) covers extractor-form sort keys. diff --git a/content/docs/index.mdx b/content/docs/index.mdx new file mode 100644 index 0000000..a2f2d22 --- /dev/null +++ b/content/docs/index.mdx @@ -0,0 +1,38 @@ +--- +title: CipherStash Docs +seoTitle: CipherStash Docs β€” Searchable encryption for Postgres +description: "Searchable field-level encryption, identity-bound keys, and cryptographic audit trails β€” built into your existing Postgres stack." +type: concept +audience: [developer, cto, ciso] +--- + +CipherStash encrypts your data at the field level. Every value gets its own +key, bound to an identity β€” and the ciphertext stays queryable in Postgres. +A breach, a compromised agent, a curious insider: they all see ciphertext +with no key. + +## Start here + + + + + + + + +## Browse the docs + + + + + + + + + + +## AI-ready documentation + +Every page is available as clean markdown: append `.mdx` to any page URL, or +fetch the whole corpus via [llms.txt](/llms.txt) and +[llms-full.txt](/llms-full.txt). diff --git a/content/docs/integrations/index.mdx b/content/docs/integrations/index.mdx new file mode 100644 index 0000000..6148388 --- /dev/null +++ b/content/docs/integrations/index.mdx @@ -0,0 +1,9 @@ +--- +title: Integrations +description: "Set up CipherStash with your platform, ORM, framework, auth provider, and runtime." +type: tutorial +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/integrations/meta.json b/content/docs/integrations/meta.json new file mode 100644 index 0000000..13995d5 --- /dev/null +++ b/content/docs/integrations/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Integrations", + "icon": "Blocks", + "pages": ["..."] +} diff --git a/content/docs/integrations/supabase/index.mdx b/content/docs/integrations/supabase/index.mdx new file mode 100644 index 0000000..86e2ef6 --- /dev/null +++ b/content/docs/integrations/supabase/index.mdx @@ -0,0 +1,20 @@ +--- +title: Supabase +description: "Searchable, application-level encryption for your Supabase project β€” encrypt in your app, query in Postgres." +type: tutorial +components: [encryption, eql, auth] +audience: [developer] +integration: + category: platform + setup: dashboard-required + pairsWith: [drizzle, prisma-next, clerk, nextjs] +--- + +CipherStash adds application-level encryption to your Supabase project: +sensitive fields are encrypted in your application before they reach Postgres, +and stay queryable with the same Supabase.js calls you already use. + +This page is being rebuilt as part of the docs V2 overhaul +([CIP-3328](https://linear.app/cipherstash/issue/CIP-3328)). Until it lands, +the current Supabase integration guide lives at +[CipherStash + Supabase](/stack/cipherstash/supabase). diff --git a/content/docs/integrations/supabase/meta.json b/content/docs/integrations/supabase/meta.json new file mode 100644 index 0000000..b4690bf --- /dev/null +++ b/content/docs/integrations/supabase/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Supabase", + "icon": "Supabase", + "pages": ["..."] +} diff --git a/content/docs/meta.json b/content/docs/meta.json new file mode 100644 index 0000000..74f486e --- /dev/null +++ b/content/docs/meta.json @@ -0,0 +1,13 @@ +{ + "pages": [ + "index", + "get-started", + "integrations", + "concepts", + "compare", + "guides", + "security", + "solutions", + "reference" + ] +} diff --git a/content/docs/reference/auth/index.mdx b/content/docs/reference/auth/index.mdx new file mode 100644 index 0000000..9bea074 --- /dev/null +++ b/content/docs/reference/auth/index.mdx @@ -0,0 +1,8 @@ +--- +title: Auth +description: "Auth documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/reference/auth/meta.json b/content/docs/reference/auth/meta.json new file mode 100644 index 0000000..d801d12 --- /dev/null +++ b/content/docs/reference/auth/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Auth", + "pages": ["..."] +} diff --git a/content/docs/reference/cli/index.mdx b/content/docs/reference/cli/index.mdx new file mode 100644 index 0000000..8eeffd3 --- /dev/null +++ b/content/docs/reference/cli/index.mdx @@ -0,0 +1,8 @@ +--- +title: CLI +description: "CLI documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/reference/cli/meta.json b/content/docs/reference/cli/meta.json new file mode 100644 index 0000000..0a67892 --- /dev/null +++ b/content/docs/reference/cli/meta.json @@ -0,0 +1,4 @@ +{ + "title": "CLI", + "pages": ["..."] +} diff --git a/content/docs/reference/eql/booleans.mdx b/content/docs/reference/eql/booleans.mdx new file mode 100644 index 0000000..5403fc9 --- /dev/null +++ b/content/docs/reference/eql/booleans.mdx @@ -0,0 +1,62 @@ +--- +title: Booleans +description: "Encrypted booleans are storage-only by design: eql_v3.bool stores and decrypts, carries no index terms, and blocks every comparison." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Every scalar type has a storage-only variant β€” for `bool` it's the only one. EQL ships `eql_v3.bool` and nothing else: there is no `bool_eq` and no `bool_ord`. An encrypted boolean column can be stored, decrypted, and null-checked; it cannot be filtered, sorted, grouped, or joined on. + +## Why there are no query variants + +A two-value column has too little cardinality for any searchable index to be safe. An equality term over `true` / `false` would partition the table into two visible buckets β€” leaking the value distribution (and, with any outside knowledge, the values themselves) outright. Rather than ship an index term that can't keep its promise, EQL omits the query variants entirely. See [Searchable encryption](/concepts/searchable-encryption) for the general analysis of what index terms reveal. + +## What works, what raises + +`eql_v3.bool` follows the bare-variant contract described in [Core concepts](/reference/eql/core-concepts#variants-declare-capability): it carries no index terms, so `IS NULL` / `IS NOT NULL` are the only predicates that work. Every comparison operator routes to a blocker and raises β€” the [fail-loud behavior](/reference/eql/core-concepts#unsupported-operations-fail-loudly) shared by all encrypted variants: + +```sql +-- ❌ Raises: operator = is not supported for eql_v3.bool +SELECT * FROM users WHERE is_active = $1::eql_v3.bool; + +-- βœ… Works: NULL columns are not encrypted +SELECT * FROM users WHERE is_active IS NOT NULL; +``` + +## Filter client-side + +Query on other columns, decrypt the boolean in your application, and filter there: + +```sql +CREATE TABLE users ( + id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + email eql_v3.text_eq, -- exact lookup + created_at eql_v3.timestamp_ord, -- range queries, ORDER BY + is_active eql_v3.bool -- storage only (by design) +); +``` + +```sql +-- Narrow the result set with the columns that do carry index terms… +SELECT id, email, is_active FROM users +WHERE created_at >= $1::eql_v3.timestamp_ord; +-- …then decrypt is_active in the client and filter on the plaintext. +``` + +The [Stack SDK](/reference/stack) and [CipherStash Proxy](/reference/proxy) decrypt the payload back to a plain boolean on read, so the client-side filter is an ordinary `if`. + +If a boolean genuinely needs to be a server-side predicate, that is a data-modelling signal: consider whether the flag is actually sensitive. A non-sensitive flag can stay a plain PostgreSQL `boolean` column alongside your encrypted columns. + +## Storing without querying + +`bool` is the forced case of a pattern available to every scalar type: the bare variant `eql_v3.` (for example `eql_v3.int4`, `eql_v3.text`, `eql_v3.timestamp`) is storage-and-decryption only. It carries no index terms, and every comparison operator raises β€” use it for columns you only ever store and decrypt, so the database holds no searchable material for them at all. + +For every type other than `bool`, storage-only is a choice you can walk back. If you later need to query, retype the column as a query variant β€” or, if the payloads already carry the needed term (the client decides which terms travel in the payload), cast at the call site: + +```sql +SELECT * FROM readings WHERE value::eql_v3.int4_ord > $1::eql_v3.int4_ord; +``` + +The variant families and what each one enables are covered in [Core concepts](/reference/eql/core-concepts); the per-type specifics live in [Numbers](/reference/eql/numbers), [Dates & times](/reference/eql/dates-and-times), and [Text](/reference/eql/text). diff --git a/content/docs/reference/eql/core-concepts.mdx b/content/docs/reference/eql/core-concepts.mdx new file mode 100644 index 0000000..427bc7b --- /dev/null +++ b/content/docs/reference/eql/core-concepts.mdx @@ -0,0 +1,152 @@ +--- +title: Core concepts +description: "The model behind every EQL page: domain variants that declare capability, the encrypted payload envelope, the typed-operand rule, and fail-loud blockers." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Everything in the EQL reference builds on four ideas: columns are typed as **domain variants** that declare what they can do, every value is a **`jsonb` payload** carrying encrypted index terms, **operands must be typed** for the encrypted operators to resolve, and anything a column can't do **fails loudly** instead of returning wrong rows. This page is the canonical home for all four β€” the per-type and per-query pages link back here rather than restating them. + +## Variants declare capability + +EQL ships its searchable-encryption surface as PostgreSQL **domains in the `eql_v3` schema**, all backed by `jsonb`. Each scalar type generates a *family* of domain variants, and the variant you type a column as fixes its query capability. Each domain carries a `CHECK` constraint that validates the encrypted payload on insert, so a malformed or wrong-version value is rejected at write time rather than surfacing at query time. + +There is no database-side configuration table. Earlier EQL versions tracked encryption config in the database (`config_add_table`, `config_add_column`, and friends) β€” those are gone in v3. The searchable surface of a column is fixed by the domain variant you type it as, and which index terms travel in a value's payload is decided by the encryption client (the [Stack SDK](/reference/stack) or [CipherStash Proxy](/reference/proxy)). The domain makes the matching operators resolve; the term in the payload is what makes them answer. + +For any scalar type ``, the family looks like this: + +| Domain variant | Capability | +| --- | --- | +| `eql_v3.` | Storage and decryption only. | +| `eql_v3._eq` | Equality: `=`, `<>`, `IN`, `GROUP BY`, `DISTINCT`, equijoins. | +| `eql_v3._ord` | Comparisons (`<` … `>=`), `BETWEEN`, `ORDER BY`, `MIN` / `MAX` β€” plus equality. | +| `eql_v3._ord_ore` | As `_ord`, with the ORE mechanism pinned β€” see [SEM specifiers](#sem-specifiers). | +| `eql_v3.text_match` (text only) | Free-text token containment: `@>` / `<@`. | +| `eql_v3.text_search` (text only) | Equality + ordering + token containment. | + +Two things worth calling out: + +- **The bare variant blocks everything.** `eql_v3.` carries no index term. Querying it with any comparison operator raises an "operator not supported" exception. Use it for columns you only ever store and decrypt β€” [Booleans](/reference/eql/booleans) covers this pattern in full. +- **Which index term backs each capability** is an implementation detail of the payload β€” covered in [Anatomy of an encrypted value](#anatomy-of-an-encrypted-value) below. + +### SEM specifiers + +A trailing mechanism suffix β€” the `_ore` in `_ord_ore` β€” is a **SEM specifier**: it pins *which* searchable-encryption mechanism implements the capability, rather than just declaring the capability itself. + +- `eql_v3._ord` declares *orderable* and leaves the mechanism to EQL's default β€” currently ORE (order-revealing encryption). +- `eql_v3._ord_ore` declares *orderable via ORE, explicitly*. Today the two are byte-identical surfaces backed by the same term. + +The distinction earns its keep as mechanisms multiply: the EQL v3 release adds an **OPE** (order-preserving encryption) specifier for every orderable type β€” including `text` β€” at which point pinning a specifier documents and freezes a column's mechanism choice, while unspecified variants track the default. Each type page lists its available specifiers under an "SEM specifiers" heading. + +Declaring a table is just typing each column as the variant it needs: + +```sql +CREATE TABLE users ( + id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + email eql_v3.text_eq, -- equality only + salary eql_v3.int4_ord, -- equality + range + ORDER BY + created_at eql_v3.timestamp_ord +); +``` + +Every scalar type β€” `int2`, `int4`, `int8`, `numeric`, `float4`, `float8`, `date`, `timestamp`, `text`, and `bool` in EQL 3.0.0 β€” ships some subset of this family. The per-category pages list exactly which variants each type has and how to choose between them: [Numbers](/reference/eql/numbers), [Dates & times](/reference/eql/dates-and-times), [Text](/reference/eql/text), and [Booleans](/reference/eql/booleans). Encrypted JSON documents use a separate domain, `eql_v3.json`, with its own operator surface β€” see [JSON](/reference/eql/json). + +## Anatomy of an encrypted value + +Every EQL encrypted value is a `jsonb` payload with a shared envelope plus the index terms that make it queryable. Earlier CipherStash docs called this format the **CipherCell** β€” this section is the current definition of the same structure. + +Payloads are **produced** by the encryption clients β€” the [Stack SDK](/reference/stack) and [CipherStash Proxy](/reference/proxy) β€” and **consumed** by EQL's operators and functions inside Postgres. EQL never sees plaintext: it validates, stores, and compares these payloads; it cannot produce or decrypt them. The division is strict: the clients never rely on the database for key material. + +### The envelope + +Every payload carries three envelope keys. Each `eql_v3` domain's `CHECK` constraint requires them, so a value missing any of these is rejected at write time: + +| Key | Contents | Notes | +| --- | --- | --- | +| `v` | The EQL version | `3` β€” the payload version matches the EQL major version. The domain `CHECK`s assert it and raise on any other value. | +| `i` | Ident: `{"t": "", "c": ""}` | Binds the ciphertext to the table and column it was encrypted for. Both keys required. | +| `c` | Ciphertext | The opaque, non-deterministic encrypted blob (mp_base85-encoded). Never used in comparisons. | + + +Payloads produced by EQL v2 clients carried `v: 2`; from 3.0.0 the payload version and the EQL version move together. + + +A `k` discriminator (`"ct"` for a scalar ciphertext, `"sv"` for a JSON document) also appears on payloads emitted by the clients, distinguishing the two top-level shapes. + +### Index-term keys + +Alongside the envelope, a payload carries the index terms for its column's capability. Each key is backed by a SEM (searchable encrypted metadata) type in the `eql_v3` schema: + +| Key | SEM type | Wire shape | Enables | Reveals | +| --- | --- | --- | --- | --- | +| `hm` | `eql_v3.hmac_256` (domain over `text`) | Hex string (HMAC-SHA-256) | `=`, `<>` on `_eq` and `text_search` domains | Whether two values are equal β€” nothing else | +| `ob` | `eql_v3.ore_block_256` (composite: array of `bytea` block terms) | Array of hex-encoded ORE blocks (block count varies by scalar width) | `<`, `<=`, `>`, `>=`, `ORDER BY` on `_ord` / `_ord_ore` domains β€” and `=` / `<>`, since ORE comparison collapses to equality | The relative order of two values | +| `bf` | `eql_v3.bloom_filter` (domain over `smallint[]`) | Array of set bit positions (**signed** 16-bit β€” large filters emit negative positions) | `@>` / `<@` token containment on `_match` domains | Probabilistic token overlap between values | + +The capability is encoded as **required keys**: the payload for an `eql_v3.text_eq` column must carry `hm`; an `eql_v3.int4_ord` payload must carry `ob` (and only `ob`); a `text_match` payload must carry `bf`; a `text_search` payload carries all three. A payload missing its term key fails the domain `CHECK` β€” and fails to deserialize in the client bindings. + +A scalar payload for an `eql_v3.text_search` column (lookup + ordering + free-text match, so all three terms are required): + +```json +{ + "v": 3, + "i": { "t": "users", "c": "email" }, + "c": "mBbKmsMM%bK#QQOx1yLDBHyD...", + "hm": "9c8ec1d2f9932b979b1bf3f09f8a4e2f6a41f8de2f0c8b7a52e1f5c3d4b6a790", + "ob": ["7a1fd0c2...", "d24c9be1...", "03fa66b8..."], + "bf": [42, 1290, -8113, 30201] +} +``` + +- `v`, `i`, `c` β€” the envelope +- `hm` β€” equality term: `WHERE email = $1` compares this +- `ob` β€” ordering term: `ORDER BY` and range comparisons walk these blocks +- `bf` β€” bloom-filter term: `@>` token containment tests these bit positions + +Encrypted JSON documents use a different payload shape β€” an `sv` array with one encrypted entry per path in the document instead of a root ciphertext β€” defined in [JSON](/reference/eql/json). + +### Machine-readable schemas + +The [EQL repository](https://github.com/cipherstash/encrypt-query-language) publishes the format as JSON Schema in two places: + +- **`crates/eql-bindings/schema/`** β€” one schema per scalar domain (`$id`s under `https://schemas.cipherstash.com/eql/v3/`), generated from the canonical Rust wire types in the `eql-bindings` crate. TypeScript bindings are generated from the same definitions, so every producer and consumer shares one source of truth. +- **`docs/reference/schema/`** β€” full-payload schemas covering both the scalar and `sv` document shapes. These files are still named for the v2.x payload releases (`eql-payload-v2.2.schema.json`, `eql-payload-v2.3.schema.json`); the v2.3 schema describes the document shape, with the payload version field moving to `3` alongside the EQL 3.0.0 release. + +## The typed-operand rule + +The `eql_v3` domains are backed by `jsonb`. When an operand has no known type β€” a bare string literal, an untyped parameter β€” PostgreSQL reduces the domain to its `jsonb` base type and resolves the **native `jsonb` operator** instead of the encrypted one. The query doesn't fail; it silently returns native `jsonb` semantics, which are meaningless for encrypted payloads. + +```sql +-- ❌ Wrong: untyped parameter. PostgreSQL falls back to the native jsonb `=`, +-- which compares raw payloads β€” syntactically valid, semantically meaningless. +SELECT * FROM users WHERE email = $1; + +-- βœ… Right: typed operand β€” the encrypted `=` resolves. +SELECT * FROM users WHERE email = $1::eql_v3.text_eq; +``` + +Always type the operand: a typed parameter (`$1::eql_v3.text_eq`) or an explicit cast (`'…'::eql_v3.int4_ord`). The [Stack SDK](/reference/stack) and [CipherStash Proxy](/reference/proxy) type bound parameters automatically β€” raw SQL must do it by hand. + +This is the one place where a mistake is *silent*. Everything else fails loudly: + +## Unsupported operations fail loudly + +Unsupported operators are not silent no-ops. Every operator that a variant doesn't support is still *defined* β€” it routes to a blocker function that raises an `operator … is not supported` exception. A mis-typed query fails loudly instead of silently returning wrong results: + +```sql +-- salary is eql_v3.int8_eq (equality only) +SELECT * FROM users WHERE salary > $1::eql_v3.int8_eq; +-- ERROR: operator > is not supported for eql_v3.int8_eq +``` + +A `NULL` operand still raises β€” the blockers are deliberately not `STRICT`, so PostgreSQL can't skip the check. (A SQL `NULL` column value is not encrypted, so `IS NULL` / `IS NOT NULL` themselves always work, on every variant.) + +`LIKE` and `ILIKE` are blocked on **every** encrypted variant β€” pattern matching is meaningless on ciphertext. Encrypted text matching is bloom-filter token containment instead; [Text](/reference/eql/text) covers it. + +One equality subtlety follows from the term table above: on `_ord` / `_ord_ore` columns, `=` and `<>` compare the **ORE (`ob`) term** β€” ORE comparison collapses to equality β€” so `_ord` payloads carry no `hm` term at all. On `_eq` and `text_search` columns, equality compares the HMAC (`hm`) term. + +## What the terms reveal + +Every index term a value carries is extra material stored in the database, and each term class reveals defined structure to an observer who can read the stored payloads: equality terms reveal *value repetition* (which rows share a value), ORE terms reveal *ordering* (which of two values is larger), and bloom terms reveal *probabilistic token overlap*. None of them reveal the plaintext β€” but you should only carry the terms you actually query on. The full analysis of what each term does and doesn't leak is in [Searchable encryption](/concepts/searchable-encryption). diff --git a/content/docs/reference/eql/dates-and-times.mdx b/content/docs/reference/eql/dates-and-times.mdx new file mode 100644 index 0000000..3af2b9c --- /dev/null +++ b/content/docs/reference/eql/dates-and-times.mdx @@ -0,0 +1,151 @@ +--- +title: Dates & times +description: "The complete reference for encrypted date and timestamp columns: the domain variants, the ORE-backed payload, and time-window, newest-first, and MIN/MAX queries." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +`date` and `timestamp` columns carry the same capabilities as [encrypted numbers](/reference/eql/numbers) β€” equality, ranges, ordering, `MIN` / `MAX` β€” but the queries they serve are temporal: time windows, newest-first listings, retention cutoffs, "when did this last happen". + +## Variants + +Both types generate the same `jsonb`-backed domain variants. The generic form: + +| Domain variant | Capability | +| --- | --- | +| `eql_v3.` | Storage and decryption only. | +| `eql_v3._eq` | Equality: `=`, `<>`, `IN`, `GROUP BY`, `DISTINCT`, equijoins. | +| `eql_v3._ord` | Comparisons, `BETWEEN`, `ORDER BY`, `MIN` / `MAX` β€” plus equality. | +| `eql_v3._ord_ore` | As `_ord`, with the ORE mechanism pinned β€” see [SEM specifiers](#sem-specifiers). | + +And every concrete domain this page covers: + +| Type | Variants | +| --- | --- | +| `date` | `eql_v3.date` Β· `eql_v3.date_eq` Β· `eql_v3.date_ord` Β· `eql_v3.date_ord_ore` | +| `timestamp` | `eql_v3.timestamp` Β· `eql_v3.timestamp_eq` Β· `eql_v3.timestamp_ord` Β· `eql_v3.timestamp_ord_ore` | + +Time columns are nearly always ranged and sorted, so `_ord` is the usual choice. Declare only the capability you query on β€” each capability stores extra searchable material with defined leakage (see [Searchable encryption](/concepts/searchable-encryption)), and the variant model itself is covered in [Core concepts](/reference/eql/core-concepts). + +### Example + +An audit-events table where the timestamps drive time-window queries and sorting: + +```sql +CREATE TABLE audit_events ( + id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + occurred_at eql_v3.timestamp_ord, -- time windows, newest-first, MIN/MAX + review_due eql_v3.date_ord, -- range filters + sealed_on eql_v3.date -- store and decrypt only +); +``` + +### SEM specifiers + +Both types take the same mechanism specifiers on their orderable variant (the concept is defined in [Core concepts](/reference/eql/core-concepts#sem-specifiers)): + +| Specifier | Meaning | +| --- | --- | +| `_ord` | Orderable, using EQL's default mechanism (currently ORE). | +| `_ord_ore` | Orderable via ORE, pinned explicitly. | + +The EQL v3 release adds an OPE specifier for every orderable type; unspecified `_ord` columns keep tracking the default. + +## Payload + +A value for an `_ord` column carries the shared envelope keys (`v`, `i`, `c` β€” see [Core concepts](/reference/eql/core-concepts)) plus the `ob` ordering term. Here is a payload for the `eql_v3.timestamp_ord` `occurred_at` column: + +```json +{ + "v": 3, + "i": { "t": "audit_events", "c": "occurred_at" }, + "c": "mBbKmsMM%bK#QQOx1yLDBHyD...", + "ob": [ + "7a1fd0c2...", "d24c9be1...", "03fa66b8...", "91b7e04d...", + "5c28aa19...", "e6f3071c...", "48d92ab5...", "0b64cf37...", + "2ce8b1f4...", "a90d57e2...", "6f13c8ba...", "d4720e95..." + ] +} +``` + +- **`ob` is the only index term.** An `_ord` payload carries no `hm`: equality on `_ord` variants compares ORE terms, which collapse to equality β€” see [Core concepts](/reference/eql/core-concepts). +- **The `ob` block count varies with the plaintext width** β€” `timestamp` values carry 12 blocks. + +## Operators + +| SQL operator | `eql_v3.` | `_eq` | `_ord` / `_ord_ore` | +| --- | :---: | :---: | :---: | +| `=` / `<>` | ❌ | βœ… | βœ… | +| `<` `<=` `>` `>=` | ❌ | ❌ | βœ… | +| `BETWEEN` (desugars to `>=` and `<=`) | ❌ | ❌ | βœ… | +| `IN` (desugars to `=`) | ❌ | βœ… | βœ… | +| `GROUP BY` / `DISTINCT` | ❌ | βœ… | βœ… | +| `ORDER BY` | ❌ | ❌ | βœ… | +| `IS NULL` / `IS NOT NULL` | βœ… | βœ… | βœ… | + +Blocked *operator* cells raise an `operator … is not supported` exception β€” they never silently return wrong rows. `ORDER BY` is the one blocked cell that doesn't raise: it isn't an operator, so sorting a variant without an ordering term runs β€” but the order is meaningless (see [Sorting](/reference/eql/sorting)). Operands must be typed (`$1::eql_v3.timestamp_ord`), or PostgreSQL resolves the native `jsonb` operator instead of the encrypted one. Both rules are covered in [Core concepts](/reference/eql/core-concepts). + +## Functions + +Every operator has a function form, for managed platforms that disallow custom operators β€” same typed arguments, identical resolution. The `MIN` / `MAX` aggregates only exist as functions: + +| Function | Equivalent | Available on | +| --- | --- | --- | +| `eql_v3.eq(a, b)` / `eql_v3.neq(a, b)` | `=` / `<>` | `_eq`, `_ord` / `_ord_ore` | +| `eql_v3.lt` / `lte` / `gt` / `gte` | `<` `<=` `>` `>=` | `_ord` / `_ord_ore` | +| `eql_v3.min(col)` / `eql_v3.max(col)` | aggregate `MIN` / `MAX` | `_ord` / `_ord_ore` | + +## Example queries + +### Time window + +```sql +SELECT * FROM audit_events +WHERE occurred_at BETWEEN $1::eql_v3.timestamp_ord AND $2::eql_v3.timestamp_ord; + +SELECT * FROM audit_events +WHERE review_due BETWEEN $1::eql_v3.date_ord AND $2::eql_v3.date_ord; +``` + +### Retention cutoff + +```sql +SELECT id FROM audit_events +WHERE occurred_at < $1::eql_v3.timestamp_ord; +``` + +### Newest-first listing + +Write the sort key in extractor form to stream rows out of the index already ordered β€” at large row counts this is the difference between seconds and milliseconds (see [Sorting](/reference/eql/sorting)): + +```sql +SELECT * FROM audit_events +WHERE occurred_at >= $1::eql_v3.timestamp_ord +ORDER BY eql_v3.ord_term(occurred_at) DESC +LIMIT 10; +``` + +### First and last event + +```sql +SELECT eql_v3.min(occurred_at), eql_v3.max(occurred_at) FROM audit_events; +``` + +## Where to next + + + + The same capabilities on int, float, and numeric columns. + + + Btree recipes on `eql_v3.ord_term` for range, ORDER BY, and MIN/MAX. + + + Why the extractor-form sort key matters, and how to verify with EXPLAIN. + + + WHERE-clause patterns across all encrypted types. + + diff --git a/content/docs/reference/eql/filtering.mdx b/content/docs/reference/eql/filtering.mdx new file mode 100644 index 0000000..9fe45a2 --- /dev/null +++ b/content/docs/reference/eql/filtering.mdx @@ -0,0 +1,124 @@ +--- +title: Filtering +description: "WHERE-clause patterns on encrypted columns: equality, IN lists, ranges and BETWEEN, text token matching, JSON containment, and combining encrypted and plaintext predicates." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Every filter below is ordinary SQL β€” the encrypted operators resolve from the column's domain variant, and a functional index on the matching term extractor serves the predicate. One rule applies throughout: **operands must be typed** (`$1::eql_v3.text_eq`, not a bare literal), or PostgreSQL falls through to native `jsonb` semantics. See [Core concepts](/reference/eql/core-concepts) for the typed-operand rule and how unsupported operators fail loudly instead of returning wrong rows. + +## Equality: `=` and `<>` + +Works on `_eq` and `_ord` / `_ord_ore` variants of every scalar, and on `text_search`: + +```sql +SELECT * FROM users WHERE email = $1::eql_v3.text_eq; +SELECT * FROM users WHERE tax_id <> $1::eql_v3.text_eq; +``` + +On `_eq` and `text_search` columns equality compares the HMAC (`hm`) term. On `_ord` variants there is no `hm` β€” equality compares the ORE (`ob`) term, which collapses to equality, so `_ord` columns get `=` and `<>` for free. See [Core concepts](/reference/eql/core-concepts) for the mechanism. + +```sql +-- salary is eql_v3.int8_ord: equality works without an hm term +SELECT * FROM users WHERE salary = $1::eql_v3.int8_ord; +``` + +Bare storage-only variants (`eql_v3.text`, `eql_v3.int4`, …) block every comparison β€” see the type pages for what each variant supports: [Numbers](/reference/eql/numbers), [Dates & times](/reference/eql/dates-and-times), [Text](/reference/eql/text), [Booleans](/reference/eql/booleans). + +## `IN` lists + +`IN` desugars to `=`, so it needs the same equality-capable variants. Each list element is a separately encrypted, typed operand: + +```sql +SELECT * FROM users +WHERE email IN ($1::eql_v3.text_eq, $2::eql_v3.text_eq, $3::eql_v3.text_eq); +``` + +There is no way to encrypt a list as one value β€” the client encrypts each element and binds it as its own parameter. `IN (subquery)` also works, subject to the same-keyset rule covered in [Joins](/reference/eql/joins). + +## Ranges and `BETWEEN` + +`<`, `<=`, `>`, `>=` work on `_ord` / `_ord_ore` variants and `text_search` β€” the variants carrying an ORE (`ob`) term: + +```sql +SELECT * FROM users WHERE salary >= $1::eql_v3.int8_ord; + +-- BETWEEN desugars to >= and <= +SELECT * FROM users +WHERE created_at BETWEEN $1::eql_v3.timestamp_ord AND $2::eql_v3.timestamp_ord; +``` + +Half-open ranges compose the same way: + +```sql +SELECT * FROM events +WHERE occurred_at >= $1::eql_v3.timestamp_ord + AND occurred_at < $2::eql_v3.timestamp_ord; +``` + +## Text token matching: `@>` + +There is no `LIKE` on encrypted columns β€” encrypted free-text matching is bloom-filter token containment via `@>` on a `text_match` or `text_search` column: + +```sql +SELECT * FROM users WHERE name @> $1::eql_v3.text_match; +``` + +The client encrypts the search term into a bloom-filter query value; matching is probabilistic (false positives possible, false negatives not). For the full no-`LIKE` story and match-term tuning, see [Text](/reference/eql/text). + +## JSON containment and path filters + +Encrypted JSON documents (`eql_v3.json`) filter by containment and path existence: + +```sql +-- Does the document contain this (encrypted) structure? +SELECT * FROM orders WHERE metadata @> $1::eql_v3.ste_vec_query; + +-- Does this path exist in the document? +SELECT * FROM orders WHERE eql_v3.jsonb_path_exists(metadata, 'region_selector'); + +-- Equality on an extracted leaf +SELECT * FROM orders +WHERE metadata -> 'email_selector'::text = $1::eql_v3.ste_vec_entry; +``` + +Field access is by selector hash, not plaintext path. The full JSON surface β€” containment, field access, path queries, and range filters on extracted leaves β€” is in [JSON](/reference/eql/json). + +## Combining predicates + +Encrypted predicates compose with `AND`, `OR`, `NOT`, and parentheses like any other predicate β€” and plaintext columns filter normally alongside encrypted ones in the same `WHERE` clause: + +```sql +SELECT * FROM users +WHERE status = 'active' -- plaintext column, native operator + AND created_at >= $1::eql_v3.timestamp_ord -- encrypted range + AND (email = $2::eql_v3.text_eq -- encrypted equality + OR name @> $3::eql_v3.text_match); -- encrypted token match +``` + +The planner treats each encrypted predicate independently, so it can combine an index on a plaintext column with a functional index on an encrypted one (bitmap-AND, or whichever plan is cheapest). + +## `IS NULL` and `IS NOT NULL` + +A SQL `NULL` column value is never encrypted β€” there is no payload to encrypt β€” so null checks work on **every** variant, including storage-only ones: + +```sql +SELECT * FROM users WHERE tax_id IS NULL; +SELECT * FROM users WHERE tax_id IS NOT NULL; +``` + +Don't confuse this with a JSON `null` *inside* an encrypted document, which is an encrypted value like any other β€” see [JSON](/reference/eql/json). + +## Shape summary + +| Filter shape | Operators | Works on | Index | +| --- | --- | --- | --- | +| Equality | `=` `<>` `IN` | `_eq`, `_ord` / `_ord_ore`, `text_search` | hash (or btree) on `eql_v3.eq_term` β€” btree on `eql_v3.ord_term` for `_ord` | +| Range | `<` `<=` `>` `>=` `BETWEEN` | `_ord` / `_ord_ore`, `text_search` | btree on `eql_v3.ord_term` | +| Text token match | `@>` `<@` | `text_match`, `text_search` | GIN on `eql_v3.match_term` | +| JSON containment | `@>` `<@` | `eql_v3.json` | GIN on `eql_v3.to_ste_vec_query(col)::jsonb` | +| Null check | `IS NULL` / `IS NOT NULL` | every variant | β€” | + +Every one of these has a full index recipe β€” which method, which extractor, and how to confirm the index engages with `EXPLAIN` β€” in [Indexes](/reference/eql/indexes). diff --git a/content/docs/reference/eql/grouping-and-aggregates.mdx b/content/docs/reference/eql/grouping-and-aggregates.mdx new file mode 100644 index 0000000..544a91c --- /dev/null +++ b/content/docs/reference/eql/grouping-and-aggregates.mdx @@ -0,0 +1,104 @@ +--- +title: Grouping & aggregates +description: "GROUP BY, DISTINCT, COUNT, and eql_v3.min/max on encrypted columns β€” why to group on the extractor, and why SUM and AVG stay client-side." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Grouping and deduplication need an equality term, so they work on the same variants as `=`: `_eq`, `_ord` / `_ord_ore`, and `text_search`. `MIN` / `MAX` need an ordering term (`_ord` / `_ord_ore`, `text_search`). Arithmetic aggregates don't work at all β€” that's the last section. As everywhere, operands and call-site casts must be typed; see [Core concepts](/reference/eql/core-concepts). + +## `GROUP BY` and `DISTINCT` + +Both work in natural form on equality-capable variants: + +```sql +SELECT email, COUNT(*) FROM logins GROUP BY email; +SELECT DISTINCT email FROM logins; +``` + +Grouping compares equality terms, so rows encrypting the same plaintext land in the same group β€” but the group key that comes back is ciphertext. Decrypt it in the client if you need to display it. + +## Group on the extractor + +For anything beyond small tables, group on the equality-term extractor instead of the raw column: + +```sql +SELECT eql_v3.eq_term(email) AS email_term, COUNT(*) + FROM logins + GROUP BY eql_v3.eq_term(email); +``` + +The reason is planner economics. `GROUP BY email` uses the entire encrypted payload β€” 1–2 KB per row β€” as the hash key. Postgres estimates a hash table far larger than the default `work_mem` and falls back to a disk-spilling `GroupAggregate`. The extractor key is a small deterministic term: the hash table fits in `work_mem` and the planner picks `HashAggregate` reliably. If an ORM forces the raw-column form, raising `work_mem` is the rescue knob β€” but the extractor form is the design. The same reasoning, from the index-tuning angle, is in [Indexes](/reference/eql/indexes). + +Note the trade-off: grouping on `eq_term` returns the *term*, not the encrypted value β€” fine for counting, but the term itself can't be decrypted. If you need the group key's plaintext, join the grouped result back to the table on the term to recover a representative encrypted value, then decrypt that in the client. + +## `COUNT` and `COUNT(DISTINCT)` + +Plain `COUNT(col)` counts non-`NULL` rows β€” it never compares values, so it works on **any** variant, including storage-only ones: + +```sql +SELECT COUNT(tax_id) FROM users; -- works even on bare eql_v3.text +``` + +`COUNT(DISTINCT col)` deduplicates, so it needs an equality-capable variant β€” and the same extractor advice applies: + +```sql +SELECT COUNT(DISTINCT eql_v3.eq_term(email)) FROM logins; +``` + +## `MIN` and `MAX`: `eql_v3.min` / `eql_v3.max` + +EQL ships `min` / `max` aggregates per ord-capable variant of every scalar type. The input type selects the aggregate, and the return type matches the input: + +```sql +eql_v3.min(eql_v3._ord) RETURNS eql_v3._ord +eql_v3.max(eql_v3._ord) RETURNS eql_v3._ord +eql_v3.min(eql_v3._ord_ore) RETURNS eql_v3._ord_ore +eql_v3.max(eql_v3._ord_ore) RETURNS eql_v3._ord_ore +``` + +Comparison routes through the variant's `<` / `>` operator on the ORE term β€” no decryption happens in the database, and the result is an encrypted value the client decrypts. `NULL` inputs are skipped; an all-`NULL` input set returns `NULL`, matching native aggregate semantics. + +```sql +SELECT eql_v3.min(salary) FROM users; +SELECT eql_v3.max(salary) FROM users WHERE department = 'engineering'; + +-- Combined with grouping +SELECT eql_v3.eq_term(department_code) AS dept, eql_v3.max(salary) + FROM users + GROUP BY eql_v3.eq_term(department_code); +``` + +If the column is generic `jsonb` rather than a domain, cast to the right variant at the call site so overload resolution can pick the aggregate: + +```sql +SELECT eql_v3.min(salary_jsonb::eql_v3.int8_ord) FROM users; +``` + +A btree on `eql_v3.ord_term(col)` serves `MIN` / `MAX` β€” the [Indexes](/reference/eql/indexes) page has the recipe. + +## No `SUM`, no `AVG` + + +**`SUM`, `AVG`, and every other arithmetic aggregate are unsupported** on encrypted columns β€” they would require homomorphic encryption, which EQL does not do. `MIN` / `MAX` work because they only need *comparison*, which the ORE term provides. For sums and averages, select the rows (or `MIN`/`MAX`/`COUNT` server-side to narrow them) and aggregate client-side after decryption. + + +## Grouping on extracted JSON leaves + +Leaves inside an encrypted JSON document group the same way β€” extract the entry by selector, then group on its equality term: + +```sql +SELECT eql_v3.eq_term(metadata -> 'region_selector'::text) AS region, COUNT(*) + FROM orders + GROUP BY eql_v3.eq_term(metadata -> 'region_selector'::text); +``` + +`eql_v3.eq_term` reads whichever term the entry carries, so this works on every JSON node type. String and Number leaves also support `eql_v3.min` / `eql_v3.max` via their CLLW ORE term. Selectors and node capabilities are in [JSON](/reference/eql/json). + +## Where to go next + +- [Indexes](/reference/eql/indexes) β€” the hash/btree recipes that back these shapes, and the full `work_mem` / `HashAggregate` story. +- [Joins](/reference/eql/joins) β€” equality terms across tables, and the same-keyset rule. +- [Filtering](/reference/eql/filtering) β€” the `WHERE` shapes that feed these aggregates. diff --git a/content/docs/reference/eql/index.mdx b/content/docs/reference/eql/index.mdx new file mode 100644 index 0000000..6b9ce9e --- /dev/null +++ b/content/docs/reference/eql/index.mdx @@ -0,0 +1,144 @@ +--- +title: EQL +description: "Encrypt Query Language (EQL) installs encrypted column types and operators into Postgres as plain SQL β€” encryption itself happens in your client." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Encrypt Query Language (EQL) is a set of types, operators, and functions for storing and querying encrypted data in PostgreSQL. It installs as a single plain-SQL script β€” no extension packaging, no superuser, no operator classes β€” so it runs on Supabase, RDS, Cloud SQL, and self-hosted Postgres alike. + +EQL itself never encrypts anything. Encryption and decryption happen in the client, using the [Stack SDK](/reference/stack) or [CipherStash Proxy](/reference/proxy). EQL provides the database-side surface those clients query against: encrypted column types, the operators that compare them, and the term-extractor functions that make indexes work. + +Every encrypted column is a `jsonb`-backed domain type in the `eql_v3` schema, and the domain variant you choose declares what the column can do β€” the full model is in [Core concepts](/reference/eql/core-concepts). + +## Install + + + + +### Download the install script + +Each [GitHub release](https://github.com/cipherstash/encrypt-query-language/releases) publishes a versioned `cipherstash-encrypt.sql`: + +```sh +curl -sLo cipherstash-encrypt.sql https://github.com/cipherstash/encrypt-query-language/releases/latest/download/cipherstash-encrypt.sql +``` + + + + +### Run it against each database + +```sh +psql -f cipherstash-encrypt.sql +``` + +The script installs the `eql_v3` schema with all domain types, operators, functions, and aggregates. It is idempotent: re-running it upgrades the `eql_v3` surface in place and won't remove anything you've built on top of it. To upgrade, download the latest script and run it again. + + + + +### Verify + +```sql +SELECT eql_v3.version(); +-- '3.0.0' +``` + + + + + +`DROP SCHEMA eql_v3 CASCADE` drops every column typed as an `eql_v3` domain. The domain types live in the schema, and your columns depend on them. + + +### dbdev + +EQL is also published to [dbdev](https://database.dev/cipherstash/eql). The dbdev release can lag behind GitHub releases, so prefer the install script when you need the latest version. + +### Docker for local development + +Run a Postgres image with EQL pre-installed: + +```sh +docker run --rm -p 5432:5432 -e POSTGRES_PASSWORD=postgres \ + ghcr.io/cipherstash/postgres-eql:17 +``` + +EQL installs automatically on first boot. Images are available for PostgreSQL 14–17 (`:14` through `:17`), and you can pin a specific EQL version with a suffixed tag (for example `:17-3.0.0`). + +## Permissions + +Installing EQL and running queries against it need different privileges. A common production pattern splits them across two users. + +**Migration user** β€” installs EQL and adds encrypted columns during migrations: + +```sql +GRANT CREATE ON DATABASE your_database TO your_migration_user; +GRANT CREATE ON SCHEMA public TO your_migration_user; +GRANT ALTER ON ALL TABLES IN SCHEMA public TO your_migration_user; +``` + +`CREATE ON DATABASE` creates the `eql_v3` schema and its types; `CREATE ON SCHEMA` and `ALTER` are needed to add encrypted columns (typed as `eql_v3` domains, with their `CHECK` constraints) to your tables. + +**Runtime user** β€” the application's day-to-day access: + +```sql +-- EQL schema usage (resolves the encrypted operators / extractors) +GRANT USAGE ON SCHEMA eql_v3 TO your_app_user; +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA eql_v3 TO your_app_user; + +-- User table access (normal application permissions) +GRANT SELECT, INSERT, UPDATE, DELETE ON TABLE your_tables TO your_app_user; +``` + +Schema changes β€” adding or removing encrypted columns β€” always go through the migration user. + +## Managed Postgres and Supabase + +EQL v3 is designed to install without superuser. There are no custom operator classes (which managed platforms typically block), no `postgresql.conf` changes, and no separate Supabase build β€” the single install script is the same artefact everywhere. Indexing works through ordinary functional indexes over EQL's term-extractor functions, which any user who can `CREATE INDEX` can build. See the [Supabase integration](/integrations/supabase) for platform-specific setup. + +## Understand + + + + Domain variants, the encrypted payload, typed operands, and fail-loud blockers β€” the model every other page assumes. + + + Encrypted integers, floats, and numerics. + + + Encrypted dates and timestamps: time windows, newest-first, retention cutoffs. + + + Encrypted text: equality, ordering, and free-text token matching β€” and why there is no `LIKE`. + + + Encrypted JSON documents: containment, field access, and GIN indexing. + + + Storage-only by design: why encrypted booleans carry no index terms. + + + Functional-index recipes over the term extractors, and what it takes for an index to engage. + + + +## Use + + + + `WHERE` clauses on encrypted columns: equality, ranges, and text containment. + + + `ORDER BY` on encrypted columns, and how to keep the sort in the index. + + + `GROUP BY`, `DISTINCT`, `COUNT`, and the `MIN` / `MAX` aggregates. + + + Equijoins on encrypted columns and the same-keyset rule. + + diff --git a/content/docs/reference/eql/indexes.mdx b/content/docs/reference/eql/indexes.mdx new file mode 100644 index 0000000..ff4b1fe --- /dev/null +++ b/content/docs/reference/eql/indexes.mdx @@ -0,0 +1,182 @@ +--- +title: Indexes +description: "Create Postgres indexes on encrypted columns using functional indexes over EQL's term-extractor functions." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +EQL indexes are ordinary PostgreSQL functional indexes over **term-extractor functions** β€” never an index or operator class on the column itself. Each extractor returns a small per-row index term whose return type already carries a default operator class: + +| Extractor | Index method | Term | Capability | +| --- | --- | --- | --- | +| `eql_v3.eq_term(col)` | `hash` (or `btree`) | `hm` (HMAC-256) | equality | +| `eql_v3.ord_term(col)` | `btree` | `ob` (ORE block) | range, `ORDER BY`, `MIN` / `MAX` | +| `eql_v3.match_term(col)` | `gin` | `bf` (bloom filter) | text containment | + +The extractors are inlinable SQL functions, so the planner rewrites a bare-form predicate into the same expression the index was built on. You don't rewrite queries to use the index: + +```sql +SELECT * FROM users WHERE email = $1::eql_v3.text_eq; +-- planner inlines `=` to: eql_v3.eq_term(email) = eql_v3.eq_term($1) +-- Index Cond on USING hash (eql_v3.eq_term(email)) +``` + + +EQL v3 deliberately ships no operator class for encrypted columns. Operators resolve against the domain's `jsonb` base type, so an opclass on the column would bypass the encrypted surface. Always index through the extractor. + + +## Index recipes + +Type the column as the domain variant that carries the term (see [Core concepts](/reference/eql/core-concepts) for the variant model, and the per-type pages for specifics), then index the matching extractor: + +```sql +-- Equality: hash index on eq_term +-- (columns typed eql_v3._eq or text_search; equality on _ord columns +-- compares ORE terms, so the btree on ord_term below serves it) +CREATE INDEX users_email_eq + ON users USING hash (eql_v3.eq_term(email)); + +-- Range / ordering: btree index on ord_term +-- (columns typed eql_v3._ord or _ord_ore) +CREATE INDEX users_created_at_ord + ON users USING btree (eql_v3.ord_term(created_at)); + +-- Text match: GIN index on match_term +-- (columns typed eql_v3.text_match or text_search) +CREATE INDEX users_name_match + ON users USING gin (eql_v3.match_term(name)); + +ANALYZE users; +``` + +Run `ANALYZE` after every index build. `CREATE INDEX` on an expression gathers no statistics for that expression β€” without `ANALYZE`, the planner has no histogram for `eql_v3.eq_term(email)` and can misjudge the index it just built. + +Create indexes when the table has a significant number of rows (typically more than 1,000) and you query the column with the matching operator. Drop indexes for capabilities you no longer query β€” duplicate indexes compete for cache and slow writes. + +## Requirements for an index to engage + +All three must hold: + +1. **The value carries the required term.** Equality needs `hm`, range needs `ob`, containment needs `bf`. Which terms travel in a value's payload is decided by the encryption client β€” a value with only a bloom term will not drive an equality index. +2. **The index was built after the data carried the term.** If you change which terms a column's values carry, recreate the index. +3. **The query operand is typed.** A typed parameter (`$1`, which CipherStash Proxy supplies) or an explicit cast resolves the encrypted operator; a bare `jsonb` literal falls through to native `jsonb` semantics and skips the index entirely: + +```sql +-- βœ“ resolves the encrypted operator β†’ uses the index +WHERE email = $1::eql_v3.text_eq; +WHERE email = $1; -- only when the client (Stack SDK / Proxy) binds $1 typed + +-- βœ— falls through to native jsonb semantics +WHERE email = '{"hm":"abc"}'::jsonb; +``` + +## Query shapes + +### Equality + +```sql +SELECT * FROM users WHERE email = $1::eql_v3.text_eq; +-- Index Scan using users_email_eq +-- Index Cond: (eql_v3.eq_term(email) = eql_v3.eq_term($1)) +``` + +### Range and ORDER BY + +The `<`, `<=`, `>`, `>=` operators inline to comparisons on `eql_v3.ord_term`, so natural-form range predicates match the btree: + +```sql +SELECT * FROM users WHERE created_at < $1::eql_v3.timestamp_ord; +``` + +`ORDER BY` needs care. The planner inlines operators in *predicates* but does not rewrite *sort keys*: `ORDER BY created_at` uses the index for the `WHERE` clause but still adds a `Sort` node, which scales linearly with the rows passing the filter. To stream rows out of the btree already ordered, write the sort key in extractor form: + +```sql +SELECT * FROM users + WHERE created_at < $1::eql_v3.timestamp_ord + ORDER BY eql_v3.ord_term(created_at) DESC + LIMIT 10; +``` + +ORE terms are order-preserving, so this sorts identically to the natural form β€” it just lets the index do the ordering. At large row counts this is the difference between seconds and milliseconds. + + +If you `SELECT col::jsonb ... ORDER BY col`, Postgres folds the cast into the scan and uses `(col)::jsonb` as the sort key β€” which matches no index. Project the column raw, or write the sort key as `eql_v3.ord_term(col)`, which sidesteps this entirely. + + +### GROUP BY and DISTINCT + +Group on the extractor, not the raw column: + +```sql +SELECT eql_v3.eq_term(email), count(*) + FROM users + GROUP BY eql_v3.eq_term(email); +``` + +`GROUP BY email` uses the entire encrypted payload (1–2 KB per row) as the hash key; Postgres estimates a hash table far larger than the default `work_mem` and falls back to a disk-spilling `GroupAggregate`. The extractor key is a small deterministic term, so the hash table fits in `work_mem` and the planner picks `HashAggregate` reliably. If an ORM forces the raw-column form, raising `work_mem` is the rescue knob β€” but the extractor form is the design. + +## Encrypted JSON + +Containment (`@>` / `<@`) on `eql_v3.json` document columns uses a GIN index over `eql_v3.to_ste_vec_query(col)::jsonb`, and field-level equality and ordering have their own extractor recipes. See [JSON](/reference/eql/json). + +## Verify with EXPLAIN + +The first move on a slow query is `EXPLAIN (COSTS OFF)`: + +- **`Index Scan using `** β€” the functional index is engaged. +- **`Index Cond:` referencing the extractor** (`eql_v3.eq_term(...)`, `eql_v3.ord_term(...)`) β€” the inlined predicate matched the index. +- **`Seq Scan`** β€” no index used. Check the three requirements above. +- **`Filter:` showing the raw operator** β€” inlining did not happen. Usual causes: a pinned `search_path` on a customised function, or the planner judging another plan cheaper. +- **`Sort` node above an Index Scan** β€” natural-form `ORDER BY`. Switch the sort key to `eql_v3.ord_term(col)` to eliminate it. + +Once the plan looks right, repeat with `EXPLAIN ANALYZE` to measure actual timings. For a full diagnosis walkthrough, see [query performance troubleshooting](/guides/troubleshooting/query-performance). + +## Building indexes on large tables + +Index *build* time is a separate axis from query time β€” a functional index that queries in a millisecond can take hours to `CREATE` on a large table. + +**Raise `maintenance_work_mem`.** `CREATE INDEX` draws on `maintenance_work_mem` (default 64 MB β€” far too small for a multi-million-row build). It's the single highest-leverage knob: + +```sql +SET maintenance_work_mem = '2GB'; +CREATE INDEX users_email_eq ON users USING btree (eql_v3.eq_term(email)); +``` + +**Prefer `btree` over `hash` for equality on large tables.** A btree build sorts then bulk-loads with sequential writes and can parallelise; a hash build scatters rows to random buckets and degrades to random I/O once the index outgrows cache β€” it cannot parallelise. A btree on `eql_v3.eq_term(col)` serves `=` exactly as well as a hash index, with no query-side cost. Hash is fine up to mid-six-figure row counts. + +**Expect a de-TOAST floor.** A functional index over a large encrypted column de-TOASTs the whole stored value once per row to evaluate the extractor. This cost is identical across access methods and sets the build's floor rate. Index builds are also I/O-heavy in a way queries are not β€” containerised Postgres on a virtualised filesystem (Docker Desktop on macOS, notably) pays a steep penalty, so run large builds on native storage. + +**Watch the build.** From a second session while `CREATE INDEX` runs: + +```sql +SELECT phase, tuples_done, tuples_total, + round(100.0 * tuples_done / nullif(tuples_total, 0), 1) AS pct +FROM pg_stat_progress_create_index; +``` + +A steady `tuples_done` rate is healthy. A rate that decays over time is the cache/memory wall β€” raise `maintenance_work_mem`, and if it's a hash index, rebuild it as a btree. + +## Why this works on managed Postgres + +Everything above is a functional index over an `IMMUTABLE` SQL function β€” no operator class on a column, no superuser, no `postgresql.conf` changes. Managed platforms that block custom operator classes (Supabase among them) run these recipes unchanged, so the indexing model is identical on Supabase, RDS, Cloud SQL, and self-hosted Postgres. See the [Supabase integration](/integrations/supabase). + +## Troubleshooting + +**Index not being used:** + +1. Verify the value carries the term: + + ```sql + SELECT email::jsonb ? 'hm' AS has_hmac, + email::jsonb ? 'ob' AS has_ore_block, + email::jsonb ? 'bf' AS has_bloom + FROM users LIMIT 1; + ``` + +2. Verify the operand is typed (`$1::eql_v3.text_eq`, not `$1::jsonb`). +3. Recreate the index if the column's terms changed after it was built. +4. Run `ANALYZE`. Very small tables may still choose a sequential scan β€” that's correct. + +**`=` returns zero rows on a populated column:** equality requires the term its variant compares β€” `hm` on `_eq` / `text_search`, `ob` on `_ord` variants. Type the column as an equality-capable variant and confirm the encryption client is emitting that term. diff --git a/content/docs/reference/eql/joins.mdx b/content/docs/reference/eql/joins.mdx new file mode 100644 index 0000000..fdcc0e4 --- /dev/null +++ b/content/docs/reference/eql/joins.mdx @@ -0,0 +1,112 @@ +--- +title: Joins +description: "Equijoins on encrypted columns: the same-keyset and matching-variant constraint, IN (subquery) and set operations, a worked example, and how to diagnose a join that returns nothing." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Equijoins work on equality-capable variants (`_eq`, `_ord` / `_ord_ore`, `text_search`) β€” the join condition is just encrypted equality. But there is one constraint that has no plaintext equivalent, and it is the single thing to internalize on this page: + + +**Both sides of the join must be encrypted with the same keyset and typed as a matching variant.** Encrypted equality compares deterministic index terms, and those terms are derived from the encryption keys. Two columns encrypted under different keysets produce different terms for the *same plaintext* β€” their terms can **never** match, and the join returns no rows. This is not an error the database can detect: the query is valid, the plan is fine, the result is simply empty. + + +"Matching variant" means both sides compare the same term kind: `_eq` with `_eq` (or `text_search`, which carries an `hm` term too) compares HMAC terms; `_ord` with `_ord` compares ORE terms. An `_eq` column can't join an `_ord` column β€” one side has no `hm`, the other no `ob`, and the equality operator between mismatched variants doesn't resolve. See [Core concepts](/reference/eql/core-concepts) for the term model. + +## Equijoin + +```sql +SELECT u.*, o.total +FROM users u +JOIN orders o ON u.email = o.customer_email; +-- both columns eql_v3.text_eq, encrypted with the same keyset +``` + +No typed-operand cast is needed here β€” both operands are encrypted columns, so their domain types resolve the encrypted operator directly. All join types (`INNER`, `LEFT`, `RIGHT`, `FULL`) work; `LEFT JOIN` null-extension behaves normally because SQL `NULL`s are not encrypted. + +Index both sides for anything beyond small tables β€” a hash (or btree) index on `eql_v3.eq_term(col)` on each column. Recipes are in [Indexes](/reference/eql/indexes). + +## `IN (subquery)` and set operations + +Both follow the same rule, because both compare equality terms across two column sources: + +```sql +-- IN (subquery): users.email and orders.customer_email must share a keyset +SELECT * FROM users +WHERE email IN (SELECT customer_email FROM orders WHERE flagged); + +-- Set-operation dedup: UNION / INTERSECT / EXCEPT dedupe by equality term +SELECT email FROM users +UNION +SELECT customer_email FROM orders; +``` + +If the two columns are under different keysets, `IN (subquery)` matches nothing, `INTERSECT` is empty, `EXCEPT` returns everything, and `UNION` never merges duplicates β€” all silently. + +## Worked example + +Two tables sharing an encrypted customer identifier, both columns typed `eql_v3.text_eq` and encrypted by the same client configuration (same keyset): + +```sql +CREATE TABLE users ( + id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + email eql_v3.text_eq +); + +CREATE TABLE orders ( + id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + customer_email eql_v3.text_eq, + total BIGINT NOT NULL +); + +CREATE INDEX users_email_eq ON users USING hash (eql_v3.eq_term(email)); +CREATE INDEX orders_cust_eq ON orders USING hash (eql_v3.eq_term(customer_email)); +ANALYZE users; ANALYZE orders; +``` + +Orders per user, filtered by an encrypted lookup on one side: + +```sql +SELECT u.id, COUNT(o.id) AS order_count +FROM users u +LEFT JOIN orders o ON u.email = o.customer_email +WHERE u.email = $1::eql_v3.text_eq +GROUP BY u.id; +``` + +The `WHERE` engages the hash index on `users`; the join condition engages the one on `orders`. The grouping key here is a plaintext `id`, so no extractor is needed β€” grouping on encrypted columns is covered in [Grouping & aggregates](/reference/eql/grouping-and-aggregates). + +## Anti-pattern: joining across keysets + +The failure mode is quiet. A join across keysets doesn't raise, doesn't warn, and produces a plan that looks healthy β€” the terms just never match, so it behaves exactly like a join where no rows happen to correlate: + +```sql +-- users encrypted by service A's keyset, partners by service B's: +SELECT * FROM users u JOIN partners p ON u.email = p.contact_email; +-- 0 rows. Always. Even when the plaintext emails overlap. +``` + +To diagnose a join that returns fewer rows than expected (or none): + +1. **Check the variants.** Both columns must be equality-capable and compare the same term kind. A blocked operator raises loudly, so if the query *runs*, the variants at least resolve β€” but confirm they compare the same term (`hm` vs `ob`). +2. **Compare terms for a known-matching pair.** Take one row from each table that you know holds the same plaintext and compare their equality terms: + + ```sql + SELECT eql_v3.eq_term(u.email) = eql_v3.eq_term(p.contact_email) AS terms_match + FROM users u, partners p + WHERE u.id = 42 AND p.id = 7; -- rows known to share a plaintext value + ``` + + `false` for plaintext-identical values means the terms were derived under different keysets (or different client configurations) β€” no SQL will make them join. +3. **Fix it at the encryption layer.** Configure both columns under the same keyset in the [Stack SDK](/reference/stack) or [CipherStash Proxy](/reference/proxy) and re-encrypt one side. Cross-keyset correlation otherwise has to happen in the client, after decryption. + +Treat shared keysets as part of your schema design: columns you intend to join are a unit, the same way a foreign key pair is. + +## Where to go next + +- [Filtering](/reference/eql/filtering) β€” the equality and `IN` shapes joins are built from. +- [Grouping & aggregates](/reference/eql/grouping-and-aggregates) β€” grouping joined results on encrypted keys. +- [Indexes](/reference/eql/indexes) β€” equality index recipes for both sides of a join. +- [Core concepts](/reference/eql/core-concepts) β€” index terms, variants, and why determinism makes joins possible at all. diff --git a/content/docs/reference/eql/json.mdx b/content/docs/reference/eql/json.mdx new file mode 100644 index 0000000..93eb920 --- /dev/null +++ b/content/docs/reference/eql/json.mdx @@ -0,0 +1,268 @@ +--- +title: JSON +description: "The complete reference for encrypted JSON documents with eql_v3.json β€” the ste_vec payload shape, containment, field access, and path queries over ciphertext, with the native jsonb operators that don't apply blocked outright." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +`eql_v3.json` is EQL's encrypted JSON document type, built on structured encryption (**ste_vec**). The document is encrypted as a vector of encrypted entries β€” one entry per path inside the document β€” and every path is queryable without decryption: containment, field and array access, and equality or range comparisons on extracted leaves. + +Like every EQL type, `eql_v3.json` holds ciphertext the database can't read. Encryption, decryption, and selector generation happen in the client β€” the [Stack SDK](/reference/stack) or [CipherStash Proxy](/reference/proxy). See [Searchable encryption](/concepts/searchable-encryption) for how querying ciphertext works at all. + +## The types + +Three `jsonb`-backed domains make up the encrypted JSON surface: + +| Type | What it is | +| --- | --- | +| `eql_v3.json` | The column type. An encrypted document envelope carrying an `sv` array β€” one encrypted entry per path in the document. | +| `eql_v3.ste_vec_entry` | A single entry from the vector: a selector, a ciphertext, and exactly one index term. This is what `->` returns. | +| `eql_v3.ste_vec_query` | A containment needle: entries with selectors and index terms but **no ciphertext**. This is what you cast a `@>` operand to. | + +## Payload shape + +An encrypted JSON document uses a different payload shape from the scalar types: the standard envelope keys are present (`v`, `i`, plus the `k: "sv"` discriminator β€” envelope anatomy is covered in [Core concepts](/reference/eql/core-concepts)), but there is no root ciphertext. Instead, an `sv` array carries one encrypted entry per path in the document. Each entry has: + +| Key | Contents | +| --- | --- | +| `s` | Selector β€” a deterministic hash of the JSON path. Required; entry matching compares selectors first. | +| `c` | Ciphertext for the node at that path. | +| `hm` **or** `oc` | Exactly one, never both β€” the domain `CHECK` enforces the exclusivity. `hm` (HMAC-256) on Boolean/`null` leaves and Object/Array roots; `oc` (CLLW ORE, backed by `eql_v3.ore_cllw`) on String/Number leaves. | +| `a` | Optional array marker β€” `true` when the selector points at an array context. | + +The decoded `oc` value starts with a domain-tag byte (`0x00` numeric, `0x01` string) followed by the CLLW ciphertext, so numeric and string values in one column keep a consistent total order. Earlier payload versions split this into two fields β€” `ocf` (fixed-width, numeric) and `ocv` (variable-width, string) β€” which consolidated into the single `oc` key; the tag byte now carries the distinction. + +A document payload for an `eql_v3.json` column: + +```json +{ + "v": 3, + "k": "sv", + "i": { "t": "orders", "c": "metadata" }, + "sv": [ + { "s": "2517068c0d1f9d4d41d2c666211f785e", "c": "mBbKmM...", "hm": "b0e0..." }, + { "s": "f510853a4ab9d4f75f51a533ac264c5d", "c": "mBbKmQ...", "oc": "01a3f2..." }, + { "s": "33743aed3ae636f6bf05cff11ac4b519", "c": "mBbKmR...", "oc": "004e19..." } + ] +} +``` + +- First entry: an object root β€” `hm` only, equality/containment +- Second entry: a string leaf β€” `oc` starting with tag `01` +- Third entry: a numeric leaf β€” `oc` starting with tag `00` + +A containment **query** payload (`eql_v3.ste_vec_query`) has the same `sv` shape but its entries carry no `c` β€” containment matches selectors and index terms, never ciphertexts. This is the needle the client builds for a `@>` query: + +```json +{ + "sv": [ + { "s": "f510853a4ab9d4f75f51a533ac264c5d", "oc": "01a3f2..." } + ] +} +``` + +## Storing encrypted JSON + +Type the column as `eql_v3.json`: + +```sql +CREATE TABLE orders ( + id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + metadata eql_v3.json +); +``` + +There is no database-side configuration step. Which index terms a document carries is decided by the encryption client; typing the column as `eql_v3.json` is what makes the encrypted operators and functions resolve. The domain's `CHECK` constraint validates the payload shape on insert, so malformed values are rejected at write time. + +Insert and read through the Stack SDK or Proxy, which encrypt the document into the ste_vec payload on write and decrypt it on read. + +## What each node type supports + +During encryption, the client flattens the document: each unique path gets a deterministic **selector** hash, and each node gets an entry in the `sv` vector carrying index terms for its JSON type: + +| JSON node type | Index term | Equality (`=`, `<>`, `GROUP BY`) | Ordering (`<` … `>=`, `MIN`/`MAX`) | +| --- | --- | --- | --- | +| Object | `hm` (HMAC-256) | Yes | No | +| Array | `hm` (HMAC-256) | Yes | No | +| Boolean / JSON `null` | `hm` (HMAC-256) | Yes | No | +| String | `oc` (CLLW ORE, string domain) | Yes | Yes | +| Number | `oc` (CLLW ORE, numeric domain) | Yes | Yes | + +Each entry carries exactly one of `hm` or `oc` β€” the domain `CHECK` enforces the exclusivity. `hm` is a deterministic hash, so it supports equality only. `oc` is a CLLW ORE term that reveals ordering and, being deterministic, collapses to equality on matching selectors β€” `eql_v3.eq_term` reads whichever term an entry carries, so equality works uniformly across all node types. + +JSON `null` here means a `null` literal *inside* the document. A SQL `NULL` column value is not encrypted at all. + +## Blocked native jsonb operators + +These native PostgreSQL `jsonb` operators are **blocked** on `eql_v3.json`. They raise an error rather than silently running plaintext-jsonb semantics against the encrypted payload: + +- Key/path existence: `?`, `?|`, `?&`, `@?`, `@@` +- Path extraction: `#>`, `#>>` +- Mutation: `-`, `#-`, `||` +- Root-document comparison: `=`, `<>`, `<`, `<=`, `>`, `>=` + +Use containment (`@>` / `<@`), field access (`->` / `->>`), or the `eql_v3.jsonb_path_*` functions instead. There is no server-side mutation of an encrypted document β€” updates re-encrypt in the client. + + +**Operands must be typed** (`doc -> 'email'::text`, not `doc -> 'email'`) β€” an untyped operand resolves the native `jsonb` operator, bypassing both the encrypted operator and the blockers. See [Core concepts](/reference/eql/core-concepts). + + +## Containment: `@>` and `<@` + +`@>` tests whether the encrypted document contains a structure; `<@` is the reverse. Build the needle with the client and cast it to `eql_v3.ste_vec_query` (a typed `eql_v3.json` or `eql_v3.ste_vec_entry` operand also works): + +```sql +SELECT * FROM orders +WHERE metadata @> $1::eql_v3.ste_vec_query; +``` + +This is the encrypted equivalent of the plaintext `metadata @> '{"customer": {"tier": "premium"}}'`: containment checks that every encrypted term in the needle exists in the document's `sv` vector. `eql_v3.to_ste_vec_query(doc)` converts a stored document into the needle shape, and `eql_v3.ste_vec_contains(a, b)` is the function form backing `@>`. + +For large tables, back containment with a GIN index. The typed `@>` overload inlines to a native `jsonb @>` over `eql_v3.to_ste_vec_query(col)::jsonb`, so a GIN index on that same expression engages: + +```sql +CREATE INDEX orders_metadata_gin + ON orders USING gin (eql_v3.to_ste_vec_query(metadata)::jsonb jsonb_path_ops); +ANALYZE orders; +``` + +See [Indexes](/reference/eql/indexes) for the full recipes. + +## Field access: `->` and `->>` + +Fields are addressed by **selector hash** β€” the deterministic identifier the client emits for a JSON path during encryption β€” not a plaintext path string like `$.customer.tier`. + +```sql +-- Field access by selector (returns eql_v3.ste_vec_entry) +SELECT metadata -> 'selector_hash'::text FROM orders; + +-- The entry serialized as text (ciphertext JSON, not decrypted plaintext) +SELECT metadata ->> 'selector_hash'::text FROM orders; + +-- Array element by 0-based index +SELECT metadata -> 0 FROM orders; +``` + +The extracted `eql_v3.ste_vec_entry` is itself comparable: + +- `=` / `<>` resolve via `eql_v3.eq_term` β€” works on every node type +- `<` / `<=` / `>` / `>=` resolve via `eql_v3.ore_cllw` β€” String and Number leaves only +- `MIN` / `MAX` over an extracted ordered leaf use the `eql_v3.min` / `eql_v3.max` aggregates + +```sql +-- Equality on an extracted leaf +SELECT * FROM orders +WHERE metadata -> 'email_selector'::text = $1::eql_v3.ste_vec_entry; + +-- Group by an extracted leaf's equality term +SELECT eql_v3.eq_term(metadata -> 'region_selector'::text) AS region, COUNT(*) +FROM orders +GROUP BY eql_v3.eq_term(metadata -> 'region_selector'::text); +``` + +A hash index on `eql_v3.eq_term(col -> ''::text)` engages the equality lookup; a btree on `eql_v3.ore_cllw(...)` engages range and `ORDER BY`. See [Indexes](/reference/eql/indexes). + +## Path queries and array helpers + +The function forms take the same selector hashes: + +```sql +-- All entries matching a selector +SELECT eql_v3.jsonb_path_query(metadata, 'selector_hash') FROM orders; + +-- First match only +SELECT eql_v3.jsonb_path_query_first(metadata, 'selector_hash') FROM orders; + +-- Does the selector exist in this document? +SELECT eql_v3.jsonb_path_exists(metadata, 'selector_hash') FROM orders; +``` + +For encrypted array nodes: + +```sql +SELECT eql_v3.jsonb_array_length(metadata -> 'items_selector'::text) FROM orders; +SELECT eql_v3.jsonb_array_elements(metadata -> 'items_selector'::text) FROM orders; +SELECT eql_v3.jsonb_array_elements_text(metadata -> 'items_selector'::text) FROM orders; +``` + +`jsonb_array_elements` yields encrypted entries; `jsonb_array_elements_text` yields each element as ciphertext text. + +## Worked example + +An `orders` table with an encrypted `metadata` document. The plaintext your application works with: + +```json +{ + "customer": { + "tier": "premium", + "region": "apac" + }, + "items": ["sku-1042", "sku-2210"] +} +``` + +The client encrypts this into a ste_vec payload with selectors for `$`, `$.customer`, `$.customer.tier`, `$.customer.region`, `$.items`, and each array element β€” every path becomes queryable. + + + + +### Create the table and insert + +```sql +CREATE TABLE orders ( + id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + metadata eql_v3.json +); + +INSERT INTO orders (metadata) VALUES ($1); +-- $1 is the encrypted ste_vec payload produced by the Stack SDK or Proxy +``` + + + + +### Query by containment + +Find premium orders. The client encrypts the needle `{"customer": {"tier": "premium"}}` into a `ste_vec_query`: + +```sql +SELECT id FROM orders +WHERE metadata @> $1::eql_v3.ste_vec_query; +``` + +Add the GIN index from above once the table grows. + + + + +### Query by path + +Count orders per region, grouping on the encrypted leaf β€” the database never sees `"apac"`: + +```sql +SELECT eql_v3.eq_term(metadata -> 'region_selector'::text) AS region, COUNT(*) +FROM orders +WHERE eql_v3.jsonb_path_exists(metadata, 'region_selector') +GROUP BY 1; +``` + +The rows come back as ciphertext; decrypt them in the client. + + + + +## Where to next + + + + The envelope anatomy, typed-operand rule, and fail-loud behavior shared by every EQL type. + + + GIN containment and field-level functional index recipes. + + + WHERE-clause patterns across all encrypted types. + + diff --git a/content/docs/reference/eql/meta.json b/content/docs/reference/eql/meta.json new file mode 100644 index 0000000..f4268ce --- /dev/null +++ b/content/docs/reference/eql/meta.json @@ -0,0 +1,19 @@ +{ + "title": "EQL", + "pages": [ + "core-concepts", + "---Types---", + "numbers", + "dates-and-times", + "text", + "json", + "booleans", + "---Indexes---", + "indexes", + "---Queries---", + "filtering", + "sorting", + "grouping-and-aggregates", + "joins" + ] +} diff --git a/content/docs/reference/eql/numbers.mdx b/content/docs/reference/eql/numbers.mdx new file mode 100644 index 0000000..2d9a2a0 --- /dev/null +++ b/content/docs/reference/eql/numbers.mdx @@ -0,0 +1,161 @@ +--- +title: Numbers +description: "The complete reference for encrypted numeric columns: the int, float, and numeric domain variants, the ORE-backed payload they carry, and range, ORDER BY, and MIN/MAX queries." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Six numeric types share one identical query surface: `int2`, `int4`, `int8`, `float4`, `float8`, and `numeric`. These are the columns you filter by range, sort, and take a `MIN` / `MAX` over β€” salaries, totals, rates, quantities. + +Date and time columns have the same capabilities but their own semantics β€” see [Dates & times](/reference/eql/dates-and-times). There is no free-text matching for numeric types β€” `_match` and `_search` are [text-only variants](/reference/eql/text). + +## Variants + +Each numeric type generates the same `jsonb`-backed domain variants. The generic form: + +| Domain variant | Capability | +| --- | --- | +| `eql_v3.` | Storage and decryption only. | +| `eql_v3._eq` | Equality: `=`, `<>`, `IN`, `GROUP BY`, `DISTINCT`, equijoins. | +| `eql_v3._ord` | Comparisons, `BETWEEN`, `ORDER BY`, `MIN` / `MAX` β€” plus equality. | +| `eql_v3._ord_ore` | As `_ord`, with the ORE mechanism pinned β€” see [SEM specifiers](#sem-specifiers). | + +And every concrete domain this page covers: + +| Type | Variants | +| --- | --- | +| `int2` | `eql_v3.int2` Β· `eql_v3.int2_eq` Β· `eql_v3.int2_ord` Β· `eql_v3.int2_ord_ore` | +| `int4` | `eql_v3.int4` Β· `eql_v3.int4_eq` Β· `eql_v3.int4_ord` Β· `eql_v3.int4_ord_ore` | +| `int8` | `eql_v3.int8` Β· `eql_v3.int8_eq` Β· `eql_v3.int8_ord` Β· `eql_v3.int8_ord_ore` | +| `float4` | `eql_v3.float4` Β· `eql_v3.float4_eq` Β· `eql_v3.float4_ord` Β· `eql_v3.float4_ord_ore` | +| `float8` | `eql_v3.float8` Β· `eql_v3.float8_eq` Β· `eql_v3.float8_ord` Β· `eql_v3.float8_ord_ore` | +| `numeric` | `eql_v3.numeric` Β· `eql_v3.numeric_eq` Β· `eql_v3.numeric_ord` Β· `eql_v3.numeric_ord_ore` | + +Declare only the capability you query on β€” each capability stores extra searchable material with defined leakage (see [Searchable encryption](/concepts/searchable-encryption)), and the variant model itself is covered in [Core concepts](/reference/eql/core-concepts). + +### Example + +A payroll table mixing the variants by how each column is queried: + +```sql +CREATE TABLE employees ( + id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + salary eql_v3.int8_ord, -- range queries, ORDER BY, MIN/MAX + tax_rate eql_v3.numeric_eq, -- exact lookup only + net_worth eql_v3.numeric -- store and decrypt only, never queried +); +``` + +### SEM specifiers + +All six types take the same mechanism specifiers on their orderable variant (the concept is defined in [Core concepts](/reference/eql/core-concepts#sem-specifiers)): + +| Specifier | Meaning | +| --- | --- | +| `_ord` | Orderable, using EQL's default mechanism (currently ORE). | +| `_ord_ore` | Orderable via ORE, pinned explicitly. | + +The EQL v3 release adds an OPE specifier for every orderable type; unspecified `_ord` columns keep tracking the default. + +## Payload + +A value for an `_ord` column carries the shared envelope keys (`v`, `i`, `c` β€” see [Core concepts](/reference/eql/core-concepts)) plus the `ob` ordering term. Here is a payload for the `eql_v3.int8_ord` `salary` column: + +```json +{ + "v": 3, + "i": { "t": "employees", "c": "salary" }, + "c": "mBbKmsMM%bK#QQOx1yLDBHyD...", + "ob": [ + "7a1fd0c2...", "d24c9be1...", "03fa66b8...", "91b7e04d...", + "5c28aa19...", "e6f3071c...", "48d92ab5...", "0b64cf37..." + ] +} +``` + +- **`ob` is the only index term.** An `_ord` payload carries no `hm`: equality on `_ord` variants compares ORE terms, which collapse to equality β€” see [Core concepts](/reference/eql/core-concepts). Only `_eq` payloads carry `hm` (a single hex HMAC-SHA-256 string) instead of `ob`. +- **The `ob` block count varies with the plaintext width**: 8 blocks for the int types, 14 for `numeric`. + +## Operators + +| SQL operator | `eql_v3.` | `_eq` | `_ord` / `_ord_ore` | +| --- | :---: | :---: | :---: | +| `=` / `<>` | ❌ | βœ… | βœ… | +| `<` `<=` `>` `>=` | ❌ | ❌ | βœ… | +| `BETWEEN` (desugars to `>=` and `<=`) | ❌ | ❌ | βœ… | +| `IN` (desugars to `=`) | ❌ | βœ… | βœ… | +| `GROUP BY` / `DISTINCT` | ❌ | βœ… | βœ… | +| `ORDER BY` | ❌ | ❌ | βœ… | +| `IS NULL` / `IS NOT NULL` | βœ… | βœ… | βœ… | + +Blocked *operator* cells raise an `operator … is not supported` exception β€” they never silently return wrong rows. `ORDER BY` is the one blocked cell that doesn't raise: it isn't an operator, so sorting a variant without an ordering term runs β€” but the order is meaningless (see [Sorting](/reference/eql/sorting)). Operands must be typed (`$1::eql_v3.int8_ord`), or PostgreSQL resolves the native `jsonb` operator instead of the encrypted one. Both rules are covered in [Core concepts](/reference/eql/core-concepts). + +## Functions + +Every operator has a function form, for managed platforms that disallow custom operators β€” same typed arguments, identical resolution. The `MIN` / `MAX` aggregates only exist as functions: + +| Function | Equivalent | Available on | +| --- | --- | --- | +| `eql_v3.eq(a, b)` / `eql_v3.neq(a, b)` | `=` / `<>` | `_eq`, `_ord` / `_ord_ore` | +| `eql_v3.lt` / `lte` / `gt` / `gte` | `<` `<=` `>` `>=` | `_ord` / `_ord_ore` | +| `eql_v3.min(col)` / `eql_v3.max(col)` | aggregate `MIN` / `MAX` | `_ord` / `_ord_ore` | + +**`SUM`, `AVG`, and other arithmetic aggregates are not supported** on encrypted columns β€” they would require homomorphic encryption. `MIN` / `MAX` work because they only need comparison; for sums and averages, decrypt at the application boundary and aggregate client-side. + +## Example queries + +### Range filter + +```sql +SELECT * FROM employees +WHERE salary >= $1::eql_v3.int8_ord; + +SELECT * FROM employees +WHERE salary BETWEEN $1::eql_v3.int8_ord AND $2::eql_v3.int8_ord; +``` + +### MIN and MAX + +`eql_v3.min` / `eql_v3.max` compare ORE terms β€” no decryption happens in the database, and the encrypted result decrypts in the client. `NULL` inputs are skipped; an all-`NULL` input set returns `NULL`: + +```sql +SELECT eql_v3.min(salary) FROM employees; +SELECT eql_v3.max(salary) FROM employees; +``` + +### Sorted listing + +Write the sort key in extractor form to stream rows out of the index already ordered (see [Sorting](/reference/eql/sorting) for why): + +```sql +SELECT * FROM employees +ORDER BY eql_v3.ord_term(salary) DESC +LIMIT 10; +``` + +### Cast at the call site + +On a generic `jsonb` column whose payloads already carry the `ob` term, cast to the right domain in the query: + +```sql +SELECT eql_v3.min(salary_jsonb::eql_v3.int8_ord) FROM employees; +``` + +## Where to next + + + + The same capabilities on date and timestamp columns. + + + Btree recipes on `eql_v3.ord_term` for range, ORDER BY, and MIN/MAX. + + + WHERE-clause patterns across all encrypted types. + + + GROUP BY, DISTINCT, and the aggregate surface on encrypted columns. + + diff --git a/content/docs/reference/eql/sorting.mdx b/content/docs/reference/eql/sorting.mdx new file mode 100644 index 0000000..eb34f12 --- /dev/null +++ b/content/docs/reference/eql/sorting.mdx @@ -0,0 +1,96 @@ +--- +title: Sorting +description: "ORDER BY on encrypted columns: which variants sort, when to write the sort key in extractor form, keyset pagination, and the ::jsonb projection trap." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +`ORDER BY` on an encrypted column needs an ORE ordering term: it works on `_ord` / `_ord_ore` variants of every scalar and on `text_search`. ORE terms are order-preserving, so the database sorts ciphertext in exactly the order the plaintext would sort β€” without decrypting anything. Which variants carry the term is covered in [Numbers](/reference/eql/numbers), [Dates & times](/reference/eql/dates-and-times), and [Text](/reference/eql/text); the variant model itself is in [Core concepts](/reference/eql/core-concepts). + +Sorting a variant *without* an ORE term (`_eq`, `text_match`, bare storage variants) won't raise β€” but the order is meaningless. Type the column as an `_ord` variant when ordering matters. + +## Bare form vs extractor form + +Both of these sort correctly: + +```sql +-- Bare form +SELECT * FROM users ORDER BY created_at DESC; + +-- Extractor form +SELECT * FROM users ORDER BY eql_v3.ord_term(created_at) DESC; +``` + +The difference is the plan. The planner inlines encrypted operators in *predicates*, so a `WHERE created_at < $1` matches a btree on `eql_v3.ord_term(created_at)` without rewriting β€” but it does **not** rewrite *sort keys*. Bare `ORDER BY created_at` therefore adds a `Sort` node above the scan, and that sort's cost scales linearly with the rows passing the filter. + +Writing the sort key in extractor form makes it textually match the index expression, so rows stream out of the btree already ordered β€” no `Sort` node at all: + +```sql +CREATE INDEX users_created_at_ord + ON users USING btree (eql_v3.ord_term(created_at)); +ANALYZE users; + +SELECT * FROM users + WHERE created_at < $1::eql_v3.timestamp_ord + ORDER BY eql_v3.ord_term(created_at) DESC + LIMIT 10; +-- Index Scan Backward using users_created_at_ord β€” no Sort node +``` + +At large row counts this is the difference between seconds and milliseconds, and it matters most for `LIMIT` queries: with a `Sort` node, Postgres must sort *every* matching row before it can return the top 10; streaming from the index, it stops after 10. + +Rule of thumb: bare form is fine for small result sets or when no ordering index exists; any hot query with `ORDER BY ... LIMIT` should use the extractor form. Confirm with `EXPLAIN (COSTS OFF)` β€” a `Sort` node above an `Index Scan` means the sort key didn't match the index. Full plan-reading guidance is in [Indexes](/reference/eql/indexes). + +## `ASC`, `DESC`, and `NULLS` + +`ASC` / `DESC` behave normally β€” a btree serves both directions (backward scans handle `DESC`). SQL `NULL` column values are not encrypted, so `NULLS FIRST` / `NULLS LAST` also behave normally: + +```sql +SELECT * FROM users +ORDER BY eql_v3.ord_term(last_login) DESC NULLS LAST; +``` + +## Keyset pagination + +`OFFSET` pagination degrades on encrypted columns the same way it does on plaintext ones β€” every page re-sorts and discards the rows before the offset. Keyset (cursor) pagination composes an encrypted range filter with an extractor-form sort: + +```sql +-- Page 1 +SELECT id, email, created_at FROM users + ORDER BY eql_v3.ord_term(created_at) DESC + LIMIT 20; + +-- Next page: pass the last row's created_at back, re-encrypted as the cursor +SELECT id, email, created_at FROM users + WHERE created_at < $1::eql_v3.timestamp_ord + ORDER BY eql_v3.ord_term(created_at) DESC + LIMIT 20; +``` + +Both the filter and the sort ride the same btree on `eql_v3.ord_term(created_at)`, so every page is an index scan that stops after 20 rows. The client re-encrypts the cursor value for the next request β€” the database only ever sees ciphertext. + +## The `::jsonb` projection trap + + +If you project the column with a cast and sort on it β€” `SELECT col::jsonb ... ORDER BY col` β€” Postgres folds the cast into the scan and uses `(col)::jsonb` as the sort key, which matches no index. Project the column raw and let the client decode it, or write the sort key as `eql_v3.ord_term(col)`, which sidesteps the problem entirely. + + +## Sorting extracted JSON leaves + +String and Number leaves inside an encrypted JSON document carry a CLLW ORE term, so they sort too β€” the extractor is `eql_v3.ore_cllw` on the extracted entry: + +```sql +SELECT * FROM orders +ORDER BY eql_v3.ore_cllw(metadata -> 'total_selector'::text) DESC +LIMIT 10; +``` + +A btree on the same `eql_v3.ore_cllw(...)` expression streams this ordered, exactly like `ord_term` on a scalar column. Selectors, node types, and which leaves are orderable are covered in [JSON](/reference/eql/json). + +## Where to go next + +- [Indexes](/reference/eql/indexes) β€” the btree recipe behind every sort on this page, plus `EXPLAIN` verification and large-table build guidance. +- [Filtering](/reference/eql/filtering) β€” the range predicates that pair with these sorts. +- [Grouping & aggregates](/reference/eql/grouping-and-aggregates) β€” `MIN` / `MAX`, which use the same ordering term. diff --git a/content/docs/reference/eql/text.mdx b/content/docs/reference/eql/text.mdx new file mode 100644 index 0000000..f08eafc --- /dev/null +++ b/content/docs/reference/eql/text.mdx @@ -0,0 +1,183 @@ +--- +title: Text +description: "The complete reference for encrypted text columns: all six text domain variants, the multi-term payload, why LIKE is gone everywhere, and bloom-filter token containment as the encrypted free-text match." +type: reference +components: [eql] +verifiedAgainst: + eql: "3.0.0" +--- + +Text is the richest encrypted scalar. Beyond the four variants every scalar type gets, `text` adds two of its own: `text_match` for encrypted free-text matching, and `text_search` for columns you need to look up, sort, *and* search. Emails, names, tax IDs, addresses β€” this page is the full surface for all of them. + +## Variants + +All six are `jsonb`-backed domains. Which one you declare fixes the column's query capability β€” the variant model itself is covered in [Core concepts](/reference/eql/core-concepts): + +| Domain variant | Capability | +| --- | --- | +| `eql_v3.text` | Storage and decryption only. | +| `eql_v3.text_eq` | Equality: `=`, `<>`, `IN`, `GROUP BY`, `DISTINCT`, equijoins. | +| `eql_v3.text_ord` | Comparisons, `BETWEEN`, `ORDER BY`, `MIN` / `MAX` β€” plus equality. | +| `eql_v3.text_ord_ore` | As `text_ord`, with the ORE mechanism pinned β€” see [SEM specifiers](#sem-specifiers). | +| `eql_v3.text_match` | Free-text token containment: `@>` / `<@`. | +| `eql_v3.text_search` | Equality + ordering + token containment. | + +Declare only the capabilities you query on β€” each capability stores extra searchable material with defined leakage (see [Searchable encryption](/concepts/searchable-encryption)). + +### Example + +A users table mixing the variants by how each column is queried: + +```sql +CREATE TABLE users ( + id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY, + email eql_v3.text_search, -- lookup, sort, and free-text match + name eql_v3.text_match, -- free-text match only + tax_id eql_v3.text_eq, -- exact lookup only + notes eql_v3.text -- store and decrypt only +); +``` + +### SEM specifiers + +Text takes the same mechanism specifiers as the other orderable types (the concept is defined in [Core concepts](/reference/eql/core-concepts#sem-specifiers)): + +| Specifier | Meaning | +| --- | --- | +| `_ord` | Orderable, using EQL's default mechanism (currently ORE). | +| `_ord_ore` | Orderable via ORE, pinned explicitly. | + +The EQL v3 release adds an OPE specifier for every orderable type β€” including `text` β€” so lexicographic ordering can be pinned to either mechanism; unspecified `_ord` columns keep tracking the default. + +## Payload + +A value for a `text_search` column carries the shared envelope keys (`v`, `i`, `c` β€” see [Core concepts](/reference/eql/core-concepts)) plus all three index terms: + +```json +{ + "v": 3, + "i": { "t": "users", "c": "email" }, + "c": "mBbKmsMM%bK#QQOx1yLDBHyD...", + "hm": "9c8ec1d2f9932b979b1bf3f09f8a4e2f6a41f8de2f0c8b7a52e1f5c3d4b6a790", + "ob": ["7a1fd0c2...", "d24c9be1...", "03fa66b8..."], + "bf": [42, 1290, -8113, 30201] +} +``` + +- `hm` β€” equality term: `WHERE email = $1` compares this +- `ob` β€” ordering term: `ORDER BY` and range comparisons walk these blocks +- `bf` β€” bloom-filter term: `@>` token containment tests these bit positions + +The narrower variants carry only their own term: a `text_eq` payload carries `hm` only, `text_match` carries `bf` only, and `text_ord` / `text_ord_ore` carry `ob` only (no `hm` β€” equality on `_ord` variants compares ORE terms, see [Core concepts](/reference/eql/core-concepts)). A payload missing its variant's required term fails the domain `CHECK` at write time. + +**`bf` positions are signed**: EQL stores the filter as PostgreSQL `smallint[]`, and filters sized above 32768 emit upper-half bit positions as *negative* signed values. Consumers must use a signed 16-bit integer type. + +## Operators + +| SQL operator | `eql_v3.text` | `text_eq` | `text_ord` / `text_ord_ore` | `text_match` | `text_search` | +| --- | :---: | :---: | :---: | :---: | :---: | +| `=` / `<>` | ❌ | βœ… | βœ… | ❌ | βœ… | +| `<` `<=` `>` `>=` | ❌ | ❌ | βœ… | ❌ | βœ… | +| `@>` / `<@` | ❌ | ❌ | ❌ | βœ… | βœ… | +| `LIKE` / `ILIKE` (`~~` / `~~*`) | ❌ | ❌ | ❌ | ❌ | ❌ | +| `IN` / `GROUP BY` / `DISTINCT` | ❌ | βœ… | βœ… | ❌ | βœ… | +| `ORDER BY` | ❌ | ❌ | βœ… | ❌ | βœ… | +| `IS NULL` / `IS NOT NULL` | βœ… | βœ… | βœ… | βœ… | βœ… | + +Blocked *operator* cells raise an `operator … is not supported` exception β€” they never silently return wrong rows. `ORDER BY` is the one blocked cell that doesn't raise: it isn't an operator, so sorting a variant without an ordering term runs β€” but the order is meaningless (see [Sorting](/reference/eql/sorting)). Operands must be typed (`$1::eql_v3.text_eq`), or PostgreSQL resolves the native `jsonb` operator instead of the encrypted one. Both rules are covered in [Core concepts](/reference/eql/core-concepts). + +## Functions + +Every operator has a function form, for managed platforms that disallow custom operators β€” same typed arguments, identical resolution. The `MIN` / `MAX` aggregates only exist as functions: + +| Function | Equivalent | Available on | +| --- | --- | --- | +| `eql_v3.eq(a, b)` / `eql_v3.neq(a, b)` | `=` / `<>` | `text_eq`, `text_ord` / `text_ord_ore`, `text_search` | +| `eql_v3.lt` / `lte` / `gt` / `gte` | `<` `<=` `>` `>=` | `text_ord` / `text_ord_ore`, `text_search` | +| `eql_v3.contains(a, b)` / `eql_v3.contained_by(a, b)` | `@>` / `<@` | `text_match`, `text_search` | +| `eql_v3.min(col)` / `eql_v3.max(col)` | aggregate `MIN` / `MAX` | `text_ord` / `text_ord_ore`, `text_search` | + +There are no `like` / `ilike` function forms β€” encrypted text matching is `eql_v3.contains` on a `text_match` value. + +## There is no `LIKE` + +`LIKE` and `ILIKE` (`~~` / `~~*`) raise on **every** encrypted-domain variant β€” including `text_match` and `text_search`. SQL pattern matching is meaningless on ciphertext. Encrypted text matching is bloom-filter token containment β€” `@>` on a `text_match` or `text_search` column: + +```sql +-- ❌ Raises: operator not supported +SELECT * FROM users WHERE email LIKE '%alice%'; + +-- βœ… Encrypted free-text match +SELECT * FROM users WHERE email @> $1::eql_v3.text_match; +``` + +`@>` / `<@` here is **probabilistic ngram-bloom containment** β€” it tests whether the encrypted text contains the (encrypted) search terms. It is not JSONB containment and not `LIKE`. The client encrypts the search term into a bloom-filter query value; false positives are possible, false negatives are not. There are no `like` / `ilike` function forms either β€” text matching is `eql_v3.contains` on a `text_match` value. + +## Example queries + +### Exact lookup + +Equality on a `text_eq` column compares HMAC terms. `IN` desugars to `=`: + +```sql +SELECT * FROM users WHERE tax_id = $1::eql_v3.text_eq; + +SELECT * FROM users +WHERE tax_id IN ($1::eql_v3.text_eq, $2::eql_v3.text_eq); +``` + +### Free-text match + +The client encrypts the search term into the bloom-filter needle: + +```sql +SELECT * FROM users WHERE name @> $1::eql_v3.text_match; + +-- Function form, for platforms without custom operators +SELECT * FROM users WHERE eql_v3.contains(name, $1::eql_v3.text_match); +``` + +### The works: `text_search` + +A `text_search` column answers exact lookup, free-text match, and ordering β€” here, all three in one query: + +```sql +SELECT id, email FROM users +WHERE email @> $1::eql_v3.text_match -- token containment on bf + AND email <> $2::eql_v3.text_eq -- exclude an exact value via hm +ORDER BY eql_v3.ord_term(email) -- sort on ob +LIMIT 20; +``` + +### Sorting text + +ORE terms are order-preserving, so `ORDER BY` sorts encrypted text correctly. Write the sort key in extractor form so a btree index can do the ordering instead of a `Sort` node β€” see [Sorting](/reference/eql/sorting): + +```sql +SELECT * FROM users +ORDER BY eql_v3.ord_term(email) +LIMIT 50; +``` + +`MIN` / `MAX` work on any ord-capable text column too: + +```sql +SELECT eql_v3.min(email) FROM users; +``` + +## Where to next + + + + Hash on `eq_term`, btree on `ord_term`, GIN on `match_term`. + + + WHERE-clause patterns across all encrypted types. + + + Extractor-form sort keys and index-backed ordering. + + + Equijoins on encrypted text columns, and the same-keyset rule. + + diff --git a/content/docs/reference/index.mdx b/content/docs/reference/index.mdx new file mode 100644 index 0000000..997dff4 --- /dev/null +++ b/content/docs/reference/index.mdx @@ -0,0 +1,9 @@ +--- +title: Reference +description: "Precise API documentation for EQL, the Stack SDK, Auth, the CLI, and Proxy." +type: reference +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/reference/meta.json b/content/docs/reference/meta.json new file mode 100644 index 0000000..b74408a --- /dev/null +++ b/content/docs/reference/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Reference", + "icon": "Library", + "pages": ["..."] +} diff --git a/content/docs/reference/proxy/index.mdx b/content/docs/reference/proxy/index.mdx new file mode 100644 index 0000000..8e59184 --- /dev/null +++ b/content/docs/reference/proxy/index.mdx @@ -0,0 +1,8 @@ +--- +title: Proxy +description: "Proxy documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/reference/proxy/meta.json b/content/docs/reference/proxy/meta.json new file mode 100644 index 0000000..85de4fd --- /dev/null +++ b/content/docs/reference/proxy/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Proxy", + "pages": ["..."] +} diff --git a/content/docs/reference/stack/index.mdx b/content/docs/reference/stack/index.mdx new file mode 100644 index 0000000..edac1c3 --- /dev/null +++ b/content/docs/reference/stack/index.mdx @@ -0,0 +1,8 @@ +--- +title: Stack SDK +description: "Stack SDK documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/reference/stack/meta.json b/content/docs/reference/stack/meta.json new file mode 100644 index 0000000..d0f86af --- /dev/null +++ b/content/docs/reference/stack/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Stack SDK", + "pages": ["..."] +} diff --git a/content/docs/reference/workspace/index.mdx b/content/docs/reference/workspace/index.mdx new file mode 100644 index 0000000..177bb82 --- /dev/null +++ b/content/docs/reference/workspace/index.mdx @@ -0,0 +1,8 @@ +--- +title: Workspace & account +description: "Workspace & account documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/reference/workspace/meta.json b/content/docs/reference/workspace/meta.json new file mode 100644 index 0000000..1dc0214 --- /dev/null +++ b/content/docs/reference/workspace/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Workspace & account", + "pages": ["..."] +} diff --git a/content/docs/security/compliance/index.mdx b/content/docs/security/compliance/index.mdx new file mode 100644 index 0000000..9af190d --- /dev/null +++ b/content/docs/security/compliance/index.mdx @@ -0,0 +1,8 @@ +--- +title: Compliance +description: "Compliance documentation β€” being built as part of the docs V2 overhaul." +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/security/compliance/meta.json b/content/docs/security/compliance/meta.json new file mode 100644 index 0000000..e7c6fa5 --- /dev/null +++ b/content/docs/security/compliance/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Compliance", + "pages": ["..."] +} diff --git a/content/docs/security/index.mdx b/content/docs/security/index.mdx new file mode 100644 index 0000000..33b00fe --- /dev/null +++ b/content/docs/security/index.mdx @@ -0,0 +1,9 @@ +--- +title: Architecture & security +description: "Trust model, components, availability, audit, and compliance β€” self-contained for security review." +type: concept +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/security/meta.json b/content/docs/security/meta.json new file mode 100644 index 0000000..5aa7273 --- /dev/null +++ b/content/docs/security/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Architecture & security", + "icon": "Shield", + "pages": ["..."] +} diff --git a/content/docs/solutions/index.mdx b/content/docs/solutions/index.mdx new file mode 100644 index 0000000..36278b5 --- /dev/null +++ b/content/docs/solutions/index.mdx @@ -0,0 +1,9 @@ +--- +title: Solutions +description: "What CipherStash solves: PII protection, HIPAA, AI/RAG, data residency, provable access." +type: concept +--- + +This section is being built as part of the docs V2 overhaul ([CIP-3307](https://linear.app/cipherstash/issue/CIP-3307)). Track progress in [IA.md](https://github.com/cipherstash/docs/blob/v2/IA.md). + +Until it lands, current documentation lives in the [existing docs](/stack). diff --git a/content/docs/solutions/meta.json b/content/docs/solutions/meta.json new file mode 100644 index 0000000..ac4f22b --- /dev/null +++ b/content/docs/solutions/meta.json @@ -0,0 +1,5 @@ +{ + "title": "Solutions", + "icon": "Target", + "pages": ["..."] +} diff --git a/next.config.mjs b/next.config.mjs index 5825239..990a7b3 100644 --- a/next.config.mjs +++ b/next.config.mjs @@ -1,13 +1,39 @@ import { createMDX } from "fumadocs-mdx/next"; +import { v2Redirects } from "./v2-redirects.mjs"; const withMDX = createMDX(); +// V2 IA migration (CIP-3325): the full legacyβ†’v2 redirect map is gated so the +// preview site serves BOTH trees while sections migrate (legacy at /stack, v2 +// at the root). Flip on at merge; once content/stack is deleted the map +// becomes unconditional (CIP-3335). Coverage is enforced by +// `bun run validate-redirects` regardless of the flag. +const enableV2Redirects = process.env.ENABLE_V2_REDIRECTS === "1"; + /** @type {import('next').NextConfig} */ const config = { basePath: "/docs", reactStrictMode: true, async redirects() { return [ + // The app lives under the /docs basePath, so the bare domain root + // (e.g. on Vercel preview URLs) would otherwise 404. In production + // "/" never reaches this app β€” cipherstash.com routes only /docs/* + // here β€” so this only affects previews. + { + source: "/", + destination: "/docs", + basePath: false, + permanent: false, + }, + // Vanity URL for the new IA (safe to ship ungated: the path has no + // legacy traffic). Temporary until the v2 quickstart is canonical. + { + source: "/quickstart", + destination: "/get-started/quickstart", + permanent: false, + }, + ...(enableV2Redirects ? v2Redirects : []), // === 4-section consolidation: product sections under /cipherstash/ === { source: "/stack/encryption/:path*", @@ -287,11 +313,10 @@ const config = { destination: "/stack/deploy/aws-ecs", permanent: true, }, - { - source: "/reference/eql", - destination: "/stack/reference/eql", - permanent: false, - }, + // NOTE(v2): the AI-citation redirect "/reference/eql" β†’ + // "/stack/reference/eql" was removed here β€” its source collides with + // the v2 IA's /reference/eql page, which now serves that traffic + // directly (CIP-3325). { source: "/platform/workspaces/key-sets", destination: "/stack/cipherstash/kms/keysets", @@ -317,6 +342,13 @@ const config = { source: "/stack/:path*.mdx", destination: "/llms.mdx/stack/:path*", }, + // Raw-markdown mirror for the v2 tree (Cloudflare/agents fetch + // .mdx). Listed after the /stack rule so legacy paths keep + // resolving to the legacy collection. + { + source: "/:path*.mdx", + destination: "/llms.mdx/v2/:path*", + }, ], afterFiles: [ { diff --git a/package.json b/package.json index 7cc2dfd..3b4681a 100644 --- a/package.json +++ b/package.json @@ -3,7 +3,7 @@ "version": "0.0.0", "private": true, "scripts": { - "prebuild": "bun run generate-docs && bun run generate-docs:eql && bun run validate-links", + "prebuild": "bun run generate-docs && bun run generate-docs:eql && bun run validate-links && bun run validate-redirects", "build": "next build", "dev": "next dev -p 3001", "start": "next start", @@ -13,7 +13,8 @@ "format": "biome format --write", "generate-docs": "tsx scripts/generate-docs.ts", "generate-docs:eql": "tsx scripts/generate-eql-docs.ts", - "validate-links": "tsx scripts/validate-links.ts" + "validate-links": "tsx scripts/validate-links.ts", + "validate-redirects": "tsx scripts/validate-v2-redirects.ts" }, "dependencies": { "fumadocs-core": "16.6.0", diff --git a/scripts/validate-v2-redirects.ts b/scripts/validate-v2-redirects.ts new file mode 100644 index 0000000..97c90d0 --- /dev/null +++ b/scripts/validate-v2-redirects.ts @@ -0,0 +1,61 @@ +#!/usr/bin/env tsx +/** + * V2 redirect gate (CIP-3325 / CIP-3337 item 7). + * + * Every page in the legacy tree (content/stack) must be covered by an entry + * in v2-redirects.mjs β€” exact match or `:path*` wildcard β€” so that no URL is + * orphaned when the v2 IA ships. Run via `bun run validate-redirects`; wired + * into prebuild so a page added to content/stack without a mapping fails CI. + * + * This checks map *coverage*, not destination existence β€” destinations are + * stubs until each section's migration ticket lands. CIP-3335 verifies + * destinations resolve before merge. + */ +import fs from "node:fs"; +import path from "node:path"; +// eslint-disable-next-line -- .mjs import is intentional; the map is shared with next.config.mjs +import { v2Redirects } from "../v2-redirects.mjs"; + +const LEGACY_DIR = path.join(process.cwd(), "content/stack"); + +function collectSlugs(dir: string, prefix: string[] = []): string[] { + const slugs: string[] = []; + for (const entry of fs.readdirSync(dir, { withFileTypes: true })) { + if (entry.isDirectory()) { + slugs.push( + ...collectSlugs(path.join(dir, entry.name), [...prefix, entry.name]), + ); + } else if (entry.name.endsWith(".mdx") || entry.name.endsWith(".md")) { + const base = entry.name.replace(/\.mdx?$/, ""); + const parts = base === "index" ? prefix : [...prefix, base]; + slugs.push(`/stack${parts.length ? `/${parts.join("/")}` : ""}`); + } + } + return slugs; +} + +function matches(url: string, source: string): boolean { + if (source.endsWith("/:path*")) { + const base = source.slice(0, -"/:path*".length); + return url === base || url.startsWith(`${base}/`); + } + return url === source; +} + +const urls = collectSlugs(LEGACY_DIR); +const unmatched = urls.filter( + (url) => !v2Redirects.some((r: { source: string }) => matches(url, r.source)), +); + +if (unmatched.length > 0) { + console.error( + `βœ— ${unmatched.length} legacy page(s) have no v2 redirect mapping:\n`, + ); + for (const url of unmatched.sort()) { + console.error(` ${url}`); + } + console.error("\nAdd entries to v2-redirects.mjs (see IA.md migration map)."); + process.exit(1); +} + +console.log(`βœ“ all ${urls.length} legacy pages covered by v2-redirects.mjs`); diff --git a/source.config.ts b/source.config.ts index 7769cf3..18ad3f5 100644 --- a/source.config.ts +++ b/source.config.ts @@ -23,6 +23,54 @@ export const docs = defineDocs({ }, }); +// V2 information architecture (CIP-3325). New content lives in content/docs +// and is served from the site root (e.g. /docs/get-started/...). The legacy +// `docs` collection above (content/stack) is served alongside it during the +// migration and is deleted once the last section moves. See IA.md. +export const v2docs = defineDocs({ + dir: "content/docs", + docs: { + schema: pageSchema.extend({ + seoTitle: z.string().optional(), + // DiΓ‘taxis page type. Every page should declare one; enforced by the + // docs lint (CIP-3337) rather than the schema so stubs can land first. + type: z.enum(["tutorial", "guide", "concept", "reference"]).optional(), + // Facets powering index pages, filtered views, and the future + // tailored-quickstart picker (CIP-3339). Nav position never depends on + // these β€” the sidebar tree comes from meta.json alone. + components: z + .array(z.enum(["encryption", "auth", "zerokms", "eql", "proxy", "cli"])) + .optional(), + audience: z.array(z.enum(["developer", "cto", "ciso"])).optional(), + integration: z + .object({ + category: z.enum([ + "platform", + "orm", + "framework", + "auth-provider", + "language", + "runtime", + ]), + setup: z.enum(["code-only", "dashboard-required"]), + pairsWith: z.array(z.string()).optional(), + }) + .optional(), + // Review tracking (CIP-3337): API pages pin the releases they were + // verified against (e.g. { stack: "1.2.0", eql: "3.0.0" }); claims pages + // (compliance, pricing, comparisons) carry a review-by date instead. + verifiedAgainst: z.record(z.string(), z.string()).optional(), + reviewBy: z.string().optional(), + }), + postprocess: { + includeProcessedMarkdown: true, + }, + }, + meta: { + schema: metaSchema, + }, +}); + // Parse the leftover code-fence meta string (what remains after Fumadocs // extracts `title`, `tab`, and line-number directives) for the analytics // attributes documented for authors: `example-id`, `cta`, and `cta-type`. diff --git a/src/app/(home)/layout.tsx b/src/app/(home)/layout.tsx deleted file mode 100644 index c16b056..0000000 --- a/src/app/(home)/layout.tsx +++ /dev/null @@ -1,6 +0,0 @@ -import { HomeLayout } from "fumadocs-ui/layouts/home"; -import { baseOptions } from "@/lib/layout.shared"; - -export default function Layout({ children }: LayoutProps<"/">) { - return {children}; -} diff --git a/src/app/(home)/page.tsx b/src/app/(home)/page.tsx deleted file mode 100644 index 5311eab..0000000 --- a/src/app/(home)/page.tsx +++ /dev/null @@ -1,346 +0,0 @@ -import { - ArrowRight, - BookOpen, - Code, - Database, - ExternalLinkIcon, - FileText, - KeyRound, - Lock, - Search, - ShieldCheck, - Zap, -} from "lucide-react"; -import Link from "next/link"; -import type { ComponentType } from "react"; -import { - DrizzleLogo, - DynamoDBLogo, - PrismaLogo, - SupabaseLogo, -} from "@/components/integration-logos"; -import type { Metadata } from "next"; - -// The /docs landing page had no metadata (no ). `absolute` bypasses the -// root layout's "%s | CipherStash Docs" template so the title isn't doubled. -export const metadata: Metadata = { - title: { - absolute: "CipherStash Docs β€” Searchable encryption for Postgres", - }, - description: - "Data Level Access Control for Postgres. Searchable field-level encryption, identity-bound keys, and cryptographic audit trails.", - alternates: { canonical: "https://cipherstash.com/docs" }, -}; - -const monoClass = "font-[family-name:var(--font-fira-code)] tracking-[-0.02em]"; -const eyebrowClass = - "font-[family-name:var(--font-fira-code)] text-[10px] font-medium tracking-[0.16em] uppercase text-fd-primary"; - -const products = [ - { - title: "Encryption", - description: - "Searchable field-level encryption. Range queries, exact match, and free-text search over ciphertext. Sub-millisecond overhead.", - href: "/stack/cipherstash/encryption", - icon: Lock, - }, - { - title: "ZeroKMS", - description: - "The key management layer. Unique key per value, derived on demand, never stored. 100x faster than AWS KMS.", - href: "/stack/cipherstash/kms", - icon: KeyRound, - }, - { - title: "Proxy", - description: - "Transparent searchable encryption for existing PostgreSQL databases. Zero application code changes.", - href: "/stack/cipherstash/proxy", - icon: Database, - }, -]; - -const integrations: { - title: string; - description: string; - href: string; - logo: ComponentType<{ className?: string }>; -}[] = [ - { - title: "Supabase", - description: "Field-level encryption for your Supabase project.", - href: "/stack/cipherstash/supabase", - logo: SupabaseLogo, - }, - { - title: "Drizzle ORM", - description: "Encrypted column types and query operators for Drizzle.", - href: "/stack/cipherstash/encryption/drizzle", - logo: DrizzleLogo, - }, - { - title: "Prisma Next", - description: - "Searchable field-level encryption for Postgres with Prisma Next.", - href: "/stack/cipherstash/encryption/prisma-next", - logo: PrismaLogo, - }, - { - title: "DynamoDB", - description: - "Encrypted DynamoDB attributes with searchable equality lookups.", - href: "/stack/cipherstash/encryption/dynamodb", - logo: DynamoDBLogo, - }, -]; - -const resources = [ - { - title: "What is CipherStash?", - description: "DLAC, threat model, how it works", - href: "/stack/reference/what-is-cipherstash", - icon: ShieldCheck, - }, - { - title: "API Reference", - description: "SDK and API reference docs", - href: "/stack/reference", - icon: Code, - }, - { - title: "Agent Skills", - description: "CipherStash knowledge for your AI coding agent", - href: "/stack/reference/agent-skills", - icon: Zap, - }, - { - title: "Use Cases", - description: "AI/RAG, compliance, data residency", - href: "/stack/reference/use-cases", - icon: BookOpen, - }, -]; - -export default function HomePage() { - return ( - <main className="flex flex-col"> - {/* Hero */} - <section className="border-b border-fd-border"> - <div className="mx-auto w-full max-w-[1200px] px-6 pt-24 pb-16 md:px-12 md:pt-32 md:pb-20"> - <p className={eyebrowClass}>DLAC / DATA LEVEL ACCESS CONTROL</p> - <h1 - className={`mt-4 text-3xl font-medium text-fd-foreground md:text-5xl ${monoClass}`} - > - CipherStash Docs - </h1> - <p className="mt-4 max-w-2xl text-[17px] leading-relaxed text-fd-muted-foreground"> - Searchable field-level encryption. Identity-bound keys. - Cryptographic audit trails. Built into your existing Postgres stack. - </p> - - {/* Getting started cards */} - <div className="mt-10 grid gap-px bg-fd-border sm:grid-cols-2 border border-fd-border rounded-[2px] overflow-hidden"> - {[ - { - href: "/stack/quickstart", - icon: Zap, - title: "Quickstart", - desc: "Encrypt your first fields in 15 minutes.", - }, - { - href: "/stack/cipherstash/supabase", - icon: Database, - title: "Supabase", - desc: "Field-level encryption for Supabase.", - }, - { - href: "/stack/cipherstash/encryption/searchable-encryption", - icon: Search, - title: "Searchable encryption", - desc: "Equality, free text, range, ordering, and JSON queries over ciphertext.", - }, - { - href: "/stack/reference/agent-skills", - icon: Zap, - title: "Agent Skills", - desc: "CipherStash knowledge for Cursor, Copilot, Claude Code.", - }, - ].map((card) => ( - <Link - key={card.href} - href={card.href} - className="group flex items-center gap-4 bg-fd-background p-5 transition-colors hover:bg-fd-accent/50" - > - <div className="flex size-10 shrink-0 items-center justify-center rounded-[2px] bg-fd-primary/10 text-fd-primary"> - <card.icon className="size-5" /> - </div> - <div className="min-w-0"> - <p - className={`font-medium text-fd-foreground text-[15px] ${monoClass}`} - > - {card.title} - </p> - <p className="text-sm text-fd-muted-foreground"> - {card.desc} - </p> - </div> - <ArrowRight className="ml-auto size-4 shrink-0 text-fd-muted-foreground transition-colors group-hover:text-fd-primary" /> - </Link> - ))} - </div> - </div> - </section> - - {/* Products */} - <section className="mx-auto w-full max-w-[1200px] px-6 py-16 md:px-12 md:py-24"> - <p className={eyebrowClass}>Β§ 01 / THE STACK</p> - <h2 - className={`mt-3 text-xl font-medium text-fd-foreground md:text-2xl ${monoClass}`} - > - The Stack - </h2> - <p className="mt-2 text-fd-muted-foreground"> - Encryption, key management, and proxy. - </p> - - <div className="mt-8 grid gap-px bg-fd-border sm:grid-cols-3 border border-fd-border rounded-[2px] overflow-hidden"> - {products.map((product) => ( - <Link - key={product.title} - href={product.href} - className="group relative flex flex-col overflow-hidden bg-fd-background transition-colors hover:bg-fd-accent/50" - > - <div className="flex h-32 items-center justify-center border-b border-fd-border bg-fd-muted/20"> - <product.icon className="size-10 text-fd-muted-foreground/30" /> - </div> - <div className="flex flex-1 flex-col p-5"> - <div className="flex items-center gap-2"> - <product.icon className="size-4 text-fd-primary" /> - <h3 className={`font-medium text-fd-foreground ${monoClass}`}> - {product.title} - </h3> - </div> - <p className="mt-2 flex-1 text-sm leading-relaxed text-fd-muted-foreground"> - {product.description} - </p> - </div> - </Link> - ))} - </div> - </section> - - {/* Integrations */} - <section className="border-t border-fd-border"> - <div className="mx-auto w-full max-w-[1200px] px-6 py-16 md:px-12 md:py-24"> - <p className={eyebrowClass}>Β§ 02 / INTEGRATIONS</p> - <h2 - className={`mt-3 text-xl font-medium text-fd-foreground md:text-2xl ${monoClass}`} - > - Integrations - </h2> - <p className="mt-2 text-fd-muted-foreground"> - Drop-in encryption for the databases and ORMs you already use. - </p> - - <div className="mt-8 grid gap-px bg-fd-border sm:grid-cols-2 lg:grid-cols-4 border border-fd-border rounded-[2px] overflow-hidden"> - {integrations.map((integration) => ( - <Link - key={integration.title} - href={integration.href} - className="group flex flex-col items-center bg-fd-background p-6 text-center transition-colors hover:bg-fd-accent/50" - > - <div className="flex size-24 items-center justify-center"> - <integration.logo className="h-12 w-auto" /> - </div> - <h3 - className={`mt-4 font-medium text-fd-foreground ${monoClass}`} - > - {integration.title} - </h3> - <p className="mt-1 text-sm text-fd-muted-foreground"> - {integration.description} - </p> - </Link> - ))} - </div> - </div> - </section> - - {/* Resources */} - <section className="border-t border-fd-border"> - <div className="mx-auto w-full max-w-[1200px] px-6 py-16 md:px-12 md:py-24"> - <p className={eyebrowClass}>Β§ 03 / RESOURCES</p> - <h2 - className={`mt-3 text-xl font-medium text-fd-foreground md:text-2xl ${monoClass}`} - > - Resources - </h2> - - <div className="mt-8 grid gap-px bg-fd-border sm:grid-cols-2 lg:grid-cols-4 border border-fd-border rounded-[2px] overflow-hidden"> - {resources.map((resource) => ( - <Link - key={resource.title} - href={resource.href} - className="group flex items-start gap-3 bg-fd-background p-4 transition-colors hover:bg-fd-accent/50" - > - <resource.icon className="mt-0.5 size-5 shrink-0 text-fd-muted-foreground group-hover:text-fd-primary" /> - <div> - <p - className={`font-medium text-fd-foreground text-[14px] ${monoClass}`} - > - {resource.title} - </p> - <p className="mt-0.5 text-sm text-fd-muted-foreground"> - {resource.description} - </p> - </div> - </Link> - ))} - </div> - </div> - </section> - - {/* AI/LLM + CTA footer */} - <section className="border-t border-fd-border bg-fd-card/50"> - <div className="mx-auto flex w-full max-w-[1200px] flex-col items-center px-6 py-16 text-center md:px-12 md:py-20"> - <div className="flex size-10 items-center justify-center rounded-[2px] bg-fd-primary/10 text-fd-primary"> - <FileText className="size-5" /> - </div> - <h2 - className={`mt-4 text-xl font-medium text-fd-foreground md:text-2xl ${monoClass}`} - > - AI-ready documentation - </h2> - <p className="mx-auto mt-2 max-w-lg text-sm text-fd-muted-foreground"> - Every page is clean markdown. Feed it to your LLM. - </p> - <div className="mt-6 flex flex-wrap justify-center gap-3"> - <Link - href="/llms.txt" - className="inline-flex items-center gap-2 rounded-[2px] border border-fd-border px-4 py-2 text-sm font-medium text-fd-foreground transition-colors hover:border-fd-primary/40 hover:bg-fd-accent/50" - > - <FileText className="size-4" /> - llms.txt - </Link> - <Link - href="/llms-full.txt" - className="inline-flex items-center gap-2 rounded-[2px] border border-fd-border px-4 py-2 text-sm font-medium text-fd-foreground transition-colors hover:border-fd-primary/40 hover:bg-fd-accent/50" - > - <FileText className="size-4" /> - llms-full.txt - </Link> - <a - href="https://github.com/cipherstash/stack" - target="_blank" - rel="noopener noreferrer" - className="inline-flex items-center gap-2 rounded-[2px] border border-fd-border px-4 py-2 text-sm font-medium text-fd-foreground transition-colors hover:border-fd-primary/40 hover:bg-fd-accent/50" - > - <ExternalLinkIcon className="size-4" /> - GitHub - </a> - </div> - </div> - </section> - </main> - ); -} diff --git a/src/app/[[...slug]]/layout.tsx b/src/app/[[...slug]]/layout.tsx new file mode 100644 index 0000000..d121273 --- /dev/null +++ b/src/app/[[...slug]]/layout.tsx @@ -0,0 +1,15 @@ +import { DocsLayout } from "fumadocs-ui/layouts/docs"; +import { baseOptions } from "@/lib/layout.shared"; +import { getV2PageTree } from "@/lib/source"; + +// Layout for the V2 IA tree (content/docs), served from the site root β€” +// including the /docs landing page (content/docs/index.mdx), which renders +// inside the same navigation shell as every other page. Static routes +// (/stack, /api, /og, …) take precedence over this segment as usual. +export default function Layout({ children }: LayoutProps<"/[[...slug]]">) { + return ( + <DocsLayout tree={getV2PageTree()} {...baseOptions()}> + {children} + </DocsLayout> + ); +} diff --git a/src/app/[[...slug]]/page.tsx b/src/app/[[...slug]]/page.tsx new file mode 100644 index 0000000..5efba49 --- /dev/null +++ b/src/app/[[...slug]]/page.tsx @@ -0,0 +1,84 @@ +import { + DocsBody, + DocsDescription, + DocsPage, + DocsTitle, +} from "fumadocs-ui/layouts/docs/page"; +import { createRelativeLink } from "fumadocs-ui/mdx"; +import type { Metadata } from "next"; +import { notFound } from "next/navigation"; +import { LLMCopyButton, ViewOptions } from "@/components/ai/page-actions"; +import { gitConfig } from "@/lib/layout.shared"; +import { v2source } from "@/lib/source"; +import { getMDXComponents } from "@/mdx-components"; + +// Page route for the V2 IA tree (content/docs), including the /docs landing +// page. Mirrors the legacy /stack/[[...slug]] route; the legacy route is +// deleted when the migration completes (see IA.md). + +// The landing page's URL is "/", which would produce "/docs/.mdx" β€” serve its +// raw-markdown mirror at /docs/index.mdx instead (normalized back to the root +// slug in the llms.mdx/v2 route). +function markdownUrl(pageUrl: string): string { + return `/docs${pageUrl === "/" ? "/index" : pageUrl}.mdx`; +} + +export default async function Page(props: PageProps<"/[[...slug]]">) { + const params = await props.params; + const page = v2source.getPage(params.slug); + if (!page) notFound(); + + const MDX = page.data.body; + + return ( + <DocsPage toc={page.data.toc} full={page.data.full}> + <DocsTitle>{page.data.title}</DocsTitle> + <DocsDescription className="mb-0"> + {page.data.description} + </DocsDescription> + <div className="flex flex-row gap-2 items-center border-b pb-6"> + <LLMCopyButton markdownUrl={markdownUrl(page.url)} /> + <ViewOptions + markdownUrl={markdownUrl(page.url)} + githubUrl={`https://github.com/${gitConfig.user}/${gitConfig.repo}/blob/${gitConfig.branch}/content/docs/${page.path}`} + /> + </div> + <DocsBody> + <MDX + components={getMDXComponents({ + a: createRelativeLink(v2source, page), + })} + /> + </DocsBody> + </DocsPage> + ); +} + +export async function generateStaticParams() { + return v2source.generateParams(); +} + +export async function generateMetadata( + props: PageProps<"/[[...slug]]">, +): Promise<Metadata> { + const params = await props.params; + const page = v2source.getPage(params.slug); + if (!page) notFound(); + + const title = page.data.seoTitle ?? page.data.title; + const url = `https://cipherstash.com/docs${page.url === "/" ? "" : page.url}`; + + return { + title, + description: page.data.description, + alternates: { canonical: url }, + openGraph: { + type: "article", + url, + title, + description: page.data.description, + // TODO(v2): OG images β€” the /og route only covers the legacy tree. + // Add a v2 OG route when the first real (non-stub) pages land. + }, + }; +} diff --git a/src/app/api/search/route.ts b/src/app/api/search/route.ts index aa9d5cd..ec6bc8d 100644 --- a/src/app/api/search/route.ts +++ b/src/app/api/search/route.ts @@ -1,5 +1,5 @@ -import { source } from "@/lib/source"; import { createFromSource } from "fumadocs-core/search/server"; +import { source } from "@/lib/source"; export const { GET } = createFromSource(source, { // https://docs.orama.com/docs/orama-js/supported-languages diff --git a/src/app/layout.tsx b/src/app/layout.tsx index e6ff9a2..33912be 100644 --- a/src/app/layout.tsx +++ b/src/app/layout.tsx @@ -2,7 +2,7 @@ import { RootProvider } from "fumadocs-ui/provider/next"; import { PostHogProvider } from "@/lib/posthog/provider"; import "./global.css"; import type { Metadata } from "next"; -import { Inter, Fira_Code } from "next/font/google"; +import { Fira_Code, Inter } from "next/font/google"; // Site-wide title template so every page gets a descriptive, branded // <title>. Per-page metadata returns a bare title (e.g. "Keysets") which diff --git a/src/app/llms-full.txt/route.ts b/src/app/llms-full.txt/route.ts index 8e2efe8..9cd76be 100644 --- a/src/app/llms-full.txt/route.ts +++ b/src/app/llms-full.txt/route.ts @@ -1,5 +1,5 @@ import { getPostHogClient } from "@/lib/posthog/server"; -import { getLLMText, source } from "@/lib/source"; +import { getLLMText, source, v2source } from "@/lib/source"; export const revalidate = false; @@ -18,7 +18,7 @@ export async function GET(request: Request) { await posthog.flush(); } - const scan = source.getPages().map(getLLMText); + const scan = [...v2source.getPages(), ...source.getPages()].map(getLLMText); const scanned = await Promise.all(scan); return new Response(scanned.join("\n\n")); diff --git a/src/app/llms.mdx/v2/[[...slug]]/route.ts b/src/app/llms.mdx/v2/[[...slug]]/route.ts new file mode 100644 index 0000000..fc9f251 --- /dev/null +++ b/src/app/llms.mdx/v2/[[...slug]]/route.ts @@ -0,0 +1,48 @@ +import { notFound } from "next/navigation"; +import { getPostHogClient } from "@/lib/posthog/server"; +import { getLLMText, v2source } from "@/lib/source"; + +// Raw-markdown mirror for the V2 IA tree, reached via the +// `/:path*.mdx` rewrite in next.config.mjs (same pattern as the legacy +// /llms.mdx/stack route). +export const revalidate = false; + +export async function GET( + req: Request, + { params }: RouteContext<"/llms.mdx/v2/[[...slug]]">, +) { + const { slug } = await params; + // The landing page's markdown mirror is served at /docs/index.mdx (its URL + // is "/", which can't carry an .mdx suffix) β€” normalize back to the root. + const normalized = + !slug || (slug.length === 1 && slug[0] === "index") ? [] : slug; + const page = v2source.getPage(normalized); + if (!page) notFound(); + + const posthog = getPostHogClient(); + if (posthog) { + posthog.capture({ + distinctId: "llm-agent", + event: "llms_mdx_page_fetched", + properties: { + $current_url: req.url, + page_slug: normalized.join("/"), + page_title: page.data.title, + referer: req.headers.get("referer") ?? "", + user_agent: req.headers.get("user-agent") ?? "", + }, + }); + await posthog.flush(); + } + + return new Response(await getLLMText(page), { + headers: { + "Content-Type": "text/markdown", + "Access-Control-Allow-Origin": "*", + }, + }); +} + +export function generateStaticParams() { + return v2source.generateParams(); +} diff --git a/src/app/llms.txt/route.ts b/src/app/llms.txt/route.ts index 5d6bcbb..2c6696e 100644 --- a/src/app/llms.txt/route.ts +++ b/src/app/llms.txt/route.ts @@ -1,5 +1,5 @@ import { getPostHogClient } from "@/lib/posthog/server"; -import { source } from "@/lib/source"; +import { source, v2source } from "@/lib/source"; export const revalidate = false; @@ -21,7 +21,8 @@ export async function GET(request: Request) { const lines: string[] = []; lines.push("# Documentation"); lines.push(""); - for (const page of source.getPages()) { + // V2 tree first: it's the canonical IA once the migration completes. + for (const page of [...v2source.getPages(), ...source.getPages()]) { lines.push(`- [${page.data.title}](${page.url}): ${page.data.description}`); } return new Response(lines.join("\n")); diff --git a/src/app/og/docs/[...slug]/route.tsx b/src/app/og/docs/[...slug]/route.tsx index 208fdfd..801a890 100644 --- a/src/app/og/docs/[...slug]/route.tsx +++ b/src/app/og/docs/[...slug]/route.tsx @@ -1,7 +1,7 @@ -import { getPageImage, source } from "@/lib/source"; +import { generate as DefaultImage } from "fumadocs-ui/og"; import { notFound } from "next/navigation"; import { ImageResponse } from "next/og"; -import { generate as DefaultImage } from "fumadocs-ui/og"; +import { getPageImage, source } from "@/lib/source"; export const revalidate = false; diff --git a/src/app/sitemap.ts b/src/app/sitemap.ts index 515283d..77fd867 100644 --- a/src/app/sitemap.ts +++ b/src/app/sitemap.ts @@ -1,10 +1,10 @@ import type { MetadataRoute } from "next"; -import { source } from "@/lib/source"; +import { source, v2source } from "@/lib/source"; const BASE_URL = "https://cipherstash.com/docs"; export default function sitemap(): MetadataRoute.Sitemap { - return source.getPages().map((page) => ({ + return [...v2source.getPages(), ...source.getPages()].map((page) => ({ url: `${BASE_URL}${page.url}`, lastModified: new Date(), changeFrequency: "weekly", diff --git a/src/app/stack/[[...slug]]/page.tsx b/src/app/stack/[[...slug]]/page.tsx index 6f5e79a..ef050ee 100644 --- a/src/app/stack/[[...slug]]/page.tsx +++ b/src/app/stack/[[...slug]]/page.tsx @@ -1,16 +1,16 @@ -import { getPageImage, source } from "@/lib/source"; import { DocsBody, DocsDescription, DocsPage, DocsTitle, } from "fumadocs-ui/layouts/docs/page"; -import { notFound } from "next/navigation"; -import { getMDXComponents } from "@/mdx-components"; -import type { Metadata } from "next"; import { createRelativeLink } from "fumadocs-ui/mdx"; +import type { Metadata } from "next"; +import { notFound } from "next/navigation"; import { LLMCopyButton, ViewOptions } from "@/components/ai/page-actions"; import { gitConfig } from "@/lib/layout.shared"; +import { getPageImage, source } from "@/lib/source"; +import { getMDXComponents } from "@/mdx-components"; export default async function Page(props: PageProps<"/stack/[[...slug]]">) { const params = await props.params; diff --git a/src/app/stack/layout.tsx b/src/app/stack/layout.tsx index d5b93ec..78d2389 100644 --- a/src/app/stack/layout.tsx +++ b/src/app/stack/layout.tsx @@ -1,6 +1,6 @@ -import { source } from "@/lib/source"; import { DocsLayout } from "fumadocs-ui/layouts/docs"; import { baseOptions } from "@/lib/layout.shared"; +import { source } from "@/lib/source"; export default function Layout({ children }: LayoutProps<"/stack">) { return ( diff --git a/src/components/icons/supabase.tsx b/src/components/icons/supabase.tsx index c6336f9..492ac61 100644 --- a/src/components/icons/supabase.tsx +++ b/src/components/icons/supabase.tsx @@ -44,5 +44,5 @@ export function SupabaseIcon(props: React.SVGProps<SVGSVGElement>) { </linearGradient> </defs> </svg> - ) + ); } diff --git a/src/lib/posthog/provider.tsx b/src/lib/posthog/provider.tsx index 711982a..9d7a5cb 100644 --- a/src/lib/posthog/provider.tsx +++ b/src/lib/posthog/provider.tsx @@ -1,13 +1,13 @@ "use client"; -import posthog from "posthog-js"; import { usePathname, useSearchParams } from "next/navigation"; +import posthog from "posthog-js"; import { - Suspense, createContext, + type ReactNode, + Suspense, useContext, useEffect, - type ReactNode, } from "react"; const PostHogContext = createContext<typeof posthog | null>(null); diff --git a/src/lib/source.ts b/src/lib/source.ts index 97ae84f..c056250 100644 --- a/src/lib/source.ts +++ b/src/lib/source.ts @@ -1,7 +1,8 @@ -import { docs } from "fumadocs-mdx:collections/server"; +import { docs, v2docs } from "fumadocs-mdx:collections/server"; +import type * as PageTree from "fumadocs-core/page-tree"; import { type InferPageType, loader } from "fumadocs-core/source"; -import { createElement } from "react"; import { icons } from "lucide-react"; +import { createElement } from "react"; import { SupabaseIcon } from "@/components/icons/supabase"; const customIcons: Record<string, () => React.ReactElement> = { @@ -23,6 +24,34 @@ export const source = loader({ icon: resolveIcon, }); +// V2 IA tree (CIP-3325): content/docs served from the site root, e.g. +// /docs/get-started/quickstart. Lives alongside the legacy `source` during +// the migration; the legacy loader and /stack routes are deleted at the end. +export const v2source = loader({ + baseUrl: "/", + source: v2docs.toFumadocsSource(), + icon: resolveIcon, +}); + +// Sidebar folders whose only page is their index render with a collapse +// chevron pointing at nothing. Collapse such folders into plain page items; +// they become folders again automatically once real sub-pages land. +function flattenEmptyFolders(nodes: PageTree.Node[]): PageTree.Node[] { + return nodes.map((node) => { + if (node.type !== "folder") return node; + const children = flattenEmptyFolders(node.children); + if (children.length === 0 && node.index) { + return { ...node.index, icon: node.index.icon ?? node.icon }; + } + return { ...node, children }; + }); +} + +export function getV2PageTree(): PageTree.Root { + const tree = v2source.getPageTree(); + return { ...tree, children: flattenEmptyFolders(tree.children) }; +} + export function getPageImage(page: InferPageType<typeof source>) { const segments = [...page.slugs, "image.png"]; @@ -32,7 +61,9 @@ export function getPageImage(page: InferPageType<typeof source>) { }; } -export async function getLLMText(page: InferPageType<typeof source>) { +export async function getLLMText( + page: InferPageType<typeof source> | InferPageType<typeof v2source>, +) { const processed = await page.data.getText("processed"); return `# ${page.data.title} diff --git a/src/proxy.ts b/src/proxy.ts index 028dd8f..5d45773 100644 --- a/src/proxy.ts +++ b/src/proxy.ts @@ -1,5 +1,5 @@ -import { NextResponse } from "next/server"; import type { NextFetchEvent, NextRequest } from "next/server"; +import { NextResponse } from "next/server"; import { getPostHogClient } from "@/lib/posthog/server"; const SKIP_PATHS = ["/api", "/_next/static", "/_next/image", "/ingest"]; diff --git a/v2-redirects.mjs b/v2-redirects.mjs new file mode 100644 index 0000000..dec811a --- /dev/null +++ b/v2-redirects.mjs @@ -0,0 +1,380 @@ +// V2 IA redirect map (CIP-3325): every legacy /stack/* URL β†’ its new home. +// Derived from the migration map in IA.md; completeness is enforced by +// `scripts/validate-v2-redirects.ts` (every content/stack page must match an +// entry here, exact or wildcard). +// +// Gated behind ENABLE_V2_REDIRECTS=1 in next.config.mjs: during the migration +// the preview site serves BOTH trees (legacy at /stack, v2 at the root), so +// unmigrated content stays reachable. The flag flips on at merge; once +// content/stack is deleted these entries become unconditional (CIP-3335). +// +// Conventions (matching next.config.mjs): sources/destinations omit the +// "/docs" basePath. Order matters β€” specific entries before wildcards. +// +// All entries are `permanent: false` (307) while the IA settles β€” browsers +// and crawlers cache 308s aggressively, and a mis-cached destination is hard +// to walk back. Flip to permanent once the map has soaked post-merge +// (CIP-3335). +export const v2Redirects = [ + // === Roots === + { source: "/stack", destination: "/", permanent: false }, + { + source: "/stack/quickstart", + destination: "/get-started/quickstart", + permanent: false, + }, + { source: "/stack/cipherstash", destination: "/", permanent: false }, + { + source: "/stack/cipherstash/postgres", + destination: "/reference/eql", + permanent: false, + }, + { + source: "/stack/cipherstash/supabase", + destination: "/integrations/supabase", + permanent: false, + }, + + // === Encryption SDK section β†’ Reference/stack + new homes === + { + source: "/stack/cipherstash/encryption", + destination: "/reference/stack", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/searchable-encryption", + destination: "/concepts/searchable-encryption", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/identity", + destination: "/concepts/identity-aware-encryption", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/drizzle", + destination: "/integrations/drizzle", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/prisma-next", + destination: "/integrations/prisma-next", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/dynamodb", + destination: "/integrations/aws/dynamodb", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/supabase", + destination: "/reference/stack/supabase", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/indexes", + destination: "/reference/eql/indexes", + permanent: false, + }, + { + source: "/stack/cipherstash/encryption/queries", + destination: "/reference/eql/filtering", + permanent: false, + }, + // configuration, encrypt-decrypt, bulk-operations, models, schema, storing-data + { + source: "/stack/cipherstash/encryption/:path*", + destination: "/reference/stack/:path*", + permanent: false, + }, + + // === KMS section β†’ Security + Reference/auth + Concepts === + { + source: "/stack/cipherstash/kms", + destination: "/security/zerokms", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/cts", + destination: "/security/cts", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/oidc", + destination: "/reference/auth/oidc-configuration", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/access-keys", + destination: "/reference/auth/access-keys", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/clients", + destination: "/reference/auth/clients", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/disaster-recovery", + destination: "/security/availability-and-continuity", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/keysets", + destination: "/concepts/key-management", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/regions", + destination: "/security/zerokms", + permanent: false, + }, + { + source: "/stack/cipherstash/kms/configuration", + destination: "/reference/workspace/configuration", + permanent: false, + }, + + // === Proxy section β†’ Reference/proxy + new homes === + { + source: "/stack/cipherstash/proxy", + destination: "/reference/proxy", + permanent: false, + }, + { + source: "/stack/cipherstash/proxy/audit", + destination: "/security/audit-logging", + permanent: false, + }, + { + source: "/stack/cipherstash/proxy/getting-started", + destination: "/integrations/aws/rds-aurora", + permanent: false, + }, + { + source: "/stack/cipherstash/proxy/encrypt-tool", + destination: "/guides/migration/encrypt-existing-data", + permanent: false, + }, + { + source: "/stack/cipherstash/proxy/searchable-json", + destination: "/reference/eql/json", + permanent: false, + }, + { + source: "/stack/cipherstash/proxy/troubleshooting", + destination: "/guides/troubleshooting/proxy", + permanent: false, + }, + // configuration, message-flow, multitenant + { + source: "/stack/cipherstash/proxy/:path*", + destination: "/reference/proxy/:path*", + permanent: false, + }, + + // === CLI section β†’ Reference/cli === + { + source: "/stack/cipherstash/cli", + destination: "/reference/cli", + permanent: false, + }, + { + source: "/stack/cipherstash/cli/troubleshooting", + destination: "/guides/troubleshooting/cli", + permanent: false, + }, + { + source: "/stack/cipherstash/cli/:path*", + destination: "/reference/cli/:path*", + permanent: false, + }, + + // === Deploy section β†’ Guides === + { + source: "/stack/deploy", + destination: "/guides/deployment", + permanent: false, + }, + { + source: "/stack/deploy/going-to-production", + destination: "/guides/deployment/going-to-production", + permanent: false, + }, + { + source: "/stack/deploy/aws-ecs", + destination: "/guides/deployment/proxy-deployment", + permanent: false, + }, + { + source: "/stack/deploy/bundling", + destination: "/guides/deployment/serverless-and-bundling", + permanent: false, + }, + { + source: "/stack/deploy/sst", + destination: "/guides/deployment/serverless-and-bundling", + permanent: false, + }, + { + source: "/stack/deploy/testing", + destination: "/guides/development/testing-and-ci", + permanent: false, + }, + { + source: "/stack/deploy/team-onboarding", + destination: "/guides/development/team-onboarding", + permanent: false, + }, + { + source: "/stack/deploy/troubleshooting", + destination: "/guides/troubleshooting", + permanent: false, + }, + + // === Reference section === + { source: "/stack/reference", destination: "/reference", permanent: false }, + { + source: "/stack/reference/what-is-cipherstash", + destination: "/get-started/what-is-cipherstash", + permanent: false, + }, + { + source: "/stack/reference/security-architecture", + destination: "/security/architecture", + permanent: false, + }, + { + source: "/stack/reference/compliance", + destination: "/security/compliance", + permanent: false, + }, + { + source: "/stack/reference/comparisons", + destination: "/compare", + permanent: false, + }, + { + source: "/stack/reference/comparisons/:path*", + destination: "/compare/:path*", + permanent: false, + }, + { + source: "/stack/reference/use-cases", + destination: "/solutions", + permanent: false, + }, + { + source: "/stack/reference/use-cases/ai-rag", + destination: "/solutions/ai-and-rag", + permanent: false, + }, + { + source: "/stack/reference/use-cases/compliance", + destination: "/security/compliance", + permanent: false, + }, + { + source: "/stack/reference/use-cases/:path*", + destination: "/solutions/:path*", + permanent: false, + }, + { + source: "/stack/reference/billing", + destination: "/reference/workspace/billing", + permanent: false, + }, + { + source: "/stack/reference/members", + destination: "/reference/workspace/members", + permanent: false, + }, + { + source: "/stack/reference/cipher-cell", + destination: "/reference/eql/core-concepts", + permanent: false, + }, + { + source: "/stack/reference/eql-guide", + destination: "/reference/eql", + permanent: false, + }, + { + source: "/stack/reference/eql", + destination: "/reference/eql", + permanent: false, + }, + { + source: "/stack/reference/eql/:path*", + destination: "/reference/eql/:path*", + permanent: false, + }, + { + source: "/stack/reference/encryption-sdk", + destination: "/reference/stack", + permanent: false, + }, + { + source: "/stack/reference/error-handling", + destination: "/reference/stack/errors", + permanent: false, + }, + // NOTE: legacy "migration" page is the @cipherstash/protectβ†’stack package + // rename guide, NOT data migration (see IA.md). + { + source: "/stack/reference/migration", + destination: "/reference/stack/upgrading-from-protect", + permanent: false, + }, + { + source: "/stack/reference/proxy-errors", + destination: "/reference/proxy/errors", + permanent: false, + }, + { + source: "/stack/reference/proxy-reference", + destination: "/reference/proxy/configuration", + permanent: false, + }, + { + source: "/stack/reference/drizzle", + destination: "/integrations/drizzle", + permanent: false, + }, + { + source: "/stack/reference/dashboard-supabase-integration", + destination: "/integrations/supabase", + permanent: false, + }, + { + source: "/stack/reference/discovery-session", + destination: "/get-started/choose-your-stack", + permanent: false, + }, + { + source: "/stack/reference/planning-guide", + destination: "/get-started/choose-your-stack", + permanent: false, + }, + { + source: "/stack/reference/supported-solutions", + destination: "/integrations", + permanent: false, + }, + { + source: "/stack/reference/agent-skills", + destination: "/reference/agent-skills", + permanent: false, + }, + { + source: "/stack/reference/glossary", + destination: "/reference/glossary", + permanent: false, + }, + // Generated TypeDoc API reference (scripts/generate-docs.ts output) + { + source: "/stack/reference/stack/:path*", + destination: "/reference/stack/:path*", + permanent: false, + }, +];