Skip to content

High availability on Kubernetes#277

Merged
sriramveeraghanta merged 4 commits into
masterfrom
high-availability
May 22, 2026
Merged

High availability on Kubernetes#277
sriramveeraghanta merged 4 commits into
masterfrom
high-availability

Conversation

@danciaclara
Copy link
Copy Markdown
Collaborator

@danciaclara danciaclara commented May 22, 2026

Description

Adds a new High Availability on Kubernetes guide under Self-Hosting → Methods → Kubernetes, documenting how to deploy Plane Enterprise on Kubernetes so the cluster survives the loss of a single node or AZ without manual recovery.

The guide covers:

  • What HA means for Plane (single-region, AZ/node fault tolerance — not active-active)
  • Workload tiers in the plane-enterprise Helm chart (stateless, singleton, local stateful) and how each behaves under failure
  • Cluster prerequisites: multi-AZ worker nodes, volumeBindingMode: WaitForFirstConsumer StorageClass
  • Replacing in-chart stateful services (postgres, redis, rabbitmq, opensearch, minio) with managed multi-AZ equivalents
  • Cloud-agnostic configuration plus a dedicated section for AWS + Karpenter

Sidebar update: nests the new page under the existing Kubernetes entry (docs/.vitepress/config.mts).

Type of Change

  • Documentation update

Test Scenarios

  • pnpm build passes
  • pnpm check:format passes
  • Sidebar navigation shows High availability nested under Kubernetes
  • Internal anchor links (#external-managed-services, etc.) resolve correctly

Summary by CodeRabbit

  • Documentation
    • Added comprehensive high-availability deployment guide for Kubernetes-based self-hosted deployments, including cluster prerequisites, pod distribution strategies, and HA configuration examples.
    • Reorganized self-hosting documentation navigation to improve discoverability of high-availability resources.

Review Change Stack

@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
developer-docs Ready Ready Preview, Comment May 22, 2026 3:16pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Warning

Rate limit exceeded

@sriramveeraghanta has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 24 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b11e547c-cd86-4138-ba55-9ae1af2b30c9

📥 Commits

Reviewing files that changed from the base of the PR and between a08ce27 and bd97f85.

📒 Files selected for processing (1)
  • docs/self-hosting/govern/high-availability.md
📝 Walkthrough

Walkthrough

This PR adds comprehensive high-availability deployment guidance for Plane Enterprise on Kubernetes and updates documentation navigation. A new HA guide defines failure modes, workload tiers, cluster prerequisites, service wiring, scheduling patterns, operational controls, Karpenter automation, and pre-deployment checklists with reference configuration examples.

Changes

Kubernetes HA Documentation and Navigation

Layer / File(s) Summary
Navigation structure update
docs/.vitepress/config.mts
VitePress sidebar configuration expands the Kubernetes entry into a collapsible group with a nested High availability documentation link.
HA definition and cluster prerequisites
docs/self-hosting/govern/high-availability.md (lines 1–138)
Introduces the scope of HA (AZ/node fault tolerance), defines workload tiers (Tier 1 stateless, Tier 2 singleton, Tier 3 optional local), specifies cluster prerequisites (3+ AZs, StorageClass behavior, cross-zone load balancing, topology labels), and shows recommended topology.
External services and pod scheduling strategy
docs/self-hosting/govern/high-availability.md (lines 139–234)
Documents how to disable local stateful services and configure remote managed endpoints for Postgres/Redis/RabbitMQ/OpenSearch/storage; provides scheduling guidance including nodeSelector, tolerations, and AZ-spreading pod anti-affinity rules with examples for pinning workloads to specific node pools.
Pod disruption and autoscaling controls
docs/self-hosting/govern/high-availability.md (lines 235–409)
Provides PodDisruptionBudget guidance with sample manifests for Tier-1 services and warnings against Tier-2 singletons; documents HorizontalPodAutoscaler configuration for Tier-1 workloads with examples and exclusions for specific jobs.
Infrastructure automation and operational tuning
docs/self-hosting/govern/high-availability.md (lines 410–629)
Covers AWS Karpenter HA setup including version requirements, EC2NodeClass/NodePool examples for on-demand and spot instances, disruption rules for Tier-1 vs Tier-2 workloads, ingress/load balancer configuration, backup and disaster recovery mechanisms, and known chart gaps with workarounds.
Pre-go-live checklist and values.yaml reference
docs/self-hosting/govern/high-availability.md (lines 630–706)
Provides pre-go-live validation checklist with failure drills and a reference values.yaml example showing how to disable local services, configure remote endpoints, set Tier-1 replica counts, and apply AZ anti-affinity patterns using YAML anchors for all Tier-1 services.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A guide for planes that soar so high,
With clusters spread across the sky,
Three zones of trust, no single point of pain,
Through Kubernetes we scale again!
High availability, documented with care. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly reflects the main change: a comprehensive guide for high availability deployment on Kubernetes is the primary addition to the documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch high-availability

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new self-hosting documentation guide describing how to run Plane’s Helm-based Kubernetes deployment in a high-availability (multi-node / multi-AZ) topology, and exposes it in the docs sidebar under the Kubernetes installation section.

Changes:

  • Added a comprehensive “High Availability on Kubernetes” guide covering workload tiers, external managed dependencies, pod spreading, PDBs/HPAs, and AWS Karpenter specifics.
  • Updated the VitePress sidebar to nest a “High availability” page under “Kubernetes”.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
docs/self-hosting/govern/high-availability.md New HA guide for Kubernetes deployments (multi-AZ, external managed services, scheduling, PDB/HPA examples).
docs/.vitepress/config.mts Sidebar navigation update to include the new HA guide under Kubernetes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/self-hosting/govern/high-availability.md Outdated
Replace "Plane Enterprise" with "Plane Commercial Edition" in the
frontmatter description and body to match the rest of /self-hosting/govern,
add a Commercial Edition badge to the H1, and add the keywords field for
SEO consistency with neighboring pages.

Addresses Copilot review on #277.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/self-hosting/govern/high-availability.md`:
- Around line 105-135: The fenced ASCII topology block in high-availability.md
is missing a language tag causing MD040 lint errors; edit the opening fence of
the ASCII diagram (the ``` block containing the diagram/ASCII topology) and
change it to use the text language identifier (i.e., ```text) so the diagram
remains plain text and the markdown linter is satisfied.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 75f3c048-1778-4434-a716-9e3b3bbd9023

📥 Commits

Reviewing files that changed from the base of the PR and between f4817ee and a08ce27.

📒 Files selected for processing (2)
  • docs/.vitepress/config.mts
  • docs/self-hosting/govern/high-availability.md

Comment thread docs/self-hosting/govern/high-availability.md Outdated
@sriramveeraghanta sriramveeraghanta merged commit e14f7c1 into master May 22, 2026
8 checks passed
@sriramveeraghanta sriramveeraghanta deleted the high-availability branch May 22, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants