Skip to content

Optimize gunicorn worker configuration#7935

Open
foozleface wants to merge 2 commits into
specify:mainfrom
calacademy-research:cas/perf-gunicorn-7880
Open

Optimize gunicorn worker configuration#7935
foozleface wants to merge 2 commits into
specify:mainfrom
calacademy-research:cas/perf-gunicorn-7880

Conversation

@foozleface

@foozleface foozleface commented Apr 9, 2026

Copy link
Copy Markdown
Collaborator

Fixes #7880
Contributed by @foozleface

The default gunicorn configuration uses 3 sync workers, each a separate forked process holding its own copy of the Django app in memory. For a memory-constrained container, switching to gthread (1 worker with 5 threads) reduces RSS by ~60% while maintaining concurrency. Periodic worker restart via --max-requests prevents memory leaks from accumulating. Celery workers also get worker_max_tasks_per_child=100 for the same reason.

Implementation

  • Change Dockerfile CMD from gunicorn -w 3 (sync) to gunicorn --worker-class gthread -w 1 --threads 5 --max-requests 500 --max-requests-jitter 50
  • Add app.conf.worker_max_tasks_per_child = 100 to celery_tasks.py so Celery workers restart after 100 tasks
  • Add deployment config tests that parse the Dockerfile and verify gthread, single worker, max-requests, and thread count

Testing instructions

  • Build the Docker image and verify gunicorn starts with gthread workers: docker logs <container> | grep gthread
  • Confirm worker count is 1 and threads is 5 in the startup log
  • Run the test suite: pytest tests/test_deployment_config.py -v
  • Monitor memory usage under load — RSS should be significantly lower than the 3-sync-worker configuration

Summary by CodeRabbit

  • Tests

    • Added a deployment configuration test suite that verifies the application server and task queue worker runtime settings.
  • Chores

    • Switched the application server to a threaded worker model with a single worker and multiple threads, and enabled automatic worker recycling.
    • Enabled task queue worker recycling after a fixed number of processed tasks.

@grantfitzsimmons

Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 7c4135ea-6b09-4c10-b607-dca2579043e1

📥 Commits

Reviewing files that changed from the base of the PR and between 2958a1e and 4a934d9.

📒 Files selected for processing (1)
  • tests/test_deployment_config.py

📝 Walkthrough

Walkthrough

Gunicorn startup flags in the Dockerfile are changed to a single gthread worker with 5 threads and max-requests recycling; Celery is configured to recycle child workers after 100 tasks; tests were added to validate both Dockerfile Gunicorn flags and the Celery setting.

Changes

Deployment configuration

Layer / File(s) Summary
Tests: module docstring + parsing helper
tests/test_deployment_config.py
Adds docstring, repo/Dockerfile path constants, and _parse_gunicorn_cmd() which extracts and JSON-decodes the Dockerfile CMD and returns the Gunicorn CLI flags.
Tests: Gunicorn assertions
tests/test_deployment_config.py
Adds TestGunicornConfig asserting --worker-class gthread, -w 1, --max-requests 500, --max-requests-jitter 50, and --threads 5 are present in the Dockerfile CMD.
Celery worker recycle assertion
tests/test_deployment_config.py, specifyweb/celery_tasks.py
Adds TestCeleryConfig that verifies app.conf.worker_max_tasks_per_child is set to 100 in specifyweb/celery_tasks.py.
🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Testing Instructions ⚠️ Warning Testing instructions specify pytest tests/test_deployment_config.py -v, but pytest is not in requirements files, making the instructions unclear and incomplete. Add pytest to requirements-testing.txt and document the full setup: install dependencies, then run pytest. Or convert tests to Django TestCase for CI integration.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Optimize gunicorn worker configuration' clearly and concisely summarizes the main change: switching from 3 sync workers to a threaded configuration with memory optimization, which aligns with the primary changeset.
Linked Issues check ✅ Passed All coding requirements from issue #7880 are met: switched to gthread workers [Dockerfile], configured single worker with 5 threads [Dockerfile], added worker restart policies [celery_tasks.py, Dockerfile], and added deployment config tests [test_deployment_config.py].
Out of Scope Changes check ✅ Passed All changes are scoped to the stated objectives: Gunicorn configuration optimization, Celery worker restart policy, and deployment configuration tests. No unrelated modifications detected.
Automatic Tests ✅ Passed PR includes comprehensive automatic tests in tests/test_deployment_config.py verifying gunicorn gthread configuration, worker count, max-requests, threads, and Celery worker recycling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
specifyweb/celery_tasks.py (1)

13-13: Add regression coverage for the Celery recycle limit.

This is part of the rollout, but it is currently untested in the provided changes. A small test that asserts app.conf.worker_max_tasks_per_child == 100 would keep the Celery half from regressing while the Dockerfile checks still pass.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specifyweb/celery_tasks.py` at line 13, Add a regression unit test that
asserts the Celery recycle limit remains set to 100 by checking
app.conf.worker_max_tasks_per_child equals 100; implement this in the test suite
(e.g., create a new test function like test_celery_worker_max_tasks_per_child)
import or fixture the same Celery `app` used in specifyweb/celery_tasks.py, and
assert app.conf.worker_max_tasks_per_child == 100 so changes to that value will
fail the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_deployment_config.py`:
- Around line 53-65: The tests test_max_requests_present and
test_threads_configured are too permissive; change them to assert the exact
tuned values: verify that "--max-requests" exists and that the parsed value
equals 500, verify that "--threads" exists and that the parsed value equals 50,
and add a new assertion to check that "--max-requests-jitter" exists and its
parsed value equals 5 (all using the same pattern of finding the flag in
self.args and reading the next element). Use the existing parsing pattern (idx =
self.args.index(...); value = int(self.args[idx + 1])) and update the assertion
messages to reflect the expected exact values.

---

Nitpick comments:
In `@specifyweb/celery_tasks.py`:
- Line 13: Add a regression unit test that asserts the Celery recycle limit
remains set to 100 by checking app.conf.worker_max_tasks_per_child equals 100;
implement this in the test suite (e.g., create a new test function like
test_celery_worker_max_tasks_per_child) import or fixture the same Celery `app`
used in specifyweb/celery_tasks.py, and assert
app.conf.worker_max_tasks_per_child == 100 so changes to that value will fail
the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6b977e44-70d2-4f56-8c82-23276698e998

📥 Commits

Reviewing files that changed from the base of the PR and between a0f96a8 and 2958a1e.

📒 Files selected for processing (3)
  • Dockerfile
  • specifyweb/celery_tasks.py
  • tests/test_deployment_config.py

Comment thread tests/test_deployment_config.py Outdated
@github-project-automation github-project-automation Bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board Apr 27, 2026
Tighten --max-requests (500), --threads (5) assertions, add coverage for
--max-requests-jitter (50) and celery worker_max_tasks_per_child (100).
Addresses review feedback on specify#7935.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Dev Attention Needed

Development

Successfully merging this pull request may close these issues.

Default to using Gunicorn gthread or gevent workers over sync

2 participants