Skip to content

fix(api): bound Celery worker concurrency to a configurable default#11075

Open
b-abderrahmane wants to merge 1 commit intoprowler-cloud:masterfrom
b-abderrahmane:fix/celery-worker-concurrency-default
Open

fix(api): bound Celery worker concurrency to a configurable default#11075
b-abderrahmane wants to merge 1 commit intoprowler-cloud:masterfrom
b-abderrahmane:fix/celery-worker-concurrency-default

Conversation

@b-abderrahmane
Copy link
Copy Markdown

Context

Fix #10968.

Without an explicit worker_concurrency setting, Celery falls back to os.cpu_count() of the host. On Kubernetes nodes with many CPUs (16+) the worker container spawns one prefork child per CPU, each loading the full SDK working set, and routinely OOMKills under typical 4 GiB pod limits even when idle. The existing --max-tasks-per-child 1 recycles each child after one task — it bounds per-task leaks but not the number of concurrent slots.

Description

Adds CELERY_WORKER_CONCURRENCY = env.int("DJANGO_CELERY_WORKER_CONCURRENCY", default=4) in config/settings/celery.py. The Celery app already loads Django settings via config_from_object("django.conf:settings", namespace="CELERY"), so the setting flows through to app.conf.worker_concurrency and the prefork pool size with no entrypoint changes.

A default of 4 matches the per-process SDK memory footprint observed in typical deployments; operators size up via DJANGO_CELERY_WORKER_CONCURRENCY (also added to api/.env.example next to the sibling DJANGO_CELERY_DEADLOCK_ATTEMPTS).

Changes:

  • api/src/backend/config/settings/celery.py — one Django setting, matching the existing env.int("DJANGO_*", default=...) convention used by DJANGO_CELERY_DEADLOCK_ATTEMPTS.
  • api/.env.example — document the new env var alongside its siblings.
  • api/CHANGELOG.md — entry under [1.27.0] (Prowler UNRELEASED)🐞 Fixed.

Steps to review

  1. Read the 1-line settings change and the changelog/env-example additions — three files, +7 lines.
  2. Confirm the setting name/namespace is consistent with existing siblings:
    • CELERY_DEADLOCK_ATTEMPTS = env.int("DJANGO_CELERY_DEADLOCK_ATTEMPTS", default=5) (existing)
    • CELERY_WORKER_CONCURRENCY = env.int("DJANGO_CELERY_WORKER_CONCURRENCY", default=4) (this PR)
  3. Optionally reproduce the validation below in any K8s deployment.

Evidence

Validated end-to-end on a 20-core AKS node with the worker image rebuilt from this branch (prowler-api, dockerized from api/). Worker Deployment patched to use the test image; same pod spec (requests.memory=1Gi, limits.memory=4Gi).

Before vs after

Metric Before (master, no setting) After (this PR, default 4)
Django settings.CELERY_WORKER_CONCURRENCY (unset → falls through) 4
app.conf.worker_concurrency os.cpu_count() = 20 4
Prefork children per pod 20 4
Idle resident memory ~3.4 GiB ~880 MiB
Restart count (over 26h soak) 4 and 8 (OOMKilled) 0

Process tree (after) — 1 parent + 4 ForkPool children:

$ kubectl exec prowler-worker-7c56d9dd6d-4lcdt -- ls /proc/[0-9]*/comm | grep -E celery
pid=10  python -m celery -A config.celery worker -n c5...
pid=38  python -m celery -A config.celery worker -n c5...   # ForkPoolWorker-1
pid=39  python -m celery -A config.celery worker -n c5...   # ForkPoolWorker-2
pid=40  python -m celery -A config.celery worker -n c5...   # ForkPoolWorker-3
pid=41  python -m celery -A config.celery worker -n c5...   # ForkPoolWorker-4

Wiring proof — Django setting → Celery config:

$ kubectl exec prowler-worker-... -- manage.py shell -c "
    from django.conf import settings; from config.celery import celery_app
    print(settings.CELERY_WORKER_CONCURRENCY)
    print(celery_app.conf.worker_concurrency)
  "
4
4

Override demonstrationDJANGO_CELERY_WORKER_CONCURRENCY=2 propagates through:

$ kubectl set env deploy/prowler-worker DJANGO_CELERY_WORKER_CONCURRENCY=2
$ kubectl exec <new-pod> -- printenv DJANGO_CELERY_WORKER_CONCURRENCY
2
$ kubectl exec <new-pod> -- manage.py shell -c "..."
settings.CELERY_WORKER_CONCURRENCY=2
celery_app.conf.worker_concurrency=2
$ kubectl top pod <new-pod>
NAME                              CPU(cores)   MEMORY(bytes)
prowler-worker-5cb6bb6bc4-2b9q7   521m         564Mi          # was 3.4 GiB

Process count goes to 1 parent + 2 children (with brief overlaps from --max-tasks-per-child 1 recycling, expected).

Checklist

  • Review if the code is being covered by tests.
    • The change is a single Django settings line wrapping env.int() (the same pattern already exercised for DJANGO_CELERY_DEADLOCK_ATTEMPTS with no dedicated test in api/src/backend/api/tests/test_celery_settings.py); validated in a real cluster as documented above.
  • Review if code is being documented following pyguide §3.8.
  • Review if backport is needed.
  • Review if is needed to change the Readme.md — N/A.
  • Ensure new entries are added to CHANGELOG.md — added under [1.27.0]🐞 Fixed.

API

  • All issue/task requirements work as expected on the API — see Evidence above.
  • Endpoint response output (if applicable) — N/A (worker-side change, no endpoints touched).
  • EXPLAIN ANALYZE output for new/modified queries or indexes (if applicable) — N/A.
  • Performance test results (if applicable) — included above (memory + process-count).
  • Any other relevant evidence of the implementation (if applicable) — included above.
  • Verify if API specs need to be regenerated — N/A.
  • Check if version updates are required — N/A (no dependency or spec changes).
  • Ensure new entries are added to CHANGELOG.md.

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Without an explicit `worker_concurrency` setting, Celery falls back to
`os.cpu_count()` of the host. On Kubernetes nodes with many CPUs (16+)
the worker container spawns one prefork child per CPU, each loading the
full SDK working set, and routinely OOMKills under the typical 4 GiB
limit even when idle.

Add `CELERY_WORKER_CONCURRENCY = env.int("DJANGO_CELERY_WORKER_CONCURRENCY", default=4)`
in `config/settings/celery.py`. The Celery app already loads Django
settings via `config_from_object("django.conf:settings", namespace="CELERY")`,
so the setting flows through to `app.conf.worker_concurrency` and the
prefork pool size with no entrypoint changes. Default of 4 matches the
per-process SDK memory footprint in typical deployments; operators size
up via `DJANGO_CELERY_WORKER_CONCURRENCY`.

Closes prowler-cloud#10968
@b-abderrahmane b-abderrahmane requested a review from a team as a code owner May 7, 2026 19:41
@github-actions github-actions Bot added component/api community Opened by the Community labels May 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Conflict Markers Resolved

All conflict markers have been successfully resolved in this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Opened by the Community component/api

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Celery workers OOM on hosts with many CPUs — --concurrency is unbounded by default

1 participant