Aller au contenu

Celery task conventions

Status: Placeholder — to be developed. Last reviewed:Reference structural sibling: guidelines/ui/forms.md (component-style sectioning + length).

Scope (when this guideline lands)

How to add and operate a Celery task: task naming (full dotted path, never bare), idempotency (must be safe to re-run), retry / backoff defaults, when to use @shared_task vs @app.task, when a task vs a service function vs a management command, beat schedule conventions, what make celery-logs / make beat-logs show, how a task surfaces failure (Sentry, Django log, no silent swallow).

Out of scope (cross-refs)

  • Per-integration sync logic (Doctolib, Pennylane, Logosw, INPI) → guidelines/integrations/*.md (placeholders). This file owns the task-shape rules; the integration files own connector specifics.
  • Logging and PII redaction in task outputguidelines/security/pii-and-logging.md (placeholder).
  • Service function placement (when business logic is in a task vs a service called by a task) → guidelines/backend/views.md (placeholder, has the service-layer convention).

Sources to mine when writing this

  • Celery configuration (config/celery.py or equivalent — locate the app definition).
  • Beat schedule (celerybeat-schedule is the runtime artefact at repo root; the source schedule is in code — find it).
  • Existing tasks: apps/sync/tasks.py (Doctolib daily 3 AM), apps/imports/tasks.py, apps/dataquality/tasks.py, apps/pennylane_sync/tasks.py.
  • Makefile celery targets — celery-logs, beat-logs, celery-heavy-logs, celery-restart. Document what each does.
  • Any past incident where a task ran twice or failed silently — recipe for the rule that prevents recurrence (check roadmap/done/ for relevant entries).

Starter hard rules to investigate

  1. Full dotted task names in name= argument: apps.sync.tasks.sync_doctolib_practice — never rely on Celery's autonaming.
  2. Idempotent by construction: re-running a task with the same args produces the same end state. Use external idempotency keys where the operation isn't naturally idempotent.
  3. Explicit retry/backoff: exponential backoff with max_retries=N. Don't rely on defaults; declare per task.
  4. Errors surface to Sentry AND log at WARNING+ level. Never try: … except: pass.
  5. Beat schedule colocation: the schedule entry lives next to the task definition (or in one canonical schedule file), not buried in settings.

Decision points to settle

  1. @shared_task vs @app.task: codebase may have both. Pick one — @shared_task is more portable.
  2. Task vs management command: when to choose each. (Tasks: triggered by views or beat. Commands: ad-hoc CLI invocation.)
  3. Heavy vs light queue split: celery-heavy-logs exists — what goes on the heavy queue? Document the routing rule.
  4. Result backend: are task results stored? If yes, retention; if no, document ignore_result=True policy.
  5. Beat lock: ensure only one beat scheduler runs (avoid duplicate schedule firing) — document the lock mechanism.

Known deviations to look for during writing

  • Tasks with try: … except Exception: pass (silent swallow).
  • Tasks without idempotency guards on operations that aren't naturally idempotent (file uploads, external API writes).
  • Bare task names that auto-generate (lose intent in logs).
  • Beat schedule entries without a clear owner / reason comment.

If found, file as roadmap/backlog/celery-task-drift-2026-MM.md.