AbstractImporter usage¶

Status: Placeholder — to be developed. Last reviewed: — Reference structural sibling: guidelines/ui/forms.md (component-style sectioning + length).

Scope (when this guideline lands)¶

Conventions for adding a new importer: subclassing AbstractImporter, declaring ImportRow schema, bulk-mode rules, error handling, idempotency, file-format expectations, how the importer is wired into apps/collection/ (remote agent uploads) and the admin.

Out of scope (cross-refs)¶

Logosw-specific format quirks (encoding, category mapping, CIVIL.txt vs ACTES_2.txt structure) → guidelines/integrations/logosw.md (placeholder).
Collection / remote agent upload pipeline (the file-arrival side: heartbeat, batching, retry, validation) → not yet a guideline; agent backlog has many active items.
Form construction for upload UI → guidelines/ui/forms.md and guidelines/backend/forms.md (placeholder).
Celery task shape for the import task → guidelines/celery/task-conventions.md (placeholder).

Sources to mine when writing this¶

apps/imports/ — AbstractImporter definition (importers/__init__.py or importers/base.py).
Existing importers: CIVIL, ACTES_2, GL trial balance (most recent: apps/imports/importers/gl_trial_balance.py).
apps/imports/models.py — Import and ImportRow models.
docs/IMPORTS_MODULE.md — existing module documentation (in French); extract the rules, leave the reference material in docs/.
roadmap/done/imports-bulk-performance.md — bulk-mode lessons (OOM avoidance, debug_mode flag).
roadmap/done/imports-encoding-fixes.md — encoding handling.
roadmap/done/imports-gl-payment-redesign.md — recent 5-phase migration; reference for evolving an importer.
roadmap/done/imports-logging-optimization.md — logging discipline.

Starter hard rules to investigate¶

Subclass AbstractImporter, declare ImportRow schema in __init__, implement _import_row(row) — never bypass the base class.
Idempotent re-run: re-importing the same file produces the same DB state. Use natural keys + update_or_create.
Every row gets an ImportRow: status 'imported', 'skipped', 'failed' — never silent.
Bulk mode for >1000 rows: bulk_create with ignore_conflicts=True, batch size from settings.
Errors logged WITHOUT PII (per guidelines/security/pii-and-logging.md) — use the row's external ID, not patient name.
debug_mode=True disables bulk and adds verbose logging for diagnosing single-row failures.

Decision points to settle¶

Error policy: skip-and-log vs fail-fast on first error vs configurable per-importer?
Batch size: per-importer override vs global setting?
Progress reporting: how to expose import progress to the UI (HTMX poll? AJAX progress per the imports-ajax-progress Idea?)
Pre-validation step: should importers validate file structure BEFORE processing rows (fail-fast on bad header), or row-by-row?
Re-import without re-uploading: management command to replay an existing Import record's file? Useful for fixing importer bugs against historical data.

Known deviations to look for during writing¶

Importers that catch Exception: pass and don't record an ImportRow.
Importers that use for row in file: model.save() without bulk mode (slow on large files).
Importer logs containing patient names or other PII.
New importers that don't subclass AbstractImporter (custom base — investigate why).

If found, file as roadmap/backlog/imports-importer-drift-2026-MM.md.