Aller au contenu

Pennylane integration

Status: Placeholder — to be developed. Last reviewed:Reference structural sibling: guidelines/i18n/translation-rules.md (rule + workflow style — connector-specific operational rules).

Scope (when this guideline lands)

Pennylane (French cloud accounting) sync: dual access model (Redshift Data Sharing + REST API v2), per-company tokens, cursor pagination, PDF download with fresh-URL handling, incremental ETL with checkpointing, hash-based integrity validation. How apps/pennylane_sync/ is wired and when to use which access path.

Out of scope (cross-refs)

  • Shared connector contractguidelines/integrations/overview.md (placeholder).
  • Celery task shapeguidelines/celery/task-conventions.md (placeholder).
  • Finance domain model (GL, journal entries, reconciliation) → apps/finance/ and docs/finance-roadmap.md (reference).
  • PII / financial data redaction in logsguidelines/security/pii-and-logging.md (placeholder).

Sources to mine when writing this

  • apps/pennylane_sync/ — current implementation (clients, tasks, models for sync state).
  • The reusable bundle (originally temp/pennylane_reusable/ — verify current location). Extracted from the penlane_tst prototype, contains the dual-backend abstraction and ETL patterns.
  • roadmap/done/meta-finance-captable-planning-session.md — context on what was extracted from penlane_tst and why.
  • roadmap/done/finance-captable-planning-2026-04-12.md (or similar) — earlier planning that set the dual-backend approach.

Starter hard rules to investigate

  1. Per-company tokens, never a shared "master" token. Tokens scoped to one Pennylane company at a time.
  2. Cursor pagination required for any list endpoint — never trust offset-based.
  3. Fresh URL on PDF download — Pennylane PDF URLs expire; refetch the URL right before download, don't cache.
  4. Incremental ETL with checkpoint — store last-sync timestamp per company; resume on partial failure.
  5. Hash-based integrity validation — every batch carries a hash; mismatch triggers re-fetch.

Decision points to settle

  1. Redshift vs REST API v2: when to use which? Redshift for bulk historical, API v2 for incremental? Document the cutover.
  2. Checkpoint frequency: per-batch? per-page? per-company? Storage vs replay-cost tradeoff.
  3. Hash validation failure: retry-from-scratch vs alert + manual intervention?
  4. Token rotation policy: how often, who triggers it, how is the new token deployed without downtime?

Known deviations to look for during writing

  • Pennylane API calls without per-company token scoping.
  • Cached PDF URLs that 404 on download (URL expired).
  • Sync resuming from "beginning of time" because no checkpoint.
  • Financial data logged in clear (PII / regulatory risk).

If found, file as roadmap/backlog/integrations-pennylane-drift-2026-MM.md.