Pennylane integration¶

Status: Placeholder — to be developed. Last reviewed: — Reference structural sibling: guidelines/i18n/translation-rules.md (rule + workflow style — connector-specific operational rules).

Scope (when this guideline lands)¶

Pennylane (French cloud accounting) sync: dual access model (Redshift Data Sharing + REST API v2), per-company tokens, cursor pagination, PDF download with fresh-URL handling, incremental ETL with checkpointing, hash-based integrity validation. How apps/pennylane_sync/ is wired and when to use which access path.

Out of scope (cross-refs)¶

Shared connector contract → guidelines/integrations/overview.md (placeholder).
Celery task shape → guidelines/celery/task-conventions.md (placeholder).
Finance domain model (GL, journal entries, reconciliation) → apps/finance/ and docs/finance-roadmap.md (reference).
PII / financial data redaction in logs → guidelines/security/pii-and-logging.md (placeholder).

Sources to mine when writing this¶

apps/pennylane_sync/ — current implementation (clients, tasks, models for sync state).
The reusable bundle (originally temp/pennylane_reusable/ — verify current location). Extracted from the penlane_tst prototype, contains the dual-backend abstraction and ETL patterns.
roadmap/done/meta-finance-captable-planning-session.md — context on what was extracted from penlane_tst and why.
roadmap/done/finance-captable-planning-2026-04-12.md (or similar) — earlier planning that set the dual-backend approach.

Starter hard rules to investigate¶

Per-company tokens, never a shared "master" token. Tokens scoped to one Pennylane company at a time.
Cursor pagination required for any list endpoint — never trust offset-based.
Fresh URL on PDF download — Pennylane PDF URLs expire; refetch the URL right before download, don't cache.
Incremental ETL with checkpoint — store last-sync timestamp per company; resume on partial failure.
Hash-based integrity validation — every batch carries a hash; mismatch triggers re-fetch.

Decision points to settle¶

Redshift vs REST API v2: when to use which? Redshift for bulk historical, API v2 for incremental? Document the cutover.
Checkpoint frequency: per-batch? per-page? per-company? Storage vs replay-cost tradeoff.
Hash validation failure: retry-from-scratch vs alert + manual intervention?
Token rotation policy: how often, who triggers it, how is the new token deployed without downtime?

Known deviations to look for during writing¶

Pennylane API calls without per-company token scoping.
Cached PDF URLs that 404 on download (URL expired).
Sync resuming from "beginning of time" because no checkpoint.
Financial data logged in clear (PII / regulatory risk).

If found, file as roadmap/backlog/integrations-pennylane-drift-2026-MM.md.