Pennylane integration¶
Status: Placeholder — to be developed. Last reviewed: — Reference structural sibling:
guidelines/i18n/translation-rules.md(rule + workflow style — connector-specific operational rules).
Scope (when this guideline lands)¶
Pennylane (French cloud accounting) sync: dual access model (Redshift Data Sharing + REST API v2), per-company tokens, cursor pagination, PDF download with fresh-URL handling, incremental ETL with checkpointing, hash-based integrity validation. How apps/pennylane_sync/ is wired and when to use which access path.
Out of scope (cross-refs)¶
- Shared connector contract →
guidelines/integrations/overview.md(placeholder). - Celery task shape →
guidelines/celery/task-conventions.md(placeholder). - Finance domain model (GL, journal entries, reconciliation) →
apps/finance/anddocs/finance-roadmap.md(reference). - PII / financial data redaction in logs →
guidelines/security/pii-and-logging.md(placeholder).
Sources to mine when writing this¶
apps/pennylane_sync/— current implementation (clients, tasks, models for sync state).- The reusable bundle (originally
temp/pennylane_reusable/— verify current location). Extracted from thepenlane_tstprototype, contains the dual-backend abstraction and ETL patterns. roadmap/done/meta-finance-captable-planning-session.md— context on what was extracted frompenlane_tstand why.roadmap/done/finance-captable-planning-2026-04-12.md(or similar) — earlier planning that set the dual-backend approach.
Starter hard rules to investigate¶
- Per-company tokens, never a shared "master" token. Tokens scoped to one Pennylane company at a time.
- Cursor pagination required for any list endpoint — never trust offset-based.
- Fresh URL on PDF download — Pennylane PDF URLs expire; refetch the URL right before download, don't cache.
- Incremental ETL with checkpoint — store last-sync timestamp per company; resume on partial failure.
- Hash-based integrity validation — every batch carries a hash; mismatch triggers re-fetch.
Decision points to settle¶
- Redshift vs REST API v2: when to use which? Redshift for bulk historical, API v2 for incremental? Document the cutover.
- Checkpoint frequency: per-batch? per-page? per-company? Storage vs replay-cost tradeoff.
- Hash validation failure: retry-from-scratch vs alert + manual intervention?
- Token rotation policy: how often, who triggers it, how is the new token deployed without downtime?
Known deviations to look for during writing¶
- Pennylane API calls without per-company token scoping.
- Cached PDF URLs that 404 on download (URL expired).
- Sync resuming from "beginning of time" because no checkpoint.
- Financial data logged in clear (PII / regulatory risk).
If found, file as roadmap/backlog/integrations-pennylane-drift-2026-MM.md.