Output documents — business intent#
See repo source for current behavior.
This page describes why the pipeline emits each business-facing deliverable list and what stakeholders expect from them. For exact filenames, paths, and column definitions, see DATA_DICTIONARY.html.
| # | Deliverable | File(s) | Primary recipient |
|---|---|---|---|
| 1 | List of Inactive People | output_document_inactive_people.csv/.json | Internal ops (run directory; not emailed on N04) |
| 1a | SFMC marketing suppression import | {BusinessUnit}_NoLongerThere_{YYYY-MM-DD}.csv (e.g. Marketing_NoLongerThere_2026-05-27.csv) | Marketing team (notify_marketing_suppression N04) |
| 2 | List of Alternate Contacts | output_document_alternate_contacts.csv/.json | Sai Teja (IP4) |
| 3 | List of Inactive People at New Organization | output_document_inactive_new_org.csv/.json | Sai Teja (IP4) |
| 4 | List of Undeliverables | output_document_undeliverables.csv/.json | Run report consumers |
| 5 | CUPOLA-undetermined handoff (IP4) | output_document_inactive_no_cupola_match.csv/.json | Sai (notify_sai_action_items catalog N05/N06); global Max + Vish Cc |
| 6 | Email Update Requests (Changed Email) | output_document_email_update_requests.csv/.json | Run directory only; marketing team gets SFMC suppression files via N04 (*_NoLongerThere_*.csv) |
| 7 | Multipub Audit | output_document_multipub_audit.csv/.json | Run directory only; Tarun gets undetermined-sender (1.2) and upload Yes (3.1) |
| 8 | Human Review digest | output_document_human_review.csv/.json | Sai Teja (IP4); consolidated review queue |
| 9 | Impact report | impact_report.txt + .json | Client Services (attached to the run-reports email) |
Active-only policy. CUPOLA writes apply only on INACTIVE against an active CUPOLA row, or on gated EMAIL_UPDATE / TITLE_UPDATE against an active row. ACTIVE never auto-adds or auto-reactivates; OUT_OF_OFFICE never writes backends. Otherwise rows go to the Human Review digest (#8).
List of Inactive People#
Purpose: Remove or follow up these emails across systems (Cupola, Hodor, Salesforce, SFMC, Multipub).
Fields (conceptual):
- Id
- AccountName
- Org Name
- Person Name
- Email – Auto Response Received From
- Status with Org
- Cupola organization / person identifiers and org-person links used for actions
- HODOR ProsNums and status action taken
- Salesforce Lead/Contact record IDs when the contact was resolved in Salesforce
- Multipub subscriber number, active subscriptions, and recent orders surfaced for sales
Systems and actions:
- Cupola: Org_Person_Ids linked to these emails marked inactive.
- Hodor: ProsNums linked to these emails marked “No Longer with Firm”.
- Salesforce: Associated Lead/Contact IDs recorded when lookup succeeds (for reconciliation and downstream updates).
- SFMC: Suppression for marketing is via N04
*_NoLongerThere_*.csvimport files (see MARKETING_SUPPRESSION.html). Live REST auto-suppression is planned but not hooked up in production; internal column SFMC Suppression Added reflects that future path. - Multipub: Active subscriptions and recent orders (e.g. past 12 months) associated with inactive emails/names — for sales follow-up.
Marketing email (N04): The marketing team mailbox does not receive this wide internal CSV. When inactive rows exist, the pipeline also writes {BusinessUnit}_NoLongerThere_{YYYY-MM-DD}.csv (see below) and attaches only those SFMC import files via notify_marketing_suppression.
SFMC marketing suppression import#
Purpose: Give Marketing a file they can import into SFMC to suppress addresses the auto-responder marked inactive (deceased, retired, left company, no longer there).
Artifacts: One CSV per business-unit label under processing_reports/run_{timestamp}/, named {BusinessUnit}_NoLongerThere_{YYYY-MM-DD}.csv (run date from folder name). Today all rows map to Marketing_...; future inbox→business-unit mapping may produce additional files (e.g. Energy_...).
Columns: Email Address, Status (always Unsubscribed), Date Added (ISO date). Emails are deduped case-insensitively within each file. UTF-8 with BOM.
Recipient: Marketing team via catalog N04 (NOTIFICATION_EMAIL_ERIN). See MARKETING_SUPPRESSION.html (process guide), NOTIFICATIONS_CATALOG — N04, and DATA_DICTIONARY — marketing suppression deliverable.
List of Alternate Contacts#
Purpose: Consolidate alternate contacts provided in auto-replies and add or update across systems.
Fields (conceptual): Id, AccountName, Email Received From (source / lookup identity), Subject, Email Body (full source message text), Message ID (traceability), Org ID/Name, alternate name/title/email/phone/ext, Cupola Org Person ID / Person ID, HODOR ProsNum when present, comments, planned Cupola action (add vs update), HODOR library and import-template fields, flag when Sales should follow up in Multipub.
Systems and actions:
- Cupola: If alternate exists, update email/title/phone as needed; if not, add when org exists and policy allows. When adding,
add_contactavoids duplicate people: same email already at another org reusesperson_idand adds a new org-person link for the target org (seedocs/connections/cupola.html). - Hodor: Map fields to import template by AccountName/library (e.g. energy@thompson.com → ENGY).
- Multipub: Sales may request alternates for inactive people who still had active subscription or recent purchase.
List of Inactive People at New Organization#
Purpose: Track where inactive people went and whether they should be in our systems at the new org.
Population: Rows are written when Determination.new_org_details is populated after LLM classification. category_mapper._normalize_new_org_details reads nested or flat new_org_* keys from the classifier JSON; _enrich_new_org_details falls back to final_sender_new_email (and optional org name from classifier reasoning) for Left Company, Retired, Deceased, and Changed Email when structured fields are missing.
Fields (conceptual): Id, Account Name, source email, person name, new org ID/name/title/email/phone, Org Person ID, Person ID, HODOR ProsNum when present, comments, Cupola action and whether the org exists / is AI-appropriate where applicable, HODOR library assignment notes, flag when Sales should follow up in Multipub.
Systems and actions:
- Cupola: Update or add per org/contact rules (including AI-appropriate org checks where applicable).
- Hodor: Library assignment may need review when someone changes industry/org.
- Multipub: Sales may request new contacts when prior subscription context applies.
CUPOLA-undetermined handoff (IP4)#
Purpose: Give IP4 / operations a dedicated queue whenever an automated CUPOLA decision was not possible — no CUPOLA org-person row exists, or the matched row is inactive and the determination would otherwise auto-add / auto-reactivate. Under the Active-only policy this queue also captures ACTIVE determinations with no CUPOLA row and reactivation candidates (inactive CUPOLA rows on ACTIVE outcomes).
Artifacts: output_document_inactive_no_cupola_match.csv and .json under processing_reports/run_{timestamp}/. Rows are delivered inside Notifier.notify_sai_action_items (catalog N05/N06 — inactive contacts with no Cupola match). To: NOTIFICATION_EMAIL_SAI; global Max + Vish Cc.
Fields (conceptual): Account/inbox source, auto-response sender, subject, person/org hints from the email, determination label, Multipub deferral/review context when relevant, Hodor ProsNums and Salesforce IDs if lookup found those without Cupola, message id for traceability.
List of Undeliverables#
Purpose: Consolidate undeliverable auto-response traffic (bounce-backs, invalid addresses) so teams can remove or correct records and align downstream systems.
Fields (conceptual): Id, AccountName, sender email, lookup email, Org Name, Person Name, subject, Cupola org/person and org-person identifiers when resolved, HODOR ProsNums, Multipub subscriber number when resolved, processing status, skip reason when applicable.
Systems and actions:
- Cupola: Org-person links recorded when a contact can be matched — supports cleanup or verification.
- Hodor: ProsNums when matched — supports alignment with master data.
- Multipub: Subscriber number when matched; Multipub Sales Request and subscription order summaries when validation shows relevant activity (catalog N02 may still fire while CUPOLA/Hodor/SF writes remain blocked for bounce-pending undeliverables).
- General: This list is the operational queue for addresses that could not be delivered to as intended. Parsed replacement contacts on bounce messages may also appear on
output_document_alternate_contacts.
Email Update Requests (Changed Email)#
Purpose: Per-row deliverable for every email mapped to the Changed Email category. Per-row deliverable for Marketing / SFMC address corrections on changed-email determinations.
Artifacts: output_document_email_update_requests.csv under processing_reports/run_{timestamp}/. Changed-email rows are tracked on the master ledger; they are not attached to marketing email N04. Inactive-people SFMC suppression files (*_NoLongerThere_*.csv) are emailed separately via Notifier.notify_marketing_suppression (NOTIFICATION_EMAIL_ERIN → marketing-team-cbi@columbiabooks.com).
Fields (conceptual): Source message id, original sender, lookup email actually used, contact-found flag, contact systems hit, determination, processing status, org name, person name, the new email extracted from the auto-response, and the matched IDs from each backend system (Cupola org/org-person/person, Hodor ProsNum, Multipub SubsNum).
Systems and actions:
- SFMC / Marketing: Apply the corrected address; suppress the old one.
- Cupola / Hodor: When a backend match exists, the row carries the matching IDs so downstream operators can update the same record.
Multipub Audit (Tarun handoff)#
Purpose: Cross-check audit for every INACTIVE determination that ran through the Multipub subscription gate. Lets Tarun retire (or compare against) the manual Multipub review queue.
Artifacts: output_document_multipub_audit.csv under processing_reports/run_{timestamp}/. Written for engineering review; not bulk-emailed to Tarun. Tarun receives notify_tarun_undetermined_sender_review (N03) and the upload Yes-path notify_multipub_subscriber_followup_from_upload (N09).
Tarun upload loopback: POST /multipub/upload with outcome=yes|no. Yes → Angel + Yogesh (Matt Cc). No → tarun_upload_audit.csv only. See docs/connections/multipub.html.
Fields (conceptual): Account/inbox source, email used for the Multipub lookup, person/org hints, determination label, matched Multipub subsnum, active / recently expired / single-issue subscription counts, the validation gate's flagged/deferred decision, the review reason text, a one-line summary, and the source message id.
Systems and actions:
- Multipub: Captures the validation snapshot used to decide whether the inactive workflow could proceed (deferred vs. clean inactive).
- Sales follow-up: Rows with
Has Active Subscription = YesorInactive Action Deferred = Yesare the queue for Tarun's manual contact / cancellation review.
Human Review digest#
Purpose: Single consolidated queue for every row the pipeline decided not to act on automatically, for rows not acted on automatically. Captures reasons such as:
reason constant | When |
|---|---|
HUMAN_REVIEW_REASON_ACTIVE_NEW_CONTACT | ACTIVE outcome but no CUPOLA row (no auto-add) |
HUMAN_REVIEW_REASON_REACTIVATION_CANDIDATE | ACTIVE outcome but matched CUPOLA row is inactive (no auto-reactivate) |
HUMAN_REVIEW_REASON_UPDATE_ON_INACTIVE | EMAIL_UPDATE / TITLE_UPDATE on inactive CUPOLA row |
HUMAN_REVIEW_REASON_OUT_OF_OFFICE | OUT_OF_OFFICE determination — tracked as its own workflow, no system writes |
Artifacts: output_document_human_review.csv under processing_reports/run_{timestamp}/. Actionable rows are filtered into output_document_human_review_action_items.csv and emailed via notify_sai_action_items (catalog N05/N06). Venu's audit email includes reason-key legend + per-reason counts from the master CSV. Each row includes Email Body, reason, reason_detail, suggested_action, and traceability columns.
Impact report#
Purpose: One-page headline summary of what the run actually changed. Driven by utils/impact_report.py, which reads the in-memory CupolaAuditLogger.entries list and derives three counts.
Artifacts: impact_report.txt (human readable) and impact_report.json (machine readable) next to the audit log. Impact metrics are inlined in the N08 run-audit email body and in processing_report.log; impact_report.txt is not attached to that email.
Fields:
| Field | Meaning |
|---|---|
emails_processed | Total auto-response emails handled in the run |
records_deactivated | CUPOLA rows flipped to inactive (status_change audit entries with requested_status=False and auto_applied=True) |
records_added | New CUPOLA rows inserted (contact_addition audit entries with a non-empty contact_id; when this only ticks for REPLACEMENT when CUPOLA_AUTO_ADD_REPLACEMENTS=true) |