AutoResponder Determination System - Complete Reference#
This document is the single, authoritative reference for the AutoResponderProcess determination system. It covers the end-to-end flow from email extraction through LLM classification to action execution, including all business rules, system update logic, Multipub validation, and implementation details.
Table of Contents#
- Executive Summary
- Architecture Overview
- Complete Process Flow
- LLM Classification Pipeline
- Category-to-Determination Mapping
- Determination Types — Detailed Reference
- Multipub Validation (Tarun's Process)
- System Update Rules
- Run Outputs (Spreadsheets & Logs)
- Notifications
- Special Cases & Edge Cases
- Configuration
- Dry-Run Mode
- Project Structure
- Integration History
- Error Handling
- Performance Considerations
- Testing
- Implementation Status & Enhancements
- Known Limitations & Risks
- Team Responsibilities
Executive Summary#
The AutoResponderProcess is a production-ready system that processes auto-response emails from a central MSSQL repository. It uses an LLM-based classification pipeline (OpenAI or Ollama) to determine the status of each contact, then executes appropriate actions across four database systems: Cupola, Hodor, Multipub, and Salesforce.
Key Principles:
- Multipub and Salesforce (when the contact is related to Multipub) are treated as read-only for status and email updates — the marketing team handles those changes.
- Only Cupola and Hodor are updated directly (unless
HODOR_AUTOMATIC_UPDATES=false, in which case Hodor changes are emailed as a manual-action list instead of SQLUPDATEs). - Before marking contacts inactive, the system runs Tarun's Multipub validation process to cross-check against active/recent subscriptions.
- The system supports a comprehensive dry-run mode for safe testing.
Determination Types (6):
| Determination | Primary Action |
|---|---|
| UNKNOWN | Skip processing |
| INACTIVE | Mark inactive, check subscriptions, notify |
| REPLACEMENT | Mark original inactive, add replacement to Cupola, notify |
| EMAIL_UPDATE | Changed Email (no alt) — update email/title in Cupola/Hodor/SF (skip Multipub/SF-if-Multipub) |
| TITLE_UPDATE | Same handler as EMAIL_UPDATE; not emitted by map_category_to_determination() (reserved for tests / future use) |
| ACTIVE | Ensure active status in Cupola/Hodor (skip Multipub/SF-if-Multipub) |
Architecture Overview#
Original Email Classifier#
Location: Auto_responder_email/unified_email_classifier.py (reference only — functionality fully integrated into main application)
The original program was a FastAPI-based service with a /classify/csv/ endpoint. Its LLM classification and QA agent logic has been extracted and integrated into the main application's determination services.
Current Application#
Location: src/auto_responder/
Workflow: Extract → Deduplicate → LLM determination (Step 4a) → resolve lookup email → contact lookup (Step 4c; stub in dry-run) → execute actions (Step 4d) → notify
(main.py runs classification before contact lookup so UNKNOWN emails skip database lookups.)
Key Features:
- Extracts emails from central MSSQL repository
- Classifies each email with the LLM before contact lookup (so UNKNOWN skips DB queries)
- Looks up contacts across Cupola, Hodor, Multipub, and Salesforce (with graceful degradation)
- LLM-based determination with two-agent QA system (Classification Agent + QA Agent)
- Action engine that routes determinations to appropriate handlers
- Notification system for subscriptions, replacements, and Multipub validation flags
- Comprehensive dry-run mode
LLM Output Categories (9)#
- Not Relevant
- Flagged as Spam
- Undeliverable
- Retired
- Out of Office (with long-term leave flag)
- Deceased
- Left Company
- Changed Email
- Unknown
Application Determination Types (6)#
- ACTIVE — Person is active at organization
- INACTIVE — Person is no longer at organization
- REPLACEMENT — Person has been replaced by someone else
- EMAIL_UPDATE — Changed Email (no alternate) — only determination the mapper produces for “changed contact info” today;
_handle_title_update() - TITLE_UPDATE — Same handler as
EMAIL_UPDATEinActionEngine; the live LLM path does not set this value (mapper never returns it). Use for fixtures/tests or future wiring. - UNKNOWN — Cannot determine action type
Complete Process Flow#
PlantUML Diagram#
Source file: diagrams/business_flow_technical.puml. Pre-rendered exports sit alongside the source: business_flow_technical.svg and business_flow_technical.pdf. Re-render any of the .puml files with python docs/diagrams/_render_business_flow.py [stem] or with any PlantUML-compatible tool. The technical diagram calls out class / method entry points (EmailProcessor.deduplicate_emails(), AutoResponseDetermination.determine(), ContactLookup.lookup_with_enrichment(), ActionEngine.execute() and its _handle_* branches), the run artifacts written at each step, and the notify_* dispatches used for routing. The business-level, non-engineering view lives in diagrams/business_flow_simplified.puml — that is the file to share externally. The two diagrams are kept in sync (business_flow_simplified.puml / business_flow_technical.puml).
Processing Steps#
| Step | Description | Key Classes / notes |
|---|---|---|
| Step 1 | Extract emails from central MSSQL repository | EmailRepository |
| Step 2 | Deduplicate emails by Message ID | EmailProcessor |
| Step 3 | Initialize services (with graceful degradation) | ContactLookup, AutoResponseDetermination, ActionEngine, Notifier |
| Step 4a | LLM determination (classification, QA, map to Determination) | AutoResponseDetermination.determine() |
| — | If determination is UNKNOWN, skip (no contact lookup, no actions) | main.py |
| Step 4c | Contact lookup using resolved lookup_email (dry-run: stub Contact, no DB lookup) | ContactLookup.lookup_with_enrichment() |
| Step 4d | Execute actions based on determination | ActionEngine.execute() |
Data Flow Summary#
Email Repository (MSSQL - CBI_EmailExtraction.dbo.Emails)
↓
Step 1: Email Extraction (most recent N emails)
↓
Step 2: Deduplication (by Message ID)
↓
Step 3: Service Initialization
├─ ContactLookup (Cupola, Hodor, Multipub, Salesforce)
│ (graceful degradation — continues if some connectors fail)
├─ AutoResponseDetermination (LLM: OpenAI or Ollama)
├─ ActionEngine (with all DB connectors + Notifier)
└─ [Dry-Run: mock Cupola/Multipub/Salesforce + real HodorDatabase; optional mock email repo; ActionLogger + StageLogger]
↓
Step 4: Process Each Email
↓
┌───────────────────────────────────────────────────┐
│ Step 4a: DETERMINATION (LLM pipeline) │
│ │
│ 1. Classification Agent → Category + Contact Info │
│ 2. QA Agent → Verification + Corrections │
│ 3. Email Validation → Clean addresses │
│ 4. Normalize compound categories → single label │
│ 5. Category Mapper → Determination enum │
│ 6. Identify auto-response source email │
│ 7. Post-Processing → replacement email ≠ sender │
└───────────────────────────────────────────────────┘
↓
UNKNOWN? → skip (no lookup, no actions)
↓
┌───────────────────────────────────────────────────┐
│ Step 4c: Contact Lookup │
│ Search: Cupola → Hodor → Multipub → Salesforce │
│ (graceful degradation; dry-run: stub Contact) │
│ Returns: Unified Contact with records from all │
└───────────────────────────────────────────────────┘
↓
Step 4d: Action Engine Execution
├─→ UNKNOWN → Skip
│
├─→ INACTIVE → ┌──────────────────────────────────────────┐
│ │ MULTIPUB VALIDATION (Tarun's Process) │
│ │ 1. Match by email/user_id or name+org │
│ │ 2. Check active subscriptions │
│ │ 3. Check recently expired (12 mo) │
│ │ 4. Check single-issue purchases (12 mo) │
│ │ 5. Flag for review if subscriptions found│
│ └──────────────────────────────────────────┘
│ → Mark inactive (Cupola, Hodor only)
│ → Skip Multipub & Salesforce-if-Multipub
│ → Check subscriptions, notify if found
│
├─→ REPLACEMENT → Mark original inactive (same logic)
│ → Add replacement to Cupola
│ → Notify Sai Teja
│
├─→ EMAIL_UPDATE / TITLE_UPDATE → Update email/title in Cupola/Hodor/SF
│ → Skip Multipub & Salesforce-if-Multipub
│
└─→ ACTIVE → Ensure active (Cupola/Hodor)
→ Add to Cupola if not present
→ Skip Multipub & Salesforce-if-Multipub
↓
Notifications (if needed)
├─→ Multipub validation flag notification → Client Services
├─→ Person left with subscription notification → Max / Client Services
└─→ New replacement notification → Sai Teja
↓
Next Email / Summary ReportLLM Classification Pipeline#
Location in Code: src/auto_responder/main.py → Step 4a
Service: AutoResponseDetermination.determine()
Pipeline Steps#
Classification Agent (LLM)
- Analyzes email subject and body (truncated to 12,000 chars)
- Extracts contact information (
sender_new_email,alternate_contact,retired_personal_email) - Assigns category based on priority order
Returns initial classification result with confidence level
QA Agent (LLM)
- Verifies classification agent's output
- Checks category correctness and priority order
- Validates contact information extraction
- Applies corrections if needed
Returns QA-verified result
Email Address Validation
- Validates
sender_new_emailandretired_personal_emailusing regex Cleans and strips whitespace
- Validates
Normalize LLM category (
normalize_llm_category)- Models sometimes return compound strings (e.g.
"Out of Office, Changed Email"). - The pipeline collapses these to one canonical label using pattern priority in
category_mapper.py(highest business severity first: Undeliverable → Deceased → Retired → Left Company → Changed Email → Out of Office → …). That single label drives
map_category_to_determinationand the person/email status columns.
- Models sometimes return compound strings (e.g.
Category Mapping
- Maps normalized LLM category to
Determinationenum - Parses
alternate_contactintoReplacementInfo - If alternate_contact parsing fails for Retired/Deceased/Left Company → falls back to INACTIVE
Creates
DeterminationResult(does not setupdated_titleorTITLE_UPDATEtoday — see EMAIL_UPDATE / TITLE_UPDATE under Determination Types — Detailed Reference)
- Maps normalized LLM category to
Source email identification (
AutoResponseDeterminationStep 5 in code)identify_auto_response_source_email()setssource_emailon the result;main.pyprefers it over the raw sender when resolvinglookup_email.After Step 1b relay normalization (
normalize_relay_email, optional multi-label split when the body is available),reconcile_email_with_body()corrects mangled person addresses (e.g.colleen.bivona.domail@maricopa.edu→ signaturecolleen.bivona@domail.maricopa.edu) at ingest, during source identification (Step 3b before the cleaned-sender fallback), and again when resolvinglookup_email.
Post-Processing (alternate contacts vs sender)
- Removes replacement entries whose
replacement_emailequals the sender If no replacement entries remain and determination was REPLACEMENT:
- Final category Retired, Deceased, or Left Company → change determination to INACTIVE
- Otherwise (e.g. Changed Email with unusable alt) → change determination to UNKNOWN
- Removes replacement entries whose
Contact Information Fields#
| Field | Source | Usage | Updates Systems? |
|---|---|---|---|
sender_new_email | LLM extraction | Update email in systems | Yes |
alternate_contact | LLM extraction | Create replacement contact | Yes (Cupola) |
retired_personal_email | LLM extraction | Store in notes only | No |
is_long_term_leave | LLM flag | Log for review | No |
new_org_details (new_org_* keys) | LLM QA JSON (nested object or flat keys); mapper fallback via _enrich_new_org_details | output_document_inactive_new_org when person moved | No (IP4 research list) |
updated_title | Not populated by LLM pipeline | Would drive update_contact_title() if set | Yes (if ever wired) |
QA checklist (move / retire categories): Prefer a structured
new_org_detailsobject in QA JSON (new_org_name,new_org_email, etc.). When onlyfinal_sender_new_emailis present,category_mapper._enrich_new_org_detailsstill populatesnew_org_emailfor Left Company, Retired, Deceased, and Changed Email sooutput_document_inactive_new_orgis not empty.
Category-to-Determination Mapping#
| LLM Category | Person Status | Email Status | Determination | Special Handling |
|---|---|---|---|---|
| Not Relevant | Inconclusive | Unknown | UNKNOWN | Skip |
| Flagged as Spam | Inconclusive | Unknown | UNKNOWN | Skip |
| Undeliverable | Inconclusive | Invalid | INACTIVE | Mark email invalid |
| Retired (no alt) | Inactive | Invalid | INACTIVE | Check subscriptions |
| Retired (with alt) | Inactive | Invalid | REPLACEMENT | Mark original inactive + add replacement |
| Out of Office | Active | Valid | OUT_OF_OFFICE | First-class determination. Routed to Human Review digest (HUMAN_REVIEW_REASON_OUT_OF_OFFICE); no CUPOLA / Hodor / Multipub / SF writes — dedicated OOO extraction process will pick these up. |
| Deceased (no alt) | Inactive | Invalid | INACTIVE | Check subscriptions |
| Deceased (with alt) | Inactive | Invalid | REPLACEMENT | Mark original inactive + add replacement |
| Left Company (no alt) | Inactive | Invalid | INACTIVE | Check subscriptions |
| Left Company (with alt) | Inactive | Invalid | REPLACEMENT | Mark original inactive + add replacement |
| Changed Email (no alt) | Active | Invalid | EMAIL_UPDATE | Update email/title in Cupola/Hodor/SF (skip Multipub/SF-if-Multipub) |
| Changed Email (with alt) | Active | Invalid | REPLACEMENT | Add replacement |
| Unknown | Inconclusive | Unknown | UNKNOWN | Skip |
Compound categories: If the LLM returns multiple labels in one string,
normalize_llm_categorypicks a single row using the severity order above — not necessarily the first phrase in the string.
Fallback: If
alternate_contactis present for Retired/Deceased/Left Company but parsing fails, the determination falls back to INACTIVE.
Determination Types — Detailed Reference#
1. UNKNOWN#
Source Categories: Not Relevant, Flagged as Spam, Unknown
Business Logic: These emails are not actionable auto-responses. No database updates or notifications are required.
Process:
┌─────────────────────────────────┐
│ Skip Processing │
│ - Log warning │
│ - No database updates │
│ - Continue to next email │
└─────────────────────────────────┘Spreadsheet landing (every UNKNOWN row):
- Processing Report (master) —
processing_report_master.csv/.json— every email lands here withstatus=skipped_unknown. - Human Review —
output_document_human_review.csv/.json— a person picks the real outcome later. See Run Outputs.
Code Location: action_engine.py → execute() → early return for UNKNOWN; main.py also calls OutputDocumentCollector.add_human_review() before the skip so UNKNOWN never "disappears" from the per-run ledger.
2. INACTIVE#
Source Categories: Undeliverable, Retired (without alt), Deceased (without alt), Left Company (without alt)
Business Rules:
- Run Multipub validation before marking inactive (Tarun's process)
- Mark contact inactive in Cupola and Hodor
- Skip Multipub (subscription data only, marketing handles contact changes)
- Skip Salesforce if contact is related to a Multipub subscriber
- Check for active subscriptions across all systems
- Notify Max/Client Services if active subscription exists
- Store
retired_personal_emailin notes (reference only, not updated in any system)
Process:
┌─────────────────────────────────────────────────────────────────────────────┐
│ STEP 1: MULTIPUB VALIDATION (Tarun's Cross-Check Process) │
│ Before marking inactive, validate against Multipub: │
│ │
│ a. Match contact by: │
│ • Email address / User ID (PRIMARY) │
│ • First Name + Last Name + Organization (SECONDARY - with caution) │
│ │
│ b. Check subscription scenarios: │
│ • Active subscriptions (multi-issue, eiss > today) │
│ • Subscriptions expired within last 12 months │
│ • Single-issue purchases within last 12 months │
│ │
│ c. If relevant subscriptions found: │
│ → Flag for review │
│ → notify_multipub_subscriber_followup_batch() to Angel + Yogesh │
│ → Proceed with marking inactive (after notification) │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 2: Update IPv4 Team Systems (Cupola and Hodor) │
│ ├─ Cupola: update_contact_status(False) + check_active_subscription() │
│ └─ Hodor: update_contact_status(False) + check_active_subscription() │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 3: ACE Team Systems (Read-Only Approach) │
│ ├─ Multipub: **SKIP status update** │
│ │ (Multipub is subscription data only; │
│ │ marketing team handles contact changes) │
│ │ Still check_active_subscription() for consistency │
│ │ │
│ └─ Salesforce: │
│ ├─ If contact IS in Multipub → **SKIP** Contact update │
│ │ (Contact is related to Multipub subscriber) │
│ └─ If contact NOT in Multipub: │
│ → update_contact_status(False) │
│ → check_active_subscription() │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 4: Notifications │
│ └─ If Active Subscription Found in any system: │
│ → notify_multipub_subscriber_followup_batch (catalog N02) │
│ → To: Max / Client Services │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ STEP 5: Store Additional Information │
│ └─ Store retired_personal_email in notes (if provided, for reference) │
│ (NOT updated in any system — logged only) │
└─────────────────────────────────────────────────────────────────────────────┘Code Location: action_engine.py → _handle_inactive() and _validate_multipub_before_inactive()
3. REPLACEMENT#
Source Categories:
- Changed Email (with
alternate_contactpresent) - Retired (with
alternate_contactpresent) - Deceased (with
alternate_contactpresent) - Left Company (with
alternate_contactpresent)
Any INACTIVE category (Retired, Deceased, Left Company) with an
alternate_contactpresent is automatically mapped to REPLACEMENT. If thealternate_contactstring cannot be parsed, it falls back to INACTIVE.
Business Rules:
- Mark original person inactive (full INACTIVE flow including Multipub validation)
- Extract replacement contact info (name, email, title) from
alternate_contact - Add replacement person to Cupola only (IPv4 team system)
- Notify Sai Teja about the new replacement
Process:
┌─────────────────────────────────────────────────────┐
│ 1. Validate replacement_info │
│ └─ If None → Log warning, return False │
│ │
│ 2. Mark Original Person INACTIVE │
│ └─ Calls _handle_inactive() for original │
│ (Includes full Multipub validation process) │
│ │
│ 3. Use replacement_info from DeterminationResult │
│ └─ Contains: name, email, title │
│ Parsed from alternate_contact string │
│ Format: "Name: John, Email: john@abc.com" │
│ │
│ 4. Add Replacement to Cupola │
│ └─ check_contact_exists → add_contact() or │
│ reuse existing org_person_id (see connections/cupola.html) │
│ (IPv4 team system only) │
│ │
│ 5. Notify Sai Teja │
│ └─ notify_sai_action_items() (catalog N05/N06) │
│ About: new replacement person │
└─────────────────────────────────────────────────────┘Code Location: action_engine.py → _handle_replacement()
4. EMAIL_UPDATE / TITLE_UPDATE#
Source Categories: Changed Email (without alternate_contact, and not normalized to another label) maps to EMAIL_UPDATE in category_mapper.py.
TITLE_UPDATE: The enum and ActionEngine branch exist and share _handle_title_update() with EMAIL_UPDATE, but map_category_to_determination() never returns TITLE_UPDATE, and create_determination_result() does not set updated_title. Production runs therefore only reach this handler as EMAIL_UPDATE (with sender_new_email from the LLM). Title-only updates via updated_title would require future LLM/schema wiring or manual construction of DeterminationResult.
Business Rules:
- Update email in Cupola, Hodor, and Salesforce (with the same Multipub / Salesforce-if-Multipub skip rules as below)
- Skip Multipub (marketing handles contact info changes)
- Skip Salesforce if contact is related to a Multipub subscriber
- Log email mapping (old → new)
- When
updated_titleis present, callupdate_contact_title()per system with the same skip rules as email
Process:
┌─────────────────────────────────────────────────────────────┐
│ 1. Check for sender_new_email │
│ │
│ 2. If sender_new_email Present: │
│ └─ Update Email in systems (with skip logic): │
│ ├─ Multipub → **SKIP** │
│ │ (Marketing team handles contact info changes) │
│ ├─ Salesforce (if in Multipub) → **SKIP** │
│ │ (Contact is related to Multipub subscriber) │
│ ├─ Cupola → update_contact_email() │
│ ├─ Hodor → update_contact_email() │
│ └─ Salesforce (if NOT in Multipub) │
│ → update_contact_email() │
│ │
│ 3. Log Email Mapping │
│ └─ old_email → new_email │
│ │
│ 4. If updated_title Present: │
│ └─ Same skip rules as email; then update_contact_title() │
│ on Cupola / Hodor / Salesforce where applicable │
└─────────────────────────────────────────────────────────────┘Code Location: action_engine.py → _handle_title_update()
5. ACTIVE#
Source categories: Left without a clear OOO signal — e.g. emails that still resolve to "active person, no change needed" after QA. OOO is now tracked separately as OUT_OF_OFFICE (see §6).
Business Rules (Active-only policy):
- Never auto-add new CUPOLA rows. If the lookup finds no CUPOLA record, the row is written to
output_document_inactive_no_cupola_match.*and a Human Review row is added with reasonHUMAN_REVIEW_REASON_ACTIVE_NEW_CONTACT— IP4 verifies + creates a new entry manually. - Never auto-reactivate inactive CUPOLA rows. Inactive rows for an ACTIVE outcome are routed to Human Review with reason
HUMAN_REVIEW_REASON_REACTIVATION_CANDIDATE; IP4 verifies and creates a new CUPOLA entry (do not reactivate the old one). - Already-active CUPOLA rows: no change required (logged as "Already active").
- Skip Multipub (subscription data only).
- Skip Salesforce if contact is related to a Multipub subscriber.
Process:
┌─────────────────────────────────────────────────────────────┐
│ 1. Lookup CUPOLA rows for contact │
│ ├─ No rows → add_inactive_no_cupola_match │
│ │ + Human Review │
│ │ (ACTIVE_NEW_CONTACT) │
│ └─ Rows found → for each row: │
│ ├─ active → "Already active" (no-op) │
│ └─ inactive → Human Review │
│ (REACTIVATION_CANDIDATE) │
│ │
│ 2. All CUPOLA state changes require human verification; │
│ no update_contact_status / add_contact calls fire on │
│ this path. │
└─────────────────────────────────────────────────────────────┘Code Location: action_engine.py → _handle_active()
6. OUT_OF_OFFICE#
Source Categories: Out of Office, Out of Office (Long-term Leave: is_long_term_leave=True)
Business Rules: OOO replies signal nothing actionable for CUPOLA / Hodor / Multipub / Salesforce on the current run. They are surfaced for the planned dedicated out-of-office extraction process.
Process:
┌─────────────────────────────────────────────────────────────┐
│ 1. Add row to Human Review digest │
│ reason = HUMAN_REVIEW_REASON_OUT_OF_OFFICE │
│ │
│ 2. NO writes to any system. NO status updates. │
└─────────────────────────────────────────────────────────────┘Code Location: action_engine.py → _handle_out_of_office()
Multipub Validation (Tarun's Process)#
Purpose#
Ensure data accuracy across systems by verifying that inactive contacts from the Autoresponse process do not have active or recently relevant subscriptions in Multipub before marking them as inactive.
Applies to: INACTIVE determination (and the INACTIVE step within REPLACEMENT). Runs before marking inactive.
Matching Criteria#
| Priority | Method | Description |
|---|---|---|
| PRIMARY | Email / User ID | Direct match on email address or user ID |
| SECONDARY | Name + Org | First Name + Last Name + Organization (use with caution, only if exactly one match found) |
Subscription Scenarios Checked#
| Scenario | Criteria | SQL Details |
|---|---|---|
| Active Subscriptions | Multi-issue orders with future end date | o.stat IN ('01','02','03','04','05','30') AND o.eiss > GETDATE() AND p.SM='M' |
| Recently Expired | Expired within last 12 months | o.eiss BETWEEN DATEADD(YEAR,-1,GETDATE()) AND GETDATE() AND p.SM='M' |
| Single-Issue Purchases | Purchased within last 12 months | o.odte > DATEADD(YEAR,-1,GETDATE()) AND p.SM='S' |
Active order queries include both direct orders and site license orders (via pub.site, pub.SITEORD, pub.SITEUSER joins, excluding IP-authenticated site licenses).
Important: The focus is on subscribers (contacts), not order-level criteria. Factors such as paid/unpaid status, comp, trial, or site license orders are NOT used to exclude contacts from the review.
Validation Result Actions#
- No subscriptions found → Proceed with inactive marking
- Relevant subscriptions found → Flag for review, notify Client Services, then proceed with inactive marking (with audit trail)
Validation Flow#
Contact marked as INACTIVE by LLM
↓
_validate_multipub_before_inactive()
↓
┌─────────────────────────────────────────────────────────┐
│ 1. Find subscriber in Multipub │
│ a. Try email match │
│ b. Try user_id match │
│ c. Try name+org match (if single result only) │
│ │
│ 2. Check subscription scenarios: │
│ • _get_active_orders() │
│ • _get_recently_expired_orders() │
│ • _get_recent_single_issue_orders() │
│ │
│ 3. If any relevant subscriptions found: │
│ • Set should_flag_for_review = True │
│ • Build review_reason summary │
│ │
│ 4. Return MultipubValidationResult │
└─────────────────────────────────────────────────────────┘
↓
If flagged for review:
→ notify_multipub_subscriber_followup_batch()
→ Log detailed findings
→ Continue with inactive marking (with notification)
↓
Proceed with _handle_inactive()SQL Queries#
Active Orders (Multi-issue + Site License):
-- Direct orders
SELECT o.onum, o.subsnum, o.prdcde, p.prdtyp, p.gprdtyp,
o.stat, o.odte, o.eiss, o.pamt, o.trm, ...
FROM pub.ord o
INNER JOIN pub.prd p ON p.prdcde = o.prdcde
WHERE o.subsnum = ?
AND o.stat IN ('01','02','03','04','05','30')
AND o.eiss > GETDATE()
AND p.SM = 'M'
-- Site license orders (additional complexity)
-- Includes JOIN to pub.site, pub.SITEORD, pub.SITEUSER
-- Excludes IP-based site licensesExpired Orders (Last 12 Months):
SELECT o.onum, o.subsnum, ...
FROM pub.ord o
INNER JOIN pub.prd p ON p.prdcde = o.prdcde
INNER JOIN pub.sc sc ON sc.prdcde = o.prdcde AND sc.eiss = o.eiss
WHERE o.subsnum = ?
AND o.stat NOT IN ('01','02','03','04','05','30')
AND o.eiss BETWEEN DATEADD(YEAR,-1,GETDATE()) AND GETDATE()
AND p.SM = 'M'Single-Issue Orders (Last 12 Months):
SELECT o.onum, o.subsnum, ...
FROM pub.ord o
INNER JOIN pub.prd p ON p.prdcde = o.prdcde
WHERE o.subsnum = ?
AND o.odte > DATEADD(YEAR,-1,GETDATE())
AND p.SM = 'S'Notification Format#
When a contact is flagged for review, the notification includes:
Subject: [REVIEW REQUIRED] Multipub Validation Flag: John Doe
Contact: John Doe (john.doe@example.com)
Multipub Subscriber: 82025008
Match Method: email
=== SUBSCRIPTION FINDINGS ===
[ACTIVE SUBSCRIPTIONS] (2 found):
- Order 12345: Product Name ABC (expires 2026-12-31)
- Order 12346: Product Name XYZ (expires 2027-06-30) [Site License]
[RECENTLY EXPIRED] (1 found):
- Order 12340: Product Name DEF (expired 2025-08-15)
[RECENT SINGLE-ISSUE PURCHASES] (3 found):
- Order 12350: Single Issue Product (purchased 2025-10-01)
...
=== ACTION REQUIRED ===
Please review the above subscription details and verify that marking
this contact as inactive is appropriate.
Review Reason: 2 active subscription(s); 1 recently expired; 3 recent single-issueCode Location: multipub.py → validate_for_inactive(), action_engine.py → _validate_multipub_before_inactive()
System Update Rules#
| System | INACTIVE | REPLACEMENT | EMAIL_UPDATE / TITLE_UPDATE | ACTIVE |
|---|---|---|---|---|
| Cupola | ✅ Mark inactive | ✅ Mark original inactive + Add replacement | ✅ Update email/title | ✅ Mark active (or add new) |
| Hodor | ✅ Mark inactive (Prospect Import row plus AutoRenewal SQL by email in live mode) — or email list only if HODOR_AUTOMATIC_UPDATES=false | ✅ Mark original inactive (via _handle_inactive) — or email list | ✅ Update email/title — or email list | ✅ Mark active if inactive — or email list |
| Multipub | ❌ SKIP (subscription data only) | ❌ SKIP (via _handle_inactive) | ❌ SKIP (marketing handles) | ❌ SKIP (subscription data only) |
| Salesforce | ⚠️ Only if NOT in Multipub | ⚠️ Only if NOT in Multipub (via _handle_inactive) | ⚠️ Only if NOT in Multipub | ⚠️ Only if NOT in Multipub |
Reasoning: Multipub is a subscription management system — the marketing team is responsible for updating contact information there. Salesforce Contacts linked to Multipub subscribers are managed through the Multipub→Salesforce sync, so direct updates would create conflicts.
Database Methods#
Cupola (IPv4 Team)
update_contact_status(unique_id, is_active)update_contact_email(unique_id, new_email)check_active_subscription(unique_id)add_contact(email, name, title, organization=..., organization_id=...)— resolves target org, then same-org idempotent return, cross-orgperson_idreuse + newlink_org_person, or new person+link; seedocs/connections/cupola.html
Hodor (IPv4 Team)
update_contact_status(unique_id, is_active)— unused for INACTIVE determinations; INACTIVE flows instead combine Prospect Import rows with the AutoRenewal email-based update below.update_contact_email(unique_id, new_email)— skipped whenHODOR_AUTOMATIC_UPDATES=false(pending actions emailed to Venu via N01,NOTIFICATION_EMAIL_VENU)update_contact_title(unique_id, new_title)— samemark_inactive_by_email_for_autorenewal(email)— live-mode INACTIVE helper that executesIsNoLongerWithFirm = 1, LastUpdatedDate = getdate(), UpdatedReason = 'Updated based on AutoRenewal Program'for the sender email whenHODOR_AUTOMATIC_UPDATES=truecheck_active_subscription(unique_id)
Multipub (ACE Team)
update_contact_status(unique_id, is_active)update_contact_email(unique_id, new_email)check_active_subscription(unique_id)— Simple checkvalidate_for_inactive(...)— Full validation (Tarun's process)lookup_by_email(email)— Primary lookuplookup_by_user_id(user_id)— Primary lookuplookup_by_name_and_org(first, last, org)— Secondary lookup (with caution)
Salesforce (ACE Team)
update_contact_status(unique_id, is_active)update_contact_email(unique_id, new_email)check_active_subscription(unique_id)
Run Outputs (Spreadsheets & Logs)#
Every run writes its artifacts to a fresh processing_reports/run_{timestamp}/ folder. The invariant is simple: every email gets a row on the Processing Report (master); uncertain rows also get a row on Human Review; automated writes produce an audit bundle for Venu.
Per-email ledgers (always written)#
| File | Rows | Purpose |
|---|---|---|
processing_report_master.csv / .json | Every email | The run's ledger of record. One row per email with its final determination, outcome, systems touched, and errors. |
processing_report_ip4.csv | IP4-actionable categories only | Filtered copy for Sai's team (Out of Office, Retired, Deceased, Left Company, Changed Email). Uses the IP4 column subset and order (see ReportGenerator._CSV_COLUMNS_IP4); the master CSV remains the full ledger. |
processing_report.log | Every email | Human-readable per-email log. |
category_summary_report.csv / .json | Every email | Coarse category view (Undeliverable / Left Company / OOO / Changed Email / Other) for reporting. |
classifier_output/classifier_output.csv / .json | Every classified email | Raw LLM + QA payloads. |
batch_report.html / batch_report.pptx | Run summary | Dashboard + deck. |
Outcome-specific spreadsheets (only when rows exist)#
| File | Populated when | Consumer |
|---|---|---|
output_document_human_review.csv / .json | UNKNOWN, no-match-actionable, bounce awaiting the Max + Vish rule, or replacement parse fallback | Sai action items + Venu audit metadata (notify_sai_action_items / notify_venu_cupola_audit_files) |
output_document_inactive_people.csv / .json | INACTIVE actioned (including cross-verification rows where the person was already inactive) | Internal ops traceability; Sai action items when applicable. Marketing receives separate *_NoLongerThere_*.csv via N04 (not this file). |
{BusinessUnit}_NoLongerThere_{YYYY-MM-DD}.csv | Same population as inactive people (deduped emails per business-unit label) | Marketing team (notify_marketing_suppression, N04) — SFMC suppression import |
output_document_alternate_contacts.csv / .json | REPLACEMENT or ACTIVE-with-replacement | Sai Teja (notify_sai_action_items (catalog N05/N06) — output_document_alternate_contacts.csv) |
output_document_inactive_new_org.csv / .json | EMAIL_UPDATE where the person also moved organisations | Sai Teja (notify_sai_action_items SECTION 3) |
output_document_inactive_no_cupola_match.csv / .json | INACTIVE-style outcome with no CUPOLA match | Sai Teja (notify_sai_action_items SECTION 2) |
output_document_domain_matches.csv / .json | Domain fallback found CUPOLA org context but multiple candidate people | Sai Teja / IP4 manual selection queue |
output_document_shared_email_reviews.csv / .json | Exact email maps to multiple CUPOLA people (shared/designation mailbox) | Sai Teja / IP4 manual selection queue |
output_document_undeliverables.csv / .json | Bounce / Undeliverable | Run directory; Client Services notify_multipub_subscriber_followup_batch (N02) when Multipub evidence applies — not emailed on N04 |
output_document_multipub_audit.csv / .json | Every INACTIVE row that hit the Multipub gate | Run directory only (not bulk-emailed); Tarun gets 1.2 / upload Yes-path |
output_document_email_update_requests.csv / .json | Changed-Email rows | Run directory only (not emailed to marketing) |
Venu audit bundle (always written; attached to Venu's end-of-run email)#
| File | Purpose |
|---|---|
cupola_audit_log.csv / .json | Every CUPOLA status change + contact addition recorded this run. Always written, even when empty, so Venu gets a receipt every run. |
cupola_audit_log_rollback_plan.csv | SQL to revert each auto-applied CUPOLA status change. Only written when auto-apply actually ran. |
hodor_prospect_import.csv | Written when HODOR_AUTOMATIC_UPDATES=false; emailed via notify_hodor_prospect_import (N01 To Venu). |
Which outputs each outcome lands on#
| Outcome | Master | IP4 copy | Human Review | Outcome-specific file(s) | Venu audit bundle |
|---|---|---|---|---|---|
| UNKNOWN | ✅ | ⬜ | ✅ | — | ✅ (no writes happened) |
| No match (actionable, non-inactive) | ✅ | ⬜ | ✅ | — | ✅ |
| No match (inactive-style) | ✅ | ✅ | ⬜ | Inactive — no CUPOLA match | ✅ |
| ACTIVE | ✅ | ✅ | ⬜ | (long-leave → Client Services only) | ✅ |
| EMAIL_UPDATE / TITLE_UPDATE | ✅ | ✅ | ⬜ | Email Update Requests; Inactive at New Org (if moved) | ✅ |
| INACTIVE | ✅ | ✅ | ⬜ | Inactive People; Multipub Audit (when gated) | ✅ |
| REPLACEMENT | ✅ | ✅ | ☑ parse fallback only | Inactive People + Alternate Contacts (± Multipub Audit) | ✅ |
| Bounce | ✅ | ⬜ | ✅ until Max + Vish rule lands | Undeliverables | ✅ |
Notifications#
| Notification | Trigger | Recipient | Data / Attachments |
|---|---|---|---|
notify_multipub_subscriber_followup_batch() | INACTIVE / left-org with determined sender + Multipub evidence (active / 12mo sub / 12mo purchase) | Angel + Yogesh (Matt Cc) | multipub_subscriber_followup.csv |
notify_tarun_undetermined_sender_review() | Sender email could not be determined for Multipub routing | Tarun | tarun_undetermined_senders.csv |
notify_multipub_subscriber_followup_from_upload() | Tarun upload API outcome=yes | Angel + Yogesh (Matt Cc; Tarun Cc) | Uploaded CSV |
notify_hodor_prospect_import() | HODOR_AUTOMATIC_UPDATES=false and queued prospect-import rows | Hodor (Cc Sai) | hodor_prospect_import.csv |
notify_marketing_suppression() | End of run when *_NoLongerThere_*.csv deliverable(s) exist | Marketing team (N04) | {BusinessUnit}_NoLongerThere_{YYYY-MM-DD}.csv (SFMC import columns) |
notify_sai_action_items() | Replacements, no-Cupola-match, inactive-new-org, actionable HR rows | Sai | Section CSVs + output_document_human_review_action_items.csv |
notify_venu_cupola_audit_files() | End of every run | Venu (Cc Sai, Max) | cupola_audit_log.csv, rollback plan, HR CSV when present |
notify_run_audit_for_ip4() | End of every run | Sai | Processing log, master + IP4 CSV, category summary, batch HTML, impact.txt; backlog snapshot in body when available |
Live email routing: see NOTIFICATIONS_CATALOG.html.
Batching: Most operational emails above are queued during processing and sent once per category at end of run via ActionEngine.flush_batch_notifications() (dry-run: logged only). Cupola audit attachments to Venu are separate. See NOTIFICATIONS_REFERENCE.html for recipient env vars and the full model.
Special Cases & Edge Cases#
| Scenario | Current Behavior | Status |
|---|---|---|
| Changed Email + Alternate Contact | Maps to REPLACEMENT | ✅ Implemented |
| Out of Office + Alternate Contact | Maps to OUT_OF_OFFICE; Human Review digest only, no system writes; alternates captured by _collect_alternate_contacts where applicable | ✅ Implemented |
| Deceased + Replacement Contact | Maps to REPLACEMENT | ✅ Implemented |
| Left Company + Replacement Contact | Maps to REPLACEMENT | ✅ Implemented |
| Retired + Replacement Contact | Maps to REPLACEMENT | ✅ Implemented |
| Replacement email matches sender | Clears replacement_info, changes to UNKNOWN | ✅ Implemented |
| Alternate contact parsing fails | Falls back to INACTIVE | ✅ Implemented |
Case: Out of Office with Replacement Contact#
"I'm out of office, contact John at john@company.com"
- Category: "Out of Office" — maps to OUT_OF_OFFICE, regardless of alternate_contact.
- No system state changes. Row is added to the Human Review digest with reason
HUMAN_REVIEW_REASON_OUT_OF_OFFICEso the dedicated OOO process can pick it up. - Any parsed alternate contact is still emitted via
output_document_alternate_contacts.*for the IP4 reviewer. - Enhancement needed: Decide if temporary replacements should be added to systems
Case: Retired with Personal Email#
"I'm retiring, keep in touch at personal@gmail.com"
- Category: "Retired" → INACTIVE
retired_personal_emailstored in notes for reference- Personal email is not updated in any system
Case: Undeliverable Email#
Bounce message indicating email is invalid.
- Category: "Undeliverable" → INACTIVE
- Contact marked inactive, subscriptions checked
- Enhancement needed: Add method to mark email as invalid in EmailRepository
Configuration#
Environment Variables#
# Required for LLM classification
OPENAI_API_KEY=your_openai_api_key_here
# Optional (defaults shown)
CLASSIFIER_BACKEND=openai # or "ollama"
OLLAMA_MODEL=mistral-nemo:12b
OLLAMA_URL=http://localhost:11434/api/chat
MAX_CONCURRENCY=10
MOCK_ALL_SERVICES_FOR_DEV=false
WRITE_OPERATIONS_ENABLED=trueLLM Configuration#
- OpenAI Model: gpt-4o-mini (production)
- Temperature: 0.0 (deterministic)
- Fallback: Ollama support for local/development use
- Backend: Configured via
CLASSIFIER_BACKENDsetting
Operating Modes#
The application supports three operating modes:
Dev mock-all (MOCK_ALL_SERVICES_FOR_DEV=true)#
Cupola, Multipub, and Salesforce use mock connectors; SFMC uses a mock client. HodorDatabase is the real SQL connector (graceful-unavailable if HODOR_* / HODOR_CONNECTION_STRING is not set — not a MockHodor class). The email repository is mock by default; set dry_run_use_real_email_repository=true to read the real inbox. Step 4c contact lookup is skipped (main.py uses a minimal stub Contact so downstream steps can run). LLM classification is real. Operations are logged to dry_run_output.log where applicable.
Simulation — no writes (WRITE_OPERATIONS_ENABLED=false)#
Live database reads; writes are intercepted via ReadOnlyDatabaseWrapper for Cupola, Multipub, Salesforce, and Hodor when each connector is live and available. SFMC uses a mock client. Salesforce attempts a real connection with mock fallback. Write operations are logged to dry_run_output.log.
Live Mode (default)#
Real database connections and writes (with mock fallbacks where the code substitutes mocks after failed connectivity checks).
Mode Comparison#
| Component | Live Mode | Simulation (no writes) | Dev mock-all |
|---|---|---|---|
| Email Extraction (Step 1) | Real MSSQL repository | Real MSSQL repository | Mock by default; real if dry_run_use_real_email_repository=true |
| Cupola / Multipub / Salesforce | Real reads and writes | Real reads; writes intercepted (logged) | Mock connectors |
| Hodor | Real SQL when configured | Real reads; writes intercepted when wrapped | Real HodorDatabase when configured (not mocked); Step 4c contact lookup skipped — main.py uses a stub Contact |
| SFMC | Live or mock fallback | Mock client | Mock |
| LLM Classification | Real: OpenAI / Ollama | Real: OpenAI / Ollama | Real: OpenAI / Ollama |
| Notifications | Sent via SMTP | Logged only (no emails sent) | Logged only (no emails sent) |
| Action Logging | Standard logging | ActionLogger → dry_run_output.log | ActionLogger → dry_run_output.log |
| Stage Logging | StageLogger → stage_execution.log | StageLogger → stage_execution.log | StageLogger → stage_execution.log |
Project Structure#
Core Application#
| File | Description |
|---|---|
src/auto_responder/main.py | Main entry point — orchestrates all steps |
src/auto_responder/config.py | Application settings (Pydantic BaseSettings, loaded from .env) |
Models#
| File | Description |
|---|---|
src/auto_responder/models/email.py | AutoResponseEmail — email data model |
src/auto_responder/models/contact.py | Contact, ContactRecord — unified contact model |
src/auto_responder/models/determination.py | Determination enum, DeterminationResult, ReplacementInfo |
src/auto_responder/models/output_document.py | InactivePersonRecord, AlternateContactRecord, InactiveNewOrgRecord, InactiveNoCupolaMatchRecord, UndeliverableRecord, MultipubAuditRecord, HumanReviewRecord — output document models |
src/auto_responder/models/subscription.py | MultipubValidationResult, MultipubOrder, MultipubSubscriber |
Processors#
| File | Description |
|---|---|
src/auto_responder/processors/action_engine.py | Action execution engine — routes determinations to handlers |
src/auto_responder/processors/contact_lookup.py | Unified contact lookup across all databases |
src/auto_responder/processors/email_processor.py | Email deduplication and filtering |
Services — Determination#
| File | Description |
|---|---|
src/auto_responder/services/determination/autoresponsedetermination.py | Main determination service (LLM pipeline orchestrator) |
src/auto_responder/services/determination/llm_classifier.py | LLM classification and QA agents (OpenAI / Ollama) |
src/auto_responder/services/determination/category_mapper.py | Category → Determination mapping, alternate_contact parsing |
src/auto_responder/services/determination/mock_classifier.py | Mock LLM classifier for dry-run mode (keyword matching) |
Services — Databases#
| File | Description |
|---|---|
src/auto_responder/services/databases/base.py | Base database class (retry logic, health checks, connection enhancement) |
src/auto_responder/services/databases/cupola.py | Cupola SQL Server connector (IPv4 team) |
src/auto_responder/services/databases/hodor.py | Hodor SQL Server connector (IPv4 team) |
src/auto_responder/services/databases/multipub.py | Multipub SQL Server connector + subscription validation (ACE team) |
src/auto_responder/services/databases/salesforce.py | Salesforce REST API connector (ACE team) |
src/auto_responder/services/databases/mock.py | Mock database connectors for dry-run mode |
src/auto_responder/services/databases/readonly_wrapper.py | Read-only wrapper — live reads, mocked writes (for read-only mode) |
Services — Email & Notifications#
| File | Description |
|---|---|
src/auto_responder/services/email_repository.py | MSSQL email extraction from central repository |
src/auto_responder/services/mock_email_repository.py | Mock email repository for dry-run mode |
src/auto_responder/services/email_service.py | SMTP email sending service |
src/auto_responder/services/notifications/notifier.py | Notification service (Client Services, Sai Teja, Venu, Multipub validation flags) |
src/auto_responder/services/sfmc/client.py | SFMC REST client — suppression lists (live) |
src/auto_responder/services/sfmc/mock.py | SFMC mock client (read-only mode, dry-run) |
Services — Outlook (Deprecated)#
| File | Description |
|---|---|
src/auto_responder/services/outlook/graph_client.py | Microsoft Graph API client — DEPRECATED (replaced by central SQL server) |
Utilities#
| File | Description |
|---|---|
src/auto_responder/utils/action_logger.py | Comprehensive operation logging for dry-run and read-only modes |
src/auto_responder/utils/stage_logger.py | Stage-by-stage execution logging (always enabled) |
src/auto_responder/utils/logging_config.py | Logging configuration and correlation context |
src/auto_responder/utils/output_document_collector.py | Collects records for the three business deliverable lists |
src/auto_responder/utils/output_document_generator.py | Generates CSV/JSON output documents from collected records |
src/auto_responder/utils/report_generator.py | Per-email processing report generation (log + JSON + CSV) |
Original Reference#
| File | Description |
|---|---|
Auto_responder_email/unified_email_classifier.py | Original LLM classification program (reference only, integrated into services) |
Integration History#
The LLM-based email classification from the original Auto_responder_email/unified_email_classifier.py was fully integrated into the AutoResponderProcess. Below is a summary of the integration phases.
Phase 1: Core Integration ✅#
| Component | File | Changes |
|---|---|---|
| Model Enhancements | models/determination.py | Added sender_new_email, is_long_term_leave, retired_personal_email, confidence fields |
| Configuration | config.py | Added OpenAI, Ollama, and concurrency settings |
| LLM Classifier | services/determination/llm_classifier.py | Created — extracted from original, pure service implementation |
| Category Mapper | services/determination/category_mapper.py | Created — maps LLM categories to Determination enum |
| Determination Service | services/determination/autoresponsedetermination.py | Replaced placeholder with full LLM integration |
Phase 2: Action Enhancements ✅#
| Component | File | Changes |
|---|---|---|
| Database Connectors | services/databases/*.py | Added update_contact_email() to all connectors |
| Action Engine | processors/action_engine.py | Enhanced all handlers; added Multipub validation integration |
| Subscription Models | models/subscription.py | Created — MultipubValidationResult, MultipubOrder, MultipubSubscriber |
| Multipub Connector | services/databases/multipub.py | Major enhancements for Tarun's validation queries |
| Notifier | services/notifications/notifier.py | Added Multipub validation notification |
| Dependencies | requirements.txt | Added openai>=1.12.0 |
Error Handling#
Database Update Failures#
- Log each failure individually
- Continue with other systems (graceful degradation)
- Return
Falseif all systems fail - Return
Trueif at least one system succeeds
Notification Failures#
- Log failure but don't fail the entire operation
- Store failed notifications for manual review
LLM Classification Failures#
- Return UNKNOWN determination with confidence
0.0 - Log error with full email content
- Skip processing rather than crashing
Contact Lookup Failures#
- Graceful degradation — continues even if some database connectors fail to initialize
- Unified contact returned with records from whichever systems responded
Common Issues & Solutions#
| Issue | Solution |
|---|---|
| Replacement contact matches sender | Clear replacement_info, change to UNKNOWN |
| Invalid email format | Validate with regex before updating |
| Database connection timeout | Retry with exponential backoff |
| LLM returns malformed JSON | Return UNKNOWN, log for review |
| Contact not found in any system | Log warning, skip processing |
Performance Considerations#
Batch Processing#
- Current implementation processes emails sequentially
- Determination service supports
determine_batch()for future parallel use - Respect API rate limits for LLM calls
Database Connections#
- Reuse database connections where possible
- Connection pooling recommended
- Handle connection timeouts gracefully
Async/Sync Bridge#
- Uses
asyncio.run()which may cause issues if called from an existing event loop - Consider making
main.pyasync in the future for better performance
Caching#
- Cache contact lookups to avoid repeated queries
- Cache LLM classifications for duplicate emails
- Consider caching determination results
Testing#
Test Scenarios#
| Test Case | Input | Expected Determination | Expected Actions |
|---|---|---|---|
| Retired Person | Email with "retirement" keyword | INACTIVE | Mark inactive, check subscriptions, notify if needed |
| Out of Office | Email with "out of office" keyword | OUT_OF_OFFICE | Human Review digest only; no system writes |
| Changed Email | Email with "new email is X" | EMAIL_UPDATE | Update email/title in Cupola/Hodor/SF (skips per rules) |
| Replacement Contact | Email with "contact John at john@co.com" | REPLACEMENT | Mark inactive, add replacement, notify |
| Undeliverable | Bounce message | INACTIVE | Mark inactive, check subscriptions |
| Not Relevant | System notification | UNKNOWN | Skip processing |
Testing Checklist#
- [ ] Retired person → INACTIVE
- [ ] Out of office → OUT_OF_OFFICE (Human Review digest only, no writes)
- [ ] Changed email → EMAIL_UPDATE
- [ ] Replacement contact → REPLACEMENT
- [ ] Undeliverable → INACTIVE
- [ ] Not relevant → UNKNOWN
- [ ] Long-term leave flag → ACTIVE with logging
- [ ] Active subscription → Notification sent
- [ ] Multiple systems → All updated correctly
- [ ] Database failure → Partial success handling
- [ ] Multipub validation — active subscription → Flag notification
- [ ] Multipub validation — recently expired → Flag notification
- [ ] Multipub validation — single-issue purchase → Flag notification
- [ ] Multipub validation — no subscriptions → Proceed normally
Implementation Status & Enhancements#
Core Functionality (Implemented)#
- [x] UNKNOWN determination handling
- [x] INACTIVE determination handling
- [x] REPLACEMENT determination handling
- [x] EMAIL_UPDATE / TITLE_UPDATE determination handling (shared
_handle_title_update()) - [x] ACTIVE determination handling
- [x] Subscription checking
- [x] Notification system
- [x] Multipub validation (Tarun's process)
- [x] Category mapper — alternate_contact for Deceased/Left Company/Retired
- [x] Two-agent QA system (Classification + QA)
- [x] Dry-run mode
Enhancements Needed#
High Priority
- [ ] Optional: write-through updates to the central email repository (MSSQL inbox extract) when contact data changes — not implemented; operational updates target Cupola/Hodor/Salesforce per business rules
- [ ] Add validation for new email addresses before updating
- [x] Implement title update functionality in database connectors (
update_contact_titleon Cupola, Hodor, Multipub, Salesforce; action engine applies with Multipub / SF-if-Multipub skips) - [ ] Wire LLM (or rules) to populate
DeterminationResult.updated_title/TITLE_UPDATEif title-only changes should be first-class
Medium Priority
- [ ] Add category-specific logging (Retired vs Deceased vs Left Company)
- [ ] Add metrics tracking per category
- [ ] Add duplicate replacement prevention
- [ ] Add retry logic for transient LLM API failures
Low Priority
- [ ] Add phone number extraction from alternate_contact
- [ ] Add notification for long-term leave
- [ ] Add temporary replacement handling for Out of Office
- [ ] Add email validation in email repository
- [ ] Consider making main.py fully async
- [ ] Add caching for repeated email classifications
Future Enhancements#
- Machine Learning Model — Train custom model on historical classifications to reduce LLM dependency
- Workflow Automation — Automatic approval for high-confidence; manual review queue for low-confidence
- Advanced Contact Matching — Fuzzy matching, duplicate detection, contact relationship mapping
- Reporting Dashboard — Real-time processing status, historical trends, error analysis
Known Limitations & Risks#
- API Costs: LLM calls have per-email costs (OpenAI)
- Latency: LLM calls add processing time per email
- Rate Limits: Must respect OpenAI API rate limits
- Malformed Responses: LLM may return unparseable JSON (falls back to UNKNOWN)
- Async/Sync Bridge:
asyncio.run()may conflict with existing event loops - Title Updates:
update_contact_title()exists on connectors and_handle_title_update()applies it whenDeterminationResult.updated_titleis set; the LLM pipeline does not populateupdated_titletoday, andTITLE_UPDATEis not returned by the category mapper — title-only automation needs future wiring - Multipub Linked Server: Queries use OPENQUERY to
multipub_livelinked server — must be properly configured - Error Recovery: Transient API failures fall back to UNKNOWN; no retry logic yet
Team Responsibilities#
| System | Team | Contact |
|---|---|---|
| Cupola | IPv4 | - |
| Hodor | IPv4 | - |
| Multipub | ACE | Tarun (validation queries) |
| Salesforce | ACE | - |
| LLM Classification | - | Shubham |
| Action Engine | - | - |
| Notifications (Multipub flags) | - | Max / Client Services (recipient) |
| Notifications (Replacements) | - | Sai Teja (recipient) |
Usage#
The integration is transparent — the main workflow remains the same:
from auto_responder.main import main
# Process 10 most recent emails (default)
main()
# Process a specific number of emails
main(max_results=25)Operating Modes#
The application supports three modes controlled by environment variables:
| Mode | Environment Variable | Description |
|---|---|---|
| Live | (default) | Real database connections and writes |
| Simulation (no writes) | WRITE_OPERATIONS_ENABLED=false | Live database reads; writes intercepted via ReadOnlyDatabaseWrapper (Cupola, Multipub, Salesforce, Hodor when configured) — operations logged |
| Dev mock-all | MOCK_ALL_SERVICES_FOR_DEV=true | Mock Cupola/Multipub/Salesforce/SFMC; real HodorDatabase; optional real email repository; real LLM; contact lookup skipped in main.py |
| Hodor manual | HODOR_AUTOMATIC_UPDATES=false | Hodor reads still run; Hodor writes are not executed — intended changes are emailed to Venu (NOTIFICATION_EMAIL_VENU, default Venu). Not the same as simulation (no-write) mode. |
Note: Hodor uses a real SQL connector, not MockDatabase. Salesforce gracefully falls back to a mock if connection fails. See connections/README.html and connections/hodor.html.
Document Version: 3.6
Date: April 20, 2026
Status: Complete ✅
Last Updated: May 3, 2026 — Documented that processing_report_ip4.csv uses a fixed 23-column IP4 layout (ReportGenerator._CSV_COLUMNS_IP4) while the master CSV remains the full ledger; Client Services run-report email copy aligned. (Earlier April 20, 2026 update: Run Outputs section, Human Review, Venu audit bundle, notifications table.)