This project implements an agent-based reconciliation system powered by Large Language Models (LLMs) to automate dividend booking reconciliation between NBIM and Custody data sources.
It addresses real-world operational challenges by identifying reconciliation breaks, explaining causes, and generating control recommendations. See in agents/README.md for detailed explanation of agents and their prompts.
NBIM_case/
│
├── agents/
│ ├── README.md
│ ├── data_verification_agent.py
│ ├── reconciliation_information_agent.py
│ ├── reconciliation_eval_transaction_agent.py
│ ├── reconciliation_remediation_agent.py
│ └── reconciliation_summary_agent.py
│
├── raw_data/
│ ├── NBIM_Dividend_Bookings.csv
│ └── CUSTODY_Dividend_Bookings.csv
│
├── processed_data/
│ ├── dividend_detailed_join.csv
│ ├── dividend_llm_flagged.csv
│ └── dividend_transaction_summaries.csv
│
├── notebooks/
│ ├── README.md
│ └── data_preprocessing.ipynb
│
├── .env
├── main.py
└── README.md
The program operates in two modes, each corresponding to a distinct workflow:
Figure 1 — LLM-driven data verification and reconciliation pipeline.
Command:
python3 main.py --verifyThis mode validates that the data preprocessing logic is correct before reconciliation begins.
Loads and analyzes the Jupyter notebook in notebooks/data_preprocessing.ipynb.
Uses an LLM to verify that column renaming, data cleaning, and joins follow the expected standards.
Produces a structured verification report to provide data quality.
Command:
python3 main.py --reconciliationThis mode runs the full LLM-driven reconciliation workflow from raw data to summarized results.
-
Reads NBIM and Custody CSV files from raw_data/.
-
Standardizes column names and merges the datasets on event_key and isin.
-
Produces processed_data/dividend_detailed_join.csv (a detailed outer-joined dataset)
-
Analyzes both datasets to detect common reconciliation break types such as mismatched FX rates, missing events, or incorrect settlement amounts.
-
Generates a structured dictionary of break types, causes, and control recommendations.
-
Evaluates each break type across all transactions using contextual LLM reasoning.
-
Flags records that exhibit reconciliation inconsistencies.
-
Produces processed_data/dividend_llm_flagged.csv (a dataset with automated LLM-based flags)
-
Generates short, professional summaries for each transaction explaining what was detected and what action is recommended.
-
Produces processed_data/dividend_transaction_summaries.csv — a human-readable output ready for audit or review.
-
Reviews flagged transactions with the NBIM and Custody records to suggest concrete record-level corrections.
-
Generates proposed updates and accompanying reasons for each remediation.
-
Outputs the results to processed_data/dividend_llm_remediated.csv for review and audit.
Input Files
-
NBIM_Dividend_Bookings.csv
-
CUSTODY_Dividend_Bookings.csv
Processing Steps
-
Clean and normalize both datasets.
-
Merge them by event_key and isin.
-
Identify and evaluate reconciliation breaks using LLM analysis.
-
Generate audit-ready summaries.
Output Files
-
dividend_detailed_join.csv -> Full merged dataset
-
dividend_llm_flagged.csv -> Transactions with LLM-generated flags
-
dividend_transaction_summaries.csv -> Summarized reconciliation insights
-
dividend_llm_remediated -> Suggestions to remediated transactions
Technical Stack
-
Python 3.10+
-
LangChain – LLM orchestration framework
-
Pydantic – Structured validation of model outputs
-
pandas – Data wrangling and joining
-
dotenv – Secure environment variable management
-
Automate reconciliation analysis using LLM reasoning
-
Reduce manual investigation time
-
Provide explainable and auditable outputs
-
Standardize break classification and control documentation
The NBIM LLM-Powered Reconciliation System introduces automation into a financial control process.
Key risks and mitigations are summarized below.
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Incorrect LLM output (hallucination or mislabeling) | High | Medium | Use deterministic pre-checks, strict Pydantic schemas, and low-temperature settings. |
| Missed or false reconciliation breaks | High | Medium | Combine rule-based checks with LLM reasoning; validate against historical test cases. |
| Unsafe remediations or data overwrites | High | Low | No direct system writes: all remediations are exported to CSV for human approval. |
| Data quality or join logic errors | Medium | Medium | Preflight verification agent checks joins, types, and transformations before execution. |
| Prompt drift / inconsistent behavior | Medium | Medium | Keep a saved version of every LLM prompt and regularly test it on known example cases to make sure it still produces the expected results. |
| Sensitive data exposure | High | Low | Restrict inputs to necessary columns; store secrets in .env. |
| Auditability and explainability gaps | Medium | Medium | Log all prompts, model versions, and outputs; structured JSON ensures reproducibility. |
| Model unavailability / vendor outage | Medium | Low | Include fallback “rules-only” mode for continuity. |
| Untrusted or compromised LLM APIs | High | Low | Use only verified and secure API providers (e.g., OpenAI with authentication keys). Enforce HTTPS, validate responses, and reject unsigned or tampered outputs. |
The system uses an LLM-first approach with guardrails rather than hard deterministic rules. Model outputs are constrained by schema validation, and any proposed remediations are exported to CSV for analyst approval (human-in-the-loop). Fully autonomous financial operations require strong governance and control mechanisms to ensure reliability and trust. At the system level, this includes schema validation, prompt versioning (record of prompt templates), secure API usage, and immutable audit logs for every model decision.
Operationally, all remediation outputs remain human-reviewed before execution, preventing unauthorized or automated postings. Longer-term autonomy should introduce layered controls such as confidence scoring, dual-approval workflows, and rule-based policy checks to contain model risk while maintaining efficiency.
-
Add a Streamlit dashboard for reviewing flagged transactions
-
Integrate retrieval-augmented memory for contextual learning (valuable with previous cases)
-
Implement confidence scoring on flagging
-
Enable continuous learning from human feedback (fine-tune analysis based on expert input)
-
I plan to implement an in-context learning mechanism based on retrieval of similar historical records to support automated remediation. The approach involves clustering records by key attributes (such as break type, instrument, currency, and reconciliation flags), then applying case-based reasoning principles to retrieve and adapt solutions from comparable past cases. This would allow the system to suggest or automatically apply remediation actions based on prior successful resolutions.
-
Integrate rule-based validation layer (Combine deterministic financial rules with LLM reasoning to ensure control robustness and reduce hallucination risk)
-
Connect to real-time custody APIs (enables real time reconciliation)
Example output of a reconciliation run:
(venv) anders@Anders-sin-MacBook-Pro NBIM_case % python3 main.py --reconciliation
Running Reconciliation ...
Break Types Dictionary:
{0: 'Dividend Amount Discrepancy', 1: 'Payment Date Mismatch', 2: 'Tax Rate Variance', 3: 'Currency Conversion Errors', 4: 'Event Type Classification Errors', 5: 'Missing Events', 6: 'Settlement Amount Differences'}
Causes Dictionary:
{0: 'Differences in dividend calculations between the investment manager and custodian can lead to discrepancies in gross and net amounts reported. This may arise from variations in tax treatment or currency conversion rates applied by each party.', 1: 'Payment dates may differ due to processing times or delays in communication between the investment manager and custodian, leading to mismatches in expected cash flows.', 2: 'Variations in tax rates applied by the investment manager versus the custodian can result in discrepancies in the net amounts received, particularly for international dividends subject to different withholding tax rates.'}
Controls Dictionary:
{0: 'Implement a standardized process for calculating dividends that includes clear guidelines on tax treatment and currency conversion to ensure consistency across both systems.', 1: "Establish a timeline for payment processing that aligns both parties' expectations and includes regular updates to minimize discrepancies in payment dates.", 2: 'Regularly review and reconcile tax rates applied to dividends, ensuring both parties are aligned on the applicable rates and any changes in tax legislation.'}
Loaded 5 rows from processed_data/dividend_detailed_join.csv
Evaluating break type: Dividend Amount Discrepancy (grouped by 'event_key')
Evaluating Dividend Amount Discrepancy: 100%|██████████████████████████████████████████| 3/3 [00:05<00:00, 1.78s/it]
Completed evaluation for 3 transactions under break type 'Dividend Amount Discrepancy'
Evaluating break type: Payment Date Mismatch (grouped by 'event_key')
Evaluating Payment Date Mismatch: 100%|████████████████████████████████████████████████| 3/3 [00:05<00:00, 1.83s/it]
Completed evaluation for 3 transactions under break type 'Payment Date Mismatch'
Evaluating break type: Tax Rate Variance (grouped by 'event_key')
Evaluating Tax Rate Variance: 100%|████████████████████████████████████████████████████| 3/3 [00:05<00:00, 1.69s/it]
Completed evaluation for 3 transactions under break type 'Tax Rate Variance'
Evaluating break type: Currency Conversion Errors (grouped by 'event_key')
Evaluating Currency Conversion Errors: 100%|███████████████████████████████████████████| 3/3 [00:05<00:00, 1.84s/it]
Completed evaluation for 3 transactions under break type 'Currency Conversion Errors'
Evaluating break type: Event Type Classification Errors (grouped by 'event_key')
Evaluating Event Type Classification Errors: 100%|█████████████████████████████████████| 3/3 [00:05<00:00, 1.86s/it]
Completed evaluation for 3 transactions under break type 'Event Type Classification Errors'
Evaluating break type: Missing Events (grouped by 'event_key')
Evaluating Missing Events: 100%|███████████████████████████████████████████████████████| 3/3 [00:05<00:00, 1.81s/it]
Completed evaluation for 3 transactions under break type 'Missing Events'
Evaluating break type: Settlement Amount Differences (grouped by 'event_key')
Evaluating Settlement Amount Differences: 100%|████████████████████████████████████████| 3/3 [00:05<00:00, 1.83s/it]
Completed evaluation for 3 transactions under break type 'Settlement Amount Differences'
Saved LLM-flagged dataset to processed_data/dividend_llm_flagged.csv
Processed summary for event_key 950123456: The reconciliation of transaction event_key 950123456 has confirmed that all relevant data points align between NBIM and Custody. There are no discrepancies in dividend amounts, payment dates, tax rates, currency conversions, event type classifications, or settlement amounts. All necessary event details are present and consistent, indicating a successful reconciliation with adherence to control guidance for standardized processes and regular reviews.
Processed summary for event_key 960789012: The reconciliation of transaction event_key 960789012 revealed multiple discrepancies, including a dividend amount discrepancy of $342.77 between custody and NBIM records, a payment date mismatch of five days, and a tax rate variance where the custody records indicated a 20% tax rate compared to 25% in NBIM records. Additionally, significant currency conversion errors were identified, with a notable difference in the average FX rate used. The event type was misclassified as 'DVCA' instead of the appropriate classification for a cross-currency dividend payment. To address these issues, it is recommended to implement standardized processes for dividend calculations, align payment processing timelines, and regularly review tax rates to ensure consistency and accuracy across both systems.
Processed summary for event_key 970456789: The reconciliation of transaction event_key 970456789 revealed significant discrepancies in the reported dividend amounts between NBIM and Custody, with NBIM reporting 502,163.9 CHF and Custody reporting a total of 94,605 CHF. Additionally, the analysis identified missing events due to multiple bank accounts associated with the same event_key, suggesting potential unrecorded dividend events. Furthermore, there were settlement amount differences across the rows, indicating inconsistencies in the net amounts settled for the respective nominal bases. No issues were found regarding payment date mismatches, tax rate variances, currency conversion errors, or event type classification errors. It is recommended to implement standardized processes for dividend calculations and regular reconciliations to address these discrepancies.
Running Remediation Agent to propose record-level fixes...
Loaded 3 flagged rows and 5 joined transaction rows.
Generating remediations for 3 transactions...
Remediating: 100%|█████████████████████████████████████████████████████████████████████| 3/3 [00:32<00:00, 10.70s/it]
Overwrote 5 total rows to processed_data/dividend_llm_remediated.csv
Proposed remediation for event_key 950123456: Updated custodian from 'JPMORGAN_CHASE' to 'CUST/JPMORGANUS' to match custody records.
Proposed remediation for event_key 960789012: Updated payment date to match custody record (25.05.2025).
Proposed remediation for event_key 970456789: Updated custodian from 'UBS_SWITZERLAND' to 'CUST/UBSCH' to match custody records for all rows.
Proposed remediation for event_key 970456789: Adjusted avg_fx_rate_quotation_to_portfolio from 12.4567 to 1.0 to reflect custody records for all rows.
Proposed remediation for event_key 970456789: Updated gross_amount_quotation, net_amount_quotation, and net_amount_settlement for the third row to match custody records, correcting discrepancies in the amounts.
Remediation complete. 5 updated rows proposed and written to processed_data/dividend_llm_remediated.csv