top of page

Document Summerization with GenAI for Compliance Team

  • Writer: Anand Nerurkar
    Anand Nerurkar
  • Oct 4
  • 3 min read

Step 0: Sample Loan Agreement Document

File: sample-loan-agreement.pdf

Uploaded by: Compliance Officer Amit

Document Type: Term Loan Agreement

Sample Content (Excerpt):

1. Parties: ABC Bank (Lender) and XYZ Corp (Borrower)
2. Loan Amount: $20,000,000
3. Term: 5 years
4. Interest Rate: SOFR + 2%, compounded quarterly
5. Repayment: Quarterly installments
6. Collateral: Real estate properties at XYZ Corp HQ
7. Covenants:
   7.1 Borrower must maintain DSCR ≥ 1.25
   7.2 Borrower may request grace period up to 180 days
   7.3 Quarterly financial reports to be submitted within 30 days
8. Prepayment: Allowed with 2% penalty
9. Regulatory:
   AML compliance under FATCA & RBI KYC 2023 Master Circular
   Basel III exposure norms for large corporate borrowers
10. Action Items:
    Compliance verification of FATCA clause
    Risk team sensitivity analysis on revenue forecast

Step 1: Angular UI Upload

Actions:

  • Compliance Officer Amit logs in (RBAC access).

  • Uploads PDF via multipart form.

  • Metadata submitted: document type, version, sensitivity.

Tables Updated:

  1. Audit Table: audit_log

audit_id

event_name

document_id

performed_by

timestamp

details

A1

DOCUMENT_UPLOAD_INITIATED

null

Amit

2025-10-04 10:00

Upload initiated, metadata validated

Event Emitted: DOCUMENT_UPLOAD_INITIATED

Consumed By: AuditService

AI Guardrails: Input validation, file type/size, PII masking.

Step 2: Document Service - Save & Persist

Actions:

  • File saved locally: /data/documents/sample-loan-agreement.pdf

  • Metadata stored in document_metadata table.

Table: document_metadata

document_id

file_name

doc_type

uploaded_by

upload_date

version

sensitivity

storage_path

processed_flag

processed_date

UUID1

sample-loan-agreement.pdf

LoanAgreement

Amit

2025-10-04 10:01

1.0

Confidential

/data/documents/sample-loan-agreement.pdf

false

null

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A2

DOCUMENT_UPLOADED

UUID1

DocumentService

2025-10-04 10:01

File saved, metadata persisted

Event Emitted: DOCUMENT_UPLOADED

Consumed By: OCRService, AuditService

Step 3: OCR & Text Extraction

Actions:

  • OCRService consumes DOCUMENT_UPLOADED.

  • Extracts text and splits into chunks.

  • PII masked or tokenized.

Table: document_text

chunk_id

document_id

text_chunk

pii_masked_flag

1

UUID1

"Parties: ABC Bank (Lender) and XYZ Corp..."

true

2

UUID1

"Loan Amount: $20,000,000; Term: 5 years..."

false

3

UUID1

"Covenants: 7.1 DSCR ≥1.25; 7.2 grace 180d..."

false

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A3

DOCUMENT_OCR_COMPLETED

UUID1

OCRService

2025-10-04 10:03

3 text chunks extracted, PII masked

Event Emitted: DOCUMENT_OCR_COMPLETED

Step 4: Indexing & Embeddings

Actions:

  • IndexingService consumes DOCUMENT_OCR_COMPLETED.

  • Generates embeddings for each chunk via OpenAI API.

  • Stores embeddings in PGVector table.

  • Redis caching for frequently accessed embeddings.

Table: document_embeddings (PGVector)

chunk_id

document_id

embedding_vector

created_at

1

UUID1

[0.12,0.56,...0.78]

2025-10-04 10:04

2

UUID1

[0.98,0.34,...0.11]

2025-10-04 10:04

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A4

DOCUMENT_INDEXED

UUID1

IndexingService

2025-10-04 10:05

3 chunks indexed, embeddings created

Event Emitted: DOCUMENT_INDEXED

Step 5: Summarization (Executive Summary)

Actions:

  • SummarizationService consumes DOCUMENT_INDEXED.

  • LLM generates structured summary with AI Guardrails (hallucination checks, citation, explainability, validation).

Table: document_summary

summary_id

document_id

executive_summary

key_obligations

risks

regulatory_mapping

clause_categorization

action_items

feedback_status

S1

UUID1

"Loan agreement: $20M, 5y, SOFR+2%, quarterly"

"DSCR ≥1.25; Quarterly reports; Prepayment"

"Grace 180d; Cross-border guarantees"

"AML: FATCA & RBI KYC; Basel III exposure"

"Covenants, Prepayment, Repayment"

"Verify FATCA; Risk revenue analysis"

Pending

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A5

DOCUMENT_SUMMARIZED

UUID1

SummarizationService

2025-10-04 10:06

Summary generated, AI guardrails applied

Event Emitted: DOCUMENT_SUMMARIZED

Step 6: Feedback Loop

Actions:

  • Compliance officer reviews summary: Approve / Request changes.

  • Updates document_summary.feedback_status.

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A6

FEEDBACK_UPDATED

UUID1

Amit

2025-10-04 10:07

Feedback: Approved

Event Emitted: FEEDBACK_UPDATED

Step 7: Approval & Notifications

Actions:

  • ApprovalService consumes FEEDBACK_UPDATED.

  • Updates document_metadata.processed_flag = true and processed_date.

  • Notifies risk & compliance dashboards.

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A7

DOCUMENT_APPROVED

UUID1

ApprovalService

2025-10-04 10:08

Document approved, dashboards notified

Event Emitted: DOCUMENT_APPROVED

Step 8: Semantic Search & Retrieval

Actions:

  • SearchService queries PGVector embeddings for semantic search.

  • Returns top-k relevant chunks.

Audit Table Entry:

audit_id

event_name

document_id

performed_by

timestamp

details

A8

DOCUMENT_SEARCHED

UUID1

Amit

2025-10-04 10:09

Retrieved 3 similar chunks for query

Step 9: Event Flow Summary

Event Name

Produced By

Consumed By

Action Taken

DOCUMENT_UPLOAD_INITIATED

Angular UI/Controller

AuditService

Log user & metadata

DOCUMENT_UPLOADED

DocumentService

OCRService, AuditService

OCR extraction, log metadata

DOCUMENT_OCR_COMPLETED

OCRService

IndexingService, AuditService

Generate chunks, PII masking

DOCUMENT_INDEXED

IndexingService

SummarizationService, SearchService, AuditService

Embeddings created, cached

DOCUMENT_SUMMARIZED

SummarizationService

FeedbackService, AuditService

Executive summary generated, AI guardrails applied

FEEDBACK_UPDATED

FeedbackService

ApprovalService, AuditService

Feedback processed, retraining if needed

DOCUMENT_APPROVED

ApprovalService

AuditService

Final approval logged

DOCUMENT_SEARCHED

SearchService

AuditService

Semantic search logged

Key Features Illustrated in This Walkthrough:

  • Tables: document_metadata, document_text, document_embeddings (PGVector), document_summary, audit_log

  • AI Guardrails: Input/output validation, hallucination checks, PII masking, citation, explainability

  • Caching: Redis cache for embeddings

  • Event-Driven Architecture: Full asynchronous flow with events emitted and consumed

  • Audit: Each action is logged with timestamp, actor, document_id, event_name

 
 
 

Recent Posts

See All
Open Banking Vs Tradinal Banking

1. What is Open Banking? Open banking  is a system where banks allow secure sharing of financial data  with authorized third-party...

 
 
 
How To Validate Architecture

🧭 1️⃣ What Architecture Validation Means It’s the structured process of verifying that the proposed or implemented solution : Meets...

 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
  • Facebook
  • Twitter
  • LinkedIn

©2024 by AeeroTech. Proudly created with Wix.com

bottom of page