How Accurate Does GenAI Need to Be in Structured Finance?

Co-authored by Francesco Moscogiuri and Dylan Thiam


Accuracy is not a yes or no question, especially when interpreting a Collateralized Loan Obligation (CLO) prospectus. 

In structured finance, a technically accurate response isn’t always useful, and a useful answer isn’t always technically complete. A good output needs to capture subtle thresholds, conditional logic, exceptions and legal nuance. 

We’re not only tracing whether the Large Language Model (LLM) found the right number or clause. Generative Artificial Intelligence (GenAI) systems should understand not just the words, but the structure, the logic, and the regulatory context of a deal, and be able to communicate these things clearly. 

What does this mean for GenAI in structured finance?

What’s the Threshold in Structured Finance?

Building an AI Agent is relatively easy in 2025. Making it reliable for Structured Finance? Much less so.

In most industries, a mistake from an AI Agent might be mildly infuriating. In structured finance, it can be disqualifying. You’re interpreting legal triggers, modeling priority of payments, or extracting reinvestment rules that govern hundreds of millions in collateral. A single missed clause or an ambiguous summary can change our understanding of how a deal works.

Take a CLO prospectus, for example. Coverage tests are often buried in footnotes, with thresholds that change by tranche or time period. Manager replacement provisions might sit in a section with layered conditions, cross-referencing multiple appendices. Even misunderstanding a reference can render the AI’s output not just unhelpful, but dangerously misleading.

That’s why accuracy isn’t a binary score. It’s about stakes: The higher the consequence of misunderstanding, the closer to 100% you need to be, and the more context and clarity the model has to preserve.

But if we expect machines to operate near-perfectly in high-stakes contexts, it’s worth asking how accurate humans actually are in these contexts.

How Accurate Are Humans, Really?

When evaluating AI performance, it’s easy to treat human understanding as the gold standard. But in practice, that bar is much lower than we think, especially when it comes to dense financial documents.

In a legal study involving over 28,000 documents, seven separate review teams of trained attorneys were asked to evaluate the same dataset. When their assessments were compared, only 43% agreed on the relevance of the document for the case. When it came to deciding which documents were important and should definitely be included, teams agreed just 9% of the time.

Structured finance has similar challenges. Junior analysts and seasoned professionals alike can misread clauses, skip cross-references, or disagree on how to interpret a trigger. Fatigue, formatting inconsistencies, and complexity all play a role.

In that sense, the AI isn’t competing with perfection, it’s competing with real-world workflows where shortcuts happen, attention can very much slip, and documentation can often get misinterpreted.

Case Study: Validating an AI Tool on CLO Prospectuses

We’ve just completed the second validation round on an AI tool that originally gained traction after the first Structured Finance hackathon: a system designed to extract key terms, triggers, and structural elements from prospectuses. The tool focuses on turning lengths, legally dense documents like CLOs into structured, queryable data.

This round focused on thirty prospectuses from 2020 to 2025, issued in Ireland, the Cayman Islands and the U.S.A. 

The evaluation involved 1,479 questions covering real-world areas of interest: waterfall provisions, post-reinvestment restrictions, manager replacement clauses, interest and overcollateralization tests, and more. Each AI-generated answer was manually reviewed and scored based on accuracy.

The result: 1,475 out of 1,479 answers were marked correct – an accuracy rate of 99.7%.

A line graph illustrating the validation success rate trend over time, showing a steady increase in accuracy reaching approximately 99.73%.

The most obvious thing one might notice is the steady upwards trend in accuracy. This rise has been driven in foremost by ongoing finetuning, targeted adjustments and prompt engineering. There have also been infrastructure changes along the way, including moving to a different model to gain better control over performance. 

But the most valuable insights came from comparing answers from this round to the first, especially in terms of completeness and clarity.

While early outputs got the facts right, this round showed clear gains not just in accuracy, but in how the answers were expressed: with more concise phrasing and better formatting of complex rules. 

In other words, the improvement wasn’t just in content, but in form, making the deals easier to understand quickly for human users.

Consider two answers to a reinvestment rule question. One returns a dense paragraph quoting the full clause — this is technically accurate, but hard to scan. The other lays out the same information as a structured list: threshold, deadline, exception.

Use the slider to compare the examples below:

In finance, as in conversation, how something is said often determines whether the message lands. Humans use emphasis, repetition, and context cues to make sure what matters comes through. When AI starts doing the same, i.e. structuring complex clauses into clean logic, or highlighting key exceptions, it stops feeling like a search tool and starts behaving like a partner.

So… How Accurate Should It Be?

Once AI models pass the 98-99% accuracy mark and can show their work, they are generally usable. But more value emerges when outputs are structured in a way that lets analysts follow the logic instantly and confidently.

If we keep iterating with real-world feedback, manual validation and domain expertise, AI tools in structured finance won’t just match human performance, they will raise the bar

If you’re working with prospectuses, waterfalls, or complex deals, we offer practical support across validation, prompt engineering, and fine-tuning.

Our team is proud to be at the forefront of applying AI to Structured Finance, and we’re always up for comparing notes. Reach out below!

dylanthiam Avatar

Posted by

Leave a Reply

Discover more from Algoritmica

Subscribe now to keep reading and get access to the full archive.

Continue reading