Revolut built a foundation model on transactions, PRAGMA
The architecture: event streams (transactions, payments, CRM activity) → tokenise → train a base model → refine / finetune (they used NVIDIA Nemotron-1b-v2 for text comprehension) → universal embeddings that transfer across tasks.
Downstream applications? Credit risk. Fraud. Personalisation.
One model. Many use cases.
It works because transactions are sequential data. Like language. Attention is all you need.
The same logic applies to loan-level data.
Loan events — payment schedules, default trajectories, collateral changes are also sequential. The architecture translates directly.
The difference: payment data at a neobank is (relatively) clean and centralised. Loan data across European banks is typically not ready to be used by data scientists.
It might be buried in a vendor solution or stored in the cloud in a not AI friendly way.
That’s where deeploans helps. Not the foundation model, the infrastructure layer that makes it possible. Clean, ECB/ESMA-aligned loan sequences. Ready for whatever sits on top.
There is more: NVIDIA has just released a new blueprint: transaction-foundation-model.
NVIDIA core idea is the following: the pipeline learns embeddings from tabular sequences and is meant to generalise beyond payments to any domain with structured sequential signals.
We have used it as a representation-learning layer on top of deeploans cleaned loan-level data, which then feed the learned embeddings into downstream credit / risk applications (check the GitHub pull request).
The best initial use cases are:
- Default / delinquency early warning: Train the foundation model self-supervised on large un-labeled loan histories, then use the embeddings as features in a supervised PD or delinquency-transition model. This is the most direct analogue to NVIDIA’s and Revolut’s fraud example: unsupervised sequence learning first, task model second. This is also very similar to what we built (but did not work) a few years ago.
- Deal anomaly detection / data quality scoring: Because deeploans already focuses on fragmented and inconsistent financial data, embedding-based outlier detection could flag loans or deals whose behaviour is statistically unusual relative to comparable cohorts. That is often valuable before full risk modeling
- Borrower / loan segmentation: Use pooled embeddings for clustering, peer grouping, and search over similar loans or deals. This is useful for surveillance, portfolio triage, and analyst workflows.
- Prepayment / restructuring propensity: Sequence embeddings are often better than hand-built features for identifying behavioral states that precede refinance, modification or stress (related to the first bullet point).
Join us for the Structured Finance Hackathon (June 10) to see how the community will be leveraging deeploans and other open source frameworks to push the frontier of innovation in the industry.


Leave a Reply