Capitec Pulse AI banking interface displaying predictive customer insights and recommendations

06 Mar 2026 Artificial Intelligence Banking Internet

👁1,652views
Why Capitec Pulse Is a World First and Why You Cannot Just Copy It

CloudScale AI SEO - Article Summary

1.
What it is
Capitec Bank built Pulse, an AI system that provides contact center agents with real-time client context by pulling live data from across the bank's systems in the seconds between a client requesting a call and the agent answering.
2.
Why it matters
This reduces call handling time by 18% and creates effortless customer experiences, but the real achievement is the complex engineering required to access live banking data at scale without impacting production systems - something most banks cannot replicate due to architectural constraints.
3.
Key takeaway
The innovation isn't the AI interface but the ability to safely query live production banking data in real-time at massive scale.

By Andrew Baker, Chief Information Officer, Capitec Bank

The Engineering Behind Capitec Pulse

1. Introduction

I have had lots of questions about how we are “reading our clients minds”. This is a great question, but the answer is quite complex – so I decided to blog it. The article below really focuses on the heavy lifting required to make agentic solutions first class citizens of your architecture. I dont go down to box diagrams in this article, but it should give you enough to frame the shape of your architecture and the choices you have.

When Capitec launched Pulse this week, the coverage focused on the numbers. An AI powered contact centre tool that reduces call handling time by up to 18%, delivering a 26% net performance improvement across the pilot group, with agents who previously took 7% longer than the contact centre average closing that gap entirely after adoption. Those are meaningful numbers, and they are worth reporting. But they are not the interesting part of the story.

The interesting part is the engineering that makes Pulse possible at all, and why the “world first” claim, which drew measured scepticism from TechCentral and others who pointed to existing vendor platforms with broadly similar agent assist capabilities, is more defensible than the initial coverage suggested. The distinction between having a concept and being able to deploy it in production, at banking scale, against a real estate of 25 million clients, is not a marketing question. It is a physics question. This article explains why.

2. What Pulse Actually Does

To understand why Pulse is difficult to build, it helps to understand precisely what it is being asked to do. When a Capitec client contacts the support centre through the banking app, Pulse fires. Before the agent picks up the call, the system assembles a real time contextual picture of that client’s recent account activity, drawing on signals from across the bank’s systems: declined transactions, app diagnostics, service interruptions, payment data and risk indicators. All of that context is surfaced to the agent before the first word is exchanged, so that the agent enters the conversation already knowing, or at least having a high confidence hypothesis about, why the client is calling.

The goal, as I described it in the launch statement, is not simply faster resolution. It is an effortless experience for clients at the moment they are most frustrated. The removal of the repetitive preamble, the “can you tell me the last four digits of your card” and “when did the problem start” that precedes every contact centre interaction, is what makes the experience qualitatively different, not just marginally faster. The 18% reduction in handling time is a consequence of that. It is not the objective.

What makes this hard is not the user interface, or the machine learning, or the integration with Amazon Connect. What makes it hard is getting the right data, for the right client, in the right form, in the window of time between the client tapping “call us” and the agent picking up. That window is measured in seconds. The data in question spans the entire operational footprint of a major retail bank.

3. Why Anyone Can Build Pulse in a Meeting Room, But Not in Production

When TechCentral noted that several major technology vendors offer agent assist platforms with broadly similar real time context capabilities, they were correct on the surface. Genesys, Salesforce, Amazon Connect itself and a number of specialised contact centre AI vendors all offer products that can surface contextual information to agents during calls. The concept of giving an agent more information before they speak to a customer is not new, and Capitec has never claimed it is.

The “world first” claim is more specific than that. It is a claim about delivering real time situational understanding at the moment a call is received, built entirely on live operational data rather than batch replicated summaries, without impacting the bank’s production transaction processing. That specificity is what the coverage largely missed, and it is worth unpacking in detail, because the reason no comparable system exists is not that nobody thought of it. It is that the engineering path to deploying it safely is extremely narrow, and it requires a degree of control over the underlying data architecture that almost no bank in the world currently possesses.

To see why, it helps to understand the two approaches any bank or vendor would naturally reach for, and why both of them fail at scale.

4. Option 1: Replicate Everything Into Pulse Before the Call Arrives

The first and most intuitive approach is to build a dedicated data store for Pulse and replicate all relevant client data into it continuously. Pulse then queries its own local copy of the data when a call comes in, rather than touching production systems at all. The production estate is insulated, the data is pre assembled, and the agent gets a fast response because Pulse is working against its own index rather than firing live queries into transactional databases.

This approach has significant appeal on paper, and it is the model that most vendor platforms implicitly rely on. The problem is what happens to it at banking scale, in a real production environment, under real time load.

Most banks run their data replication through change data capture (CDC) pipelines. A CDC tool watches the database write ahead log, the sequential record of every committed transaction, and streams those changes downstream to consuming systems: the data warehouse, the fraud platform, the reporting layer, the risk systems. These pipelines are already under substantial pressure in large scale banking environments. They are not idle infrastructure with spare capacity waiting to be allocated. Adding a new, high priority, low latency replication consumer for contact centre data means competing with every other downstream consumer for CDC throughput, and the replication lag that results from that contention can easily reach the point where the data Pulse is working with is minutes or tens of minutes old rather than seconds.

For some of our core services, due to massive volumes, CDC replication is not an option, so these key services would not be eligible for Pulse if we adopted a replication architecture approach.

The more fundamental problem, though, is one of scope. You cannot wait for a call to come in before deciding what to replicate. By the time the client has initiated the support session, there is no longer enough time to go and fetch all the data for currently over 60 databases and log stores. The replication into the Pulse data store has to be continuous, complete and current for all 25 million clients, not just the ones currently on calls. That means maintaining sub second freshness across the entire operational footprint of the bank, continuously, around the clock. The storage footprint of that at scale is large. The write amplification, where every transaction is written twice, once to the source system and once to the Pulse replica, effectively doubles the IOPS demand on an already loaded infrastructure. And the cost of provisioning enough I/O capacity to maintain that freshness reliably, without tail latency spikes that would degrade the contact centre experience, is substantial and recurring.

All of our core services have to be designed for worst case failure states. During an outage, when all our systems are already under huge scale out pressures, contact centre call volumes are obviously at their peak as well. If Pulse replication added pressure during that scenario to the point where we could not recover our services, or had to turn it off precisely when it was most valuable, the architectural trade off would be untenable.

Option 1 works on paper. In production, against a real banking client base of the size Capitec serves, it is expensive, architecturally fragile and, in practice, not reliably fresh enough for the use case it is meant to serve.

5. Option 2: Query the Live Production Databases as the Call Comes In

The second approach is more direct: abandon the replication model entirely and let Pulse query the live production databases at the moment the call arrives. There is no replication lag, because there is no replication. The data Pulse reads is the same data the bank’s transactional systems are working with right now, because Pulse is reading from the same source. Freshness is guaranteed by definition.

This approach also fails at scale, and the failure mode is more dangerous than the one in Option 1, because it does not manifest as stale data. It manifests as degraded payment processing.

To understand why, it is necessary to understand how relational databases handle concurrent reads and writes. Almost every OLTP (online transaction processing) database, including Oracle, SQL Server, MySQL, and PostgreSQL in its standard read committed isolation configuration, uses shared locks, also called read locks, to manage concurrent access to rows and pages. When a query reads a row, it acquires a shared lock on that row for the duration of the read. A shared lock is compatible with other shared locks, so multiple readers can access the same row simultaneously without blocking each other. But a shared lock is not compatible with an exclusive lock, which is what a write operation requires. A write must wait until all shared locks on the target row have been released before it can proceed. This is the fundamental concurrency model of most production relational databases, and it exists for a good reason: it ensures that readers see a consistent view of data that is not mid modification. The cost of that consistency guarantee is that reads and writes are not fully independent.

In a low concurrency environment, this trade off is rarely visible. Reads complete quickly, locks are released, writes proceed with negligible delay. In a high throughput banking environment, where thousands of transactions per second are competing for access to the same set of account rows, adding a new class of read traffic directly into that contention pool has measurable consequences. Every time a Pulse query reads a client’s account data to prepare a contact centre briefing, it acquires shared locks on the rows it touches. Every write transaction targeting those same rows, whether a payment completing, a balance updating, or a fraud flag being set, must wait until those shared locks are released. At Capitec’s scale, with a large number of contact centre calls arriving simultaneously, the aggregate lock contention introduced by Pulse queries onto the production write path would generate a consistent and material increase in transaction tail latency. That is not a theoretical risk. It is a predictable consequence of the locking model that virtually every production RDBMS uses, and it is a consequence that cannot be engineered away without changing the database platform itself.

Option 2 solves the data freshness problem while introducing a write path degradation problem that, in a regulated banking environment, is not an acceptable trade off. The integrity and predictability of payment processing is not something that can be compromised in exchange for better contact centre context.

6. Option 3: Redesign the Foundations

Capitec arrived at a third path, and it was available to us for a reason that has nothing to do with being smarter than the engineers at other banks or at the vendor platforms. It was available because Capitec owns its source code. The entire banking stack, from the core transaction engine to the client facing application layer, is built internally. There is no third party core banking platform, no licensed system with a vendor controlled schema and a contractual restriction on architectural modification. When we decided that real time operational intelligence was worth getting right at a foundational level, we had the ability to act on that decision across the entire estate.

The central architectural choice was to build every database in the bank on Amazon Aurora PostgreSQL, with Aurora read replicas provisioned with dedicated IOPS rather than relying on Aurora’s default autoscaling burst IOPS model (with conservative min ACUs). Aurora’s architecture is important here because it separates the storage layer from the compute layer in a way that most traditional relational databases do not. In a conventional RDBMS, a read replica is a physically separate copy of the database that receives a stream of changes from the primary and applies them sequentially. Replication lag in a conventional model accumulates when write throughput on the primary outpaces the replica’s ability to apply changes. In Aurora, the primary and all read replicas share the same underlying distributed storage layer. A write committed on the primary is immediately visible to all replicas, because they are all reading from the same storage volume. The replica lag in Aurora PostgreSQL under normal operational load is measured in single digit milliseconds rather than seconds or minutes, and that difference is what makes the contact centre use case viable.

Pulse has access exclusively to the read replicas. By design and by access control, it cannot touch the write path at all. This is the critical architectural guarantee. The read replicas are configured with access patterns, indexes and query plans optimised specifically for the contact centre read profile, which is structurally different from the transactional write profile the primary instances are optimised for. Because Aurora’s read replicas use PostgreSQL’s MVCC (multi version concurrency control) architecture, reads on the replica never acquire shared locks that could interfere with writes on the primary. MVCC works by maintaining multiple versions of each row simultaneously, one for each concurrent transaction that needs to see a consistent snapshot of the data. When Pulse queries a read replica, PostgreSQL serves it a snapshot of the data as it existed at the moment the query started, without acquiring any row level locks whatsoever. There is no mechanism by which Pulse’s read traffic can cause a write on the primary to wait.

Beyond the relational data layer, all operational log files across the platform are coalesced into Amazon OpenSearch, giving Pulse a single, indexed view of the bank’s entire log estate without requiring it to fan out queries to dozens of individual service logs scattered across the infrastructure. App diagnostics, service health events, error patterns and system signals are all searchable through a single interface, and OpenSearch’s inverted index architecture means that the kinds of pattern matching and signal correlation queries that Pulse needs to produce a useful agent briefing execute in milliseconds against a well tuned cluster, rather than in seconds against raw log streams.

The result of these architectural choices taken together is a system in which Pulse reads genuinely current data, through a read path that is completely isolated from the write path, with effectively no replication lag, no lock contention and no impact on the transaction processing that is the bank’s core operational obligation.

7. Why a Vendor Could Not Have Delivered This

This is the part of the “world first” argument that the sceptics most consistently miss, and it is worth addressing directly. The question is not whether vendors are capable of building the software components that Pulse uses. Of course they are. Amazon, Salesforce, Genesys and others have engineering teams that are among the best in the industry. The question is whether any vendor could have deployed a Pulse equivalent system successfully against a real world banking estate, and the answer to that question is almost certainly no, for reasons that have nothing to do with engineering capability and everything to do with the constraints that vendors face when they deploy into environments they did not build.

A vendor arriving at a major bank with a Pulse equivalent product would encounter a technology estate built on a core banking platform they do not control, with a CDC replication architecture that is already at or near capacity, and with OLTP databases running a locking model that is baked into the platform and cannot be modified without the platform vendor’s involvement. They would be presented with exactly the choice described in sections 4 and 5 of this article: replicate everything and accept the lag and IOPS cost, or query production and accept the locking risk. Neither of those options produces a system that works reliably at the scale and performance level that a contact centre use case demands, and a vendor has no ability to change the underlying estate to create the third option.

The only path to the architecture described in section 6 is to control the source code of the underlying banking systems and to have made the decision to build the data infrastructure correctly from the beginning, before the specific use case of contact centre AI was on anyone’s roadmap. That is a decision Capitec made, and it is a decision that most banks, running licensed core banking platforms with limited architectural flexibility, are not in a position to make regardless of budget or intent.

8. Pulse Is the First Output of a Broader Capability

It would be a mistake to read Pulse purely as a contact centre initiative, because that is not what it is. It is the first publicly visible output of a platform capability that Capitec has been building for several years, and that capability was designed to serve a much broader set of real time operational decisions than contact centre agent briefings.

The traditional data architecture in banking separates the transactional estate from the analytical estate. The OLTP systems process transactions in real time. A subset of that data is replicated, usually overnight, into a data warehouse or data lake, where it is available to analytical tools and operational decision systems. Business intelligence, fraud models, credit decisioning engines and risk systems are typically built on top of this batch refreshed analytical layer. It is a well understood model and it works reliably, but its fundamental limitation is that every decision made on the analytical layer is made on data that is, at minimum, hours old.

For fraud prevention, that delay is increasingly unacceptable. Fraud patterns evolve in minutes, and a fraud signal that is twelve hours old is, in many cases, a signal that arrived after the damage was done. For credit decisions that should reflect a client’s current financial position rather than yesterday’s snapshot, Capitec Advances is one example where the decision should reflect income received this morning rather than income received last month, and the batch model introduces systematic inaccuracy that translates directly into worse outcomes for clients. For contact centre interactions, it means agents are working with context that may not reflect the last several hours of a client’s experience, which is precisely the window in which the problem they are calling about occurred. Capitec’s investment in the real time data infrastructure that underpins Pulse was motivated by all three of these use cases simultaneously, and Pulse is simply the first system to emerge from that investment in a publicly deployable form. It will not be the last.

9. The Hallucination Trap

So you have liberated your data and AI can access everything. Congratulations. Here is your next problem, and it is one that almost nobody talks about openly: your schema needs a cryptologist to understand it.

I have seen vendor systems where retrieving a simple transaction history for a client across all their accounts requires over four thousand lines of SQL. Four thousand lines. Not because the query is sophisticated. Because the schema has been abused so systematically over so many years that it has become genuinely incomprehensible. Field A means one thing for product type 1 and something entirely different for product type 2. The same column carries different semantics depending on a discriminator flag three joins away that half the team has forgotten exists. The schema was not designed this way deliberately. It accumulated this way, one pragmatic shortcut at a time, over a decade of releases where the path of least resistance was always to reuse an existing column rather than add a new one.

When you point an AI at a schema like this and ask it to answer questions about client behaviour, you are not testing the AI. You are testing whether the AI can reverse-engineer fifteen years of undocumented modelling decisions from first principles, in real time, while a client is waiting on the line. The model is not hallucinating. You have simply given it no chance. The garbage is in the schema, not in the model.

The instinctive response is to fix the schema. That instinct is correct and also career-limiting. A schema remediation project of that scope touches every upstream writer and every downstream consumer simultaneously. It takes years, it breaks things in ways that are difficult to predict and expensive to test, and it competes for the same engineering capacity that is meant to be delivering the AI capabilities the business is waiting for. In practice, it does not happen. The schema persists, the SQL grows longer, and the AI continues to produce answers that are subtly wrong in ways that are difficult to trace back to their root cause.

The better answer is to stop trying to fix the past and build a clean projection of it instead. You take the ugly SQL, you encapsulate it in a service, and you publish the output onto a Kafka topic with a logical schema that any engineer can read without a glossary. A transaction is a transaction. An account is an account. The field names mean what they say, consistently, regardless of product type. The complexity of the source system is hidden behind the service boundary, versioned, tested and owned by a team that understands it deeply rather than distributed invisibly across every system that needs to query it.

This approach has compounding benefits that go well beyond making AI queries more reliable. A client’s five year transaction history, retrieved for a tax enquiry, no longer runs as a live query against your core banking database at the worst possible moment. It is read from the Kafka topic, which was built precisely for that read profile and carries no locking risk whatsoever against the transaction processing path. Every change to the underlying logic is isolated to the service, regression tested independently, and deployed without touching the consumers. The operational complexity that used to be everyone’s problem becomes the well-defined responsibility of a single team.

And then, once you have a clean logical schema flowing through a reliable event stream, something shifts. The AI stops guessing. The queries become short and readable. The answers become trustworthy. You stop spending half your prompt engineering budget compensating for schema ambiguity and start asking the questions that actually matter. You can anticipate why a client is calling before they tell you. You can see the shape of their financial life clearly enough to offer them something useful rather than just resolving their immediate complaint. These details are not glamorous. They do not appear in product launch coverage. But they are the actual reason Pulse works, and they are genuinely hard to get right. Get them right, and the AI does not just answer questions. It starts to read your clients’ minds.

The broader lesson here is one that the industry keeps learning the hard way. You do not need to train and retrain models endlessly to compensate for the complexity you have accumulated. You do not need exotic prompt engineering to paper over a schema that was never coherent to begin with. You need to go on a complexity diet and get fit. Simplify the data, clean the contracts, publish logical schemas, and then let the model do what it was actually built to do. The banks that are chasing their tails retraining models to handle their own internal chaos are solving the wrong problem at enormous cost. The ones that do the unglamorous work of cleaning up the foundations find that the model does not need to be retrained at all. It just works. That is the difference between an AI strategy and an AI bill.

10. Where the Insights Come From

Once the data architecture described in section 6 is in place, the inference layer that actually produces the agent briefing is, relatively speaking, the easy part. The decisions Pulse makes — the synthesis of declined transactions, app diagnostics, payment signals and risk indicators into a coherent hypothesis about why a client is calling — are generated by Amazon Bedrock, predominantly using Claude as the underlying model. The assembled context is passed to Claude as a structured prompt, and Claude returns a natural language briefing that the agent reads before picking up. There is no hand-coded decision tree, no brittle rules engine, and no model trained from scratch on Capitec-specific data. The reasoning is emergent from the context, which is exactly what a large language model is designed to do well.

What is worth noting for engineers who have not yet worked with Bedrock at production scale is that the AI layer, once the data problem is solved, introduces almost none of the architectural complexity that the preceding sections describe. Claude reads context, produces a summary, and it does so with a consistency and quality that would have been implausible from any commercially available model even two years ago. The model does not need to be fine-tuned for this use case. It needs to be given good inputs, and the entire engineering effort described in this article is, in a sense, the work required to produce those good inputs reliably and at speed.

The one genuinely frustrating constraint at the AI layer has nothing to do with model capability. AWS accounts are provisioned with default throughput limits on Bedrock — tokens per minute and requests per minute caps that are set conservatively for new or low-volume accounts. At contact centre scale, those defaults are insufficient, and lifting them requires a support request to AWS that, in practice, takes approximately a day to process. For a team trying to move quickly from pilot to production, that is an unexpected bottleneck: the data architecture performs, the model performs, and progress stalls on an account configuration ticket. It is a solvable problem, but it is worth naming because it catches teams off guard when everything else is working.

11. The World First Verdict

The “world first” claim, properly understood, is this: no comparable system delivers real time situational understanding to contact centre agents at the moment a call is received, built on live operational data with sub second freshness, at the scale of a 25 million client retail banking estate, without any impact on the bank’s production write path. That is a precise claim, and it is defensible precisely because the engineering path that leads to it requires a combination of architectural decisions, including full internal ownership of source code, Aurora PostgreSQL with dedicated read replicas across the entire estate, MVCC based read isolation, and OpenSearch log aggregation, that very few organisations in the world have made, and that could not have been retrofitted to an existing banking estate by a third party vendor regardless of their capability.

Any bank can describe Pulse in a presentation. The vast majority of them cannot deploy it, because they do not control the foundations it depends on. The distinction between the idea and the working system is what the claim is actually about, and on that basis it stands.

References

TechCentral, “Capitec’s new AI tool knows your problem before you explain it”, 5 March 2026. https://techcentral.co.za/capitecs-new-ai-tool-knows-your-problem-before-you-explain-it/278635/

BizCommunity, “Capitec unveils AI system to speed up client support”, 5 March 2026. https://www.bizcommunity.com/article/capitec-unveils-ai-system-to-speed-up-client-support-400089a

MyBroadband, “Capitec launches new system that can almost read customers’ minds”, 2026. https://mybroadband.co.za/news/banking/632029-capitec-launches-new-system-that-can-almost-reads-customers-minds.html

Amazon Web Services, “Amazon Aurora PostgreSQL read replicas and replication”, AWS Documentation. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Replication.html

Amazon Web Services, “Amazon Connect, cloud contact centre”, AWS Documentation. https://aws.amazon.com/connect/

PostgreSQL Global Development Group, “Chapter 13: Concurrency Control and Multi Version Concurrency Control”, PostgreSQL 16 Documentation. https://www.postgresql.org/docs/current/mvcc.html

Amazon Web Services, “What is Amazon OpenSearch Service?”, AWS Documentation. https://docs.aws.amazon.com/opensearch-service/latest/developerguide/what-is.html

Capitec Bank, “Interim Results for the six months ended 31 August 2025”, 1 October 2025. https://www.capitecbank.co.za/blog/news/2025/interim-results/

👁1,652viewsWhy Capitec Pulse Is a World First and Why You Cannot Just Copy It