By Zuzanna Stamirowska

For a number of years, the finance sector has been moving full-speed to collect data on business operations. From fraud detection to offering real-time customer experiences, the speed at which data could be collected, transformed and applied for analysis has become a critical success factor. And many financial organisations are now rapidly pursuing projects to apply this data to real-time AI applications.

But challenges persist. The difficulty of applying Machine Learning to data streaming workflows – which both of the aforementioned examples rely on – is tangibly reducing the speed and accuracy of these use cases. That means reduced fraud detection and less personalised customer experiences.

The finance sector is not alone in facing this issue. Nor is it a new issue. I first recognised this challenge when finishing my PhD on forecasting maritime trade in 2019. Despite the logistics industry going full-throttle on its collection and use of data, this inability to apply Machine Learning to the data streams was slowing down digitisation in the industry.

The barriers of batch data

But what is stopping the finance industry from being able to design real-time AI applications? Most AI models are trained with static data uploads. This means that while Machine Learning models are offering promising results and outcomes based on one-off, static data snapshots, they do not work for the ever-changing inputs of a data streaming workflow. This means that the new events and updates that are generated on a minute-by-minute basis, such as the latest creative fraud pattern cannot inform these models until the next batch upload, which might be run daily or less and often takes hours to process.

In practice, it means that these models are not in a continuous state of learning and their intelligence is stuck in a moment in time. Unlike humans, for example, their knowledge cannot be updated as new information is revealed – whether that’s because it has become outdated and needs to be updated, or where anything previously learnt is revealed to be false or inaccurate.

This ultimately means that the accuracy of real-time AI models is not up to the market needs. This has, in turn, stalled its adoption for use cases that rely on real-time data for decision-making.

Layers of complexity

The complexity of designing streaming workflows has underpinned the challenge that financial services organisations face in implementing real-time AI applications. A specialist skillset is required to build use cases for data streaming. And as a result, there are typically separate teams that focus on streaming and batch use cases, each writing in their respective coding languages.  This has made the integration of batch and streaming workflows particularly challenging as they are literally speaking different languages. 

And if it wasn’t difficult enough before, the introduction of a third workflow – generative AI, which needs real-time contextual insights to deliver value in an enterprise application – has made the situation event harder.

So now, most organisations are designing two or more different systems that are separate of each other, which can’t perform incremental updates to the preliminary datasets. And until the challenge of integrating these data workflows is resolved, it will be impossible to seize the advantage of real-time AI systems for strategic decision-making, resource management, observability and monitoring, predictive maintenance, and anomaly detection.

Bridging the divide

To overcome the disparate nature of batch, streaming and LLM (large language model) workflows, new innovations now allow those workflows to run in a unified platform to bridge the divide between these workflows and offer new opportunities.

Being able to switch from batch to streaming in a click radically democratises the ability to design streaming workflows at scale and, in turn, the ease at which LLM pipelines can be put into production. With batch and streaming data combined in the same workflow, real-time AI applications become a reality as new streaming data can continuously train and update the model. It will no longer require a full batch data upload – that means increasing the speed of intelligence and delivering greater accuracy, as well as secondary benefits such as reduced energy consumption as data which doesn’t need to be updated isn’t constantly refreshed as part of batch uploads.

This will also undoubtedly have a dramatic effect on how real-time data is approached within an organisation. As will democratising who within the data team can design workflows for both batch and streaming. Python is emerging as the lingua franca of data processing, which can then be translated into more efficient languages, like Rust. This will radically increase the number of data professionals that can work on projects that combine batch and streaming, as they’ll be able to code in the same language. In turn, this bring downs down one of the most common barriers organisations face in developing AI systems, driving greater innovation and creativity of use cases, as well as increase the speed at which and number of data streaming projects an organisation can pursue.

Beckoning a new generation of real-time AI

There is a new paradigm of real-time AI applications which hold the promise of delivering faster, smart and more efficient processes for financial services organisations, which promise to deliver benefits at an operational level and to their customers in equal measure. Overcoming the challenge of static data uploads for Machine Learning and AI applications will enable organisations to radically scale their use of real-time data to improve the speed and accuracy of decision-making

About the Author

Zuzanna StamirowskaZuzanna Stamirowska is the CEO of Pathway.com – the fastest data processing engine on the market which makes real-time intelligence possible, enabling companies to power their LLMs, MLMs and enterprise data pipelines. She also authored the forecasting model for maritime trade published by the National Academy of Sciences of the USA.

 

Leave a Reply

Your email address will not be published. Required fields are marked *