Business drivers such as increasing regulation, the risk of fraud, and consumer demand for tailored services are creating a fundamental shift in the way financial organizations are using their data assets. Rather than asking their data what happened yesterday, firms are asking their data what is happening now. The data that answers this question, however, is not only trapped in transactional systems not optimized for introspection, but also dispersed across many static and historical data stores.
Answering the question above is only possible if a data platform is always available, secure, and capable of:
- massive rates of data ingestion,
- scalability to retain data, and
- concurrent and unencumbered access to the data.
A Shift in the Industry
Firms have already solved how to introspect historical data to discover medium and long-term patterns across their businesses. Whether leveraging a data warehouse or data lake, the business case is research, and the nettlesome operational issue is moving data from transactional applications to the right data elements in the warehouse or lake. Time elapses as the data is cleaned, conditioned, and relocated, and the once fresh data becomes stale. Old data is sufficient for historical analysis, but not for answering what is happening now.
Today, however, organizations are looking to leverage short-term trends in response to challenging business problems such as credit card fraud, real-time risk management, and improving customer experiences with timely interactions. Firms are now asking current time questions such as “what is happening now?” and “is what’s happening now different from recent experience?”
Firms continually evaluate the many technologies that might appear to help. These technologies range from sharded file system architectures to columnar databases to Spark, grid, virtualization, and a myriad of others.
First Challenge: Harnessing Transactional Data
A common obstacle is created by traditional organizational structure, which creates a separation of transactional processing (IT) and analytics (data science) departments. Each has different goals. One division’s mission is to process orders and minimize downtime, while the other’s is to find new correlations within different sets of data.
Another obstacle is the technology itself. Transactional systems are tuned for throughput and capacity. They often run on expensive equipment that has been modified for high-performance and resiliency. Meanwhile, analytical systems are tuned for easy exploration of the data. Data scientists request data from the transactional systems so they can analyze the data in their own environment using a slew of different analytical tools such as R, Matlab, etc.
The obstacles above introduce a costly and unnecessary delay that prevents analyzing data when it is fresh. The longer the delay, the more stale the data, and the less value it has to an organization.
Bigger Challenge: Joining Dynamic and Static Data
Even with the ability to introspect transactional data, there are larger challenges. What is missing is a panoramic view of all data assets. The ability to see all data across different data sets, whether dynamic or static, is critical to answer current time questions and is almost never satisfied simply by the payload of each transaction. Other static or occasionally updated data is required to construct a useful answer. If a firm, for example, wants to ask how many recent purchases were made by a specific group of people for a particular type of product, they will need to access both the accounts and products data along with the current transactions.
Similarly, to understand how what is happening now is different from recent experience will require transactional and reference data. More interesting, though, is the reach into “recent experience,” i.e., a window of time measured in seconds, minutes, hours, days, etc., from the accumulated transactional data. Anomaly detection relies on the comparison of both accumulated real-time transactions with baselines created from longer-term trends. For example, it is advantageous to immediately know if a particular transaction between two specific parties is occurring more or less frequently than in the past.
The Data Platform Approach: Unencumbered Access
The data platform approach provides a single environment to handle the concurrent workloads of high-performance transactions (ingestion) and high-performance analytics (queries). With a single platform combining data management, integration and analytics technologies, a firm can simultaneously process transactions and enable access to the newly arrived data. The real difference is “unencumbered access” to data, whether dynamic or static. This requires giving a query full, panoramic access to the data to answer a “current time” question fully.
“Unencumbered” means full join capabilities no matter the location of data. It requires direct access to the single copy of the product, account, and other reference data, without replication, without repurposing a database technology to assist in that replication, without the operational overhead to ensure replication is consistent, and without the cost associated with custom hardware over standard off-the-shelf hardware.
A join should be agnostic to data topology; it is immaterial that data is sharded, co-sharded, non-co-sharded or non-sharded across (virtualized) commodity hardware. “Unencumbered” also means massively scalable data persistence. Data germane to “recent experience” is accessible all the while “current” data is being ingested. The conflict of workload (writes vs. reads) is resolved using the same “panoramic view” machinery. In this scenario, it is used to provide data from the ingestion tier to a compute or query tier.
A data platform brings to the business user a platform that will help answer “What is happening now?” and “Is what’s happening now different from recent experience?”
For more information on the InterSystems data platform and how it addresses these challenges, please visit https://financial.intersystems.com/.

























