Enterprise Strategy Group
InterSystems IRIS: High Performance Data Management Software for Concurrent Data Ingestion and Real-time Queries
Date: June 2020
Author: Kerry Dolan, Senior IT Validation Analyst
This report documents ESG’s validation of concurrent data ingestion and real-time query performance testing of various database management software products that demonstrates the ability of InterSystems IRIS data platform to ingest hundreds of millions of records and simultaneously execute millions of queries with microsecond performance, outperforming other traditional and in-memory products.
For many organizations, the ability to collect data and analyze it in real time is an essential task that drives revenue, improves visibility, informs strategy, and aids decision making. For example, applications focused on financial trading, IoT, fraud detection, and real-time personalization must ingest large amounts of data and analyze it immediately. The challenge is finding a database platform with sufficient horsepower to handle large-scale ingest and querying simultaneously without impeding performance. When ESG asked database and analytics professionals about technologies supporting data analytics, performance was among the topmost important capabilities.1
In-memory databases offer high performance but are expensive to scale and have hard memory limits that can affect reliability and cause restart delays. Traditional databases offer persistence and reliability but lack the high performance of in-memory databases. InterSystems IRIS can process both ingestion and query workloads simultaneously, with performance equal to or better than in-memory-only databases, without their limitations. InterSystems has published an open-source test to demonstrate this claim, which ESG is validating in this report.
The Solution: InterSystems IRIS
InterSystems IRIS is a data management software platform that was built for high performance, multi-workload processing at scale. As a multi-model DBMS, it provides native support for relational, object-oriented, document, key-value, and hierarchical data objects; in addition, it enables consistent high performance for both transactional and analytic workloads simultaneously. While a complete product description is beyond the scope of this report, some key functionality is described below.
- An important feature that provides superior ingest performance is the multidimensional data engine in InterSystems IRIS that enables efficient, compact storage with a rich data structure, speeding data ingest, access, and updates while minimizing resource usage and disk consumption.
- Real-time analytics performance is achieved by using a transactional-bitmap indexing schema that enables InterSystems IRIS to process complex queries quickly, including on real-time data, without searching the entire database.
- InterSystems IRIS Enterprise Cache Protocol, an intelligent, distributed memory caching mechanism, enables it to execute sophisticated queries on very large data sets with high performance and reliability, including performing joins accessing distributed data, without making multiple data copies.
Other features include:
- In-memory performance with built-in data persistence in a format optimized for rapid data access.
- Built-in distributed caching layer with automatic and guaranteed consistency.
- Full SQL support.
- Deployment on-premises, in all major public clouds, and in hybrid environments, with a single API.
ESG validated performance benefits of InterSystems IRIS using the company’s publicly available, customizable, open-source Speed Test benchmark kit.2 The benchmark was designed to measure concurrent real-time ingest and query performance. This is a common use case that financial services, fraud detection, IoT, and other applications face. For example, while financial services firms are executing thousands of trades, thousands of users are querying for order status, risk management, etc. Similarly, IoT sensor data comes in fast from the field and applications must perform immediate anomaly detection and other real-time calculations. When a database is stressed in this way, having to simultaneously ingest data and execute analytic queries can slow performance.
1 - Source: ESG Master Survey Results, The State of Data Analytics, August 2019. 2 - https://github.com/intersystems-community/irisdemo-demo-htap.