In today’s business world, the need for speed is pervasive.
Customers expect instant access, real-time visibility, personalized point-of-sale interactions, same day delivery. No waiting, and zero patience for entering and re-entering information that the business already has.
It’s no wonder that high-performance in-memory databases and caches have become popular with businesses in many industries. Shedding the extra weight of persisting in-process state to disk, in-memory databases are like fast, agile motorcycles. In comparison, many traditional disk-based databases are akin to sturdy, heavy duty trucks, processing workloads with reliability and scale, but unable to deliver comparable levels of performance.
But the high performance of in-memory databases comes at a cost.
Higher total cost of ownership. With in-memory databases, there is no permanent data store. Therefore, organizations must buy and provision sufficient memory to accommodate all of the data, with enough additional memory to ingest new data, process analytic workloads and return query results.
Hard scalability limits. Data and queries – especially large analytic queries – can consume all available memory, especially for applications that experience rapid growth or unexpected volatility. When the available memory is exhausted, the application stops, and servers can fail.
Restart delays. With in-memory databases, the contents of memory are periodically written to checkpoint files, and any subsequent data is written to write-ahead log files. When memory is exhausted and outages occur, as they are prone to do, rebuilding the in-process state from these files can take hours. For mission-critical applications, downtime can be devastating to the business. Keeping redundant copies of the data in production is an option, but further adds to the total cost of ownership.
What about In-Memory Caches?
Another somewhat similar approach is to use an in-memory cache between a persistent database and the application servers, to keep recently used data available for rapid access. This approach shares all of the weaknesses of in-memory databases, as well as added architectural and application complexity; increased network latency; and additional CPU costs from converting the data between the different structures in the cache and the application layer.
In fact, engineers from Google and Stanford University recently published a research paper reviewing this architectural approach. In it, they write that, “the time of the [remote, in memory key-value store] has come and gone: their domain independent APIs (e.g., PUT/GET) push complexity back to the application, leading to extra (un)marshaling overheads and network hops.”
A modern data platform technology provides the best of both worlds: all of the performance benefits of in-memory technologies without any of their limitations.