Skip to content

Scanning the Garbage: Looking for Meaning in Mounds of Data

Various data graphs in bright colors displayed on a screen

We all have different ways of interpreting the world around us. I tend to see things as visual metaphors.

While waiting in an airport security line recently, I noticed a maintenance worker pushing a large wheeled garbage can. He carefully lifted out a plastic trash bag and placed it on the TSA conveyor belt so it could be X-rayed. I didn’t know whether to laugh hysterically or weep at a world that has to scan its garbage to keep people safe!

Later, I thought about this some more. A friend of mine collects antique glass bottles. She spends a fair amount of time digging for these treasures in the garbage pits of long-abandoned farmhouses. How, I asked myself, is this really different from the TSA garbage screening? Both require a highly trained eye and disciplined techniques. Both are looking for items of interest in the by-products of everyday life. One is looking for something of positive value, the other trying to avert tragedy, but both require filtering out the clutter and identifying what matters.

All of which led me to patient health records.

Wait – what?

No, health information is not garbage; it is an incredibly important artifact of the care process. On the other hand, the way we have captured, managed and stored patient records historically has made it virtually impossible to sort through the information to find the relevant bits and pieces. And there is high value buried in that information that will help create positive outcomes and avoid negative ones.

Among the challenges to doing so is the sheer volume of health information – whether we are talking linear feet of shelving for paper records, or petabytes of electronic storage. Health information is also highly heterogeneous. It may be handwritten, typed on paper, scanned, or created digitally. It includes images, payment information, diagnoses, videos, treatment records, doctors’ notes, medication administrations, and adverse reactions – almost anything you can imagine.

Of course, there is also geographic dispersion; a single patient’s record may be spread across many locales over the course of a lifetime. And then there are the inconsistencies and idiosyncrasies introduced by individual record-keeping preferences, changing data standards, different information systems, etc., etc.

When you think about it, the wonder is that we find as much information as we do in spite of it all.

I recently came across a development project in Mongolia called TG2G, Turning Garbage into Gold. The phrase suggests other things to me when I think about mounds of data.

When business information became almost entirely digital, a new, very expensive, legal business was spawned to manually search electronic archives for the information relevant to litigation. Very quickly, that young, high-touch business was itself revolutionized by eDiscovery software and machine learning technology to pore through terabytes of information, locate the relevant records, and suggest others that might be equally so. Essentially, this entire industry sorts through the detritus of global business and government to find litigator’s gold.

So how is this relevant to health information? It’s all in how the data is assembled and prepared. Underlying all the “big data” predictive analytics of eDiscovery software is a process of bringing together heterogeneous, often unstructured, data that is spread across time and space in all kinds of information silos. The preparation requires aggregating, indexing, de-duplicating and normalizing disparate data to create a connected, comprehensive and credible information foundation on which to build. It is the prep work that makes the data usable.

Sounds to me a lot like health information exchanges – both public HIEs and private, internal informatics platforms. These serve up actionable information to power real-time alerts for care transitions, population health management tools, management dashboards, retrospective quality measures – in short, all the analytics that help us understand and improve the quality and efficiency of our care processes.

My colleague, Gary Christensen, wrote recently about the kinds of innovations that became possible once the data prep work had been done by the Rhode Island health information exchange. His metaphor, cream rising to the top, was more appealing than mine – but whatever the image, there’s a lot of good stuff hidden away, just waiting to be uncovered!