Skip to content
Vyhledejte informace o produktech a řešeních InterSystems, kariérních příležitostech a dalších.

Garbage In, Gospel Out?

red, rusty garbage can in a filed of overgrown grass

In antiquity, an oracle was means for people to receive wise and insightful counsel that was divinely inspired.

From Assyria, to Egypt to their more famous Greek counterparts, oracles were medium by which the gods spoke to people. For people of all civilizations interested in knowing the future or making the right decision, the oracle was a way to know the unknown.

In our modern culture, computers and technology have become the new oracles (to the point that a large software company adopted the very name). And with ever-increasing amounts of data, people want insight into that data to predict the future. Now, more than ever, they want their version of the Oracle of Delphi, a priestess who tells you what will happen in the future.

Technology companies have rushed to fulfill this ancient human need. Today, the new oracles are Artificial Intelligence, Machine Learning and Deep Learning algorithms.

And nowhere is AI more attractive than in healthcare; the potential it has for treatment and diagnosis is nearly limitless. In many ways, that makes sense; healthcare is an incredibly broad subject, full of highly complex subject matter that is changing rapidly with new
technological advances. So, like the ancient Greeks seeking wisdom at Delphi, we hope AI will help see the future and allow us to make the right decisions on care.

But lost in the AI and machine learning healthcare gold rush is the immutable law of computing: Garbage in, Garbage Out. That is, computers are only as good as the data that is input. When you train AI on biased data, you obviously get biased outputs.

Put very simply, you cannot perform deep learning, machine learning or leverage artificial intelligence on data that is either non-existent or incorrect.

A perfect example of errors from non-existent data[i]: a 2015 study on the effectiveness of a machine-learning technique used to predict which hospital patients would develop pneumonia complications worked well in most situations. But the algorithm made one serious error: it instructed clinicians to send asthma patients home even though they are in a high-risk category. That was due to the fact that the hospital protocol was to automatically send patients with asthma to intensive care; these patients were rarely on the ‘required further care’ records on which the system was trained.

To quote Pedro Domingo’s from The Master Algorithm, "People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world."

For many organizations, the hope of easy answers creates the most severe consequence we have seen with early AI and machine learning: "Garbage in, gospel out." As pointed out in many articles, the lessons from the front lines of today’s cognitive computing and deep learning initiatives, are that successes are proving elusive.[ii] That is, we want so badly to believe in the premise of these new technology oracles that we believe merely using the techniques will lead to insight and better care and improved outcomes.

Unfortunately that is not and never has been the case.

Charles Babbage, the “father of the computer,” said in his 1864 book Passages from the Life of a Philosopher, “On two occasions I have been asked, ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.”

So what is a thoughtful organization to do in order to take advantage of AI and machine learning and actually improve healthcare?

My advice for organizations looking to leverage these new technologies are simple.

  1. Don’t Create “Strategy by Press Release”

Outsourcing data strategy or hoping AI can magically solve complex business challenges with unproven technology and without a plan is a fool’s errand.

It is said that the two happiest days of a boat owner’s life are the day they buy a boat and the day they sell it. The same can be true of an AI strategy by press release. There is the first press release announcing the momentous breakthroughs and problems to be solved. Then, there is the second press release that may never get sent announcing the quiet shutdown of the unsuccessful project.

No amount of marketing hype will stand in for solid strategy and doing the hard work of data science. This behavior also ties your AI strategy to a particular vendor. When you announce you’re doing big things with a particular AI technology, you tend to get locked in to a single solution. The right way to view AI systems is that it is another set of information technology infrastructure that should be modular to update components easily. Likewise, the right data strategy needs to be in place to support the aggregation and normalization of healthcare data out of various systems to support the building, testing, and deployment of machine-learning algorithms across the organization. This approach will enable organizations to take advantage of industry innovations, while reducing the risk of machine learning-system obsolescence and avoiding the cost of custom integrations.

If you call it a moonshot from the outset, are you doing more than a proactive admission of failure?

  1. Treat AI technologies as the pupil and not the master

At their hearts, AI and machine learning technologies replicate human cognition and learning capabilities, only faster. Therefore, if you don’t have a sound learning strategy in place, using AI or machine learning won’t help. It sounds simple, but it’s often overlooked. The other key point is that we should treat these technologies as willing pupils and help them grow, not as prescient, all-knowing oracles of business direction.

Cris Ross, the chief information officer of Mayo Clinic, described the present state of AI thusly: “Artificial intelligence is still pretty dumb, and I don’t mean that in a really derogatory way…..The best artificial intelligence today is still driven entirely by so-called semantic models, which is understanding language and the relationship of words to each other and how they build up. So the only way that these things can work is by giving them mountains of data to plow through to try and get to statistically meaningful connections, which then can be leveraged to gain some other understanding. So, this is like a 2-year-old child just learning to speak and to walk and how they interact with the world. When I put my hand on the stove, that’s not a good outcome. It’s not something immediately clear to a 2-year-old child.”

They won’t cure cancer, solve world hunger or deliver peace on earth. What they can do is make humans more efficient in processing and analyzing the right data. But in order to help, the AI needs to be trained, so treat them like the neophytes they are and establish the right conditions for them to learn.[iii]

  1. Plan your data strategy – there is no substitute for good data

Lastly, there is no substitute for good data. We all know the immutable law of computing; GIGO—garbage in, garbage out—is what we must all heed. AI does not provide a shortcut to magical outputs based on bad data. As John Bruno at Forrester recently wrote about the implication of Salesforce’s new AI, Einstein, “The future analytics-driven sales processes is bright, but the path ahead is not without its challenges. Current and potential Salesforce customers should be mindful that intelligent recommendations require a large volume of quality data. If poor data goes in, poor recommendations will come out. Cleansing data and iterating the fine-tuning of recommendations will be vital to long-term success.”[iv]

If you have ambitions to leverage any of the current crop of technologies, know that AI is heavily dependent on mammoth amounts of data. It means that the only place where this technology is applicable is where there is sufficiently deep and rich data sets with enough narrow variations.

Data paves the way for AI, and, in order ever reap the benefits of AI and machine learning, we have to establish a healthcare data strategy. In healthcare, that means moving beyond your electronic healthcare record(s) and your data warehouse(s). To ensure you have the right underpinnings for any AI or machine learning efforts, you need a healthcare data strategy—and a way to actually manage ALL your data.

It’s the only way to make sure your moonshot will even get to the launchpad.

If you’re interested in discussing data strategies for AI in more detail or just want to give me an opposing view, I will be at BioIT World in Boston May 23-25 and will be happy to talk.


[i] Caruana, R. et al. ‘Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission’ Proc. 21th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining1721–1730 (ACM, 2015).

[ii] Davenport, Thomas H. "Lessons from the Cognitive Front Lines: Early Adopters of IBM's Watson." The Wall Street Journal. Dow Jones & Company, 03 Dec. 2015. Web. 15 May 2017.

[iii] Parmar, Arundhati, Arundhati Parmar  |  1:42 Pm May 10, Stephanie Baum  |  2:27 Pm May 15, Juliet Preston  |  2:15 Pm May 15, and Erin Dietsche  |  10:31 Am May 15. "AI Is "still Pretty Dumb" and like a "2-year-old"." MedCity News. N.p., 08 Mar. 2017. Web. 15 May 2017.

[iv] "John Bruno's Blog." Can Salesforce Really Prescribe An End-to-End Sales Process? | Forrester Blogs. N.p., n.d. Web. 15 May 2017.