EHR v Covid-19. Leading Indicators: work-in-progress, VERSION 2

From nounproject.com. Leading indicators may seem like astrology, and tarot cards. BUT there is science too!

Thanks to those of you who caught my non-displaying graph images, I’m reposting now converting my original PNG to JPG. Please let me know if you can see these and follow the reasoning below! (edited 6/15, CTL)

—–

Thanks to Brendan Drew, one of our data scientists, who is diving into the analysis of Leading Indicators, for the graphs and reasoning below. If I can twist his arm for more graphs, will pass them along.

If you recall, I discussed this recently: the idea that, our future is uncertain. Even though we have survived the first wave of the Covid-19 pandemic, we are concerned about possible future waves. How might we prepare?

If you don’t know this about me already, I find “making the sausage” in informatics and data science fascinating. Here are some intermediate steps we are taking beyond my “data dilettante” days as we search for signal in the noise.

These are all COVID-19 new codes. Firstly, note that ORANGE line R68.89 , orange shows up WAY before March. Turns out, this is not only “suspected Covid-19” it is also “Other Symptoms and Signs” previously in the ICD10 dictionary. So, that is a terrible signal. Then, RED line Z20.828 “Close exposure to COVID-19” is also “Exposure to influenza”. Hmm. Then, BLUE line B34.2 “Coronavirus Infection” is also “Coronavirus, unspecified.” Also Hmm. Only GREEN line U07.1 “Coronavirus identified” is highly specific for COVID-19 in the graph.

So, how do we make sense of this?

First, we take ONLY hospital patient codes for CONFIRMED (BLUE) versus SUSPECTED Covid (ORANGE), and we see that the BLUE CONFIRMED line shows two peaks, whereas ORANGE, there is no real signal there at all. GREEN is adjusted for Market Share based on 2019 data for that zip code (we are trying to localize prediction to the Zip code level).

Now, we compare zip codes. Blue line is 80011, Aurora near University of Colorado Hospital, a relative hot spot in Denver Metro region, and orange is 80634, the hot spot near Greeley hospital, and we see a temporal difference in the onset and peak of Greeley being earlier than Aurora. Interesting.

Here is where it gets tantalizing, and we have to hold back our excitement: Pair up the outpatient symptom data with the inpatient hospitalization rate for Confirmed Covid. Here it is for Aurora, x-axis lined up by date:

Those of us who cannot contain our excitement will see a visual rise in RED (outpatient symptoms suspicious of COVID, like fever, cough, shortness of breath), in the 80011 zip code increasing about 2 weeks BEFORE the corresponding rise in COVID-19 cases at University of Colorado Hospital in Aurora (also 80011). We WIN! Right?

Also, here’s the corresponding graph for Greeley:

This is a bit messier: what is that symptom peak in February? There is no corresponding COVID hospitalization peak in Feb/Mar. BUT, the symptom peak in mid March DOES correspond to a rise and peak in late March, and all of April.

My theory: mid February was probably Influenza A, and we did NOT track hospitalizations on our graph for that, AND the COVID confirmed codes did not get implemented until mid March, and maybe NOT attached in retrospect to patients who MIGHT have had COVID, but were admitted BEFORE those codes went into effect. This is harder than it looks!

Are you looking for a final answer? SORRY! We are still cranking away at this. Even though we humans have frontal lobes that CANNOT WAIT to see patterns (even where there is no pattern!), we have to resist that urge. AND, how do you teach an algorithm (even if there IS a pattern here), to tell us: YES you should pay attention to THIS rise in the data, but THAT ONE is just random noise.

For example, imagine the 80011 graph prints out one day at a time, moving to the right. At what point, would you tell the algorithm to alert us: YES it is TIME TO BRING IN MORE DOCS AND STAFF FOR THE NEXT SURGE.

Would it be: March 15, when there is an uptick? But there are lots of upticks just like that. March 22, a week later, when the line is DOUBLE of the average from 0.0007 to 0.0014?

AND, worse yet, UCHealth is only one of 5 health systems in Metro Denver and across the state of Colorado. Will cases come to US or to other health systems? What will the peak be? Will it be a tiny peak? (Hey, CT, why did you call all of us in here for these dozen patients?) Will it be a HUGE peak (Hey, CT, you didn’t raise enough of an alarm, there still aren’t enough of us).

Finally, signal to noise MIGHT be easier for the summer months when Influenza is done, but what about the fall when Influenza B and many other viruses are back in action? What about seasonal allergies during spring and summer that might kick off cough and shortness of breath?

CMIO’s take? Figuring out Leading Indicators is HARD. If YOU have this figured out, let us know. We’re still working on it. But the math and the figuring-it-out is pretty fascinating in the mean time.

Author: CT Lin

CMIO, UCHealth (Colorado); Professor, University of Colorado School of Medicine

Leave a Reply

%d