Friday, April 10, 2020

COVID-19: It's Now Time for Health IT Vendors to Traffick in Patient Data

In numerous posts at this blog, I've brought up the issue of hospitals and health IT sellers extracting and exchanging/selling (ostensibly) anonymized clinical data from their EMR systems.  The buyers are varied, from pharma and PBM's to academic researchers to government, and likely many others.

This practice is not new.  For example, see my Oct. 7, 2009 post "Health IT Vendors Trafficking in Patient Data?" at

An example of EMR vendor (Cerner) data sales of "anonymous, HIPAA-compliant, EHR-derived data" for analysis. 
Cerner and EPIC are among the largest enterprise EMR sellers in the world.

Also see my November 12, 2019 post "Google’s ‘Project Nightingale’ Secretly Gathers Personal Health Data on Millions of Americans - Time to Refuse Use Of EMR's In Your Healthcare?" at  In that post I cited an article on Google's efforts in this domain:

Google’s ‘Project Nightingale’ Secretly Gathers Personal Health Data on Millions of Americans

November 12, 2019

Google has been working with one of the largest healthcare systems in the U.S. to collect and analyze the personal health information of millions of citizens across 21 states, The Wall Street Journal reports.  The Tech giant reportedly teamed up with St. Louis-based Ascension, the largest non-profit health system in the country, last year, and the data sharing has accelerated since summer.

Code-named Nightingale, the project saw both companies collect personal data from patients, which included lab results, doctor diagnoses, and hospitalization records, as well as patient names and dates of birth.
Google said it plans to use the data to create new software that will improve patient care and suggest changes to their care.

More recently, a court review of a noncompete clause-related lawsuit "FLATIRON HEALTH, INC. v. Carson" at, received by me via a Google alert I have active on EHR-related litigation, describes the market for medical data in more detail.

For instance:

... Flatiron's largest line of business is its real-world evidence ("RWE") service, which converts raw clinical data from patient records into a structured format so that the data can be used for research purposes.[3] After structuring the data, Flatiron aggregates the data into data sets. Flatiron generates revenue by selling data sets to biopharmaceutical companies, as well as some regulatory agencies and researchers...

... Flatiron has developed methodologies and software systems for gathering, curating, and analyzing data from electronic health records. For example, to curate data, Flatiron has formulated rules governing how Flatiron converts information conveyed by physician notes, or other raw data in a patient record, into numeric values for variables in Flatiron's data set...

... Flatiron develops and implements these methods and systems using cross-functional teams consisting of software engineers, oncologists, clinical data specialists, data entry personnel, and others. For example, Flatiron's clinical data team writes policies and procedures to govern how Flatiron's data entry personnel curate data from unstructured records, and Flatiron's research oncology team must generally sign off on those policies and procedures. Research oncologists also describe clinically relevant concepts and rules, which software engineers incorporate into Flatiron's software codes.

This is one of numerous companies who perform services like this.  (Disclaimer:  I have no connections, financial or other interests, or involvement in this company, or others like it, whatsoever.)

Some observations:

1.  Expertise for analysis of EHR-derived datasets is relatively common.

2.  Enterprise EHR systems are widespread and are capturing highly-detailed, relatively standardized data that is easily extracted, as compared to paper records.

3.  Nearly all COVID-19 patients treated in hospitals in the United States, and in other countries with widespread EHR adoption, will have detailed data stored about their demographics (including residences and recent travel), medical and social history, medication history, pre-existing medical conditions, timelines of their signs and symptoms, chronological results of labwork and imaging studies showing response (or not) to therapy, and so forth.

4.  While systems may not be "interoperable", extraction of a uniform constrained dataset from the major EHR systems is both straightforward and, apparently, done regularly for commercial and/or research purposes.

5.  Therefore, in view of the current medical and mass economic upheaval, and what seems to be increasing public impatience and distrust of the experts:

I believe and recommend that anonymized, HIPAA-compliant datasets on COVID-19 patients should be made available ASAP, for example, on an HHS website.

This would allow leveraging of widespread expertise in analysis of the data for crucial purposes including, but not limited to (just off the top of my head): a better understanding of just who is susceptible to getting severe symptoms and ARDS from COVID-19; the role of co-morbidities in outcomes; comparative effectiveness research on new and experimental treatments (such as the currently controversy-provoking hydroxychloroquine/zithromycin/zinc triad); comparison of strategies in use of mechanical ventilators; and others.

One of the motivations of widespread EHR adoption (including government incentives, and, after a few years, penalties for non-adopters) was the potential of EHRs for enabling "virtual clinical trials" to be conducted.

Up to now, EHRs have largely been an albatross to practicing physicians and nurses, who are called upon - mandated, actually - to perform a massive amounts of clerical work in data entry, in addition to clinical work. 

It's time IMHO to leverage and democratize EHR potential, not just for the benefit of high-paying data customers.

(I don't think what I describe is happening on a significant scale; I believe the data is being kept on a short leash.  I would appreciate knowing if this is not correct.)

Finally, I note that politics and the data analytics I describe don't mix well.

-- SS

1 comment:

Anonymous said...

I would expect that this data being obtained at scale by an expanding industry now means that it is no longer a source of clinical/public health value but something to be bought, sold and ultimately commodified. What does that mean for health IT if clinically your hospital can't live up to the promise but a private company "can" through aggregation and scaled deep learning/algorithms?