So much about “long COVID,” more formally known as Post-Acute Sequelae of SARS-CoV-2 infection, or PASC, is still unknown.
Why do some patients, weeks or months after their initial infection, still exhibit symptoms such as “brain fog,” shortness of breath or heart problems? What kinds of underlying conditions, or demographic characteristics, are associated with these lingering conditions? And how many people are still suffering?
The National Institutes of Health is aiming to answer these questions and others by launching an initiative that brings together experts to gather data around thousands of PASC patients.
Announced in February of this year, the PASC Initiative, also known as RECOVER, is a multimillion dollar project to coordinate and support analysis of nationwide medical information on COVID-19 “long-haulers.” By doing so, the agency hopes to improve understanding of how to treat long-term complications of the disease.
Dr. Shawn Murphy, chief research information officer at Mass General Brigham, is one of the leaders of the team serving as the PASC Data Resource Core. Over the course of the project, the DRC will facilitate the collection and analysis of standardized data across different cohort studies, along with contributing to study designs.
Murphy spoke with Healthcare IT News this week about the need to study PASC, the challenges involved with data collection and standardization and his hopes for the future of the project.
Q: Tell me a bit about the PASC Initiative. How did it start, and what’s the current status?
A: The government issued a request for applications. What they wanted to do was set up a pretty complicated system of data flow that came from hospitals, where they were going to find patients who had this really nasty result of COVID.
For whatever reason, the virus leaves many of us with trouble breathing, heart problems or neuropsychiatric problems. By far one of the worst ones is this chronic fatigue syndrome – myalgic encephalomyelitis. You get this brain fog where it’s hard to concentrate. And of course with that comes depression. It’s been a really difficult thing. And these symptoms can appear a long time – over 30 days – after COVID. So that’s what we’re trying to figure out.
The place to start is with the patients. The data is coming from 20 adult sites, 10 pediatric sites and seven autopsy sites: Some people don’t survive the syndrome or they die from something else.
It’s important to make sure we get enough different diversity in our populations; we found that COVID affects different kinds of populations differently. We think it’s the same mechanism – residual inflammation – but are each of those symptoms different kinds of inflammation? Or is it that the history of a person’s health worsens general inflammation?
That’s something we need to figure out, from soup to nuts.
Q: How are you planning to gather the data?
A: We’re trying to gather together 20,000 patients, more or less – depending on how many people are needed for the studies – from these 37 sites.
That’s kind of the starting place, and then we collect three different kinds of data. There’s data that’s hand-entered by doctors or patients themselves, electronic health record data, and imaging data – doing MRIs on the living and the deceased.
Q: Can you say more about that hand-entered data? Where is it coming from, and how do you extract it?
A: What will happen is there will be two classes of hand-entered data. Providers will have case report forms. They have a schedule of visits. When the patient comes in they ask them a lot of questions, and they’ll fill out the form specifically with the answers. We try to make the question as consistent as possible.
The second class is from the patients themselves, who are often much more active with data entry. You can get a patient to put down every day how they’re feeling. They’re often willing to put in details every day: that they drank less water one day, for example. Those things are important.
It’s like a needle in a haystack as far as what it is that’s actually able to help.
Then we’ll use an app to capture the data.
It all goes into the data resource core: the DRC. That’s what I lead, along with Chief of Biostatistics at Massachusetts General Hospital Andrea Foulkes and Dr. Elizabeth Karlson, director of rheumatic disease epidemiology at Brigham and Women’s Hospital.
If you really want a great thing, put together a biostatistician, an informatics person and an epidemiologist. Those skill sets come together nicely to form a cohesive plan.
Q: Are all your study sites using the same EHR vendors?
A: No. The plan is to get all this data filtering down to the DRC, where it can all be made interoperable.
The way we do that is put it into a data meta-model called i2b2, or Informatics for Integrating Biology and the Bedside – a project that’s been going on for over 15 years. What that does is it creates a place where data can all fit together. And then you can query it with web-based query tools and see what kinds of data you have, and what kinds of patients have which symptoms.
Generally, what you have to do is get the data out of the EHR, manipulate it so it fits in the i2b2 and transport it to the DRC. And we do it all without keeping the name of the patient.
Q: I know this is a four-year study. What do your time line goals look like?
A: They’re actively trying to recruit their first patient by September 1. This is extremely aggressive. Normally it would be a year before you recruit your first patient.
We meet almost every day. It’s a very aggressive time line. But that’s the goal, because we need to figure out what we can do for our patients. The longer this goes on, the more disabled patients are going to be. You can see the impact something like this is going to have on our entire economy.
As far as we can tell, 10-15% of people who have had COVID are getting these kinds of symptoms. We’re really talking about quite an enormous number of patients going to be having this problem.
It’s going to have an incredible impact. There’s a lot of angst to get in there and do something about it.
This interview has been edited and condensed for clarity.