Study: Researchers use big data to establish long COVID subtypes based on more than 250,000 VA patients

A list of long covid symptoms.

Data-driven long COVID definition will help support public health initiatives while providing clinicians with a more nuanced basis for screening and diagnosis

Release Date: June 22, 2024

Peter Elkin is standing in the atrium of the Jacobs School.
“We looked at the full electronic health records of patients so that we could tease apart all the different symptoms of this complex disease to find all the different subtypes that exist. ”
Peter L. Elkin, MD, Chair, Department of Biomedical Informatics
Jacobs School of Medicine and Biomedical Sciences

BUFFALO, N.Y. — The announcement earlier this month from the National Academies of Science, Engineering and Medicine of a consensus definition of long COVID mentioned that it was not the final word on this condition but that the definition would be revised as new findings are published.

Earlier this year, researchers at the University at Buffalo and the Department of Veterans Affairs who were using big data to work on a data-driven long COVID definition were invited to report their findings in testimony before the National Academies.

In April they published those findings in JMIR Public Health Surveillance based on more than 250,000 patients in the Veterans Health Administration who had tested positive for COVID.

Finding different subtypes

“Because we had such large numbers of patients at the VA, it was a wonderful place to do this study,” says Peter L. Elkin, MD, corresponding author and chair of the Department of Biomedical Informatics in the Jacobs School of Medicine and Biomedical Sciences at UB. “We looked at the full electronic health records of patients so that we could tease apart all the different symptoms of this complex disease to find all the different subtypes that exist.”

One of the goals is to help clinicians better recognize cases of long COVID. “The idea is to make more clinicians aware of who has long COVID and who is maybe not yet out of the woods from long COVID,” he explains. “Now that we have a definition, you can tell who in your specialty has long COVID and who doesn’t.”

The findings could help researchers develop a more scientific approach to finding out how COVID infection causes the specific conditions seen in long COVID.

The study was done using the electronic health records of more than 2.3 million patients who were seen at a VA health facility between Jan. 1, 2020, and Aug. 18, 2022. There were 367,148 patients who tested positive for COVID-19 at a VA facility; of those, 268,320 were considered to have long COVID if they had a novel diagnosis between one and seven months following a positive COVID-19 test.

Based on the symptoms experienced by patients with long COVID, the researchers assigned a total of 324 ICD (International Classification of Disease) codes. They identified 180 clinical scenarios and 17 clinical subtypes that were upregulated in people who had long COVID.

Highest case counts in cardiology

The highest long COVID case counts were in cardiology, with diagnoses including low blood pressure, heart failure, arrythmias and atrial fibrillation, followed by neurology (symptoms included low back pain, severe muscle weakness and cognitive impairment), ophthalmology and pulmonology.

Among the most commonly cited symptoms were fatigue and acute respiratory distress. Respiratory issues are among the most common in long COVID, including chronic cough, respiratory failure and dependence on supplemental oxygen. Cognitive difficulties, including brain fog, were also frequently cited.

Factors that put patients at higher risk for developing long COVID included older age, other health conditions prior to becoming infected, a more severe case of COVID and low oxygen saturation during COVID. Patients who had not received a COVID-19 vaccine were 1.3 times more likely to develop long COVID.

The study provides definitions of each of the long COVID subtypes and odds ratios defining the risk for each of them.

“These data will allow us to better identify patients with long COVID and it can also help support public health research and policy initiatives going forward,” Elkin says.

The authors note that a limitation of the study was that the study population was 84% male.

Co-authors with Elkin are Skyler Resendez, PhD, a postdoctoral fellow; Hugo Sebastian Ruiz Ayala; and Prahalad Rangan, PhD, all in the Department of Biomedical Informatics in the Jacobs School; and Steven H. Brown, MD; Jonathan Nebeker, MD; and Diana Montella, MD, of the Office of Health Informatics of the Department of Veterans Affairs.

The work was supported by the National Library of Medicine, the National Institute on Alcohol Abuse and Alcoholism, and the National Center for Advancing Translational Science, all of the National Institutes of Health, as well as the Department of Veterans Affairs. 

Media Contact Information

Ellen Goldbaum
News Content Manager
Tel: 716-645-4605