Study: Statistical Methods for Phenotype Estimation and Analysis Using Electronic Health Records
Study Dates
Last Modified
Tags
Publisher
Abstract
Study to develop new statistical methods that combine the unique set of measures available for each individual to estimate a “latent phenotype.” The latent phenotype consists of a patient’s underlying, true disease profile, which may be only hinted at by the series of medical tests recorded in the EHR.
Funder(s)
Provenance
Description
Electronic health records (EHR) provide extensive information on disease risk factors that can be studied to improve our understanding of health outcomes. However, medical assessments are performed at irregular intervals in response to patients’ medical needs, which makes these data difficult to use for research. This project develops new statistical methods that combine the unique set of measures available for each individual to estimate a “latent phenotype.” The latent phenotype consists of a patient’s underlying, true disease profile, which may be only hinted at by the series of medical tests recorded in the EHR. Efficiently combining all available information for each individual and leverageing the richness and complexity of EHR data leads to better characterized patients.
Children and adolescents with type II diabetes are identified to demonstrate the potential of our new statistical methods. Using EHR data from eight children’s hospital health systems participating in the PEDSnet federation, a pediatric diabetes latent phenotype is developed. This phenotype can be used in subsequent research for identifying patient participants or for assessing risk of other health outcomes that may be increased in children with type II diabetes. Clinician, patient, and parent partners from PEDSnet identify downstream health consequences that are most important for further study and analyze associations between the newly developed diabetes latent phenotype and these outcomes. These analyses illustrate the performance of the latent phenotype approach in a real-world context where information on risk factors and outcomes for type II diabetes is urgently needed.
Study Aims
- To develop statistical methods for estimating latent phenotypes.
- To develop methods for incorporating latent phenotypes into analyses of health outcomes accounting for uncertainty in phenotypes and other patient covariates.
- To estimate a type II diabetes phenotype for patients in the PEDSnet federation and associations with downstream health outcomes.
The long-term objective of this research is to provide better statistical methods for combining inconsistently collected measures derived from the EHR.
Development Code
Vocabulary
Clinical Subjects Headings
Related Publications
Hubbard RA, Huang J, Harton J, Oganisian A, Choi G, et al. January 2019. “A Bayesian latent class approach for EHR-based phenotyping.” Stat Med. 38(1):74-87.
DOI: 10.1002/sim.7953