Patient Records Consistency: Multi Site, Anomaly Detection, Longitudinal Analysis


Created

Last Modified

Click on the thumbnail above to preview images.

Domain

Category

Parameters

Publisher

PEDSnet

Abstract

This check provides analyses to identifiy anomalous data across time among multiple sites. The Patient Record Consistency module, part of the larger SSDQA ecosystem, tests the consistency of clinical data representation within a patient’s record. The goal is to ensure that the patient’s information is confirmatory and complete, such that two events that are expected to co-exist do both occur within the same patient (i.e. a leukemia diagnosis and chemotherapy).

Probe

Clinical Assessment

Access Package

# install.packages("devtools") devtools::install_github('ssdqa/patientrecordconsistency')

Visualization Output

This check displays the Euclidean distance between two time series: the smoothed (Loess) proportion of a user-selected event category for a given site and the all-site average proportion for each time point. Three graphs are output:

  1. A line graph displaying the smoothed proportion of the event category at each site over time, with the Euclidean distance available in the tooltip when hovering over the line.
  2. A line graph displaying the raw (not smoothed) proportion of the event category at each site over time.
  3. A circular bar graph displaying the Euclidean distance from the all-site mean where the fill represents the average Loess proportion over time.

Raw Output

This check produces a raw data output containing 9 columns of data:

Column Data Type Definition
site character the name of the site being targeted
time_start date the start of the time period being examined
stat_type character string indicating the event combination of interest: A only, B only, both, or neither
prop_event numeric the proportion of patients meeting the criteria for stat_type in the time period of interest
mean_allsiteprop numeric the average patient proportion for the stat_type across sites
median numeric the median patient proportion for the stat_type across sites
date_numeric numeric the numeric equivalent of time_start
site_loess numeric the patient proportion for the stat_type with Loess regression applied
dist_eucl_mean numeric the Euclidean distance of site_loess from mean_allsiteprop

Funder(s)

This research was made possible through the generous support of Patient-Centered Outcomes Research Institute. The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of PCORI, its Board of Governors, or its Methodology Committee.

Provenance

Description

Clinical Subjects Headings

Related Data Quality Result

Related Person

Related Code

Study-Specific Quality, Utility, and Breadth Assessment
Created:2025-11Affiliation:PEDSnet Data Coordinating Center
This suite of R packages allows one to investigate multiple facets of data quality and customize analyses based on your study-specific needs. Each module allows up to 8 different analyses in either the OMOP or PCORnet CDM, all aimed at taking a different view of the data while still addressing the same data quality probe.

##### [View pkgdown summary here.](https://ssdqa.github.io/squba/)

Related Data Quality Check

Related Publications

Creative Commons license

Except where otherwised noted, this item's license is described as a CC-BY Attribution 4.0 License.

Cite this Data Quality Check

PEDSnet Data Coordinating Center., Wieand, K., Bailey, C., Razzaghi, H., & Dickinson, K. (2024, December). Patient Records Consistency: Multi Site, Anomaly Detection, Longitudinal Analysis. [D Q Check]. PEDSpace Knowledge Bank. https://doi.org/10.24373/pdsp-428