Patient Records Consistency: Single Site, Anomaly Detection, Cross-Sectional Analysis


Created

Last Modified

Click on the thumbnail above to preview images.

Domain

Category

Parameters

Publisher

PEDSnet

Abstract

This check provides analyses to identify anomalous data at the level of a single site. The Patient Record Consistency module, part of the larger SSDQA ecosystem, tests the consistency of clinical data representation within a patient’s record. The goal is to ensure that the patient’s information is confirmatory and complete, such that two events that are expected to co-exist do both occur within the same patient (i.e. a leukemia diagnosis and chemotherapy).

Probe

Clinical Assessment

Access Package

# install.packages("devtools") devtools::install_github('ssdqa/patientrecordconsistency')

Visualization Output

This check outputs a bar graph displaying the Jaccard similarity index for categories of F/U time.

The length of F/U “bins” are provided by the user via a vector of integers for breaks. Each bin includes the starting integer and up to but NOT including the ending integer. For example, [0,1) includes patients with F/U between 0 and 0.99 years.

Raw Output

This check produces a raw data output containing 11 columns of data:

Column Data Type Definition
site character the name of the site being targeted
fu_bin character the categorical bin of follow up time defined by the user in the fu_breaks parameter
concept1 character the name of the first event being compared in the similarity index
concept2 character the name of the second event being compared in the similarity index
cocount numeric the number of patients with evidence of both events
concept1_ct numeric the number of patients with evidence of event 1
concept2_ct numeric the number of patients with evidence of event 2
concept_count_union numeric the number of patients with evidence of either event
jaccard_index numeric the jaccard similarity index (cocount / concept_count_union)
concept1_prop numeric the proportion of patients with evidence of event 1 (cocount / concept1_ct)
concept2_prop numeric the proportion of patients with evidence of event 2 (cocount / concept2_ct)

Funder(s)

This research was made possible through the generous support of Patient-Centered Outcomes Research Institute. The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of PCORI, its Board of Governors, or its Methodology Committee.

Provenance

Description

Clinical Subjects Headings

Related Data Quality Result

Related Person

Related Code

Study-Specific Quality, Utility, and Breadth Assessment
Created:2025-11Affiliation:PEDSnet Data Coordinating Center
This suite of R packages allows one to investigate multiple facets of data quality and customize analyses based on your study-specific needs. Each module allows up to 8 different analyses in either the OMOP or PCORnet CDM, all aimed at taking a different view of the data while still addressing the same data quality probe.

##### [View pkgdown summary here.](https://ssdqa.github.io/squba/)

Related Data Quality Check

Related Publications

Creative Commons license

Except where otherwised noted, this item's license is described as a CC-BY Attribution 4.0 License.

Cite this Data Quality Check

PEDSnet Data Coordinating Center., Wieand, K., Bailey, C., Razzaghi, H., & Dickinson, K. (2024, December). Patient Records Consistency: Single Site, Anomaly Detection, Cross-Sectional Analysis. [D Q Check]. PEDSpace Knowledge Bank. https://doi.org/10.24373/pdsp-429