Expected Variables Present: Multi-Site, Anomaly Detection, Longitudinal Analysis


Created

Last Modified

Click on the thumbnail above to preview images.

Domain

Category

Parameters

Publisher

PEDSnet

Abstract

This check provides raw data and visualizations to aid a user in evaluating whether expected concepts are present in a dataset of interest. It summarizes the proportion of patients with co-occurring variables. This check promotes the identification of anomalous data to compare among sites.

Probe

Clinical Assessment

Access Package

# install.packages("devtools") devtools::install_github('ssdqa/https://github.com/ssdqa/conceptsetdistribution')

Visualization Output

This check outputs three visualizations to display the Euclidean distance between two time series: the smoothed (Loess) proportion of a user-selected variable for a given site, and the average proportion of all sites. Two line graphs (one smoother, one raw) represent the proportion of the variable at each site over time. Sites are differentiated by color, and a thick red line represente the All Site Average. A circular bar graph displays the Euclidean distance from the all-site mean where the color represents the average Loess proportion over time.

Raw Output

The raw data output of this check produces nine columns of data:

Column Data Type Definition
site character the name of the site being targeted
time_start date the start of the time period being examined
variable character the name of the variable
prop_pt_variable / prop_row_variable numeric the proportion of patients or rows (based on user selection) with evidence of the variable
mean_allsiteprop numeric the average patient/row proportion across sites
median numeric the median patient/row proportion across sites
date_numeric numeric the numeric equivalent of time_start
site_loess numeric the patient/row proportion with Loess regression applied
dist_eucl_mean numeric the Euclidean distance of site_loess from mean_allsiteprop

Funder(s)

This research was made possible through the generous support of Patient-Centered Outcomes Research Institute. The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of PCORI, its Board of Governors, or its Methodology Committee.

Provenance

Description

Clinical Subjects Headings

Related Data Quality Result

Expected Variables Present Study Results II: PAQS Query 3
Created:2025-05-30Affiliation:PEDSnet Data Coordinating Center
The results of an Expected Variables Present check using the Multi-Site, Anomaly Detection, Longitudinal parameters. This check evaluates the annual distributions of key variables related to diabetes: stroke, second-line antidiabetics, ketoacidosis, an Hba1c > 8%, elevated blood pressure, and CKD.
Expected Variables Present Study Results III: PRESERVE
Created:2025-04-08Affiliation:PEDSnet Data Coordinating Center
The results of an Expected Variables Present check using the Multi-Site, Anomaly Detection, Longitudinal parameters. This check assesses anomalous proportions of patients with several study variables, like ABPM and census tract information.

Related Person

Related Code

Study-Specific Quality, Utility, and Breadth Assessment
Created:2025-11Affiliation:PEDSnet Data Coordinating Center
This suite of R packages allows one to investigate multiple facets of data quality and customize analyses based on your study-specific needs. Each module allows up to 8 different analyses in either the OMOP or PCORnet CDM, all aimed at taking a different view of the data while still addressing the same data quality probe.

##### [View pkgdown summary here.](https://ssdqa.github.io/squba/)

Related Data Quality Check

Related Publications

Creative Commons license

Except where otherwised noted, this item's license is described as a CC-BY Attribution 4.0 License.

Cite this Data Quality Check

PEDSnet Data Coordinating Center. (2024, June). Expected Variables Present: Multi-Site, Anomaly Detection, Longitudinal Analysis. [D Q Check]. PEDSpace Knowledge Bank. https://doi.org/10.24373/pdsp-465