Quantitative Variable Distributions: Multi Site, Anomaly Detection, Longitudinal Analysis
Created
Last Modified
Click on the thumbnail above to preview images.
Domain
Category
Parameters
Publisher
PEDSnet
Abstract
This check provides raw data and visualizations to aid a user in evaluating whether the distribution of quantitative variables aligns with clinical expectations. It can summarize the distribution of a quantitative variable (like lab result values) or patient counts (like number of patients with an outpatient visit).
Data Requirements
Probe
Clinical Assessment
Access Package
# install.packages("devtools")
devtools::install_github('ssdqa/https://github.com/ssdqa/quantvariabledistribution')Visualization Output
This check displays the Euclidean distance between two time series: the smoothed (Loess) summary statistic as selected by the user for a given site and the all-site statistic. Three graphs are output:
- A line graph displaying the smoothed summary statistuc at each site over time, with the Euclidean distance available in the tooltip when hovering over the line
- A line graph displaying the raw (not smoothed) summary statistic at each site over time
- A circular bar graph displaying the Euclidean distance from the all-site value where the fill represents the average Loess statistic over time
Raw Output
This check produces a raw data output containing 12 columns:
| Column | Data Type | Definition |
|---|---|---|
site |
character | the name of the site being targeted OR “combined” if multiple sites were provided |
time_start |
date | the start of the time period being examined |
value_type |
character | the type of value being measured |
mean_val or median_val |
numeric | the mean or median of the value_type being measured, based on the user’s selection |
allsite_var |
numeric | the euclidean_stat of interest across all sites |
date_numeric |
numeric | the numeric equivalent of time_start |
site_loess |
numeric | the euclidean_stat value with Loess regression applied |
dist_eucl_mean |
numeric | the Euclidean distance of site_loess from allsite_var |
euclidean_stat |
character | the summary statistic selected by the user for the computation |
output_function |
character | a string indicating the type of visualization that should be generated by qvd_output |
Affiliation(s)
Funder(s)
This research was made possible through the generous support of Patient-Centered Outcomes Research Institute. The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of PCORI, its Board of Governors, or its Methodology Committee.
Provenance
Description
Development Code
Clinical Subjects Headings
Related Publications
Creative Commons license
Except where otherwised noted, this item's license is described as a CC-BY Attribution 4.0 License.
Cite this Data Quality Check
PEDSnet Data Coordinating Center., Wieand, K., & Dickinson, K. (2025, July). Quantitative Variable Distributions: Multi Site, Anomaly Detection, Longitudinal Analysis. [D Q Check]. PEDSpace Knowledge Bank. https://doi.org/10.24373/pdsp-476

