Cohort Attrition: Multi-Site, Anomaly Detection, Cross-Sectional Analysis


Created

Last Modified

Click on the thumbnail above to preview images.

Domain

Category

Parameters

Publisher

PEDSnet

Abstract

This check provides an exploratory analysis of patient eligibility criteria for a study. It summarizes each step of attrition criteria in the cohort construction for a given site.

Probe

Clinical Assessment

Access Package

# install.packages("devtools") devtools::install_github('ssdqa/https://github.com/ssdqa/cohortattrition')

Visualization Output

This check provides the option of a dot plot visualization or a line graph with reference table. The user inputs the attrition step at which the visualization should begin and selects a variable to facet by (num_pts, prop_retained_start, prop_retained_prior or prop_diff_prior). Anomolous data points are represented with a star. Data point size is directly proportional to the mean value per step and data point color corresponds to the proportion of the user-selected variable (ex: prop_diff_prior). In the line graph output option, line color corresponds to the site. Hovering over the graph provides displays a tooltip with a description of the attrition step, the site name, mean, standard deviation, median, MAD, and the value of the user-selected variable.

Raw Output

The raw data output of this check produces twenty-two columns of data:

Column Data Type Definition
num_pts numeric the number of patients associated with a given attrition step - provided in the attrition table by the user
step_number numeric an integer indicating the step associated with the attrition - provided in the attrition table by the user
attrition_step character a string describing the attrition step - provided in the attrition table by the user
site character the name of the site being targeted
... any additional columns that were included in the user-provided attrition table will also appear in the output
prop_retained_prior numeric the proportion of patients retained from the previous attrition step
ct_diff_prior numeric the difference in patient count between a given attrition step and the previous step
prop_diff_prior numeric the proportion difference between a given attrition step and the previous step
prop_retained_start numeric the proportion of patients retained from the user-defined “start” attrition step
mean_val numeric the mean difference value (based on user selection) for each group across sites
median_val numeric the median difference value (based on user selection) for each group across sites
sd_val numeric the standard deviation of the difference value (based on user selection) for each group across sites
mad_val numeric the median absolute deviation of the difference value (based on user selection) for each group across sites
cov_val numeric the coefficient of variance of the difference value (based on user selection) for each group across sites
max_val numeric the maximum difference value (based on user selection) for each group across sites
min_val numeric the minimum difference value (based on user selection) for each group across sites
range_val numeric the range of the difference value (based on user selection) for each group across sites
total_ct numeric the total number of group members
analysis_eligible character a string indicating whether the group is eligible for anomaly detection analysis
lower_tail numeric the lower bound used to identify low anomalies
upper_tail numeric the upper bound used to identify high anomalies
anomaly_yn character a string indicating whether the value is anomalous or not

Funder(s)

This research was made possible through the generous support of Patient-Centered Outcomes Research Institute. The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of PCORI, its Board of Governors, or its Methodology Committee.

Provenance

Description

Clinical Subjects Headings

Related Data Quality Result

Cohort Attrition Study Results I: PAQS Query 3
Created:2025-05-30Affiliation:PEDSnet Data Coordinating Center
The results of a Cohort Attrition check using the Multi-Site, Anomaly Detection, Cross-Sectional parameters. This check investigates whether there were anomalous levels of patient retention when compared to the starting step (either step 0 or step 3) in a cohort of Type II Diabetes patients.

Related Person

Related Code

Study-Specific Quality, Utility, and Breadth Assessment
Created:2025-11Affiliation:PEDSnet Data Coordinating Center
This suite of R packages allows one to investigate multiple facets of data quality and customize analyses based on your study-specific needs. Each module allows up to 8 different analyses in either the OMOP or PCORnet CDM, all aimed at taking a different view of the data while still addressing the same data quality probe.

##### [View pkgdown summary here.](https://ssdqa.github.io/squba/)

Related Data Quality Check

Related Publications

Creative Commons license

Except where otherwised noted, this item's license is described as a CC-BY Attribution 4.0 License.

Cite this Data Quality Check

PEDSnet Data Coordinating Center. (2024, June). Cohort Attrition: Multi-Site, Anomaly Detection, Cross-Sectional Analysis. [D Q Check]. PEDSpace Knowledge Bank. https://doi.org/10.24373/pdsp-430