Cohort Attrition: Multi-Site, Anomaly Detection, Cross-Sectional Analysis
Created
Last Modified
Files
Domain
Category
Parameters
Publisher
Abstract
This check provides an exploratory analysis of patient eligibility criteria for a study. It summarizes each step of attrition criteria in the cohort construction for a given site.
Data Requirements
Probe
Clinical Assessment
Access Package
# install.packages("devtools")
devtools::install_github('ssdqa/https://github.com/ssdqa/cohortattrition')Visualization Output
This check provides the option of a dot plot visualization or a line graph with reference table. The user inputs the attrition step at which the visualization should begin and selects a variable to facet by (num_pts, prop_retained_start, prop_retained_prior or prop_diff_prior). Anomolous data points are represented with a star. Data point size is directly proportional to the mean value per step and data point color corresponds to the proportion of the user-selected variable (ex: prop_diff_prior). In the line graph output option, line color corresponds to the site. Hovering over the graph provides displays a tooltip with a description of the attrition step, the site name, mean, standard deviation, median, MAD, and the value of the user-selected variable.
Raw Output
The raw data output of this check produces twenty-two columns of data:
| Column | Data Type | Definition |
|---|---|---|
num_pts |
numeric | the number of patients associated with a given attrition step - provided in the attrition table by the user |
step_number |
numeric | an integer indicating the step associated with the attrition - provided in the attrition table by the user |
attrition_step |
character | a string describing the attrition step - provided in the attrition table by the user |
site |
character | the name of the site being targeted |
... |
any additional columns that were included in the user-provided attrition table will also appear in the output | |
prop_retained_prior |
numeric | the proportion of patients retained from the previous attrition step |
ct_diff_prior |
numeric | the difference in patient count between a given attrition step and the previous step |
prop_diff_prior |
numeric | the proportion difference between a given attrition step and the previous step |
prop_retained_start |
numeric | the proportion of patients retained from the user-defined “start” attrition step |
mean_val |
numeric | the mean difference value (based on user selection) for each group across sites |
median_val |
numeric | the median difference value (based on user selection) for each group across sites |
sd_val |
numeric | the standard deviation of the difference value (based on user selection) for each group across sites |
mad_val |
numeric | the median absolute deviation of the difference value (based on user selection) for each group across sites |
cov_val |
numeric | the coefficient of variance of the difference value (based on user selection) for each group across sites |
max_val |
numeric | the maximum difference value (based on user selection) for each group across sites |
min_val |
numeric | the minimum difference value (based on user selection) for each group across sites |
range_val |
numeric | the range of the difference value (based on user selection) for each group across sites |
total_ct |
numeric | the total number of group members |
analysis_eligible |
character | a string indicating whether the group is eligible for anomaly detection analysis |
lower_tail |
numeric | the lower bound used to identify low anomalies |
upper_tail |
numeric | the upper bound used to identify high anomalies |
anomaly_yn |
character | a string indicating whether the value is anomalous or not |

