Patient Records Consistency: Multi Site, Anomaly Detection, Longitudinal Analysis
Created
Last Modified
Domain
Category
Parameters
Publisher
Abstract
This check provides analyses to identifiy anomalous data across time among multiple sites. The Patient Record Consistency module, part of the larger SSDQA ecosystem, tests the consistency of clinical data representation within a patient’s record. The goal is to ensure that the patient’s information is confirmatory and complete, such that two events that are expected to co-exist do both occur within the same patient (i.e. a leukemia diagnosis and chemotherapy).
Data Requirements
Probe
Clinical Assessment
Access Package
# install.packages("devtools")
devtools::install_github('ssdqa/patientrecordconsistency')Visualization Output
This check displays the Euclidean distance between two time series: the smoothed (Loess) proportion of a user-selected event category for a given site and the all-site average proportion for each time point. Three graphs are output:
- A line graph displaying the smoothed proportion of the event category at each site over time, with the Euclidean distance available in the tooltip when hovering over the line.
- A line graph displaying the raw (not smoothed) proportion of the event category at each site over time.
- A circular bar graph displaying the Euclidean distance from the all-site mean where the fill represents the average Loess proportion over time.
Raw Output
This check produces a raw data output containing 9 columns of data:
| Column | Data Type | Definition |
|---|---|---|
site |
character | the name of the site being targeted |
time_start |
date | the start of the time period being examined |
stat_type |
character | string indicating the event combination of interest: A only, B only, both, or neither |
prop_event |
numeric | the proportion of patients meeting the criteria for stat_type in the time period of interest |
mean_allsiteprop |
numeric | the average patient proportion for the stat_type across sites |
median |
numeric | the median patient proportion for the stat_type across sites |
date_numeric |
numeric | the numeric equivalent of time_start |
site_loess |
numeric | the patient proportion for the stat_type with Loess regression applied |
dist_eucl_mean |
numeric | the Euclidean distance of site_loess from mean_allsiteprop |

