Patient Records Consistency: Single Site, Anomaly Detection, Longitudinal Analysis
dc.contributor | Patient-Centered Outcomes Research Institute |
dc.contributor.author | PEDSnet |
dc.contributor.author | Wieand, Kaleigh |
dc.contributor.author | Bailey, Charles |
dc.contributor.author | Razzaghi, Hanieh |
dc.contributor.author | Dickinson, Kimberley |
dc.contributor.other | Children's Hospital of Philadelphia |
dc.date.accessioned | 2025-01-08T16:06:29Z |
dc.date.created | 2024-12-17 |
dc.description.abstract | This check provides analyses to identify anomalous data across time at the level of a single site. The Patient Record Consistency module, part of the larger SSDQA ecosystem, tests the consistency of clinical data representation within a patient's record. The goal is to ensure that the patient's information is confirmatory and complete, such that two events that are expected to co-exist do both occur within the same patient (i.e. a leukemia diagnosis and chemotherapy). |
dc.description.abstract | #### How to Access This Check 1. You may access the module's R package in [GitHub](https://github.com/ssdqa/sourceconceptvocabularies).<br> Or, run in R ```{r} install_github('ssdqa/patientrecordconsistency') ``` 2. Using the provided vignettes on GitHub or help in R, follow parameter input instructions for "Single-Site", "Anomaly Detection", "Longitudinal Analysis" requirements. |
dc.identifier.uri | https://pedsnet.org/metadata/handle/20.500.14642/930 |
dc.publisher | PEDSnet |
dc.relation.uri | https://github.com/ssdqa/patientrecordconsistency |
dc.rights | a CC-BY Attribution 4.0 License. |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0 |
dc.subject | Data Quality Check Categorizations::Data Quality Category::Concordance |
dc.subject | Data Quality Check Categorizations::Data Quality Category::Consistency |
dc.subject | Data Quality Check Categorizations::Dataset Evaluation Strategy::Data Source Comparison::Single Site Analysis |
dc.subject | Data Quality Check Categorizations::Dataset Evaluation Strategy::Data Anomaly Method |
dc.subject | Data Quality Check Categorizations::Dataset Evaluation Strategy::Temporal Evaluation::Longitudinal Analysis |
dc.subject | Data Quality Check Categorizations::Error Detection Approach::Data Quality Probe::Missing Expected Data |
dc.subject | Data Quality Check Categorizations::Error Detection Approach::Data Quality Probe::Misclassification Detection |
dc.subject | Data Quality Check Categorizations::Error Detection Approach::Clinical Probe::Confirmatory Clinical Data |
dc.subject | Data Quality Check Categorizations::Error Detection Approach::Clinical Probe::Clinical Consistency |
dc.subject | Data Quality Check Categorizations::Error Detection Approach::Data Quality Probe::Temporality Consistency Check |
dc.subject | Data Quality Check Categorizations::Dataset Evaluation Strategy::Data Visualization::Control Chart |
dc.subject | Data Quality Check Categorizations::Dataset Evaluation Strategy::Data Anomaly Method::Time Series Anomalies |
dc.subject | Data Quality Check Categorizations::Dataset Evaluation Strategy::Data Anomaly Method::Seasonal-Trend Decomposition Using LOESS |
dc.title | Patient Records Consistency: Single Site, Anomaly Detection, Longitudinal Analysis |
dspace.entity.type | DQCheck |
local.description.raw | This check produces a raw data output containing 9 columns of data for analyses over annual intervals: <br> |Column |Data Type|Definition | |----------------|---------|--------------------------------------------------------------------------------------------| |`site` |character|the name of the site being targeted OR "combined" if multiple sites were provided | |`time_start` |date |the start of the time period being examined | |`time_increment`|character|the length of each time period | |`event_a_name` |character|the name of event A | |`event_b_name` |character|the name of event B | |`total_pts` |numeric |the total number of eligible patients in the cohort during the time period | |`stat_type` |character|string indicating the event combination of interest: A only, B only, both, or neither | |`stat_ct` |numeric |the count of patients meeting the criteria for stat_type in the time period of interest | |`prop_event` |numeric |the proportion of patients meeting the criteria for stat_type in the time period of interest| {.dqcheck-table} <br> It produces 11 columns of data for analyses over time of monthly or weekly intervals: <br> |Column |Data Type|Definition | |-------------------|---------|----------------------------------------------------------------------------------------------------------------| |`observed` |numeric |the original proportion of patients | |`season` |numeric |the seasonal component of the time series | |`trend` |numeric |the trend component of the time series | |`remainder` |numeric |the residual component after "season" and "trend" are removed from "observed" - target of anomaly detection | |`seasadj` |numeric |the adjusted seasonal component | |`anomaly` |character|a flag to indicate whether the proportion is an anomaly | |`anomaly_direction`|numeric |the direction of the anomaly (upper or lower) | |`anomaly_score` |numeric |the distance between the anomaly and the centerline | |`recomposed_l1` |numeric |the lower level bound of the processed time series used to identify lower outliers | |`recomposed_l2` |numeric |the upper level bound of the processed time series used to identify upper outliers | |`observed_clean` |numeric |the original proportion after the season and trend components have been removed and anomalies have been detected| {.dqcheck-table} |
local.description.viz | This check's visual output depends on the time increment input by the user. <br><br>For yearly time increments, this check outputs a control chart that highlights anomalies in the proportion of patients per event category. A `P Prime` chart is used to account for the high sample size, which means that the standard deviation is multiplied by a numerical constant. Blue dots along the line indicate non-anomalous values, while orange dots are anomalies.Only one event category should be specified via the `event_filter` parameter to be displayed on the graph. Any of the four options seen in the other output may be chosen with `a`, `b`, `both`, or `neither`.<br><br>For smaller time increments (by month or smaller), seasonality can make it difficult to detect true anomalies in a time series. This output computes anomalies while ignoring seasonality and outputs 2 graphs: 1. A time series line graph with anomalies highlighted with a red dot. 2. A 4-facet time series line graph that demonstrates the decomposition of the anomalies to make it more clear how the anomalies were identified. |
local.dqcheck.requirement | cohort |
local.dqcheck.requirement | prc_event_file |
local.dqcheck.requirement | omop_or_pcornet |
local.dqcheck.requirement | multi_or_single_site |
local.dqcheck.requirement | anomaly_or_exploratory |
local.dqcheck.requirement | age_groups |
local.dqcheck.requirement | patient_level_tbl |
local.dqcheck.requirement | fu_breaks |
local.dqcheck.requirement | p_value |
local.dqcheck.requirement | time |
local.dqcheck.requirement | time_span |
local.dqcheck.requirement | time_period |
local.subject.flat | Single Site Analysis |
local.subject.flat | Data Anomaly Method |
local.subject.flat | Longitudinal Analysis |
local.subject.flat | Person-Level Analysis |
local.subject.flat | Concordance |
local.subject.flat | Missing Expected Data |
local.subject.flat | Misclassification Detection |
local.subject.flat | Confirmatory Clinical Data |
local.subject.flat | Clinical Consistency |
local.subject.flat | Expected Clinical Event Representation |
local.subject.flat | Consistency |
local.subject.flat | Temporality Consistency Check |
local.subject.flat | Control Chart |
local.subject.flat | Time Series Anomalies |
local.subject.flat | Seasonal-Trend Decomposition using LOESS |