Expected Variables Present: Multi-Site, Anomaly Detection, Cross-Sectional Analysis
| dc.contributor | Patient-Centered Outcomes Research Institute |
| dc.contributor.author | PEDSnet Data Coordinating Center |
| dc.contributor.other | PEDSnet Data Coordinating Center |
| dc.date.accessioned | 2024-09-09T17:20:49Z |
| dc.date.created | 2024-06-05 |
| dc.description.abstract | This check provides raw data and visualizations to aid a user in evaluating whether expected concepts are present in a dataset of interest. It summarizes the proportion of patients with co-occurring variables. This check promotes the identification of anomalous data to compare among sites. |
| dc.identifier.uri | https://hdl.handle.net/20.500.14642/779 |
| dc.identifier.uri | https://doi.org/10.24373/pdsp-461 |
| dc.publisher | PEDSnet |
| dc.relation.uri | https://github.com/ssdqa/expectedvariablespresent |
| dc.rights | a CC-BY Attribution 4.0 License. |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0 |
| dc.subject | Multi-Site Analysis |
| dc.subject | Data Anomaly Method |
| dc.subject | Cross-Sectional Analysis |
| dc.subject | Person-Level Analysis |
| dc.title | Expected Variables Present: Multi-Site, Anomaly Detection, Cross-Sectional Analysis |
| dspace.entity.type | DQCheck |
| local.code.package | # install.packages("devtools") devtools::install_github('ssdqa/https://github.com/ssdqa/conceptsetdistribution') |
| local.description.raw | The raw data output of this check produces twenty_one columns of data: <br> | Column | Data Type | Definition | |-------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------| |`site` | character | the name of the site being targeted | |`total_pt_ct` | numeric | the total number of patients from the cohort in the domain table | |`total_row_ct` | numeric | the total number of rows associated with patients from the cohort in the domain table | |`variable_pt_ct` | numeric | the number of patients with evidence of the variable | |`variable_row_ct` | numeric | the number of rows with evidence of the variable | |`prop_pt_variable` | numeric | the proportion of patients with evidence of the variable | |`prop_row_variable` | numeric | the proportion of rows with evidence of the variable | |`variable` | character | the name of the variable | |`mean_val` | numeric | the mean proportion of patients or rows (based on user selection) for each group across sites | |`median_val` | numeric | the median proportion of patients or rows (based on user selection) for each group across sites | |`sd_val` | numeric | the standard deviation of the proportion of patients or rows (based on user selection) for each group across sites | |`mad_val` | numeric | the median absolute deviation of the proportion of patients or rows (based on user selection) for each group across sites | |`cov_val` | numeric | the coefficient of variance of the proportion of patients or rows (based on user selection) for each group across sites | |`max_val` | numeric | the maximum proportion of patients or rows (based on user selection) for each group across sites | |`min_val` | numeric | the minimum prorportion of patients or rows (based on user selection) for each group across sites | |`range_val` | numeric | the range of the proportion of patients or rows (based on user selection) for each group across sites | |`total_ct` | numeric | the total number of group members | |`analysis_eligible` | character | a string indicating whether the group is eligible for anomaly detection analysis | |`lower_tail` | numeric | the lower bound used to identify low anomalies | |`upper_tail` | numeric | the upper bound used to identify high anomalies | |`anomaly_yn` | character | a string indicating whether the value is anomalous or not | {.dqcheck-table} |
| local.description.viz | This check outputs a dot plot representing anomalous proportions of patients (or rows) with a given variable per site. This graph summarizes the mean absolute deviation (MAD) value for the `concept_id` by the dot size, how often that `concept_id` is used proportionally by the dot color, and whether that `concept_id` is anomalous by replacing the dot with a star. A tooltip provides metadat for the mapped concet and the site and precise values for proportion, mean proportion, median proportion, standard deviation and MAD upon hover. |
| local.dqcheck.category | Completeness |
| local.dqcheck.clinicalprobe | Confirmatory Clinical Data |
| local.dqcheck.clinicalprobe | Clinical Follow-Up |
| local.dqcheck.clinicalprobe | Clinical Complexity |
| local.dqcheck.measurement | Hotspots Outlier Detection |
| local.dqcheck.probe | Data Representation Errors |
| local.dqcheck.probe | Misclassification Detection |
| local.dqcheck.probe | External Benchmarking |
| local.dqcheck.probe | Missing Required Data |
| local.dqcheck.requirement | cohort |
| local.dqcheck.requirement | omop_or_pcornet |
| local.dqcheck.requirement | evp_variable_file |
| local.dqcheck.requirement | multi_or_single_site |
| local.dqcheck.requirement | anomaly_or_exploratory |
| local.dqcheck.requirement | output_level |
| local.dqcheck.requirement | age_groups |
| local.dqcheck.requirement | p_value |
| local.dqcheck.requirement | time |
| local.dqcheck.requirement | time_span |
| local.dqcheck.requirement | time_period |
| local.dqcheck.type | Variable Testing |
| local.dqcheck.viz | Dot and Star Plot |
| relation.isCodeOfDQCheck | 929c8dfc-2c8b-4e62-8e1d-0fa06c542832 |
| relation.isCodeOfDQCheck.latestForDiscovery | 929c8dfc-2c8b-4e62-8e1d-0fa06c542832 |
| relation.isDQResultOfDQCheck | cb11c990-01dd-4191-b116-979d6016a514 |
| relation.isDQResultOfDQCheck.latestForDiscovery | cb11c990-01dd-4191-b116-979d6016a514 |
Files
Original bundle
1 - 1 of 1
