Data Quality Check Domain

Duplicate Record Check

The Duplicate Records module allows the user to provide a set of columns to include or exclude to define what should be considered duplication. The module will then evaluate the proportion of rows and patients with duplicates, as well as the median number of duplicates per patient.

Access technical code and implementation requirements for this module on GitHub.

Browse

Search Results

The checks in this collection identify where there are duplicate rows or values in a given dataset.