Automated Feature Extraction from Transcranial Doppler Procedure Notes Using Natural Language Processing: A Multi-Center Study


Study Dates

2023 - 2026

Last Modified

Loading...
Thumbnail Image

Tags




Publisher

PEDSnet

Abstract

Project aims to demonstrate the feasibility of using automated methods to extract Transcranial Doppler velocities and results from multiple hospitals in the PEDSnet network. The goal is to confirm data abstraction validity from at least 5 hospitals.

Funder(s)

This research was made possible through the generous support of Patient-Centered Outcomes Research Institute, Agency for Healthcare Research and Quality (AHRQ) 1P30HS029755. The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of PCORI, its Board of Governors, or its Methodology Committee.

Provenance

Description

Efforts to reduce the burden of abnormal transcranial Doppler in patients with SCD have been hampered by challenges in ascertaining transcranial Doppler results, which are typically stored as unstructured data elements consisting of free-text radiology reports. Multi-center studies examining transcranial Doppler results have required labor-intensive manual review.5 Natural language processing (NLP) offers a scalable and reproducible methodology that has been proven to yield high performance in extracting clinical features from clinical notes and pathology reports. The goal of this study is to develop an NLP-based tool and workflow to extract transcranial Doppler results using radiology reports across multiple PEDSnet institutions. We will then use these reports to describe the epidemiology and outcomes SCD related transcranial Doppler abnormalities in a modern cohort.

Study Design

This is a retrospective cohort study to determine the feasibility of using NLP techniques to extract transcranial Doppler velocities and classify transcranial Doppler outcomes. The overall design of this aim will be as follows:

  1. Leverage the PEDSnet data pipeline to obtain transcranial Doppler clinical notes from participating sites
  2. Populate OMOP/PEDSnet NOTE table for transcranial Doppler (free text, no extracted features)
  3. Abstract transcranial Doppler notes to obtain velocities, and outcome ascertainment
  4. Manually validate a sample of transcranial Doppler reports at each site and, if needed, modify our transcranial Doppler natural language processing methods.
  5. Use extracted data to complete analyses of a related PEDSnet Project, Creation of a Computable Phenotype in Childhood- Onset Arterial Ischemic Stroke (CAIS).

Patients with a transcranial Doppler order will be obtained from the PEDSnet database for participating institutes. Free text data from these identified patients will be extracted and de-identified locally by sites using the TiDE program. The de-identified data will be incorporated into OMOP/PEDSnet CDM tables in the centralized PEDSnet database. The team will leverage NLP tools such as Machine Learning-based feature extraction using a pre-trained model to extract transcranial Doppler velocities and classify transcranial Doppler outcomes into normal, abnormal, conditional, indeterminate, or not performed

Development Code

Vocabulary

Clinical Subjects Headings

Related Code

Data Source

PEDSnet Production Database (2024-04)
Created:2024-04Affiliation:PEDSnet Data Coordinating Center
PEDSnet production database containing de-identified aggregate electronic healthcare information for ten contributing pediatric healthcare institutions. This database corresponds to Version 5.3 of the PEDSnet data model.

#### **Learn about how to access PEDSnet data for research [here](https://pedsnet.org/database/access-to-data/).**
#### **Submit a [collaboration request](https://pedsnet.org/work-with-us/collaboration-request/).**

Related Person

Parent Partner

Related Study

Creation of a Computable Phenotype in Childhood- Onset Arterial Ischemic Stroke (CAIS)
Affiliation:Children's Hospital of Philadelphia; Nemours Children's Health
The primary aim of this study is to incorporate discrete data elements included in the PEDSnet database to create a valid computable phenotype for childhood arterial ischemic stroke (CAIS), using a rule-based approach. The phenotype should identify all types of childhood arterial ischemic stroke, an entity which is defined by multiple stroke subtypes and at-risk patient populations, especially in patients with sickle cell disease (SCD) as patients with SCD present to care frequently for pain and other disease complications, and presentation and evaluation of stroke in SCD may differ from all-cause stroke.

Creative Commons license

Except where otherwised noted, this item's license is described as a CC-BY Attribution 4.0 license.

Cite this Study

Dain, A. & Beslow, L. Automated Feature Extraction from Transcranial Doppler Procedure Notes Using Natural Language Processing: A Multi-Center Study. [Study]. PEDSpace Knowledge Bank. https://hdl.handle.net/20.500.14642/1662

Search Results