Automated Feature Extraction from Transcranial Doppler Procedure Notes Using Natural Language Processing: A Multi-Center Study
| dc.contributor | Patient-Centered Outcomes Research Institute |
| dc.contributor | Agency for Healthcare Research and Quality (AHRQ) |
| dc.contributor.author | Dain, Aleksandra (Sarah) |
| dc.contributor.author | Beslow, Lauren |
| dc.contributor.other | Nemours Children's Health |
| dc.date.accessioned | 2026-06-08T16:45:34Z |
| dc.description | Efforts to reduce the burden of abnormal transcranial Doppler in patients with SCD have been hampered by challenges in ascertaining transcranial Doppler results, which are typically stored as unstructured data elements consisting of free-text radiology reports. Multi-center studies examining transcranial Doppler results have required labor-intensive manual review.5 Natural language processing (NLP) offers a scalable and reproducible methodology that has been proven to yield high performance in extracting clinical features from clinical notes and pathology reports. The goal of this study is to develop an NLP-based tool and workflow to extract transcranial Doppler results using radiology reports across multiple PEDSnet institutions. We will then use these reports to describe the epidemiology and outcomes SCD related transcranial Doppler abnormalities in a modern cohort. #### Study Design This is a retrospective cohort study to determine the feasibility of using NLP techniques to extract transcranial Doppler velocities and classify transcranial Doppler outcomes. The overall design of this aim will be as follows: 1. Leverage the PEDSnet data pipeline to obtain transcranial Doppler clinical notes from participating sites 2. Populate OMOP/PEDSnet NOTE table for transcranial Doppler (free text, no extracted features) 3. Abstract transcranial Doppler notes to obtain velocities, and outcome ascertainment 4. Manually validate a sample of transcranial Doppler reports at each site and, if needed, modify our transcranial Doppler natural language processing methods. 5. Use extracted data to complete analyses of a related PEDSnet Project, [Creation of a Computable Phenotype in Childhood- Onset Arterial Ischemic Stroke (CAIS)](https://hdl.handle.net/20.500.14642/824). Patients with a transcranial Doppler order will be obtained from the PEDSnet database for participating institutes. Free text data from these identified patients will be extracted and de-identified locally by sites using the TiDE program. The de-identified data will be incorporated into OMOP/PEDSnet CDM tables in the centralized PEDSnet database. The team will leverage NLP tools such as Machine Learning-based feature extraction using a pre-trained model to extract transcranial Doppler velocities and classify transcranial Doppler outcomes into normal, abnormal, conditional, indeterminate, or not performed |
| dc.description.abstract | Project aims to demonstrate the feasibility of using automated methods to extract Transcranial Doppler velocities and results from multiple hospitals in the PEDSnet network. The goal is to confirm data abstraction validity from at least 5 hospitals. |
| dc.identifier.uri | https://hdl.handle.net/20.500.14642/1662 |
| dc.publisher | PEDSnet |
| dc.rights | a CC-BY Attribution 4.0 license. |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
| dc.subject | PCORI-Funded Research |
| dc.subject | Grant-Funded Research |
| dc.subject | Cohort Study |
| dc.subject | Retrospective Study |
| dc.subject.mesh | Ultrasonography, Doppler, Transcranial |
| dc.subject.mesh | Natural Lanuage Processing |
| dc.subject.mesh | Large Language Models |
| dc.title | Automated Feature Extraction from Transcranial Doppler Procedure Notes Using Natural Language Processing: A Multi-Center Study |
| dspace.entity.type | Study |
| local.contributor.grant | 1P30HS029755 |
| local.contributor.grant | RI-CHOP-01-PS10 |
| local.contributor.sites | Children’s Hospital of Philadelphia |
| local.contributor.sites | Nemours Children's Health |
| local.contributor.sites | Children's Hospital Colorado |
| local.contributor.sites | Texas Children's Hospital |
| local.contributor.sites | Seattle Children's Hospital |
| local.description.analytics | Selection criteria for the study sample is as follows: **Inclusion Criteria:** 1. Diagnosis of SCD genotype SS (SCD), as per previously published phenotype: - Qualifying diagnosis code of SCD in the problem list, medical history, as a primary diagnosis at encounter, nonprimary diagnosis at encounter, or as a discharge diagnosis - 2 hematology/oncology outpatient visits at least 3 days apart OR 1 hospitalization in the electronic medical record. - Visits for administrative purposes, imaging, and labs will be excluded. **Exclusion Criteria:** - Number of diagnoses for sickle cell trait diagnosis > qualifying SCD diagnoses - Evidence for stem cell transplant before cohort entrance date - Evidence for autologous gene therapy before cohort entrance date - Age >= 17 years on cohort entrance date (first hematology/oncology in person encounter) - Age < 2 years on cohort exit date (last available PEDSnet encounter) |
| project.endDate | 2026 |
| project.startDate | 2023 |
| relation.isDocumentationOfStudy | 7156b9cb-99cf-430f-acc6-040fe7398373 |
| relation.isDocumentationOfStudy.latestForDiscovery | 7156b9cb-99cf-430f-acc6-040fe7398373 |
| relation.isOrgUnitOfStudy | a118440c-013c-4d44-b7ce-19a9b1441304 |
| relation.isOrgUnitOfStudy | cdb3cef9-ebdd-4ca8-9000-14573ba301bf |
| relation.isOrgUnitOfStudy | c8e42b1c-6ffb-4b73-b893-ae4b5fe07dfa |
| relation.isOrgUnitOfStudy | 751635c0-bac7-47e0-95ea-c78f3cb31390 |
| relation.isOrgUnitOfStudy | ff3cd76b-cf4c-44c1-84a6-368cfd0e9123 |
| relation.isOrgUnitOfStudy.latestForDiscovery | a118440c-013c-4d44-b7ce-19a9b1441304 |
| relation.isPublicationOfStudy | 88dd6903-1cb0-4296-9ace-62eba94d7ff2 |
| relation.isPublicationOfStudy | e01509c3-6fee-4567-9155-ccfb92e5e1d0 |
| relation.isPublicationOfStudy | af8ec59e-8375-45c5-89b2-36ee5937c485 |
| relation.isPublicationOfStudy.latestForDiscovery | 88dd6903-1cb0-4296-9ace-62eba94d7ff2 |
| relation.isStudyOfStudy | e31630db-0837-4cda-bd2d-1db206a5d751 |
| relation.isStudyOfStudy.latestForDiscovery | e31630db-0837-4cda-bd2d-1db206a5d751 |
