Identifying Symptom Information in Clinical Notes Using Natural Language Processing
By: Koleck, Theresa A [author]
Contributor(s): Tatonetti, Nicholas P [author] | Bakken, Suzanne [author ] | Mitha, Shazia [author] | Henderson, Morgan M [author] | George, Maureen [author] | Miaskowski, Christine [author] | Smaldone, Arlene [author] | Topaz, Maxim [author]
Language: English Copyright date: 2021Subject(s): Computational linguistics | Computerized medical records | Constipation | Constipation - diagnosis | Depression - diagnosis | Electronic health records | Electronic Health Records - statistics & numerical data | Fatigue - diagnosis | Health records | Language processing | Medical records | Natural language interfaces | Pattern Recognition, Automated - methods | Professional identity | Sleep problems | Sleep Wake Disorders - diagnosis | Symptom Assessment - nursing | Symptoms | Tachycardia - diagnosis | Natural Language Processing In: Nursing Research May/June 2021 - Volume 70 - Number 3, pages 173-183Abstract: Background Symptoms are a core concept of nursing interest. Large-scale secondary data reuse of notes in electronic health records (EHRs) has the potential to increase the quantity and quality of symptom research. However, the symptom language used in clinical notes is complex. A need exists for methods designed specifically to identify and study symptom information from EHR notes. Objectives We aim to describe a method that combines standardized vocabularies, clinical expertise, and natural language processing to generate comprehensive symptom vocabularies and identify symptom information in EHR notes. We piloted this method with five diverse symptom concepts: constipation, depressed mood, disturbed sleep, fatigue, and palpitations. Methods First, we obtained synonym lists for each pilot symptom concept from the Unified Medical Language System. Then, we used two large bodies of text (clinical notes from Columbia University Irving Medical Center and PubMed abstracts containing Medical Subject Headings or key words related to the pilot symptoms) to further expand our initial vocabulary of synonyms for each pilot symptom concept. We used NimbleMiner, an open-source natural language processing tool, to accomplish these tasks and evaluated NimbleMiner symptom identification performance by comparison to a manually annotated set of nurse- and physician-authored common EHR note types. Results Compared to the baseline Unified Medical Language System synonym lists, we identified up to 11 times more additional synonym words or expressions, including abbreviations, misspellings, and unique multiword combinations, for each symptom concept. Natural language processing system symptom identification performance was excellent. Discussion Using our comprehensive symptom vocabularies and NimbleMiner to label symptoms in clinical notes produced excellent performance metrics. The ability to extract symptom information from EHR notes in an accurate and scalable manner has the potential to greatly facilitate symptom science research.Item type | Current location | Home library | Call number | Status | Date due | Barcode | Item holds |
---|---|---|---|---|---|---|---|
JOURNAL ARTICLE | COLLEGE LIBRARY | COLLEGE LIBRARY PERIODICALS | Not for loan |
Background
Symptoms are a core concept of nursing interest. Large-scale secondary data reuse of notes in electronic health records (EHRs) has the potential to increase the quantity and quality of symptom research. However, the symptom language used in clinical notes is complex. A need exists for methods designed specifically to identify and study symptom information from EHR notes.
Objectives
We aim to describe a method that combines standardized vocabularies, clinical expertise, and natural language processing to generate comprehensive symptom vocabularies and identify symptom information in EHR notes. We piloted this method with five diverse symptom concepts: constipation, depressed mood, disturbed sleep, fatigue, and palpitations.
Methods
First, we obtained synonym lists for each pilot symptom concept from the Unified Medical Language System. Then, we used two large bodies of text (clinical notes from Columbia University Irving Medical Center and PubMed abstracts containing Medical Subject Headings or key words related to the pilot symptoms) to further expand our initial vocabulary of synonyms for each pilot symptom concept. We used NimbleMiner, an open-source natural language processing tool, to accomplish these tasks and evaluated NimbleMiner symptom identification performance by comparison to a manually annotated set of nurse- and physician-authored common EHR note types.
Results
Compared to the baseline Unified Medical Language System synonym lists, we identified up to 11 times more additional synonym words or expressions, including abbreviations, misspellings, and unique multiword combinations, for each symptom concept. Natural language processing system symptom identification performance was excellent.
Discussion
Using our comprehensive symptom vocabularies and NimbleMiner to label symptoms in clinical notes produced excellent performance metrics. The ability to extract symptom information from EHR notes in an accurate and scalable manner has the potential to greatly facilitate symptom science research.
There are no comments for this item.