HSR&D Citation Abstract
Search | Search by Center | Search by Source | Keywords in Title
Using statistical text mining to supplement the development of an ontology.
Luther S, Berndt D, Finch D, Richardson M, Hickling E, Hickam D. Using statistical text mining to supplement the development of an ontology. Journal of Biomedical Informatics. 2011 Dec 1; 44 Suppl 1:S86-93.
Statistical text mining was used to supplement efforts to develop a clinical vocabulary for post-traumatic stress disorder (PTSD) in the VA. A set of outpatient progress notes was collected for a cohort of 405 unique veterans with PTSD and a comparison group of 392 with other psychological conditions at one VA hospital. Two methods were employed: (1) "multi-model term scoring" used stepwise logistic regression to develop 21 separate models by varying three frequency weight and seven term weight options and (2) "iterative term refinement" which used a standard stop list followed by clinical review to eliminate non-clinical terms and terms not related to PTSD. Combined results of the two methods were reviewed by two clinicians resulting in 226 unique PTSD related terms. Results of the statistical text mining methods were compared with ongoing efforts to identify terms based on literature review, focus groups with clinicians treating PTSD and review of an existing vocabulary, lending support to the contributions of the STM analyses.