HSR Citation Abstract

HSR Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Using ontology network structure in text mining.

Berndt DJ, McCart JA, Luther SL. Using ontology network structure in text mining. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2010 Nov 13; 2010:41-5.

Related HSR&D Project(s)

HIR 09-002 – Consortium for Healthcare Informatics Research: Clinical Inference and Modeling

Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions

Search for Abstract from PubMed

Abstract:

Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.

Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.

VA Health Systems Research