Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website
2015 Conference Logo



2015 HSR&D/QUERI National Conference Abstract


3010 — Use of Statistical Text Mining (STM) to Adjust Colonoscopy Follow-up Rates for Patients with Positive Fecal Occult Blood Test (FOBT+) Results

Nugent SM, Minneapolis COIN; Nelson DB, Minneapolis COIN; Gravely AA, Minneapolis COIN; Lillie SE, Minneapolis COIN; Partin M, Minneapolis COIN;

Objectives:
We used STM to search unstructured text from clinical notes for valid reasons for not receiving a colonoscopy (i.e., colonoscopy refusal (CR) or private sector colonoscopy (PSC)) in the VHA within 6 months post FOBT+. This information was used to adjust colonoscopy follow-up rate estimates.

Methods:
We identified 74,014 patients with FOBT+ between August 2009 and March 2011. We extracted > 85,000 clinical documents on the 41.4% of FOBT appropriate patients not receiving a colonoscopy within 6 months post FOBT+. We performed annotation using eHOST software on 828 notes from 250 randomly selected patients. Annotators highlighted key words (i.e., terms) in the notes and classified notes as associated with CR, PSC, or neither. We used annotated terms in STM to develop logistic regression based classification algorithms to separately predict CR and PCS using split-half development (DS) and validation (VS) subsets from all annotated notes. We used the developed models to construct predicted probabilities of CR and PSC for the non-annotated notes. These predicted probabilities were used in sensitivity analyses reclassifying predicted refusals and PSC from having no follow-up to having appropriate follow-up.

Results:
Annotators demonstrated very good agreement classifying notes indicating CR (kappa = .898) and PSC (kappa = .834). Model agreement of our CR classification algorithm was 98% for DS and 80% for VS; our PSC algorithm yielded 87% for DS and 75% for VS. Applying the scored logistic regression model to all FOBT+ cases in the sample, we estimated that 8.8% refused colonoscopy while 10.1% received a colonoscopy in the private sector. The sensitivity analysis treating identified CR or PSC as being adequately followed up increased overall colonoscopy follow-up estimates at 6 months from 49% to over 67%.

Implications:
We successfully employed STM techniques to estimate CR and PSC in our population of FOBT+ patients.

Impacts:
Receipt of care outside the VA or intention to treat is often only documented in clinical notes. STM provides a useful technique to glean structured information from unstructured text. With the advent of the Veterans Choice Act receipt of care outside the VA will most likely increase.