HIR 10-002
Pro-WATCH: Homelessness as Sentinel Event
Adiseshu V. Gundlapalli, MD PhD VA Salt Lake City Health Care System, Salt Lake City, UT Salt Lake City, UT Funding Period: September 2010 - September 2013 |
BACKGROUND/RATIONALE:
Post-deployment homelessness has been a major issue for Veterans after all conflicts and has been a priority area for the VA. The exact number of homeless Veterans is unknown; the US Department of Veterans Affairs estimates that at least 131,000 Veterans are homeless every night in the US. There is an urgent need to develop electronic algorithms to identify homeless Veterans and identify those Veterans at risk for homelessness. These alerts will be triggered by mining structured and unstructured (free text) data elements for known risk factors in the electronic medical record using natural language processing methods. The program will focus on male and female OEF/OIF Veterans and is intended to complement and enhance current local and national VA initiatives to address homelessness among Veterans. OBJECTIVE(S): The objective of this project is to develop and validate algorithms using clinical narratives and structured data to flag Veterans who are homeless or at high risk of homelessness. METHODS: (1) Identify a cohort of Veterans whose homeless status has been established (2) Develop a vocabulary to identify concepts, features and documentation related to homelessness in the VA electronic medical record with special emphasis on psychosocial phenotyping (3) Develop electronic algorithms to identify Veterans who are homeless or at risk of homelessness, using a domain-specific lexicon and natural language processing methods (4) Perform retrospective validation of the algorithms using national VA electronic data (5) Establish working relationships with community homeless service providers in Salt Lake City, Utah. FINDINGS/RESULTS: 1. Developing a lexicon: We generated a human-curated lexicon for concepts related to homelessness. This is an important contribution to the field as there was no readily available lexicon for this domain. 2. Information (concept) extraction using natural language processing (NLP): We have used this lexicon to develop an NLP algorithm that extracts concepts related to homelessness from VA electronic records. We have trained and tested the algorithm on a human-reviewed reference standard set of medical notes that contained these concepts. The overall performance (positive predictive value, PPV) of the algorithm on this reference standard set is 77%. 3. Information retrieval for Veteran homelessness: An off-the-shelf VA developed tool, Automated Retrieval Console (ARC) was successfully adapted to the homelessness domain. The ARC tool was trained to perform document level classification. Performance has been measured at a precision of 94.5, recall of 95.2, and F-measure of 94.8. 4. Early identification of concepts related to homelessness in free text: We tested the hypothesis that concepts related to homelessness written in the free text of the medical record would precede the identification of homelessness by administrative data (ICD-9-CM). We applied our natural language processing algorithms for detecting homelessness and risk factors to medical notes from 50 randomly selected Veterans who were found to be homeless using the standard VA administrative data case definition. Notes from a control group of 50 Veterans who did not have an administrative indicator for homelessness were also processed. 'Direct evidence' of homelessness appeared in the notes of 30% of homeless Veterans a month or more before an administrative code for homelessness. Notes from 88% had evidence of risk factors related to homelessness prior to receiving an administrative code for homelessness. Among the notes of non-homeless Veterans, only 1 had 'direct evidence' and 6 had 'indirect evidence'. 5. Scaling up of NLP concept extraction algorithms: We set out to develop algorithms to improve efficiency of patient phenotyping using NLP on large corpora of text data. We sought to determine the note titles in the database with highest yield and precision for psychosocial concepts. We used our lexicon for homelessness risk factors as a basis for this work. From a database of over 1 billion documents from VA medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS based NLP pipeline, using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives. High yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. A total of 58,707 psychosocial concepts were identified from 316,355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6 to 0). Of 6031 concepts reviewed from a high yield set of note titles, the overall precision for all concept categories was 80% with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context and alternate meaning of words. 6. Identification of patterns in resource utilization prior to administrative recognition of homelessness: There are limited data on resources utilized by Veterans prior to their identification as being homeless. We performed visual analytics on longitudinal medical encounter data prior to the official recognition of homelessness in a large cohort of OEF/OIF Veterans. A statistically significant increase in numbers of visits in the immediate 30 days prior to the recognition of homelessness was noted as compared to an earlier period. Further studies are ongoing to validate this novel finding as a predictive tool. 7. Hepatitis C and homeless Veterans:We sought to describe the rates and predictors of initiation of treatment for chronic hepatitis C virus infection (HCV) in a cohort of HCV positive US Veterans with evidence of homelessness. Rates of treatment among homeless and non-homeless HCV Veterans were very low and clinically similar, though statistically significant. (6.2% vs. 7.4%, p<0.0001). Patients age 50, those with drug abuse, diabetes and hemoglobin < 10 g/dL were less likely to be treated. Genotype 2/3 increased the likelihood of treatment. 8. Pneumonia and homeless individuals: We evaluated the admission decisions and outcomes in homeless individuals diagnosed with community acquired pneumonia (CAP) seen at an urban community hospital. A large cohort of homeless patients with CAP demonstrated higher hospitalization risk but similar lengths of stay and costs as nonhomeless patients. 9. We have established excellent working relationships with our local community homeless service providers. IMPACT: Detecting homelessness or identifying Veterans at risk for homelessness is an important target for sentinel event surveillance. The benefits of such surveillance are multifold. In addition, this work forms the foundation for a currently funded HSR&D grant whose goal is to develop automated predictive models to identify Veterans at risk of homelessness. External Links for this ProjectDimensions for VADimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.Learn more about Dimensions for VA. VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address. Search Dimensions for this project PUBLICATIONS:Journal Articles
DRA:
Mental, Cognitive and Behavioral Disorders, Health Systems
DRE: Research Infrastructure, Epidemiology, Diagnosis Keywords: none MeSH Terms: none |