2065. Missing Race Data in VA Based Disparities Research: A Systematic Review
JA Long, Center for Health Equity Research and Promotion, Leonard Davis Institute of Health Economics, and University of Pennsylvania School of Medicine, MI Bamba, Center for Health Equity Research and Promotion and University of Pennsylvania School of Medicine, B Ling, Center for Health Equity Research and Promotion and University of Pittsburgh, JA Shea, Center for Health Equity Research and Promotion, Leonard Davis Institute of Health Economics,  and University of Pennsylvania School of Medicine

Objectives: Many studies evaluating racial disparities within the VHA are based on secondary data analyses. Often race data are missing for at least a portion of the patients. Knowing how investigators treat missing data is critical in evaluating potential biases. The objectives of this systematic review were to quantify: (1) the data sources for VHA disparity studies; (2) how missing data were handled; and (3) the extent of missing data.

Methods: MEDLINE, EconLit, and Sociological Abstracts were searched using these keywords: (race, racial stock, ethnicity, ethnic groups, blacks, Hispanic Americans) and (United States Department of Veterans Affairs, veterans, veterans hospitals, and VA).  Abstract exclusion criteria included: written before 1992; letters or review papers; did not pertain to veterans; race not mentioned; race self-reported.  Two trained reviewers independently abstracted each article.  Article exclusion criteria included: duplicate study populations; not a secondary data analysis; race was not a focus of or important predictor in the research.

Results: 69 of 118 articles met inclusion criteria. Race was the primary focus in 42 articles and an important predictor in 27 articles. The Patient Treatment File (PTF) was the most common source of race data (29).  For 32 articles knowledge of race was required for inclusion in the analytic population. Articles were grouped into the following mutually exclusive categories: no missing race data (11); missing race data explicitly quantified (8); missing race data explicitly grouped with other data but not quantified (9); race data known for enumeration of the potential population (2); no mention of missing race data but known to exist in the data source e.g., PTF (5); unable to determine if there was missing race data (33). When missing race data was quantified it ranged from 0% to 48%, median = 0%, mean = 8%.

Conclusions: Missing race data is frequently present in VHA secondary data sources.  However, it is rarely explicitly discussed or quantified, even when it is the primary focus of the research question.

Impact: Without clear descriptions of how much missing race data is present in studies using VHA secondary data sources, readers are unable to evaluate a very important potential source of bias.