Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

HSR&D Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Lessons learned in dealing with missing race data: an empirical investigation

Gebregziabher MG, Zhao Y, Axon RN, Gilbert GE, Egede LE. Lessons learned in dealing with missing race data: an empirical investigation. Journal of biometrics & biostatistics. 2012 Apr 14; 3(138):1-5.

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information vaww.hsrd.research.va.gov/dimensions/

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions



Abstract:

Background: Missing race data is a ubiquitous problem in studies using data from large administrative datasets such as the Veteran Health Administration and other sources. The most common approach to deal with this problem has been analyzing only those records with complete data. Complete Case Analysis (CCA) which requires the assumption of Missing Completely At Random (MCAR) but CCA could lead to biased estimates with inflated standard errors. Objective: To examine the performance of a new imputation approach. Latent Class Multiple Imputation (LCMI), for imputing missing race data and make comparisons with CCA, Multiple Imputation (MI) and Log-Linear Multiple Imputation (LLMI). Design/Participants: To empirically compare LCMI to CCA, MI and LLMI using simulated data and demonstrate their applications using data from a sample of 13,705 veterans with type 2 diabetes among whom 23% had unknown/missing race information. Results: Our simulation study shows that under MAR, LCMI leads to lower bias and lower standard error estimates compared to CCA, MI and LLMI. Similarly, in our data example which does not conform to MCAR since subjects with missing race information had lower rates of medical comorbidities than those with race information, LCMI outperformed MI and LLMI providing lower standard errors especially when relatively larger number of latent classes in assumed for the latent class imputation model. Conclusions: Our results shot that LCMI is a valid statistical technique for imputing missing categorical covariate data and particularly missing race data that offers advantages with respect to precision of estimates.





Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.