Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

HSR Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Are ICD codes reliable for observational studies? Assessing coding consistency for data quality.

Nelson SJ, Yin Y, Trujillo Rivera EA, Shao Y, Ma P, Tuttle MS, Garvin J, Zeng-Treitler Q. Are ICD codes reliable for observational studies? Assessing coding consistency for data quality. Digital health. 2024 Oct 29; 10:20552076241297056.

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information vaww.hsrd.research.va.gov/dimensions/

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions



Abstract:

OBJECTIVE: International Classification of Diseases (ICD) codes recorded in electronic health records (EHRs) are frequently used to create patient cohorts or define phenotypes. Inconsistent assignment of codes may reduce the utility of such cohorts. We assessed the reliability across time and location of the assignment of ICD codes in a US health system at the time of the transition from ICD-9-CM (ICD, 9th Revision, Clinical Modification) to ICD-10-CM (ICD, 10th Revision, Clinical Modification). MATERIALS AND METHODS: Using clusters of equivalent codes derived from the US Centers for Disease Control and Prevention General Equivalence Mapping (GEM) tables, ICD assignments occurring during the ICD-9-CM to ICD-10-CM transition were investigated in EHR data from the US Veterans Administration Central Data Warehouse using deep learning and statistical models. These models were then used to detect abrupt changes across the transition; additionally, changes at each VA station were examined. RESULTS: Many of the 687 most-used code clusters had ICD-10-CM assignments differing greatly from that predicted from the codes used in ICD-9-CM. Manual reviews of a random sample found that 66% of the clusters showed problematic changes, with 37% having no apparent explanations. Notably, the observed pattern of changes varied widely across care locations. DISCUSSION AND CONCLUSION: The observed coding variability across time and across location suggests that ICD codes in EHRs are insufficient to establish a semantically reliable cohort or phenotype. While some variations might be expected with a changing in coding structure, the inconsistency across locations suggests other difficulties. Researchers should consider carefully how cohorts and phenotypes of interest are selected and defined.





Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.