Studying the Causal Effects of Infection within a Complex Pandemic
The last few years have been devastating for so many Americans, including many Veterans and their families. The COVID-19 epidemic exacerbated, triggered, and intersected wide-ranging social and economic changes in a dynamic policy environment. The individual and institutional implications of these intersecting forces are likely to be felt for decades.
In this context, after a competitive solicitation process, HSR&D chartered a COVID-19 Observational Research Collaboratory (CORC) with a targeted mandate: to understand the health services and clinical impacts caused by infection with SARS-CoV-2 in Veterans. CORC is one of several HSR&D initiatives interacting with VA’s Office of Research and Development COVID-19 research projects and is a collaboration of investigators across five VA facilities (Ann Arbor, Durham, Palo Alto, Portland, and Puget Sound) with deep connectivity throughout the broader HSR&D Centers of Innovation. CORC is led by a Principal Investigator team (in alphabetical order): Amy S.B. Bohnert, PhD, C. Barrett Bowling, MD, Edward J. Boyko, MD, Denise M. Hynes, PhD, RN, George N. Ioannou, MD, Theodore J. Iwashyna, MD, PhD, Matthew L. Maciejewski, PhD, Ann M. O’Hare, MA, MD, and Elizabeth M. Viglianti, MD, MPH, MSc.
The CORC team framed its primary task as providing causal evidence of the individual impact of COVID-19 infection on outcomes beyond the initial period of acute illness. To do so, we had to address the reality that randomization to infection was not possible. We turned therefore to contemporary thinking about observational causal inference using two key tools. The first tool was a directed acyclic graph (DAG) to identify mechanisms of action by which infection would affect outcomes of interest, which can inform regression adjustment. Convening a national advisory team of 31 clinical and methodological experts, we used a multi-step process to identify the relationships among potential confounding and colliding variables; throughout this process we synthesized existing knowledge and judgement. This step is important because recent developments have shown that old-fashioned “just throw everything in a reduced model” approaches to statistical model building can produce misleading answers – there are some factors, known as colliders, for which controlling increases rather than decreases bias. No simple empirical method exists for distinguishing whether a variable is a confounder (and should be controlled for) or a collider (in which case control introduces bias); instead, researchers must articulate a proposed (and not immediately verifiable) causal structure of relationships.
Having identified the DAG, we then organized our design and analysis around the concept of target trial emulation (TTE). The TTE approach avoids common biases and errors of observational studies through a two-step design process: (1) define the randomized controlled trial that would most precisely answer the causal question of interest (the target) without regard to feasibility; and (2) design an observational study that emulates that trial as closely as possible. To identify causal effects of COVID-19 infection, we imagined an impossible and unethical trial in which Veterans would be randomized prospectively to infection (e.g., through exposure at such a high level to cause infection) or no infection – a randomized trial that should never be done. We defined the inclusion and exclusion criteria of the specified target trial and applied them to our emulated trial. And then we attempted to emulate the balance in baseline characteristics achieved through randomization, by matching Veterans who tested positive for SARS-CoV-2 for the first time in each calendar month of the pandemic (identified from the COVID-19 Shared Data Resource) to Veterans who had not yet tested positive as of the date of infection of their matched comparator. We used exact matching by certain key characteristics followed by propensity score matching. We matched using propensity scores built separately for each calendar month given the changing dynamics of the epidemic. And we carefully designed our TTEs to guard against subtle selection biases or inclusion of information after infection (or not). We incorporated over 20 DAG-informed potential confounding variables in hopes that these matched cohorts would be sufficiently balanced that internal validity would hold for a diverse range of electronic health record –based outcome investigations.
Armed with these individually-matched cohorts of nearly every (several hundred thousand) COVID-positive Veteran from each month and their nearly identical comparators (up to 25 per COVID-positive Veteran), we have turned to three “work streams” in our Long-Term Outcomes (LTOs) study. First, we are using the rich resources of the electronic health record within VA and linked Medicaid and Medicare claims to study outcomes, such as healthcare utilization, costs, suicide, and depression. These studies are an essential step for forecasting future VA service needs.
Concurrently, we have begun fielding telephone-based surveys administered serially to matched cohorts of COVID-positive Veterans and their nearly identical uninfected comparators to study things that do not appear reliably in the medical record or may be heavily biased by potential differences in access to VA care or coding. The surveys aim to understand the total effects caused by COVID-19 on disability, financial toxicity, and other mental health outcomes. We will continue to follow these Veterans – both COVID-positive initially and their comparators – for at least 1.5 years to understand the dynamics of recovery.
Finally, from the beginning, we in the CORC LTO study realized that we would not know all the questions that need to be answered. To discover new questions, our third work stream is an active program of qualitative inquiry that is using interviews with Veterans and their caregivers, as well as textual analysis of documentation in the VA-wide electronic health records of these Veterans. As we discover new patterns, we can explore their generalizability with integration into the ongoing electronic health record and survey-based programs of work.
While the LTO group conducts these primary analyses, the Data and Coordinating Center (DCC) of CORC is working to support not just the LTO, but VA-wide analyses. First and foremost, that means actively partnering to find ways to overcome traditional VA barriers to sharing data across research projects. Our goal is to make available to VA investigators the large monthly matched cohorts to allow those investigators to bypass all the steps of rebuilding matches to support causal comparisons and investigation of outcomes. This work is supported by a fundamental commitment to sharing all code developed in the CORC with readers of our papers, as well as with other VA investigators to support more rapid, open science on pressing COVID-19 problems – or any other research for which these tools might be helpful.
In our DCC function, we are working closely with the VA Informatics and Computing Infrastructure (VINCI), VA’s Centralized Interactive Phenomics Resource (CIPHER) and the VA Information Resource Center (VIReC) to facilitate the provision of COVID-19 related data, analytic code, and methodologic expertise, and to support VA investigators working in this area. Additionally, we have a nationwide methods advisory group that provides detailed review of CORC projects to strengthen analysis plans; they are also able to provide similar discussion or review of other proposals to investigators who wish such assistance. The DCC is also currently developing a Young COVID Investigators series (You CIDS) to nurture early career investigators and advise on research design and data resources. Additionally, the DCC provides a consultative function to HSR&D’s Director and other operational partners across VA on interpreting and translating emerging evidence.
The CORC program of research will, of course, not answer every important question about COVID-19 in health services research; it is not intended to. Instead, our goals are to answer select core questions with rigorous causal methods and, at the same time, catalyze the research of other investigators.