Numerous observational studies have attempted to identify risk factors for infection with SARS-CoV-2 and COVID-19 disease outcomes. Studies have used datasets sampled from patients admitted to hospital, people tested for active infection, or people who volunteered to participate. Here, we highlight the challenge of interpreting observational evidence from such non-representative samples. Collider bias can induce associations between two or more variables which affect the likelihood of an individual being sampled, distorting associations between these variables in the sample. Analysing UK Biobank data, compared to the wider cohort the participants tested for COVID-19 were highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. We discuss the mechanisms inducing these problems, and approaches that could help mitigate them. While collider bias should be explored in existing studies, the optimal way to mitigate the problem is to use appropriate sampling strategies at the study design stage. Many published studies of the current SARS-CoV-2 pandemic have analysed data from non-representative samples from populations. Here, using UK BioBank samples, Gibran Hemani and colleagues discuss the potential for such studies to suffer from collider bias, and provide suggestions for optimising study design to account for this.
【저자키워드】 Risk factors, Infectious diseases, Epidemiology, Statistical methods, 【초록키워드】 COVID-19, SARS-CoV-2 pandemic, hospital, Genetic, Infection, risk factor, observational study, COVID-19 disease, outcomes, Cohort, Patient, Study design, dataset, behavioural, mechanism, association, Evidence, problems, help, traits, participant, while, variable, Affect, mitigate, approach, populations, likelihood, collider bias, highlight, UK Biobank data, selected, tested, identify, analysed, induce, Numerous, Analysing, collider bia, infection with SARS-CoV-2, sampling strategy, 【제목키워드】 severity, risk, COVID-19 disease,