Novel pathogenic coronaviruses – such as SARS-CoV and probably SARS-CoV-2 – arise by homologous recombination between co-infecting viruses in a single cell. Identifying possible sources of novel coronaviruses therefore requires identifying hosts of multiple coronaviruses; however, most coronavirus-host interactions remain unknown. Here, by deploying a meta-ensemble of similarity learners from three complementary perspectives (viral, mammalian and network), we predict which mammals are hosts of multiple coronaviruses. We predict that there are 11.5-fold more coronavirus-host associations, over 30-fold more potential SARS-CoV-2 recombination hosts, and over 40-fold more host species with four or more different subgenera of coronaviruses than have been observed to date at >0.5 mean probability cut-off (2.4-, 4.25- and 9-fold, respectively, at >0.9821). Our results demonstrate the large underappreciation of the potential scale of novel coronavirus generation in wild and domesticated animals. We identify high-risk species for coronavirus surveillance. Homologous recombination between co-infecting coronaviruses can produce novel pathogens. Here, Wardeh et al. develop a machine learning approach to predict associations between mammals and multiple coronaviruses and hence estimate the potential for generation of novel coronaviruses by recombination.
【저자키워드】 machine learning, Ecological epidemiology, Viral reservoirs, Ecological networks, 【초록키워드】 SARS-CoV-2, coronavirus, SARS-CoV, virus, Novel coronavirus, Probability, Surveillance, Recombination, Single Cell, Pathogens, novel, predict, association, Interaction, Homologous recombination, similarity, complementary, Perspective, hosts, cut-off, multiple coronaviruses, identifying, mammalian, pathogenic coronavirus, SARS-CoV-2 recombination, Host, approach, identify, develop, associations, mammal, co-infecting, 【제목키워드】 Novel coronavirus, mammalian, Host,