The Covid-19 pandemic, a disease transmitted by the SARS-CoV-2 virus, has already caused the infection of more than 120 million people, of which 70 million have been recovered, while 3 million people have died. The high speed of infection has led to the rapid depletion of public health resources in most countries. RT-PCR is Covid-19’s reference diagnostic method. In this work we propose a new technique for representing DNA sequences: they are divided into smaller sequences with overlap in a pseudo-convolutional approach and represented by co-occurrence matrices. This technique eliminates multiple sequence alignment. Through the proposed method, it is possible to identify virus sequences from a large database: 347,363 virus DNA sequences from 24 virus families and SARS-CoV-2. When comparing SARS-CoV-2 with virus families with similar symptoms, we obtained \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.97 \pm 0.03$$\end{document} 0.97 ± 0.03 for sensitivity and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.9919 \pm 0.0005$$\end{document} 0.9919 ± 0.0005 for specificity with MLP classifier and 30% overlap. When SARS-CoV-2 is compared to other coronaviruses and healthy human DNA sequences, we obtained \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.99 \pm 0.01$$\end{document} 0.99 ± 0.01 for sensitivity and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.9986 \pm 0.0002$$\end{document} 0.9986 ± 0.0002 for specificity with MLP and 50% overlap. Therefore, the molecular diagnosis of Covid-19 can be optimized by combining RT-PCR and our pseudo-convolutional method to identify DNA sequences for SARS-CoV-2 with greater specificity and sensitivity.
【저자키워드】 Molecular medicine, machine learning, 【초록키워드】 public health, SARS-CoV-2, pandemic, Infection, RT-PCR, virus, Symptoms, DNA, sensitivity, specificity, Molecular diagnosis, resource, disease, Diagnostic method, Multiple sequence alignment, Classifier, overlap, sequence, DNA sequences, DNA sequence, approach, greater, identify, caused, died, healthy, transmitted, other coronavirus, representing, co-occurrence, the SARS-CoV-2 virus, 【제목키워드】 Diagnosis, RT-PCR, virus, sequence,