WO2024008955A1 - Method of screening for severe covid-19 susceptibility - Google Patents

Method of screening for severe covid-19 susceptibility Download PDF

Info

Publication number
WO2024008955A1
WO2024008955A1 PCT/EP2023/068925 EP2023068925W WO2024008955A1 WO 2024008955 A1 WO2024008955 A1 WO 2024008955A1 EP 2023068925 W EP2023068925 W EP 2023068925W WO 2024008955 A1 WO2024008955 A1 WO 2024008955A1
Authority
WO
WIPO (PCT)
Prior art keywords
cpg sites
subject
methylation levels
cpg
covid
Prior art date
Application number
PCT/EP2023/068925
Other languages
French (fr)
Inventor
Karl Trygve KALLEBERG
Arne SØRAAS
Janis Frederick NEUMANN
Cathrine Lund HADLEY
Espen RISKEDAL
Original Assignee
Age Labs As
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Age Labs As filed Critical Age Labs As
Publication of WO2024008955A1 publication Critical patent/WO2024008955A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • the present invention relates generally to methods of screening for susceptibility to severe COVID-19, as well as kits for screening for susceptibility to severe COVID-19.
  • SARS-CoV-2 The highly contagious severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in Wuhan, China, in late 2019 (Ref 1), and has since spread to the entire world, causing billions to be infected and millions of deaths.
  • ARDS acute respiratory distress syndrome
  • Ref 2 The clinical manifestations are systemic and heterogeneous, and the organ damage appears to be largely immune-mediated and regulated by epigenetic changes (Ref 3).
  • the infection may trigger a wide range of autoimmune responses, including multisystem inflammatory syndrome in children (MIS-C) (Ref 4), hemolytic anemia, myocarditis, and Guillain Barre Syndrome (Ref 5).
  • MI-C multisystem inflammatory syndrome in children
  • Ref 5 hemolytic anemia
  • myocarditis myocarditis
  • Guillain Barre Syndrome Ref 5
  • Older patients and individuals with comorbidities and compromised immunity appear to be more prone to severe disease (Ref 6), but it is generally difficult to know in advance who will be severely affected, and who will escape with mild disease (Ref 7).
  • COVID-19 Severity Test CST
  • the inventors used two previously published data sets to train and develop SARS-CoV-2 severity models, while their own novel data was used solely as holdout data to validate their findings. To the knowledge of the inventors, no existing study has so far used a fully independent data set to test COVID-19 severity models.
  • the present inventors have identified 433 CpG sites in total which are highly relevant for determining susceptibility to severe COVID-19.
  • the 433 CpG sites were identified from 3 different filtering processes devised by the inventors, yielding 3 overlapping lists of CpG sites.
  • Two of the filtering processes were used to identify sites suitable for measurement in a microarray-based protocol. Specifically, one of the filtering processes involved training a model to identify a list of sites for which best performance in COVID-19 severity prediction was achieved based on the datasets used; this yielded a list of 177 CpG sites (as listed in Table 1).
  • the other filtering process involved the use of stability selection (Ref 9) to identify the most robust CpG sites for predicting susceptibility to severe COVID-19 (in brief, a selection of CpG sites with the highest probability of appearing in other potential models); this yielded a list of 251 CpG sites (as listed in Table 2).
  • the two lists of CpG sites generated by these two filtering processes exhibited significant overlap; specifically, 91 CpG sites (listed in Table 6) are common to both of Tables 1 and 2, and yielded 337 CpG sites in total (listed in Table 5).
  • the third filtering process was used to identify sites particularly suitable for measurement in a polymerase chain reaction (PCR)-based protocol.
  • PCR machines are usually available onsite in hospitals, unlike microarray machines, and so it is advantageous to provide a protocol for determining susceptibility to severe COVID-19 which is specifically optimized for use with PCR.
  • CpG sites were filtered (selected) inter alia based on the magnitude of the dissimilarity of the methylation ratio distributions between severe and non- severe patients, in order to ensure the usefulness of CpG sites in a lab setting where, potentially, only a few CpG sites can be utilized at once in the absence of high throughput data.
  • the filtering process yielded a final list of 97 CpG sites as provided in Table 3. While the list of 97 CpG sites is particularly suitable for use in a PCR-based protocol, it will be readily understood that these sites are also suitable and could be used advantageously in a microarray-based protocol. Table 7 provides a subset of 34 CpG sites from Table 3 which are particularly preferred.
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
  • a method of or kit for “obtaining clinically relevant information” is contemplated, or a method of or kit for “diagnosing” or prognosing” or similar is contemplated, a method of or kit for “providing information (or clinically relevant information) for diagnosing (or prognosing)” may also be contemplated.
  • the method comprises using the methylation levels of at least or at most the CpG sites referred to in each of Tables 1 to 3 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36,
  • the methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop) severe COVID-19, etc.
  • the methylation levels are indicative of, or used as an indication of, or used to provide an indication of, the susceptibility of the subject to (or to develop) severe COVID-19, etc..
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most:
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most:
  • the method comprises using the methylation levels of at least or at most the CpG sites referred to in each of Tables 1 to 3 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31 , 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Tables 1 to 3 (so 6 sites in total)), or 1 (i.e. at least or at most
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most:
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most:
  • the method comprises using the methylation levels of at least or at most the
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
  • the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 2 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 2 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
  • the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 1 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e.
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
  • the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 3 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 3 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of
  • the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID- 19, or to determine
  • the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
  • the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 7 as CpG site numbers 1 to 34, 1 to 33, 1 to 32, 1 to 31 , 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21, 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11, 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 7 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Table 1 (so 1 site in total)).
  • the method of the invention comprises a step of obtaining (or acquiring or collecting) the methylation levels before the step of using the methylation levels.
  • the obtaining (or acquiring or collecting) of the methylation levels may include the measuring of the methylation levels, however this is not essential.
  • the obtaining (or acquiring or collecting) of the methylation levels may comprise or consist of receiving methylation levels (e.g. methylation levels that have been previously measured), for example from an external source or a third party.
  • methylation levels are "indicative of (the presence or absence of) susceptibility to severe COVID-19 in the subject” or “used to provide an indication of (the presence or absence of) susceptibility to severe COVID-19 in the subject” or other similar terms, it is meant that there is a correlation (e.g. a positive or negative correlation) between the respective methylation level and susceptibility to severe COVID-19 in the subject.
  • COVID-19 stands for “Coronavirus disease 2019” and refers to the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In most countries, the determination of COVID-19 severity follows guidelines that are similar to those provided by the NIH to physicians in the US. Any given patient with a positive PCR test for SARS-CoV-2 is placed at the appropriate level on the following clinical spectrum:
  • Severe illness Same as moderate illness plus SpO2 ⁇ 94%. Management is hospitalization with 02 therapy. May deteriorate rapidly.
  • severe COVID-19 refers to a degree of COVID-19 illness wherein the subject is hospitalized (or wherein the subject should be hospitalized, or wherein hospitalization of the subject is warranted). This is in contrast to “non-severe COVID”, which refers to a degree of COVID-19 illness wherein the subject is not hospitalized (or wherein the subject should not be hospitalized, or wherein hospitalization of the subject is not warranted).
  • “severe COVID-19” refers to a degree of COVID-19 which is characterized by a score of 4 or 5 on the NIH clinical spectrum as recited above.
  • non-severe COVID refers to a degree of COVID-19 illness which is characterized by a score of 1 , 2 or 3 on the NIH clinical spectrum as recited above.
  • the methods, etc., of the invention as described herein can be used to classify subjects into those with susceptibility to severe as opposed to non-severe COVID-19, or to classify subjects into those which will (or are likely to) develop severe COVID-19 as opposed to developing (or maintaining) non-severe COVID-19.
  • the present invention provides a method for determining the likelihood (or probability) that severe COVID-19 will develop in a subject.
  • the methylation level of one or more of the CpG sites as described elsewhere herein in the sample, or the overall probability value determined therefrom shows an association with the probability of severe COVID-19 that is predicted to develop in the subject.
  • the methylation level of one or more of the CpG sites as described elsewhere herein, or the overall probability value determined therefrom is indicative of the probability that severe COVID-19 will develop in the subject (or of the probability of severe COVID-19 progression).
  • the methods of the invention can thus be used in the selection of patients for therapy or for triaging. Hence, the method of the invention may be used to enable those subjects having greater probability of developing severe COVID-19 to be prioritised for treatment over those with a lower probability of developing severe COVID-19.
  • selectively measuring refers to methods wherein the methylation levels of only a finite number of CpG sites are measured rather than measuring the methylation levels essentially of all or essentially all potential CpG sites in a genome.
  • "selectively measuring" methylation levels can refer to measuring the methylation levels of no more than 100000, 90000, 80000, 70000, 60000, 50000, 40000, 30000, 20000, 10000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91 , 95, 97, 100, 110, 120, 130, 140, 150, 160, 170, 177, 180, 190, 200, 210, 220, 230, 240, 250, 251 , 300, 350, 400, or 433 different CpG sites.
  • "selectively using” methylation levels can refer to using the methylation levels of no more than 100000, 90000, 80000, 70000, 60000, 50000, 40000, 30000, 20000, 10000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 95, 97, 100, 110, 120, 130, 140, 150, 160, 170, 177, 180, 190, 200, 210, 220,
  • selectively detecting refers to methods wherein the methylation levels of only a finite number of CpG sites are measured rather than measuring the methylation levels essentially of all or essentially all potential CpG sites in a genome.
  • "selectively detecting" methylation levels can refer to detecting the methylation levels of no more than 100000, 90000, 80000, 70000, 60000, 50000, 40000, 30000, 20000, 10000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91 , 95, 97, 100, 110, 120, 130, 140, 150, 160, 170, 177, 180, 190, 200, 210, 220, 230, 240, 250, 251, 300, 350, 400, or 433 different CpG sites.
  • methods of the present invention may comprise using, determining or measuring, etc., the methylation levels of one or more CpG sites “selected from the list of” certain specific CpG sites set forth herein.
  • the methylation levels of one or more of the specific CpG sites “selected from the list” set forth herein are used, measured or determined, etc.
  • the methylation levels of one or more other (or distinct or alternative) CpG sites, or of one or more other (or distinct or alternative) CpG sites belonging to one or more other genes, and/or one or more other biomarkers may additionally be used, measured or determined.
  • “selected from the list of” may be an “open” term.
  • the methylation levels of only one or more of the specific CpG sites discussed herein is used, measured or determined, etc. (e.g. the methylation levels of other CpG sites or other biomarkers are not used, measured or determined).
  • CpG site is given its art recognised meaning and refers to the location in a nucleic acid molecule, or sequence representation of the molecule, where a cytosine nucleotide and guanine nucleotide occur, the 3' oxygen of the cytosine nucleotide being covalently attached to the 5' phosphate of the guanine nucleotide.
  • the nucleic acid is typically DNA.
  • the cytosine nucleotide can optionally be methylated at position 5 of the pyrimidine ring.
  • Such CpG sites can be referred to as methylated CpG sites.
  • nucleic acid sequences recited herein are recited in the 5’ to 3’ direction.
  • methylation level includes the average methylation state of a CpG site in a biological sample.
  • Methylation levels of each CpG site may be quantified by methods known in the art, for example in the form of a beta value or M value.
  • the beta value is the ratio of the methylated probe intensity and the overall intensity (sum of methylated and unmethylated probe intensities).
  • the beta-value is thus generally and conveniently a number between 0 and 1, or 0 and 100%. A value of zero indicates that all copies of the CpG site in the sample were completely unmethylated (no methylated molecules were measured) and a value of one (or 100%) indicates that every copy of the CpG site in the sample was methylated.
  • the methylation levels referred to herein are methylation states.
  • the “methylation state” of a particular CpG site in a particular cell is either methylated or nonmethylated.
  • the methods of the invention are carried out in vitro or ex vivo (unless the context requires otherwise, e.g. administration steps).
  • methylation levels of any number of the 433 CpG sites listed in Table 4 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
  • methylation levels of any number of the 251 CpG sites listed in Table 2 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40,
  • any particular selection of the 251 CpG sites listed in Table 2 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46,
  • methylation levels of any number of the 177 CpG sites listed in Table 1 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15,
  • methylation levels of any number of the 91 CpG sites listed in Table 6 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15,
  • any particular selection of the 91 CpG sites selected from the 91 CpG sites listed in Table 6 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40,
  • methylation levels of any number of the 97 CpG sites listed in Table 3 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
  • methylation levels of any number of the 34 CpG sites listed in Table 7 could be used, i.e. the methylation levels of at least or at most or exactly 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, or 34 of the CpG sites recited in Table 7 could be used.
  • CpG site cg13452062 is not used or included.
  • a diagnosis or diagnosing step e.g. a step of diagnosing susceptibility to severe COVID-19 or the presence or absence of susceptibility to severe COVID-19 in a subject
  • a classification or classification step e.g. a step of classifying a subject as having or not having susceptibility to severe COVID-19.
  • the classification or diagnosing can be achieved by assignment of a cutoff value as described elsewhere herein.
  • the indication of the susceptibility (or presence or absence of susceptibility) to severe COVID- 19 in the subject may be provided or derived using machine learning (or a machine learning technique).
  • the indication may be provided using appropriate techniques such as random forest, gradient boosting, a neural network, or linear or logistic regression.
  • the indication may be provided using a combination of appropriate techniques, for example using a logistic regression model followed by a support vector machine (svm) model, or for example using a random forest classification model followed by a feedforward neural network.
  • scoring methods, scoring systems, markers or formulas can be used that comprise any appropriate combination of the CpG sites or methylation levels of the invention as described herein in order to arrive at an indication, e.g. in the form of a value or score, which can then be used for diagnosis of susceptibility (or of the presence or absence of susceptibility) to severe COVID-19 (this can also be referred to as a COVID-19 severity score or a severity risk score).
  • said methods etc. can be an algorithm that comprises any appropriate combination of the CpG sites or methylation levels as an input, to e.g. perform pattern recognition of the samples, in order to arrive at an indication, e.g.
  • Non-limiting examples of such algorithms include machine learning algorithms that implement classification (algorithmic classifiers), such as linear classifiers (e.g. Fisher’s linear discriminant, logistic regression, naive Bayes classifier, perceptron); support vector machines (e.g. least squares support vector machines); quadratic classifiers; kernel estimation (e.g. k-nearest neighbor); boosting (e.g. gradient boosting); decision trees (e.g. random forests); neural networks; and learning vector quantization.
  • linear classifiers e.g. Fisher’s linear discriminant, logistic regression, naive Bayes classifier, perceptron
  • support vector machines e.g. least squares support vector machines
  • quadratic classifiers kernel estimation (e.g. k-nearest neighbor); boosting (e.g. gradient boosting); decision trees (e.g. random forests); neural networks; and learning vector quantization.
  • kernel estimation e.g. k-nearest neighbor
  • boosting e.g
  • classifiers e.g. machine learning, random forest, gradient boosting or logistic regression
  • Such classifiers can conveniently be trained on methylation levels from a training set of samples and then tested in terms of accuracy (or balanced accuracy) on a test set of samples.
  • the classifier may generate a black-box model that is trained on the most important methylation CpG sites or methylation levels.
  • the method comprises calculating a likelihood (or probability) of the subject having susceptibility, etc., to severe COVID-19, for example as a function of said methylation levels.
  • the likelihood (or probability) can alternatively be referred to as likelihood value (or probability value).
  • the likelihood (or probability) can be a value between 0 and 1.
  • a value of 1 can indicate a 100% likelihood (or probability) that the subject has susceptibility to severe COVID- 19, and a value of 0 can indicate a 0% likelihood (or probability) that the subject has susceptibility to severe COVID-19.
  • the methods of the invention comprise calculating the likelihood as a function of a linear combination of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
  • the linear combination of said methylation levels comprises a weighted sum of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
  • the weighted sum of methylation levels can be formed by applying a predetermined weight (or coefficient) to each methylation value to provide a set of weighted methylation levels and then summing the weighted methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
  • a weight (or coefficient) as described herein is a normalized weight (or normalized coefficient), standardized weight (or standardized coefficient), or standardized logistic regression weight (or standardized logistic regression coefficient).
  • the method of the invention comprises calculating the likelihood as a logistic function of a linear combination of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
  • the method of the invention comprises performing a logistic regression method using said methylation levels, e.g. a linear combination of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
  • the method of the invention comprises receiving data representative of said methylation levels, and inputting the data to an algorithm for evaluating said function to determine the likelihood of the subject having susceptibility to severe COVID-19.
  • the method comprises applying an algorithm (for example a statistical prediction algorithm) to the methylation levels, optionally in order to determine the susceptibility to severe COVID-19 status of the subject (or optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19).
  • an algorithm for example a statistical prediction algorithm
  • applying the algorithm can comprise: applying a weight (or coefficient), e.g. a predetermined weight (or coefficient), to each methylation value to provide a set of weighted methylation levels; summing the weighted methylation levels to provide a linear combination of methylation levels in the form of a weighted sum of said methylation levels; and applying a logistic function to the weighted sum, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19; and optionally comparing the likelihood value (or likelihood) with a cutoff value (or cutoff).
  • a weight or coefficient
  • a predetermined weight or coefficient
  • the weight (or coefficient), e.g. the predetermined weight (or coefficient), for each methylation value has been calculated using reference methylation levels for each CpG site, wherein the reference methylation levels have been measured (or determined or obtained) from severe COVID-19 subjects (or observations of severe COVID-19 subjects) and from subjects not having severe COVID-19 (or observations of subjects not having severe COVID-19).
  • the method comprises (or further comprises) comparing the likelihood or likelihood value with a cutoff or cutoff value (e.g. a predetermined cutoff value). In embodiments, the method comprises (or further comprises) comparing the likelihood value with a cutoff value (e.g. a predetermined cutoff value), wherein the likelihood value being above the cutoff value is indicative of susceptibility to severe COVID-19 in the subject, and wherein the likelihood value being below the cutoff value is indicative of the absence of susceptibility to severe COVID-19 in the subject.
  • a cutoff or cutoff value e.g. a predetermined cutoff value
  • the comparing step may be considered to result in a diagnosis, i.e. of the presence or absence of susceptibility to severe COVID-19 in the subject. Alternatively viewed, the comparing step may be considered to result in a classification of the subject as having or not having susceptibility to severe COVID-19.
  • the method comprises (or further comprises) providing a readout or result indicating the presence or absence of susceptibility to severe COVID-19 based on the comparison of the likelihood (or likelihood value) with the cutoff (or cutoff value).
  • the readout or result can be used as a diagnosis of the presence or absence of susceptibility to severe COVID-19 in the subject.
  • appropriate threshold or cut-off scores or values can be calculated by methods known in the art, for example from the ROC curve, for use in the methods of the invention.
  • Such cut-off scores or values or thresholds may be used to declare a sample positive or negative.
  • Appropriate or optimal cut-off scores or values or thresholds can be calculated depending on the desired outcome of the method, for example a cut-off score or value or threshold can be determined (or selected) to maximize the accuracy of the assay.
  • a cut-off score or value or threshold can be determined (or selected) to maximize the specificity of the assay, or the sensitivity of the assay, or both the sensitivity and the specificity of the assay (e.g.
  • a default cut-off can be used without calculation, for example a cut-off of 0.5 (in other words, a likelihood value of greater than 0.5 indicates susceptibility to severe COVID-19).
  • Appropriate cutoff values can readily be determined by a person skilled in the art as described elsewhere herein. However, exemplary cutoff values might be 0.5, 0.6, 0.7, 0.8, or 0.9.
  • threshold (cut-off) values could be used for any of the models (or algorithms) using different combinations of CpG sites described herein. Pre-determined or default cut-off values can also be used. Such threshold (cut-off) scores can then conveniently be used to assess the appropriate methylation data in subjects and to arrive at a diagnosis.
  • the models of the invention provided herein show outstanding results (on average a balanced accuracy of approximately 0.8 for all models tested). Thus, these results show that the present invention provides a simple and accessible test to allow accurate screening of susceptibility (or the presence or absence of susceptibility) to severe COVID-19 in an individual. Good indicators of the performance of a diagnostic test are AUG, sensitivity, specificity, accuracy and balanced accuracy, especially AUG and balanced accuracy.
  • the “area under the receiver operating characteristic (ROC) curve” is a global measure of diagnostic accuracy.
  • the ROC curve is a plot of the pairs of sensitivity and specificity values for each cut-off, with 1 -specificity (1 minus specificity) on the x-axis and sensitivity on the y- axis.
  • the AUC is independent of cut-off. In some instances, AUC can therefore be more informative of the quality of a diagnostic test than sensitivity or specificity.
  • an AUC of 0.5 suggests no discrimination (i.e. no ability to diagnose patients with and without the disease or condition based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding (Mandrekar, Journal of Thoracic Oncology, Volume s, Number 9, September 2010).
  • the balanced accuracy value of a predictor at a given cut-off will generally be much lower than the AUC value of the predictor.
  • practitioners in the art would consider a balanced accuracy of approximately 0.6 or above, or 0.6 or above, to define that the predictor is acceptable/workable.
  • the methods of the invention as described elsewhere herein have a specificity, sensitivity, balanced accuracy and/or AUC value of at least 0.58 (58%), 0.59 (59%), 0.6 (60%), 0.61 (61%), 0.62 (62%), 0.63 (63%), 0.64 (64%), 0.65 (65%), 0.66 (66%), 0.67 (67%), 0.68 (68%), 0.69 (69%), 0.7 (70%), 0.71 (71%), 0.72 (72%), 0.73 (73%), 0.74 (74%), 0.75 (75%), 0.76 (76%), 0.77 (77%), 0.78 (78%), 0.79 (79%), 0.8 (80%), 0.81 (81%), 0.82 (82%), 0.83 (83%), 0.84 (84%), 0.85 (85%), 0.86 (86%), 0.87 (87%), 0.88 (88%), 0.89 (89%), 0.9 (90%), 0.91 (91%), 0.92 (92%), 0.93 (93%), 0.94 (94%), 0.95 (95%) or 0.96 (96%).
  • the method of the invention comprises or further comprises making a diagnosis of susceptibility to severe COVID-19 based on the methylation levels referred to elsewhere herein and/or the likelihood referred to elsewhere herein. Alternatively viewed, the method of the invention may comprise or further comprise making a prognosis of the development of severe COVID-19.
  • the diagnosis may be made on the basis of (or based on) the methylation levels, likelihood value, or readout or result described elsewhere herein.
  • the diagnosis may be considered to be performed by the production of the readout or result itself. Said diagnosis may therefore be computer implemented, e.g. partially or entirely computer implemented, and/or performed in the absence of a clinician. Alternatively or in addition, the diagnosis may be considered to be the conclusion drawn by a clinician based on said methylation levels, likelihood value, or readout or result described elsewhere herein.
  • the method may comprise (or further comprise) delivering a diagnosis (or prognosis).
  • the diagnosis (or prognosis) may be based on data used or generated in the method, for example a readout, a result, or methylation levels as described elsewhere herein.
  • the delivering of the diagnosis (or prognosis) may be considered to be performed by the production of the readout or result itself.
  • the diagnosis (or prognosis) may be delivered in the form of a written or electronic report as described elsewhere herein, or may be delivered orally.
  • the diagnosis (or prognosis) may be delivered by a clinician, or by a processing system or computer.
  • the diagnosis (or prognosis) may be delivered to any relevant party, for example the subject being tested or an acquaintance thereof, or another clinician.
  • the method may further comprise outputting the data (e.g. readout, result, diagnosis, prognosis, or methylation levels, as the case may be) over a network connection, or displaying the data on a screen, e.g. a computer screen, or on an electronic display.
  • the subject e.g. human subject
  • the subject is a subject who has COVID-19, or has been diagnosed or identified as having COVID-19.
  • the subject has been recently diagnosed or identified as having COVID-19, for example the subject may have been diagnosed or identified as having COVID-19 at most 1, 2, 3, 4, 5, 6 or 7 days, or 1 , 2, 3 or 4 weeks prior to collection of a sample from the subject for analysis by the method of the invention.
  • the subject e.g. human subject
  • the subject is a subject suspected of being (or believed to be) susceptible to severe COVID-19 (or at risk of, or at high or higher or increased risk of severe COVID-19).
  • the method of the invention may be used in order to affirm or further support that the subject is indeed susceptible to severe COVID-19.
  • the subject is a subject having (or known or believed to have) one or more of known risk factors associated with (developing or the development of) severe COVID-19.
  • Such “at risk”, “suspected” or “susceptible” etc. subjects would be readily identified by a person skilled in the art. These subjects include for example subjects with a family history of susceptibility to (or development of) severe COVID-19, or a genetic predisposition to the development of severe COVID-19, or subjects diagnosed with one or more risk factors associated with severe COVID-19, or subjects with one or more recognized risk factors associated with severe COVID-19.
  • recognized risk factors for severe COVID-19 are male sex, obesity, old age, hypertension, diabetes, cardiovascular disease, or chronic lung disease, Down's syndrome, cancer, treatment for certain types of cancer (for example chemotherapy), sickle cell disease, chronic kidney disease, severe liver disease, or immunodeficiency.
  • subject as used herein can also mean “individual”, “patient” or “person”.
  • the methods of the invention as described herein can be carried out on any type of subject which is capable of suffering from severe COVID-19 (or from being susceptible to severe COVID-19).
  • a wide variety of animals are known to be able to be infected with SARS-CoV-2 and develop COVID-19, including several mammals.
  • the methods of the invention may be carried out on a mammal, i.e. the subject may be a mammal.
  • the methods may be carried out on (i.e. the subject may be) a human, primate (e.g. monkey), laboratory mammal (e.g. mouse, rat, rabbit, or guinea pig), livestock mammal (e.g. horse, cattle, sheep, or pig), domestic pet (e.g. cat, dog, hamster, or ferret), zoo or sanctuary animal (e.g. lion, tiger, leopard), mink, or wild animal (e.g. deer, marmoset, or anteater).
  • the subject is preferably a human subject.
  • the subject may be male or female.
  • the human may, for example, be 0-10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, 90-100 or above 100 years old.
  • the subject e.g. human subject
  • the methods of the invention may be carried out on “healthy” patients (subjects) or at least patients (subjects) which are not manifesting any clinical symptoms of COVID-19, for example, patients with very early or pre-clinical stage COVID-19, and/or asymptomatic COVID-19.
  • the methods of the present invention can also be used to monitor disease progression. Such monitoring can take place before, during or after treatment of COVID-19 (or severe COVID-19), e.g. pharmaceutical therapy or other non-pharmaceutical interventions such as respiratory support, hospitalization, oxygen (O2) supplementation, non-invasive ventilation, ventilator support, ECMO, or physical therapy.
  • pharmaceutical therapy or other non-pharmaceutical interventions such as respiratory support, hospitalization, oxygen (O2) supplementation, non-invasive ventilation, ventilator support, ECMO, or physical therapy.
  • O2 oxygen
  • ECMO non-invasive ventilation
  • the present invention provides a method for monitoring COVID-19 (or severe COVID-19) or monitoring the progression of COVID-19 (or severe COVID-19) in a subject.
  • Methods of the present invention can be used in the active monitoring of patients which have not been subjected to therapy (pharmacological or non-pharmacological), e.g. to monitor the progress of COVID-19 in untreated patients.
  • serial measurements can allow an assessment of whether or not, or the extent to which, the COVID-19 is worsening or improving, thus, for example, allowing a more reasoned decision to be made as to whether therapeutic (pharmacological or non-pharmacological) intervention is necessary or advisable.
  • monitoring can also be carried out, for example, in an individual, e.g. a healthy individual or an individual which has tested positive for COVID-19 (for example through a lateral flow test or PCR test), who is thought to be at risk of developing severe COVID-19 or thought to be susceptible to developing severe COVID-19, in order to obtain an early, and ideally pre-clinical, indication of susceptibility to developing severe COVID- 19.
  • the term “monitoring” COVID-19 as used herein can also be used to mean “monitoring the development of’ or “monitoring the progression of” COVID-19 (or severe COVID-19).
  • Serial (periodical) measuring of the methylation level of one or more of the CpG sites in accordance with the present invention and as referred to elsewhere herein may also be used to monitor susceptibility to severe COVID-19, looking for either increasing or decreasing levels over time. Observation of altered levels (increase or decrease as the case may be) may also be used to guide and monitor therapy, both in the setting of subclinical disease, i.e. in the situation of "watchful waiting" before treatment (pharmacological or non-pharmacological), e.g. before initiation of pharmacological or non-pharmacological treatment, or during or after treatment to evaluate the effect of treatment and look for signs of therapy failure.
  • the present invention also provides a method for predicting the response of a subject to therapy (pharmacological or non-pharmacological). For example, a subject with a higher likelihood (or probability) of developing severe COVID-19, as determined by the methylation level of one or more of the CpG sites in a sample in accordance with the present invention and as referred to elsewhere herein, may be more likely to be responsive to therapy (pharmacological or non-pharmacological) than a subject with lower likelihood (or probability) of developing severe COVID-19. In such methods the choice of therapy (pharmacological or non- pharmacological) may be guided by knowledge of the methylation level of one or more of the CpG sites in the sample.
  • the invention provides a method of monitoring (e.g. continuously monitoring or performing active surveillance of) a subject having COVID-19 or severe COVID-19 (e.g. a subject being treated for COVID-19).
  • a method of monitoring e.g. continuously monitoring or performing active surveillance of
  • Such monitoring may guide which treatment to use or whether no treatment should be given or whether treatment should be continued or whether the dose of a pharmaceutical agent should be increased or decreased, etc.
  • the methods of the invention are used in conjunction with (or subsequent to) known screening or diagnostic methods for identifying COVID-19, for example lateral flow tests or PCR tests.
  • the invention provides the use of the methods of the invention (e.g. screening or diagnostic methods, etc., as described herein) in conjunction with other known screening or diagnostic methods for identifying susceptibility to severe COVID-19, for example lateral flow tests or PCR tests.
  • the methods of the invention can be used as a follow up to confirm a diagnosis of susceptibility to severe COVID-19 in a subject.
  • the methods of the present invention are used alone.
  • the methods of the present invention can be carried out on any appropriate biological sample, e.g. any appropriate body fluid sample or tissue sample that contains DNA.
  • any appropriate biological sample e.g. any appropriate body fluid sample or tissue sample that contains DNA.
  • blood samples are a common source of DNA
  • other types of body fluid or tissue sample could be used by a skilled person to extract DNA containing the desired CpG sites, following the teaching as provided herein.
  • the sample has been obtained from (removed from) a subject (e.g. as described elsewhere herein, preferably a human subject).
  • the method further comprises a step of obtaining a sample from the subject.
  • obtained from the subject it is meant that the biological sample is previously obtained, or has been obtained from the subject.
  • the patient or subject is not required to be present (while the methods of the invention are being performed).
  • the invention is not practised (or performed) on the human (or animal) body.
  • body fluid includes reference to all fluids derived from the body of a subject.
  • Exemplary fluids include blood (including all blood derived components, for example buffy coat, plasma, serum, etc.), saliva, urine, tears, bronchial secretions or mucus.
  • the body fluid is a circulatory fluid (especially blood or a blood component), or urine.
  • Especially preferred body fluids are blood or urine.
  • the sample is a blood sample (e.g. a plasma, serum, buffy coat or white blood cell sample).
  • the sample is a buffy coat sample or white blood cell sample.
  • the sample is a urine sample.
  • the body fluid or sample may be in the form of a liquid biopsy.
  • sample also encompasses any material derived by processing a body fluid or tissue sample (e.g. derived by processing a blood or urine sample). Processing of biological samples to obtain a test sample may involve one or more of: digestion, boiling, filtration, distillation, centrifugation, lyophilization, fractionation, extraction, concentration, dilution, purification, inactivation of interfering components, addition of reagents, derivatization, complexation and the like, e.g. as described elsewhere herein.
  • the biological sample is a blood, saliva, urine, solid tissue (for example cartilage from affected joints), or fecal sample.
  • the biological sample is a blood sample.
  • the blood sample is a buffy coat sample or a serum sample or a plasma sample.
  • the sample is a white blood cell (or leukocyte) sample, or is a sample comprising white blood cells (or leukocytes).
  • the DNA from the biological sample is genomic DNA.
  • the method additionally comprises the step of obtaining one or more biological samples from the subject.
  • one or more of the methylation levels in accordance with the present invention are detected directly in the biological sample, e.g. from within a sample of the subject’s blood, blood serum, blood plasma, buffy coat, white blood cell, or other sample.
  • DNA is first isolated and/or purified from the biological sample before the methylation levels are detected.
  • the biological sample may therefore comprise (or consist of, or be) isolated and/or purified DNA.
  • DNA may be isolated and/or purified from the biological samples by any suitable method which would be well known to a person skilled in the art. Such methods may include cell lysis; treatment with protease, RNase and/or detergent; and DNA purification by ethanol precipitation, phenol-chloroform extraction or minicolumn purification. Specific DNA extraction methods can be used depending on the biological sample in question.
  • the DNA can be extracted using the Monarch® Protocol for Extraction and Purification of Genomic DNA from Blood (NEB #T3010), or a magnetic bead-based technology such as the ChargeSwitch® gDNA Purification Kit (Thermofisher).
  • the method of the invention comprises, e.g. further comprises, reporting the results of the method, optionally and conveniently by preparing a written or electronic report.
  • the method of the invention is implemented by a computer (or is computer- implemented).
  • the method of the invention comprises, e.g. further comprises, treating said severe COVID-19 by therapy (pharmacological or non-pharmacological).
  • nirmatrelvir and ritonavir Pieris avir
  • sotrovimab Xevudy
  • remdesivir Veklury
  • sotrovimab is a neutralising monoclonal antibody (nMAb).
  • the therapy comprises a step of administering to the subject a therapeutically effective amount of one or more agents suitable for preventing the development of severe COVID-19, preferably an agent selected from the group consisting of nirmatrelvir, ritonavir, sotrovimab, remdesivir, and molnupiravir.
  • agents suitable for preventing the development of severe COVID-19 preferably an agent selected from the group consisting of nirmatrelvir, ritonavir, sotrovimab, remdesivir, and molnupiravir.
  • COVID-19 illness may be classified as severe COVID-19 where the subject requires hospitalization or if hospitalization is warranted.
  • supplemental oxygen may be administered to the subject.
  • the supplemental oxygen may be provided through non-invasive ventilation (for example using a face mask).
  • the supplemental oxygen may be provided by high-flow oxygen therapy (for example using a high-flow nasal cannula), mechanical ventilation (MV) (using a ventilator), or extracorporeal membrane oxygenation (ECMO) (using a heart and/or lung machine).
  • an antiviral or immunomodulator therapy may be administered to the subject, for example remdesivir, dexamethasone and/or tocilizumab.
  • an anticoagulation therapy may be administered to the subject, for example heparin.
  • the therapy comprises a step of administering to the subject a therapeutically effective amount of one or more agents suitable for treating severe COVID-19, preferably a suitable antiviral or immunomodulator agent (for example remdesivir, dexamethasone and/or tocilizumab) and/or a suitable anticoagulation therapy (for example heparin).
  • a suitable antiviral or immunomodulator agent for example remdesivir, dexamethasone and/or tocilizumab
  • a suitable anticoagulation therapy for example heparin
  • the method of the invention comprises, e.g. further comprises, altering, ceasing or continuing treatment of said subject.
  • the method of the invention comprises, e.g. further comprises, a step of measuring the methylation levels before the step of using the methylation levels.
  • the method of the invention comprises, e.g. further comprises, providing DNA (said DNA) from a biological sample obtained from the subject before the step of measuring the methylation levels.
  • a method of the invention comprising a first step of extracting DNA (e.g. genomic DNA) from a sample, e.g. a biological sample.
  • DNA e.g. genomic DNA
  • a second step the DNA methylation levels at multiple CpG sites as defined elsewhere herein are measured. Each measurement measures the extent of methylation at a particular CpG site.
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
  • a computer program, software, or computer readable storage medium e.g. a non-transitory and/or tangible computer readable storage medium
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most: 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most:
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
  • a computer program, software, or computer readable storage medium e.g. a non-transitory and/or tangible computer readable storage medium
  • the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250,
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
  • a computer program, software, or computer readable storage medium e.g. a non-transitory and/or tangible computer readable storage medium
  • the processing system to process data representative of methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
  • a computer program, software, or computer readable storage medium e.g. a non-transitory and/or tangible computer readable storage medium
  • the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
  • a computer program, software, or computer readable storage medium e.g. a non-transitory and/or tangible computer readable storage medium
  • the processing system to process data representative of methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in
  • the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
  • a computer program, software, or computer readable storage medium e.g. a non-transitory and/or tangible computer readable storage medium
  • the software or computer program may be stored on a non-transitory and/or tangible computer- readable storage medium, such as a hard-drive, a CD-ROM, a solid-state memory, etc., or may be communicated by a transitory signal such as data over a network.
  • a non-transitory and/or tangible computer- readable storage medium such as a hard-drive, a CD-ROM, a solid-state memory, etc.
  • a transitory signal such as data over a network.
  • the instructions cause the processing system to calculate the likelihood as a non-linear function of the combination of said methylation levels in accordance with the invention as described elsewhere herein.
  • the instructions cause the processing system to calculate the likelihood as a function of a linear combination of said methylation levels.
  • the linear combination of said methylation levels comprises a weighted sum of said methylation levels.
  • the instructions cause the processing system to calculate the likelihood as a logistic function of a linear combination of said methylation levels.
  • the instructions cause the processing system to receive data representative of said methylation levels and input the data to an algorithm for evaluating said function to determine the likelihood of the subject having susceptibility to severe COVID-19.
  • the computer program, software, or non-transitory (or tangible) computer readable storage medium comprises computer-readable code that, when executed by a processing system (or a computer), causes the processing system (or the computer) to perform one or more additional operations comprising: sending information corresponding to the methylation levels of the set of CpG sites in the biological sample to a tangible data storage device.
  • the using of the methylation levels comprises processing data representative of the methylation levels (or processing the methylation levels).
  • the methods disclosed herein may be fully or wholly computer-implemented methods. Alternatively, the methods disclosed herein may be partially computer-implemented methods. Any of the method steps disclosed herein may, wherever appropriate, be implemented as steps of the method, using any appropriate hardware and/or software.
  • the method (or one or more method steps, for example the step of using the methylation levels or the step of processing the methylation levels) may be carried out by a processor, computer, device, unit, module or means.
  • the step (or only the step) of using the methylation levels may be computer-implemented, and/or the step (or only the step) of processing of data representative of the methylation levels (or processing the methylation levels) may be computer-implemented.
  • the computer software disclosed herein may be on a transitory or a non-transitory computer- readable medium.
  • the diagnostic algorithm could be implemented on one or more further computer processing systems that are distinct from the computer processing system that is configured to train the model (or models) used in the invention.
  • a preferred embodiment provides a method of screening for the susceptibility of a subject to severe COVID-19, the method comprising using the methylation levels of at least: 10, 15, 20, 25, 30 or 35 CpG sites selected from the 337 CpG sites listed in Table 5; and/or 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to severe COVID-19; wherein the step of using the methylation levels is implemented by a computer; and wherein the biological sample is a blood sample.
  • Such methods may comprise using the methylation levels of at least:
  • the invention may also be provided in a fully developed software package or web-based program. For example, a user may access a webpage and upload their DNA methylation data. The program then emails the results, including the indication of the (presence or absence of) susceptibility to severe COVID-19, to the user.
  • processing system is recited, it should be understood that “computer” or “computer system” (or processor, or device, or unit, or module, or means) is also contemplated alternatively or in addition. Equally, all the terms recited in this paragraph may be used interchangeably where appropriate.
  • the invention provides a processing system configured to perform the method of the invention.
  • the invention provides a processing system (or a computer or computer system) configured to run the algorithm or software of the invention as provided elsewhere herein or configured to perform the methods of the invention.
  • the invention provides a method of screening or diagnosing, etc., susceptibility to severe COVID-19 in a subject, the method comprising calculating, optionally implemented by a computer, a likelihood (or probability) of the subject having susceptibility to severe COVID-19 using measurements of methylation levels of at least or at most:
  • the invention provides a method of monitoring susceptibility to severe COVID-19 in a subject, the method comprising:
  • the invention provides a method of obtaining an indication of the efficacy of a drug which is being used to treat severe COVID-19 in a subject, the method comprising:
  • the biological samples obtained in steps (a) and (b) should be directly comparable, e.g. the biological samples must both be of the same type (e.g. both are blood samples) and subsequently treated in the same manner.
  • the first time point may, for example, be at an early stage of the disease, or at most or exactly 1 , 2, 3, 4, 5, 6 or 7 days, or 1 , 2, 3 or 4 weeks after the subject first tested positive for COVID-19.
  • the second time point may be at a later stage of COVID-19 infection, or after the subject has been treated with medicament suitable for the prevention of severe COVID-19 (for example as described herein) and/or a medicament suitable treatment of severe COVID-19 (for example as described herein).
  • the first and second time points may be any suitable time intervals, e.g. at least 1 , 2, 3, 4, 5, 6, or 7 days apart, or at least 1 , 2, 3 or 4 weeks apart, or at least 1-12 months apart, or at least 1-5 years apart.
  • Serial (periodic) measuring of the level of the methylation levels of one or more of the CpG sites in accordance with the present invention may also be used for disease monitoring, e.g. assessing disease severity (or likelihood of progression to severe disease), looking for either increasing or decreasing levels (or scores or likelihoods or likelihood values) over time.
  • an altering methylation level or score or likelihood (increase or decrease, as appropriate) of one or more of the CpG sites in accordance with the present invention over time e.g. in comparison to a control level or base-line or earlier level in the same subject, e.g. a level moving further away from the control level, base-line or earlier level in the same subject
  • an altering level (increase or decrease, as appropriate) of the methylation level of one or more of the CpG sites in accordance with the present invention over time may indicate an improving disease state, severity or prognosis.
  • a change in the methylation levels between the first and second time points in any aspects referred to herein is indicative of a change in the likelihood (or probability) of the development of severe COVID-19 (or in the severity of COVID-19) in the subject.
  • the invention provides a method of treating COVID-19 in a subject, the method comprising:
  • the invention provides a method of preventing severe COVID-19 in a subject, the method comprising: (a) obtaining an indication of an increased risk of severe COVID-19 in a subject (e.g. a healthy subject, or an at risk or susceptible subject) by performing a method of the present invention as described elsewhere herein; and
  • the invention provides a method of preventing severe COVID-19 in a subject, the method comprising the step of:
  • the treatment to be administered can also be an invasive treatment, e.g. as described elsewhere herein.
  • the methylation level or state of a CpG site may be detected by hybridisation to a probe (e.g. an oligonucleotide probe) and many such hybridisation protocols have been described (see e.g. Sambrook et al., Molecular cloning: A Laboratory Manual, 3rd Ed., 2001, Cold Spring Harbor Press, Cold Spring Harbor, NY).
  • a probe e.g. an oligonucleotide probe
  • many such hybridisation protocols have been described (see e.g. Sambrook et al., Molecular cloning: A Laboratory Manual, 3rd Ed., 2001, Cold Spring Harbor Press, Cold Spring Harbor, NY).
  • the detection will involve a hybridisation step and/or an in vitro amplification step.
  • the target nucleic acid e.g. the methylated or unmethylated form of a particular CpG site, in a sample
  • the target nucleic acid may be detected by using an oligonucleotide with a label attached thereto, which can hybridize to the nucleic acid sequence of interest.
  • an oligonucleotide with a label attached thereto, which can hybridize to the nucleic acid sequence of interest.
  • Such a labeled oligonucleotide will allow detection by direct means or indirect means. In other words, such an oligonucleotide may be used simply as a conventional oligonucleotide probe.
  • the signal from the label of the probe emanating from the sample may be detected.
  • the label is selected such that it is detectable only when the probe is hybridized to its target.
  • the probe may have a nucleic acid sequence complementary to the sequence of the CpG site of interest or a derivative thereof.
  • the probe may be complementary to the CpG site (i.e. the dinucleotide “CG” sequence) and certain adjacent residues.
  • the probe may alternatively be complementary to a derivative of the CpG site and certain immediately adjacent residues, for example 10, 20, 30, 40, 50 or 60 immediately adjacent residues.
  • the immediately adjacent residues may for example be residues to the 3’ side of (downstream of) the CG dinucleotide. This probe design format is known in the art.
  • CpG methylation can be detected using two different types of probe as used in the Illumina Infinium I Methylation Assay.
  • the probes may each be linked to a solid support, for example a bead.
  • the first probe type (named the II type in the Infinium I assay) has the sequence “CA” at its 3’ end, and thus is complementary to the sequence of an unmethylated CpG site which has been bisulfite treated (i.e. to “UG”) and subsequently amplified (i.e. to “TG”).
  • the second probe type (named the M type in the Infinium I assay) has the sequence “CG” at its 3’ end, and thus is complementary to the sequence of a methylated CpG site, whether bisulfite-treated or not (i.e. “CG”).
  • the probes may be complementary to said CpG sites and certain immediately adjacent residues (or a derivative of said sequence).
  • the immediately adjacent residues may for example be residues to the 3’ side of (downstream of) the CG dinucleotide. Annealing of complementary probes to their target sites enables single-nucleotide (or single-base) extension.
  • the nucleotide incorporated in the single-nucleotide extension may be labelled with an appropriate fluorophore (which indicates methylation or non-methylation), and the fluorescent signal may be detected using an imaging apparatus, for example Illumina iScan.
  • an imaging apparatus for example Illumina iScan.
  • CpG methylation may be detected using a single type of probe as used in the Infinium II Methylation Assay.
  • the probe may have at its 3’ end a cytosine residue suitable for hybridizing to the guanine of the “CG” sequence.
  • the probe may be complementary to said guanine and certain immediately adjacent residues (or a derivative of said sequence).
  • the immediately adjacent residues may for example be residues to the 3’ side of (downstream of) the CG dinucleotide.
  • the probe may be linked to a solid support, for example a bead. The probe can therefore target or hybridize to the CpG site irrespective of the sequence of the CpG site after bisulfite treatment.
  • single-base extension is conducted to identify the second nucleotide of the bisulfite-treated CpG site, and thus whether the CpG site was methylated or unmethylated.
  • the nucleotide incorporated in the single-nucleotide extension may be labeled with an appropriate fluorophore (which indicates methylation or non-methylation), and the fluorescent signal may be detected using an imaging apparatus, for example Illumina iScan.
  • an imaging apparatus for example Illumina iScan.
  • the probe or probes can be designed to be targeted to the sequence of the sense strand of the CpG site, or the antisense strand of the CpG site.
  • the probe or probes can be designed to be targeted to the sequence of the strand on which methylation occurs.
  • probes for use in accordance with the invention may be targeted towards (or complementary to) the sense strand of a CpG site or the antisense strand of a CpG site.
  • probe refers to an oligonucleotide capable of binding in a basespecific manner to a complementary strand of nucleic acid.
  • probe as used herein can also refer to a surface-immobilized molecule that can be recognized by a particular target as well as molecules that are not immobilized and are coupled to a detectable label.
  • probe and “primer” can be used interchangeably herein.
  • the probe is conveniently a nucleic acid probe and thus can be a DNA or RNA oligonucleotide, typically a DNA oligonucleotide.
  • the probe may be for example 10, 20, 30, 40, 50, 60 or 70 nucleotides in length.
  • the probes of the invention may be suitable for use in a PCR method (or PCR-based method).
  • PCR primer or “PCR kit” can be used to refer to a primer or kit which is suitable for use in a PCR protocol or is suitable for performing PCR.
  • the methylation levels have been obtained using a PCR-based method (or using PCR).
  • this is an active step of the method, hence in embodiments the method of the invention comprises obtaining the methylation levels by a PCR-based method (or by PCR) prior to using the methylation levels.
  • MS-LAMP, OpenArray and HRM as mentioned above are preferred PCR methods in respect of the present invention.
  • LAMP provides a simple workflow for detecting methylated CpG dinucleotides in synthetic and genomic DNA samples using methylation-sensitive restriction enzyme digestion followed by loop-mediated isothermal amplification.
  • OpenArray technology is one of the most high-throughput qPCR platforms, which uses a microscope slide-sized plate with through-holes which retain reaction mixtures via surface tension.
  • Methylation-sensitive high-resolution melting (MS-HRM) is based on the comparison of the melting profiles of PCR products from unknown samples with profiles specific for PCR products derived from methylated and unmethylated control DNAs.
  • the protocol consists of PCR amplification of bisulfite-modified DNA with primers designed to proportionally amplify both methylated and unmethylated templates and subsequent high-resolution melting analysis of the PCR product.
  • MS-HRM allows in-tube determination of the methylation status of the locus of interest following sodium bisulfite modification of template DNA during a short time period.
  • complementary or “targeted” as used herein can refer to the hybridization or base pairing between nucleotides or nucleic acids (e.g. between probes), such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer or probe and a primer or probe binding site on a single stranded nucleic acid to be sequenced or amplified.
  • the target CpG site in a sample may be detected or identified by using an oligonucleotide probe which is labeled only when hybridized to its target sequence, i.e. the probe may be selectively labeled.
  • selective labeling may be achieved using labeled nucleotides, i.e. by incorporation into the oligonucleotide probe of a nucleotide carrying a label.
  • selective labeling may occur by chain extension of the oligonucleotide probe using a polymerase enzyme which incorporates a labeled nucleotide, preferably a labeled dideoxynucleotide (e.g.
  • primer extension analysis This approach to the detection of specific nucleotide sequences is sometimes referred to as primer extension analysis.
  • Suitable primer extension analysis techniques are well known to the skilled person, e.g. those techniques disclosed in WO99/50448, the contents of which are incorporated herein by reference.
  • Fluorescent reporter probes used in qPCR may be sequence specific oligonucleotides, typically RNA or DNA, that have a fluorescent reporter molecule at one end and a quencher molecule at the other (e.g. the reporter molecule is at the 5' end and a quencher molecule at the 3' end or vice versa).
  • the probe is designed so that the reporter is quenched by the quencher.
  • the probe is also designed to hybridize selectively to particular regions of complementary sequence which might be in the template. If these regions are between the annealed PCR primers the polymerase, if it has exonuclease activity, will degrade (depolymerise) the bound probe as it extends the nascent nucleic acid chain it is polymerising. This will relieve the quenching and fluorescence will rise. Accordingly, by measuring fluorescence after every PCR cycle, the relative amount of amplification product can be monitored in real time. Through the use of internal standard and controls, this information can be translated into
  • the amplification product may be detected, and amounts (levels) of the amplification product can be determined by any convenient means.
  • a vast number of techniques are routinely employed as standard laboratory techniques and the literature has descriptions of more specialized approaches.
  • the amplification product may be detected by visual inspection of the reaction mixture at the end of the reaction or at a desired time point.
  • the amplification product will be resolved with the aid of a label that may be preferentially bound to the amplification product.
  • a dye substance e.g. a colorimetric, chromomeric fluorescent or luminescent dye (for instance ethidium bromide or SYBR green) is used.
  • a labeled oligonucleotide probe that preferentially binds the amplification product is used.
  • the relative abundance of the methylated or unmethylated CpG site in association with (e.g. physical association with or in complex with) the probe is determined.
  • the level of a complex of the methylated or unmethylated CpG site and the probe used to detect the methylated or unmethylated CpG site is determined.
  • the level of a methylated or unmethylated CpG site in association with (e.g. in complex with) a primer (or extended primer) or probe (e.g fluorescent reporter probe) or dye or the like may be determined.
  • DNA methylation of the CpG sites can be measured using various approaches, which range from commercial array platforms (e.g. from IlluminaTM) to sequencing approaches of individual genes. This includes standard lab techniques or array platforms. For a review of some methylation detection methods, see, Oakeley, E. J., Pharmacology & Therapeutics 84:389-400 (1999).
  • methylation-sensitive sequencing e.g. using an Illumina microarray such as an Illumina 450k array or Illumina Infinium Methylation EPIC Kit
  • a PCR-based method e.g. using an Illumina microarray such as an Illumina 450k array or Illumina Infinium Methylation EPIC Kit
  • HRM high resolution melting
  • OpenArray or LAMP reverse-phase HPLC
  • thin-layer chromatography Sssl methyltransferases with incorporation of labeled methyl groups
  • the chloracetaldehyde reaction differentially sensitive restriction enzymes
  • hydrazine or permanganate treatment m5C is cleaved by permanganate treatment but not by hydrazine treatment
  • combined bisulphate-restriction analysis methylation sensitive single nucleotide probe extension, methylation-sensitive single-strand conformation analysis (MS-SSCA), methylation-sensitive single-nucleotide primer extension (MS-SnuPE), base-specific
  • measuring a methylation level can comprise performing array-based PCR (e.g., digital PCR), targeted multiplex PCR, or direct sequencing without bisulfite treatment (e.g., via a nanopore technology).
  • determining methylation status comprises methylation specific PCR, real-time methylation specific PCR, quantitative methylation specific PCR (QMSP), or bisulfite sequencing.
  • a method according to the embodiments comprises treating DNA in or from a sample with bisulfite (e.g., sodium bisulfite) to convert unmethylated cytosines of CpG dinucleotides to uracil.
  • bisulfite e.g., sodium bisulfite
  • DNA methylation levels can also be used to measure DNA methylation levels: a) Molecular break light assay for DNA adenine methyltransferase activity is an assay that is based on the specificity of the restriction enzyme Dpnl for fully methylated (adenine methylation) GATC sites in an oligonucleotide labeled with a fluorophore and quencher. The adenine methyltransferase methylates the oligonucleotide making it a substrate for Dpnl. Cutting of the oligonucleotide by Dpnl gives rise to a fluorescence increase.
  • PCR Methylation-Specific Polymerase Chain Reaction
  • PCR is based on a chemical reaction of sodium bisulfite with DNA that converts unmethylated cytosines of CpG dinucleotides to uracil or UpG, followed by traditional PCR.
  • methylated cytosines will not be converted in this process, and thus probes are designed to overlap the CpG site of interest, which allows one to determine methylation status as methylated or unmethylated.
  • the beta value can be calculated as the proportion of methylation.
  • Whole genome bisulfite sequencing also known as BS-Seq, is a genome-wide analysis of DNA methylation.
  • ChlP-on-chip assay is based on the ability of commercially prepared antibodies to bind to DNA methylation-associated proteins like MeCP2.
  • Restriction landmark genomic scanning is a complicated and now rarely-used assay is based upon restriction enzymes' differential recognition of methylated and unmethylated CpG sites. This assay is similar in concept to the HELP assay.
  • Methylated DNA immunoprecipitation (MeDIP) is analogous to chromatin immunoprecipitation.
  • Immunoprecipitation is used to isolate methylated DNA fragments for input into DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq).
  • DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq).
  • Pyrosequencing of bisulfite treated DNA is a sequencing of an amplicon made by a normal forward primer (or probe) but a biatenylated reverse primer (or probe) to PCR the gene of choice.
  • the Pyrosequencer analyses the sample by denaturing the DNA and adding one nucleotide at a time to the mix according to a sequence given by the user. If there is a mismatch, it is recorded and the percentage of DNA for which the mismatch is present is noted. This gives the user a percentage methylation per CpG island.
  • the DNA e.g. genomic DNA
  • a complementary sequence e.g. a synthetic polynucleotide sequence
  • a matrix e.g. one disposed within a microarray
  • the DNA e.g. genomic DNA
  • the sample may be amplified by a variety of mechanisms, some of which may employ PCR. See, for example, PCR Technology: Principles and Applications for DNA Amplification (Ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (Eds.
  • methylation levels can be compared to an indication of (the presence or absence of) susceptibility to severe COVID-19, e.g. a weighted sum of the methylation levels can be applied to a logistic function as described herein.
  • diagnostic prediction models are contemplated for use with specific DNA samples (e.g. genomic DNA samples) and/or specific analysis techniques and/or specific individual populations.
  • a logistic regression model may predict (the presence or absence of) susceptibility to severe COVID-19 based on a weighted sum of the methylation levels optionally plus an offset (or regression intercept). To identify the weights for the weighted sum, one can use the regression coefficients of a regression model.
  • coefficient values can be tailored to the subject being analyzed. For example, if a model is applied to male subjects only, then one set of coefficients can be used. Alternatively, if a model is applied exclusively to obese subjects, another set of coefficients can be used. Alternatively, coefficients can be fixed, for example, when a model is broadly applied to a heterogeneous group of subjects, e.g. the selection of weights provided in Tables of CpG sites recited herein.
  • Coefficient values (weights) in various models can also reflect the specific assay that is used to measure the methylation levels. Different machines may give different methylation values, which are closer or farther away from the true methylation values. The coefficients may change when the model is re-trained for another machine. For example, for beta values measured on IlluminaTM methylation microarray platforms there can be one set of coefficients (weights), while for other methylation measures (e.g. using sequencing technology) there can be another set of coefficients (weights) etc. Other values may also be used instead, such as M values (transformed versions of beta values).
  • the methylation levels measured by the technique are preferably measured using an Illumina 450k array or Illumina Infinium Methylation EPIC Kit, or an array of similar quality.
  • embodiments of the invention can include a variety of art accepted technical processes.
  • a bisulfite conversion process is performed so that cytosine residues in the DNA (e.g. genomic DNA) are transformed to uracil, while 5- methylcytosine residues in the DNA (e.g. genomic DNA) are not transformed to uracil.
  • Kits for DNA bisulfite modification are commercially available from, for example, Methyl EasyTM (Human Genetic SignaturesTM) and CpGenomeTM Modification Kit (ChemiconTM). See also, WO04096825A1 , which describes bisulfite modification methods and Olek et al. Nuc.
  • Bisulfite treatment allows the methylation status of cytosines to be detected by a variety of methods.
  • any method that may be used to detect a SNP may be used, for example, see Syvanen, Nature Rev. Gen. 2:930-942 (2001).
  • Methods such as single base extension (SBE) may be used or hybridization of sequence specific probes similar to allele specific hybridization methods.
  • SBE single base extension
  • MIP Molecular Inversion Probe
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for (or capable of) detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject.
  • a solid support e.g. a chip
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for (or capable of) detecting (or measuring) the methylation levels of at least or at most:
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most:
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject.
  • a solid support e.g. a chip
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject.
  • a solid support e.g. a chip
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject.
  • a solid support e.g. a chip
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject.
  • a solid support e.g. a chip
  • the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject.
  • a solid support e.g. a chip
  • the kit is used to determine (or the kit is suitable for determining) whether or not a subject has susceptibility to severe COVID-19 by utilizing measurements of methylation levels at specific CpG sites in cells derived from the biological sample, for example blood or saliva.
  • Microfluidics devices can be applied to easily accessible tissues/fluids such as blood, buccal cells, or saliva.
  • the kit comprises a plurality of probes for amplifying DNA sequences (e.g. genomic DNA sequences) of the CpG sites (or bisulfite-treated forms of the CpG sites) in accordance with the invention as described elsewhere herein.
  • the kit comprises bisulfite or sodium bisulfite.
  • kits as described above are for obtaining information useful to determine susceptibility (or the presence or absence of susceptibility) to severe COVID-19 in a subject, the kit comprising a plurality of probes (or other appropriate entities) specific for (or specifically targeted to) at least or at most the numbers and selections of CpG sites listed in in any one of Tables 1 to 7 as described above, in DNA from a biological sample obtained from the subject.
  • the probes are for detecting (or measuring) the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
  • the probes are for detecting (or measuring) the methylation levels of CpG site numbers: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
  • the kit is (or comprises) an array or microarray, or is in the form of an array or microarray.
  • array or “microarray” as used herein refers to an intentionally created collection of molecules (e.g. probes or other appropriate entities) which can be prepared either synthetically or biosynthetically (e.g. IlluminaTM HumanMethylation27 microarrays).
  • the array can assume a variety of formats, for example, libraries of probes for targeting the desired CpG site sequences; or libraries of probes for targeting the desired CpG site sequences tethered to resin beads, silica chips, or other solid supports.
  • DNA methylation microarrays commonly comprise tethered nucleic acid probes, for example the Illumina Infinium® HumanMethylation450 BeadChip.
  • kits of the invention as described herein are specifically designed for the detection (or measurement) of the CpG sites of the invention as described elsewhere herein.
  • said kits are for use in, or in accordance with, the methods of the invention as described elsewhere herein.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4, but generally not exceeding more than or up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 433, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 433 CpG sites of Table 4. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most:
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most:
  • the probe (or CpG probe) component of the kit consists of up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 433, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 337 CpG sites listed in Table 5, and/or the 97 CpG sites listed in Table 3. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most:
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most:
  • the probe (or CpG probe) component of the kit consists of up to 337, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 337, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 337 CpG sites listed in Table 5. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2, but generally not exceeding more than or up to 251 , 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
  • a relatively small subset of probes (or other appropriate entities) e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5,
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 251 , 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 251 , 260, 270, 280, 290, 300, 310, 320
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 251 CpG sites of Table 2. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1, but generally not exceeding more than or up to 177, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
  • a relatively small subset of probes (or other appropriate entities) e.g. a subset of probes for detecting
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 177, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 177, 180,
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 177 CpG sites of Table 1. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6, but generally not exceeding more than or up to 91 , 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 91, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 91 CpG sites listed in Table 6. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3, but generally not exceeding more than or up to 97, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400 or 500 different probes in total.
  • the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 97, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400 or 500 different probes or consists of probes for detecting the methylation levels of up to 97, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400 or 500 CpG sites.
  • kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 97 CpG sites listed in Table 3. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
  • a kit for obtaining information useful to determine susceptibility (or the presence or absence of susceptibility) to severe COVID-19 in a subject, the kit comprising a plurality of probes (or other appropriate entities) specific for (or specifically targeted to) at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject.
  • the probes are for detecting (or measuring) the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, or 34 CpG sites selected from the 34 CpG sites listed in Table 7.
  • the kit is a PCR kit (i.e. a kit suitable for performing PCR or a PCR-based method), more preferably wherein the probe (or CpG probe) component of the kit consists of any of the following range of different probes, or the probe (or CpG probe) component consists of probes for detecting the methylation levels of any of the following range of CpG sites:
  • the probe is a PCR primer (i.e. a probe suitable for use in
  • the kit is a microarray kit (i.e. a kit suitable for performing a microarray-based method), more preferably wherein the probe (or CpG probe) component of the kit consists of any of the following range of different probes, or the probe (or CpG probe) component consists of probes for detecting the methylation levels of any of the following range of numbers of CpG sites:
  • the kit may comprise (or further comprise) a label necessary for the detection of the probes (or for the detection of other appropriate entities), for example a selective label as described elsewhere herein.
  • the selective labels may be for example labeled dideoxynucleotides (e.g. ddATP, ddCTP, ddGTP, ddTTP, ddllTP). Such dideoxynucleotides are used in chain extension of the oligonucleotide probe using a polymerase enzyme as described elsewhere herein.
  • the kit may therefore comprise (or further comprise) a polymerase enzyme (e.g. a DNA polymerase enzyme).
  • the kit may comprise (or further comprise) a reagent used in a DNA polymerization process, a DNA hybridization process, and/or a DNA bisulfite conversion process.
  • the kit may comprise (or further comprise) instructions for carrying out the methods of the invention.
  • a probe is for detecting or measuring the methylation level of a CpG site
  • the probe is targeted towards said CpG site and not other CpG sites, e.g. is selective for or specific for said CpG site.
  • the terms “suitable for detecting”, “suitable for determining” and “suitable for measuring or similar are also encompassed.
  • Appropriate probes for use in the kits of the invention are described elsewhere herein but are conveniently nucleic acid probes. In embodiments the kit contains two different types of probe as described elsewhere herein.
  • the probes are attached to a solid support or a substrate.
  • solid support and “substrate” as used herein are used interchangeably and refer to a material or group of materials having a rigid or semi-rigid surface or surfaces.
  • at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like.
  • the solid support will take the form of beads, resins, gels, microspheres, or other geometric configurations.
  • multiple probe types may be included for determining the methylation level of a given CpG site; for example, two probe types may be used, wherein the first probe enables detection of the methylated form of the CpG site (or, depending on the methylation detection protocol used, a derivative of said methylated form of said CpG site) and the second probe enables detection of the unmethylated form of the CpG site (or, depending on the methylation detection protocol used, a derivative of said methylated form of said CpG site, for example a bisulfite-treated or bilsulfite-converted form of said CpG site).
  • the invention provides a panel of CpG sites in accordance with the invention as described elsewhere herein.
  • the invention provides a panel or set of biomarkers, said panel or set of biomarkers comprising (or consisting of) CpG sites in accordance with the invention as described elsewhere herein.
  • a decrease or increase is generally regarded as statistically significant if a statistical comparison using a significance test such as a Student t- test, Mann-Whitney II Rank-Sum test, chi-square test or Fisher's exact test, one-way ANOVA or two-way ANOVA tests as appropriate, shows a probability value of ⁇ 0.05.
  • a significance test such as a Student t- test, Mann-Whitney II Rank-Sum test, chi-square test or Fisher's exact test, one-way ANOVA or two-way ANOVA tests as appropriate, shows a probability value of ⁇ 0.05.
  • a method of screening for the susceptibility of a subject to severe COVID-19 comprising using the methylation levels of at least:
  • the method of embodiment 1 comprising using the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 251 CpG sites listed in Table 2 and/or at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 177 CpG sites listed in Table 1.
  • the biological sample is a blood sample, or a white blood cell sample.
  • a computer program comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least:
  • the computer program of embodiment 17, comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3, preferably selected from the 34 CpG sites listed in Table 7.
  • kits for screening for susceptibility of a subject to severe COVID-19 comprising probes for detecting the methylation levels of at least:
  • CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, wherein the CpG probe component of the kit consists of probes for detecting the methylation levels of up to 500 CpG sites.
  • kit of embodiment 19 said kit comprising probes for detecting the methylation levels of at least:
  • kit of embodiment 19 or embodiment 20 said kit comprising probes for detecting the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 91 CpG sites listed in both of Tables 1 and 2.
  • kit of embodiment 19 said kit comprising probes for detecting the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3, preferably selected from the 34 CpG sites listed in Table 7.
  • Figure 1 provides a Venn diagram showing the overlap between the three lists of CpG sites of Tables 1 to 3.
  • Figure 2 provides a flowchart for the feature selection (CpG site filtering) process.
  • Figure 3 provides a ROC curve for a model using all 177 CpG sites together from the list of Table 1.
  • Figure 4 provides a confusion matrix for a model using all 177 CpG sites together from the list of Table 1 (the same model as in Figure 3).
  • Figure 5 provides a ROC curve for a model using 5 CpG sites selected from the list of 97 CpG sites of Table 3.
  • Figure 6 provides a confusion matrix for a model using 5 CpG sites selected from the list of 97 CpG sites of Table 3 (the same model as in Figure 5).
  • Figure 7 provides model performance depending on the number of CpG sites used in the model, where the CpG sites are selected from the list of 177 CpG sites of Table 1.
  • Each boxwhisker (each bar) shows the results of 300 tested combinations (models).
  • the box is drawn from the first quartile (Q1) to the third quartile (Q3) with a horizontal line drawn in the middle to denote the median.
  • the end of the lower whisker is the minimum performance value of the given number of CpG sites (i.e. the value for the combination of CpG sites (model) which gave the poorest performance), and the end of the upper whisker is the maximum value (i.e. the value for the combination of CpG sites (model) which gave the best performance).
  • Figure 8 provides model performance depending on the number of CpG sites used in the model, where the CpG sites are selected from the list of 251 CpG sites of Table 2. Each boxwhisker (each bar) shows the results of 300 tested combinations, and is used as described for Figure 7.
  • Figure 9 provides model performance depending on the number of CpG sites used in the model, where the CpG sites are selected from the list of 97 CpG sites of Table 3. Each boxwhisker (each bar) shows the results of 300 tested combinations, and is used as described for Figures 7 and 8.
  • Table 9 Cohort information detailing number of cases in each data set. Sex and age information are also supplied.
  • the technical solution is a test for diagnosing COVID-19 severity by reading the DNA methylation level in white blood cells at specific CpG sites, and combining these values into a score using a mathematical formula that classifies into either severe or non-severe COVID-19. While the formula is developed for binary classification, its readout is a continuous score from 0 (mild case) to 1 (severe), which can be understood and implemented as a severity risk score. This allows for additional, more granular categories to be introduced in a clinical context.
  • the mathematical formula is in the form of a multiple logistic regression such that where o p is the probability of severe COVID-19 progression (a value between 0 and 1) o b is e o p 0 is the regression intercept o Pi is the weight for CpG-site 1 o Xi is the methylation level of CpG-site 1 o p m is the weight for CpG-site m o x m is the methylation level of CpG-site m
  • a support vector machine (svm) model can optionally be fitted on an independent development data set drawn from the same or another cohort as the training data (the dev data) using predictions of the logistic regression (elastic net) model and patient age and/or sex as input to predict severity.
  • Another instance of such a technical solution could involve a random forest classification model in place of the logistic regression, optionally followed by a feedforward neural network in place of the svm.
  • the filtering process made use of several steps to ascertain differential methylation, similar behavior in both training data sets, and usefulness in a lab setting (see Figure 2, with the Steps below indicated in brackets).
  • Step 1 overlap between data sets: any CpG sites not present in all three used data sets were discarded.
  • Step 2 differential methylation analysis: was performed in both training data sets separately. Only sites found to be significantly differentially methylated in both training data sets were retained.
  • Step 3 conformity between data sets: the sign of the differential methylation had to be equal in both data sets.
  • Step 4 distribution dissimilarity analysis: the means of the methylation ratio distributions for non-severe cases in both data sets were required to be similar.
  • Step 5 overlap between data sets: only CpG sites passing steps 1, 2 and 3 in both data sets were kept.
  • the American and Spanish cohorts were split so that 75% of the COVID-19-positive cases formed the training data set for the initial regression modeling (which determines the used CpG sites) while the remaining 25% were used to form a development (dev) set used to fit support vector machine models taking the output of the regression models and patient age as input. Both regression and svm models use severity/hospitalization as the dependent variable to be predicted. Holdout and dev set identities were kept consistent across all models. The entire Norwegian cohort was retained as a fully independent holdout data set used exclusively for testing purposes.
  • an elastic net model was trained to arrive at a severity prediction formula optimized for microarray data. 177 CpG sites were used by the resulting model (Table 1). The predictions of the svm model form the final COVID-19 severity prediction. We then performed stability selection (Ref 9) to determine CpG sites useful for other potential elastic net models. 251 sites were identified as relevant for an array-optimized model using stability selection (Table 2). The union of the 177 CpG sites used by the most performant model and the 251 sites determined by stability selection are sites (337 sites in total - Table 5) which we have determined as relevant for the invention. Of the list of 177 sites and list of 251 sites, there are 91 sites common to both lists (Table 6).
  • the 97 sites were ranked in terms of order of importance (Table 3). This was achieved by training 20,000 severity predictor models using combinations of 5 randomly determined sites (out of the 97) each. For each model, Balanced Accuracy was calculated using the holdout data. Then, a new model was fitted that predicted the resulting 20,000 Balanced Accuracy values depending on whether or not each of the 97 sites was used in the predictor. Thus, each Balanced Accuracy value could be attributed to five sites that were used in a specific case. An overall estimate could thus be obtained of what influence any given site had on Balanced Accuracy. These estimates are the coefficients for the usefulness model, and those are listed in the site usefulness column. So, the higher that value is, the higher the average Balanced Accuracy of severity prediction models using the site in question.
  • the ensemble model (logistic regression followed by svm), using an exemplary set of 5 CpG sites selected from the list of 97 CpG sites of Table 3, and with the cutoff value for severity classification set at 0.5, achieved an AUC of 0.88 ( Figure 5) and balanced accuracy, sensitivity and specificity of 0.8 in the independent holdout set (Table 11, see Figure 6 for a confusion matrix).
  • Table 11
  • a box-whisker plot for each number of sites is provided in Figure 7 for the array list (300 models of each number of CpG sites (15), so 4500 models in total), in Figure 8 for the stability selection list (300 models of each number of CpG sites (15), so 4500 models in total), and in Figure 9 for the PCR list (300 models for each number of CpG sites (11), so 3300 models in total).
  • the data demonstrates that with as few as 2 sites, a balanced accuracy of 0.8 or higher can be achieved using combinations drawn from any of the three proposed sets of CpG sites. Prediction success rate does not fall below or even approach random performance (defined as a balanced accuracy of 0.5).
  • these data demonstrate that susceptibility to severe COVID-19 can be reasonably predicted (and thus that the method of the invention may be performed) using as few as 2 CpG sites selected from each of the three lists of Tables 1 , 2 and/or 3.
  • Severe illness Same as moderate illness plus SpO2 ⁇ 94%. Management is hospitalization with 02 therapy. May deteriorate rapidly.
  • COVID-19 now considered to be endemic, our method has the potential to become an important tool for clinicians facing COVID-19 patients in the years ahead, as they make decisions about the treatment for individual patients.
  • COVID-19 severity test described herein can be an important addition to the standard of care in the monitoring and treatment of individual patients.
  • each CpG site in each Table has been assigned a “CpG site number”.
  • CpG site “cg04610187” can alternatively be referred to as “CpG site number 1 of Table 4”, and so on.

Abstract

The present invention relates generally to methods of screening for susceptibility to severe COVID-19, as well as kits for screening for susceptibility to severe COVID-19. More particularly, the invention relates to a method of screening for susceptibility to severe COVID-19 in a subject, the method comprising using methylation levels of CpG sites in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of susceptibility to severe COVID-19 in the subject.

Description

Method of screening for severe COVID-19 susceptibility
The present invention relates generally to methods of screening for susceptibility to severe COVID-19, as well as kits for screening for susceptibility to severe COVID-19.
The highly contagious severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in Wuhan, China, in late 2019 (Ref 1), and has since spread to the entire world, causing billions to be infected and millions of deaths. The clinical picture of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 infection spans a wide spectrum ranging from asymptomatic disease to acute respiratory distress syndrome (ARDS), multiorgan failure, and death (Ref 2). The clinical manifestations are systemic and heterogeneous, and the organ damage appears to be largely immune-mediated and regulated by epigenetic changes (Ref 3). The infection may trigger a wide range of autoimmune responses, including multisystem inflammatory syndrome in children (MIS-C) (Ref 4), hemolytic anemia, myocarditis, and Guillain Barre Syndrome (Ref 5). Older patients and individuals with comorbidities and compromised immunity appear to be more prone to severe disease (Ref 6), but it is generally difficult to know in advance who will be severely affected, and who will escape with mild disease (Ref 7).
The introduction of vaccines has significantly reduced the severity of disease at the population level, with numbers from the clinical trials and also aftermarket trials indicating an efficacy of 90- 95% of preventing hospitalization and severe disease after vaccination, though possibly waning over time (Ref 8). However, with the rise of new variants, such as omicron, hospitalization and death rates were again spiking. While patients generally exhibit less severe disease and shorter stays both in hospitalization and in the ICU, the sheer number of patients places a massive burden on the healthcare system. Any tool that can aid with triaging patients, predicting the clinical course or optimizing treatment would help alleviate this resource constraint, both in response to the omicron variant and in response to future variants.
Thus, there remains a need for supportive and directed therapies for COVID-19 patients also after the widespread rollout of vaccines. In particular, the need for identifying individuals in need of early treatment, the need for guiding treatment selection, and also the need to predict severity of disease remains an unmet medical need.
The invention provided herein addresses that unmet need. Provided herein is a novel multivariate DNA methylation-based assay for classifying severity of COVID-19 infection, which the inventors have named the COVID-19 Severity Test (CST). The inventors used two previously published data sets to train and develop SARS-CoV-2 severity models, while their own novel data was used solely as holdout data to validate their findings. To the knowledge of the inventors, no existing study has so far used a fully independent data set to test COVID-19 severity models.
As explained in the Examples, the present inventors have identified 433 CpG sites in total which are highly relevant for determining susceptibility to severe COVID-19. The 433 CpG sites were identified from 3 different filtering processes devised by the inventors, yielding 3 overlapping lists of CpG sites.
Two of the filtering processes were used to identify sites suitable for measurement in a microarray-based protocol. Specifically, one of the filtering processes involved training a model to identify a list of sites for which best performance in COVID-19 severity prediction was achieved based on the datasets used; this yielded a list of 177 CpG sites (as listed in Table 1). The other filtering process involved the use of stability selection (Ref 9) to identify the most robust CpG sites for predicting susceptibility to severe COVID-19 (in brief, a selection of CpG sites with the highest probability of appearing in other potential models); this yielded a list of 251 CpG sites (as listed in Table 2). The two lists of CpG sites generated by these two filtering processes exhibited significant overlap; specifically, 91 CpG sites (listed in Table 6) are common to both of Tables 1 and 2, and yielded 337 CpG sites in total (listed in Table 5).
The third filtering process was used to identify sites particularly suitable for measurement in a polymerase chain reaction (PCR)-based protocol. PCR machines are usually available onsite in hospitals, unlike microarray machines, and so it is advantageous to provide a protocol for determining susceptibility to severe COVID-19 which is specifically optimized for use with PCR. In the PCR optimized filtering process, CpG sites were filtered (selected) inter alia based on the magnitude of the dissimilarity of the methylation ratio distributions between severe and non- severe patients, in order to ensure the usefulness of CpG sites in a lab setting where, potentially, only a few CpG sites can be utilized at once in the absence of high throughput data. The filtering process yielded a final list of 97 CpG sites as provided in Table 3. While the list of 97 CpG sites is particularly suitable for use in a PCR-based protocol, it will be readily understood that these sites are also suitable and could be used advantageously in a microarray-based protocol. Table 7 provides a subset of 34 CpG sites from Table 3 which are particularly preferred.
Thus, in one aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In all of the aspects and embodiments of the invention provided herein, where a method of or kit for “obtaining clinically relevant information” is contemplated, or a method of or kit for “diagnosing” or prognosing” or similar is contemplated, a method of or kit for “providing information (or clinically relevant information) for diagnosing (or prognosing)” may also be contemplated.
In embodiments, the method comprises using the methylation levels of at least or at most the CpG sites referred to in each of Tables 1 to 3 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36,
1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to
24, 1 to 23, 1 to 22, 1 to 21, 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Tables 1 to 3 (i.e. 6 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Tables 1 to 3 (i.e. 3 sites in total)).
In all of the aspect and embodiments herein, where it is considered that the methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop) severe COVID-19, etc., it may be viewed alternatively that the methylation levels are indicative of, or used as an indication of, or used to provide an indication of, the susceptibility of the subject to (or to develop) severe COVID-19, etc..
In all of the aspects and embodiments herein, where “at least or at most 1 , 2, 3, 4...” CpG sites is contemplated, “at least 2” CpG sites is preferred.
Thus, in one aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19. In embodiments, the method comprises using the methylation levels of at least or at most the CpG sites referred to in each of Tables 1 to 3 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31 , 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Tables 1 to 3 (so 6 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Tables 1 to 3 (so 3 sites in total)).
In another aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In embodiments, the method comprises using the methylation levels of at least or at most the
CpG sites referred to in each of Table 1 and Table 2 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41 , 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25,
1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to
13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 1 and Table 2 (so 4 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Table 1 and Table 2 (so 2 sites in total)).
In another aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In embodiments, the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 2 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 2 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Table 2 (so 1 site in total)).
In another aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In embodiments, the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 1 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 1 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Table 1 (so 1 site in total)). In another aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In another aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID-19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19). Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In embodiments, the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 3 as CpG site numbers 1 to 50, 1 to 49, 1 to 48, 1 to 47, 1 to 46, 1 to 45, 1 to 44, 1 to 43, 1 to 42, 1 to 41, 1 to 40, 1 to 39, 1 to 38, 1 to 37, 1 to 36, 1 to 35, 1 to 34, 1 to 33, 1 to 32, 1 to 31, 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21 , 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11 , 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 3 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Table 3 (so 1 site in total)).
In another aspect, the invention provides a method of screening for (or diagnosing) susceptibility (or the presence or absence of susceptibility) of a subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or a method of determining whether or not a subject will develop severe COVID-19, or a method of determining the likelihood (or probability) that a subject will develop severe COVID-19), the method comprising using the methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject in order to screen for susceptibility to severe COVID-19 in the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19 (or wherein said methylation levels are used to determine whether or not the subject will develop severe COVID- 19, or to determine the likelihood (or probability) that the subject will develop severe COVID-19).
Alternatively viewed, the method of the invention may be viewed as a method of predicting (or prognosing) the course of COVID-19 in a subject, or a method of obtaining an indication of the susceptibility of a subject to (or to develop, or to developing, or to the development of) severe COVID-19, or a method of obtaining clinically relevant information about a subject, the method comprising using the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject in order to screen for susceptibility of the subject to severe COVID-19, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to (or to develop, or to developing, or to the development of) severe COVID-19.
In embodiments, the method comprises using the methylation levels of at least or at most the CpG sites referred to in Table 7 as CpG site numbers 1 to 34, 1 to 33, 1 to 32, 1 to 31 , 1 to 30, 1 to 29, 1 to 28, 1 to 27, 1 to 26, 1 to 25, 1 to 24, 1 to 23, 1 to 22, 1 to 21, 1 to 20, 1 to 19, 1 to 18, 1 to 17, 1 to 16, 1 to 15, 1 to 14, 1 to 13, 1 to 12, 1 to 11, 1 to 10, 1 to 9, 1 to 8, 1 to 7, 1 to 6, 1 to 5, 1 to 4, 1 to 3, 1 to 2 (i.e. at least or at most CpG site numbers 1 and 2 of Table 7 (so 2 sites in total)), or 1 (i.e. at least or at most CpG site number 1 of Table 1 (so 1 site in total)).
As used herein, where reference is made to “using” or “measuring” methylation levels, the acts of “observing”, “acquiring”, “collecting”, “obtaining”, “determining”, “detecting” and/or “assessing” said methylation levels are contemplated alternatively or in addition. All the terms quoted in this paragraph may be used interchangeably if appropriate.
Thus for example, in embodiments, the method of the invention comprises a step of obtaining (or acquiring or collecting) the methylation levels before the step of using the methylation levels. The obtaining (or acquiring or collecting) of the methylation levels may include the measuring of the methylation levels, however this is not essential. For example, the obtaining (or acquiring or collecting) of the methylation levels may comprise or consist of receiving methylation levels (e.g. methylation levels that have been previously measured), for example from an external source or a third party.
As used herein, where it is recited that methylation levels are "indicative of (the presence or absence of) susceptibility to severe COVID-19 in the subject” or “used to provide an indication of (the presence or absence of) susceptibility to severe COVID-19 in the subject” or other similar terms, it is meant that there is a correlation (e.g. a positive or negative correlation) between the respective methylation level and susceptibility to severe COVID-19 in the subject.
Where the term “measuring” in respect of methylation levels is recited, the term “selectively measuring” is also encompassed thereby.
The term “COVID-19” as used herein stands for “Coronavirus disease 2019” and refers to the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In most countries, the determination of COVID-19 severity follows guidelines that are similar to those provided by the NIH to physicians in the US. Any given patient with a positive PCR test for SARS-CoV-2 is placed at the appropriate level on the following clinical spectrum:
1. Asymptomatic or presymptomatic infection
2. Mild Illness (managed in an ambulatory setting)
3. Moderate illness: evidence of lower respiratory infection and SpO2>=94%. Close monitoring, often at home.
4. Severe illness: Same as moderate illness plus SpO2<94%. Management is hospitalization with 02 therapy. May deteriorate rapidly.
5. Critical illness: acute respiratory distress syndrome, septic shock: ICU management.
For the purposes of this invention, the term “severe COVID-19” refers to a degree of COVID-19 illness wherein the subject is hospitalized (or wherein the subject should be hospitalized, or wherein hospitalization of the subject is warranted). This is in contrast to “non-severe COVID”, which refers to a degree of COVID-19 illness wherein the subject is not hospitalized (or wherein the subject should not be hospitalized, or wherein hospitalization of the subject is not warranted). Alternatively viewed, “severe COVID-19” refers to a degree of COVID-19 which is characterized by a score of 4 or 5 on the NIH clinical spectrum as recited above. This is in contrast to “non-severe COVID”, which refers to a degree of COVID-19 illness which is characterized by a score of 1 , 2 or 3 on the NIH clinical spectrum as recited above. Thus, the methods, etc., of the invention as described herein, can be used to classify subjects into those with susceptibility to severe as opposed to non-severe COVID-19, or to classify subjects into those which will (or are likely to) develop severe COVID-19 as opposed to developing (or maintaining) non-severe COVID-19.
By “susceptible to” it is meant “at risk” or “likely to suffer from”. Thus, where reference is made herein to “susceptibility to severe COVID-19”, the terms “risk of severe COVID-19 (or of developing severe COVID-19)” or “high or higher or increased risk of severe COVID-19 (or of developing severe COVID-19)” are also contemplated alternatively or in addition.
Where it is described herein the concept of “susceptibility to severe COVID-19”, it is of course meant to refer to the susceptibility of the subject in the absence of pre-emptive medical intervention. Similarly, where it is described herein the concept of the likelihood of development of severe COVID-19 in a subject, it is meant to refer to the likelihood of development of severe COVID-19 in the subject in the absence of pre-emptive medical intervention. Therefore, in another aspect, the present invention provides a method for determining the likelihood (or probability) that severe COVID-19 will develop in a subject. In such methods the methylation level of one or more of the CpG sites as described elsewhere herein in the sample, or the overall probability value determined therefrom, shows an association with the probability of severe COVID-19 that is predicted to develop in the subject. Thus, the methylation level of one or more of the CpG sites as described elsewhere herein, or the overall probability value determined therefrom, is indicative of the probability that severe COVID-19 will develop in the subject (or of the probability of severe COVID-19 progression). In some embodiments, the more altered (more increased or more decreased depending on the CpG site in question) the methylation level (or score) of one or more (preferably two or more) of the CpG sites in comparison to a control level, or the higher the overall probability value determined therefrom, the greater the probability that the subject will develop severe COVID-19 (or the greater the probability of severe COVID-19 progression). In some embodiments the methods of the invention can thus be used in the selection of patients for therapy or for triaging. Hence, the method of the invention may be used to enable those subjects having greater probability of developing severe COVID-19 to be prioritised for treatment over those with a lower probability of developing severe COVID-19.
The phrase "selectively measuring" as used herein refers to methods wherein the methylation levels of only a finite number of CpG sites are measured rather than measuring the methylation levels essentially of all or essentially all potential CpG sites in a genome. For example, in some aspects, "selectively measuring" methylation levels can refer to measuring the methylation levels of no more than 100000, 90000, 80000, 70000, 60000, 50000, 40000, 30000, 20000, 10000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91 , 95, 97, 100, 110, 120, 130, 140, 150, 160, 170, 177, 180, 190, 200, 210, 220, 230, 240, 250, 251 , 300, 350, 400, or 433 different CpG sites.
Similarly, where the term “using” in respect of methylation levels is recited, the term “selectively using” is also encompassed thereby.
The phrase "selectively using" as used herein refers to methods wherein the methylation levels of only a finite number of CpG sites are used rather than using the methylation levels of all or essentially all potential CpG sites in a genome. For example, in some aspects, "selectively using" methylation levels can refer to using the methylation levels of no more than 100000, 90000, 80000, 70000, 60000, 50000, 40000, 30000, 20000, 10000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 95, 97, 100, 110, 120, 130, 140, 150, 160, 170, 177, 180, 190, 200, 210, 220, 230, 240, 250, 251 , 300, 350, 400, or 433 different CpG sites.
Where the term “detecting” in respect of methylation levels is recited, the term “selectively detecting” is also encompassed thereby.
The phrase "selectively detecting" as used herein refers to methods wherein the methylation levels of only a finite number of CpG sites are measured rather than measuring the methylation levels essentially of all or essentially all potential CpG sites in a genome. For example, in some aspects, "selectively detecting" methylation levels can refer to detecting the methylation levels of no more than 100000, 90000, 80000, 70000, 60000, 50000, 40000, 30000, 20000, 10000, 9000, 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91 , 95, 97, 100, 110, 120, 130, 140, 150, 160, 170, 177, 180, 190, 200, 210, 220, 230, 240, 250, 251, 300, 350, 400, or 433 different CpG sites.
As discussed herein, methods of the present invention may comprise using, determining or measuring, etc., the methylation levels of one or more CpG sites “selected from the list of” certain specific CpG sites set forth herein. For the avoidance of doubt, in some embodiments in which the methylation levels of one or more of the specific CpG sites “selected from the list” set forth herein are used, measured or determined, etc., the methylation levels of one or more other (or distinct or alternative) CpG sites, or of one or more other (or distinct or alternative) CpG sites belonging to one or more other genes, and/or one or more other biomarkers, may additionally be used, measured or determined. Thus, “selected from the list of” may be an “open” term. In other embodiments, the methylation levels of only one or more of the specific CpG sites discussed herein is used, measured or determined, etc. (e.g. the methylation levels of other CpG sites or other biomarkers are not used, measured or determined).
As used herein, the term "CpG site" is given its art recognised meaning and refers to the location in a nucleic acid molecule, or sequence representation of the molecule, where a cytosine nucleotide and guanine nucleotide occur, the 3' oxygen of the cytosine nucleotide being covalently attached to the 5' phosphate of the guanine nucleotide. The nucleic acid is typically DNA. The cytosine nucleotide can optionally be methylated at position 5 of the pyrimidine ring. Such CpG sites can be referred to as methylated CpG sites. Unless otherwise stated, nucleic acid sequences recited herein are recited in the 5’ to 3’ direction.
As used herein, the term “methylation level” includes the average methylation state of a CpG site in a biological sample. Methylation levels of each CpG site may be quantified by methods known in the art, for example in the form of a beta value or M value. When measuring DNA methylation using microarray technology (such as HumanMethylation450 BeadChip array, which covers approximately 450,000 CpG sites), the beta value is the ratio of the methylated probe intensity and the overall intensity (sum of methylated and unmethylated probe intensities). The beta-value is thus generally and conveniently a number between 0 and 1, or 0 and 100%. A value of zero indicates that all copies of the CpG site in the sample were completely unmethylated (no methylated molecules were measured) and a value of one (or 100%) indicates that every copy of the CpG site in the sample was methylated.
In embodiments, the methylation levels referred to herein are methylation states. The “methylation state” of a particular CpG site in a particular cell is either methylated or nonmethylated.
In general, the methods of the invention are carried out in vitro or ex vivo (unless the context requires otherwise, e.g. administration steps).
Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 433 CpG sites listed in Table 4 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65,
66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130,
131 , 132, 133, 134, 135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149,
150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168,
169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181 , 182, 183, 184, 185, 186, 187,
188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206,
207, 208, 209, 210, 211 , 212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244,
245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263,
264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282,
283, 284, 285, 286, 287, 288, 289, 290, 291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311 , 312, 313, 314, 315, 316, 317, 318, 319, 320,
321 , 322, 323, 324, 325, 326, 327, 328, 329, 330, 331 , 332, 333, 334, 335, 336, 337, 338, 339,
340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351 , 352, 353, 354, 355, 356, 357, 358,
359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377,
378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401 , 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 , 412, 413, 414, 415, 416, 417, 418, 419, 420, 421 , 422, 423, 424, 425, 426, 427, 428, 429, 430, 431 , 432, or 433 of the CpG sites recited in Table 4 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 433 CpG sites listed in Table 4 could be used, i.e. the methylation level of:
CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47,
48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72,
73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97,
98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115, 116,
117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134, 135,
136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149, 150, 151 , 152, 153, 154,
155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168, 169, 170, 171 , 172, 173,
174, 175, 176, 177, 178, 179, 180, 181 , 182, 183, 184, 185, 186, 187, 188, 189, 190, 191 , 192,
193, 194, 195, 196, 197, 198, 199, 200, 201 , 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 ,
212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225, 226, 227, 228, 229, 230,
231 , 232, 233, 234, 235, 236, 237, 238, 239, 240, 241 , 242, 243, 244, 245, 246, 247, 248, 249,
250, 251 , 252, 253, 254, 255, 256, 257, 258, 259, 260, 261 , 262, 263, 264, 265, 266, 267, 268,
269, 270, 271 , 272, 273, 274, 275, 276, 277, 278, 279, 280, 281 , 282, 283, 284, 285, 286, 287,
288, 289, 290, 291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301 , 302, 303, 304, 305, 306,
307, 308, 309, 310, 311 , 312, 313, 314, 315, 316, 317, 318, 319, 320, 321 , 322, 323, 324, 325,
326, 327, 328, 329, 330, 331 , 332, 333, 334, 335, 336, 337, 338, 339, 340, 341 , 342, 343, 344,
345, 346, 347, 348, 349, 350, 351 , 352, 353, 354, 355, 356, 357, 358, 359, 360, 361 , 362, 363,
364, 365, 366, 367, 368, 369, 370, 371 , 372, 373, 374, 375, 376, 377, 378, 379, 380, 381 , 382,
383, 384, 385, 386, 387, 388, 389, 390, 391 , 392, 393, 394, 395, 396, 397, 398, 399, 400, 401 ,
402, 403, 404, 405, 406, 407, 408, 409, 410, 411 , 412, 413, 414, 415, 416, 417, 418, 419, 420,
421 , 422, 423, 424, 425, 426, 427, 428, 429, 430, 431 , 432, and/or 433 as named in Table 4 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 337 CpG sites listed in Table 5, could be used, i.e. the methylation levels of at least or at most or exactly:
1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26,
27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 ,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76,
77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115, 116, 117, 118, 119,
120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134, 135, 136, 137, 138,
139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151 , 152, 153, 154, 155, 156, 157,
158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171 , 172, 173, 174, 175, 176,
177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195,
196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214,
215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233,
234, 235, 236, 237, 238, 239, 240, 241 , 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261 , 262, 263, 264, 265, 266, 267, 268, 269, 270, 271,
272, 273, 274, 275, 276, 277, 278, 279, 280, 281 , 282, 283, 284, 285, 286, 287, 288, 289, 290,
291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301 , 302, 303, 304, 305, 306, 307, 308, 309,
310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321 , 322, 323, 324, 325, 326, 327, 328,
329, 330, 331, 332, 333, 334, 335, 336, or 337 of the 337 CpG sites recited in Table 5 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 337 CpG sites listed in Table 5, could be used, i.e. the methylation level of:
CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72,
73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97,
98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115, 116,
117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,
136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154,
155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173,
174, 175, 176, 177, 178, 179, 180, 181 , 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192,
193, 194, 195, 196, 197, 198, 199, 200, 201 , 202, 203, 204, 205, 206, 207, 208, 209, 210, 211,
212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225, 226, 227, 228, 229, 230,
231 , 232, 233, 234, 235, 236, 237, 238, 239, 240, 241 , 242, 243, 244, 245, 246, 247, 248, 249,
250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261 , 262, 263, 264, 265, 266, 267, 268,
269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281 , 282, 283, 284, 285, 286, 287,
288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306,
307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325,
326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, and/or 337 as named in Table 5 could be used. Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 251 CpG sites listed in Table 2 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40,
41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65,
66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90,
91 , 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 ,
112, 113, 114, 115, 116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130,
131 , 132, 133, 134, 135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149,
150, 151 , 152, 153, 154, 155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168,
169, 170, 171 , 172, 173, 174, 175, 176, 177, 178, 179, 180, 181 , 182, 183, 184, 185, 186, 187,
188, 189, 190, 191 , 192, 193, 194, 195, 196, 197, 198, 199, 200, 201 , 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 , 212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225,
226, 227, 228, 229, 230, 231 , 232, 233, 234, 235, 236, 237, 238, 239, 240, 241 , 242, 243, 244,
245, 246, 247, 248, 249, 250, or 251 of the CpG sites recited in Table 2 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 251 CpG sites listed in Table 2 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46,
47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 ,
72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96,
97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115,
116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134,
135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149, 150, 151 , 152, 153,
154, 155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168, 169, 170, 171 , 172,
173, 174, 175, 176, 177, 178, 179, 180, 181 , 182, 183, 184, 185, 186, 187, 188, 189, 190, 191 ,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201 , 202, 203, 204, 205, 206, 207, 208, 209, 210,
211 , 212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225, 226, 227, 228, 229,
230, 231 , 232, 233, 234, 235, 236, 237, 238, 239, 240, 241 , 242, 243, 244, 245, 246, 247, 248,
249, 250, and/or 251 as named in Table 2 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 177 CpG sites listed in Table 1 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15,
16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40,
41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65,
66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115, 116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130,
131 , 132, 133, 134, 135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149,
150, 151 , 152, 153, 154, 155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168,
169, 170, 171 , 172, 173, 174, 175, 176, or 177 of the CpG sites recited in Table 1 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 177 CpG sites listed in Table 1 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46,
47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 ,
72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96,
97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115,
116, 117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134,
135, 136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149, 150, 151 , 152, 153,
154, 155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168, 169, 170, 171 , 172,
173, 174, 175, 176, and/or 177 as named in Table 1 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 91 CpG sites listed in Table 6 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15,
16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40,
41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65,
66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, or 91 of the CpG sites recited in Table 6 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 91 CpG sites selected from the 91 CpG sites listed in Table 6 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40,
41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65,
66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, and/or 91 as named in Table 6 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 97 CpG sites listed in Table 3 could be used, i.e. the methylation levels of at least or at most or exactly 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65,
66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, or 97 of the CpG sites recited in Table 3 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 97 CpG sites listed in Table 3 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48,
49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72, 73,
74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, and/or
97 as named in Table 3 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that the methylation levels of any number of the 34 CpG sites listed in Table 7 could be used, i.e. the methylation levels of at least or at most or exactly 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, or 34 of the CpG sites recited in Table 7 could be used.
Throughout the aspects and embodiments provided herein, it will be appreciated that any particular selection of the 34 CpG sites listed in Table 7 could be used, i.e. the methylation level of CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, and/or 34 as named in Table 7 could be used.
In preferred embodiments of all of the aspects and embodiments herein, any of the following range of numbers of CpG sites could be used from Table 1 or Table 2:
2 to 125, 3 to 125, 4 to 125, 5 to 125, 7 to 125, 10 to 125, 15 to 125, 20 to 125, 25 to 125, 30 to 125, 40 to 125, 50 to 125, 75 to 125, or 100 to 125 of the CpG sites listed in Table 1 or Table 2. Alternative ranges of numbers of sites which could be used from Table 1 or Table 2 include:
2 to 100, 3 to 100, 4 to 100, 5 to 100, 7 to 100, 10 to 100, 15 to 100, 20 to 100, 25 to 100, 30 to 100, 40 to 100, 50 to 100, or 75 to 100 of the CpG sites listed in Table 1 or Table 2.
Alternative ranges of numbers of sites which could be used from Table 1 or Table 2 include: 2 to 75, 3 to 75, 4 to 75, 5 to 75, 7 to 75, 10 to 75, 15 to 75, 20 to 75, 25 to 75, 30 to 75, 40 to
75, or 50 to 75 of the CpG sites listed in Table 1 or Table 2.
Alternative ranges of numbers of sites which could be used from Table 1 or Table 2 include: 2 to 50, 3 to 50, 4 to 50, 5 to 50, 7 to 50, 10 to 50, 15 to 50, 20 to 50, 25 to 50, 30 to 50, or 40 to 50 of the CpG sites listed in Table 1 or Table 2. In preferred embodiments of all of the aspects and embodiments herein, any of the following range of numbers of sites could be used from Table 1 , Table 2 or Table 3:
2 to 15, 2 to 14, 2 to 13, 2 to 12, 2 to 11 , 2 to 10, 2 to 9, 2 to 8, 2 to 7, 2 to 6, 2 to 5,
3 to 15, 4 to 15, 5 to 15, 6 to 15, 7 to 15, 8 to 15, 9 to 15, 10 to 15,
2 to 10, 3 to 10, 4 to 10, 5 to 10, of the CpG sites listed in Table 1 , Table 2 or Table 3.
In preferred embodiments of all of the aspects and embodiments herein, it will be appreciated that any particular selection of CpG sites are not used or included, i.e. the methylation level of:
CpG site number 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47,
48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72,
73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97,
98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 110, 111 , 112, 113, 114, 115, 116,
117, 118, 119, 120, 121 , 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 , 132, 133, 134, 135,
136, 137, 138, 139, 140, 141 , 142, 143, 144, 145, 146, 147, 148, 149, 150, 151 , 152, 153, 154,
155, 156, 157, 158, 159, 160, 161 , 162, 163, 164, 165, 166, 167, 168, 169, 170, 171 , 172, 173,
174, 175, 176, 177, 178, 179, 180, 181 , 182, 183, 184, 185, 186, 187, 188, 189, 190, 191 , 192,
193, 194, 195, 196, 197, 198, 199, 200, 201 , 202, 203, 204, 205, 206, 207, 208, 209, 210, 211 ,
212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225, 226, 227, 228, 229, 230,
231 , 232, 233, 234, 235, 236, 237, 238, 239, 240, 241 , 242, 243, 244, 245, 246, 247, 248, 249, 250, 251 , 252, 253, 254, 255, 256, 257, 258, 259, 260, 261 , 262, 263, 264, 265, 266, 267, 268,
269, 270, 271 , 272, 273, 274, 275, 276, 277, 278, 279, 280, 281 , 282, 283, 284, 285, 286, 287,
288, 289, 290, 291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301 , 302, 303, 304, 305, 306,
307, 308, 309, 310, 311 , 312, 313, 314, 315, 316, 317, 318, 319, 320, 321 , 322, 323, 324, 325,
326, 327, 328, 329, 330, 331 , 332, 333, 334, 335, 336, 337, 338, 339, 340, 341 , 342, 343, 344,
345, 346, 347, 348, 349, 350, 351 , 352, 353, 354, 355, 356, 357, 358, 359, 360, 361 , 362, 363,
364, 365, 366, 367, 368, 369, 370, 371 , 372, 373, 374, 375, 376, 377, 378, 379, 380, 381 , 382,
383, 384, 385, 386, 387, 388, 389, 390, 391 , 392, 393, 394, 395, 396, 397, 398, 399, 400, 401 ,
402, 403, 404, 405, 406, 407, 408, 409, 410, 411 , 412, 413, 414, 415, 416, 417, 418, 419, 420, 421 , 422, 423, 424, 425, 426, 427, 428, 429, 430, 431 , 432, and/or 433 as named in Table 4.
In a preferred embodiment of all of the aspects and embodiments herein, CpG site cg13452062 is not used or included.
Where reference is made to the use of methylation levels of CpG sites, it can also or alternatively be phrased or worded that the CpG sites themselves are used. A diagnosis or diagnosing step (e.g. a step of diagnosing susceptibility to severe COVID-19 or the presence or absence of susceptibility to severe COVID-19 in a subject) can alternatively be worded as a classification or classification step (e.g. a step of classifying a subject as having or not having susceptibility to severe COVID-19). The classification or diagnosing can be achieved by assignment of a cutoff value as described elsewhere herein.
The terms “likelihood” and “probability” and “p” can be used interchangeably herein.
The indication of the susceptibility (or presence or absence of susceptibility) to severe COVID- 19 in the subject may be provided or derived using machine learning (or a machine learning technique). The indication may be provided using appropriate techniques such as random forest, gradient boosting, a neural network, or linear or logistic regression. The indication may be provided using a combination of appropriate techniques, for example using a logistic regression model followed by a support vector machine (svm) model, or for example using a random forest classification model followed by a feedforward neural network.
Various scoring methods, scoring systems, markers or formulas can be used that comprise any appropriate combination of the CpG sites or methylation levels of the invention as described herein in order to arrive at an indication, e.g. in the form of a value or score, which can then be used for diagnosis of susceptibility (or of the presence or absence of susceptibility) to severe COVID-19 (this can also be referred to as a COVID-19 severity score or a severity risk score). For example, said methods etc., can be an algorithm that comprises any appropriate combination of the CpG sites or methylation levels as an input, to e.g. perform pattern recognition of the samples, in order to arrive at an indication, e.g. in the form of a value or score, which can then be used for diagnosis of susceptibility to severe COVID-19. Non-limiting examples of such algorithms include machine learning algorithms that implement classification (algorithmic classifiers), such as linear classifiers (e.g. Fisher’s linear discriminant, logistic regression, naive Bayes classifier, perceptron); support vector machines (e.g. least squares support vector machines); quadratic classifiers; kernel estimation (e.g. k-nearest neighbor); boosting (e.g. gradient boosting); decision trees (e.g. random forests); neural networks; and learning vector quantization.
The use of such classifiers, e.g. machine learning, random forest, gradient boosting or logistic regression, would be within the skill of a person skilled in the art. For example, such classifiers can conveniently be trained on methylation levels from a training set of samples and then tested in terms of accuracy (or balanced accuracy) on a test set of samples. The classifier may generate a black-box model that is trained on the most important methylation CpG sites or methylation levels. In embodiments of any of the methods of the invention provided herein, the method comprises calculating a likelihood (or probability) of the subject having susceptibility, etc., to severe COVID-19, for example as a function of said methylation levels.
The likelihood (or probability) can alternatively be referred to as likelihood value (or probability value). The likelihood (or probability) can be a value between 0 and 1. A value of 1 can indicate a 100% likelihood (or probability) that the subject has susceptibility to severe COVID- 19, and a value of 0 can indicate a 0% likelihood (or probability) that the subject has susceptibility to severe COVID-19. In preferred embodiments, the methods of the invention comprise calculating the likelihood as a function of a linear combination of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
In embodiments, the linear combination of said methylation levels comprises a weighted sum of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
Alternatively viewed, the weighted sum of methylation levels can be formed by applying a predetermined weight (or coefficient) to each methylation value to provide a set of weighted methylation levels and then summing the weighted methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
In embodiments, a weight (or coefficient) as described herein is a normalized weight (or normalized coefficient), standardized weight (or standardized coefficient), or standardized logistic regression weight (or standardized logistic regression coefficient).
In embodiments, the method of the invention comprises calculating the likelihood as a logistic function of a linear combination of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19.
In embodiments, the method of the invention comprises performing a logistic regression method using said methylation levels, e.g. a linear combination of said methylation levels, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19. In embodiments, the method of the invention comprises receiving data representative of said methylation levels, and inputting the data to an algorithm for evaluating said function to determine the likelihood of the subject having susceptibility to severe COVID-19.
In embodiments, the method comprises applying an algorithm (for example a statistical prediction algorithm) to the methylation levels, optionally in order to determine the susceptibility to severe COVID-19 status of the subject (or optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19).
In preferred embodiments, applying the algorithm, e.g. the statistical prediction algorithm, can comprise: applying a weight (or coefficient), e.g. a predetermined weight (or coefficient), to each methylation value to provide a set of weighted methylation levels; summing the weighted methylation levels to provide a linear combination of methylation levels in the form of a weighted sum of said methylation levels; and applying a logistic function to the weighted sum, optionally to provide a value for (or representative of or corresponding to) the likelihood of the subject having susceptibility to severe COVID-19; and optionally comparing the likelihood value (or likelihood) with a cutoff value (or cutoff).
In preferred embodiments of any of the methods provided herein, the weight (or coefficient), e.g. the predetermined weight (or coefficient), for each methylation value has been calculated using reference methylation levels for each CpG site, wherein the reference methylation levels have been measured (or determined or obtained) from severe COVID-19 subjects (or observations of severe COVID-19 subjects) and from subjects not having severe COVID-19 (or observations of subjects not having severe COVID-19).
In embodiments, the method comprises (or further comprises) comparing the likelihood or likelihood value with a cutoff or cutoff value (e.g. a predetermined cutoff value). In embodiments, the method comprises (or further comprises) comparing the likelihood value with a cutoff value (e.g. a predetermined cutoff value), wherein the likelihood value being above the cutoff value is indicative of susceptibility to severe COVID-19 in the subject, and wherein the likelihood value being below the cutoff value is indicative of the absence of susceptibility to severe COVID-19 in the subject.
The comparing step may be considered to result in a diagnosis, i.e. of the presence or absence of susceptibility to severe COVID-19 in the subject. Alternatively viewed, the comparing step may be considered to result in a classification of the subject as having or not having susceptibility to severe COVID-19.
In further embodiments, the method comprises (or further comprises) providing a readout or result indicating the presence or absence of susceptibility to severe COVID-19 based on the comparison of the likelihood (or likelihood value) with the cutoff (or cutoff value). In other words, the readout or result can be used as a diagnosis of the presence or absence of susceptibility to severe COVID-19 in the subject.
In the methods of the invention, appropriate threshold or cut-off scores or values can be calculated by methods known in the art, for example from the ROC curve, for use in the methods of the invention. Such cut-off scores or values or thresholds may be used to declare a sample positive or negative. Appropriate or optimal cut-off scores or values or thresholds can be calculated depending on the desired outcome of the method, for example a cut-off score or value or threshold can be determined (or selected) to maximize the accuracy of the assay. Alternatively or in addition, a cut-off score or value or threshold can be determined (or selected) to maximize the specificity of the assay, or the sensitivity of the assay, or both the sensitivity and the specificity of the assay (e.g. the maximum total sum of the sensitivity and specificity, or maximizing the accuracy). Alternatively, a default cut-off can be used without calculation, for example a cut-off of 0.5 (in other words, a likelihood value of greater than 0.5 indicates susceptibility to severe COVID-19). Appropriate cutoff values can readily be determined by a person skilled in the art as described elsewhere herein. However, exemplary cutoff values might be 0.5, 0.6, 0.7, 0.8, or 0.9.
Once the cut-off value has been determined, a sample whose likelihood score is below this threshold (cut-off) value is classified as not having susceptibility to severe COVID-19, or, put another way, a sample whose likelihood score is above this cut-off value is classified as having susceptibility to severe COVID-19. This way of determining threshold (cut-off) values could be used for any of the models (or algorithms) using different combinations of CpG sites described herein. Pre-determined or default cut-off values can also be used. Such threshold (cut-off) scores can then conveniently be used to assess the appropriate methylation data in subjects and to arrive at a diagnosis. Using an appropriate cut-off or threshold value (used to declare a sample positive or negative), the models of the invention provided herein show outstanding results (on average a balanced accuracy of approximately 0.8 for all models tested). Thus, these results show that the present invention provides a simple and accessible test to allow accurate screening of susceptibility (or the presence or absence of susceptibility) to severe COVID-19 in an individual. Good indicators of the performance of a diagnostic test are AUG, sensitivity, specificity, accuracy and balanced accuracy, especially AUG and balanced accuracy.
As used herein, the term "sensitivity" refers to the ability of the test to correctly identify those patients with the disease or disorder, such that a 100% sensitivity indicates a test that correctly identifies all patients with the disease or disorder. Sensitivity is calculated as: Sensitivity = (True Positives)/(True Positives + False Negatives). Sensitivity thus also provides a representation of the number of true positives or false negatives.
As used herein, the term "specificity" refers to the ability of a test to correctly identify those patients without the disease or disorder, such that a 100% specificity indicates a test that correctly identifies all patients without the disease or disorder. Specificity is calculated as: Specificity = (True Negatives)/(True Negatives + False Positives). Sensitivity thus also provides a representation of the number of true negatives or false positives.
As used herein, the term “accuracy” or “balanced accuracy” refers to the average of the sensitivity and specificity of the method. In other words, Balanced Accuracy = (Sensitivity + Specificity) 12.
The “area under the receiver operating characteristic (ROC) curve” (AUC) is a global measure of diagnostic accuracy. The ROC curve is a plot of the pairs of sensitivity and specificity values for each cut-off, with 1 -specificity (1 minus specificity) on the x-axis and sensitivity on the y- axis. Thus, while the sensitivity and specificity of a diagnostic test depend on the cut-off, the AUC is independent of cut-off. In some instances, AUC can therefore be more informative of the quality of a diagnostic test than sensitivity or specificity.
In general, an AUC of 0.5 suggests no discrimination (i.e. no ability to diagnose patients with and without the disease or condition based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding (Mandrekar, Journal of Thoracic Oncology, Volume s, Number 9, September 2010).
As used herein, the term “balanced accuracy” refers to the average of the sensitivity and specificity of the method. In other words, Balanced Accuracy = (Sensitivity + Specificity) 12.
Due to the nature of AUC and balanced accuracy, the balanced accuracy value of a predictor at a given cut-off will generally be much lower than the AUC value of the predictor. Thus, practitioners in the art would consider a balanced accuracy of approximately 0.6 or above, or 0.6 or above, to define that the predictor is acceptable/workable. In embodiments, the methods of the invention as described elsewhere herein have a specificity, sensitivity, balanced accuracy and/or AUC value of at least 0.58 (58%), 0.59 (59%), 0.6 (60%), 0.61 (61%), 0.62 (62%), 0.63 (63%), 0.64 (64%), 0.65 (65%), 0.66 (66%), 0.67 (67%), 0.68 (68%), 0.69 (69%), 0.7 (70%), 0.71 (71%), 0.72 (72%), 0.73 (73%), 0.74 (74%), 0.75 (75%), 0.76 (76%), 0.77 (77%), 0.78 (78%), 0.79 (79%), 0.8 (80%), 0.81 (81%), 0.82 (82%), 0.83 (83%), 0.84 (84%), 0.85 (85%), 0.86 (86%), 0.87 (87%), 0.88 (88%), 0.89 (89%), 0.9 (90%), 0.91 (91%), 0.92 (92%), 0.93 (93%), 0.94 (94%), 0.95 (95%) or 0.96 (96%).
In embodiments, the method of the invention comprises or further comprises making a diagnosis of susceptibility to severe COVID-19 based on the methylation levels referred to elsewhere herein and/or the likelihood referred to elsewhere herein. Alternatively viewed, the method of the invention may comprise or further comprise making a prognosis of the development of severe COVID-19.
The diagnosis (or prognosis) may be made on the basis of (or based on) the methylation levels, likelihood value, or readout or result described elsewhere herein. The diagnosis may be considered to be performed by the production of the readout or result itself. Said diagnosis may therefore be computer implemented, e.g. partially or entirely computer implemented, and/or performed in the absence of a clinician. Alternatively or in addition, the diagnosis may be considered to be the conclusion drawn by a clinician based on said methylation levels, likelihood value, or readout or result described elsewhere herein.
In embodiments of any the methods described herein, the method may comprise (or further comprise) delivering a diagnosis (or prognosis). The diagnosis (or prognosis) may be based on data used or generated in the method, for example a readout, a result, or methylation levels as described elsewhere herein. The delivering of the diagnosis (or prognosis) may be considered to be performed by the production of the readout or result itself. The diagnosis (or prognosis) may be delivered in the form of a written or electronic report as described elsewhere herein, or may be delivered orally. The diagnosis (or prognosis) may be delivered by a clinician, or by a processing system or computer. The diagnosis (or prognosis) may be delivered to any relevant party, for example the subject being tested or an acquaintance thereof, or another clinician.
In embodiments of any of the methods of the invention provided herein, the method may further comprise outputting the data (e.g. readout, result, diagnosis, prognosis, or methylation levels, as the case may be) over a network connection, or displaying the data on a screen, e.g. a computer screen, or on an electronic display. In embodiments, the subject (e.g. human subject) is a subject who has COVID-19, or has been diagnosed or identified as having COVID-19. Preferably the subject has been recently diagnosed or identified as having COVID-19, for example the subject may have been diagnosed or identified as having COVID-19 at most 1, 2, 3, 4, 5, 6 or 7 days, or 1 , 2, 3 or 4 weeks prior to collection of a sample from the subject for analysis by the method of the invention.
In embodiments, the subject (e.g. human subject) is a subject suspected of being (or believed to be) susceptible to severe COVID-19 (or at risk of, or at high or higher or increased risk of severe COVID-19). The method of the invention may be used in order to affirm or further support that the subject is indeed susceptible to severe COVID-19. Thus, in embodiments, the subject is a subject having (or known or believed to have) one or more of known risk factors associated with (developing or the development of) severe COVID-19.
Such “at risk”, “suspected” or “susceptible” etc. subjects would be readily identified by a person skilled in the art. These subjects include for example subjects with a family history of susceptibility to (or development of) severe COVID-19, or a genetic predisposition to the development of severe COVID-19, or subjects diagnosed with one or more risk factors associated with severe COVID-19, or subjects with one or more recognized risk factors associated with severe COVID-19. For example, recognized risk factors for severe COVID-19 are male sex, obesity, old age, hypertension, diabetes, cardiovascular disease, or chronic lung disease, Down's syndrome, cancer, treatment for certain types of cancer (for example chemotherapy), sickle cell disease, chronic kidney disease, severe liver disease, or immunodeficiency.
The term “subject” as used herein can also mean “individual”, “patient” or “person”.
The methods of the invention as described herein can be carried out on any type of subject which is capable of suffering from severe COVID-19 (or from being susceptible to severe COVID-19). A wide variety of animals are known to be able to be infected with SARS-CoV-2 and develop COVID-19, including several mammals. Thus, the methods of the invention may be carried out on a mammal, i.e. the subject may be a mammal. The methods may be carried out on (i.e. the subject may be) a human, primate (e.g. monkey), laboratory mammal (e.g. mouse, rat, rabbit, or guinea pig), livestock mammal (e.g. horse, cattle, sheep, or pig), domestic pet (e.g. cat, dog, hamster, or ferret), zoo or sanctuary animal (e.g. lion, tiger, leopard), mink, or wild animal (e.g. deer, marmoset, or anteater).
The subject is preferably a human subject. The subject may be male or female. The human may, for example, be 0-10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, 90-100 or above 100 years old. In embodiments, the subject (e.g. human subject) may be one who is at risk (or is a subject suspected of being (or believed to be) at risk) from a particular disease or disorder, e.g. severe COVID-19, or one who has previously suffered from a particular disease or disorder, e.g. severe COVID-19.
The methods of the invention may be carried out on “healthy” patients (subjects) or at least patients (subjects) which are not manifesting any clinical symptoms of COVID-19, for example, patients with very early or pre-clinical stage COVID-19, and/or asymptomatic COVID-19.
Thus, the methods of the present invention can also be used to monitor disease progression. Such monitoring can take place before, during or after treatment of COVID-19 (or severe COVID-19), e.g. pharmaceutical therapy or other non-pharmaceutical interventions such as respiratory support, hospitalization, oxygen (O2) supplementation, non-invasive ventilation, ventilator support, ECMO, or physical therapy. Thus, in another aspect the present invention provides a method for monitoring COVID-19 (or severe COVID-19) or monitoring the progression of COVID-19 (or severe COVID-19) in a subject.
Methods of the present invention can be used in the active monitoring of patients which have not been subjected to therapy (pharmacological or non-pharmacological), e.g. to monitor the progress of COVID-19 in untreated patients. For example, serial measurements can allow an assessment of whether or not, or the extent to which, the COVID-19 is worsening or improving, thus, for example, allowing a more reasoned decision to be made as to whether therapeutic (pharmacological or non-pharmacological) intervention is necessary or advisable.
As discussed above and elsewhere herein, monitoring can also be carried out, for example, in an individual, e.g. a healthy individual or an individual which has tested positive for COVID-19 (for example through a lateral flow test or PCR test), who is thought to be at risk of developing severe COVID-19 or thought to be susceptible to developing severe COVID-19, in order to obtain an early, and ideally pre-clinical, indication of susceptibility to developing severe COVID- 19. The term “monitoring” COVID-19 as used herein can also be used to mean “monitoring the development of’ or “monitoring the progression of” COVID-19 (or severe COVID-19).
Serial (periodical) measuring of the methylation level of one or more of the CpG sites in accordance with the present invention and as referred to elsewhere herein may also be used to monitor susceptibility to severe COVID-19, looking for either increasing or decreasing levels over time. Observation of altered levels (increase or decrease as the case may be) may also be used to guide and monitor therapy, both in the setting of subclinical disease, i.e. in the situation of "watchful waiting" before treatment (pharmacological or non-pharmacological), e.g. before initiation of pharmacological or non-pharmacological treatment, or during or after treatment to evaluate the effect of treatment and look for signs of therapy failure.
Thus, the present invention also provides a method for predicting the response of a subject to therapy (pharmacological or non-pharmacological). For example, a subject with a higher likelihood (or probability) of developing severe COVID-19, as determined by the methylation level of one or more of the CpG sites in a sample in accordance with the present invention and as referred to elsewhere herein, may be more likely to be responsive to therapy (pharmacological or non-pharmacological) than a subject with lower likelihood (or probability) of developing severe COVID-19. In such methods the choice of therapy (pharmacological or non- pharmacological) may be guided by knowledge of the methylation level of one or more of the CpG sites in the sample.
In some embodiments, the invention provides a method of monitoring (e.g. continuously monitoring or performing active surveillance of) a subject having COVID-19 or severe COVID-19 (e.g. a subject being treated for COVID-19). Such monitoring may guide which treatment to use or whether no treatment should be given or whether treatment should be continued or whether the dose of a pharmaceutical agent should be increased or decreased, etc.
Preferably, but not necessarily, the methods of the invention (e.g. screening or diagnostic methods, etc., as described herein) are used in conjunction with (or subsequent to) known screening or diagnostic methods for identifying COVID-19, for example lateral flow tests or PCR tests. In another embodiment, the invention provides the use of the methods of the invention (e.g. screening or diagnostic methods, etc., as described herein) in conjunction with other known screening or diagnostic methods for identifying susceptibility to severe COVID-19, for example lateral flow tests or PCR tests. Thus, for example, the methods of the invention can be used as a follow up to confirm a diagnosis of susceptibility to severe COVID-19 in a subject. In some embodiments the methods of the present invention are used alone.
The methods of the present invention can be carried out on any appropriate biological sample, e.g. any appropriate body fluid sample or tissue sample that contains DNA. In this regard, although blood samples are a common source of DNA, other types of body fluid or tissue sample could be used by a skilled person to extract DNA containing the desired CpG sites, following the teaching as provided herein. Typically the sample has been obtained from (removed from) a subject (e.g. as described elsewhere herein, preferably a human subject). In other aspects, the method further comprises a step of obtaining a sample from the subject. By obtained from the subject, it is meant that the biological sample is previously obtained, or has been obtained from the subject. Hence, in embodiments the patient or subject is not required to be present (while the methods of the invention are being performed). In embodiments, the invention is not practised (or performed) on the human (or animal) body.
Reference herein to "body fluid" includes reference to all fluids derived from the body of a subject. Exemplary fluids include blood (including all blood derived components, for example buffy coat, plasma, serum, etc.), saliva, urine, tears, bronchial secretions or mucus. Preferably, the body fluid is a circulatory fluid (especially blood or a blood component), or urine. Especially preferred body fluids are blood or urine. In some preferred embodiments the sample is a blood sample (e.g. a plasma, serum, buffy coat or white blood cell sample). In some preferred embodiments the sample is a buffy coat sample or white blood cell sample. In some embodiments the sample is a urine sample. The body fluid or sample may be in the form of a liquid biopsy. The term "sample" also encompasses any material derived by processing a body fluid or tissue sample (e.g. derived by processing a blood or urine sample). Processing of biological samples to obtain a test sample may involve one or more of: digestion, boiling, filtration, distillation, centrifugation, lyophilization, fractionation, extraction, concentration, dilution, purification, inactivation of interfering components, addition of reagents, derivatization, complexation and the like, e.g. as described elsewhere herein.
In embodiments, the biological sample is a blood, saliva, urine, solid tissue (for example cartilage from affected joints), or fecal sample. In preferred embodiments, the biological sample is a blood sample. Preferably, the blood sample is a buffy coat sample or a serum sample or a plasma sample. In embodiments, the sample is a white blood cell (or leukocyte) sample, or is a sample comprising white blood cells (or leukocytes).
Typically, the DNA from the biological sample is genomic DNA.
In embodiments, the method additionally comprises the step of obtaining one or more biological samples from the subject. In some embodiments, one or more of the methylation levels in accordance with the present invention are detected directly in the biological sample, e.g. from within a sample of the subject’s blood, blood serum, blood plasma, buffy coat, white blood cell, or other sample.
In embodiments, DNA is first isolated and/or purified from the biological sample before the methylation levels are detected. The biological sample may therefore comprise (or consist of, or be) isolated and/or purified DNA. DNA may be isolated and/or purified from the biological samples by any suitable method which would be well known to a person skilled in the art. Such methods may include cell lysis; treatment with protease, RNase and/or detergent; and DNA purification by ethanol precipitation, phenol-chloroform extraction or minicolumn purification. Specific DNA extraction methods can be used depending on the biological sample in question. For example, where the biological sample is a blood sample, the DNA can be extracted using the Monarch® Protocol for Extraction and Purification of Genomic DNA from Blood (NEB #T3010), or a magnetic bead-based technology such as the ChargeSwitch® gDNA Purification Kit (Thermofisher).
In embodiments, the method of the invention comprises, e.g. further comprises, reporting the results of the method, optionally and conveniently by preparing a written or electronic report.
In embodiments, the method of the invention is implemented by a computer (or is computer- implemented).
In embodiments, the method of the invention comprises, e.g. further comprises, treating said severe COVID-19 by therapy (pharmacological or non-pharmacological).
Hospitals may offer antibody and/or antiviral treatment to individuals with COVID-19 who are at risk of developing severe COVID-19. Multiple treatment options are available and include nirmatrelvir and ritonavir (Paxlovid), sotrovimab (Xevudy), remdesivir (Veklury), and molnupiravir (Lagevrio). Nirmatrelvir, ritonavir, remdesivir and molnupiravir are antiviral medicines, while sotrovimab is a neutralising monoclonal antibody (nMAb).
Thus, in preferred embodiments the therapy comprises a step of administering to the subject a therapeutically effective amount of one or more agents suitable for preventing the development of severe COVID-19, preferably an agent selected from the group consisting of nirmatrelvir, ritonavir, sotrovimab, remdesivir, and molnupiravir.
COVID-19 illness may be classified as severe COVID-19 where the subject requires hospitalization or if hospitalization is warranted. There are multiple treatment and therapy options for the management of patients with severe COVID-19. Depending on the severity of the illness, supplemental oxygen may be administered to the subject. The supplemental oxygen may be provided through non-invasive ventilation (for example using a face mask). In cases of more severe illness, the supplemental oxygen may be provided by high-flow oxygen therapy (for example using a high-flow nasal cannula), mechanical ventilation (MV) (using a ventilator), or extracorporeal membrane oxygenation (ECMO) (using a heart and/or lung machine). Alternatively or in addition, an antiviral or immunomodulator therapy may be administered to the subject, for example remdesivir, dexamethasone and/or tocilizumab.
Alternatively or in addition, an anticoagulation therapy may be administered to the subject, for example heparin.
Thus, in preferred embodiments the therapy comprises a step of administering to the subject a therapeutically effective amount of one or more agents suitable for treating severe COVID-19, preferably a suitable antiviral or immunomodulator agent (for example remdesivir, dexamethasone and/or tocilizumab) and/or a suitable anticoagulation therapy (for example heparin).
In embodiments, the method of the invention comprises, e.g. further comprises, altering, ceasing or continuing treatment of said subject.
In embodiments, the method of the invention comprises, e.g. further comprises, a step of measuring the methylation levels before the step of using the methylation levels.
In embodiments, the method of the invention comprises, e.g. further comprises, providing DNA (said DNA) from a biological sample obtained from the subject before the step of measuring the methylation levels.
In one or more embodiments, a method of the invention is provided comprising a first step of extracting DNA (e.g. genomic DNA) from a sample, e.g. a biological sample. In a second step, the DNA methylation levels at multiple CpG sites as defined elsewhere herein are measured. Each measurement measures the extent of methylation at a particular CpG site.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most: 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
In another aspect, the invention provides a computer program, software, or computer readable storage medium (e.g. a non-transitory and/or tangible computer readable storage medium), comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject having susceptibility to severe COVID-19.
The software or computer program may be stored on a non-transitory and/or tangible computer- readable storage medium, such as a hard-drive, a CD-ROM, a solid-state memory, etc., or may be communicated by a transitory signal such as data over a network.
These embodiments provide a means for executing or implementing the methods of the invention as described herein. Thus, in these embodiments, any other number of the CpG sites or other features as described elsewhere herein for the methods of the invention can be used.
In alternative embodiments, the instructions cause the processing system to calculate the likelihood as a non-linear function of the combination of said methylation levels in accordance with the invention as described elsewhere herein.
In embodiments, the instructions cause the processing system to calculate the likelihood as a function of a linear combination of said methylation levels.
In embodiments, the linear combination of said methylation levels comprises a weighted sum of said methylation levels. In embodiments, the instructions cause the processing system to calculate the likelihood as a logistic function of a linear combination of said methylation levels.
In embodiments, the instructions cause the processing system to receive data representative of said methylation levels and input the data to an algorithm for evaluating said function to determine the likelihood of the subject having susceptibility to severe COVID-19.
In embodiments, the computer program, software, or non-transitory (or tangible) computer readable storage medium comprises computer-readable code that, when executed by a processing system (or a computer), causes the processing system (or the computer) to perform one or more additional operations comprising: sending information corresponding to the methylation levels of the set of CpG sites in the biological sample to a tangible data storage device.
In embodiments, the using of the methylation levels comprises processing data representative of the methylation levels (or processing the methylation levels).
The methods disclosed herein may be fully or wholly computer-implemented methods. Alternatively, the methods disclosed herein may be partially computer-implemented methods. Any of the method steps disclosed herein may, wherever appropriate, be implemented as steps of the method, using any appropriate hardware and/or software. For example, the method (or one or more method steps, for example the step of using the methylation levels or the step of processing the methylation levels) may be carried out by a processor, computer, device, unit, module or means. Thus, the step (or only the step) of using the methylation levels may be computer-implemented, and/or the step (or only the step) of processing of data representative of the methylation levels (or processing the methylation levels) may be computer-implemented. The computer software disclosed herein may be on a transitory or a non-transitory computer- readable medium. The diagnostic algorithm could be implemented on one or more further computer processing systems that are distinct from the computer processing system that is configured to train the model (or models) used in the invention.
Thus, a preferred embodiment provides a method of screening for the susceptibility of a subject to severe COVID-19, the method comprising using the methylation levels of at least: 10, 15, 20, 25, 30 or 35 CpG sites selected from the 337 CpG sites listed in Table 5; and/or 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to severe COVID-19; wherein the step of using the methylation levels is implemented by a computer; and wherein the biological sample is a blood sample.
Such methods may comprise using the methylation levels of at least:
10, 15, or 20 CpG sites selected from the 251 CpG sites listed in Table 2; and/or
10, 15, or 20 CpG sites selected from the 177 CpG sites listed in Table 1 ; and/or
2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3.
Alternative and preferred embodiments and features of the invention as described elsewhere herein apply equally to these methods of the invention.
The invention may also be provided in a fully developed software package or web-based program. For example, a user may access a webpage and upload their DNA methylation data. The program then emails the results, including the indication of the (presence or absence of) susceptibility to severe COVID-19, to the user.
Where “processing system” is recited, it should be understood that “computer” or “computer system” (or processor, or device, or unit, or module, or means) is also contemplated alternatively or in addition. Equally, all the terms recited in this paragraph may be used interchangeably where appropriate.
In another aspect, the invention provides a processing system configured to perform the method of the invention.
In another aspect, the invention provides a processing system (or a computer or computer system) configured to run the algorithm or software of the invention as provided elsewhere herein or configured to perform the methods of the invention.
In another aspect, the invention provides a method of screening or diagnosing, etc., susceptibility to severe COVID-19 in a subject, the method comprising calculating, optionally implemented by a computer, a likelihood (or probability) of the subject having susceptibility to severe COVID-19 using measurements of methylation levels of at least or at most:
(i) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4; or
(ii) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed Table 5, and/or 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; or
(iii) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; or (iv) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2; or
(v) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 ; or
(vi) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6; or
(vii) 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; or
(viii) 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7; obtained from a DNA sample of the subject. In alternative embodiments, any other number of the CpG sites or other features as described herein for the methods of the invention can be used.
In another aspect, the invention provides a method of monitoring susceptibility to severe COVID-19 in a subject, the method comprising:
(a) using the methylation levels of at least or at most:
(i) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4; or
(ii) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed Table 5, and/or 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; or
(iii) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; or
(iv) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2; or
(v) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 ; or
(vi) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6; or
(vii) 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; or
(viii) 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7;
(or any other number of the CpG sites as described elsewhere herein) in DNA from a biological sample obtained from the subject at a first time point; and (b) comparing said methylation levels to the methylation levels of the same CpG sites (or the same any other number or selection of the CpG sites as described elsewhere herein) in DNA from a biological sample obtained from the subject at a second time point.
In another aspect, the invention provides a method of obtaining an indication of the efficacy of a drug which is being used to treat severe COVID-19 in a subject, the method comprising:
(a) using the methylation levels of at least or at most:
(i) 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4; or
(ii) 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed Table 5, and/or 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; or
(iii) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; or
(iv) 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2; or
(v) 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1; or
(vi) 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6; or
(vii) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; or
(viii) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7;
(or any other number of the CpG sites as described elsewhere herein) in DNA from a biological sample obtained from the subject at a first time point; and
(b) comparing said methylation levels to the methylation levels of the same CpG sites (or the same any other number or selection of the CpG sites as described elsewhere herein) within a biological sample obtained from the subject at a second time point, wherein a drug has been administered to the subject in the interval between the first and second time points or has been administered at any other appropriate time point, for example at or around the same time as the first time point, for example at a time point where a base-line level of methylation can be measured.
The biological samples obtained in steps (a) and (b) should be directly comparable, e.g. the biological samples must both be of the same type (e.g. both are blood samples) and subsequently treated in the same manner. The first time point may, for example, be at an early stage of the disease, or at most or exactly 1 , 2, 3, 4, 5, 6 or 7 days, or 1 , 2, 3 or 4 weeks after the subject first tested positive for COVID-19. The second time point may be at a later stage of COVID-19 infection, or after the subject has been treated with medicament suitable for the prevention of severe COVID-19 (for example as described herein) and/or a medicament suitable treatment of severe COVID-19 (for example as described herein). The first and second time points may be any suitable time intervals, e.g. at least 1 , 2, 3, 4, 5, 6, or 7 days apart, or at least 1 , 2, 3 or 4 weeks apart, or at least 1-12 months apart, or at least 1-5 years apart.
Serial (periodic) measuring of the level of the methylation levels of one or more of the CpG sites in accordance with the present invention may also be used for disease monitoring, e.g. assessing disease severity (or likelihood of progression to severe disease), looking for either increasing or decreasing levels (or scores or likelihoods or likelihood values) over time. In some embodiments, an altering methylation level or score or likelihood (increase or decrease, as appropriate) of one or more of the CpG sites in accordance with the present invention over time (e.g. in comparison to a control level or base-line or earlier level in the same subject, e.g. a level moving further away from the control level, base-line or earlier level in the same subject) may indicate a worsening disease state, severity or prognosis. In some embodiments, an altering level (increase or decrease, as appropriate) of the methylation level of one or more of the CpG sites in accordance with the present invention over time (e.g. in comparison to a control level, e.g. a level moving closer to the control level) may indicate an improving disease state, severity or prognosis.
In embodiments, a change in the methylation levels between the first and second time points in any aspects referred to herein is indicative of a change in the likelihood (or probability) of the development of severe COVID-19 (or in the severity of COVID-19) in the subject.
In another aspect, the invention provides a method of treating COVID-19 in a subject, the method comprising:
(a) obtaining an indication of susceptibility (or of the presence of susceptibility) to severe COVID-19 in a subject by performing a method of the present invention as described elsewhere herein; and
(b) administering a treatment appropriate for treating COVID-19 to the subject if an indication of susceptibility (or of the presence of susceptibility) to severe COVID-19 in the subject is obtained, thereby treating COVID-19 in the subject.
In another aspect, the invention provides a method of preventing severe COVID-19 in a subject, the method comprising: (a) obtaining an indication of an increased risk of severe COVID-19 in a subject (e.g. a healthy subject, or an at risk or susceptible subject) by performing a method of the present invention as described elsewhere herein; and
(b) administering a treatment appropriate for preventing severe COVID-19 in the subject if an indication of an increased risk of severe COVID-19 in the subject is obtained, thereby preventing severe COVID-19 in the subject.
In another aspect, the invention provides a method of preventing severe COVID-19 in a subject, the method comprising the step of:
(a) administering a treatment appropriate for preventing severe COVID-19 to the subject, wherein, prior to administration, an indication of the susceptibility to severe COVID-19 (or a prognosis of the development of severe COVID-19) in the subject, has been obtained by a method of the invention.
In the above methods, the treatment to be administered can also be an invasive treatment, e.g. as described elsewhere herein.
A number of different methods for detecting methylation levels of CpG sites are known and described in the literature and any of these may be used according to the present invention. At its simplest, the methylation level or state of a CpG site may be detected by hybridisation to a probe (e.g. an oligonucleotide probe) and many such hybridisation protocols have been described (see e.g. Sambrook et al., Molecular cloning: A Laboratory Manual, 3rd Ed., 2001, Cold Spring Harbor Press, Cold Spring Harbor, NY). Typically, the detection will involve a hybridisation step and/or an in vitro amplification step.
In one embodiment, the target nucleic acid, e.g. the methylated or unmethylated form of a particular CpG site, in a sample, may be detected by using an oligonucleotide with a label attached thereto, which can hybridize to the nucleic acid sequence of interest. Such a labeled oligonucleotide will allow detection by direct means or indirect means. In other words, such an oligonucleotide may be used simply as a conventional oligonucleotide probe. After contact of such a probe with the sample under conditions which allow hybridisation, and typically following a step (or steps) to remove unbound labeled oligonucleotide and/or non-specifically bound oligonucleotide, the signal from the label of the probe emanating from the sample may be detected. In preferred embodiments the label is selected such that it is detectable only when the probe is hybridized to its target. The probe may have a nucleic acid sequence complementary to the sequence of the CpG site of interest or a derivative thereof. The probe may be complementary to the CpG site (i.e. the dinucleotide “CG” sequence) and certain adjacent residues.
The probe may alternatively be complementary to a derivative of the CpG site and certain immediately adjacent residues, for example 10, 20, 30, 40, 50 or 60 immediately adjacent residues. The immediately adjacent residues may for example be residues to the 3’ side of (downstream of) the CG dinucleotide. This probe design format is known in the art.
CpG methylation can be detected using two different types of probe as used in the Illumina Infinium I Methylation Assay. The probes may each be linked to a solid support, for example a bead. The first probe type (named the II type in the Infinium I assay) has the sequence “CA” at its 3’ end, and thus is complementary to the sequence of an unmethylated CpG site which has been bisulfite treated (i.e. to “UG”) and subsequently amplified (i.e. to “TG”). The second probe type (named the M type in the Infinium I assay) has the sequence “CG” at its 3’ end, and thus is complementary to the sequence of a methylated CpG site, whether bisulfite-treated or not (i.e. “CG”). The probes may be complementary to said CpG sites and certain immediately adjacent residues (or a derivative of said sequence). The immediately adjacent residues may for example be residues to the 3’ side of (downstream of) the CG dinucleotide. Annealing of complementary probes to their target sites enables single-nucleotide (or single-base) extension. In order to enable detection, the nucleotide incorporated in the single-nucleotide extension may be labelled with an appropriate fluorophore (which indicates methylation or non-methylation), and the fluorescent signal may be detected using an imaging apparatus, for example Illumina iScan.
Alternatively or in addition, CpG methylation may be detected using a single type of probe as used in the Infinium II Methylation Assay. The probe may have at its 3’ end a cytosine residue suitable for hybridizing to the guanine of the “CG” sequence. The probe may be complementary to said guanine and certain immediately adjacent residues (or a derivative of said sequence). The immediately adjacent residues may for example be residues to the 3’ side of (downstream of) the CG dinucleotide. The probe may be linked to a solid support, for example a bead. The probe can therefore target or hybridize to the CpG site irrespective of the sequence of the CpG site after bisulfite treatment. After hybridisation to the bisulfite treated sequence of interest, single-base extension is conducted to identify the second nucleotide of the bisulfite-treated CpG site, and thus whether the CpG site was methylated or unmethylated. In order to enable detection, the nucleotide incorporated in the single-nucleotide extension may be labeled with an appropriate fluorophore (which indicates methylation or non-methylation), and the fluorescent signal may be detected using an imaging apparatus, for example Illumina iScan. For detecting (or measuring) the methylation level of CpG sites which, when methylated, are methylated on both strands (i.e. the cytosine of the CpG site is methylated on both the sense and antisense DNA strands), the probe or probes can be designed to be targeted to the sequence of the sense strand of the CpG site, or the antisense strand of the CpG site. For CpG sites which, when methylated, are hemimethylated (i.e. the cytosine of the CpG site on one strand is methylated (e.g. the sense strand), while cytosine of the CpG site on the other strand is unmethylated (e.g. the antisense strand)), the probe or probes can be designed to be targeted to the sequence of the strand on which methylation occurs. Hence, probes for use in accordance with the invention may be targeted towards (or complementary to) the sense strand of a CpG site or the antisense strand of a CpG site.
The term "probe" as used herein refers to an oligonucleotide capable of binding in a basespecific manner to a complementary strand of nucleic acid. The term "probe" as used herein can also refer to a surface-immobilized molecule that can be recognized by a particular target as well as molecules that are not immobilized and are coupled to a detectable label. The terms “probe” and “primer” can be used interchangeably herein. The probe is conveniently a nucleic acid probe and thus can be a DNA or RNA oligonucleotide, typically a DNA oligonucleotide. The probe may be for example 10, 20, 30, 40, 50, 60 or 70 nucleotides in length.
The probes of the invention may be suitable for use in a PCR method (or PCR-based method). The term “PCR primer” or “PCR kit” can be used to refer to a primer or kit which is suitable for use in a PCR protocol or is suitable for performing PCR.
In some embodiments of the method of the invention, the methylation levels have been obtained using a PCR-based method (or using PCR). Optionally this is an active step of the method, hence in embodiments the method of the invention comprises obtaining the methylation levels by a PCR-based method (or by PCR) prior to using the methylation levels.
An exemplary protocol for predicting COVID-19 severity with a PCR test may be described as follows:
1. Select CpG sites to be measured and obtain or design primers for the CpG sites in question. Primers can be easily designed, for example using a web-based service such as PrimerSuite (Lu et al. Sci Rep 7, 41328 (2017). https://doi.org/10.1038/srep41328).
2. Run a PCR method to obtain output values corresponding to the methylation rates of the CpG sites. Several known PCR methods are available which could be used for this purpose, such as methylation-sensitive loop-mediated isothermal amplification (MS- LAMP) as described for example in Hambalek et al. 2021 (ACS Sens. 24;6(9):3242-3252) or Zerilli et al. 2010 (Clin Chem. 56(8): 1287-96); high resolution melting (HRM) as described for example in Wojdacz et al. 2008 (Nat Protoc 3, 1903-1908); or the OpenArray™ platform as described for example by Broccanello et al. (in Quantitative Real-Time PCR. Methods in Molecular Biology, vol 2065. Humana, New York, NY).
3. Derive the methylation rates computationally based on the PCR results using known methodology. This can involve several calibration steps and averaging over several samples to account for technical and between-sample variance.
4. Use a machine learning model in order to calculate a COVID-19 severity score (severity risk score) using the methylation rates of the n CpG sites. This may be performed for example using a logistic regression model as described herein, along with the regression coefficients associated with each CpG site and intercept, for example as provided in Table 8.
MS-LAMP, OpenArray and HRM as mentioned above are preferred PCR methods in respect of the present invention. LAMP provides a simple workflow for detecting methylated CpG dinucleotides in synthetic and genomic DNA samples using methylation-sensitive restriction enzyme digestion followed by loop-mediated isothermal amplification. OpenArray technology is one of the most high-throughput qPCR platforms, which uses a microscope slide-sized plate with through-holes which retain reaction mixtures via surface tension. Methylation-sensitive high-resolution melting (MS-HRM) is based on the comparison of the melting profiles of PCR products from unknown samples with profiles specific for PCR products derived from methylated and unmethylated control DNAs. The protocol consists of PCR amplification of bisulfite-modified DNA with primers designed to proportionally amplify both methylated and unmethylated templates and subsequent high-resolution melting analysis of the PCR product. MS-HRM allows in-tube determination of the methylation status of the locus of interest following sodium bisulfite modification of template DNA during a short time period.
The term "complementary" or “targeted” as used herein can refer to the hybridization or base pairing between nucleotides or nucleic acids (e.g. between probes), such as, for instance, between the two strands of a double stranded DNA molecule or between an oligonucleotide primer or probe and a primer or probe binding site on a single stranded nucleic acid to be sequenced or amplified.
In another embodiment, the target CpG site in a sample may be detected or identified by using an oligonucleotide probe which is labeled only when hybridized to its target sequence, i.e. the probe may be selectively labeled. Conveniently, selective labeling may be achieved using labeled nucleotides, i.e. by incorporation into the oligonucleotide probe of a nucleotide carrying a label. In other words, selective labeling may occur by chain extension of the oligonucleotide probe using a polymerase enzyme which incorporates a labeled nucleotide, preferably a labeled dideoxynucleotide (e.g. ddATP, ddCTP, ddGTP, ddTTP, ddllTP). This approach to the detection of specific nucleotide sequences is sometimes referred to as primer extension analysis. Suitable primer extension analysis techniques are well known to the skilled person, e.g. those techniques disclosed in WO99/50448, the contents of which are incorporated herein by reference.
Modifications of the basic PCR method such as qPCR (Real Time PCR) have been developed that can provide quantitative information on the template being amplified. Numerous approaches have been taken although the two most common techniques use double-stranded DNA binding fluorescent dyes or selective fluorescent reporter probes.
Fluorescent reporter probes used in qPCR may be sequence specific oligonucleotides, typically RNA or DNA, that have a fluorescent reporter molecule at one end and a quencher molecule at the other (e.g. the reporter molecule is at the 5' end and a quencher molecule at the 3' end or vice versa). The probe is designed so that the reporter is quenched by the quencher. The probe is also designed to hybridize selectively to particular regions of complementary sequence which might be in the template. If these regions are between the annealed PCR primers the polymerase, if it has exonuclease activity, will degrade (depolymerise) the bound probe as it extends the nascent nucleic acid chain it is polymerising. This will relieve the quenching and fluorescence will rise. Accordingly, by measuring fluorescence after every PCR cycle, the relative amount of amplification product can be monitored in real time. Through the use of internal standard and controls, this information can be translated into quantitative data.
The amplification product may be detected, and amounts (levels) of the amplification product can be determined by any convenient means. A vast number of techniques are routinely employed as standard laboratory techniques and the literature has descriptions of more specialized approaches. At its most simple the amplification product may be detected by visual inspection of the reaction mixture at the end of the reaction or at a desired time point. Typically the amplification product will be resolved with the aid of a label that may be preferentially bound to the amplification product. Typically a dye substance, e.g. a colorimetric, chromomeric fluorescent or luminescent dye (for instance ethidium bromide or SYBR green) is used. In other embodiments a labeled oligonucleotide probe that preferentially binds the amplification product is used.
In some embodiments, the relative abundance of the methylated or unmethylated CpG site in association with (e.g. physical association with or in complex with) the probe is determined. Thus, in some embodiments the level of a complex of the methylated or unmethylated CpG site and the probe used to detect the methylated or unmethylated CpG site is determined. In some embodiments the level of a methylated or unmethylated CpG site in association with (e.g. in complex with) a primer (or extended primer) or probe (e.g fluorescent reporter probe) or dye or the like may be determined.
DNA methylation of the CpG sites can be measured using various approaches, which range from commercial array platforms (e.g. from Illumina™) to sequencing approaches of individual genes. This includes standard lab techniques or array platforms. For a review of some methylation detection methods, see, Oakeley, E. J., Pharmacology & Therapeutics 84:389-400 (1999).
Available methods of the measuring the DNA methylation levels of CpG sites include, but are not limited to: methylation-sensitive sequencing, a microarray-based method, (e.g. using an Illumina microarray such as an Illumina 450k array or Illumina Infinium Methylation EPIC Kit), a PCR-based method, or high resolution melting (HRM), OpenArray or LAMP, reverse-phase HPLC, thin-layer chromatography, Sssl methyltransferases with incorporation of labeled methyl groups, the chloracetaldehyde reaction, differentially sensitive restriction enzymes, hydrazine or permanganate treatment (m5C is cleaved by permanganate treatment but not by hydrazine treatment), combined bisulphate-restriction analysis, methylation sensitive single nucleotide probe extension, methylation-sensitive single-strand conformation analysis (MS-SSCA), methylation-sensitive single-nucleotide primer extension (MS-SnuPE), base-specific cleavage/MALDI-TOF, Combined Bisulfite Restriction Analysis (COBRA), methylated DNA immunoprecipitation (MeDIP), pyrosequencing or bisulfite sequencing. For example, measuring a methylation level can comprise performing array-based PCR (e.g., digital PCR), targeted multiplex PCR, or direct sequencing without bisulfite treatment (e.g., via a nanopore technology). In some aspects, determining methylation status comprises methylation specific PCR, real-time methylation specific PCR, quantitative methylation specific PCR (QMSP), or bisulfite sequencing. In certain aspects, a method according to the embodiments comprises treating DNA in or from a sample with bisulfite (e.g., sodium bisulfite) to convert unmethylated cytosines of CpG dinucleotides to uracil.
In more detail, the following assays can also be used to measure DNA methylation levels: a) Molecular break light assay for DNA adenine methyltransferase activity is an assay that is based on the specificity of the restriction enzyme Dpnl for fully methylated (adenine methylation) GATC sites in an oligonucleotide labeled with a fluorophore and quencher. The adenine methyltransferase methylates the oligonucleotide making it a substrate for Dpnl. Cutting of the oligonucleotide by Dpnl gives rise to a fluorescence increase. b) Methylation-Specific Polymerase Chain Reaction (PCR) is based on a chemical reaction of sodium bisulfite with DNA that converts unmethylated cytosines of CpG dinucleotides to uracil or UpG, followed by traditional PCR. However, methylated cytosines will not be converted in this process, and thus probes are designed to overlap the CpG site of interest, which allows one to determine methylation status as methylated or unmethylated. The beta value can be calculated as the proportion of methylation. c) Whole genome bisulfite sequencing, also known as BS-Seq, is a genome-wide analysis of DNA methylation. It is based on the sodium bisulfite conversion of genomic DNA, which is then sequenced on a Next-Generation Sequencing (NGS) platform. The sequences obtained are then re-aligned to the reference genome to determine methylation states of CpG dinucleotides based on mismatches resulting from the conversion of unmethylated cytosines into uracil. d) The Hpall tiny fragment Enrichment by Ligation-mediated PCR (HELP) assay is based on restriction enzymes' differential ability to recognize and cleave methylated and unmethylated CpG DNA sites. e) Methyl Sensitive Southern Blotting is similar to the HELP assay but uses
Southern blotting techniques to probe gene-specific differences in methylation using restriction digests. This technique is used to evaluate local methylation near the binding site for the probe. f) ChlP-on-chip assay is based on the ability of commercially prepared antibodies to bind to DNA methylation-associated proteins like MeCP2. g) Restriction landmark genomic scanning is a complicated and now rarely-used assay is based upon restriction enzymes' differential recognition of methylated and unmethylated CpG sites. This assay is similar in concept to the HELP assay. h) Methylated DNA immunoprecipitation (MeDIP) is analogous to chromatin immunoprecipitation. Immunoprecipitation is used to isolate methylated DNA fragments for input into DNA detection methods such as DNA microarrays (MeDIP-chip) or DNA sequencing (MeDIP-seq). i) Pyrosequencing of bisulfite treated DNA is a sequencing of an amplicon made by a normal forward primer (or probe) but a biatenylated reverse primer (or probe) to PCR the gene of choice. The Pyrosequencer then analyses the sample by denaturing the DNA and adding one nucleotide at a time to the mix according to a sequence given by the user. If there is a mismatch, it is recorded and the percentage of DNA for which the mismatch is present is noted. This gives the user a percentage methylation per CpG island.
In certain embodiments of the invention, the DNA (e.g. genomic DNA) is hybridized to a complementary sequence (e.g. a synthetic polynucleotide sequence) that is coupled to a matrix (e.g. one disposed within a microarray). Optionally, the DNA (e.g. genomic DNA) is transformed from its natural state via amplification by a polymerase chain reaction process. For example, prior to or concurrent with hybridization to an array, the sample may be amplified by a variety of mechanisms, some of which may employ PCR. See, for example, PCR Technology: Principles and Applications for DNA Amplification (Ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (Eds. Innis, et al, Academic Press, San Diego, Calif, 1990); Mattila et al, Nucleic Acids Res. 19, 4967 (1991); Eckert et al, PCR Methods and Applications 1, 17 (1991); PCR (Eds. McPherson et al, IRL Press, Oxford). The sample may be amplified on the array.
Any appropriate statistical approach can be used to relate the methylation levels to an indication of (the presence or absence of) susceptibility to severe COVID-19, e.g. a weighted sum of the methylation levels can be applied to a logistic function as described herein. Using conventional regression model/analysis tools and methodologies known in the art, a number of diagnostic prediction models are contemplated for use with specific DNA samples (e.g. genomic DNA samples) and/or specific analysis techniques and/or specific individual populations.
In embodiments, a logistic regression model may predict (the presence or absence of) susceptibility to severe COVID-19 based on a weighted sum of the methylation levels optionally plus an offset (or regression intercept). To identify the weights for the weighted sum, one can use the regression coefficients of a regression model.
The coefficient values (weights) can be tailored to the subject being analyzed. For example, if a model is applied to male subjects only, then one set of coefficients can be used. Alternatively, if a model is applied exclusively to obese subjects, another set of coefficients can be used. Alternatively, coefficients can be fixed, for example, when a model is broadly applied to a heterogeneous group of subjects, e.g. the selection of weights provided in Tables of CpG sites recited herein.
Coefficient values (weights) in various models can also reflect the specific assay that is used to measure the methylation levels. Different machines may give different methylation values, which are closer or farther away from the true methylation values. The coefficients may change when the model is re-trained for another machine. For example, for beta values measured on Illumina™ methylation microarray platforms there can be one set of coefficients (weights), while for other methylation measures (e.g. using sequencing technology) there can be another set of coefficients (weights) etc. Other values may also be used instead, such as M values (transformed versions of beta values). The methylation levels measured by the technique are preferably measured using an Illumina 450k array or Illumina Infinium Methylation EPIC Kit, or an array of similar quality. In addition to using art accepted modeling techniques (e.g. regression analyses), embodiments of the invention can include a variety of art accepted technical processes. For example, in certain embodiments of the invention, a bisulfite conversion process is performed so that cytosine residues in the DNA (e.g. genomic DNA) are transformed to uracil, while 5- methylcytosine residues in the DNA (e.g. genomic DNA) are not transformed to uracil. Kits for DNA bisulfite modification are commercially available from, for example, Methyl Easy™ (Human Genetic Signatures™) and CpGenome™ Modification Kit (Chemicon™). See also, WO04096825A1 , which describes bisulfite modification methods and Olek et al. Nuc. Acids Res. 24:5064-6 (1994), which discloses methods of performing bisulfite treatment and subsequent amplification. Bisulfite treatment allows the methylation status of cytosines to be detected by a variety of methods. For example, any method that may be used to detect a SNP may be used, for example, see Syvanen, Nature Rev. Gen. 2:930-942 (2001). Methods such as single base extension (SBE) may be used or hybridization of sequence specific probes similar to allele specific hybridization methods. In another aspect the Molecular Inversion Probe (MIP) assay may be used.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for (or capable of) detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for (or capable of) detecting (or measuring) the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; in DNA from a biological sample obtained from the subject.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject.
In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject. In another aspect, the invention provides a kit for screening for susceptibility to severe COVID- 19 in a subject, said kit comprising: (i) probes (or other appropriate entities); (ii) an array of probes (or other appropriate entities); or (iii) a solid support (e.g. a chip) comprising probes (or other appropriate entities); for detecting (or measuring) the methylation levels of at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject.
In one embodiment, the kit is used to determine (or the kit is suitable for determining) whether or not a subject has susceptibility to severe COVID-19 by utilizing measurements of methylation levels at specific CpG sites in cells derived from the biological sample, for example blood or saliva. Microfluidics devices can be applied to easily accessible tissues/fluids such as blood, buccal cells, or saliva. Optionally, the kit comprises a plurality of probes for amplifying DNA sequences (e.g. genomic DNA sequences) of the CpG sites (or bisulfite-treated forms of the CpG sites) in accordance with the invention as described elsewhere herein. Optionally, the kit comprises bisulfite or sodium bisulfite.
In embodiments, the kits as described above are for obtaining information useful to determine susceptibility (or the presence or absence of susceptibility) to severe COVID-19 in a subject, the kit comprising a plurality of probes (or other appropriate entities) specific for (or specifically targeted to) at least or at most the numbers and selections of CpG sites listed in in any one of Tables 1 to 7 as described above, in DNA from a biological sample obtained from the subject.
In embodiments of any of the kits described above, the probes (or other appropriate entities) are for detecting (or measuring) the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59,
60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106,
107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144,
145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163,
164, 165, 166, 167, 168, 169, 170, 171 , 172, 173, 174, 175, 176, 177, 178, 179, 180, 181 , 182,
183, 184, 185, 186, 187, 188, 189, 190, 191 , 192, 193, 194, 195, 196, 197, 198, 199, 200, 201,
202, 203, 204, 205, 206, 207, 208, 209, 210, 211 , 212, 213, 214, 215, 216, 217, 218, 219, 220, 221 , 222, 223, 224, 225, 226, 227, 228, 229, 230, 231 , 232, 233, 234, 235, 236, 237, 238, 239,
240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251 , 252, 253, 254, 255, 256, 257, 258,
259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271 , 272, 273, 274, 275, 276, 277,
278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315,
316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331 , 332, 333, 334,
335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353,
354, 355, 356, 357, 358, 359, 360, 361 , 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372,
373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391,
392, 393, 394, 395, 396, 397, 398, 399, 400, 401 , 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 , 412, 413, 414, 415, 416, 417, 418, 419, 420, 421 , 422, 423, 424, 425, 426, 427, 428, 429, 430, 431 , 432, or 433 of the 433 CpG sites recited in any one of Tables 1 to 7 as described above.
In embodiments of any of the kits described above, the probes (or other appropriate entities) are for detecting (or measuring) the methylation levels of CpG site numbers: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59,
60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106,
107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144,
145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163,
164, 165, 166, 167, 168, 169, 170, 171 , 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182,
183, 184, 185, 186, 187, 188, 189, 190, 191 , 192, 193, 194, 195, 196, 197, 198, 199, 200, 201,
202, 203, 204, 205, 206, 207, 208, 209, 210, 211 , 212, 213, 214, 215, 216, 217, 218, 219, 220,
221 , 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239,
240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251 , 252, 253, 254, 255, 256, 257, 258,
259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271 , 272, 273, 274, 275, 276, 277,
278, 279, 280, 281 , 282, 283, 284, 285, 286, 287, 288, 289, 290, 291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315,
316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331 , 332, 333, 334,
335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353,
354, 355, 356, 357, 358, 359, 360, 361 , 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372,
373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391,
392, 393, 394, 395, 396, 397, 398, 399, 400, 401 , 402, 403, 404, 405, 406, 407, 408, 409, 410, 411 , 412, 413, 414, 415, 416, 417, 418, 419, 420, 421 , 422, 423, 424, 425, 426, 427, 428, 429, 430, 431 , 432, and/or 433 of the CpG sites recited in any one of Tables 1 to 7.
In embodiments, the kit is (or comprises) an array or microarray, or is in the form of an array or microarray. The term "array" or "microarray" as used herein refers to an intentionally created collection of molecules (e.g. probes or other appropriate entities) which can be prepared either synthetically or biosynthetically (e.g. Illumina™ HumanMethylation27 microarrays). The array can assume a variety of formats, for example, libraries of probes for targeting the desired CpG site sequences; or libraries of probes for targeting the desired CpG site sequences tethered to resin beads, silica chips, or other solid supports. DNA methylation microarrays commonly comprise tethered nucleic acid probes, for example the Illumina Infinium® HumanMethylation450 BeadChip.
The kits of the invention as described herein are specifically designed for the detection (or measurement) of the CpG sites of the invention as described elsewhere herein. In other words, said kits are for use in, or in accordance with, the methods of the invention as described elsewhere herein.
Thus, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4, but generally not exceeding more than or up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
Thus, in other embodiments the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, 350, 400 or 433 CpG sites selected from the 433 CpG sites listed in Table 4 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 433, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 433 CpG sites of Table 4. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; but generally not exceeding more than or up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
In other embodiments, the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 433, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 433, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 337 CpG sites listed in Table 5, and/or the 97 CpG sites listed in Table 3. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; but generally not exceeding more than or up to 337, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
In other embodiments, the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most:
1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, 300, or 337 CpG sites selected from the 337 CpG sites listed in Table 5; in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 337, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 337, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 337 CpG sites listed in Table 5. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2, but generally not exceeding more than or up to 251 , 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
In other embodiments the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, 200, 250, or 251 CpG sites selected from the 251 CpG sites listed in Table 2 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 251 , 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 251 , 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 251 CpG sites of Table 2. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1, but generally not exceeding more than or up to 177, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
In other embodiments, the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 100, 125, 150, or 177 CpG sites selected from the 177 CpG sites listed in Table 1 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 177, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 177, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 CpG sites.
Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 177 CpG sites of Table 1. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6, but generally not exceeding more than or up to 91 , 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes in total.
In other embodiments, the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 91 CpG sites selected from the 91 CpG sites listed in Table 6 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 91, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 different probes or consists of probes for detecting the methylation levels of up to 91 , 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 CpG sites. Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 91 CpG sites listed in Table 6. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, the probe (or other appropriate entity) component of said kits generally comprises (or consists of) a relatively small subset of probes (or other appropriate entities), e.g. a subset of probes for detecting (or measuring) at least or at most 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3, but generally not exceeding more than or up to 97, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400 or 500 different probes in total.
In other embodiments, the present invention provides a kit for screening for susceptibility to severe COVID-19 in a subject, said kit comprising probes for detecting the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90, or 97 CpG sites selected from the 97 CpG sites listed in Table 3 in DNA from a biological sample obtained from the subject, optionally wherein the probe (or CpG probe) component of the kit consists of up to 97, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400 or 500 different probes or consists of probes for detecting the methylation levels of up to 97, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400 or 500 CpG sites.
Exemplary kits may for example only comprise probes (or other appropriate entities) for detecting (or measuring) up to the 97 CpG sites listed in Table 3. In other words, no other probes (or no forms or copies of the other entities) for detecting (or measuring) other CpG sites are present in these examples.
In embodiments, a kit is provided for obtaining information useful to determine susceptibility (or the presence or absence of susceptibility) to severe COVID-19 in a subject, the kit comprising a plurality of probes (or other appropriate entities) specific for (or specifically targeted to) at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, or 34 CpG sites selected from the 34 CpG sites listed in Table 7 in DNA from a biological sample obtained from the subject.
In embodiments, the probes (or other appropriate entities) are for detecting (or measuring) the methylation levels of at least or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, or 34 CpG sites selected from the 34 CpG sites listed in Table 7. In preferred embodiments of the kits herein, the kit is a PCR kit (i.e. a kit suitable for performing PCR or a PCR-based method), more preferably wherein the probe (or CpG probe) component of the kit consists of any of the following range of different probes, or the probe (or CpG probe) component consists of probes for detecting the methylation levels of any of the following range of CpG sites:
2 to 15, 2 to 14, 2 to 13, 2 to 12, 2 to 11, 2 to 10, 2 to 9, 2 to 8, 2 to 7, 2 to 6, 2 to 5,
3 to 15, 4 to 15, 5 to 15, 6 to 15, 7 to 15, 8 to 15, 9 to 15, 10 to 15,
2 to 10, 3 to 10, 4 to 10, or 5 to 10 different probes or CpG sites.
Thus, in preferred embodiments the probe is a PCR primer (i.e. a probe suitable for use in
PCR).
In alternative embodiments of the kits herein, the kit is a microarray kit (i.e. a kit suitable for performing a microarray-based method), more preferably wherein the probe (or CpG probe) component of the kit consists of any of the following range of different probes, or the probe (or CpG probe) component consists of probes for detecting the methylation levels of any of the following range of numbers of CpG sites:
10 to 500, 20 to 400, 30 to 300, or 50 to 200 different probes or CpG sites.
The kit may comprise (or further comprise) a label necessary for the detection of the probes (or for the detection of other appropriate entities), for example a selective label as described elsewhere herein. The selective labels may be for example labeled dideoxynucleotides (e.g. ddATP, ddCTP, ddGTP, ddTTP, ddllTP). Such dideoxynucleotides are used in chain extension of the oligonucleotide probe using a polymerase enzyme as described elsewhere herein. The kit may therefore comprise (or further comprise) a polymerase enzyme (e.g. a DNA polymerase enzyme). The kit may comprise (or further comprise) a reagent used in a DNA polymerization process, a DNA hybridization process, and/or a DNA bisulfite conversion process. The kit may comprise (or further comprise) instructions for carrying out the methods of the invention.
Where a probe (or other appropriate entity) is for detecting or measuring the methylation level of a CpG site, it is meant that the probe (or other appropriate entity) is targeted towards said CpG site and not other CpG sites, e.g. is selective for or specific for said CpG site.
Where the language “for detecting”, “for determining” or “for measuring” or similar is used herein, the terms “suitable for detecting”, “suitable for determining” and “suitable for measuring or similar are also encompassed. Appropriate probes for use in the kits of the invention are described elsewhere herein but are conveniently nucleic acid probes. In embodiments the kit contains two different types of probe as described elsewhere herein.
In embodiments, the probes are attached to a solid support or a substrate.
The terms "solid support" and "substrate" as used herein are used interchangeably and refer to a material or group of materials having a rigid or semi-rigid surface or surfaces. In embodiments, at least one surface of the solid support will be substantially flat, although in some embodiments it may be desirable to physically separate synthesis regions for different compounds with, for example, wells, raised regions, pins, etched trenches, or the like. In embodiments, the solid support will take the form of beads, resins, gels, microspheres, or other geometric configurations.
In the kits of the invention, multiple probe types may be included for determining the methylation level of a given CpG site; for example, two probe types may be used, wherein the first probe enables detection of the methylated form of the CpG site (or, depending on the methylation detection protocol used, a derivative of said methylated form of said CpG site) and the second probe enables detection of the unmethylated form of the CpG site (or, depending on the methylation detection protocol used, a derivative of said methylated form of said CpG site, for example a bisulfite-treated or bilsulfite-converted form of said CpG site).
In embodiments, the invention provides a panel of CpG sites in accordance with the invention as described elsewhere herein.
In embodiments, the invention provides a panel or set of biomarkers, said panel or set of biomarkers comprising (or consisting of) CpG sites in accordance with the invention as described elsewhere herein.
As used throughout the application, the terms "a" and "an" are used in the sense that they mean "at least one", "at least a first", "one or more" or "a plurality" of the referenced components or steps, except in instances wherein an upper limit is thereafter specifically stated.
In addition, where the terms “comprise”, “comprises”, “has” or “having”, or other equivalent terms are used herein, then in some more specific embodiments these terms include the term “consists of” or “consists essentially of”, or other equivalent terms. Methods comprising certain steps also include, where appropriate, methods consisting of these steps. Methods of determining the statistical significance of differences between test groups of subjects or differences in levels or values of a particular parameter are well known and documented in the art. For example herein a decrease or increase is generally regarded as statistically significant if a statistical comparison using a significance test such as a Student t- test, Mann-Whitney II Rank-Sum test, chi-square test or Fisher's exact test, one-way ANOVA or two-way ANOVA tests as appropriate, shows a probability value of <0.05.
Further embodiments of the invention are provided in embodiments 1 to 23 below:
1. A method of screening for the susceptibility of a subject to severe COVID-19, the method comprising using the methylation levels of at least:
2, 3, 4, 5, 10, or 15 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
2, 3, 4, 5, 10, or 15 CpG sites CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to severe COVID-19.
2. The method of embodiment 1 , the method comprising using the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 251 CpG sites listed in Table 2 and/or at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 177 CpG sites listed in Table 1.
3. The method of embodiment 1 or embodiment 2, the method comprising using the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 91 CpG sites listed in Table 6.
4. The method of embodiment 1 , the method comprising using the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3, preferably selected from the 34 CpG sites listed in Table 7.
5. The method of any one of the preceding embodiments, comprising calculating a likelihood of the subject developing severe COVID-19 as a function of said methylation levels.
6. The method of embodiment 5, comprising calculating the likelihood as a function of a linear combination of said methylation levels, preferably wherein the linear combination of said methylation levels comprises a weighted sum of said methylation levels.
7. The method of embodiment 5 or embodiment 6, comprising calculating the likelihood as a logistic function of a linear combination of said methylation levels. 8. The method of any one of embodiments 5 to 7, comprising receiving data representative of said methylation levels, and inputting the data to an algorithm for evaluating said function to determine the likelihood of the subject developing severe COVID-19.
9. The method of any one of the preceding embodiments, further comprising making a prognosis that the subject will develop severe COVID-19 based on the methylation levels referred to in any one of the preceding embodiments and/or the likelihood referred to in any one of embodiments 5 to 8, optionally by comparing the methylation levels or likelihood with a cutoff value.
10. The method of any one of the preceding embodiments, wherein said subject is suspected of being susceptible to severe COVID-19, or has one or more risk factors associated with the development of severe COVID-19.
11. The method of any one of the preceding embodiments, wherein the biological sample is a blood sample, or a white blood cell sample.
12. The method of any one of the preceding embodiments, further comprising reporting the results of the method, optionally by preparing a written or electronic report.
13. The method of any one of the preceding embodiments, wherein the method is implemented by a computer.
14. The method of any one of the preceding embodiments, wherein said method comprises a step of measuring the methylation levels before the step of using the methylation levels.
15. A computer program comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least:
2, 3, 4, 5, 10, or 15 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
2, 3, 4, 5, 10, or 15 CpG sites CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject developing severe COVID-19.
16. The computer program of embodiment 15, comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least:
2, 3, 4, 5, 10, or 15 CpG sites selected from the 251 CpG sites listed in Table 1 ; and/or 2, 3, 4, 5, 10, or 15 CpG sites selected from the 177 CpG sites listed in Table 2.
17. The computer program of embodiment 15 or embodiment 16, comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 91 CpG sites listed in Table 6.
18. The computer program of embodiment 17, comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3, preferably selected from the 34 CpG sites listed in Table 7.
19. A kit for screening for susceptibility of a subject to severe COVID-19, said kit comprising probes for detecting the methylation levels of at least:
2, 3, 4, 5, 10, or 15 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
2, 3, 4, 5, 10, or 15 CpG sites CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, wherein the CpG probe component of the kit consists of probes for detecting the methylation levels of up to 500 CpG sites.
20. The kit of embodiment 19, said kit comprising probes for detecting the methylation levels of at least:
2, 3, 4, 5, 10, or 15 CpG sites selected from the 251 CpG sites listed in Table 1 ; and/or 2, 3, 4, 5, 10, or 15 CpG sites selected from the 177 CpG sites listed in Table 2.
21. The kit of embodiment 19 or embodiment 20, said kit comprising probes for detecting the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 91 CpG sites listed in both of Tables 1 and 2.
22. The kit of embodiment 19, said kit comprising probes for detecting the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3, preferably selected from the 34 CpG sites listed in Table 7.
23. The method, computer program or kit of any one of the preceding embodiments, wherein the methylation levels have been obtained using a PCR-based method.
The invention will be further described with reference to the following non-limiting Examples with reference to the following drawings in which: Figure 1 provides a Venn diagram showing the overlap between the three lists of CpG sites of Tables 1 to 3.
Figure 2 provides a flowchart for the feature selection (CpG site filtering) process.
Figure 3 provides a ROC curve for a model using all 177 CpG sites together from the list of Table 1.
Figure 4 provides a confusion matrix for a model using all 177 CpG sites together from the list of Table 1 (the same model as in Figure 3).
Figure 5 provides a ROC curve for a model using 5 CpG sites selected from the list of 97 CpG sites of Table 3.
Figure 6 provides a confusion matrix for a model using 5 CpG sites selected from the list of 97 CpG sites of Table 3 (the same model as in Figure 5).
Figure 7 provides model performance depending on the number of CpG sites used in the model, where the CpG sites are selected from the list of 177 CpG sites of Table 1. Each boxwhisker (each bar) shows the results of 300 tested combinations (models). The box is drawn from the first quartile (Q1) to the third quartile (Q3) with a horizontal line drawn in the middle to denote the median. The end of the lower whisker is the minimum performance value of the given number of CpG sites (i.e. the value for the combination of CpG sites (model) which gave the poorest performance), and the end of the upper whisker is the maximum value (i.e. the value for the combination of CpG sites (model) which gave the best performance).
Figure 8 provides model performance depending on the number of CpG sites used in the model, where the CpG sites are selected from the list of 251 CpG sites of Table 2. Each boxwhisker (each bar) shows the results of 300 tested combinations, and is used as described for Figure 7.
Figure 9 provides model performance depending on the number of CpG sites used in the model, where the CpG sites are selected from the list of 97 CpG sites of Table 3. Each boxwhisker (each bar) shows the results of 300 tested combinations, and is used as described for Figures 7 and 8. EXAMPLES
Production of methylation level data sets
For an overview of the cohort used for construction and validation, see Table 9.
Table 9 - Cohort information detailing number of cases in each data set. Sex and age information are also supplied.
Characteristic Dev, N = 143 Holdout, N = 371 Training, N « 427 p-value
Sex 0.042
Female 65 (45%) 166 (45%) 227 (53%)
Male 78 (55%) 205 (55%) 200(47%)
Age 43 (34, 51) 51 (41, 62) 44(35, 53) <0.001
Severity 83 (58%) 199 (54%) 252 (59%) 0.3
Figure imgf000064_0001
Study design and participants
We used epigenetic profiles and clinical covariates drawn from a US-American cohort comprising 525 samples (American cohort), a Spanish cohort comprising 407 samples (Spanish cohort), and from 870 samples collected from our own cohort (Norwegian cohort). All samples were EDTA blood samples (so white blood cell data). In all data sets, the DNA methylation rates of over 850,000 sites were measured using the Illumina Infinium methylation EPIC array. All patients in all cohorts were tested for SARS-CoV-2 using PCR assays run on sample material collected from nasopharyngeal swabs. The data apart from the Norwegian cohort is publicly available. The data from the Norwegian cohort is available upon reasonable request.
We retained only patients that tested positive for SARS-CoV-2 for whom age, sex and hospitalization information was also available, resulting in 163 samples used from the American cohort, all 407 samples from the Spanish cohort, and 371 samples from the Norwegian cohort.
Technical solution
The technical solution is a test for diagnosing COVID-19 severity by reading the DNA methylation level in white blood cells at specific CpG sites, and combining these values into a score using a mathematical formula that classifies into either severe or non-severe COVID-19. While the formula is developed for binary classification, its readout is a continuous score from 0 (mild case) to 1 (severe), which can be understood and implemented as a severity risk score. This allows for additional, more granular categories to be introduced in a clinical context.
The mathematical formula is in the form of a multiple logistic regression such that
Figure imgf000065_0001
where o p is the probability of severe COVID-19 progression (a value between 0 and 1) o b is e o p0 is the regression intercept o Pi is the weight for CpG-site 1 o Xi is the methylation level of CpG-site 1 o pm is the weight for CpG-site m o xm is the methylation level of CpG-site m
A support vector machine (svm) model can optionally be fitted on an independent development data set drawn from the same or another cohort as the training data (the dev data) using predictions of the logistic regression (elastic net) model and patient age and/or sex as input to predict severity.
Another instance of such a technical solution could involve a random forest classification model in place of the logistic regression, optionally followed by a feedforward neural network in place of the svm.
Two key steps going into developing the technical solution are first, a feature selection step and second, training and validating machine learning models to arrive at the technical solution. Both are outlined below.
Feature selection
The filtering process made use of several steps to ascertain differential methylation, similar behavior in both training data sets, and usefulness in a lab setting (see Figure 2, with the Steps below indicated in brackets).
Step 1, overlap between data sets: any CpG sites not present in all three used data sets were discarded. Step 2, differential methylation analysis: was performed in both training data sets separately. Only sites found to be significantly differentially methylated in both training data sets were retained.
Step 3, conformity between data sets: the sign of the differential methylation had to be equal in both data sets.
Step 4, distribution dissimilarity analysis: the means of the methylation ratio distributions for non-severe cases in both data sets were required to be similar.
Step 5, overlap between data sets: only CpG sites passing steps 1, 2 and 3 in both data sets were kept.
Model training and validation
The American and Spanish cohorts were split so that 75% of the COVID-19-positive cases formed the training data set for the initial regression modeling (which determines the used CpG sites) while the remaining 25% were used to form a development (dev) set used to fit support vector machine models taking the output of the regression models and patient age as input. Both regression and svm models use severity/hospitalization as the dependent variable to be predicted. Holdout and dev set identities were kept consistent across all models. The entire Norwegian cohort was retained as a fully independent holdout data set used exclusively for testing purposes.
Using all CpG sites retained after step 2 of the feature selection process outlined above, an elastic net model was trained to arrive at a severity prediction formula optimized for microarray data. 177 CpG sites were used by the resulting model (Table 1). The predictions of the svm model form the final COVID-19 severity prediction. We then performed stability selection (Ref 9) to determine CpG sites useful for other potential elastic net models. 251 sites were identified as relevant for an array-optimized model using stability selection (Table 2). The union of the 177 CpG sites used by the most performant model and the 251 sites determined by stability selection are sites (337 sites in total - Table 5) which we have determined as relevant for the invention. Of the list of 177 sites and list of 251 sites, there are 91 sites common to both lists (Table 6).
To further develop a model optimized for PCR (characterized by being less expensive and faster than a microarray but having lower resolution and requiring manual steps for each site), logistic regression models using combinations of the CpG sites which were retained after step 5 of the feature selection process were trained on the training set to predict severity. There are 97 such sites in total. The coefficients of one such model, using all of these sites together, can be found in Table 8. An exemplary protocol for predicting COVID-19 severity with a PCR test may be described as follows:
1. Select CpG sites to be measured and obtain or design primers for the CpG sites in question. Primers can be easily designed, for example using a web-based service such as PrimerSuite (Lu et al. Sci Rep 7, 41328 (2017). https://doi.org/10.1038/srep41328).
2. Run a PCR method to obtain output values corresponding to the methylation rates of the CpG sites. Several known PCR methods are available which could be used for this purpose, such as methylation-sensitive loop-mediated isothermal amplification (MS- LAMP) as described for example in Hambalek et al. 2021 (ACS Sens. 24;6(9):3242-3252) or Zerilli et al. 2010 (Clin Chem. 56(8): 1287-96); high resolution melting (HRM) as described for example in Wojdacz et al. 2008 (Nat Protoc 3, 1903-1908); or the OpenArray™ platform as described for example by Broccanello et al. (in Quantitative Real-Time PCR. Methods in Molecular Biology, vol 2065. Humana, New York, NY).
3. Derive the methylation rates computationally based on the PCR results using known methodology. This can involve several calibration steps and averaging over several samples to account for technical and between-sample variance.
4. Use a machine learning model in order to calculate a COVID-19 severity score (severity risk score) using the methylation rates of the n CpG sites. This may be performed for example using a logistic regression model as described herein, along with the regression coefficients associated with each CpG site and intercept, for example as provided in Table 8.
Results
After training logistic regression models using elastic nets and performing stability selection, the model with the best performance contained 177 CpG sites (Table 1) while 251 CpG sites were selected following the stability selection process (Table 2).
Of the differentially methylated sites, 97 were found to behave similarly in both data sets (Table 3), and meet the criteria for conformity between cohorts and distribution dissimilarity between severe and mild cases outlined in Steps 3 and 4 of the feature selection process above. These 97 sites were retained to train COVID-19 severity models to ascertain the usefulness of these sites for PCR-based and similar tests. Of these 97 sites, 34 are particularly preferred and are listed in Table 7.
The 97 sites were ranked in terms of order of importance (Table 3). This was achieved by training 20,000 severity predictor models using combinations of 5 randomly determined sites (out of the 97) each. For each model, Balanced Accuracy was calculated using the holdout data. Then, a new model was fitted that predicted the resulting 20,000 Balanced Accuracy values depending on whether or not each of the 97 sites was used in the predictor. Thus, each Balanced Accuracy value could be attributed to five sites that were used in a specific case. An overall estimate could thus be obtained of what influence any given site had on Balanced Accuracy. These estimates are the coefficients for the usefulness model, and those are listed in the site usefulness column. So, the higher that value is, the higher the average Balanced Accuracy of severity prediction models using the site in question.
Thus, 433 sites in total were determined to be immediately highly relevant for COVID-19 severity prediction (Table 4). A Venn diagram showing the overlap between the different lists of sites is provided in Figure 1.
Model performance
The elastic net model followed by a svm trained using the set of all 177 CpG sites of Table 1 together (see Table 1 also for the relevant coefficients) could achieve an AUC of 0.89 (Figure 3). With the cutoff value for severity classification set at 0.5, the predictor achieved a balanced accuracy of 0.82, with a sensitivity of 0.8 and specificity of 0.83 in the independent holdout set (Table 10, see Figure 4 for a confusion matrix).
Table 10
Metric Value
Balanced Accuracy 0.8165920
Sensitivity 0.8009259
Specificity 0.8322581
The ensemble model (logistic regression followed by svm), using an exemplary set of 5 CpG sites selected from the list of 97 CpG sites of Table 3, and with the cutoff value for severity classification set at 0.5, achieved an AUC of 0.88 (Figure 5) and balanced accuracy, sensitivity and specificity of 0.8 in the independent holdout set (Table 11, see Figure 6 for a confusion matrix). Table 11
Metric Value
Balanced Accuracy 0.8032828
Sensitivity 0.8028846
Specificity 0.8036810
Sites needed for severity modeling
Additionally, we sought to determine how predictor performance varied with the number of CpG sites that were used. In order to achieve this, models were trained with different numbers of CpG sites randomly selected from the array list (i.e. the 177 CpG sites listed in Table 1), the stability selection list (i.e. the 251 CpG sites listed in Table 2) or the PCR list (i.e. the 97 sites listed in Table 3). For each number of CpG sites, a total of 300 models (i.e. 300 combinations of that number of CpG sites) were trained, after which the sensitivity, specificity, and balanced accuracy for each model was determined. A box-whisker plot for each number of sites is provided in Figure 7 for the array list (300 models of each number of CpG sites (15), so 4500 models in total), in Figure 8 for the stability selection list (300 models of each number of CpG sites (15), so 4500 models in total), and in Figure 9 for the PCR list (300 models for each number of CpG sites (11), so 3300 models in total). The data demonstrates that with as few as 2 sites, a balanced accuracy of 0.8 or higher can be achieved using combinations drawn from any of the three proposed sets of CpG sites. Prediction success rate does not fall below or even approach random performance (defined as a balanced accuracy of 0.5). Thus, these data demonstrate that susceptibility to severe COVID-19 can be reasonably predicted (and thus that the method of the invention may be performed) using as few as 2 CpG sites selected from each of the three lists of Tables 1 , 2 and/or 3.
Discussion
We report a model that is not just highly performant but can also be assumed to be robust as it was trained on two data sets and tested on a fully independent third one. This model can guide the decision on which patients to monitor, either as in- or outpatients, and which are likely not facing any risk. In most countries, the determination of COVID-19 severity follows guidelines that are similar to those provided by the NIH to physicians in the US (Ref 10). Any given patient with a positive PCR test for SARS-CoV-2 is placed at the appropriate level on the following clinical spectrum:
1. Asymptomatic or presymptomatic infection
2. Mild Illness (managed in an ambulatory setting)
3. Moderate illness: evidence of lower respiratory infection and SpO2>=94%. Close monitoring, often at home.
4. Severe illness: Same as moderate illness plus SpO2<94%. Management is hospitalization with 02 therapy. May deteriorate rapidly.
5. Critical illness: acute respiratory distress syndrome, septic shock: ICU management.
For the purposes of this report, we separate between severe (corresponding to level 4 and 5) and non-severe (corresponding to levels 1 through 3).
In this report, we present a diagnostic method that can be used to determine the severity of COVID-19 infection based on detected specific patterns in the DNA methylation state of white blood cells collected from venous blood samples. The method has been developed using two cohorts from the US and Spain, and validated in an independent cohort of Norwegian patients. Patients in the validation cohort include both hospitalized and non-hospitalized patients.
With COVID-19 now considered to be endemic, our method has the potential to become an important tool for clinicians facing COVID-19 patients in the years ahead, as they make decisions about the treatment for individual patients. We believe the COVID-19 severity test described herein can be an important addition to the standard of care in the monitoring and treatment of individual patients.
Summary of results
1. We present a diagnostic method that can be used in most hospital labs to predict a severe COVID-19 progression.
2. Our diagnostic test is based on blood methylation data, as gene regulation has previously been found to be predictive of severity of viral disease, and the methylome is easier to sample and chemically more stable to work with compared to alternatives such as transcriptom ic data.
3. We used two previously published data sets, and generated our own independent data set from 371 COVID-19 patients.
4. We were able to train a predictive ensemble model based on 177 CpG sites, achieving a balanced accuracy of 0.82 (sensitivity of 0.8, specificity of 0.83) in an independent holdout set. A secondary ensemble model optimized for PCR-based diagnostic approaches using a set of 5 CpG sites could achieve a balanced accuracy of 0.8 (sensitivity and specificity of 0.8) in the independent holdout data.
Bibliography
1. Zhou P, Yang X-L, Wang X-G, Hu B, Zhang L, Zhang W, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020 Mar;579(7798):270-3.
2. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020 Feb 15;395(10223):497-506.
3. Sawalha AH, Zhao M, Coit P, Lu Q. Epigenetic dysregulation of ACE2 and interferon- regulated genes might suggest increased COVID-19 susceptibility and severity in lupus patients. Clin Immunol. 2020 Jun;215:108410.
4. Feldstein LR, Tenforde MW, Friedman KG, Newhams M, Rose EB, Dapul H, et al. Characteristics and Outcomes of US Children and Adolescents With Multisystem Inflammatory Syndrome in Children (MIS-C) Compared With Severe Acute COVID-19. JAMA. 2021 Mar 16;325(11): 1074-87.
5. Rodriguez Y, Novelli L, Rojas M, De Santis M, Acosta-Ampudia Y, Monsalve DM, et al. Autoinflammatory and autoimmune conditions at the crossroad of COVID-19. J Autoimmun. 2020 Nov; 114: 102506.
6. Pennington AF, Kompaniyets L, Summers AD, Danielson ML, Goodman AB, Chevinsky JR, et al. Risk of Clinical Severity by Age and Race/Ethnicity Among Adults Hospitalized for COVID-19-United States, March-September 2020. Open Forum Infect Dis. 2021 Feb;8(2):ofaa638.
7. Ponti G, Maccaferri M, Ruini C, Tomasi A, Ozben T. Biomarkers associated with COVID- 19 disease progression. Crit Rev Clin Lab Sci. 2020 Sep; 57(6): 389-99.
8. Rosenberg ES, Dorabawila V, Easton D, Bauer UE, Kumar J, Hoen R, et al. Covid-19 Vaccine Effectiveness in New York State. N Engl J Med. 2022 Jan 13;386(2): 116-27.
9. Meinshausen, N. and Buhlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72: 417-473.
10. COVID- 19 Treatment Guidelines Panel. Coronavirus Disease 2019 (COVID-19) Treatment Guidelines. National Institutes of Health. Available at https://www.covid19treatmentguidelines.nih.gov/.
TABLES
Tables of CpG sites referenced in the Examples and in the main part of the description are provided below. In these Tables, the CpG ID number (e.g. cg04610187) of each CpG site is provided in the “feature” column. The chromosome on which the CpG site is located is provided in the “chr” column, and the position of the CpG site on the chromosome is provided in the “pos” column. For the purposes of referring to each CpG site in a concise manner in the present specification, each CpG site in each Table has been assigned a “CpG site number”. For example, CpG site “cg04610187” can alternatively be referred to as “CpG site number 1 of Table 4”, and so on.
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000082_0002
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Figure imgf000087_0001
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000097_0002
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Figure imgf000101_0002
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000103_0002
Figure imgf000104_0001
Figure imgf000105_0001
Figure imgf000106_0001
Figure imgf000106_0002
Figure imgf000107_0001
Figure imgf000107_0002
Figure imgf000108_0001
Figure imgf000109_0001

Claims

1. A method of screening for the susceptibility of a subject to severe COVID-19, the method comprising using the methylation levels of at least:
10, 15, 20, 25, 30, or 35 CpG sites selected from the 337 CpG sites listed in Table 5; and/or
2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3; in DNA from a biological sample obtained from the subject, wherein said methylation levels are used to provide an indication of the susceptibility of the subject to severe COVID-19; wherein the step of using the methylation levels is implemented by a computer; and wherein the biological sample is a blood sample.
2. The method of claim 1 , the method comprising using the methylation levels of at least:
10, 15, or 20 CpG sites selected from the 251 CpG sites listed in Table 2; and/or
10, 15, or 20 CpG sites selected from the 177 CpG sites listed in Table 1 ; and/or
2. 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3.
3. The method of claim 1, the method comprising using the methylation levels of at least:
10 CpG sites selected from the 251 CpG sites listed in Table 2; and/or
15 CpG sites selected from the 177 CpG sites listed in Table 1 ; and/or
2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3.
4. The method of claim 1 , the method comprising using the methylation levels of at least:
20 CpG sites selected from the 251 CpG sites listed in Table 2; and/or
15 CpG sites selected from the 177 CpG sites listed in Table 1 ; and/or
2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3.
5. The method of any one of the preceding claims, the method comprising using the methylation levels of at least 10, 15, or 20 CpG sites selected from the 91 CpG sites listed in Table 6.
6. The method of any one of the preceding claims, the method comprising using the methylation levels of at least 2, 3, 4, 5, 10, or 15 CpG sites selected from the 97 CpG sites listed in Table 3, preferably selected from the 34 CpG sites listed in Table 7.
7. The method of any one of the preceding claims, comprising calculating a likelihood of the subject developing severe COVID-19 as a function of said methylation levels.
8. The method of claim 7, comprising calculating the likelihood as a function of a linear combination of said methylation levels.
9. The method of claim 8, wherein the linear combination of said methylation levels comprises a weighted sum of said methylation levels.
10. The method of any one of claims 7 to 9, comprising calculating the likelihood as a logistic function of a linear combination of said methylation levels.
11. The method of any one of claims 7 to 10, comprising receiving data representative of said methylation levels, and inputting the data to an algorithm for evaluating said function to determine the likelihood of the subject developing severe COVID-19.
12. The method of any one of the preceding claims, further comprising making a prognosis that the subject will develop severe COVID-19 based on the methylation levels referred to in any one of the preceding claims and/or the likelihood referred to in any one of claims 7 to 11 , optionally by comparing the methylation levels or likelihood with a cutoff value.
13. The method of any one of the preceding claims, wherein said subject is suspected of being susceptible to severe COVID-19, or has one or more risk factors associated with the development of severe COVID-19.
14. The method of any one of the preceding claims, wherein the biological sample is a white blood cell sample.
15. The method of any one of the preceding claims, further comprising reporting the results of the method, optionally by preparing a written or electronic report.
16. The method of any one of the preceding claims, wherein said method comprises a step of measuring the methylation levels before the step of using the methylation levels.
17. A computer program comprising instructions that, when executed by a processing system, cause the processing system to process data representative of methylation levels of the CpG sites as defined in any one of claims 1 to 6; in DNA from a biological sample obtained from a subject, to calculate a likelihood of the subject developing severe COVID-19; wherein the biological sample is a blood sample.
18. A kit for screening for susceptibility of a subject to severe COVID-19, said kit comprising probes for detecting the methylation levels of the CpG sites as defined in any one of claims 1 to 6; in DNA from a biological sample obtained from the subject, wherein the CpG probe component of the kit consists of probes for detecting the methylation levels of up to 500 CpG sites; wherein the biological sample is a blood sample.
19. The method of claim 6, the computer program of claim 17 or the kit of claim 18, wherein the methylation levels have been obtained, or are suitable for being obtained, using a PCR- based method.
Ill
PCT/EP2023/068925 2022-07-08 2023-07-07 Method of screening for severe covid-19 susceptibility WO2024008955A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP22183982.2 2022-07-08
EP22183982 2022-07-08

Publications (1)

Publication Number Publication Date
WO2024008955A1 true WO2024008955A1 (en) 2024-01-11

Family

ID=82403819

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/068925 WO2024008955A1 (en) 2022-07-08 2023-07-07 Method of screening for severe covid-19 susceptibility

Country Status (1)

Country Link
WO (1) WO2024008955A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999050448A2 (en) 1998-04-01 1999-10-07 Genpoint A.S. Nucleic acid detection method
WO2004096825A1 (en) 2003-05-02 2004-11-11 Human Genetic Signatures Pty Ltd Treatment of nucleic acid
WO2021262894A1 (en) * 2020-06-23 2021-12-30 The Regents Of The University Of Colorado, A Body Corporate Methods for diagnosing respiratory pathogens and predicting covid-19 related outcomes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999050448A2 (en) 1998-04-01 1999-10-07 Genpoint A.S. Nucleic acid detection method
WO2004096825A1 (en) 2003-05-02 2004-11-11 Human Genetic Signatures Pty Ltd Treatment of nucleic acid
WO2021262894A1 (en) * 2020-06-23 2021-12-30 The Regents Of The University Of Colorado, A Body Corporate Methods for diagnosing respiratory pathogens and predicting covid-19 related outcomes

Non-Patent Citations (27)

* Cited by examiner, † Cited by third party
Title
"Coronavirus Disease 2019 (COVID-19) Treatment Guidelines", NATIONAL INSTITUTES OF HEALTH, article "COVID-19 Treatment Guidelines Panel"
"PCR Protocols: A Guide to Methods and Applications", 1990, ACADEMIC PRESS
"PCR Technology: Principles and Applications for DNA Amplification", 1992, FREEMAN PRESS
ANONYMOUS: "Infinium HumanMethylation450 BeadChip", 9 March 2012 (2012-03-09), XP055401052, Retrieved from the Internet <URL:https://support.illumina.com/content/dam/illumina-marketing/documents/products/datasheets/datasheet_humanmethylation450.pdf> [retrieved on 20170824] *
BROCCANELLO ET AL.: "Methods in Molecular Biology", vol. 2065, HUMANA, article "Quantitative Real-Time PCR"
ECKERT ET AL., PCR METHODS AND APPLICATIONS, vol. 1, 1991, pages 17
FELDSTEIN LRTENFORDE MWFRIEDMAN KGNEWHAMS MROSE EBDAPUL H ET AL.: "Characteristics and Outcomes of US Children and Adolescents With Multisystem Inflammatory Syndrome in Children (MIS-C) Compared With Severe Acute COVID-19", JAMA, vol. 325, no. 11, 16 March 2021 (2021-03-16), pages 1074 - 87
HAMBALEK ET AL., ACS SENS, vol. 6, no. 9, 2021, pages 3242 - 3252
HUANG CWANG YLI XREN LZHAO JHU Y ET AL.: "Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China", LANCET, vol. 395, no. 10223, 15 February 2020 (2020-02-15), pages 497 - 506, XP086050317, DOI: 10.1016/S0140-6736(20)30183-5
JOSEPH BALNIS ET AL: "Blood DNA methylation and COVID-19 outcomes", CLINICAL EPIGENETICS, BIOMED CENTRAL LTD, LONDON, UK, vol. 13, no. 1, 25 May 2021 (2021-05-25), pages 1 - 16, XP021291369, ISSN: 1868-7075, DOI: 10.1186/S13148-021-01102-9 *
KONIGSBERG IAIN R. ET AL: "Host methylation predicts SARS-CoV-2 infection and clinical outcome", vol. 1, no. 1, 1 December 2021 (2021-12-01), XP055942345, Retrieved from the Internet <URL:https://www.nature.com/articles/s43856-021-00042-y.pdf> DOI: 10.1038/s43856-021-00042-y *
LU ET AL., SCI REP, vol. 7, 2017, pages 41328, Retrieved from the Internet <URL:https://doi.org/10.1038/srep41328>
MANDREKAR, JOURNAL OF THORACIC ONCOLOGY, vol. 5, no. 9, September 2010 (2010-09-01)
MATTILA ET AL., NUCLEIC ACIDS RES., vol. 19, 1991, pages 4967
MEINSHAUSEN, NBUHLMANN, P, JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY, vol. 72, 2010, pages 417 - 473
OAKELEY, E. J., PHARMACOLOGY & THERAPEUTICS, vol. 84, 1999, pages 389 - 400
OLEK ET AL., NUC. ACIDS RES., vol. 24, 1994, pages 5064 - 6
PENNINGTON AFKOMPANIYETS LSUMMERS ADDANIELSON MLGOODMAN ABCHEVINSKY JR ET AL.: "Risk of Clinical Severity by Age and Race/Ethnicity Among Adults Hospitalized for COVID-19-United States, March-September 2020", OPEN FORUM INFECT DIS, vol. 8, no. 2, February 2021 (2021-02-01), pages 638
PONTI GMACCAFERRI MRUINI CTOMASI AOZBEN T: "Biomarkers associated with COVID-19 disease progression", CRIT REV CLIN LAB SCI, vol. 57, no. 6, September 2020 (2020-09-01), pages 389 - 99, XP055898740, DOI: 10.1080/10408363.2020.1770685
RODRIGUEZ YNOVELLI LROJAS MDE SANTIS MACOSTA-AMPUDIA YMONSALVE DM ET AL.: "Autoinflammatory and autoimmune conditions at the crossroad of COVID-19", J AUTOIMMUN, vol. 114, November 2020 (2020-11-01), pages 102506
ROSENBERG ESDORABAWILA VEASTON DBAUER UEKUMAR JHOEN R ET AL.: "Covid-19 Vaccine Effectiveness in New York State", N ENGL J MED, vol. 386, no. 2, 13 January 2022 (2022-01-13), pages 116 - 27
SAWALHA AHZHAO MCOIT PLU Q: "Epigenetic dysregulation of ACE2 and interferon-regulated genes might suggest increased COVID-19 susceptibility and severity in lupus patients", CLIN IMMUNOL, vol. 215, June 2020 (2020-06-01), pages 108410
SYVANEN, NATURE REV. GEN., vol. 2, 2001, pages 930 - 942
WOJDACZ ET AL., NAT PROTOC, vol. 3, 2008, pages 1903 - 1908
WU LANG ET AL: "An integrative multiomics analysis identifies putative causal genes for COVID-19 severity", GENETICS IN MEDICINE, NATURE PUBLISHING GROUP US, NEW YORK, vol. 23, no. 11, 28 June 2021 (2021-06-28), pages 2076 - 2086, XP037602783, ISSN: 1098-3600, [retrieved on 20210628], DOI: 10.1038/S41436-021-01243-5 *
ZERILLI ET AL., CLIN CHEM., vol. 56, no. 8, 2010, pages 1287 - 96
ZHOU PYANG X-LWANG X-GHU BZHANG LZHANG W ET AL.: "A pneumonia outbreak associated with a new coronavirus of probable bat origin", NATURE, vol. 579, no. 7798, March 2020 (2020-03-01), pages 270 - 3

Similar Documents

Publication Publication Date Title
US20220365067A1 (en) Analysis of cell-free dna in urine and other samples
Rakyan et al. Identification of type 1 diabetes–associated DNA methylation variable positions that precede disease diagnosis
Milani et al. DNA methylation for subtype classification and prediction of treatment outcome in patients with childhood acute lymphoblastic leukemia
US20170039318A1 (en) Resolving genome fractions using polymorphism counts
DK2438193T3 (en) Methods for assessing the risk of breast cancer
Konigsberg et al. Host methylation predicts SARS-CoV-2 infection and clinical outcome
JP6820838B2 (en) How to assess the risk of developing breast cancer
JP7126704B2 (en) Methods for assessing the risk of developing colorectal cancer
US20210404003A1 (en) Dna methylation and genotype specific biomarker for predicting post-traumatic stress disorder
US20210024999A1 (en) Method of identifying risk for autism
US20200340057A1 (en) Dna targets as tissue-specific methylation markers
CN114292909B (en) Application of SNP rs241970 as target in development of kit for screening plateau pulmonary edema susceptible population
US20230120076A1 (en) Dual-probe digital droplet pcr strategy for specific detection of tissue-specific circulating dna molecules
KR20190110594A (en) Improved Assessment of Breast Cancer Risk
US20220246242A1 (en) Methods of assessing risk of developing a severe response to coronavirus infection
JP6346557B2 (en) Method for detecting HLA-A * 31: 01 allele
US20200165671A1 (en) Detecting tissue-specific dna
CN116083562B (en) SNP marker combination and primer set related to aspirin resistance auxiliary diagnosis and application thereof
JP2017000006A (en) Method for assisting diagnosis of effectiveness of methotrexate in rheumatoid arthritis patient
WO2024008955A1 (en) Method of screening for severe covid-19 susceptibility
JP7140707B2 (en) How to determine your risk of glaucoma
JP6369893B2 (en) A method to assist in the determination of the risk of developing side effects from methotrexate in patients with rheumatoid arthritis
JP7161442B2 (en) How to determine the risk of rheumatoid arthritis
JP5643933B2 (en) Method for testing amyotrophic lateral sclerosis based on single nucleotide polymorphism of ZNF512B gene
RU2510508C1 (en) Method for risk prediction of bronchial asthma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23739275

Country of ref document: EP

Kind code of ref document: A1