WO2017021752A1 - Methods for amplifying and sequencing the genome of a hepatitis c virus - Google Patents

Methods for amplifying and sequencing the genome of a hepatitis c virus Download PDF

Info

Publication number
WO2017021752A1
WO2017021752A1 PCT/IB2015/001881 IB2015001881W WO2017021752A1 WO 2017021752 A1 WO2017021752 A1 WO 2017021752A1 IB 2015001881 W IB2015001881 W IB 2015001881W WO 2017021752 A1 WO2017021752 A1 WO 2017021752A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
hepatitis
virus
dna
sequencing
Prior art date
Application number
PCT/IB2015/001881
Other languages
French (fr)
Inventor
Sylvie Jacqueline Henriette LARRAT
Pauline TREMEAUX
Elodie SANTONI
Marie-Ange THELU
Original Assignee
Universite Joseph Fourier
Centre Hospitalier Universitaire De Grenoble
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universite Joseph Fourier, Centre Hospitalier Universitaire De Grenoble filed Critical Universite Joseph Fourier
Priority to PCT/IB2015/001881 priority Critical patent/WO2017021752A1/en
Priority to PCT/EP2016/068589 priority patent/WO2017021471A1/en
Publication of WO2017021752A1 publication Critical patent/WO2017021752A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • C12Q1/706Specific hybridization probes for hepatitis
    • C12Q1/707Specific hybridization probes for hepatitis non-A, non-B Hepatitis, excluding hepatitis D
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms

Definitions

  • the present invention relates to methods for amplifying and sequencing the full-length genome of a hepatitis C virus.
  • the present invention also relates to primers and kits for amplifying and sequencing the full-length genome of a hepatitis C virus.
  • Chronic hepatitis C is a major disease caused by the hepatitis C virus (HCV) and responsible for more than 350 000 deaths a year through an evolution of liver fibrosis to cirrhosis or hepatocellular carcinoma. Nevertheless, it is a curable disease.
  • New standards of care are based on drags targeting the non- structural proteins of HCV.
  • DAAs directly acting antivirals
  • SVR sustained virological response
  • Available DAAs differ in their activity against HCV genotypes and even subtypes.
  • HCV whole genome sequencing is the most accurate and reliable method.
  • viral RNA is converted in vitro into cDNA by reverse transcription and, then, cDNA is amplified by PCR to produce a DNA template.
  • cDNA is amplified by PCR to produce a DNA template.
  • HCV whole genome Due to its size (over 9 kb), its high GC content (over 58%), its high degree of variability and its secondary structures, HCV whole genome is difficult to amplify, and published techniques were either restricted to a single genotype - if not a single subtype (Tellier et al, Long PCR and its application to hepatitis viruses: amplification of hepatitis A, hepatitis B, and hepatitis C virus genomes.
  • genotyping is primarily achieved by the sequencing of PCR-amplified portions of the viral genome obtained from a patient sample, followed by phylogenetic analysis. However, this approach increases the risk of errors and is not appropriate for genotyping recombinant forms of HCV,
  • One of the aims of the invention is to provide a method for amplifying the full-length HCV genome with a single PCR reaction, without a prior knowledge of its genotype.
  • Another aim of the invention is to provide a reliable method for sequencing the full- length HCV genome.
  • Another aim of the invention is to provide a method of detecting genotypes and/or subtypes of hepatitis C virus in a biological sample containing one genotype and/or subtype or in a biological sample containing a mix of different genotypes and/or subtypes.
  • Another aim of the invention is to provide primers, couples of primers and kits allowing amplification of the full-length HCV genome.
  • the invention relates to a method of amplifying the full-length genome of a hepatitis C virus of any genotype comprising:
  • primers being respectively complementary to two sequences located at the two extremities of said cDNA, said two sequences being distant from each other by at least 9 kb, and
  • the present invention is based on the unexpected invention made by the Inventors that the full-length genome of a hepatitis C virus can be amplified by a single PCR reaction with one couple of primers, without prior knowledge of the genotype of the hepatitis C virus.
  • full-length genome (or “genome”) as used herein is defined as the collective gene set carried by a viral particle of hepatitis C virus. This collective gene set can be carried by RNA, cDNA or DNA material. In the invention, the term full-length genome refers to the complete sequence or a near full-length sequence of the viral nucleic acid.
  • genotypes refers to groups of viruses resulting from genetic heterogeneity and divergence in HCV sequences.
  • the classification of the HCV genotypes is based on the analysis of the full-length genome. The last consensus describes 7 genotypes numbered from 1 to 7 with nucleotid sequences varying over 30% between each other.
  • subtypes correspond to genetic variations of the particular genotypes (for example, la and lb). Generally, subtypes have nucleotide sequences varying over 30% between each other.
  • genotypes such as RFJ2k/lb. These recombinant forms arise in patients that are coinfected with more than one genotype and then they can be transmitted to other patients.
  • each genotype and/or subtype it is also possible to identify genetically distinct but closely related variants referred to as "quasispecies". Indeed, the viral population in a patient is generally composed of a sequence that is dominant and a number of sequences differing by a few mutations.
  • cDNA refers to "complementary DNA", a form of DNA synthesized from a RNA template by reverse transcription.
  • total nucleic acid is defined as the total genetic material extracting from a biological sample suspected to contain the genome of a hepatitis C virus.
  • Total nucleic acid can contain DNA and/or RNA molecules.
  • amplifying refers to amplification methods that require thermocycling (e.g. PCR). Amplification means increasing the relative concentration of one or more sequences in a sample at least 10-fold, relative to unamplified components of the sample.
  • PCR refers to the polymerase chain reaction. PCR involves a DNA polymerase, pahs of primers, and thermal cycling to synthesize multiple copies of two complementary strands from double strand DNA or from cDNA.
  • an excess of at least two oligonucleotide primers (forward and reverse) is introduced to a reaction mixture containing the desired target nucleic acid and the reaction mixture is submitted to a specific thermal cycling in the presence of a DNA polymerase.
  • the reaction mixture is denatured and the primers are then annealed to their respective target sequences.
  • the primers are extended with a polymerase so as to form new pairs of complementary strands.
  • the steps of denaturation, primers annealing and polymerase extension can be repeated many times and constitute one "PCR cycle" (in "two-step PCR", annealing and extension can be carried out at the same temperature).
  • the amplified segments obtained by the PCR method are, themselves, efficient templates for subsequent PCR amplifications.
  • PCR is also called “RT-PCR”, or “reverse transcription polymerase chain reaction”, when a RNA strand is reverse transcribed into its cDNA using a reverse transcriptase, and the resulting cDNA is amplified using PCR.
  • PCR is also called “LR-PCR”, or “long range PCR”, when it refers to amplification of DNA lengths (in general over 5 kb and up to 30 kb and beyond) that cannot typically be amplified using routine PCR methods or reagents.
  • the invention relates to a method as defined above, wherein said couple of primers is chosen among the group consisting in the following couples:
  • SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2)
  • SEQ ID NO: 1 and SEQ ID NO: 6 (5NCl/7er),
  • SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2)
  • SEQ ID NO: 1 and SEQ ID NO: 8 (5NCl NA2bis),
  • SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2)
  • SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2), SEQ ID NO: 2 and SEQ ID NO: 8 (5NC3NA2bis),
  • SEQ ID NO: 3 and SEQ ID NO: 5 (UNI40/HUTLA2)
  • SEQ ID NO: 3 and SEQ ID NO: 7 (UNI40/NA2)
  • SEQ ID NO: 3 and SEQ ID NO: 8 (UNI40NA2bis),
  • SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2)
  • SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2)
  • SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2)
  • SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2).
  • Forward primers are located in the 5'NC non-coding region, located at the 5' end of the
  • Reverse primers are located in the NS5B region coding for a nonstructural protein, located at the 3 ' end of the HCV genome.
  • 5NC1 5'NC forward GCAGAAAGCGTCTAGCCATGGCGT 1
  • HUTLA2 NS5B reverse GGGCCGGGCATGAGACACGCTGTGATAAATGTC 5
  • NS5B reverse CGGGCATG GACASGCTGTGAACCAGGAAACAGCTATGACC 12 M13 Table 1. List of used primers for amplification by PCR.
  • primer refers to short oligonucleotides that can be used to initiate extension of one strand of DNA.
  • the primers have a sequence that is the reverse complement of a specific region of the target DNA.
  • these primers can be mutated for modifying their target-binding properties, they can be extended with a tail-sequence for allowing nested PCR or cloning reactions (such as the primers 5NC3-M13 and NA2- 13 in table 1), they can be chemically bound to fluorescent compounds for carrying out real-time PCR reactions, or the structure of the ribose and the phosphate can be modified to improve sensibility and specificity of the primers (such as Locked Nucleic Acids, LNA).
  • the invention relates to a method as defined above, wherein said DNA polymerase has a 3' ⁇ 5' exonuclease activity and/or a 5' ⁇ 3' exonuclease activity, preferably a 3' ⁇ 5' exonuclease activity.
  • 3 ' ⁇ 5 ' exonuclease activity and "5 ' ⁇ 3 ' exonuclease activity” refer to the proofreading activities of some DNA polymerases. These proofreading activities catalyze the removal of a mononucleotide, in the 3 '-5' or 5 '-3' direction, at the extremity of the duplex DNA during elongation. Following base excision, the polymerase can re-insert the correct base and elongation can continue. These proofreading activities allow replication of long DNA molecules over 1,000 bp while maintaining high fidelity.
  • the invention relates to a method as defined above, wherein said DNA polymerase has a 3' ⁇ 5' exonuclease activity and/or a 5' ⁇ 3' exonuclease activity, preferably a 3' ⁇ 5' exonuclease activity, is thermostable and allows hot-start PCR
  • the term "hot start PCR” refers to a modified form of PCR which avoids a non-specific amplification of DNA by inactivating the DNA polymerase at low temperature.
  • the DNA polymerase is active at room temperature and, when all the reaction components are put together, nonspecific primer annealing can occur due to these low temperatures.
  • This nonspecific annealed primer can then be extended by the Taq DNA polymerase, generating nonspecific products and lowering product yields.
  • specific antibodies are used to block the DNA polymerase at annealing temperature. When the temperature raises for amplification, the specific antibodies detach from the DNA polymerase and the amplification with greater specificity starts. Hot Start PCR significantly reduces nonspecific priming, the formation of primer dimers, and often, increases product yields.
  • thermostable' ' ' refers to the property of some DNA polymerase to resist to high temperatures over 90°C. Use of thermostable DNA polymerases enables running the PCR at high temperature ( ⁇ 60°C and above), which facilitates high specificity of the primers and reduces the production of unspecific products, such as primer dimers.
  • DNA polymerases suitable for the method of the invention include, but are not limited to, KOD Hot StartTM (Merck), PhusionTM (New England Biolabs), Q5 High Fidelity Hot Start DNA polymeraseTM (New England Biolabs), Platinum Taq DNA Polymerase High FidelityTM (Life Technologies), Taq Platinum PCR SupermixTM (Life Technologies), Accuprime Pfx DNA PolymeraseTM (Life Technologies), Takara LA Taq iM (Ozyme), GXL Prime Star DNA PolymeraseTM (Takara Clontech), Expand Long TemplateTM (Roche Diagnostics), Expand Long RangeTM (Roche Diagnostics) and KAPA Long RangeTM ( apa Biosystems).
  • the invention relates to a method as defined above, wherein each primer has a size of 15 to 40 nucleotides, preferably of 20 to 40 nucleotides, more preferably of 28 to 30 nucleotides.
  • a primer of the invention have a size of 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
  • the invention relates to a method as defined above, wherein least one PCR cycle is:
  • This embodiment corresponds to a "two-step PCR" in which primers annealing and extension occurs at the same temperature.
  • the invention relates to a method as defined above, wherein said at least one PCR cycle is:
  • said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times.
  • temperature of denaturation in step (i) of the PCR cycle can be 95.0°C, 95.1°C, 95.2°C, 95.3°C, 95.4°C, 95.5°C, 95.6°C, 95.7°C, 95.8°C, 95.9°C, 96.0°C, 96. ⁇ , 96.2°C, 96.3°C, 96.4°C, 96.5°C, 96.6°C, 96.7°C, 96.8°C, 96.9°C, 97.0°C, 97.1°C, 97.2°C, 97.3°C, 97.4°C, 97.5°C, 97.6°C, 97.7°C, 97.8°C, 97.9°C or 98.0°C.
  • temperature of primers annealing/ DNA extension in step (ii) of the PCR cycle can be 65.0°C, 65.1°C, 65.2°C, 65.3°C, 65.4°C, 65.5°C, 65.6°C, 65.7°C, 65.8°C, 65.9°C, 66.0°C, 66.1°C, 66.2°C, 66.3°C, 66.4°C, 66.5°C, 66.6°C, 66.7°C, 66.8°C, 66.9°C, 67.0°C, 67.1°C, 67.2°C, 67.3°C, 67.4°C, 67.5°C, 67.6°C, 67.7°C, 67.8°C, 67.9°C, 68.0°C, 68.1°C, 68.2°C, 68.3°C, 68.4°C, 68.5°C, 68.6°C, 68.7°C, 68.8°C, 68.9
  • the invention relates to a method as defined above, wherein said at least one PCR cycle is:
  • said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times.
  • the invention relates to a method as defined above, wherein said at least one PCR cycle is:
  • said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times.
  • the invention relates to a method as defined above, wherein said mixture reaction is heated once at 98°C for at least 1 minute, preferably for at least 2 minutes, just before being subjected to said at least one PCR cycle.
  • the invention relates to a method as defined above, wherein one additive is added to said reaction mixture, said additive being preferably chosen among the group consisting in dimethylsulfoxyd (DMSO), bovine serum albumin (BSA), T4 Gene 32 Protein and betaine.
  • DMSO dimethylsulfoxyd
  • BSA bovine serum albumin
  • T4 Gene 32 Protein T4 Gene 32 Protein
  • additive refers to enhancing agents that can be used to increase the yield, specificity and consistency of D A synthesis.
  • the invention relates to a method as defined above, wherein said cDNA is obtained by reverse transcription using a primer chosen among the group consisting in: SEQ ID NO: 9 and SEQ ID NO: 10.
  • reverse transcriptases suitable for the method of the invention include, but are not limited to, Superscript III First Strand System for RT-PCRTM (Life Technologies) and Primescript Reverse TranscriptaseTM (Ozyme).
  • the invention relates to a method as defined above, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase for at least 40°C, preferably for at least 50°C, during at least 50 minutes.
  • the invention relates to a method as defined above, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase at 50°C, during 50 minutes.
  • the invention relates to a method as defined above, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase and at least one additive chosen among T4 Gene 32 Protein and a ribonuclease inhibitor.
  • the invention relates to a method as defined above, comprising a step of extracting said total nucleic acid of a hepatitis C virus from a biological sample containing viral particles, such as blood, plasma, hepatic puncture biopsy or peripheral blood mononuclear cells (PBMC) of a patient.
  • viral particles such as blood, plasma, hepatic puncture biopsy or peripheral blood mononuclear cells (PBMC) of a patient.
  • PBMC peripheral blood mononuclear cells
  • patient refers to an individual infected, or suspected of being infected, by a hepatitis C virus. This term includes mammals such as humans and other primates,
  • the invention relates to a method as defined above, comprising:
  • a biological sample contaniing viral particles such as blood, plasma, hepatic puncture biopsy or peripheral blood mononuclear cells (PBMC) of a patient
  • a cDNA of the total nucleic acid of a hepatitis C vims - a couple of primers, said primers being respectively complementary to two sequences located at the two extremities of said cDNA, said two sequences being distant from each other by at least 9 kb, and
  • the invention relates to a method of sequencing the full-length genome of a hepatitis C virus comprising:
  • the sequence of the DNA molecule obtained by PCR amplification can be detennined by using any method for sequencing DNA.
  • the invention relates to a method of detecting hepatitis C virus genotypes in a biological sample of a patient suspected of containing hepatitis C viral particles, comprising:
  • the method of the mvention allows amplification and sequencing of the full-length genome of hepatitis C virus, it is possible to detect and identify the genotypes, the subtypes and the quasispecies among the HCV population present in a biological sample. Moreover, the method of the invention can be used to detect and identify new genotypes and/or subtypes and/or quasispecies that have not been described yet.
  • DNA library refers to a collection of DNA fragments that have been physically isolated from each other.
  • a DNA library allows further analysis, and in particular sequence analysis, of each DNA fragments individually.
  • DNA library can be obtained through different means according to the purpose or to the technology. For example, DNA fragments can be cloned into vectors (such as plasmids) or captured by beads, to allow the sequencing of each fragment.
  • a read alignment is carried out after step (c) to obtain consensus sequences.
  • DNA fragments that are used to create a DNA library, have an average size between 400 and 1 100 bp, preferably between 500 and 800 bp.
  • DNA fragments have an average size of 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1 000, 1 050 or 1 100 bp.
  • DNA fragments, that are used to create a DNA library have an average size of at least 1, 2, 3, 4, 5, 6, 7, 8 or 9 kb.
  • DNA fragments that are used to create a DNA library, are the double strand DNA of at least 9 kb obtained after step a) as such.
  • the invention relates to one of the methods as defined above, comprising a step of extracting total nucleic acid of the hepatitis C virus from a biological sample of a patient containing only one genotype of hepatitis C virus. In an embodiment, the invention relates to one of the methods as defined above, comprising a step of extracting total nucleic acid of the hepatitis C virus from a biological sample of a patient containing at least two different genotypes of hepatitis C virus.
  • Hepatitis C virus exists as a population of distinct but closely related viral variants. These variants may display divergent implicative capacity, cell tropism, immunologic escape, and antiviral-drug resistance.
  • a patient is co-infected with 2 or more distinct HCVs with distinct genotypes.
  • a DNA library allows to separate and identify the HCV variants that are present within a sample. Therefore, it is possible to detect and identify resistance -associated variants within a sample.
  • the method of the invention can be used to detect variants that are present in low quantity, even in minority, within a sample.
  • the invention relates to one of the methods as defined above, wherein said step of sequencing is achieved using by the Sanger method or a NGS method.
  • NGS Next Generation Sequencing' ' ' refers to the so-called methods of nucleic acid sequencing and comprises the sequencing-by-synthesis or sequencing-by- ligation platforms currently employed by Illumina, Life Technologies, Pacific Biosciences and Roche, etc.
  • Next generation sequencing methods may also include, but not be limited to, nanopore sequencing methods such as offered by Oxford Nanopore or electronic detection-based methods such as the Ion Torrent technology commercialized by Life Technologies.
  • the invention relates to the method as defined above, wherein said step of fragmenting the double strand DNA is achieved using by nebulization, by ultrasonication or by enzymatic digestion, preferably by nebulization.
  • nebulization refers to a DNA fragmentation by forcing DNA through a small hole in a nebulizer unit.
  • size of the fragments varies accoding to the pressure of the gas used to push the DNA tlirough the nebulizer, the speed at which the DNA solution passes through the hole, the viscosity of the solution, and the temperature.
  • sonication refers to a DNA fragmentation method by subjecting DNA to brief perio ds of sonication with ultrasonic frequencies .
  • enzymatic digestion refers to a DNA fragmentation method by using enzymes to digest DNA. Usually, enzymatic digestion can be achieved using restriction enzymes that cut DNA at or near specific recognition sequences (restriction sites).
  • the invention relates to a primer chosen among the group consisting in: SEQ ID NO: 4 (2CH), SEQ ID NO: 6 (7er) and SEQ ID NO: 10 (3UTR1).
  • the invention relates to a couple of primers chosen among the group consisting in the following couples :
  • SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2)
  • SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2)
  • SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2)
  • SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2).
  • the invention relates to a kit for the amplification of the full-length genome of a hepatitis C virus comprising:
  • SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2)
  • SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2)
  • SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2)
  • the invention relates to a kit as defined above further comprising a reverse transcriptase.
  • the invention relates to a kit as defined above further comprising a primer for the reverse transcription chosen among SEQ ID NO: 9 and SEQ ID NO: 10.
  • Figure 1 (a-b). Examples of LR-PCR products obtained for different genotypes and subtypes using different pangenotypic primers.
  • Figure 2 (a-h). Examples of quality scores (PHRED) according to read size for 3 samples sequenced using GSJ assay (a, b and c), 3 samples sequenced using GSJ+ assay (d, e and f) and for the two pyro sequencing failures (g and h). P8S1 was sequenced with the GSJ and P14Sl with the GSJ+.
  • Figure 3 Repartition of the PHRED scores each 500 bp across HCV genome on 6 examples. Data are presented as box plots in which 50% of the values lie within the box. The horizontal lines drawn through the middle of the boxes represent the median values. The top and bottom of each box are the 25 lh and 75 th percentiles of all values. The numbering is based on the sequence of HCV strain H77 (GenBank accession no. AF009606).
  • Figure 4 Examples of sequence depth of coverage obtained for 6 samples.
  • the nucleotide numbering is based on the sequence of HCV strain H77 (GenBank accession no. AF009606).
  • Figure 5 Phylogenetic analysis of the 19 near full-length genomes obtained compared with 86 reference sequences identified by their GenBank accession numbers. Bootstrap resampling (1000 replications) support values are shown at nodes. The tree is rooted using genotype 2 sequences, and all horizontal branch lengths are drawn to a scale of nucleotide substituions per site.
  • the LR-PCR also worked for a group of 5 RF 2k/lb samples (with a mean viral load of 5.5 log UI/mL), including 3 sequential samples taken from the same patient before, during and after a failed treatment with sofosbuvir and ribavirin.
  • the 21 positive LR-PCRs were sequenced in 4 runs. Two samples (one HCVlb (P8S1) and one HCV4a (P14S1)) could not be analysed because, despite a correct number of reads (8407 and 15517 respectively), reads were too short and quality too low to allow the reconstitution of a consensus sequence (see Figure 2, panel C). Consensus sequences could be achieved for 19 samples (90%). A mean number of reads/sample of 8601 (+/- 6053) was obtained.
  • Consensus sequence covered more than 99% of the expected sequence for 15/1 samples and covered 98.8%, 97.5%, 89.8% and 77.9% for the 4 remaining ones (see Table 3).
  • P.2S1 13 la EF407419 92.63 % (3583/9266) 1579 99,86 % (9266/9279 ⁇ 15 (10-19) 3S1 la la EF4C741S 92.68 % (8525/9198.) 11 306 99.69 % (9198/9227) NC P4S1 lb lb AY587016 90.18 % (£359/9269) 5 25 99.84 % (9269/9284) NC
  • P7S2 3a 3a D2S917 92. 42 % (8447/9140) 21 756 98.67 % (9140/9253) 32 (22-44 ⁇ SS1 lb / / / 8407 / /
  • RAVs detected on consensus sequences at baseline were found in NS3 for 3 patients (T54A + 1132V for P6S1, T54S for P10S1 and D168E for P12S1) and in NS5A for 4 patients (Y93H for P7S1 and P11S1, and L31M for P13S1 and P15S1). No blown RAV (L159F, S282T, C316N/Y, L320F, V321A, N411S, M414T, Y448C/H, A553V and S556G) was detected inNS5B (30,31).
  • RNA extracts were extracted from 1 mL of plasma and eluted on 50 ⁇ , on the easyMAG® instrument, according to the manufacturer's instructions (BioMerieux). Extracts were treated with TURBO DNase AmbionTM (Life Technologies). Reverse transcription (RT) was performed from 8 ⁇ ⁇ of RNA extract using PrimeScriptTM Reverse Transcriptase (Ozyme) and a specific primer: dA20 or 3UTR1, annealing respectively on the poly(U) tail or the 3' non-coding region (3'NC). Manufacturer's instructions were followed, except for the addition of T4 Gene 32 Protein at 0.1 g ⁇ L (New England BioLabs). Success of the cDNA synthesis was checked with small PCRs on both sides of the genome. Long range PCR (LR-PCR
  • HCV near-complete genome was amplified in a single fragment via a LR-PCR, from 2 ⁇ of cDNA in a total volume of 20 ⁇ , using either PrimeSTARTM GXL DNA Polymerase (Takara Clontech) or KOD XtremeTM Hot Start DNA Polymerase (Merck). The following conditions were applied: 94 °C x 2 min, (98°C x 10 s, 68°C x 9 min 25) x 45 cycles, 4°C on hold. Several primer pairs were tested for each sample. A nested PCR with internal primers or a second-round PCR with the same ones was sometimes necessary to obtain the required DNA quantities, and done following the same conditions. Primers are listed in Table 1.
  • PCR products were then purified directly or from a 0.5% agarose gel using NucleoSpinTM Gel and PCR Clean-up kit (Macherey-Nagel, Hoerdt, France). DNA concentration was quantified using Quant-iTTM Pico GreenTM dsDNA Reagent (Life Technologies, St Aubin, France) and a LightCylerTM 480 instrument (Roche Diagnostics), from 1 ⁇ , of sample diluted with TE IX on 50 ⁇ L. PicoGreen® Reagent was then added in a 1.T ratio.
  • the obtained SFF files were split by MIDs and converted to FASTQ files. Quality control (particularly, PHRED quality score profiles) was performed on each FASTQ file using the FASTQC software. Contigs were generated de novo using the VICUNA software (Broad Institute Inc.) with default parameters except for the minimum length of contigs to output which was set to 0 and the maximum percentage of divergence between read and consensus which was set to 50% (default: 8%), when the consensus output from V-FAT presented too many gaps.
  • V-FAT processed the output of VICUNA to orient and filter the raw contigs, merge them where they overlapped based on a reference alignment (the subtype-specific reference sequence for this alignment was determined using the HCV BLAST from Los Alamos National Laboratory (with the longest contig as query) and correct frameshifts found in coding regions. As a result, V-FAT yielded a consensus full-genome assembly.
  • Phylogenetic analysis was performed after alignment of the 19 HCV whole genome sequences obtained in this study associated with 86 reference sequences.
  • the phylogenetic tree was constructed by means of the Neighbor- Joining method on conserved sites and a Jukes-Cantor substitution model using MAFFT version 7 online software. Reliability of the various inferred clades was estimated by bootstrapping (1000 replicates). Visualization of the tree and node coloring were performed by means of Archaeopteryx.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Communicable Diseases (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Virology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to methods for amplifying and sequencing the full-length genome of a hepatitis C vims. The present invention also relates to primers and kits for amplifying and sequencing the full-length genome of a hepatitis C virus.

Description

METHODS FOR AMPLIFYING AND SEQUENCING THE GENOME OF A
HEPATITIS C VIRUS
The present invention relates to methods for amplifying and sequencing the full-length genome of a hepatitis C virus. The present invention also relates to primers and kits for amplifying and sequencing the full-length genome of a hepatitis C virus.
Chronic hepatitis C is a major disease caused by the hepatitis C virus (HCV) and responsible for more than 350 000 deaths a year through an evolution of liver fibrosis to cirrhosis or hepatocellular carcinoma. Nevertheless, it is a curable disease. New standards of care are based on drags targeting the non- structural proteins of HCV. These directly acting antivirals (DAAs) lead to levels of sustained virological response (SVR) much higher than the classical bi-therapy with pegylated interferonTalpha and ribavirin. Available DAAs differ in their activity against HCV genotypes and even subtypes. Therefore, an accurate typing of HCV is mandatory to determine the most appropriate molecules, as well as the treatment duration (EASL Clinical practice Guidelines: Management of hepatitis C virus infection. J Hepatol 2014; 60(2):392-420). Seven genotypes and more than 67 subtypes are now validated (Smith et al, Expanded classification of hepatitis C virus into 7 genotypes and 67 genotypes: updated criteria and genotype assignment Web resource. Hepatol Baltim Md. 2014; 59(1)318-27). Some inter- genotypic recombinant forms have also been described (Morel et al, Genetic recombination of the hepatitis C virus: clinical implications. J Viral Hepat. 2011; 18(2): 77-83). Current recommendation for HCV typing is to analyse variable genomic regions, such as "Core" or "NS5B". However, the analysis of a single region does not allow the detection of recombinant forms and the use of assays typing HCV in the 5 'non-coding (5'NC) and/or Core region may lead to discrepancies between the obtained genotype and the type of the non-structural regions targeted by the DAAs, resulting in therapeutic failures (Ramiere et al, Recent evidence of underestimated circulation of hepatitis C virus intergenic recombinant strain RF2k lb in the Rhone-Alpes region, France, January to August 2014: implications for antiviral treatment. Euro Surveill Bull Eur Sur Mai Transm Eur Commun Dis Bull. 2014; 19(43) ; Larrat et al, Sequencing assays for failed genotyping with the versant hepatitis C virus genotype assay (LiPA), version 2.0 J Clin Microbiol. 2013; 51(9): 2815-21; Ανό et al, Hepatitis C virus subtyping based on sequencing of the C/El and NS5B genomic regions in comparison to a commercially available line probe assay. J Med Virol 2013; 85(5): 8 5-22). Moreover, the great genetic variability of HCV also leads to the emergence of resistance-associated variants (RAVs) in the DA As-targeted proteins NS3, NS5A and/or NS5B and their monitoring seems mandatory in clinical research. Even with a treatment combining antiviral drugs, a few therapeutic failures with the selection of RAVs in the 3 genomic regions have been reported (Sulkowski et at, Ombitasvir paritaprevir co-dosed with ritonavir, dasabuvir, and ribavirin for hepatitis C in patients co-infected with HIV-1: a randomized trial. JAMA. 2015; 313(12): 1223-31).
Among all typing methods available, HCV whole genome sequencing is the most accurate and reliable method. Commonly, prior to the sequencing of the HCV genome, viral RNA is converted in vitro into cDNA by reverse transcription and, then, cDNA is amplified by PCR to produce a DNA template. However, some difficulties remain. Due to its size (over 9 kb), its high GC content (over 58%), its high degree of variability and its secondary structures, HCV whole genome is difficult to amplify, and published techniques were either restricted to a single genotype - if not a single subtype (Tellier et al, Long PCR and its application to hepatitis viruses: amplification of hepatitis A, hepatitis B, and hepatitis C virus genomes. J Clin Microbiol. 1996; 34(12):3085-81; Fan et al, Efficient amplification and cloning of near full-length hepatitis C virus genome from clinical samples. Biochem Biophys Res Commun. 2006; 346(4): 1163-72; Zhang et al. , Development of a sensitive RT-PCR method for amplifying and sequencing near full- length HCV genotype 1 RNA from patient samples. Virol J. 2013; 10:53). . Thus, in clinical use, genotyping is primarily achieved by the sequencing of PCR-amplified portions of the viral genome obtained from a patient sample, followed by phylogenetic analysis. However, this approach increases the risk of errors and is not appropriate for genotyping recombinant forms of HCV,
Therefore, there is a need for simple and relevant methods for amplifying and sequencing HCV whole genome without a prior knowledge of its genotype. One of the aims of the invention is to provide a method for amplifying the full-length HCV genome with a single PCR reaction, without a prior knowledge of its genotype. Another aim of the invention is to provide a reliable method for sequencing the full- length HCV genome. Another aim of the invention is to provide a method of detecting genotypes and/or subtypes of hepatitis C virus in a biological sample containing one genotype and/or subtype or in a biological sample containing a mix of different genotypes and/or subtypes.
Another aim of the invention is to provide primers, couples of primers and kits allowing amplification of the full-length HCV genome.
The invention relates to a method of amplifying the full-length genome of a hepatitis C virus of any genotype comprising:
a) mixing :
- a cDNA of the total nucleic acid of a hepatitis C virus,
- a couple of primers, said primers being respectively complementary to two sequences located at the two extremities of said cDNA, said two sequences being distant from each other by at least 9 kb, and
- a DNA polymerase, to obtain a reaction mixture,
b) subjecting the reaction mixture to at least one PCR cycle to obtain a double strand DNA of at least 9 kb.
The present invention is based on the unexpected invention made by the Inventors that the full-length genome of a hepatitis C virus can be amplified by a single PCR reaction with one couple of primers, without prior knowledge of the genotype of the hepatitis C virus.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art pertinent to the methods and products described. Reference to a technology by a company or trade name herein includes reference to the corresponding generic technology, unless otherwise indicated by context. Terms and symbols of nucleic acid chemistry, biochemistry, genetics and molecular biology used herein follow those of standard treaties and texts in the field. As used herein, the following terms have the meanings ascribed to them unless specified otherwise. The terms "a" (or "an"), "and" and "the" include plural referents, unless the context clearly indicates otherwise. For example, "a cDNA" as used herein is understood to represent one or more cDNA molecules. As such, the terms "a" (or "an") and "at least one" can be used interchangeably herein, unless otherwise indicated by context.
The term "full-length genome" (or "genome") as used herein is defined as the collective gene set carried by a viral particle of hepatitis C virus. This collective gene set can be carried by RNA, cDNA or DNA material. In the invention, the term full-length genome refers to the complete sequence or a near full-length sequence of the viral nucleic acid.
The term "genotypes" refers to groups of viruses resulting from genetic heterogeneity and divergence in HCV sequences. The classification of the HCV genotypes is based on the analysis of the full-length genome. The last consensus describes 7 genotypes numbered from 1 to 7 with nucleotid sequences varying over 30% between each other.
Within each genotype are further divisions called "subtypes". The subtypes correspond to genetic variations of the particular genotypes (for example, la and lb). Generally, subtypes have nucleotide sequences varying over 30% between each other.
It also exists recombinant forms of genotypes, such as RFJ2k/lb. These recombinant forms arise in patients that are coinfected with more than one genotype and then they can be transmitted to other patients.
Within each genotype and/or subtype, it is also possible to identify genetically distinct but closely related variants referred to as "quasispecies". Indeed, the viral population in a patient is generally composed of a sequence that is dominant and a number of sequences differing by a few mutations.
The term "cDNA" refers to "complementary DNA", a form of DNA synthesized from a RNA template by reverse transcription.
The term "total nucleic acid" as used herein is defined as the total genetic material extracting from a biological sample suspected to contain the genome of a hepatitis C virus. Total nucleic acid can contain DNA and/or RNA molecules. The term "amplifying" refers to amplification methods that require thermocycling (e.g. PCR). Amplification means increasing the relative concentration of one or more sequences in a sample at least 10-fold, relative to unamplified components of the sample. The term "PCR" refers to the polymerase chain reaction. PCR involves a DNA polymerase, pahs of primers, and thermal cycling to synthesize multiple copies of two complementary strands from double strand DNA or from cDNA. For amplifying a target region, an excess of at least two oligonucleotide primers (forward and reverse) is introduced to a reaction mixture containing the desired target nucleic acid and the reaction mixture is submitted to a specific thermal cycling in the presence of a DNA polymerase. To achieve amplification, the reaction mixture is denatured and the primers are then annealed to their respective target sequences. Following annealing, the primers are extended with a polymerase so as to form new pairs of complementary strands. The steps of denaturation, primers annealing and polymerase extension can be repeated many times and constitute one "PCR cycle" (in "two-step PCR", annealing and extension can be carried out at the same temperature). The amplified segments obtained by the PCR method are, themselves, efficient templates for subsequent PCR amplifications.
PCR is also called "RT-PCR", or "reverse transcription polymerase chain reaction", when a RNA strand is reverse transcribed into its cDNA using a reverse transcriptase, and the resulting cDNA is amplified using PCR.
PCR is also called "LR-PCR", or "long range PCR", when it refers to amplification of DNA lengths (in general over 5 kb and up to 30 kb and beyond) that cannot typically be amplified using routine PCR methods or reagents. In an embodiment, the invention relates to a method as defined above, wherein said couple of primers is chosen among the group consisting in the following couples:
SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2),
SEQ ID NO: 1 and SEQ ID NO: 6 (5NCl/7er),
SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2),
SEQ ID NO: 1 and SEQ ID NO: 8 (5NCl NA2bis),
SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2),
SEQ ID NO: 2 and SEQ ID NO: 6 (5NC3/7er),
SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2), SEQ ID NO: 2 and SEQ ID NO: 8 (5NC3NA2bis),
SEQ ID NO: 3 and SEQ ID NO: 5 (UNI40/HUTLA2),
SEQ ID NO: 3 and SEQ ID NO: 6 (UNI40/7er),
SEQ ID NO: 3 and SEQ ID NO: 7 (UNI40/NA2),
SEQ ID NO: 3 and SEQ ID NO: 8 (UNI40NA2bis),
SEQ ID NO: 4 and SEQ ID NO: 5 (2CH/HUTLA2),
SEQ ID NO: 4 and SEQ ID NO: 6 (2CH/7er),
SEQ ID NO: 4 and SEQ ID NO: 7 (2CH/NA2), and
SEQ ID NO: 4 and SEQ ID NO: 8 (2CH NA2bis),
preferably chosen among the group consisting in:
SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2),
SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2),
SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2),
SEQ ID NO: 2 and SEQ ID NO: 6 (5NC3/7er), and
SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2).
Forward primers are located in the 5'NC non-coding region, located at the 5' end of the
HCV genome. Reverse primers are located in the NS5B region coding for a nonstructural protein, located at the 3 ' end of the HCV genome.
The sequences of primers are given in table 1.
SEQ
Primers Region Polarity Sequences ID
NO
5NC1 5'NC forward GCAGAAAGCGTCTAGCCATGGCGT 1
5NC3 5'NC forward GCCATGGCGTTAGTATGAGT 2
U 40 5'NC forward ACTGTCTTCACGCAGAAAGCGTCTAGCCAT 3
2CH 5'NC forward GGAAC TAC GTCT TCACGC AGAA 4
HUTLA2 NS5B reverse GGGCCGGGCATGAGACACGCTGTGATAAATGTC 5
7er NS5B reverse GGGGAGCAGGTAGATGCCTA 6
NA2 NS5B reverse CGCGCATGMGACASGCTGTGA 7
NA2bis NS5B reverse C GGGC YGMGAC S GCTGTGA 8
5NC3-
5'NC forward GCCATGGCGTTAGTATGAGTACTGTAAAACGACGGCCAGT 11 M13
NA2-
NS5B reverse CGGGCATG GACASGCTGTGAACCAGGAAACAGCTATGACC 12 M13 Table 1. List of used primers for amplification by PCR.
In the invention, the term "primer" refers to short oligonucleotides that can be used to initiate extension of one strand of DNA. The primers have a sequence that is the reverse complement of a specific region of the target DNA.
These primers can be modified according to specific needs.
For example, these primers can be mutated for modifying their target-binding properties, they can be extended with a tail-sequence for allowing nested PCR or cloning reactions (such as the primers 5NC3-M13 and NA2- 13 in table 1), they can be chemically bound to fluorescent compounds for carrying out real-time PCR reactions, or the structure of the ribose and the phosphate can be modified to improve sensibility and specificity of the primers (such as Locked Nucleic Acids, LNA). In an embodiment, the invention relates to a method as defined above, wherein said DNA polymerase has a 3'→5' exonuclease activity and/or a 5'→3' exonuclease activity, preferably a 3'→5' exonuclease activity.
The terms "3 '→5 ' exonuclease activity" and "5 '→3 ' exonuclease activity" refer to the proofreading activities of some DNA polymerases. These proofreading activities catalyze the removal of a mononucleotide, in the 3 '-5' or 5 '-3' direction, at the extremity of the duplex DNA during elongation. Following base excision, the polymerase can re-insert the correct base and elongation can continue. These proofreading activities allow replication of long DNA molecules over 1,000 bp while maintaining high fidelity.
In an embodiment, the invention relates to a method as defined above, wherein said DNA polymerase has a 3'→5' exonuclease activity and/or a 5'→3' exonuclease activity, preferably a 3'→5' exonuclease activity, is thermostable and allows hot-start PCR, The term "hot start PCR" refers to a modified form of PCR which avoids a non-specific amplification of DNA by inactivating the DNA polymerase at low temperature. In conventional PCR, the DNA polymerase is active at room temperature and, when all the reaction components are put together, nonspecific primer annealing can occur due to these low temperatures. This nonspecific annealed primer can then be extended by the Taq DNA polymerase, generating nonspecific products and lowering product yields. In hot start PCR, specific antibodies are used to block the DNA polymerase at annealing temperature. When the temperature raises for amplification, the specific antibodies detach from the DNA polymerase and the amplification with greater specificity starts. Hot Start PCR significantly reduces nonspecific priming, the formation of primer dimers, and often, increases product yields.
The term "thermostable''' refers to the property of some DNA polymerase to resist to high temperatures over 90°C. Use of thermostable DNA polymerases enables running the PCR at high temperature (~60°C and above), which facilitates high specificity of the primers and reduces the production of unspecific products, such as primer dimers.
Examples of DNA polymerases suitable for the method of the invention include, but are not limited to, KOD Hot Start™ (Merck), Phusion™ (New England Biolabs), Q5 High Fidelity Hot Start DNA polymerase™ (New England Biolabs), Platinum Taq DNA Polymerase High Fidelity™ (Life Technologies), Taq Platinum PCR Supermix™ (Life Technologies), Accuprime Pfx DNA Polymerase™ (Life Technologies), Takara LA TaqiM (Ozyme), GXL Prime Star DNA Polymerase™ (Takara Clontech), Expand Long Template™ (Roche Diagnostics), Expand Long Range™ (Roche Diagnostics) and KAPA Long Range™ ( apa Biosystems).
In an embodiment, the invention relates to a method as defined above, wherein each primer has a size of 15 to 40 nucleotides, preferably of 20 to 40 nucleotides, more preferably of 28 to 30 nucleotides.
In particular, a primer of the invention have a size of 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
In an embodiment, the invention relates to a method as defined above, wherein least one PCR cycle is:
(i) 95 to 98°C for about 10 seconds,
(ii) then, 65 to 70°C for about 9 minutes, said at least one PGR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times.
This embodiment corresponds to a "two-step PCR" in which primers annealing and extension occurs at the same temperature.
In particular, the invention relates to a method as defined above, wherein said at least one PCR cycle is:
(i) 95 to 98°C for 10 seconds,
(ii) then, 65 to 70°C for 9 minutes and 25 seconds,
said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times.
In particular, temperature of denaturation in step (i) of the PCR cycle can be 95.0°C, 95.1°C, 95.2°C, 95.3°C, 95.4°C, 95.5°C, 95.6°C, 95.7°C, 95.8°C, 95.9°C, 96.0°C, 96. ΓΟ, 96.2°C, 96.3°C, 96.4°C, 96.5°C, 96.6°C, 96.7°C, 96.8°C, 96.9°C, 97.0°C, 97.1°C, 97.2°C, 97.3°C, 97.4°C, 97.5°C, 97.6°C, 97.7°C, 97.8°C, 97.9°C or 98.0°C.
In particular-, temperature of primers annealing/ DNA extension in step (ii) of the PCR cycle can be 65.0°C, 65.1°C, 65.2°C, 65.3°C, 65.4°C, 65.5°C, 65.6°C, 65.7°C, 65.8°C, 65.9°C, 66.0°C, 66.1°C, 66.2°C, 66.3°C, 66.4°C, 66.5°C, 66.6°C, 66.7°C, 66.8°C, 66.9°C, 67.0°C, 67.1°C, 67.2°C, 67.3°C, 67.4°C, 67.5°C, 67.6°C, 67.7°C, 67.8°C, 67.9°C, 68.0°C, 68.1°C, 68.2°C, 68.3°C, 68.4°C, 68.5°C, 68.6°C, 68.7°C, 68.8°C, 68.9°C, 69.0°C, 69.1°C, 69.2°C, 69.3°C, 69.4°C, 69.5°C, 69.6°C, 69.7°C, 69.8°C, 69.9°C or 70.0°C.
In an embodiment, the invention relates to a method as defined above, wherein said at least one PCR cycle is:
(i) 98°C for about 10 seconds,
(ii) then, 68°C for about 9 minutes,
said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times. In particular, the invention relates to a method as defined above, wherein said at least one PCR cycle is:
(i) 98°C for 10 seconds,
(ii) then, 68°C for 9 minutes and 25 seconds,
said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times.
In an embodiment, the invention relates to a method as defined above, wherein said mixture reaction is heated once at 98°C for at least 1 minute, preferably for at least 2 minutes, just before being subjected to said at least one PCR cycle.
In an embodiment, the invention relates to a method as defined above, wherein one additive is added to said reaction mixture, said additive being preferably chosen among the group consisting in dimethylsulfoxyd (DMSO), bovine serum albumin (BSA), T4 Gene 32 Protein and betaine.
The term "additive" refers to enhancing agents that can be used to increase the yield, specificity and consistency of D A synthesis. In an embodiment, the invention relates to a method as defined above, wherein said cDNA is obtained by reverse transcription using a primer chosen among the group consisting in: SEQ ID NO: 9 and SEQ ID NO: 10.
The sequences of primers used for reverse transcription are given in table 2.
Figure imgf000011_0001
Table 2. List of primers used for reverse transcription.
Examples of reverse transcriptases suitable for the method of the invention include, but are not limited to, Superscript III First Strand System for RT-PCR™ (Life Technologies) and Primescript Reverse Transcriptase™ (Ozyme). In an embodiment, the invention relates to a method as defined above, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase for at least 40°C, preferably for at least 50°C, during at least 50 minutes.
In particular, the invention relates to a method as defined above, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase at 50°C, during 50 minutes. In an embodiment, the invention relates to a method as defined above, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase and at least one additive chosen among T4 Gene 32 Protein and a ribonuclease inhibitor. In an embodiment, the invention relates to a method as defined above, comprising a step of extracting said total nucleic acid of a hepatitis C virus from a biological sample containing viral particles, such as blood, plasma, hepatic puncture biopsy or peripheral blood mononuclear cells (PBMC) of a patient. The term "patient" refers to an individual infected, or suspected of being infected, by a hepatitis C virus. This term includes mammals such as humans and other primates,
In an embodiment, the invention relates to a method as defined above, comprising:
a) extracting total nucleic acid of a hepatitis C vims from a biological sample contaniing viral particles, such as blood, plasma, hepatic puncture biopsy or peripheral blood mononuclear cells (PBMC) of a patient,
b) producing a cDNA of the total nucleic acid of a hepatitis C virus by reverse transcription,
c) mixing :
- a cDNA of the total nucleic acid of a hepatitis C vims, - a couple of primers, said primers being respectively complementary to two sequences located at the two extremities of said cDNA, said two sequences being distant from each other by at least 9 kb, and
- a DNA polymerase, to obtain a reaction mixture,
d) subjecting the reaction mixture to at least one PCR cycle to obtain a double strand DNA ofat least 9 kb.
In another aspect, the invention relates to a method of sequencing the full-length genome of a hepatitis C virus comprising:
(a) amplifying the full-length genome of a hepatitis C virus using the method as defined above to obtain a double strand DNA of at least 9 kb,
(b) sequencing said double strand DNA.
The sequence of the DNA molecule obtained by PCR amplification can be detennined by using any method for sequencing DNA.
In another aspect, the invention relates to a method of detecting hepatitis C virus genotypes in a biological sample of a patient suspected of containing hepatitis C viral particles, comprising:
(a) amplifying the full-length genome of a hepatitis C virus using the method as defined above to obtain a double strand DNA of at least 9 kb,
(b) fragmenting said double strand DNA and constructing a library of DNA fragments,
(c) sequencing DNA fragments,
(d) comparing sequences of said DNA fragments with each other to identify mutations representative of hepatitis C virus variants.
Since the method of the mvention allows amplification and sequencing of the full-length genome of hepatitis C virus, it is possible to detect and identify the genotypes, the subtypes and the quasispecies among the HCV population present in a biological sample. Moreover, the method of the invention can be used to detect and identify new genotypes and/or subtypes and/or quasispecies that have not been described yet.
The term "DNA library" refers to a collection of DNA fragments that have been physically isolated from each other. A DNA library allows further analysis, and in particular sequence analysis, of each DNA fragments individually. DNA library can be obtained through different means according to the purpose or to the technology. For example, DNA fragments can be cloned into vectors (such as plasmids) or captured by beads, to allow the sequencing of each fragment.
In an embodiment, a read alignment is carried out after step (c) to obtain consensus sequences.
In an embodiment, DNA fragments, that are used to create a DNA library, have an average size between 400 and 1 100 bp, preferably between 500 and 800 bp.
In particular, DNA fragments have an average size of 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1 000, 1 050 or 1 100 bp. In an embodiment, DNA fragments, that are used to create a DNA library, have an average size of at least 1, 2, 3, 4, 5, 6, 7, 8 or 9 kb.
In an embobiment, DNA fragments, that are used to create a DNA library, are the double strand DNA of at least 9 kb obtained after step a) as such.
In an embodiment, the invention relates to one of the methods as defined above, comprising a step of extracting total nucleic acid of the hepatitis C virus from a biological sample of a patient containing only one genotype of hepatitis C virus. In an embodiment, the invention relates to one of the methods as defined above, comprising a step of extracting total nucleic acid of the hepatitis C virus from a biological sample of a patient containing at least two different genotypes of hepatitis C virus. Within a patient, Hepatitis C virus exists as a population of distinct but closely related viral variants. These variants may display divergent implicative capacity, cell tropism, immunologic escape, and antiviral-drug resistance.
In the case of a "co-infection", a patient is co-infected with 2 or more distinct HCVs with distinct genotypes. In this context, a DNA library allows to separate and identify the HCV variants that are present within a sample. Therefore, it is possible to detect and identify resistance -associated variants within a sample. Moreover, the method of the invention can be used to detect variants that are present in low quantity, even in minority, within a sample.
In an embodiment, the invention relates to one of the methods as defined above, wherein said step of sequencing is achieved using by the Sanger method or a NGS method.
The term "Sanger method" refers to the classic method developed by Frederick Sanger and colleagues in 1977. This method of nucleic acid sequencing is based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication.
The term "NGS" or "Next Generation Sequencing''' refers to the so-called methods of nucleic acid sequencing and comprises the sequencing-by-synthesis or sequencing-by- ligation platforms currently employed by Illumina, Life Technologies, Pacific Biosciences and Roche, etc. Next generation sequencing methods may also include, but not be limited to, nanopore sequencing methods such as offered by Oxford Nanopore or electronic detection-based methods such as the Ion Torrent technology commercialized by Life Technologies.
In an embodiment, the invention relates to the method as defined above, wherein said step of fragmenting the double strand DNA is achieved using by nebulization, by ultrasonication or by enzymatic digestion, preferably by nebulization.
The term "nebulization" refers to a DNA fragmentation by forcing DNA through a small hole in a nebulizer unit. Generally, the size of the fragments varies accoding to the pressure of the gas used to push the DNA tlirough the nebulizer, the speed at which the DNA solution passes through the hole, the viscosity of the solution, and the temperature.
The term "sonication" refers to a DNA fragmentation method by subjecting DNA to brief perio ds of sonication with ultrasonic frequencies .
The term "enzymatic digestion" refers to a DNA fragmentation method by using enzymes to digest DNA. Usually, enzymatic digestion can be achieved using restriction enzymes that cut DNA at or near specific recognition sequences (restriction sites).
In another aspect, the invention relates to a primer chosen among the group consisting in: SEQ ID NO: 4 (2CH), SEQ ID NO: 6 (7er) and SEQ ID NO: 10 (3UTR1).
In another aspect, the invention relates to a couple of primers chosen among the group consisting in the following couples :
SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2),
SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2),
SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2),
SEQ ID NO: 2 and SEQ ID NO: 6 (5NC3/7er), and
SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2).
In another aspect, the invention relates to a kit for the amplification of the full-length genome of a hepatitis C virus comprising:
- a couple of primers chosen among the group consisting in the following couples: SEQ ID NO: 1 and SEQ ID NO: 5 (5NC1/HUTLA2),
SEQ ID NO: 1 and SEQ ID NO: 7 (5NC1/NA2),
SEQ ID NO: 2 and SEQ ID NO: 5 (5NC3/HUTLA2),
SEQ ID NO: 2 and SEQ ID NO: 6 (5NC3/7er), and
SEQ ID NO: 2 and SEQ ID NO: 7 (5NC3/NA2),
and
- a DNA polymerase.
In an embodiment, the invention relates to a kit as defined above further comprising a reverse transcriptase. In an embodiment, the invention relates to a kit as defined above further comprising a primer for the reverse transcription chosen among SEQ ID NO: 9 and SEQ ID NO: 10. The invention will be better explained by the following figui'es and examples. In any case, the following examples should not be considered as restricting the scope of the invention.
LEGENDS TO THE FIGURES
Figure 1 (a-b). Examples of LR-PCR products obtained for different genotypes and subtypes using different pangenotypic primers.
Figure 2 (a-h). Examples of quality scores (PHRED) according to read size for 3 samples sequenced using GSJ assay (a, b and c), 3 samples sequenced using GSJ+ assay (d, e and f) and for the two pyro sequencing failures (g and h). P8S1 was sequenced with the GSJ and P14Sl with the GSJ+.
Figure 3. Repartition of the PHRED scores each 500 bp across HCV genome on 6 examples. Data are presented as box plots in which 50% of the values lie within the box. The horizontal lines drawn through the middle of the boxes represent the median values. The top and bottom of each box are the 25lh and 75th percentiles of all values. The numbering is based on the sequence of HCV strain H77 (GenBank accession no. AF009606).
Figure 4. Examples of sequence depth of coverage obtained for 6 samples. The nucleotide numbering is based on the sequence of HCV strain H77 (GenBank accession no. AF009606).
Figure 5. Phylogenetic analysis of the 19 near full-length genomes obtained compared with 86 reference sequences identified by their GenBank accession numbers. Bootstrap resampling (1000 replications) support values are shown at nodes. The tree is rooted using genotype 2 sequences, and all horizontal branch lengths are drawn to a scale of nucleotide substituions per site.
Figure 6 (a-c). Recombination similarity plot of the RF_2k/lb genomes. Colored lines indicate the percentage of identity (y-axis) to each of 19 full-length non recombinant genotype 1 or 2 genomes across the entire genome (x-axis). The following parameters were applied: window = 200 bp, step = 20 bp. Figure 7. Localization of the mutations appeai'ed during treatment on the HCV RNA polymerase NS5B structure in complex with sofosbuvir and RNA. The structure representation was generated with PyMOL based on the PDB entry 4WTG. Figure 8. Bioinforrnatic pipeline.
EXAMPLES
I - cDNA synthesis and near full-length LR-PCR
Obtaining a complete cDNA of good quality is important prior to the amplification. Only cDNA samples with positive control PCRs on both 5!NC and NS5B regions were used for the LR-PCR assay. The near full-length HCV genome amplification was achieved for 16 samples out of the 19 with positive control PCRs, identified with a lane above 9 kb on the electrophoresis gel. Depending on the samples, not always the same primer pair worked best for the LR-PCR. The complete optimized LR-PCR method allowed amplification of different genotypes : 8 samples of HCVlb, 4 samples of HCVla, 2 samples of HCV3a and 2 samples of HCV4a (see results for HCV la, lb and 3a as shown in Figure 1).
The LR-PCR also worked for a group of 5 RF 2k/lb samples (with a mean viral load of 5.5 log UI/mL), including 3 sequential samples taken from the same patient before, during and after a failed treatment with sofosbuvir and ribavirin.
II - Ultra Deep Pyrosequencing (UDPS) results
The 21 positive LR-PCRs were sequenced in 4 runs. Two samples (one HCVlb (P8S1) and one HCV4a (P14S1)) could not be analysed because, despite a correct number of reads (8407 and 15517 respectively), reads were too short and quality too low to allow the reconstitution of a consensus sequence (see Figure 2, panel C). Consensus sequences could be achieved for 19 samples (90%). A mean number of reads/sample of 8601 (+/- 6053) was obtained.
PHRED scores according to reads length greatly vaiy between GSJ and GSJ+ runs, with a higher quality for longer reads in GSJ+ assays (see Figure 2, panels A and B). Repartition of PHRED scores all along the genome was linear in GS J+ assay (see Figure 3). Median values of PHRED score grouped by 500 bp range were always above 20, with the exceptions of sample P13S1 between 4500 and 5000, sample P7S2 between 5500 and 6000 and sample PI 5S1 at the 3 'extremity where quality scores fall below 20.
Consensus sequence covered more than 99% of the expected sequence for 15/1 samples and covered 98.8%, 97.5%, 89.8% and 77.9% for the 4 remaining ones (see Table 3). Typing on Median Depth
Suspected Closest Number of
Sample whole Identities Global % of coverage of coverage typing* sequence ID reads
genome'* (IQRJ
P131 lb lb EF032892 91.36 % (7605/8324} .16 731 89.82 % (8324/9267) NC
P.2S1 13: la EF407419 92.63 % (3583/9266) 1579 99,86 % (9266/9279} 15 (10-19) 3S1 la la EF4C741S 92.68 % (8525/9198.) 11 306 99.69 % (9198/9227) NC
Figure imgf000021_0001
P4S1 lb lb AY587016 90.18 % (£359/9269) 5 25 99.84 % (9269/9284) NC
P5S1 la la AF51195Q 92.89 % (8548/9202} ID 549 39. 75 % (9202/9225) NC
P5S1 l lb EF032852 91.21 % (8390/9199} 6125 99. 73 % (9199/9224) NC
P7S1 3a 3a D28917 92. 16 % (8492 9214} 4 218 93.39 % (9214/9271) NC
P7S2 3a 3a: D2S917 92. 42 % (8447/9140) 21 756 98.67 % (9140/9253) 32 (22-44} SS1 lb / / / 8407 / /
P9S1 F 2k/lb RF 2fe lb FJS21465 93.24 ¾ (8598/92.21) 13 383 99.87 9. (9221/9233} 95? (865-1062)
P9S2 RF 2k/lb RF 2k/lb FJ8ZL465 93.18 % (8589/9218) 10 475 99.84 % (9218/9233) 693 (600-789}
P9S3 RF 2k/lb RF 2k/lb FJS21465 93.26 ¾ (8600/9222) '9529 99.88 % (9222/9233) 630 (542-720}
P10S1 lb lb EF032S92 91.1 % (8384/9214) 6736 99.93 % (9214/9220) 451 (403-523)
P11S1 lb lb EFQ32B92 91. 68 % (6560/7155} 5558 77.46 % (7155/9237) 419 (175-523}
P11S2 lb lb EF&32892 91. 71 % (8444/9207) 1 -592 99.85 % (9207/9220} NC
P12S1 RF 2k/lb RF 2k/lb AY5S7015 86.71 % (7844/9046) *** 1 073 97.52 % (9046/9276) 26 (22-36)
P13S1 RF 2k/lb RF 2k/lb JX227952 93. 86 % (8622/9186) 975 99.67 % (9186/9216) 39 (32-47}
P14S1 4a / / / 15· 517 / /
P14S2 4a 4a DQ418788 94.62 % (8693/9187) 5 872 99. 64 % (9187/9220} 289 (247-350)
P15S1 la la AF511S5D 92.43 % (8520/9218) 18 976 99.76 % (9218/9240) 98 (86-117)
P15S2 la, la AF511950 92.24 ¾ (8470/9183) 11 358 99.86 % (9183/9196) 41 (34-51}
/: UDPS failure; IQE: interquartile range, NC: not calculable
*accordingio [*£S3 and NS5BSangersequendngorinni3UPA+ [E5BSBngersequencingforrecombinantforms
** determined by phylogeneticanalysis
*** P12S1: S4¾Tden¾ty calculated with cbsestsequence f the LAML database HC1S6777
amino acids 1 to 565, except amino acids 404-409 for P15S1. Median depth of coverage ranged from 15 to 957 X (see Figure 4).
Ill - Genotyping
Phylogenetic analysis of the consensus sequences confirmed the genotypes and subtypes of all samples (see Figure 5), The suspected recombinant forms of HCV clustered with previously described RF 2k/lb on the dendogram and were confirmed by bootscan analyses, which identified subtypes 2k and lb as the components of the recombinant form (see Figure 6). Moreover, the presence of reads (n = 802, 20 and 27 for P9S1, P12S1 and P13S1 respectively) covering the recombination point with at least 30 2k-upstream and 30 lb-downstream nucleotides without the need for computational reconstruction attested the presence of a unique recombinant RNA genome. No co-infection with other HCV types was detected by blasting every unique read above 50-bp long (excluding those covering 5'NC).The breakpoint was identical for the 3 patients and localised in the NS2 region, between nucleotide positions 3189 and 3200 based on H77 numbering. This recombination point was also shared with all previously described RF_2k/lb, indicating a common strain of the virus.
Calculation of the percentages of identity with the closest reference sequence identified by the phylogenetic analysis always showed less than 15% difference. Sample P12S1 exhibited the lowest identity percentage, of 87% using AYS 87016 reference. However, blasting this sample sequence in Los Alamos National Laboratory database identified sequence HC186777 as the closest one with 94% identity (8703/9305) and confirmed the type RJF_2k/lb of sample P12S1. IV - DAAs resistance
Analysis of the consensus sequences highlighted the presence of baseline RAVs on the genomic regions NS3, NS5A and NS5B for 7/13 patients, all naive of treatment with DAAs (see Table 4).
Figure imgf000023_0001
Figure imgf000024_0001
Table 4. RAVs detected on consensus sequences at baseline. RAVs were found in NS3 for 3 patients (T54A + 1132V for P6S1, T54S for P10S1 and D168E for P12S1) and in NS5A for 4 patients (Y93H for P7S1 and P11S1, and L31M for P13S1 and P15S1). No blown RAV (L159F, S282T, C316N/Y, L320F, V321A, N411S, M414T, Y448C/H, A553V and S556G) was detected inNS5B (30,31).
Infra-individual comparison of viral genomic sequences from 4 patients before treatment and after a relapse post-treatment, including sofosbuvir ± another DA A, showed the apparition of some variants h NSSB (see Table 5).
Figure imgf000025_0001
Table 5. Variants selected during DAA-based treatment on patients experiencing a relapse. SOF: sofosbuvir, DCV: daclatasvir, Peg: pegylated interferon a, /: not detected.
However the 3D analysis of these variants showed that they were distant from the catalytic site of the polymerase, and therefore could not explain the treatment failure (see Figure 7). For a fifth patient experiencing a relapse, the pre-therapeutic sequence could not be obtained but no loiown RAV (resistance-associated variants) was detected on the post-therapeutic sample (P14S2). Mutation also appeared in NSSA despite the non-use of anti-NS5A in 3/4 patients. V - Material and methods
Samples
Thirty-three plasma samples taken from 19 HCV-infected patients and belonging to the collection of the virology laboratory (N° DC-2008-680) were used to assess the technique. They were sampled between October 2010 and January 2015 and were stored at -80°C. The protocol was first optimized on genotype lb samples with viral loads between 5.6 and 7.2 log UI/mL and then its reproducibility and its performance on other genotypes were assessed. Genotype of these samples was initially determined using NS3 Sanger sequencing assay and confirmed using NS5B sequencing analysis. The technique was later applied on selected samples: either from patients who had failed a DAA-based therapy, or patients infected with a possible recombinant form RF_2k/lb. These last samples were initially classified either as 2a/2c using VERS ANT™ HCV Genotype 2.0 Assay or as 2k using Core sequencing, and classified as lb using NS3 assay, leading to the suspicion of recombinant forms of HCV. Their viral loads ranged from 2.2 to 5.8 log UI/mL.
Reverse transcription (RT)
Total nucleic acids were extracted from 1 mL of plasma and eluted on 50 μΐ, on the easyMAG® instrument, according to the manufacturer's instructions (BioMerieux). Extracts were treated with TURBO DNase Ambion™ (Life Technologies). Reverse transcription (RT) was performed from 8 μΐ^ of RNA extract using PrimeScript™ Reverse Transcriptase (Ozyme) and a specific primer: dA20 or 3UTR1, annealing respectively on the poly(U) tail or the 3' non-coding region (3'NC). Manufacturer's instructions were followed, except for the addition of T4 Gene 32 Protein at 0.1 g μL (New England BioLabs). Success of the cDNA synthesis was checked with small PCRs on both sides of the genome. Long range PCR (LR-PCR
HCV near-complete genome was amplified in a single fragment via a LR-PCR, from 2 μί of cDNA in a total volume of 20 μΕ, using either PrimeSTAR™ GXL DNA Polymerase (Takara Clontech) or KOD Xtreme™ Hot Start DNA Polymerase (Merck). The following conditions were applied: 94 °C x 2 min, (98°C x 10 s, 68°C x 9 min 25) x 45 cycles, 4°C on hold. Several primer pairs were tested for each sample. A nested PCR with internal primers or a second-round PCR with the same ones was sometimes necessary to obtain the required DNA quantities, and done following the same conditions. Primers are listed in Table 1. PCR products were then purified directly or from a 0.5% agarose gel using NucleoSpin™ Gel and PCR Clean-up kit (Macherey-Nagel, Hoerdt, France). DNA concentration was quantified using Quant-iT™ Pico Green™ dsDNA Reagent (Life Technologies, St Aubin, France) and a LightCyler™ 480 instrument (Roche Diagnostics), from 1 μΐ, of sample diluted with TE IX on 50 μL. PicoGreen® Reagent was then added in a 1.T ratio.
DNA library/Next Generation Sequencing (454)
Between 0.500 and 1.000 μg of purified near whole genome amplicon of each sample was nebulized at 2.9 bar x 1 min. Following the manufacturer's manual for Rapid Library preparation, products were purified using MinElute™ PCR Purification Kit (Qiagen) and small fragments removed with Agencourt™ AMPure™ XP magnetic beads (Beckman Coulter). After multiplex identifiers and adaptors ligation, libraries quality was assessed using an Agilent Bioanalyzer High Sensitivity DNA chip (Agilent Technologies) and quantitated with a TBS-380 Fluorometer. Based on this dosage, libraries were pooled equimolarly and emulsion PCR (emPCR) and pyrosequencing were performed according to manufacturer's protocols (Roche Diagnostics). Five to 8 samples per run were loaded on the same PicoTiterPlate and sequenced: the 2 first runs were carried out with the original GSJ system and the 2 last ones with the upgraded 454 GS Junior+ (GSJ+). Sequences analysis
The obtained SFF files were split by MIDs and converted to FASTQ files. Quality control (particularly, PHRED quality score profiles) was performed on each FASTQ file using the FASTQC software. Contigs were generated de novo using the VICUNA software (Broad Institute Inc.) with default parameters except for the minimum length of contigs to output which was set to 0 and the maximum percentage of divergence between read and consensus which was set to 50% (default: 8%), when the consensus output from V-FAT presented too many gaps. V-FAT processed the output of VICUNA to orient and filter the raw contigs, merge them where they overlapped based on a reference alignment (the subtype-specific reference sequence for this alignment was determined using the HCV BLAST from Los Alamos National Laboratory (with the longest contig as query) and correct frameshifts found in coding regions. As a result, V-FAT yielded a consensus full-genome assembly.
This consensus was next used as a reference for reference-guided alignment with the MOSAI software to further analyze the data, in order to obtain the depth of coverage and the PHRED quality score distribution all along the genome. Global percentages of coverage were evaluated using a Needleman-Wunsch algorithm to align each consensus sequence with its reference, in the limits of the expected sequence according to the primers used and discarding Ns as well as indels in the consensus sequences. Identity ratios were then assessed over the bases covered on this range. Coverage of the zones of interest in NS3, NS5A and NS5B was also assessed using the Geno2Pheno software. (The bioinformafics pipeline is represented in Figure 8.) Typing analysis
Phylogenetic analysis was performed after alignment of the 19 HCV whole genome sequences obtained in this study associated with 86 reference sequences. The phylogenetic tree was constructed by means of the Neighbor- Joining method on conserved sites and a Jukes-Cantor substitution model using MAFFT version 7 online software. Reliability of the various inferred clades was estimated by bootstrapping (1000 replicates). Visualization of the tree and node coloring were performed by means of Archaeopteryx.
For the recombinant forms, Bootscan analysis was carried out on the resulting consensus sequences compared to full-length genomic sequences of reference strains representing all HCV 1 and 2 subtypes, using SimPlot v3.5.1 software. The following parameters were applied: windows of 200 bp and steps of 20 bp, use of the Kimura 2-parameters distance model' and the Neighbour- Joining tree model.
DAA resistance analysis
Whole genome consensus sequences at baseline were analysed using Geno2Pheno to search for RAVs. Intra-individual comparison of viral genomic sequences was performed translating the nucleotide sequences into amino acids (AA) using EMBOSS Transesq and aligning them using Multalin. Modeling of the mutations localization on the HCV RNA polymerase NS5B structure in complex with sofosbuvir and RNA (genotype 2a, strain JFH-1) was realized using PyMOL software based on the PDB entry 4WTG.

Claims

A method of amplifying the full-length genome of a hepatitis C virus of any genotype comprising:
a) mixing :
- a cDNA of the total nucleic acid of a hepatitis C virus,
- a couple of primers, said primers being respectively complementary to two sequences located at the two extremities of said cDNA, said two sequences being distant from each other by at least 9 kb, and
- a DNA polymerase, to obtain a reaction mixture, b) subjecting the reaction mixture to at least one PCR cycle to obtain a double strand DNA of at least 9 kb.
The method according to claim 1, wherein said couple of primers is chosen among the group consisting in the following couples:
SEQ ID NO: 1 and SEQ ID NO: 5,
SEQ ID NO: 1 and SEQ ID NO: 6,
SEQ ID NO: 1 and SEQ ID NO: 7,
SEQ ID NO: 1 and SEQ ID NO: 8,
SEQ ID NO: 2 and SEQ ID NO: 5,
SEQ ID NO: 2 and SEQ ID NO: 6,
SEQ ID NO: 2 and SEQ ID NO: 7,
SEQ ID NO: 2 and SEQ ID NO: 8,
SEQ ID NO: 3 and SEQ ID NO: 5,
SEQ ID NO: 3 and SEQ ID NO: 6,
SEQ ID NO: 3 and SEQ ID NO: 7,
SEQ ID NO: 3 and SEQ ID NO: 8,
SEQ ID NO: 4 and SEQ ID NO: 5,
SEQ ID NO: 4 and SEQ ID NO: 6,
SEQ ID NO: 4 and SEQ ID NO: 7, and
SEQ ID NO: 4 and SEQ ID NO: 8,
preferably chosen among the group consisting in: SEQ ID NO: 1 and SEQ ID NO: 5,
SEQ ID NO: 1 and SEQ ID NO: 7,
SEQ ID NO: 2 and SEQ ID NO: 5,
SEQ ID NO: 2 and SEQ ID NO: 6, and
SEQ ID NO: 2 and SEQ ID NO: 7.
3. The method according to any of claims 1 or 2, wherein said DNA polymerase has a 3'→5' exonuclease activity and/or a 5'→3' exonuclease activity, preferably a 3'→5' exonuclease activity.
4. The method according to any of claims 1 to 3, wherein said DNA polymerase has a 3'→5' exonuclease activity and/or a 5'→3' exonuclease activity, preferably a 3'→5' exonuclease activity, is thermostable and allows hot-start PCR. 5. The method according to any of claims 1 to 4, wherein each primer has a size of 15 to 40 nucleotides, preferably of 20 to 40 nucleotides, more preferably of 28 to 30 nucleotides.
6. The method according to any of claims 1 to 5, wherein said at least one PCR cycle is:
(i) 95 to 98°C for about 10 seconds,
(ii) then, 65 to 70°C for about 9 minutes,
said at least one PCR cycle being repeated at least 30 times, preferably at least 40 times, more preferably at least 45 times. 7. The method according to any of claims 1 to 6, wherein said mixture reaction is heated once at 98°C for at least 1 minute, preferably for at least 2 minutes, just before being subjected to said at least one PCR cycle.
8. The method according to any of claims 1 to 7, wherein one additive is added to said reaction mixture, said additive being preferably chosen among the group consisting in dimethylsulfoxyd (DMSO), bovine serum albumin (BSA), T4 Gene 32 Protein and betaine.
9. The method according to any of claims 1 to 8, wherein said cDNA is obtained by reverse transcription using a primer chosen among the group consisting in: SEQ ID NO: 9 and SEQ ID NO: 10.
10. The method according to any of claims 1 to 9, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C virus with a reverse transcriptase for at least 40°C, preferably for at least 50°C, during at least 50 minutes.
11. The method according to any of claims 1 to 10, wherein said cDNA is obtained by incubating total nucleic acid of a hepatitis C vims with a reverse transcriptase and at least one additive chosen among T4 Gene 32 Protein and a ribonuclease inhibitor.
12. The method according to any of claims 1 to 11, comprising a step of extracting said total nucleic acid of a hepatitis C virus from a biological sample containing viral particles, such as blood, plasma, hepatic puncture biopsy or peripheral blood mononuclear cells (PBMC) of a patient.
13. The method according to any of claims 1 to 12, comprising: a) extracting total nucleic acid of a hepatitis C virus from a biological sample containing viral particles, such as blood, plasma or peripheral blood mononuclear cells (PBMC) of a patient,
b) producing a cDNA of the total nucleic acid of a hepatitis C virus by reverse transcription,
c) mixing :
- a cDNA of the total nucleic acid of a hepatitis C vims,
- a couple of primers, said primers being respectively complementary to two sequences located at the two extremities of said cDNA, said two sequences being distant from each other by at least 9 kb, and
- a DNA polymerase, to obtain a reaction mixture, d) subjecting the reaction mixture to at least one PC cycle to obtain a double strand DNA ofat least 9 kb.
14. A method of sequencing the full-length genome of a hepatitis C virus comprising:
(a) amplifying the full-length genome of a hepatitis C virus using the method as defined in any of claims 1 to 13 to obtain a double strand DNA of at least 9 kb,
(b) sequencing said double strand DNA.
15. A method of detecting hepatitis C virus genotypes in a biological sample of a patient suspected of containing hepatitis C viral particles, comprising:
(a) amplifying the full-length genome of a hepatitis C virus using the method as defined in any of claims 1 to 13 to obtain a double strand DNA of at least 9 kb, (b) fragmenting said double strand DNA and constructing a library of DNA fragments,
(c) sequencing DNA fragments,
(d) comparing sequences of said DNA fragments with each other to identify mutations representative of hepatitis C virus variants. 16. A method of sequencing the full-length genome of a hepatitis C virus accordmg to claim 14, or a method of detecting hepatitis C virus genotypes according to claim 15, comprising a step of extracting total nucleic acid of the hepatitis C virus from a biological sample of a patient containing only one genotype of hepatitis C virus. 17. A method of sequencing the full-length genome of a hepatitis C virus according to claims 14 or 16, or a method of detecting hepatitis C virus genotypes according to claims 15 or 16, comprising a step of extracting total nucleic acid of the hepatitis C virus from a biological sample of a patient containing at least two different genotypes of hepatitis C virus.
18. A method of sequencing the full-length genome of a hepatitis C virus according to any of claims 14, 16 and 17, or a method of detecting hepatitis C virus genotypes according to any of claims 15, 16 and 17, wherein said step of sequencing is achieved using by the Sanger method or a NGS method.
19. A method of detecting hepatitis C viras genotypes according to any of claims 15 to 18, wherein said step of fragmenting the double strand DNA is achieved using by nebulization, by ultrasonication or by enzymatic digestion, preferably by nebulization. 20. A primer chosen among the group consisting in:
SEQ ID NO: 4, SEQ ID NO: 6 and SEQ ID NO: 10.
21. A couple of primers chosen among the group consisting in the following couples:
SEQ ID NO: 1 and SEQ ID NO: 5,
SEQ ID NO: 1 and SEQ ID NO: 7,
SEQ ID NO: 2 and SEQ ID NO: 5,
SEQ ID NO: 2 and SEQ ID NO: 6, and
SEQ ID NO: 2 and SEQ ID NO: 7. 22. A kit for the amplification of the full-length genome of a hepatitis C virus comprising:
- a couple of primers chosen among the group consisting in the following couples:
SEQ ID NO: 1 and SEQ ID NO: 5,
SEQ ID NO: 1 and SEQ ID NO: 7,
SEQ ID NO: 2 and SEQ ID NO: 5,
SEQ ID NO : 2 and SEQ ID NO: 6, and
SEQ ID NO; 2 and SEQ ID NO: 7,
and
- a DNA polymerase. 23. The kit according to claim 22 further comprising a reverse transcriptase.
24. The kit according to any of claims 22 to 23, further comprising a primer for the reverse transcription chosen among SEQ ID NO: 9 and SEQ ID NO: 10.
PCT/IB2015/001881 2015-08-03 2015-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus WO2017021752A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/IB2015/001881 WO2017021752A1 (en) 2015-08-03 2015-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus
PCT/EP2016/068589 WO2017021471A1 (en) 2015-08-03 2016-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2015/001881 WO2017021752A1 (en) 2015-08-03 2015-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus

Publications (1)

Publication Number Publication Date
WO2017021752A1 true WO2017021752A1 (en) 2017-02-09

Family

ID=54540129

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/IB2015/001881 WO2017021752A1 (en) 2015-08-03 2015-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus
PCT/EP2016/068589 WO2017021471A1 (en) 2015-08-03 2016-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/EP2016/068589 WO2017021471A1 (en) 2015-08-03 2016-08-03 Methods for amplifying and sequencing the genome of a hepatitis c virus

Country Status (1)

Country Link
WO (2) WO2017021752A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111187813A (en) * 2020-02-20 2020-05-22 予果生物科技(北京)有限公司 Full-process quality control pathogenic microorganism high-throughput sequencing detection method

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110867207B (en) * 2019-11-26 2021-07-30 北京橡鑫生物科技有限公司 Evaluation method and evaluation device for verifying NGS (Next Generation Standard) variation detection method
CN112961942B (en) * 2021-03-31 2024-02-02 广州金域医学检验中心有限公司 Amplification primer for detecting HCV 2a subtype drug-resistant mutant gene, detection method and application
CN113151595B (en) * 2021-03-31 2024-02-06 广州金域医学检验中心有限公司 Amplification primer for detecting HCV6 type drug-resistant mutant gene, detection method and application
CN113025756B (en) * 2021-03-31 2024-02-06 广州金域医学检验中心有限公司 Detection method for detecting HCV type 1 drug-resistant mutant gene and application

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120035240A1 (en) * 2003-06-12 2012-02-09 Alnylam Pharmaceuticals, Inc. Conserved hbv and hcv sequences useful for gene silencing
WO2013040060A2 (en) * 2011-09-12 2013-03-21 Pathogenica, Inc. Nucleic acids for multiplex detection of hepatitis c virus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120035240A1 (en) * 2003-06-12 2012-02-09 Alnylam Pharmaceuticals, Inc. Conserved hbv and hcv sequences useful for gene silencing
WO2013040060A2 (en) * 2011-09-12 2013-03-21 Pathogenica, Inc. Nucleic acids for multiplex detection of hepatitis c virus

Non-Patent Citations (14)

* Cited by examiner, † Cited by third party
Title
"EASL Clinical practice Guidelines: Management of hepatitis C virus infection", J HEPATOL., vol. 60, no. 2, 2014, pages 392 - 420
AVO ET AL.: "Hepatitis C virus subtyping based on sequencing of the C/El and NS5B genomic regions in comparison to a commercially available line probe assay", J MED VIROL, vol. 85, no. 5, 2013, pages 815 - 22
BARTOLINI BARBARA ET AL: "Near full length hepatitis C virus genome reconstruction by next generation sequencing based on genotype-independent amplification.", DIGESTIVE AND LIVER DISEASE : OFFICIAL JOURNAL OF THE ITALIAN SOCIETY OF GASTROENTEROLOGY AND THE ITALIAN ASSOCIATION FOR THE STUDY OF THE LIVER JUL 2015, vol. 47, no. 7, 23 March 2015 (2015-03-23), pages 608 - 612, XP002752731, ISSN: 1878-3562 *
DEMETRIOU VICTORIA L ET AL: "Near-full genome characterisation of two natural intergenotypic 2k/1b recombinant hepatitis C virus isolates.", ADVANCES IN VIROLOGY 2011, vol. 2011, 710438, 2011, pages 1 - 7, XP002752732, ISSN: 1687-8647, DOI: 10.1155/2011/710438 *
EILEEN Z ZHANG ET AL: "Development of a sensitive RT-PCR method for amplifying and sequencing near full-length HCV genotype 1 RNA from patient samples", VIROLOGY JOURNAL, BIOMED CENTRAL, LONDON, GB, vol. 10, no. 1, 53, 12 February 2013 (2013-02-12), pages 1 - 6, XP021139872, ISSN: 1743-422X, DOI: 10.1186/1743-422X-10-53 *
FAN ET AL.: "Efficient amplification and cloning of near full-length hepatitis C virus genome from clinical samples", BIOCHEM BIOPHYS RES COMMUN., vol. 346, no. 4, 2006, pages 1163 - 72
FAN X ET AL: "Efficient amplification and cloning of near full-length hepatitis C virus genome from clinical samples", BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, ACADEMIC PRESS INC. ORLANDO, FL, US, vol. 346, no. 4, 11 August 2006 (2006-08-11), pages 1163 - 1172, XP024925438, ISSN: 0006-291X, [retrieved on 20060811], DOI: 10.1016/J.BBRC.2006.06.039 *
LARRAT ET AL.: "Sequencing assays for failed genotyping with the versant hepatitis C virus genotype assay (LiPA), version 2.0", J CLIN MICROBIOL., vol. 51, no. 9, 2013, pages 2815 - 21
MOREL ET AL.: "Genetic recombination of the hepatitis C virus: clinical implications", J VIRAL HEPAT., vol. 18, no. 2, 2011, pages 77 - 83
RAMIERE ET AL.: "Recent evidence of underestimated circulation of hepatitis C virus intergenic recombinant strain RF2k/lb in the Rhone-Alpes region, France, January to August 2014: implications for antiviral treatment", EURO SURVEILL BULL EUR SUR MAL TRANSM EUR COMMUN DIS BULL., vol. 19, no. 43, 2014
SMITH ET AL.: "Expanded classification of hepatitis C virus into 7 genotypes and 67 genotypes: updated criteria and genotype assignment Web resource", HEPATOL BALTIM MD., vol. 59, no. 1, 2014, pages 318 - 27
SULKOWSKI ET AL.: "Ombitasvir paritaprevir co-dosed with ritonavir, dasabuvir, and ribavirin for hepatitis C in patients co-infected with HIV-1: a randomized trial", JAMA, vol. 313, no. 12, 2015, pages 1223 - 31
TELLIER: "Long PCR and its application to hepatitis viruses: amplification of hepatitis A, hepatitis B, and hepatitis C virus genomes", J CLIN MICROBIOL., vol. 34, no. 12, 1996, pages 3085 - 81
ZHANG ET AL.: "Development of a sensitive RT-PCR method for amplifying and sequencing near full-length HCV genotype 1 RNA from patient samples", VIROL J., vol. 10, 2013, pages 53

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111187813A (en) * 2020-02-20 2020-05-22 予果生物科技(北京)有限公司 Full-process quality control pathogenic microorganism high-throughput sequencing detection method

Also Published As

Publication number Publication date
WO2017021471A8 (en) 2017-09-21
WO2017021471A1 (en) 2017-02-09

Similar Documents

Publication Publication Date Title
WO2017021471A1 (en) Methods for amplifying and sequencing the genome of a hepatitis c virus
Oyola et al. Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes
Datta et al. Next-generation sequencing in clinical virology: Discovery of new viruses
Bull et al. A method for near full-length amplification and sequencing for six hepatitis C virus genotypes
Tavares et al. The global and local distribution of RNA structure throughout the SARS-CoV-2 genome
Hedskog et al. Characterization of hepatitis C virus intergenotypic recombinant strains and associated virological response to sofosbuvir/ribavirin
JP2011525365A (en) System and method for detection of HIV affinity variants
Thys et al. Performance assessment of the Illumina massively parallel sequencing platform for deep sequencing analysis of viral minority variants
JP6612220B2 (en) Detection of chemical modifications in nucleic acids
Soria et al. Pipeline for specific subtype amplification and drug resistance detection in hepatitis C virus
Hedskog et al. Genotype-and subtype-independent full-genome sequencing assay for hepatitis C virus
JP2012515534A (en) Method for amplifying hepatitis C virus nucleic acid
Del Campo et al. Hepatitis C virus deep sequencing for sub-genotype identification in mixed infections: a real-life experience
Qiu et al. HCV genotyping from NGS short reads and its application in genotype detection from HCV mixed infected plasma
JP2012510257A (en) Systems and methods for detection of HIV integrase variants
Rawson et al. 5-Azacytidine enhances the mutagenesis of HIV-1 by reduction to 5-aza-2′-deoxycytidine
Wei et al. Development and validation of a template-independent next-generation sequencing assay for detecting low-level resistance-associated variants of hepatitis C virus
Pripuzova et al. Development of real-time PCR array for simultaneous detection of eight human blood-borne viral pathogens
Gallardo et al. MrHAMER yields highly accurate single molecule viral sequences enabling analysis of intra-host evolution
Trémeaux et al. Amplification and pyrosequencing of near-full-length hepatitis C virus for typing and monitoring antiviral resistant strains
Trémeaux et al. Hepatitis C virus whole genome sequencing: Current methods/issues and future challenges
Steele et al. Analysis of APOBEC and ADAR deaminase-driven Riboswitch Haplotypes in COVID-19 RNA strain variants and the implications for vaccine design
ES2717793T3 (en) Method for the specific detection of the virus of the classic swine fever
Guinoiseau et al. Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification
Kim et al. APOBEC-mediated editing of SARS-CoV-2 genomic RNA impacts viral replication and fitness

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15793902

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15793902

Country of ref document: EP

Kind code of ref document: A1