WO2021202861A1 - Procédé de caractérisation - Google Patents

Procédé de caractérisation Download PDF

Info

Publication number
WO2021202861A1
WO2021202861A1 PCT/US2021/025352 US2021025352W WO2021202861A1 WO 2021202861 A1 WO2021202861 A1 WO 2021202861A1 US 2021025352 W US2021025352 W US 2021025352W WO 2021202861 A1 WO2021202861 A1 WO 2021202861A1
Authority
WO
WIPO (PCT)
Prior art keywords
paratope
sample
nucleic acid
igh
pathogen
Prior art date
Application number
PCT/US2021/025352
Other languages
English (en)
Inventor
Jeffrey Edward MILLER
Original Assignee
Invivoscribe, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Invivoscribe, Inc. filed Critical Invivoscribe, Inc.
Publication of WO2021202861A1 publication Critical patent/WO2021202861A1/fr

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/08Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses
    • C07K16/10Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses from RNA viruses
    • C07K16/1002Coronaviridae
    • C07K16/1003Severe acute respiratory syndrome coronavirus 2 [SARS‐CoV‐2 or Covid-19]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/50Immunoglobulins specific features characterized by immunoglobulin fragments
    • C07K2317/52Constant or Fc region; Isotype
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/20011Coronaviridae
    • C12N2770/20034Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein

Definitions

  • the present invention relates generally to a method of identifying candidate immunoglobulin paratopes directed to a pathogen of interest. More particularly, the present invention is directed to a method of screening for candidate immunoglobulin paratopes directed to a pathogen derived epitope by screening for immunoglobulin paratopes which have both undergone class switching and are expressed across a genetically diverse infected population. The method of the present invention facilitates the detection and analysis of immunologically dominant paratopes, together with the epitopes to which they are directed.
  • the paratopes identified by the screening method of the present invention are useful, inter alia, for identifying paratope sequences for recombinant antibody production, identifying and isolating candidate epitopes and immunogens for vaccine development, developing point of care diagnostics, developing immunogen expression systems, identifying and/or developing neutralising antibodies, assessing the immune status of individuals who have been previously infected with said pathogen and assessing the immune status of individuals vaccinated with an antigen based vaccines.
  • Mammals are required to defend themselves against a multitude of pathogens including viruses, bacteria, fungi and parasites, as well as non-pathogenic insults such as tumours and toxic, or otherwise harmful, agents.
  • effector mechanisms have evolved which are capable of mounting a defence against such antigens.
  • These mechanisms are mediated by soluble molecules and/or by cells.
  • cellular immunity this is characterised by two distinct forms of immune response, these being the innate (non-specific) and adaptive (specific) immune responses.
  • Innate immunity which is not dependent upon recognition of specific antigens, is usually the first line of defence against invading pathogens.
  • the adaptive immune response is designed to not only recognise and respond to a particular antigen in a highly specific manner but, further to adapt its response during an infection to improve its recognition of the pathogen. This optimized response is then retained, in the form of immunological memory, after the pathogen has been eliminated, thereby facilitating the adaptive immune system to mount faster and stronger defences each time that a given pathogen is encountered.
  • Adaptive immune responses are affected by lymphocytes and fall into two broad classes of response - antibody (immunoglobulin) responses and cell-mediated immune responses, which are regulated by two different classes of lymphocytes, these being B cells and T cells, respectively.
  • B cells secrete immunoglobulins which circulate in the bloodstream and permeate other body fluids to bind specifically to the foreign antigen that stimulated their production.
  • B cell effector mechanisms are activated when cell surface bound antibodies bind to antigen.
  • the antigen/antibody complex is internalised and enzymatically degraded into peptides which then undergo MHC Class II presentation and T helper cell binding. Activation of the B cells by this mechanism induces proliferation and differentiation to antibody secreting plasma cells.
  • Antibody bound antigens can then be cleared by a range of mechanisms including complement activation, phagocytosis and inhibition of antigen functionality, such as in the context of competitive inhibition.
  • the antibody response is broadly divided into two phases. Following initial antigen exposure, the primary immune response marks the initial antibody response phase and is characterised by the secretion of IgM, followed by class switching to IgG secretion and the generation of memory B cells. Subsequent exposure to the same antigen produces a more rapid secondary immune response from the memory B cell population. Immunoglobulin class switching is also known as isotype switching, isotypic commutation or class-switch recombination and is the biological mechanism that changes a B cell’s production of immunoglobulin from one type to another, such as from the isotype IgM to the isotype IgG.
  • variable region the variable region of the heavy chain stays the same. Since the variable region does not change, class switching does not affect antigen specificity. Instead, the antibody retains affinity for the same antigens, but can interact with different effector molecules. Class switching occurs after activation of a mature B cell via its membrane-bound antibody molecule to generate the different classes of antibody, all with the same variable domains as the original antibody generated in the immature B cell during the process of V(D)J recombination but possessing distinct constant domains in their heavy chains. Naive mature B cells produce both IgM and IgD, which are the first two heavy chain segments in the immunoglobulin locus. After activation by antigen, these B cells proliferate. If these activated B cells encounter specific signalling molecules via their CD40 and cytokine receptors (both modulated by T helper cells), they undergo antibody class switching to produce IgG, IgA or IgE antibodies.
  • affinity maturation will occur. This is the process by which activated B cells produce antibodies with increased affinity for a given antigen. With repeated exposure to the same antigen, a host will produce antibodies of successively greater affinities. A secondary immune response can elicit antibodies with several fold greater affinity than a primary response. Affinity maturation is thought to involve two interrelated processes, occurring in the germinal centres of the secondary lymphoid organs, these being somatic hypermutation and clonal selection. Somatic hypermutation is characterised by mutations in the variable, antigenbinding coding sequences (known as complementary determining regions (CDR)) of the immunoglobulin genes.
  • CDR complementary determining regions
  • the mutation rate is up to 1,000,000 times higher than in cell lines outside the lymphoid system.
  • the increased mutation rate results in 1-2 mutations per CDR and, hence, per cell generation and alter the binding specificity and binding affinities of the resultant antibodies.
  • B cells that have undergone somatic hypermutation must compete for limiting growth resources, including the availability of antigen and paracrine signals.
  • the follicular dendritic cells of the germinal centres present antigen to the B cells, and the B cell progeny with the highest affinities for antigen, having gained a competitive advantage, are favoured for positive selection leading to their survival. Over several rounds of selection, the resultant secreted antibodies produced will exhibit increased affinities for antigen.
  • vaccine production comprises several stages.
  • identification of the antigen target itself is critical.
  • methods of production of the antigen are determined and antigen is generated.
  • Viruses can be grown on primary cells such as chicken eggs (e.g., for influenza) or on continuous cell lines such as cultured human cells (e.g., for Hepatitis A). If bacteria are used, these are usually grown in bioreactors.
  • a recombinant protein derived from the viruses or bacteria can be generated in yeast, bacteria, or cell cultures and provides a means of generating detoxified antigens.
  • Recombinant antigen production is expected to become increasingly important due to its ability to rapidly produce highly specific, including rationally designed, antigens in a highly enriched state. After the antigen is generated, it is isolated from the cells used to generate it and the vaccine is formulated. Combination vaccines are more difficult to develop and produce due to potential incompatibilities and interactions amongst the antigens and other components involved.
  • An alternative approach for making vaccines is to bypass the ex- vivo production of antigen and direct protein production in the host cells using antigen epitopes from recombinant mRNA molecules that can encode an entire protein of interest. These mRNA vaccines have recently proved very successful (e.g., Pfizer and Modema S ARS-CoV -2 vaccines).
  • Detoxification may be required before an antigen (for example, pertussis toxin) can safely be administered to humans, but some detoxification methods may destroy epitopes in the process, and impact immunogenicity.
  • an antigen for example, pertussis toxin
  • detoxification methods may destroy epitopes in the process, and impact immunogenicity.
  • initial in vitro studies are important to evaluate antigen-antibody binding capacity and function, and investigation of the immune response to a candidate antigen is necessary.
  • the identification of suitable immunogens which are efficacious across a genetically diverse population remains slow and labour intensive. Accordingly, there is an ongoing need to develop improved means for identifying pathogen immunogens suitable for applications such as vaccine development and diagnostics, in particular immunointeraction-based point of care diagnostics.
  • This information thereby enables a wide range of applications including the generation of immunogens for vaccine development, development of diagnostics (such as point of care diagnostics), development of immunogen expression systems, identifying and/or developing neutralising antibodies, assessment of the immune status of individuals who have been infected with said pathogen and assessment of the immune status of individuals vaccinated with antigen based vaccines directed to said pathogen.
  • diagnostics such as point of care diagnostics
  • immunogen expression systems identifying and/or developing neutralising antibodies
  • assessment of the immune status of individuals who have been infected with said pathogen and assessment of the immune status of individuals vaccinated with antigen based vaccines directed to said pathogen.
  • the present invention has thereby now enabled the development of a means of rapidly and accurately identifying and characterising affinity maturated pathogen directed immunoglobulin paratopes which are functional across a genetically diverse population.
  • One aspect of the present invention is directed to a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest, said method comprising:
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM and/or IgD heavy chain mRNA of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof; (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said infection of step (i) is a primary infection.
  • said IgH isotype switched nucleic acid is IgG, IgA and/or IgE heavy chain nucleic acid.
  • the subject pathogen is a microorganism, for example a virus, bacterium or parasite.
  • said pathogen is an antigen derived from a microorganism, environmental agent or allergen. More particularly, said pathogen is a vaccine.
  • said infection occurs by the administration of the pathogen to the subject. More particularly, said administration is vaccination or other mode of antigen challenge.
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM and/or IgD heavy chain mRNA of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequence encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood mononuclear cells.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood lymphocytes.
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM and/or IgD heavy chain mRNA of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM heavy chain mRNA of the sample of step (i) and the IgG heavy chain mRNA of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • said RNA is mRNA.
  • the B cell RNA is derived from a sample of peripheral blood mononuclear cells.
  • the B cell RNA is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or 8 weeks apart.
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM heavy chain mRNA of the sample of step (i) and sequencing the nucleic acid region encoding the paratope or part thereof of the IgH isotype switched rearranged DNA of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said isotype switch IgH is IgG.
  • said isotype switch is detected by clonal expansion analysis.
  • said isotype switch is detected by amplification and sequencing of IgG rearranged genomic DNA.
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM heavy chain mRNA of the sample of step (i) and the IgG heavy chain mRNA of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • step (iv) analysing the cDNA sequencing results of step (iii) to identify paratope encoding cDNA sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at east 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope DNA sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope DNA sequences identified in step (vi) are indicative of a candidate
  • IgH paratope directed to said pathogen.
  • the present invention is directed to a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest, said method comprising:
  • step (ii) wherein said paratope or part thereof comprises at least the CDR3 region or part thereof and sequencing said amplification product; (iv) analysing the cDNA sequencing results of step (iii) to identify paratope encoding cDNA sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope DNA sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope DNA sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • the B cell RNA is derived from a sample of peripheral blood mononuclear cells.
  • the B cell RNA is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or 8 weeks apart.
  • said paratope which is amplified includes, but is not limited to:
  • V gene segment region such as a region predisposed to undergoing hypermutation
  • J gene segment region encoding a portion of the CDR3 are amplified and sequenced.
  • step (iii) sequencing the cDNA reverse transcribed from the mRNA encoding the paratope or part thereof of the IgM heavy chain of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the cDNA sequencing results of step (iii) to identify paratope encoding cDNA sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (vi) screening IgH paratope DNA sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which either correlate to any of the paratope sequences identified in step (iv) or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope DNA sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • the B cell nucleic acid is derived from a sample of peripheral blood mononuclear cells.
  • the B cell nucleic acid is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or 8 weeks apart, for example 4-8 weeks apart.
  • the nucleic acid sequence data of step (vi) is mRNA sequence data and said IgH is IgG.
  • the nucleic acid sequence data of step (vi) is rearranged IgH genomic DNA sequence data.
  • step (iii) comprises non-selectively reverse transcribing the RNA of the sample of steps (i) and (ii) and (a) selectively amplifying the paratope region or fragment thereof of the IgM heavy chain of the cDNA transcribed from the sample of step (i), selectively amplifying the IgG heavy chain from the cDNA transcribed from the one or more samples of step (ii) and sequencing the amplification product of (a) and (b)
  • step (iii) comprises selectively reverse transcribing the IgM heavy chain mRNA of the sample of step (i) and the IgG heavy chain mRNA of the sample of step (ii), amplifying the paratope region or fragment thereof of the cDNA transcribed from the samples of steps (i) and (ii) wherein said paratope regions comprise at least the CDR3 region or part thereof and sequencing said amplification product;
  • said paratope which is amplified includes, but is not limited to:
  • V gene segment region such as a region predisposed to undergoing hypermutation
  • J gene segment region encoding a portion of the CDR3 are amplified and sequenced.
  • the present invention is predicated, in part, on the development of a means to identify immunoglobulin (Ig) paratopes which are generated by the immune system in response to a pathogenic insult.
  • Ig immunoglobulin
  • candidate paratopes are identified which are more likely to have been generated to highly immunogenic antigens, thereby enabling effective and rapid vaccine design and immunophenotypic diagnostic assay development.
  • one aspect of the present invention is directed to a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest, said method comprising:
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM and/or IgD heavy chain mRNA of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequence encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • the infection of step (i) is a primary infection.
  • said IgH isotype switched nucleic acid is IgG, IgA and/or IgE heavy chain nucleic acid.
  • immunoglobulin should be understood as a reference to all forms of any member of the immunoglobulin family of molecules and to derivatives thereof, including the isolated heavy and light chains in either monomeric, dimeric or multimeric form.
  • the immunoglobulin family of molecules encompass five classes (isotypes) of immunoglobulin, being IgM, IgG, IgA, IgD and IgE.
  • the immunoglobulins classed as IgG and IgA are further divided into the subclasses IgGl, IgG2, IgG3, IgG4, IgAl and IgA2.
  • the immunoglobulin molecule in its native form is a tetramer and consists of two identical pairs of immunoglobulin chains, each pair having one light and one heavy chain. In each pair, the light and heavy chain variable regions are together responsible for binding to an antigen, and the constant regions are responsible for the antibody effector functions.
  • Each individual antibody molecule exhibits unique complimentary determining region (CDR) domains in the variable regions of the immunoglobulin heavy and light chains, thereby attributing that immunoglobulin molecule with a highly specific epitope reactivity. It is currently believed that the human immune system generates in excess of 10 8 unique immunoglobulin epitope specificities.
  • immunoglobulin or derivative thereof includes, for example, all protein forms of these molecules or their derivatives, mutants or variants including, for example, any isoforms which may arise from alternative splicing of the DNA or of the encoding mRNA. Also included in this definition are mutants, variants (such as polymorphic variants) or derivatives of these molecules. “Derivatives” include, for example, portions of the immunoglobulin molecule such as the Fab fragment, F(ab')2 fragment, Fc fragment or the individual heavy and light chains or fragments thereof (such as the variable or constant region domains).
  • the immunoglobulin variable region encoding genomic DNA which may be rearranged includes the variable regions associated with the heavy chain or the k or l light chains.
  • reference to “IgH” should be understood as a reference to the immunoglobulin heavy chain, in either monomeric, dimeric or multimeric form, irrespective of its isotype. Accordingly, “IgH” is used to refer to a heavy chain of any isotype/class.
  • the variable region of each heavy and light chain is composed of three complementary determining region/hypervariable (CDR) loops which are separated by four flanking framework region antiparallel b sheets which function to support the binding of the antibody to the antigen.
  • CDR complementary determining region/hypervariable
  • the antigen binding site which is formed comprises six CDRs which form a cleft or region to which a specific epitope will bind.
  • frame region is meant region of an immunoglobulin light or heavy chain variable region, which is interrupted by three hypervariable regions, also called CDRs.
  • the extent of the framework region and CDRs have been precisely defined (see, for example, Kabat et al, “Sequences of Proteins of Immunological Interest U.S. Department of Health and Human Services, 1983).
  • the sequences of the framework regions of different light or heavy chains are relatively conserved within a species.
  • the framework region of an antibody that is the combined framework regions of the constituent light and heavy chains, serves to position and align the CDRs.
  • the CDRs are primarily responsible for binding to an epitope.
  • references to an immunoglobulin “paratope” should therefore be understood as a reference to the six CDR antigen binding site of a dimerised heavy or light immunoglobulin chain or the three CDR antigen binding sites of a heavy and light immunoglobulin chain monomer.
  • the paratope is generally understood as a reference to the CDR subregions of the variable region, and as further discussed hereafter, the information which is obtained in terms of identifying a candidate paratope will usually also include some or all of the framework region sequence information, particularly since the framework region defines the overall three dimensional folding structure of the variable region and therefor the positioning and orientation of the CDRs relative to each other. This defines the nature of the epitope to which any given set of CDRs will bind. Accordingly, reference to “paratope” in the context of the present invention should be understood to extend to include the framework regions.
  • the present invention is also directed to identifying a “part” of a paratope.
  • part is meant a subregion of the paratope.
  • one may seek to identify and characterise only some of the CDR region, such as only the CDR3 region.
  • V(D)J recombination in organisms with an adaptive immune system is an example of a type of site-specific genetic recombination that helps immune cells rapidly diversify to recognise and adapt to new pathogens.
  • Each lymphoid cell undergoes somatic recombination of its germ line variable region gene segments (either V and J, D and J or V, D and J segments), depending on the particular gene segments rearranged, in order to generate a total antigen diversity of approximately 10 16 distinct variable region structures.
  • germ line variable region gene segments either V and J, D and J or V, D and J segments
  • V and J germ line variable region gene segments
  • D and J variable region gene segments
  • VJ, DJ or VDJ segment of any given immunoglobulin or TCR gene nucleotides are randomly removed and/or inserted at the junction between the segments. This leads to the generation of enormous diversity.
  • the loci for these gene segments are widely separated in the germline but recombination during lymphoid development results in apposition of a V, (D) and J gene, with the junctions between these genes being characterised by small regions of insertion and deletion of nucleotides. This process occurs randomly so that each normal lymphocyte comes to bear a unique V(D)J rearrangement.
  • V, D and J gene segments are clustered into families. For example, there are 52 different functional V gene segments for the k immunoglobulin light chain and 5 J gene segments. For the immunoglobulin heavy chain, there are 55 functional V gene segments, 23 functional D gene segments and 6 J gene segments. Across the totality of the immunoglobulin and T cell receptor V, D and J gene segment families, there are a large number of individual gene segments, thereby enabling enormous diversity in terms of the unique combination of V(D)J rearrangements which can be effected.
  • V(D)J variable nucleic acid region
  • V(D)J variable nucleic acid region
  • gene segments individual V, D or J nucleic acid regions
  • the terminology “gene segment” is not exclusively a reference to a segment of a gene. Rather, in the context of immunoglobulin gene rearrangement, it is a reference to a gene in its own right with these gene segments being clustered into families.
  • a “rearranged” immunoglobulin variable region gene should be understood herein as a gene in which two or more of one V segment, one J segment and one D segment (if a D segment is incorporated into the particular rearranged variable gene in issue) have been spliced together to form a single rearranged “gene”.
  • this rearranged “gene” is actually a stretch of genomic DNA comprising one V gene segment, one J gene segment and one D gene segment which have been spliced together. It is therefore sometimes also referred to as a “gene region” since it is actually made up of 2 or 3 distinct V, D or J genes (herein referred to as gene segments) which have been spliced together.
  • the individual “gene segments” of the rearranged immunoglobulin gene are therefore defined as the individual V, D and J genes. These genes are discussed in detail on the IMGT database.
  • the term “gene” will be used herein to refer to the rearranged immunoglobulin variable gene.
  • the term “gene segment” will be used herein to refer to the V, D and J segments. However, it should be noted that there is significant inconsistency in the use of “gene”/“gene segment” language in terms of immunoglobulin and T cell receptor rearrangement.
  • the IMGT refers to individual V, D and J “genes”, while some scientific publication refers to these as “gene segments”.
  • Some sources refer to the rearranged variable immunoglobulin or T cell receptor as a “gene region” while others refer to it as a “gene”.
  • the nomenclature which is used in this specification is as defined earlier.
  • N regions are also unique and are themselves sometimes therefore useful targets in the context of target sequence analysis. Accordingly, it is generally understood that the V(D)J rearrangement provides combinatorial diversity while the addition of N nucleotides or palindromic (P) nucleotides provides junctional diversity.
  • the secondary structure of the protein molecule which is translated does itself comprise unique features which are themselves often the subject of analysis, albeit it in terms of the DNA sequence regions within the V(D)J rearrangement which encode these secondary structure features.
  • the translated variable region of IgH (the immunoglobulin heavy chain) or the TCR b or d chains takes the form of three looped hypervariable regions which are usually referred to as the complementary determining regions (CDR) 1, 2 and 3. These CDR regions are flanked by four framework regions (FR) 1, 2, 3 and 4.
  • the V gene segment is understood to encode the CDR1, CDR2, leader sequence, FR1, FR2 and FR3.
  • the CDR3 region is encoded by part of the V gene segment, all of the D gene segment and part of the J gene segment.
  • the remainder of the J gene segment generally encodes FR4.
  • the method of the present invention is directed to identifying candidate paratopes directed to a pathogen of interest.
  • candidate is meant a paratope that may be, but is not necessarily, directed to the pathogen of interest.
  • the present method maximizes the possibility of identifying paratopes specifically directed to the antigen of interest. This method minimizes the possibility of detecting pre-existing cross-reactive paratopes which were generated in response to a different but related pathogen at an earlier point in time or paratopes directed to another antigen which is simultaneously but opportunistically infecting the initial subject who is tested.
  • the present invention provides a highly refined and selective list of candidate paratopes for further characterization and study.
  • pathogen should be understood as a reference to any agent which causes disease or the presence of which is otherwise unwanted in a mammal.
  • said pathogen may be a microorganism (such as a bacterium, virus, fungus, yeast, parasite or mycoplasma), an antigen derived from a plant (such as pollen or the immunogenic components of nuts) or an antigen shed from or produced by an organism (such as dust mite feces).
  • said pathogen may be a non-living agent, such as a synthetically generated or naturally occurring toxin, synthetically generated or naturally occurring environmental antigen (such as dust or latex), an antigenic component of a pathogen (for example, an immunogen or epitope derived from a pathogen, which may be additionally formulated with other agents to facilitate its administration to a subject, such as in the context of a vaccine), a hapten derived from a pathogen and which has been formulated such that immunogenicity is enabled upon administration to a subject, a prion or any other proteinaceous or non-proteinaceous molecule, the clearance or neutralisation of which via the induction of an immune response, such as an antibody response, may be desirable.
  • a non-living agent such as a synthetically generated or naturally occurring toxin, synthetically generated or naturally occurring environmental antigen (such as dust or latex)
  • an antigenic component of a pathogen for example, an immunogen or epitope derived from a pathogen, which may be additionally formulated with other agents
  • the subject pathogen may or may not result in the onset of a disease condition.
  • many pathogens do induce diseases.
  • some pathogens can colonise an animal and exist in a symbiotic relationship without the onset of a disease condition.
  • Such pathogens due to their foreign nature, may nevertheless result in the onset of an acute or chronic immune response, the analysis of which response in accordance with the methods defined herein may be nevertheless desirable.
  • Other pathogens may be innocuous but nevertheless induce undesirable immune responses such as allergies or anaphylactic responses.
  • Reference to "pathogen” should also be understood to encompass pathogens which have either naturally or non-naturally undergone some form of mutation, genetic manipulation or any other form of manipulation.
  • pathogens include, but are not limited to, bacteria, viruses, parasites, allergens and vaccines which are formulated with immunogens derived from pathogens.
  • the subject pathogen is a microorganism, for example a virus such as SARS- Cov-2 (commonly known as COVID-19).
  • said pathogen is a vaccine, in particular a vaccine directed to a microorganism or an allergen.
  • the subject tested in accordance with the present method may be any human or non-human animal including humans, primates, livestock animals (e.g. sheep, pigs, cows, horses, donkeys), laboratory test animals (e.g. mice, rats, rabbits, guinea pigs), companion animals (e.g. dogs, cats), captive wild animals (e.g. foxes, kangaroos, deer), aves (e.g. chicken, geese, ducks, emus, ostriches), reptiles or fish.
  • livestock animals e.g. sheep, pigs, cows, horses, donkeys
  • laboratory test animals e.g. mice, rats, rabbits, guinea pigs
  • companion animals e.g. dogs, cats
  • captive wild animals e.g. foxes, kangaroos, deer
  • aves e.g. chicken, geese, ducks, emus, ostriches
  • any biological sample which may contain lymphocytes which are elicited by a primary immune response may be used as the starting cellular population for analysis of B cell nucleic acid.
  • Reference to “biological sample” should be understood as a reference to any biological material derived from a subject. Such samples include, but are not limited to, blood, bone marrow, urine, lymph, cerebrospinal fluid, ascites, saliva, swabs (e.g. throat or nasal swabs), biopsy or tissue or aspirate specimens (e.g.
  • lymph node samples lymph node samples
  • pleural fluid and fluid which has been introduced into the body of an animal and subsequently removed such as, for example, the saline solution extracted from the lung following lung lavage or the solution retrieved from an enema wash.
  • the biological sample which is tested according to the method of the present invention may be tested directly or may require some form of pre-treatment prior to testing.
  • the sample may require the addition of a reagent, such as a buffer, to mobilise the sample.
  • the sample which is the subject of testing may be freshly isolated or it may have been isolated at an earlier point in time and subsequently stored or otherwise treated prior to testing.
  • the sample may have been collected at an earlier point in time and frozen or otherwise preserved in order to facilitate its transportation to the site of testing.
  • the sample may be treated to neutralise any possible pathogenic infection, thereby reducing the risk of transmission of the infection to the technician.
  • said biological sample is a peripheral blood sample.
  • a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest comprising:
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM and/or IgD heavy chain mRNA of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequence encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said IgH isotype switched nucleic acid is IgG, IgA and/or IgE heavy chain nucleic acid.
  • the subject pathogen is a microorganism.
  • the subject pathogen is a microorganism.
  • the subject pathogen is a microorganism.
  • any suitable method may be used which would be well known to those of skill in the art. These include, but are not limited to, enrichment of lymphocytes or mononuclear cells from a whole blood sample using techniques such as density gradient based methods (e.g. Ficoll-Paque separation), elutriation methods (e.g.
  • cell surface marker based separation such as FACS, magnetic bead separation or any other method which facilitates cell surface antigen based cellular subpopulation enrichment.
  • FACS counterflow centrifugation elutriation
  • magnetic bead separation or any other method which facilitates cell surface antigen based cellular subpopulation enrichment.
  • cell surface antigen based enrichment is particularly preferred (e.g. magnetic bead separation or flow cytometric sorting).
  • Ficoll-Paque methods there are several different methods which can be used including two-stage gradient separation based on the isolation of peripheral blood mononuclear cells followed by lymphocyte enrichment, isolation by cell preparation tubes or isolation by Streck or SepMate tubes. It should be understood that the subject cells are an enriched population of cells and are not necessarily pure.
  • the degree of enrichment or purity which is achievable will depend on the technique which is selected for use to enrich for the target cell population, whether that be lymphocytes, mononuclear cells or B cells, in the blood sample.
  • the blood sample or the enriched or purified cellular subpopulation which is used according to the method of the present invention may be tested directly or may require some form of treatment prior to testing. For example, it may require the addition of a reagent, such as a buffer, to mobilise the sample. Alternatively, it may require some other form of pre-treatment such as heparinisation, in order to prevent clotting, or the inactivation of live virus.
  • peripheral blood lymphocytes and peripheral blood mononuclear cells not only comprise heterogenous subpopulations of cells but may, further comprise contaminating cells.
  • peripheral blood mononuclear cell populations enriched by Ficoll- Paque density gradient centrifugation are highly enriched, it is not uncommon for there to nevertheless occur a small proportion of contaminating granulocytes.
  • an enriched population of lymphocytes may nevertheless comprise contaminating NK cells.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood mononuclear cells.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood lymphocytes.
  • the B cell derived nucleic acid is a subpopulation of the total nucleic acid extracted from the starting cellular sample.
  • nucleic acid or “nucleotide” should be understood as a reference to both deoxyribonucleic acid or nucleotides and ribonucleic acid or nucleotides or purine or pyrimidine bases.
  • it should be understood to encompass phosphate esters of ribonucleotides and/or deoxyribonucleotides, including all forms of DNA (e.g. cDNA and genomic DNA) and RNA.
  • the subject RNA should be understood to encompass all forms of RNA including, but not limited to, primary RNA transcripts, messenger RNA, transfer RNA, microRNA and ribosomal RNA.
  • B cell derived nucleic acid is a reference to the skilled person either obtaining a sample and extracting the nucleic acid themselves or obtaining a sample of pre-extracted nucleic acid.
  • the subject nucleic acid may be highly enriched for B cell nucleic acid or the B cell nucleic acid may form part of a heterogeneous nucleic acid population, depending on the heterogenicity of the cellular population from which the nucleic acid was extracted.
  • the nucleic acid is RNA
  • the subject RNA may be total RNA or it may be a subpopulation of RNA - such as primary RNA and mRNA (i.e. excluding tRNA, rRNA and miRNA) or it may be mRNA alone.
  • primary RNA and mRNA i.e. excluding tRNA, rRNA and miRNA
  • paratope expression is encoded by primary RNA and mRNA.
  • RNA may be in any suitable form and may have been pretreated prior to receipt and testing, such as inactivation of live virus or treatment to reduce or inactivate the activity of enzymes, such as RNAses, which degrade RNA. It should also be understood that the RNA may have been freshly isolated or it may have been stored (for example by freezing) prior to testing or otherwise treated prior to testing. Still further, although the RNA which is analysed in accordance with the present method is enriched, there may nevertheless be nucleic or non-nucleic acid contaminants present in the sample. In one embodiment, said RNA is mRNA.
  • a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest comprising:
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM and/or IgD heavy chain mRNA of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood mononuclear cells.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood lymphocytes.
  • said IgH isotype switched nucleic acid is IgG, IgA and/or IgE heavy chain nucleic acid.
  • the subject pathogen is a microorganism.
  • the identification of candidate immunoglobulin paratopes in accordance with the present invention is predicated on the comparative analysis of the immunoglobulin paratope expression of multiple peripheral B cell samples which are drawn sequentially from a subject experiencing a primary infection by said pathogen.
  • Reference to “infection” should be understood as a reference to the exposure of the subject, by any means, to the pathogen of interest.
  • said exposure may occur by either natural or not natural means including, but not limited to, environmental exposure (such as infection due to contagious spread of pathogens between subjects) or active challenge, such as deliberately administering a pathogen to a subject, as might occur where a subject is vaccinated with an antigen derived from a pathogen.
  • primary infection is meant the initial exposure of the subject to the pathogen of interest.
  • IgM or IgD immune response in response to initial exposure, both of these isotypes being associated with the first wave of immunity in a primary infection, which will usually progress through an isotype switch to an IgG, IgA or IgE response, together with ongoing affinity maturation of the paratope itself.
  • IgH isotype switched nucleic acid should be understood as a reference to an IgH which has undergone isotype switching from the early IgM/IgD phase to IgG, IgA and/or IgE.
  • the RNA sample of step (i) is isolated during the IgM/IgD phase of the immune response and the one or more sequential samples of step (ii) are isolated subsequently to immunoglobulin isotype switching to IgG/IgA/IgE. This is most easily achieved by isolating the sample of step (i) in the early phase of infection.
  • the method of the present invention is designed to selectively amplify only those specific immunoglobulin isotypes which are expected to be expressed in a primary immune response which is directed to the pathogen of interest, thereby ensuring that to the extent that the earliest stage of infection may have actually been missed, there would simply be no amplification product produced and the sample analysis would not proceed.
  • said pathogen is an antigen derived from a microorganism, environmental agent or allergen. More particularly, said pathogen is a vaccine.
  • said infection occurs by the administration of the pathogen to the subject. More particularly, said administration is vaccination or other mode of antigen challenge.
  • the present method is designed to selectively screen for IgM and/or IgD in the sample of step (i) and then IgG, IgA and/or IgD in the subsequent step (ii) sequential samples.
  • step (ii) the samples of step (ii) will simply show no amplification product. Accordingly, one may elect to simply analyse two samples in total, that is the step (i) sample isolated in the early stage of infection and a second (step (ii)) sample which is isolated after immunoglobulin isotype switching has occurred or else one may elect to isolate several sequential samples in step (ii), for example to investigate immunoglobulin affinity maturation events or to simply ensure that where the timing of likely isotype switching is difficult to determine, that sufficient sequential samples are isolated to ensure that a post-isotype switch sample is obtained.
  • the samples which are isolated in accordance with the present method are isolated at any suitable interval, such as not less than 1 week apart or not less than 2 weeks apart.
  • the sample of step (i) relative to the first sample of step (ii) are drawn from the patient not less than two weeks apart and any further sequential samples isolated after the initial sample of step (ii) are also not less than two weeks apart.
  • the intervals at which the samples are isolated need not be the same from one sample to the next.
  • the first and second sample may have been isolated two weeks apart
  • the second and third samples may be isolated at a different interval, such as 1, 2, 3 or 4 weeks apart. Any further sequential samples may be isolated at yet another interval.
  • the subject samples are isolated 1-16 weeks apart, 2- 14 weeks apart, 2-13 weeks apart, 2-10 weeks apart, 2-8 weeks apart, 2-6 weeks apart or 2-4 weeks apart. More preferably, said samples are isolated 2, 3, 4, 5, 6, 7 or 8 weeks apart.
  • the present invention is predicated on comparing immunoglobulin paratope sequences both before and after immunoglobulin isotype switching in the context of a subject experiencing a primary infection and, also, as between the infected subject and the wider infected population.
  • the present method is also applicable to identifying and characterising multiple immunoglobulin paratope specificities which are simultaneously generated by the immune system and directed to the pathogen of interest, provided that isotype switching has occurred.
  • IgM and IgD are commonly found during the early stage of a primary immune response, this transitioning via isotype switching to IgG, IgA or IgE.
  • IgM is generally known to be a first response antibody. It is expressed on the surface of B cells and in a secreted form with very high avidity. It functions to eliminate pathogens in the early stages of B cell mediated immunity. IgD is similarly associated with the earliest stages of a primary immune response and is generally found in a membrane bound form. IgG is the major immunoglobulin found in serum and is secreted at a later stage in a primary immune response, generally after isotype switching from IgM. It provides the majority of antibody based in immunity against invading pathogens. The IgG3 subtype is known to cross the placenta.
  • IgA and IgE are also associated with the later stages of a primary immune response and, like IgG, also arise after class switching.
  • IgA is predominantly localised to mucosal tissues, such as the gut, respiratory and urogenital tract, and prevents their colonization by pathogens. It is resistant to digestion and is secreted in milk.
  • IgE binds to allergens, triggers histamine release from mast cells and is involved in allergy. It is also known to be protective against parasitic worms.
  • the present method may be designed to screen for multiple immunoglobulin isotypes in the context of any one sample, such as screening for both IgM and IgD in the context of sample (i).
  • the pathogen of interest is a virus or bacterium
  • said immunoglobulin analysis is directed to an IgM to IgG isotype switch.
  • a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest comprising:
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM heavy chain mRNA of the sample of step (i) and the IgG heavy chain mRNA of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • said RNA is mRNA.
  • the peripheral blood B cell RNA is derived from a sample of peripheral blood mononuclear cells.
  • the peripheral blood B cell RNA is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or8 weeks apart, for example 4-8 weeks apart.
  • samples of the present invention maybe analysed at the time that they are received or they may be stored until all of the sequentially isolated samples are received and thereafter analysed simultaneously.
  • the present method is predicated upon identifying immunoglobulin paratopes which have undergone isotype switching in a subject experiencing a primary infection to a pathogen of interest.
  • This objective is facilitated by testing sequentially drawn samples commencing with an initial sample drawn early in the immune response through to one or more samples which are drawn at later stages of the immune response.
  • immunoglobulin isotype switching is a well understood mechanism which can occur anywhere from several days to up to several weeks after initial infection and production of IgM and/or IgD.
  • a clonal analysis is performed to determine whether a given paratope sequence has derived from a clonally expanded population of cells. This enables differentiation between the presence of a DNA immunoglobulin rearrangement which corresponds to a particular paratope but is simply representative of the immune repertoire versus a DNA immunoglobulin rearrangement which has been the subject of clonal expansion.
  • a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest comprising:
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM heavy chain mRNA of the sample of step (i) and sequencing the nucleic acid region encoding the paratope or part thereof of the IgH isotype switched rearranged DNA of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said isotype switch IgH is IgG.
  • said isotype switch is detected by clonal expansion analysis.
  • said isotype switch is detected by amplification and sequencing of IgG rearranged genomic DNA.
  • step (iii) sequencing the nucleic acid region encoding the paratope or part thereof of the IgM heavy chain mRNA of the sample of step (i) and the IgG heavy chain mRNA of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the nucleic acid sequencing results of step (iii) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood mononuclear cells.
  • the peripheral blood B cell nucleic acid is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or8 weeks apart, for example 4-8 weeks apart.
  • the nucleic acid sequence data of step (vi) is mRNA sequence data and said IgH is IgG.
  • the nucleic acid sequence data of step (vi) is rearranged IgH genomic DNA sequence data.
  • the DNA and/or RNA samples of the present invention undergo sequencing of the nucleic acid region which encodes the immunoglobulin paratope.
  • the present invention therefore extends to both directly sequencing RNA or directly sequencing cDNA which has been reverse transcribed from the RNA population of interest. It is well within the skill of the person in the art to design methodology directed to screening for either DNA or RNA.
  • the method of the present invention is designed to more efficiently and accurately identify paratopes exhibiting high levels of functionality, such as in the context of highly effective neutralising antibodies, than has previously been possible. This is achieved by performing a comparative analysis of paratope nucleic acid sequences as between primary immune response induced Ig mRNA and isotype switched Ig nucleic acid, from a subject exposed to a pathogen of interest, to identify clonally expanded sequences exhibiting sequence identity across both samples.
  • the correlation of these sequences with clonally expanded rearranged genomic Ig DNA derived from one or more other subjects likely to have also been exposed to the pathogen of interest enables refinement of the candidate paratopes to those which have consistently been identified as having undergone clonal expansion across a genetically diverse population and are therefore likely to represent highly efficacious paratopes.
  • the present method may be conveniently designed to introduce an initial step to enrich for paratope sequences which are likely to be of interest.
  • Paratope sequencing of clonally expanded Ig populations prior to pathogen exposure provides a baseline control reading of Ig paratopes which have been expressed in response to antigens other than the pathogen of interest.
  • the mammalian immune system is simultaneously responding to multiple foreign antigens.
  • the identification and sequencing of Ig paratopes from clonally expanded B cells in the samples drawn subsequently to pathogen exposure would be expected to identify multiple paratopes which have undergone IgM expression followed by Ig class switching.
  • the number of paratopes which are the subject of analysis can be reduced.
  • this enrichment step can only be applied in the context of a pathogen in respect of which its initial exposure to the subject can be controlled or is otherwise predictable. Since most naturally occurring pathogen infections are opportunistic, such as in the context of microorganism infections, this embodiment will be of limited use in terms of these types of pathogens.
  • this enrichment method can be applied by virtue of the fact that the timing of the initial administration of the vaccine can be delayed until an appropriate preimmunization B cell sample is drawn.
  • a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest comprising:
  • step (iv) sequencing the nucleic acid region encoding the paratope or part thereof of the Ig mRNA of step (i) and the IgM and/or IgD heavy chain mRNA of the sample of step (ii) and the IgH isotype switched nucleic acid of the one or more samples of step (iii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (v) analysing the nucleic acid sequencing results of step (iv) to identify paratope encoding nucleic acid sequences which are expressed by both the sequencing sample result of the step (ii) sample and the sequencing sample result of the step (iii) sample, but not the sequencing sample result of step (i), and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (vi) optionally determining the amino acid sequences encoded by the sequences identified in step (v);
  • step (vii) screening IgH paratope nucleic acid sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (v) and/or which encode the amino acid sequences or homologs thereof of step (vi) in at least one of said other subjects; wherein IgH paratope nucleic acid sequences identified in step (vii) are indicative of a candidate IgH paratope directed to said pathogen.
  • the nucleic acid region which is the subject of sequencing is cDNA.
  • the RNA which is obtained in steps (i) and optionally steps (ii) and (vi) will initially require reverse transcription (RT) to DNA.
  • RT reverse transcription
  • the cDNA template is preferably subjected to selective amplification, such as PCR amplification, of the paratope region using primers designed to selectively amplify the subject region, or part thereof, in order to generate a library of paratope cDNA molecules which are appropriate for use with the selected sequencing technology.
  • 1-step RT-PCR is a single-tube reaction which either uses a mixture of a reverse transcriptase and a traditional thermostable DNA polymerase or uses a thermostable DNA polymerase, such as that from Thermus thermophilus, which exhibits reverse transcriptase activity in addition to DNA polymerase activity in the presence of Mn2+ ions.
  • 2-step RT- PCR consists of two separate reactions.
  • RNA strand is reverse transcribed into its cDNA using the enzyme reverse transcriptase.
  • a portion of the first reaction, containing the produced cDNA, is then added to a second reaction containing a thermostable DNA polymerase, which through traditional PCR, amplifies the cDNA many fold.
  • the selection of primers to facilitate RNA reverse transcription will depend upon the population of RNA molecules which is sought to be reverse transcribed.
  • reverse transcriptases require a short primer to bind to its complementary sequences on the RNA template and serve as a starting point for synthesis of a new strand.
  • primers of three basic types are generally used: random primers, oligo(dT) primers and gene-specific primers. More particularly: (i) Random primers are oligonucleotides which exhibit random base sequences. They are often six nucleotides long and are usually referred to as random hexamers, Nb, or dN6. Due to their random binding (i.e., no template specificity), random primers can potentially anneal to any RNA species in the sample.
  • these primers will amplify total RNA and are often used to ensure reverse transcription of RNAs which do not exhibit poly(A) tails (e.g. rRNA, tRNA, non-coding RNAs, small RNAs, prokaryotic mRNA), degraded RNA (e.g. from FFPE tissue), and RNA with known secondary structures (e.g. viral genomes).
  • RNAs which do not exhibit poly(A) tails e.g. rRNA, tRNA, non-coding RNAs, small RNAs, prokaryotic mRNA
  • degraded RNA e.g. from FFPE tissue
  • RNA with known secondary structures e.g. viral genomes.
  • random primers will improve cDNA synthesis for detection, they are not an ideal choice for full-length reverse transcription of long RNA. Increasing the concentration of random hexamers in reverse transcription reactions improves cDNA yield but results in shorter cDNA fragments due to increased binding at multiple sites on the same template.
  • Oligo (dT) primers consist of a stretch of 12-18 deoxythymidines that anneal to poly(A) tails of eukaryotic mRNAs, which make up only 1-5% of total RNA. These primers are suitable for constructing cDNA libraries from eukaryotic mRNAs. Because of their specificity for poly(A) tails, oligo(dT) primers are not suitable for degraded RNA, such as from formalin-fixed, paraffin-embedded (FFPE) samples, nor for RNAs that lack poly(A) tails, such as prokaryotic RNAs and microRNAs. Oligo(dT) primers may be modified to improve efficiency of reverse transcription.
  • FFPE formalin-fixed, paraffin-embedded
  • oligo(dT) primers may be extended to 20 nucleotides or longer to enable their annealing in reverse transcription reactions at higher temperatures.
  • oligo(dT) primers may include degenerate bases like dN (dA, dT, dG, or dC) and dV (either dG, dA, or dC) at the 3' end. This modification prevents poly(A) slippage and locks the priming site immediately upstream of the poly(A) tail. These primers are referred to as anchored oligo(dT).
  • Gene-specific primers offer the most specific priming in reverse transcription. These primers are designed based on known sequences of the target RNA. Since the primers bind to specific RNA sequences, a unique set of gene-specific primers is required for each target RNA. Gene-specific primers are commonly used in l-step RT-PCR applications.
  • RNA is sought to be reverse transcribed
  • a mixture of random primers and oligo (dT) primers will generally amplify most RNA molecules.
  • the use of oligo (dT) primers alone will generally achieve the selective reverse transcription of mRNA.
  • a primer directed to the 5’ end of a specific constant region isotype such as the immunoglobulin m or g chains, can be used to selectively reverse transcribe IgH RNA of a specific isotype.
  • total RNA or total mRNA is reverse transcribed and thereafter an amplification step is performed to facilitate the generation of an isotype-specific paratope region DNA amplification product, this product then being the subject of sequencing.
  • This amplification step may be achieved in any suitable manner including, for example, selectively amplifying the cDNA of a specific immunoglobulin isotype using primers designed to enable amplification, in a single step, of the paratope region of a specific heavy chain isotype.
  • the amplification product in this embodiment will comprise both the paratope region and part of the 5’ end of the heavy chain constant region.
  • This product may, for example, no longer include any of the IgH constant region sequence which formed part of the first round amplification product.
  • the present invention is directed to a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest, said method comprising:
  • step (iv) analysing the cDNA sequencing results of step (iii) to identify paratope encoding cDNA sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at east 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope DNA sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope DNA sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • RNA refers to “non-selectively” reverse transcribing RNA as a reference to reverse transcribing some, but not necessarily all, RNA in a non-gene specific manner.
  • said non-selective reverse transcription is reverse transcription of total RNA and in another embodiment said reverse transcription is reverse transcription of mRNA.
  • Reference to “selectively” amplifying cDNA should be understood as a reference to amplifying the subject cDNA in a gene or gene segment specific manner. It would be appreciated by the skilled person that such selective amplification is achieved via the design of gene or gene segment specific primers. This selectivity can be at the level of the paratope or part thereof which is sought to be amplified and/or at the level of the IgH isotype which is sought to be amplified.
  • said selective amplification is directed to amplifying a specific IgH isotype together with all or part of the paratope regions of the selected IgH isotype.
  • the paratope amplification is also regarded as selective since the primers which are utilized are specifically directed to the paratope region, despite the fact that a range of different epitopic specificities of paratopes will be amplified.
  • said paratope which is amplified includes, but is not limited to:
  • the subject forward primer will be directed to a specific position within a V, D or J gene segment while the reverse primer will be directed to the 5’ end of the C region of the IgH chain isotype of interest.
  • the V gene segment is understood to encode the CDR1, CDR2, leader sequence, FR1, FR2 and FR3.
  • the CDR3 region is encoded by part of the V gene segment, all of the D gene segment and part of the J gene segment.
  • the remainder of the J gene segment generally encodes FR4.
  • the present invention is directed to a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest, said method comprising:
  • said paratope or part thereof comprises at least the CDR3 region or part thereof and sequencing said amplification product
  • step (iv) analysing the cDNA sequencing results of step (iii) to identify paratope encoding cDNA sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (v) optionally determining the amino acid sequences encoded by the sequences identified in step (iv);
  • step (vi) screening IgH paratope DNA sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which correlate to any of the paratope sequences identified in step (iv) and/or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope DNA sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • the peripheral blood B cell RNA is derived from a sample of peripheral blood mononuclear cells.
  • the peripheral blood B cell RNA is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or8 weeks apart.
  • RNA in the context of this embodiment of the present invention should be understood as a reference to reverse transcribing in a gene specific manner, specifically, an IgH isotype specific manner. Means for achieving this would be well known to those of skill in the art and include the use of isotype specific primers which are specifically directed to the IgH isotype gene of interest.
  • the primers may be directed to any part of the IgH isotype gene and need not necessarily reverse transcribe the RNA of the entire constant region.
  • the primer is designed to selectively hybridize to and enable reverse transcription of the paratope region of a specific IgH isotype chain, it is irrelevant what portion of the subject constant region is also reverse transcribed.
  • the primers selected for paratope amplification need not be designed to also amplify the constant region, although this is not necessarily expressly excluded. Accordingly, a single set of primers can be used to amplify the paratope or fragment thereof of the transcription product from both steps (i) and (ii). Still further, since there is not imperative to amplify the constant region, one may conveniently use primers and amplification kits which are commercially available, such as the Lymphotrack kits, which amplify a diverse array of paratope regions.
  • said paratope which is amplified includes, but is not limited to: (iii) The DNA encoding the entire paratope region including all of framework regions 1-4 and CDRs 1-3 and optionally the leader sequence.
  • V gene segment region such as a region predisposed to undergoing hypermutation
  • J gene segment region encoding a portion of the CDR3 are amplified and sequenced.
  • the B cell RNA is derived from a sample of peripheral blood mononuclear cells.
  • the B cell RNA is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or 16 weeks apart.
  • said pathogen is a virus or bacterium, preferably SARS-Cov-2.
  • said pathogen is an antigen derived from a microorganism, environmental agent or allergen. More particularly, said pathogen is a vaccine.
  • said infection occurs by the administration of the pathogen to the subject. More particularly, said administration is vaccination or other mode of antigen challenge.
  • the paratope region which is sought to be amplified and sequenced minimally includes all or part of the CDR3 region.
  • the CDR3 region plays a particularly significant role, relative to CDR1 and CDR2, in determining the epitope specificity of a given paratope.
  • the paratope region which is selected for analysis includes all of part of the CDR3 region.
  • the paratope region which is selected for analysis includes all or part of the CDR3 region together with the nucleic acid sequence encoding the 5-20 amino acids flanking the boundaries of the CDR3 region at the FR3 and FR4 regions.
  • step (iii) Methods of performing the amplification and sequencing of step (iii), would be well known to the skilled person.
  • particularly efficacious methods include the use of PCR amplification and the design and use of primers which amplify the paratope region, or part thereof, such that a paratope amplicon library is generated which is suitable for use on a high throughput next generation sequencing (NGS) platform.
  • NGS next generation sequencing
  • subsequently to the preparation of a library of amplified DNA templates for analysis these templates are anchored to a solid support via an adaptor sequence. Once attached, cluster generation is commenced.
  • the objective is to create hundreds of identical strands of the template DNA. Some will correspond to the forward strand and others to the complementary reverse strand. Clusters are then generated through bridge amplification. Polymerases move along a strand of DNA, generating its complementary strand. The original strand is washed away, leaving only the reverse strand. At the top of the reverse strand there is another adaptor sequence. The DNA strand bends and attaches to an anchored oligonucleotide that is complementary to this adaptor sequence. Polymerases then attach to the reverse strand, and its complementary ' strand (which is identical to the original strand) is generated.
  • the now double stranded DNA is denatured so that each strand can separately attach to other unoccupied anchored oligonucleotide sequences which are complementary to the adaptors present at each end of the amplicons.
  • This bridge amplification proceeds to simultaneously generate thousands of clusters corresponding to individual templates across the solid support (often referred to as a “flow cell”). The amplification is therefore clonal within the context of an individual cluster since each cluster is generated from a single starting template DNA.
  • the clonal expansion is evidenced by an increase in the number of sequence reads corresponding to a single specific paratope relative to the otherwise heterogeneous background array of the IgH paratope locus, this heterogeneous background being indicative of the immunoglobulin repertoire of the subject being tested.
  • a clonally expanded paratope is identified from one of the sequentially obtained samples from a subject undergoing an infection with a pathogen of interest and the paratope is determined to have been expressed by IgM in the initial sample drawn from the subject and at least one of the sequentially obtained samples containing the clonally expanded paratope is drawn at a time point after IgM isotype switching would have occurred, this is indicative of the subject paratope corresponding to a paratope which has undergone class switching in accordance with the method of the present invention.
  • the reverse strand is generated via another round of bridge amplification.
  • the forward strand is then washed away and the process of sequence by synthesis repeats for the reverse strand. In this way, bidirectional sequencing is achieved.
  • the amplification and sequencing steps can be designed to facilitate single read sequencing.
  • steps (i) and (ii), as well as the reverse transcription, amplification and sequencing of each of the multiple samples, can be performed in any order or simultaneously, the objective of these steps being to generate the sequencing data readout which is analysed in step (iv).
  • these steps need not even be performed at substantially the same time.
  • these steps may be performed days, or even weeks, apart with the results subsequently analysed relative to one another.
  • the sequence data derived therefrom is analysed to identify paratope DNA sequences which are expressed by both the sample of step (i) and the samples of step (ii).
  • the expression of a paratope by two different immunoglobulin isotypes particularly an isotype characteristic of an early immune response (such as IgM or IgD) and an isotype characteristic of a maturing immune response which has undergone class switching (such as IgG, IgA or IgE) is indicative of the occurrence of an active immune response.
  • a tolerance of several nucleotide differences between sequences of different clusters is a threshold under which those paratope sequences may be classified as being the same paratope, albeit one which has undergone class switching.
  • the tolerance which is set will depend on the circumstances. For example, where affinity maturation has occurred, the tolerance level may be required to be softened in order to accommodate additional paratope DNA sequence differences in order to account for the somatic hypermutation that occurs in the context of the affinity maturation process.
  • sequence relationships between two or more polynucleotides include “reference sequence”, “comparison window”, “sequence similarity”, “sequence identity”, “percentage of sequence similarity”, “percentage of sequence identity”, “substantially similar” and “substantial identity”. Because two polynucleotides may each comprise (1) a sequence (i.e.
  • sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity.
  • Optimal alignment of sequences for aligning a comparison window may be conducted by computerized implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA) or by inspection and the best alignment (i.e. resulting in the highest percentage identity over the comparison window) generated by any of the various methods selected.
  • paratope sequences may be compared over the full length of the paratope of part thereof which is of interest. Where multiple isolated contiguous regions of the paratope sequence are being analysed relative to the corresponding regions of other samples, each of these regions will be separately and individually compared, over the length of that discrete region, to its corresponding region in another sample.
  • sequence identity refers to the extent that sequences are identical or similar on a nucleotide-by-nucleotide basis over a window of comparison.
  • a “percentage of sequence identity” is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g. A, T, C, G) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
  • the method of the present invention is designed to identify the presence of both identical shared exon sequences and exon sequences exhibiting 90% or more identity across the length of the paratope exon sequence or exon sequence region of interest.
  • the comparison window between sequences may be directly comparable.
  • the sequences may be directly compared over the length of the sequence.
  • comparison is preferably made of all the exon sequence data which is obtained but it should be understood that one may also elect to compare only selected exon regions, such as the exons(s) encoding the CDR3 region or part thereof.
  • Reference to an “exon” should be understood as a reference to any region of DNA which encodes a protein product. “Exon” should also be understood to include reference to the N region junctions and P nucleotides which are often formed during immunoglobulin gene segment rearrangement and function to provide junctional diversity.
  • affinity maturation is the process by which activated B cells produce antibodies with increased affinity for an antigen during the course of an immune response. With repeated exposure to the same antigen, a subject will often produce antibodies of successively greater affinities. In fact, a secondary immune response can elicit antibodies with several fold greater affinity than in a primary response. Affinity maturation is generally the result of somatic hypermutation and selection by T helper cells.
  • the method of the present invention is designed to detect not only identical paratope sequences which are expressed by two immunoglobulin isotypes but, further, paratope sequences which exhibit sequence variation and which are indicative of affinity matured paratopes.
  • the skilled person can determine where to set the sequence identity analysis tolerances so as to either specifically identify the occurrence of affinity maturation or to simply categorise such maturation as indicative of the same paratope sequence as the sequences prior to affinity maturation, which would be relevant if the output of the experiment is to simply identify potential immunogenic epitopes for a given pathogen.
  • said percent identity which is indicative of two paratope sequences being the same is at least 96%, 97%, 98% or greater than 99% over the length of the sequence comparison window.
  • the method of the present invention thereby provides for the step of the further characterisation and refinement of candidate paratope determination via correlation of the step (iv) results with population based data analysis which can be performed using rearranged genomic B cell IgH paratope sequence data. Nevertheless, the method of the present invention does not exclude the sequencing and analysis of IgH paratope mRNA expression if this is available.
  • step (vi) In terms of the analysis of other subjects (e.g. the wider population) in accordance with step (vi), it should be understood that these subjects may have been infected at any time and need not be recently infected. These subjects may therefore have had a primary infection or a secondary infection and, in terms of their immune response to the subject pathogen, may be at any stage of immunoresponsiveness.
  • the subjects which are analysed in step (vi) may be any number of subjects provided that at least one other subject is analysed.
  • Reference to “other subject” in this regard should therefore be understood as a reference to subjects other than the subject who undergoes initial analysis in step (i) and (ii).
  • Preferably, as many rearranged IgH sequence data results as available are analysed.
  • sequencing results from single or multiple sequential time points for any given subject may be analysed.
  • reference to screening “nucleic acid sequence data” generated by these subjects should be understood as a reference to any type or form of IgH sequence data including mRNA, cDNA or genomic data.
  • said nucleic acid sequence data is genomic DNA sequence data.
  • step (iv) Once the candidate paratope sequence data of step (iv) is known, it is well within the skill of the person in the art to correlate these sequences with any of the previously described nucleic acid sequence data which are available in term of other subjects who have been infected with the pathogen of interest. Accordingly, to the extent that B cell IgH sequence data which encodes the paratope region becomes available for any reason, such as in the context of lymphoid clonal analyses, these data can also be reviewed in the context of the present method if the medical history of the patient indicates prior infection with the subject pathogen.
  • step (vi) it is within the skill of the person in the art to align and analyse all or part of the sequence data of step (iv) with whatever equivalent data is available, as described hereinbefore.
  • sequence data of step (iv) it is not uncommon in certain B cell clonal studies to only amplify and sequence the DNA encoding the CDR3 paratope region of the rearrange B cell genomic DNA, which corresponds to the sequencing of part of the V gene segment, all of the D gene segment and part of the J gene segment.
  • step (iv) Even if the subject data of step (iv) extends to a significantly larger region of the IgH paratope, the skilled person will simply correlate only that part of the step (iv) exon sequence which is relevant to the exon sequence data available in respect of the step (vi) subject.
  • the method of the present invention may utilise data which is newly generated, such as where a peripheral blood sample is received for an unrelated reason (e.g. as lymphoid clonal analysis of genomic DNA) but is also used to isolate and sequence B cell IgH paratope mRNA.
  • a peripheral blood sample is received for an unrelated reason (e.g. as lymphoid clonal analysis of genomic DNA) but is also used to isolate and sequence B cell IgH paratope mRNA.
  • data mining commonly referred to as data mining
  • data mining can be readily accomplished by receiving FASTQ files of subjects tested by next generation sequencing tests that interrogate the antigen receptor loci.
  • step (vi) to screening the IgH paratope data of other subjects to identify the presence of IgH paratope nucleic acid sequences which “correlate” to the candidate paratope sequence identified in step (iv) should be understood as a reference to identifying sequences which exhibit exact identity to one another or otherwise exhibit a significant level of identity. This should be understood to include:
  • step (iii) Where, regardless of the degree of sequence identity, the specific V, D and/or J gene segments of the rearranged IgH paratope of step (iii) have been classified in terms of their family member identity and these correspond to the V, D and/or J gene segments which have rearranged to form the IgH paratope of the subject of step (vi).
  • step (vi) By screening the population of step (vi) to identify evidence of the clonal expansion of B cells expressing an IgH which encodes a paratope amino acid sequence or sequence motif which correlates to a candidate paratope on the basis that it exhibits homology when one or more conservative amino acid substitutions are deemed to exhibit identity, a significantly more accurate determination can be made of candidate paratopes which are expressed and function across the wider population.
  • the substitution of one or more amino acids in the CDR or framework regions with a conservative substitution will not adversely impact this functionality.
  • a conservative amino acid substitution substitutes one amino acid for another of the same class such as substitution of one hydrophobic amino acid, such as isoleucine, valine, leucine, or methionine, for another, or substitution of one polar amino acid for another, such as substitution of arginine for lysine, glutamic acid for aspartic acid or glutamine for asparagine.
  • conservative amino acid substitutions typically include substitutions within the following groups: glycine and alanine; valine, isoleucine and leucine; aspartic acid and glutamic acid; asparagine and glutamine; serine and threonine; lysine and arginine; and phenylalanine and tyrosine.
  • the candidate paratope sequences which are identified in step (vi) as being shared with sequences identified in step (iv) are then assessed to determine what percentage of the subjects in step (vi) are found to exhibit the candidate paratope sequence. Where at least one of the subjects analysed in step (vi) are found to express the candidate sequence, the status of the sequence as a candidate paratope is maintained.
  • the degeneracy of the genetic code results in a range of different DNA sequences potentially coding for the same amino acid sequence.
  • the present method optionally provides for the in silico determination of the amino acid which is translated from the candidate paratope sequence identified in step (iv). Once this sequence is known, and taking into account DNA codon degeneracy, all possible DNA sequences which would code for that amino acid sequence can be predicted.
  • step (vi) the IgH paratope nucleic acid sequence data of step (vi) can be examined to determine whether any of the alternative and degenerate nucleic acid sequences are present, thereby more accurately indicating the scope of antigens to which an immune response has been generated in those subjects.
  • another embodiment of the present invention is directed to a method of identifying candidate immunoglobulin paratopes or part thereof directed to a pathogen of interest, said method comprising:
  • step (iii) sequencing the cDNA reverse transcribed from the mRNA encoding the paratope or part thereof of the IgM heavy chain of the sample of step (i) and the IgH isotype switched nucleic acid of the one or more samples of step (ii) and wherein said paratope or part thereof comprises at least the CDR3 region or part thereof;
  • step (iv) analysing the cDNA sequencing results of step (iii) to identify paratope encoding cDNA sequences which are expressed by both the sequencing sample result of the step (i) sample and the sequencing sample result of the step (ii) sample and which exhibit at least 90% identity across the length of the exon sequences or at least 90% identity across the length of the exon sequences encoding the CDR3 region;
  • step (vi) screening IgH paratope DNA sequence data from one or more other subjects infected with said pathogen to identify the presence of IgH paratope nucleic acid sequences which either correlate to any of the paratope sequences identified in step (iv) or which encode the amino acid sequences or homologs thereof of step (v) in at least one of said other subjects; wherein IgH paratope DNA sequences identified in step (vi) are indicative of a candidate IgH paratope directed to said pathogen.
  • said pathogen is a microorganism.
  • the B cell nucleic acid is derived from a sample of peripheral blood mononuclear cells.
  • the B cell nucleic acid is derived from a sample of peripheral blood lymphocytes.
  • sequential samples are isolated 2, 3, 4, 5, 6, 7 or 8 weeks apart, for example 4-8 weeks apart.
  • the nucleic acid sequence data of step (vi) is mRNA sequence data and said IgH is IgG.
  • the nucleic acid sequence data of step (vi) is rearranged IgH genomic DNA sequence data.
  • step (iii) comprises non-selectively reverse transcribing the RNA of the sample of steps (i) and (ii) and (a) selectively amplifying the paratope region or fragment thereof of the IgM heavy chain of the cDNA transcribed from the sample of step (i), selectively amplifying the IgG heavy chain from the cDNA transcribed from the one or more samples of step (ii) and sequencing the amplification product of (a) and (b)
  • step (iii) comprises selectively reverse transcribing the IgM heavy chain mRNA of the sample of step (i) and the IgG heavy chain mRNA of the sample of step (ii), amplifying the paratope region or fragment thereof of the cDNA transcribed from the samples of steps (i) and (ii) wherein said paratope regions comprise at least the CDR3 region or part thereof and sequencing said amplification product;
  • said paratope which is amplified includes, but is not limited to:
  • DJ or VDJ rearrangements of IgH are amplified and sequenced.
  • V gene segment region such as a region predisposed to undergoing hypermutation
  • J gene segment region encoding a portion of the CDR3 are amplified and sequenced.
  • recombinant DNA technology will enable the insertion of the VDJ-Constant DNA region sequence of that paratope into a vector for recombinant antibody expression, while the paratope sequence information itself may facilitate synthetic antibody generation.
  • the invention also contemplates synthetic or recombinant antibody fragments including, for example, Fv, Fab, Fab’ and F(ab’) 2 fragments or antigen-binding molecules such as synthetic stabilised Fv fragment.
  • Exemplary fragments of this type include single chain Fv fragments (sFv, frequently termed scFv) in which a peptide linker is used to bridge the N terminus or C terminus of a V H domain with the C terminus or N-terminus, respectively, of a V L domain.
  • ScFv lack all constant parts of whole antibodies.
  • the synthetic stabilized Fv fragment comprises a disulphide stabilized Fv (dsFv) in which cysteine residues are introduced into the VH and VL domains such that in the fully folded Fv molecule the two residues will form a disulphide bond therebetween.
  • dsFv disulphide stabilized Fv
  • Also contemplated as synthetic or recombinant antigen-binding molecules are single variable region domains (termed dAbs).
  • the synthetic or recombinant antigen-binding molecule may comprise a “minibody”.
  • minibodies are small versions of whole antibodies, which encode in a single chain the essential elements of a whole antibody.
  • the minibody is comprised of the V H and V L domains fused to the hinge region and CH3 domain of the immunoglobulin molecule.
  • the synthetic or recombinant antigen-binding molecule may be multivalent (i.e. having more than one antigen binding site). Such multivalent molecules may be specific for one or more antigens. Multivalent molecules of this type may be prepared by dimerization of two antibody fragments through a cysteinyl-containing peptide as, for example disclosed by. Alternatively, dimerization may be facilitated by fusion of the antibody fragments to amphiphilic helices that naturally dimerize or by use of domains (such as leucine zippers jun and ⁇ os) that preferentially heterodimerize.
  • this methodology can be used to effect the extracellular single-chain variable fragment (scFv) expression of an antibody directed to the subject paratope.
  • scFv extracellular single-chain variable fragment
  • one may seek to use these vectors to develop CAR-T cells designed for therapeutic use.
  • CAR-T technology although largely focussed on its applicability to the treatment of cancer, is also being repurposed for the treatment of infection, in particular chronic infection.
  • the epitope to which it is specifically directed can be isolated, sequenced and thereafter recombinantly or synthetically produced.
  • engineered scFv to immunoprecipitate and identify antigens expressing the subject epitope from the blood of infected individuals or from cell cultures of the subject pathogen, such as cultures of SARS-CoV-2 infected human cells, to enable the identification of pathogen derived antigen epitopes to target for vaccine production.
  • MALDI-TOF mass spectrometry or other technologies can also be used to identify the antigen amino acid sequence in order to enable the development of expression systems for candidate vaccine production.
  • the generation of immunogenic molecules suitable for use as a vaccine is a priority.
  • the epitope to which a paratope binds may not, in isolation, be immunogenic.
  • an epitope which is a hapten may be coupled to a carrier protein to facilitate the induction of an immune response to the hapten.
  • antibodies expressing the subject paratope may be used to isolate a larger antigen molecule which expresses the epitope to which the paratope binds.
  • the antibodies or other form of antigen binding molecule expressing the paratopes identified in accordance with the present invention may be utilised as a diagnostic tool for screening for the presence of the pathogens to which these paratopes are directed.
  • These diagnostic tools are based on leveraging basic immunointeraction principles between antibodies and antigens and provide the possibility of extremely rapid and therefore highly valuable point of care diagnostic tests.
  • the range of assays, suitable for use either at the point of care or in a laboratory, which can be developed include but are not limited to paper radio immunoabsorbent test (PRIST), enzyme linked immunoabsorbent assay (ELISA), radio-immunoassay (RIA), immunoradiometric assay (IRMA) and luminescence immunoassay (LIA). Still further, one may develop Western blotting or flow cytometry procedures which are also predicated on the use of immunointeractions.
  • PRIST paper radio immunoabsorbent test
  • ELISA enzyme linked immunoabsorbent assay
  • RIA radio-imm
  • the ability to identify, isolate and generate the antigen molecules to which these paratopes are directed enables the development of tests directed to screening for the presence of antibodies in a patient, for example, who may be infected with the pathogen.
  • the application of the relevant antigen to an ELISA or other such test system enables rapid and accurate screening of a biological sample for the presence of antibody.
  • Such methods are also useful for the ongoing monitoring or assessment of the immune status of individuals who have been previously infected with said pathogen or assessing the immune status of individuals vaccinated with an antigen derived from the pathogen.
  • the method of the present invention enables identification of the diversity and selectivity of paratopes directed to the range of epitopes by a population of subjects vaccinated with a specific vaccine. Accordingly, there is provided a means of determining the efficacy of vaccines and vaccine candidates. Still further, the skilled person would appreciate that the functionality of any vaccine is dependent on the MHC haplotype expressed by the subject in issue and the extent to which effective presentation of the selected antigen can be achieved by a given MHC haplotype in order to maximize vaccine immunogenicity.
  • HLA haplotypes and vaccines which may be efficiently presented by one class of haplotype may not be efficiently presented by another haplotype.
  • the present invention enables the identification and selection of vaccine candidates either for specific racial or ethnic populations which exhibit unique MHC haplotypes or to enable the design of a “pan- vaccine candidate” that is effective across a wide range of different MHC haplotypes.
  • RNA is extracted from the peripheral blood lymphocytes and 2 separate cDNA synthesis reactions on aliquots of RNA using primers that reverse transcribe the entire IGH repertoire using IGH constant-region-specific primer for cDNA synthesis.
  • An IgM specific cDNA synthesis is performed and an IgG specific cDNA synthesis is performed.
  • the cDNA populations are amplified with separate indices using the LymphoTrack ® IGH FR1 Assay.
  • the IgM-specific clonal populations in the initial sample are identified and the IgG-specific clonal populations in the later samples are identified.
  • LymphoTrack® analyses are repeated at one or more subsequent time points testing genomic DNA samples.
  • Sequences are identified that can be confidently determined (employing the appropriate filtered mutation rate), to be shared or correlated between the samples.
  • Identical or nearly identical CDR3 DNA sequences are identified and the translated protein sequence(s) are determined.
  • Population based analysis of patients determined to be positive for the SARS-CoV-2 virus are performed to identify sequences encoding identical or nearly identical CDR3 sequences or degenerate sequences encoding identical or nearly identical CDR3 sequences or amino acid sequence motifs.
  • RNA is extracted from the peripheral blood lymphocytes and cDNA synthesis is performed on aliquots of RNA using primers that reverse transcribe the entire IGH repertoire using primers that target all of the heavy chain constant regions.
  • the cDNA population is amplified using only the LymphoTrack® IGH V- region upstream primers together with the downstream heavy chain Constant region- specific primers.
  • the IgM-specific clonal populations in the initial sample are identified and the IgG-specific clonal populations in the later samples are identified testing cDNA or genomic DNA samples.
  • the IgG analyses are repeated at one or more subsequent time points.
  • Sequences are identified that can be confidently determined (employing the appropriate filtered mutation rate), to be shared or correlated between the samples.
  • Identical or nearly identical CDR3 DNA sequences are identified and the translated protein sequence(s) are determined.
  • Population based analysis of patients detennined to be positive for the SARS-CoV-2 virus are performed to identify sequences encoding identical or nearly identical CDR3 sequences or degenerate sequences encoding identical or nearly identical CDR3 sequences or amino acid sequence motifs.
  • RNA is extracted from the peripheral blood lymphocytes from the sample obtained at the earlier time point and a cDNA synthesis reaction on aliquot of RNA using primers that reverse transcribe the entire IGH repertoire using at least one IGH IgM constant-region-specific primer for cDNA synthesis.
  • the IgM specific cDNA population is amplified using the LymphoTrack ® IGH FR1 Assay. The IgM-specific clonal populations in the initial sample are identified.
  • DNA is extracted from the peripheral blood lymphocytes from the sample obtained at one or more subsequent or later time points.
  • the genomic DNA population is amplified using the LymphoTrack ® IGH FR1 Assay.
  • Sequences are identified that can be confidently determined (employing the appropriate filtered mutation rate), to be shared or correlated between the initial IgM cDNA and later genomic DNA samples. Identical or nearly identical CDR3 DNA sequences are identified and the translated protein sequence(s) are determined.
  • VCD viable cell density
  • Viability %.
  • RLT buffer + 2-mercaptoethanol solution lysis buffer: Minimum volume required is 600 m ⁇ per sample of 5 x 106 live cells. In chemical fume hood, add 10 ⁇ l 2-mercaptoethanol per 1ml RLT buffer. Vortex for 10 seconds after observing a visible swirl. Buffer RLT/BME solution may be stored at 4C for up to a month. Vortex Lysed cells vigorously and thoroughly, at least 10 seconds after stable vortex is formed on high. If clumps are still present, then Freeze Lysate at -80C.
  • RNA RNA using Qiagen RNeasy mini kit
  • Thaw frozen lysates should be incubated at 37°C in a water bath until completely thawed and salts are dissolved.
  • RNA from said second time point specimen Collect 4-8 mL of peripheral blood from said individual at an interval corresponding to the peak period of IgM response (7-15 days), following pathogen exposure, antigen challenge, or vaccination in EDTA tubes by a phlebotomist.
  • RLT buffer + 2-mercaptoethanol solution (lysis buffer): Minimum volume required is 600 ⁇ l per sample of 5 x 106 live cells. In chemical fume hood, add 10 ⁇ l 2-mercaptoethanol per 1ml RLT buffer. Vortex for 10 seconds after observing a visible swirl. Buffer RLT/BME solution may be stored at 4C for up to a month. Vortex Lysed cells vigorously and thoroughly, at least 10 seconds after stable vortex is formed on high. If clumps are still present, then Freeze Lysate at -80C. Extract RNA using Qiagen RNeasy mini kit Thaw frozen lysates (from step 2.2.14) should be incubated at 37°C in a water bath until completely thawed and salts are dissolved.
  • RNAse-free water Reuse the collection tube.
  • 6 Quantify the RNA using Qubit RNA kit.
  • 7 Freeze the extracted RNA at -80°C.
  • In separate reactions reverse transcribe extracted RNA from said baseline and second time point specimens with a primer directed to the IgM heavy chain constant region Perform cDNA synthesis using Invitrogen’s Superscript II reverse transcriptase and IgM primer (5’-ACG GGG AAT TCT CAC AGG AG-3’) (SEQ ID NO: 1).
  • Use Thermal Program 1 Add cDNA Step 2 MM into Step 1 MM.
  • Use Thermal Program 2 Add 1 ⁇ l of Superscript II reverse transcriptase per reaction.
  • Use Thermal Program 3 Sequence said reverse transcribed baseline and second time point specimen cDNAs using a nucleic acid sequencing assay that targets and identifies members of the subject immunoglobulin repertoire. Test cDNA from Step 3 using LymphoTrack IGH FR1 - MiSeq Assay (IFU 280389 revG). Amplify cDNA using LymphoTrack IGH FR1 PCR MM. Each sample uses a different index for PCR MM. ⁇ L Use Thermal Program 4: Purify the PCR product using AmPure XP beads at 1x ratio. 4.2.4 Quantify the purified PCR product using Agilent BioAnalyzer.
  • bioinformatics software to identify candidate clonotypes and amino acid CDR sequences previous identified in these said future time point specimens.
  • the software can search for the previously identified candidate clonotype within later timepoints. This search will be based solely on sequence identity.
  • Use alignment software to track affinity maturation (clones with related but not identical nucleic acid or CDR amino acid sequences), and to identify candidate antibody molecules vs databases of previously identified neutralizing antibody molecules.
  • Candidate clonotypes from multiple patients/sources will be compared at the amino acid level of their CDRs to identify subsets of similar binding domains. The comparison will be done not with just sequence identity, but with chemical similarity parameters (e.g. amino acids with similar hydrophobicity are considered similar).
  • the LymphoTrack IGH FR1 Assay - MiSeq targets the conserved framework 1 (FR1) region within the V H segments of the IGH gene to identify clonal IGH V H -J H rearrangements, the associated V H -J H region DNA sequences, provides the frequency distribution of VH region and J H region segment utilization, and the degree of somatic hypermutation (SHM) of rearranged genes using the Illumina MiSeq platform.
  • FR1 Assay - MiSeq targets the conserved framework 1 (FR1) region within the V H segments of the IGH gene to identify clonal IGH V H -J H rearrangements, the associated V H -J H region DNA sequences, provides the frequency distribution of VH region and J H region segment utilization, and the degree of somatic hypermutation (SHM) of rearranged genes using the Illumina MiSeq platform.
  • FR1 Assay - MiSeq targets the conserved framework 1 (FR1) region within the V H segments of the IGH gene to
  • the LymphoTrack IGH FR2 Assay - MiSeq targets the conserved framework 2 (FR2) region within the V H segments of the IGH gene to identify clonal IGH V H -J H rearrangements, the associated V H -JH region DNA sequences, and provides the frequency distribution of VH region and JH region segment utilization using the Illumina MiSeq platform.
  • FR2 conserved framework 2
  • the LymphoTrack IGH FR3 Assay - MiSeq targets the conserved framework 3 (FR3) region within the VH segments of the IGH gene to identify clonal IGH V H -J H rearrangements, the associated V H -J H region DNA sequences, and provides the frequency distribution of V H region and J H region segment utilization using the Illumina MiSeq platform.
  • FR3 conserved framework 3
  • Each single multiplex master mix targets one of the conserved IGH framework regions (FR1, FR2, or FR3) within the VH and the JH regions described in lymphoid malignancies.
  • Primers included in the master mixes are designed with Illumina adapters and up to 48 different indices. This method allows for a one-step PCR, and pooling of amplicons from several different samples and targets (generated with other LymphoTrack Assays for the Illumina MiSeq instrument) onto one MiSeq flow cell, allowing for up to 48 samples per target to be analyzed in parallel in a single run.
  • PCR amplicons are purified to remove excess primers, nucleotides, salts, and enzymes using the Agencourt ® AMPure ® XP system.
  • This method utilizes solid- phase reversible immobilization (SPRI) paramagnetic bead technology for high- throughput purification of PCR amplicons.
  • SPRI solid- phase reversible immobilization
  • PCR products 100 bp or larger are selectively bound to paramagnetic beads while contaminants such as excess primers, primer dimers, salts, and unincorporated dNTPs are washed away. Amplicons can then be eluted and separated from the paramagnetic beads resulting in a more purified PCR product for downstream analysis and amplicon quantification.
  • Purified amplicons are quantified using the KAPA TM Library Quantification Kits for Illumina platforms.
  • Purified and diluted PCR amplicons and a set of six pre-diluted DNA standards are amplified by quantitative (qPCR) methods, using the KAPA SYBR ® FAST qPCR Master Mix and primers.
  • the primers in the KAPA kit target Illumina P5 and P7 flow cell adapter oligo sequences.
  • NGS technologies used in this assay rely on the amplification of genetic sequences using a series of consensus forward and reverse primers that include adapter and index tags.
  • Amplicons generated with the LymphoTrack Master Mixes are quantified, pooled, and loaded onto a flow cell for sequencing with an Illumina MiSeq sequencing platform.
  • the amplified products in the library are hybridized to oligonucleotides on a flow cell and are amplified to form local clonal colonies (bridge amplification).
  • Four types of reversible terminator bases (RT-bases) are added and the sequencing strand of DNA is extended one nucleotide at a time.
  • a CCD camera takes an image of the light as each RT-base is added, and then cleaved to allow incorporation of the next base.
  • LymphoTrack allows for two different levels of multiplexing in order to reduce costs and time for laboratories.
  • the first level of multiplexing originates from the multiple indices that are provided with the assays. Each of these 48 indices acts as a unique barcode that allows amplicons from individual samples to be pooled together after PCR amplification to generate the sequencing library; the resulting sequences are sorted by the bioinformatics software, which identifies those that originated from an individual sample.
  • the second level of multiplexing originates from the ability of the accompanying software to sort sequencing data by both index and target. This allows amplicons generated with targeted primers (even those tagged with the same index) to be pooled together to generate the library to be sequenced on a single flow cell.
  • the number of samples that can be multiplexed onto a single flow cell is also dependent on the flow cell that is utilized.
  • Illumina s standard flow cells (MiSeq v3) can generate 20-25 million reads. To determine the number of reads per sample, divide the total number of reads for the flow cell by the number of samples that will be multiplexed. Illumina also manufactures other flow cells that utilize the same sequencing chemistry, but generate fewer reads. When using these alternative flow cells one must consider that fewer total reads either means less depth per sample or fewer samples can be run on the flow cell to achieve the same depth per sample.
  • the minimum input quantity is 50 ng of high quality DNA (5 ⁇ L of sample DNA at a minimum concentration of 10 ng/ ⁇ L), DNA must be quantified and be free of PCR inhibitors. Resuspend DNA in an appropriate solution such as 0.1X TE (1 mM Tris- HC1, 0.1 mM EDTA, pH 8.0, prepared with molecular biology grade water) or molecular biology grade water alone.
  • RNA is extracted within 48 hours.
  • archived cells frozen in 10% DMSO + 90% Fetal Bovine Serum (FBS) can be used.
  • Heparin is a powerful inhibitor of PCR.
  • Ficoll and other separation media are generally effective at removing heparin from samples.
  • Amplification controls that test for presence of a housekeeping gene transcript are used to detect amplification inhibitors.
  • Peripheral blood lymphocytes PBLs are isolated by diluting whole blood 1 : 1 with RPMI 1640 and banding on Ficoll-Hypaque. Cells collected at the interface are washed 3x with RPMI or phosphate buffered saline (PBS) before RNA extraction.
  • Step 1 Isolate cells using separation media (e.g., Ficoll-Hypaque); rinse using PBS or RPMI 1640.
  • separation media e.g., Ficoll-Hypaque
  • Step 2 Extract RNA from unknown samples.
  • Step 3 Synthesize cDNA from unknown sample RNAs, and the positive and negative control RNAs included in the assay kit.
  • Specimen Control Size Ladder Master Mix from Invivoscribe ( 2-096-0021 for ABI detection or 2-096-0020 for gel detection).
  • the Specimen Control Size Ladder targets multiple genes and generates a series of amplicons of approximately 100, 200, 300, 400 and 600 bp.
  • NTC no template control
  • Amplification Amplify the samples using the PCR program:
  • the quantity of library DNA loaded onto the MiSeq flow cell is critical for generating optimal cluster density and obtaining high-quality data in a sequencing run.
  • a separate pool is created for each LymphoTrack Assay and corresponding target. After final quantification of a pooled library for each target, LymphoTrack Assays are sequenced individually or can be multiplexed together.
  • a MiSeq Sample Sheet is established using the Illumina Experiment Manager.
  • the MiSeq produce sequencing data that is analyzed using the LymphoTrack Software ( 7-500-0009). Samples prepared with LymphoTrack IGH Assays can be areprocessed into fully analyzed data using the LymphoTrack Software - MiSeq.
  • NGS Negative Control top % reads ⁇ 1.0%
  • the first 33 nts are not unique in the genome, and the reverse primer is designed immediately after this region, ending at position 55. According, 55nts of this cDNA will be IGH-M, and everything upstream is VDJ.
  • the primer design extends about 15 nts into the exon, and ends at position 35, meaning 35 nts of this cDNA is IGH-G, and everything upstream is VDJ. These primers exhibit a higher Tm than typical (GC rich area). chrl4 + 105856164 105856183
  • B cell mRNA was extracted and reverse-transcribed into cDNA with an IgM-specific primer.
  • gDNA was extracted from samples taken at the later time points. [000227] Both cDNA and gDNA were analysed using the LymphoTrack IGH FR1 Assay and Software. Clonally expanded VJ rearrangements which were prevalent in the sequentially drawn samples post-pathogen exposure but not in the baseline sample were identified. A clonal sequence found emerging within a later timepoint but absent from the baseline sample was designated a candidate clonal sequence. Candidate clonal sequences were further analysed by designating their intrinsic CDR segments.
  • IgM sequences which include the CDR3 nucleic acid and translated amino acid clonal sequences. These sequences were identified as present at the first post-exposure timepoint in each subject but not in the baseline samples. The CDR3 region is shown in bold in the amino acid sequence. These candidate clonotypes are considered emerging clonotypes as the baseline sample does not contain an IgM clonal signal of the same sequence.
  • the sequence information which was obtained included the CDR1, CDR2 and CDR3 regions, although only the CDR3 sequence is shown since the CDR3 is the most likely region to show variable binding affinity.
  • the CDR sequences of the candidate clonotypes are used to track emerging clones at later timepoints via a direct alignment of the nucleic acid sequences in the later gDNA sample datasets. These CDR sequences are also used to defmine unique sets of pathogen-specific binding affinities by comparing the amino acid sequences amongst the candidate clonotypes, and to broad public data sets. In figure 1, multiple CDR3 sequences are shown, detected from the IGH repertoire analysis of patients recovering from a COVID-19 infection.
  • Figure 1 https://www. cell. com/imnmnity/pdf/S 1074-7613(20)30279-

Abstract

Les aspects décrits dans la présente invention concernent des procédés d'identification de paratopes d'immunoglobuline candidats dirigés contre un pathogène d'intérêt. Certains modes de réalisation concernent le criblage de paratopes d'immunoglobuline candidats dirigés contre un épitope dérivé d'un pathogène par criblage de paratopes d'immunoglobuline qui, à la fois, ont subi une commutation de classe et sont exprimés à travers une population infectée génétiquement diverse. Les approches décrites dans la présente invention facilitent la détection et l'analyse de paratopes immunologiquement dominants, conjointement aux épitopes contre lesquels ils sont dirigés. Les paratopes identifiés par les procédés de criblage peuvent être utilisés pour identifier les séquences de paratope permettant la production d'anticorps recombinants, l'identification et l'isolement d'épitopes et d'immunogènes candidats dans le cadre du développement de vaccins, le développement de point de diagnostics de soin, le développement de systèmes d'expression d'immunogène, l'identification et/ou le développement d'anticorps de neutralisation, l'évaluation du statut immunitaire de sujets qui ont été précédemment infectés au moyen dudit agent pathogène et l'évaluation du statut immunitaire des sujets vaccinés au moyen de vaccins à base d'antigène.
PCT/US2021/025352 2020-04-02 2021-04-01 Procédé de caractérisation WO2021202861A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202063004419P 2020-04-02 2020-04-02
US63/004,419 2020-04-02
US202063005120P 2020-04-03 2020-04-03
US63/005,120 2020-04-03

Publications (1)

Publication Number Publication Date
WO2021202861A1 true WO2021202861A1 (fr) 2021-10-07

Family

ID=77930417

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/025352 WO2021202861A1 (fr) 2020-04-02 2021-04-01 Procédé de caractérisation

Country Status (1)

Country Link
WO (1) WO2021202861A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170349954A1 (en) * 2008-11-07 2017-12-07 Adaptive Biotechnologies Corp. Monitoring health and disease status using clonotype profiles
US20170362653A1 (en) * 2014-11-25 2017-12-21 Adaptive Biotechnologies Corporation Characterization of adaptive immune response to vaccination or infection using immunosequencing
US20180363059A1 (en) * 2013-02-04 2018-12-20 The Board Of Trustees Of The Leland Stanford Junior University Measurement and comparison of immune diversity by high-throughput sequencing
WO2020018837A1 (fr) * 2018-07-18 2020-01-23 Life Technologies Corporation Compositions et méthodes pour le séquençage du répertoire immunitaire

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170349954A1 (en) * 2008-11-07 2017-12-07 Adaptive Biotechnologies Corp. Monitoring health and disease status using clonotype profiles
US20180363059A1 (en) * 2013-02-04 2018-12-20 The Board Of Trustees Of The Leland Stanford Junior University Measurement and comparison of immune diversity by high-throughput sequencing
US20170362653A1 (en) * 2014-11-25 2017-12-21 Adaptive Biotechnologies Corporation Characterization of adaptive immune response to vaccination or infection using immunosequencing
WO2020018837A1 (fr) * 2018-07-18 2020-01-23 Life Technologies Corporation Compositions et méthodes pour le séquençage du répertoire immunitaire

Similar Documents

Publication Publication Date Title
US11061030B2 (en) Affinity-oligonucleotide conjugates and uses thereof
AU2020213348B2 (en) Uniquely tagged rearranged adaptive immune receptor genes in a complex gene set
US11001895B2 (en) Methods of monitoring conditions by sequence analysis
JP7341661B2 (ja) ネイティブに対合するt細胞受容体配列のt細胞受容体標的識別のための高スループットプロセス
JP6672310B2 (ja) ハイスループットヌクレオチドライブラリーシークエンシング
AU2018273999A1 (en) High-throughput polynucleotide library sequencing and transcriptome analysis
JP2019537430A (ja) 親和性−オリゴヌクレオチドコンジュゲートおよびその使用
WO2021202861A1 (fr) Procédé de caractérisation
CN115485392A (zh) 用于鉴别配体阻断性抗体及用于确定抗体效价的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21782404

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21782404

Country of ref document: EP

Kind code of ref document: A1