CN112840033A - Biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) - Google Patents

Biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) Download PDF

Info

Publication number
CN112840033A
CN112840033A CN201980066179.1A CN201980066179A CN112840033A CN 112840033 A CN112840033 A CN 112840033A CN 201980066179 A CN201980066179 A CN 201980066179A CN 112840033 A CN112840033 A CN 112840033A
Authority
CN
China
Prior art keywords
ighv3
cfs
variables
ighv
ighv1
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980066179.1A
Other languages
Chinese (zh)
Inventor
山村隆
佐藤和贵郎
小野纮彦
松谷隆治
中村征史
北浦一孝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Genesis Genesis Corp
National Center of Neurology and Psychiatry
Repertoire Genesis Inc
Original Assignee
Genesis Genesis Corp
National Center of Neurology and Psychiatry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genesis Genesis Corp, National Center of Neurology and Psychiatry filed Critical Genesis Genesis Corp
Publication of CN112840033A publication Critical patent/CN112840033A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • G01N33/5094Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for blood cell populations
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/30Psychoses; Psychiatry
    • G01N2800/306Chronic fatigue syndrome

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • Hematology (AREA)
  • Pathology (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • Cell Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Ecology (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)

Abstract

The present invention provides a method for diagnosing myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). The present invention provides a method for using the B Cell Receptor (BCR) pool as an index for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). 1 or more variables selected from the group consisting of the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject, the BCR diversity index of the subject, and the number of 1 or more immune cell subsets of the subject may be used as the index of ME/CFS of the subject.

Description

Biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS)
Technical Field
The present invention relates to the field of disease diagnosis, in particular to the field of diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS).
Background
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a disease in which significant fatigue, fatigue after work, sleep disorders, cognitive dysfunction, and upright intolerance are core symptoms, and which is accompanied by various symptoms such as pain, autonomic nerve disorder, and allergy to light, sound, food, and chemicals. There are cases where the disease develops after infection, the disease state is not clear, and biomarkers of the disease do not appear, so that there are various diagnostic criteria, and diagnosis and research are hindered. In addition, effective therapeutic methods have not yet been established.
Disclosure of Invention
Means for solving the problems
The present invention provides a method of using the B Cell Receptor (BCR) pool as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). The frequency of use of the gene in the IgGH chain variable region of BCR in ME/CFS patients is altered compared to healthy controls, which has been demonstrated by the present specification, and using this information, a diagnosis of ME/CFS can be made. In addition to the gene in the IgGH chain variable region of BCR, the number of immune cell subsets (B cells, regulatory T cells, etc.) of the subject may also be used as an index. The frequency of gene use can be determined by methods including large-scale, high-efficiency BCR library analysis.
Examples of embodiments of the present invention are shown in the following items.
(item 1) A method of using a B Cell Receptor (BCR) pool of a subject as an index for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the subject.
(item 2) the method according to the above item, wherein the index of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) of the subject is 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject.
(item 2A) the method of any one of the above items, wherein the 1 or more variables are indicative of the subject suffering from ME/CFS but not from other diseases.
(item 3) the method of any one of the above items, wherein the 1 OR more genes are selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV 7-70D, IGHV2/OR16-5, IGHV 16-7, IGHV 16-9, IGHV 16-11, IGHV 16-72, IGHV2 2-72, IGHV 16-72, IGHV 16-16, IGHV 16-72-, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL 3, IGHV3/OR 3-7, IGHV3/OR 3-6, IGHV3/OR 3-8, IGHV3/OR 3-9, IGHV3/OR 3-10, IGHV3/OR 3-12, IGHV3/OR 3-13, IGHV 573-72-30-72, IGHV 3-72-3, IGHV 3-72-3, IGHV 3-72-3-72, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR 2-2 a/b, IGHD 2-3, IGHD 2-9, IGHD 2-10, IGHD 2-16, IGHD 2-22, IGHD2/OR 2-3 a/b, IGHD 2-4, IGHD-72, IGHD-2-72, IGHD-2/2-72, IGHD 2/2-72 a/b, IGHD 2/2-2, IGHD 2/72-72, IGHD 2-72, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.
(item 3-1) the method of any one of the above items, wherein the 1 OR more genes are selected from the group consisting of IGHV-73, IGHV-69-2, IGHV-51, IGHV-31, IGHV-23/OR-9, IGHV-39, IGHD-12, IGHV-43-17, IGHV-10-1, IGHD/OR-4 a/b, IGHG, IGHV/OR-5, IGHV/OR-9, IGHD-7, IGHV-21, IGHD-6, IGHV-33, IGHD-23, IGHV-30-5, IGHV-23, IGHD-13, IGHV-64-48, IGHV-64, IGHG, IGHV-49, IGHV-30-3, IGHD-26, IGHJ, IGHV-30, IGHV-1, IGHV-4 a/A/B, IGHV-9, IGHV/OR-9, IGHV-7, IG, At least 1 gene selected from the group consisting of IGHGP, IGHV1-3, and IGHD 3-22.
(item 3-2) the method of any one of the above items, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD 3-22.
(item 3-3) the method of any one of the above items, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD 3-22.
(item 3-4) the method of any one of the above items, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ 6.
(item 4-1) the method of any one of the above items, wherein the 1 or more variables further comprise the number of 1 or more subpopulations of immune cells.
(item 4-2) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.7 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 4-3) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.8 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 4-4) the method of any one of the above items, wherein the number of the subpopulations of immune cells is selected from the group consisting of the number of B cells, the number of naive B cells, the number of memory B cells, the number of plasmablasts, the number of activated naive B cells, the number of transitional B cells, the number of regulatory T cells, the number of memory T cells, the number of follicular helper T cells, the number of Tfh1 cells, the number of Tfh2 cells, the number of Tfh17 cells, the number of Th1 cells, the number of Th2 cells, and the number of Th17 cells.
(item 5-1) the method of any one of the above items, wherein the 1 or more variables comprise frequency of use of 2 or more genes in an IgGH chain variable region of the BCR of the subject.
(item 5-2) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.7 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 5-2) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.8 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 5-2) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.9 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 6-1) the method according to any one of the above items, wherein the 1 or more variables comprise a combination of 2 or more variables selected from the group consisting of the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject, the BCR diversity index of the subject, and the number of 1 or more subsets of immune cells of the subject, and the 1 or more variables show AUC ≧ 0.7 in a regression analysis-based ROC curve for discriminating between normal control and ME/CFS.
(item 6-2) the method of any one of the above items, wherein the 1 or more variables include 3 or more variables selected from the group.
(item 6-3) the method of any one of the above items, wherein the 1 or more variables include 4 or more variables selected from the group.
(item 6-4) the method of any one of the above items, wherein the 1 or more variables include 5 or more variables selected from the group.
(item 6-5) the method of any one of the above items, wherein the 1 or more variables include 6 or more variables selected from the group.
(item 6-7) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.8 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 6-3) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.9 in a regression analysis-based ROC curve for discriminating between a normal control and ME/CFS.
(item 7) the method of any one of the above items, wherein the 1 or more genes comprise IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ 6.
(item 8) the method of any one of the above items, wherein the 1 or more variables comprise the number of B cells of the subject.
(item 9) the method of any one of the above items, wherein the 1 or more variables comprise the number of regulatory T cells (tregs) in the subject.
(item 10) the method of any one of the above items, wherein the frequency of use of the 1 or more genes is determined by a method comprising large-scale high-efficiency BCR library analysis.
(item 11) the method of any one of the above items, wherein the 1 or more variables comprise frequency of use of at least 1 gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHV3-30, IGHJ6, IGHGP, IGHV4-31, IGHV3-64, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1, and IGHV 4-34.
(item 12) the method of any of the above items, comprising:
(a) using a part of the more than 1 variable as the index of ME/CFS of the object,
(b) and taking a part of the more than 1 variable as an index of the object being ME/CFS but not other diseases.
(item 13) the method of any one of the above items, wherein (b) is performed a plurality of times for a plurality of other diseases.
(item 14) the method of any one of the above items, wherein the other disease comprises Multiple Sclerosis (MS).
(item 15) A method in which 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of a BCR of a subject are used as an indicator that the subject suffers from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) rather than Multiple Sclerosis (MS).
(item 15A) the method of the above item, wherein features of 1 or more of the above items are present.
(item 16) the method of any one of the above items, wherein the 1 OR more genes comprise a gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV 7-70D, IGHV2/OR16-5, IGHV 16-7, IGHV 16-9, IGHV 16-11, IGHV 16-72, IGHV2 2-72, IGHV 16-72-16, IGHV 16-72-16-72, IGHV 16-72, IGHV, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL 3, IGHV3/OR 3-7, IGHV3/OR 3-6, IGHV3/OR 3-8, IGHV3/OR 3-9, IGHV3/OR 3-10, IGHV3/OR 3-12, IGHV3/OR 3-13, IGHV 573-72-30-72, IGHV 3-72-3, IGHV 3-72-3, IGHV 3-72-3-72, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR 2-2 a/b, IGHD 2-3, IGHD 2-9, IGHD 2-10, IGHD 2-16, IGHD 2-22, IGHD2/OR 2-3 a/b, IGHD 2-4, IGHD-72, IGHD-2-72, IGHD-2/2-72, IGHD 2/2-72 a/b, IGHD 2/2-2, IGHD 2/72-72, IGHD 2-72, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.
(item 17) the method of any one of the above items, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, and IGHJ 6.
(item 18) the method of any one of the above items, wherein the 1 or more variables comprise frequency of use of 2 or more genes in the IgGH chain variable region of the subject's BCR.
(item 19) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.7 in a regression analysis-based ROC curve for discriminating MS and ME/CFS.
(item 20) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.8 in a regression analysis-based ROC curve for discriminating MS and ME/CFS.
(item 21) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.9 in a regression analysis-based ROC curve for discriminating MS and ME/CFS.
(item 22) the method of any one of the above items, wherein the 1 or more variables comprise frequency of use of 3 or more genes in the IgGH chain variable region of the subject's BCR.
(item 23) the method of any one of the above items, wherein the 1 or more variables show AUC ≧ 0.7 in a regression analysis-based ROC curve for discriminating MS and ME/CFS.
(item 24) the method according to any one of the above items, wherein the 1 or more variables show AUC ≧ 0.8 in a regression analysis-based ROC curve for discriminating MS and ME/CFS.
(item 25) the method of any one of the above items, wherein the 1 or more variables show AUC ≧ 0.9 in a regression analysis-based ROC curve for discriminating MS and ME/CFS.
(item 26) the method of any one of the above items, wherein (b) is performed by the method of any one of the above items.
(item A1) A method for diagnosing myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject, comprising: determining a subject's B Cell Receptor (BCR) pool; diagnosing the subject's suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on the BCR library.
(item a1-1) the method of the above item, wherein there are features recited in 1 or more of the above items.
(item A2) A method for treating myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), comprising: determining a subject's B Cell Receptor (BCR) pool; diagnosing the subject's suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on the BCR library; the subject is treated.
(item a2-1) the method of the above item, wherein there are features recited in 1 or more of the above items.
(item a3) the method of any one of the above items, wherein the subject is diagnosed with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject.
(item a4) the method according to any one of the above items, wherein the subject is diagnosed with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) by calculating a value of an equation containing the variable for the subject and comparing the value to a threshold value.
(item a5) the method of any one of the above items, comprising:
(A) regarding myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR are provided;
(B) providing a discriminant generated by performing multivariate analysis on the variable;
(C) substituting the value of the variable of the subject into the discriminant to calculate a probability of suffering from ME/CFS; and
(D) and determining that the subject suffers from ME/CFS under the condition that the probability of suffering from ME/CFS is higher than a specified value.
(item a6) the method of any one of the above items, comprising:
(A) regarding myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR are provided;
(B) providing a discriminant generated by performing multivariate analysis on the variable, wherein (B) comprises:
(B-1) subjecting the variable to univariate or multivariate logistic regression with patient/healthy person discrimination as a target variable,
(B-2) calculating constants and coefficients of discriminants from constants and partial regression coefficients of a Logit model formula (Logit model formula) generated in the logistic regression, and
(B-3) generating a discriminant based on the constant and the coefficient obtained in the processing of B-2;
(C) substituting the value of the variable of the subject into the discriminant to calculate a probability of suffering from ME/CFS; and
(D) and determining that the subject suffers from ME/CFS under the condition that the probability of suffering from ME/CFS is higher than a specified value.
(item a6-a) the method of any one of the above items, comprising:
(A) determining a combination of variables, (a) comprising:
(A-1) comparing healthy and healthy human subjects including myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) with each other, a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR in which a significant difference is detected, and/or the like, are provided
(a-2) performing univariate logical analysis using 1 gene in the IgGH chain variable region of the BCR as an independent variable, or performing multivariate logical analysis using 2 or more genes in the IgGH chain variable region of the BCR as independent variables to obtain a logistic regression model equation, performing ROC analysis for measuring the degree of fitting of the logistic regression model equation, and selecting a gene showing an AUC value of a predetermined value or more as a variable for a discriminant;
(B) providing a discriminant generated by performing multivariate analysis on the variables for the discriminant, wherein (B) comprises:
(B-1) subjecting the variables for the discriminant to univariate or multivariate logistic regression with the patient/healthy person classification as a target variable,
(B-2) calculating constants and coefficients of discriminants, and
(B-3) generating a discriminant based on the constant and the coefficient obtained in the processing of B-2;
(C) substituting the value of the variable of the subject into the discriminant to calculate a probability of suffering from ME/CFS; and
(D) and determining that the subject suffers from ME/CFS under the condition that the probability of suffering from ME/CFS is higher than a specified value.
(item a7) the method of any one of the above items, comprising:
(A) regarding myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR are provided;
(B) providing a discriminant generated by performing multivariate analysis on the variable;
(C) substituting the value of the variable of the subject into the discriminant to calculate a probability of suffering from ME/CFS;
(AA) providing a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR in diseases other than myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS);
(BB) providing a discriminant generated by performing multivariate analysis on the variable;
(CC) calculating the probability of suffering from a disease other than ME/CFS by substituting the value of the variable of the subject into the discriminant; and
(D) and determining that the subject suffers from ME/CFS when the probability of suffering from ME/CFS is higher than a predetermined value and the probability of suffering from diseases except ME/CFS is lower than a predetermined value.
(item A8) A method for in vitro diagnosing myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject, comprising: the subject is diagnosed as suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) based on the subject's B-cell receptor (BCR) pool.
(item A8-1) the method of the above item, wherein there are features recited in 1 or more of the above items.
(item B1) a program, comprising the instructions of: when executed by 1 or more processors, the processor acquires 1 or more variables including variables associated with a BCR library of an object, and determines whether the object is an ME/CFS based on the 1 or more variables.
(item B1-1) the program of the above item, wherein the program has the features of 1 or more of the above items.
(item B2) A storage medium in which a program containing instructions that, when executed by 1 or more processors, cause the processors to obtain 1 or more variables including variables associated with a BCR library of an object, and determine whether the object is an ME/CFS based on the 1 or more variables is recorded.
(item B2-1) the storage medium of the above item, wherein the features recited in 1 or more of the above items are present.
(item B2-2) the storage medium of any one of the above items, which is a non-transitory storage medium.
(item C1) A system, comprising:
a recording unit configured to record information of 1 or more variables including a variable associated with a target BCR library; and
and a determination unit configured to obtain the information and determine whether the object is an ME/CFS.
(item C1-1) the system of the above item, wherein the system has the features of 1 or more of the above items.
Effects of the invention
According to the present invention, myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) can be diagnosed with certainty. The present invention is also useful for prediction and prognosis of ME/CFS onset, and as a marker for developing a therapeutic agent for ME/CFS.
Drawings
FIG. 1 is a graph showing the results of comparison of the frequency of use of each IGHV gene (IGHV family) of BCR obtained from samples of ME/CFS disease patients and healthy controls. The vertical axis represents the frequency (%) of use of each gene, and bars represent standard error. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 2 is a graph showing the results of comparison of the frequency of use of each IGHD and IGHJ family of BCR obtained from samples of ME/CFS disease patients and healthy controls. The vertical axis represents the frequency (%) of use of each gene, and bars represent standard error. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 3A is a dot plot (dot plot) of the frequency of use of the IGHV genes described in samples from ME/CFS disease patients and healthy controls. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 3A is a dot diagram showing the frequency of use of the IGHV genes described in samples from ME/CFS disease patients and healthy controls. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 4 is a dot plot of the numbers of B cell populations documented for samples from ME/CFS disease patients and healthy controls. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 5 is a dot plot of the number of regulatory T cell populations from samples of ME/CFS disease patients and healthy controls. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 6 is a dot plot of the numbers of T cell populations documented for samples from ME/CFS disease patients and healthy controls. HC: healthy controls, ME/CFS: ME/CFS disease patients.
FIG. 7 is a graph showing ROC curves based on regression analysis using the use frequencies and B cell amounts (%) of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 as variables.
FIG. 8 is a graph showing ROC curves based on regression analysis using the use frequencies and Treg amounts (%) of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 as variables.
Fig. 9 is a graph showing the difference in the frequency of use of the IGH gene between ME/CFS patients (n-37) and MS patients (n-10). ME/CFS: ME/CFS disease patients, MS: patients with MS disease.
Fig. 10 is a graph showing the difference in diversity index between ME/CFS patients (n-37) and MS patients (n-10). ME/CFS: ME/CFS disease patients, MS: patients with MS disease. No significant difference was detected between ME/CFS patients (n 37) and MS patients (n 10) for any diversity index.
Fig. 11 is a graph showing the difference in the frequency of use of the IGH gene between ME/CFS patients (n-37) and non-ME/CFS patients (n-33). ME/CFS: ME/CFS disease patients, non-ME/CFS: healthy control + MS disease patients.
Detailed Description
The present invention will be described below while showing preferred embodiments. Throughout this specification, the expression in the singular form should be understood to include the concept in the plural form as well, unless otherwise specified. Thus, reference to an article in the singular (e.g., "a," an, "the," etc. in the case of english) should be understood to also include the plural thereof as long as it is not specifically stated. In addition, unless otherwise specified, terms used in the present specification should be understood to have meanings commonly used in the field. Accordingly, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present specification, including definitions, will control.
The definitions of the terms used in the present specification and/or the basic technical contents will be described below as appropriate.
(ME/CFS)
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a serious chronic disease that lasts for 6 months or longer and is difficult to describe, such as severe fatigue, extreme fatigue after work, sleep disorders, and cognitive dysfunction, which are core symptoms and are often accompanied by upright intolerance, pain, digestive system symptoms, and allergy to light, sound, odor, and chemical substances, and the number of patients in japan is estimated to be 10 million or more, but the disease state is still unclear, and there is no examination method or effective treatment method specific to the disease, and thus appropriate medical treatment is not currently being performed.
In recent years norwegian reported that rituximab (rituximab) -based B-cell depletion therapy was effective and that the immune pathology of ME/CFS received worldwide attention. In order to cope with diverse antigens, B cell receptors based on gene rearrangement are very diverse for the generation of B cell tubular antibodies in the adaptive immune system. In autoimmune diseases such as systemic lupus erythematosus, hematological tumors of B cell lines, specific B cell receptors (clones) increase and can be detected as a deviation in the population (pool) of individual diverse B cell receptors. In recent years, library analysis by a method of eliminating bias using a next-generation sequencer has been possible to be useful for detecting an abnormality in a ME/CFS B cell line.
Diagnosis of ME/CFS is based on symptoms, interviews, etc. and is performed by Fuda standards (Fukuda criterion), Canadian standards (Canadian criterion), and International consensus standards (International consensus criterion). As for ME/CFS, biomarkers representing diseases have not been developed so far, and there are a variety of diagnostic criteria for diagnosis by combining symptoms. The fuda standard was first created (1994), followed by the canadian standard (2003) and then the international standard (2011), during which experts have agreed to the basic symptoms (core symptoms) of the disease, and are now being studied to create new diagnostic standards. The reason why the diagnostic criteria including the old criteria coexist is that the following two criteria are required: 1) detailed/rigorous criteria for advancing research and facilitating pathology understanding of the disease, therapeutic development (research criteria); 2) an easy standard (standard for diagnosis) for enabling general physicians to make diagnoses to enable advancement of patient-oriented medical/social interventions. Specifically, fuda standard is still considered to be necessary for the current research use as a research standard, and new standards are focused on diagnosis and treatment. This dual criterion is generally not recognized in other diseases, but is exceptionally recognized/accepted by experts in ME/CFS. The diagnostic criteria created by the labor saving in japan are based on these overseas criteria and combined with the diagnostic criteria (setting of PS, etc.) of the actual situation in japan. In the present invention, ME/CFS may refer to a disease defined by fuda standard as a research standard, but may refer to a disease defined by another standard for medical use as necessary. In the present invention, the diagnosis of ME/CFS can be confirmed by any diagnostic criteria accepted in the art.
In the present invention, subjects who express ME/CFS at a high risk, or subjects who express ME/CFS with a poor prognosis can be treated appropriately. As the treatment of ME/CFS, immunomodulators (rituximab and the like), nonsteroidal anti-inflammatory drugs, antidepressants, anxiolytics and the like can be used.
(BCR library)
In one embodiment, the present invention provides a method for determining an index for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject using a B Cell Receptor (BCR) pool. In the present specification, "B Cell Receptor (BCR)" also referred to as B cell receptor, B cell antigen receptor (receptor), B cell antigen receptor, means composed of an Ig α/Ig β (CD79a/CD79B) heterodimer (α/β) associated with a membrane-bound immunoglobulin (migg) molecule. The migg subunit (subbunit) binds to the antigen, causing the receptor to agglutinate, while on the other hand, the α/β subunit transmits the signal into the cell. When BCR is aggregated, Lyn, Blk, and Fyn of Src family kinases are rapidly activated, similarly to Syk and Btk of tyrosine kinases. Many different outcomes arise due to the complexity of BCR signaling, including survival, tolerance (anergy; lack of allergic response to antigen) or apoptosis (apoptosis), cell division, differentiation into antibody producing cells or memory B cells, among others. T cells with different sequences of the variable region of TCR also produce hundreds of millions, and B cells with different sequences of the variable region of BCR (or antibody) also produce hundreds of millions. Since the sequences of TCR and BCR differ by rearrangement of the genomic sequence or introduction of mutation, clues can be obtained by determining the genomic sequence of TCR · BCR or the sequence of mrna (cdna) with respect to antigen specificity of T cells and B cells.
In the present specification, the "V region" refers to the variable region (V) region of the variable region of a TCR chain or a BCR chain.
In the present specification, the "D region" refers to the D region of the variable region of a TCR chain or a BCR chain.
In the present specification, the "J region" refers to the J region of the variable region of a TCR chain or a BCR chain.
In the present specification, the "C region" refers to the constant region (C) region of a TCR chain or a BCR chain.
In the present specification, "repertoire of variable regions (reportire)" refers to a collection of V (D) J regions arbitrarily constructed by gene rearrangement from TCR or BCR. The term TCR library, BCR library, and the like are used, and they are also sometimes referred to as T cell library, B cell library, and the like, for example. The library can be said to have both information on how diverse the whole is and information on how often each gene is used. For one embodiment, the following method is provided: 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of the target BCR are used as an index of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in the target. The frequency of use of 1 or more genes in the IgGH chain variable region of BCR can be derived as follows.
As a determination method of the BCR library, one method is: for how each V chain was used for B cells in the sample, the proportion of T cells expressing each V β chain was analyzed by flow cytometry (FACS analysis) using a specific V β chain-specific antibody. In addition, TCR repertoire analysis based on molecular biology approaches can be designed based on information on TCR genes obtained from human genomic sequences. One method involves the steps of extracting RNA from a cell sample, synthesizing complementary DNA, and performing PCR amplification on TCR genes for quantification.
Nucleic acid can be extracted from a cell sample using a tool known in the art, such as total RNA minikit (RNeasy Plus Universal Mini Kit (QIAGEN)). Extraction and purification of total RNA from cells dissolved in trizol ls reagent can be performed using a total RNA miniprep kit (QIAGEN). Synthesis of complementary DNA from the extracted RNA Superscript III can be usedTM(Invitrogen) and the like, and any reverse transcriptase known in the art.
PCR amplification of the BCR gene can be suitably performed by those skilled in the art using any polymerase known in the art. However, it can be said that amplification of a gene having a large variation such as the BCR gene is advantageous for accurate measurement if amplification can be performed "without variation".
As the primers used for PCR amplification, a method of designing a plurality of primers specific to each BCR V chain and quantifying the primers by a real-time PCR method or the like, or a method of simultaneously amplifying the primers specific to each BCR V chain (Multiple PCR) is used. However, even when the quantification is performed using an endogenous control for each V chain, if a large number of primers are used, accurate analysis cannot be performed. Further, in the multiplex PCR method, there is a disadvantage that the difference in amplification efficiency between primers causes variation in PCR amplification.
In a preferred embodiment of the present invention, BCR genes containing all homo (isotype) and subtype (subtype) genes are amplified without changing the presence frequency using 1 set of primers containing 1 kind of forward primer and 1 kind of reverse primer as described in WO2015/075939 (repertore genes Inc.). The primer design described below is advantageous for unbiased amplification.
Focusing on the gene structure of the BCR gene, the gene containing the entire V region was amplified by adding an adapter sequence (adapter sequence) to the 5' end of the highly diversified V region without setting primers. The adapter is of any length and sequence on the base sequence, about 20 base pairs is most preferred, and sequences of 10 bases to 100 bases may also be used. The adaptor attached to the 3' end is removed by restriction enzyme (restriction enzyme), and the entire BCR gene is amplified by amplifying an adaptor primer having the same sequence as the 20 base pair adaptor and a reverse primer specific to the C region as a consensus sequence.
Complementary strand DNA is synthesized by the BCR gene messenger RNA by reverse transcriptase, followed by double-stranded complementary DNA. Double-stranded complementary DNA containing V regions of different lengths is synthesized by reverse transcription reaction, double-stranded synthesis reaction, and an adaptor comprising 20 base pairs and 10 base pairs is attached to the 5' -end of these genes by DNA ligase reaction.
For BCR, these genes can be amplified by setting reverse primers to the C region of the light chain of the. mu. chain, the. alpha. chain, the. delta. chain, the. gamma. chain, the heavy chain of the. epsilon., the. kappa. chain, and the. lambda. chain. The reverse primer set for the C region was matched with the sequence of each of C μ, C α, C δ, C γ, C ∈, C κ, and C λ of BCR, and a primer having a mismatch (mismatch) to such an extent that the other C region sequence was not initiated was set. The reverse primer for the C region is optimally prepared in consideration of the base sequence, base composition, DNA melting temperature (Tm), and the presence or absence of a self-complementary sequence so as to amplify the primer with the adapter primer. By setting primers to regions other than the base sequences different between the allele (allele) sequences in the C region sequence, all alleles can be amplified uniformly. To increase the specificity of the amplification reaction, a multi-step nested PCR (nested PCR) was performed.
The length (number of bases) of the primer candidate sequence is not particularly limited, but is 10 to 100 bases, preferably 15 to 50 bases, and more preferably 20 to 30 bases, for a sequence in which none of the primers contains a sequence different between allele sequences. Such unbiased amplification is preferable for identification of genes with low frequency (1/10000 to 1/100000 or less).
The BCR library can be determined from the read data obtained by sequencing the BCR gene amplified as described above.
The sequencing method is not particularly limited as long as the sequence of the nucleic acid sample can be determined, and any method known in the art can be used, but Next Generation Sequencing (NGS) is preferably used. Examples of the next-generation Sequencing include, but are not limited to, pyrosequencing, Sequencing by synthesis (Sequencing by synthesis), Sequencing by ligation, and ion semiconductor Sequencing.
The resulting reads were aligned to a reference sequence comprising V, D, J gene, from which unique read numbers were derived to determine the BCR library.
In one embodiment, the reference database used is prepared separately for the V, D, J, C gene regions. Typically, a nucleic acid sequence data set for each region and each allele disclosed by IMGT is used, but the present invention is not limited thereto, and any data set may be used as long as each sequence is assigned a unique ID.
The obtained read data (including data appropriately processed by trimming (trimming) as necessary) is used as an input sequence set, homology search is performed on the reference database of each gene region, and the alignment pattern with the closest reference allele and the sequence thereof is recorded. Here, homology search uses an algorithm with high mismatch tolerance in addition to C. For example, when general BLAST is used as a homology search program, settings such as shortening the window size, reducing the mismatch penalty, and reducing the gap penalty are performed for each region. The selection of the closest reference allele is performed by using a homology score, an alignment length, a kernel (kernel) length (the length of a continuous and matching base sequence), and the number of matching bases as indices, and these indices are applied in a predetermined order of priority. With respect to the input sequences determined by V and J used in the present invention, the CDR3 sequence was extracted using the beginning of CDR3 on reference V and the end of CDR3 on reference J as markers. It is translated into an amino acid sequence, and thus used for the classification of D region. In the case where a reference database of D regions can be prepared, the combination of the results of homology search and the results of amino acid sequence translation is used as the classification result.
Based on the above, each allele of V, D, J, C is assigned to each sequence in the input set. Next, the BCR library is derived by calculating the occurrence frequencies of the entire input set V, D, J, C or the combination thereof. The frequency of occurrence is calculated in units of allele or gene name, depending on the accuracy required for classification. The latter can be achieved by translating the respective allele into the gene name.
After assigning V, D, J, and C regions to the read data, the number of reads detected in the sample and the ratio (frequency) of the total number of reads can be calculated by unique reads (reads that do not have the same sequence as the others) by counting the matching reads.
The diversity index or similarity index can be calculated using data such as the number of samples, the type of reads, and the number of reads, and using statistical analysis software such as ESTIMATES or r (vegan). In a preferred embodiment, TCR library analysis software (Reertore genetics Inc.) is used.
In the present specification, "BCR diversity" refers to the diversity of B cell receptor libraries (repotorers) of a subject, and can be measured by a person skilled in the art using various methods known in the art. The index indicating the BCR diversity is referred to as "BCR diversity index". The BCR diversity index may be any index known in the art, and examples thereof include indexes obtained by applying diversity indices such as Shannon-Weaver index (Shannon-Weaver index), Simpson index (Simpson index), Inverse Simpson index (Inverse Simpson index), peallos uniformity index (Pielou's species eveness index), Normalized Shannon-Weaver index (Normalized shann non-Weaver index), DE indices (e.g., DE50 index, DE30 index, DE80 index), and Unique indices (e.g., Unique 50 index, Unique 30 index, and Unique 80 index) to BCR.
(Large Scale high efficiency BCR library analysis)
In a preferred embodiment of the invention, the BCR library is assayed using a large-scale, high-efficiency BCR library assay. For the purposes of this specification, "large-scale high-efficiency library analysis" is described in WO2015/075939 (the entire disclosure of which is incorporated herein by reference as needed), and in the case of BCR, is referred to as "large-scale high-efficiency BCR library analysis". A method for quantitatively analyzing a bank of a subject (Reertoire) (variable region of T Cell Receptor (TCR) or B Cell Receptor (BCR)) using a database, which comprises (1) a step of providing a nucleic acid sample containing a nucleic acid sequence of T Cell Receptor (TCR) or B Cell Receptor (BCR) amplified without variance from the subject, (2) a step of determining the nucleic acid sequence contained in the nucleic acid sample, and (3) a step of calculating the frequency of occurrence or combination of genes from the determined nucleic acid sequence and deriving the TCR or BCR bank of the subject, wherein the nucleic acid sample contains a plurality of nucleic acid sequences of T Cell Receptor (TCR) or B Cell Receptor (BCR), this step (2) can be achieved by a method of determining a single sequence by using a universal adaptor primer to determine the nucleic acid sequence; it is preferable that: the method comprises (1) a step of providing a nucleic acid sample containing a nucleic acid sequence of a T Cell Receptor (TCR) or a B Cell Receptor (BCR) amplified without variation from the subject, (2) a step of specifying the nucleic acid sequence contained in the nucleic acid sample, and (3) a step of calculating the frequency of occurrence of each gene or a combination thereof from the specified nucleic acid sequence and deriving a library of the subject, wherein the step (1) comprises the steps of: (1-1) a step of synthesizing a complementary DNA using an RNA sample derived from a target cell as a template; (1-2) a step of synthesizing a double-stranded complementary DNA using the complementary DNA as a template; (1-3) double-strand complementary DNA synthesis for synthesizing an adapter by adding a universal adapter primer sequence to the double-strand complementary DNA; (1-4) performing a first PCR amplification reaction using the adaptor-added double-stranded complementary DNA, a universal adaptor primer comprising the universal adaptor primer sequence, and a first TCR-or BCR-C region-specific primer, wherein the first TCR-or BCR-C region-specific primer is designed to have a sequence that is sufficiently specific for the target C region of the TCR or BCR and is not homologous to other gene sequences, and to have mismatched bases downstream and between subtypes when amplified; (1-5) performing a second PCR amplification reaction using the PCR amplification product of (1-4), the universal adaptor primer, and a second TCR-or BCR-C region-specific primer, wherein the second TCR-or BCR-C region-specific primer is designed to have a sequence perfectly matching the C region of the TCR-or BCR in a sequence downstream of the sequence of the first TCR-C region-specific primer, but to include a sequence having no homology to other gene sequences, and to include mismatched bases downstream and between subtypes when amplified; and (1-6) performing a third PCR amplification reaction using the PCR amplification product of (1-5), an additional universal adaptor primer containing a first additional adaptor nucleic acid sequence in the nucleic acid sequence of the universal adaptor primer, and a C region-specific primer of a third TCR with an adaptor having a second additional adaptor nucleic acid sequence and a molecular identification (MID Tag) sequence appended to the C region-specific sequence of the third TCR or BCR, in which the C region-specific primer of the third TCR is designed to have a sequence perfectly matching the C region of the TCR or BCR in a sequence downstream of the sequence of the C region-specific primer of the second TCR or BCR but also to contain a sequence having no homology to other gene sequences and, if amplified, to contain mismatched bases downstream and between subtypes, the first additional adaptor nucleic acid sequence being a sequence suitable for binding to DNA capture beads and for the emPCR reaction, the second supplemental adaptor nucleic acid sequence is a sequence suitable for an emPCR reaction and the molecular identification (MID Tag) sequence is a sequence used to confer uniqueness to enable identification of the amplification product. Specific details of this method are described in WO2015/075939, and those skilled in the art can appropriately refer to this document and examples of the present specification to carry out the analysis.
(diagnosis)
In one embodiment of the present invention, a method is provided for using a subject's B Cell Receptor (BCR) pool as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject. The method may include providing an index for predicting or diagnosing the onset of ME/CFS or developing a therapeutic agent for ME/CFS, in addition to the diagnosis of the onset of ME/CFS in the subject. The process may be carried out in vitro or in silico.
The method may include using more than 1 variable comprising a variable related to the BCR library. When 1 or more variables are used as the index of ME/CFS, for example, the value of an appropriate formula containing the 1 or more variables may be calculated for a certain object and compared with an appropriate standard. The formula can be obtained by logistic regression or the like. Examples of the variable to be used include 1 or more variable selected from the frequency of use of 1 or more genes in the IgGH chain variable region of the target BCR, the diversity index of the target BCR, and the number of 1 or more immune cell subsets of the target, or a combination of 2 or more variables, but are not limited to these variables.
In the embodiment of the present invention, it is preferable that 1 or more variables including the frequency of use of 1 or more IGH genes of the target BCR be used as the index ME/CFS. Thus, it becomes possible to use more than 1 variable as an indicator (i.e. differential diagnosis) that the subject suffers from ME/CFS but not from other diseases. While not wishing to be bound by theory, for example, the number of immune cell subsets and the BCR pool diversity index may vary for other diseases that affect the overall immune status, and thus may suggest the presence of ME/CFS in a subject, but may not exclude other diseases. On the other hand, the frequency of use of the IGH gene is considered to vary depending on the mechanism of a certain disease, and therefore, it is considered to be an index of the occurrence of ME/CFS in a subject rather than other diseases, and is considered to be very useful in actual clinical diagnosis.
As the "other diseases" with respect to ME/CFS, there can be mentioned any diseases that may have an influence on immune status, in addition to all diseases that show symptoms similar to "severe fatigue hard to explain lasting for 6 months or longer, extreme fatigue after labor, sleep disorders, cognitive dysfunction as core symptoms, and accompanying upright intolerance, pain, digestive system symptoms, allergy to light, sound, odor, chemical substances, and the like" which are symptoms of ME/CFS. Examples of the disease include mental diseases (e.g., depression, maladjustment (maladjustment), somatoform disorder (somatoform disorder)), primary sleep disorders (e.g., sleep apnea (sleep apnea), narcolepsy (narcolepsy)), endocrine diseases (e.g., hypopituitarism (hypophysitis), thyroid disease), infectious diseases (e.g., chronic infectious diseases such as AIDS, hepatitis B, and hepatitis C), autoimmune diseases (e.g., rheumatoid arthritis (rheumatoid arthritis), systemic lupus erythematosus, xerostomia syndrome (Sjogren's syndrome)), inflammatory diseases (e.g., chronic inflammatory diseases such as inflammatory bowel disease and chronic pancreatitis), neurological diseases (e.g., Multiple Sclerosis (MS), and autoimmune encephalitis). With embodiments of the present invention, methods of distinguishing ME/CFS from other diseases in a subject may be provided. As an example of other diseases, Multiple Sclerosis (MS) can be mentioned. In carrying out the method for distinguishing ME/CFS from other diseases in the present specification, the matters described in the method concerning the target B Cell Receptor (BCR) pool as an index of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) can be applied as necessary.
Diagnosis of ME/CFS is performed by "combination of clinical symptoms", but in this case, elimination of other diseases and other causes is a very important procedure. There are cases where "criteria for combination of clinical symptoms" are satisfied by lifestyle habits (irregularity, strain, etc.), and there is a possibility that the criteria for clinical symptoms are satisfied by other causes such as malignant tumor, metabolic disease, cranial nerve disease, etc. Therefore, as shown in the examples of the present specification, the frequency of use of B cell pools reflecting causes of diseases or genes in them is considered to be very useful in clinical diagnosis. For example, it is considered that a more accurate differential diagnosis can be performed by supplementing the clinical diagnostic standard with the discrimination method using the BCR library (IGH gene use frequency) described in the present specification. Alternatively, the method of the present invention may be used in combination with clinical (clinical observation, MRI image observation, cerebrospinal fluid observation, and the like) information as needed. In one embodiment, 1 disease can be finally identified by determining the possibility of developing each disease using a discriminant model or discriminant of the frequency of use of the IGH gene in a patient showing systemic symptoms including ME/CFS.
In the present specification, "sensitivity" refers to the probability that a case to be determined as positive is accurately determined as positive, and there is a relationship that false negative decreases when sensitivity is high. If the sensitivity is high, it is useful for the rejection diagnosis (rule out).
In the present specification, "specificity" refers to the probability of accurately determining that a negative condition is negative, and there is a relationship that a false positive is decreased when the specificity is high. If the specificity is high, it is useful for confirmation of diagnosis.
In the present specification, the "ROC curve" refers to a curve obtained by plotting (sensitivity) and (1-specificity) when a critical (cut off) value obtained from a regression equation based on a marker is changed as an intermediate variable. With respect to the ROC curve, the AUC (area under the curve) can be appropriately determined by those skilled in the art. The AUC of the ROC curve is considered to show the performance of the prediction model. For the examples of the present specification, a number of variables or combinations of variables that show high predictive performance of AUC ≧ 0.7, AUC ≧ 0.8, and AUC ≧ 0.9 in the ROC curve based on regression analysis have been validated. If the AUC is about 7, it is considered that the model may be used as a diagnostic model.
In the present invention, by using 1 or more variables (which may be a combination of variables as necessary) showing AUC.gtoreq.0.7 in the ROC curve based on regression analysis, which is used as an index of ME/CFS for a subject, ME/CFS which has been difficult to diagnose so far can be diagnosed.
The method may comprise the step of obtaining more than 1 variable for use as an index for ME/CFS with respect to the object. In one embodiment, the step of obtaining a variable may comprise a step of analyzing a sample of the subject, for example, a step of measuring a library of BCRs of the subject, and/or a step of measuring the number of cell subsets of the subject. The step of measuring the BCR library may include a step of identifying the diversity of the target BCR, and/or a step of identifying the frequency of use of 1 or more genes in the IgGH chain variable region of the target BCR. The number of cell subsets of a subject can be determined by any method known to those skilled in the art, including flow cytometry. In addition, as a variable, data on a value previously specified for the object may be obtained.
The method may include a step of determining whether or not the object is an ME/CFS based on 1 or more variables described in the present specification. The determination step may be performed by comparing the value of the dependent variable of the function including 1 or more variables with an appropriate threshold.
In one embodiment, the methods described herein can be performed by computer simulation. The method comprises the following steps: the method includes a step of obtaining 1 or more variables including variables related to a BCR library of an object, and a step of determining whether the object is an ME/CFS based on the 1 or more variables. A program, a storage medium recording the program, or a system for implementing any of the methods described in the present specification also falls within the scope of the present invention.
In one embodiment, a program or a storage medium storing the program is provided, which includes instructions that, when executed by 1 or more processors, cause the processors to obtain 1 or more variables including variables associated with a BCR library of an object, and determine whether the object is an ME/CFS based on the 1 or more variables. There may be provided a system including: a recording unit configured to record information of 1 or more variables including a variable associated with a target BCR library; and a determination unit configured to obtain the information and determine whether the object is an ME/CFS. The system may be a computer system as needed, and may include the program described in this specification or a storage medium on which the program is recorded.
(index)
(IGH gene)
One embodiment of the present invention provides a method for using 1 or more variables including the frequency of use of 1 or more genes (IGH genes) in an IgGH chain variable region of a target BCR as an index of ME/CFS of the target. As the IGH gene, a gene selected from the group consisting of IGHV-2, IGHV-3, IGHV-8, IGHV-18, IGHV-24, IGHV-38-4, IGHV-45, IGHV-46, IGHV-58, IGHV-69-2, IGHV-69/OR-1, IGHV/OR-5, IGHV/OR-9, IGHV/OR-1, IGHV-5, IGHV-26, IGHV-70/OR-5, IGHV-7, IGHV-9, IGHV-11, IGHV-13, IGHV-15, IGHV-16, IGHV-20, IGHV-21, IGHV-23-25, IGHV-30-3, IGHV-30-5, IGHV-4, IGHV-5, IG, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV D, IGHV3-72, IGHV D, IGHV3-73, IGHV D, IGHV3-74, IGHV D, IGHV3-NL D, IGHV3, IGHV D, IGHV3/OR D, IGHV3-7, IGHV 72/OR D, IGHV3-6, IGHV D, IGHV3/OR D, IGHV 3-8, IGHV D, IGHV3/OR D, IGHV3-9, IGHV D, IGHV3/OR D, IGHV 3-10, IGHV D, IGHV3/OR D, IGHV 3-12, IGHV D, IGHV3/OR D, IGHV3-13, IGHV D, IGHV3-4, IGHV D, IGHV 3-28, IGHV D, IGHV 3-30-2, IGHV D, IGHV3-30, IGHV D, IGHV3-72, IGHV D, IGHV 3-72-D, IGHV3-72, IGHV D, IGHV 3-72-D, IGHV 3-72-D, IGHV3, IGHV D, IGHV3, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR15-2a/b, IGHD3-3, IGHD3-9, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3/OR 3-3 a/b, IGHD 3-4, IGHD 3-11, IGHD 3-17, IGHD 3-23, IGHD3/OR 3-72 a/b, IGHD 3972-72, IGHD-72, IGHD 3-72, IGHD 3-72-3, IGHD 3-72-3-72, IGHD 3-72-3, IGHD 3-72, IGHD 3-72-3-72, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHGG 4, IGHGP, and any combination thereof.
In the examples of the present specification, it is shown that the frequency of use of various IGH genes can be an index of ME/CFS. For the present invention, the number of at least 1 IGH gene used is not particularly limited, and any gene factor of 1 to 117 genes may be used. The 1 or more variables used as indicators in the present invention may include about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, about 110, or the frequency of use of the IGH gene exceeding these numbers. Where appropriate, 1 variable or more may also be 1 variable.
In a preferred embodiment, more than 1 IGH gene is selected from the group consisting of IGHV3-73, IGHV1-69-2, IGHV5-51, IGHV4-31, IGHV3-23D, IGHV1/OR15-9, IGHV4-39, IGHD5-12, IGHV3-43D, IGHD4-17, IGHV5-10-1, IGHD4/OR15-4a/b, IGHG4, IGHV1/OR15-5, IGHV3/OR16-9, IGHD1-7, IGHV3-21, IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3 and IGHD 3-22. More preferably: the 1 or more IGH genes include at least 1 gene selected from the group consisting of IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD 3-22. Even more preferred are: the 1 or more IGH genes include at least 1 gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD 3-22. In another embodiment, the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ 6. It is suggested in the examples of the present specification that the above-mentioned genes can be used alone for prediction.
In another embodiment of the present invention, there is provided a method of using a variable including the frequency of use of 2 or more genes in an IgGH chain variable region of a target BCR as 1 or more variables. Consider that: by using the frequency of use of 2 or more genes, the accuracy of the method can be further improved. One skilled in the art can select an appropriate combination of the use frequencies of 2 or more genes in view of the description of the present specification. The 1 or more variables may be those selected so that AUC.gtoreq.0.7, AUC.gtoreq.0.8, or AUC.gtoreq.0.9 is shown in the ROC curve based on regression analysis. Examples of combinations of genes include IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ 6. One skilled in the art can appropriately calculate the AUC of a certain combination of genes in the ROC curve based on regression analysis according to the method described in the present specification, and can determine whether or not a certain combination of genes shows a desired AUC. Regression analysis can be used, for example, to discriminate between normal controls and ME/CFS.
In the examples of the present specification, with respect to the combination of 2 IGH genes, any of the 117 genes can realize a combination showing AUC ≧ 0.7 in the ROC curve based on regression analysis, in combination with an appropriate other IGH gene.
(diversity index)
In the context of the present invention, in the diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject, the BCR diversity index of the subject may be used as a variable instead of or in addition to other variables.
As the BCR diversity index, any index known in the art may be used, and examples thereof include Shannon-Weaver index (Shannon-Weaver index), Simpson index (Simpson index), Inverse Simpson index (Inverse Simpson index), pello uniformity index (Pielou's species evenness index), Normalized Shannon-Weaver index (Normalized Shannon-Weaver index), DE indices (e.g., DE50 index, DE30 index, DE80 index), and the like.
In example 3-2, Table 7 of the present specification, several diversity indices with significance of 0.1 ≦ P <0.2 were extracted in the single regression analysis. In addition, in example 6 of the present specification, it was found that: among combinations of multiple variables, combinations of several variables including diversity index showed AUC values of 0.8 or more in ROC analysis (table 18).
(cell subsets)
In the context of the present invention, the number of cell subsets can be used as a variable instead of or on the basis of other variables in the diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject. As the number of cell subsets, for example, the number of immune cell subsets can be used. For purposes of this specification, a "cell subpopulation" refers to a collection of any cells having certain common characteristics in a cell population comprising cells of diverse characteristics. With respect to cell subsets, the term "cell subset" is used to refer to a specific cell subset, which is known in the art, and may be expressed by describing an arbitrary property (e.g., expression of a cell surface marker).
Examples of the cell subsets include, but are not limited to, B cells, naive B cells, memory B cells, plasmablasts, activated naive B cells, transitional B cells, regulatory T cells, memory T cells, follicular helper T cells, Tfh1 cells, Tfh2 cells, Tfh17 cells, Th1 cells, Th2 cells, and Th17 cells.
The number of cell subsets can be determined by a person skilled in the art by means of, for example, a flow cytometer. For example, after a sample is collected, red blood cells are removed by hemolysis or gravity centrifugation, and then reacted with a fluorescent-labeled antibody (an antibody against a target antigen and a control antibody thereof), and the reaction product is sufficiently washed and then observed by using a flow cytometer. The detected scattered light and fluorescence are converted into electric signals, and the electric signals are analyzed by a computer. The result is: the intensity of FSC indicates the size of the cell and the intensity of SSC indicates the intracellular structure, thereby distinguishing lymphocytes, monocytes and granulocytes. Thereafter, the target cell population is gated as necessary, and the pattern of antigen expression in these cells is examined. For the practice of the methods of the invention, one skilled in the art can suitably identify surface markers of the cells shown to differentiate or count the cells.
Particularly useful cell subsets include B cells and regulatory T cells (tregs). In the present invention, 1 or more variables including the number of B cells and/or the number of tregs may be used as the index of ME/CFS in addition to or instead of other variables.
The number of cell subsets can be used in a ratio relative to a suitable reference. The number of B cells is, for example, the frequency (%) of B cells in peripheral blood mononuclear cells. The number of tregs is for example the frequency (%) of tregs in all CD4 positive T cells.
In example 3-2, Table 7 of the present specification, several cell subset variables with significance of 0.1 ≦ P <0.2 were extracted in a single regression analysis. In addition, in example 6 of the present specification, it was found that: among the combinations of variables, the combination of several variables including the cell subpopulation variable showed AUC values above 0.8 in ROC analysis (table 18).
(combination)
As described in the present specification, 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of the target BCR can be used as the index of ME/CFS of the target. The 1 or more variables may be any combination of the variables described in the present specification. Preferably, the combination of 2 or more variables selected from the group consisting of the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject, the BCR diversity index of the subject, and the number of 1 or more immune cell subsets of the subject is included. The 2 or more variables may be 3 or more, 4 or more, 5 or more, 6 or more, or any number of variables exceeding these numbers.
It has been verified in the examples of the present specification that combinations of multiple variables show high AUC in regression analysis based ROC curves used to discriminate between normal control and ME/CFS, and one skilled in the art can appropriately combine variables and perform using combinations showing specific AUC. In one embodiment, the combination of variables may show an AUC ≧ 0.7, an AUC ≧ 0.8, an AUC ≧ 0.85, an AUC ≧ 0.9, an AUC ≧ 0.95, or an AUC ≧ 0.99 in the ROC curve based on regression analysis. One skilled in the art can appropriately calculate the AUC of a combination of certain variables in the ROC curve based on regression analysis according to the method described in the present specification, and can determine whether or not a combination of certain variables shows a desired AUC. Examples of the combination of variables include, but are not limited to, the combination of the frequency of use of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 of the subject and the number of B cells, or the combination of the frequency of use of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ6 of the subject and the number of Tregs.
Further combinations of variables showing predictive performance through single regression analysis may be relevant to higher predictive performance, although this is not required. In the examples of the present specification, regarding combinations of variables from each other showing significant differences of P <0.2 between the patient group and the control group in the single regression analysis, 46% of the combinations showed AUC ≧ 0.7 in the ROC analysis in the case of 2-variable combination, 81% of the combinations showed AUC ≧ 0.7 in the ROC analysis in the case of 3-variable combination, and 96% of the combinations showed AUC ≧ 0.7 in the case of 4-variable combination.
In the present invention, the diagnosis of the presence of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject can be performed, for example, by the following procedure. Regarding myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR can be provided. Next, a discriminant generated by performing multivariate analysis on the variable can be provided. The providing of the discriminant may include, for example, the steps of: for the variables, univariate or multivariate logistic regression is carried out by taking the differentiation of patients/healthy people as target variables; calculating a constant and a coefficient of a discriminant from a constant and a partial regression coefficient of a Logit model equation generated in logistic regression; and generating a discriminant based on the constant and the coefficient obtained in the processing. Then, the value of the subject variable is substituted into the discriminant, so that the probability of suffering from ME/CFS can be calculated. In case the probability of suffering from ME/CFS is higher than a specified value, it may be determined that the subject suffers from ME/CFS.
More specifically, in the present invention, the diagnosis of the presence of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in a subject can be performed, for example, by the following procedure. In the method, a combination of variables is determined. The determination of the combination of variables may comprise the steps of: (1) comparing healthy and healthy human subjects including myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) patients and ME/CFS patients to provide a plurality of variables including the frequency of use of 1 or more genes in the IgGH chain variable region of BCR in which a significant difference is detected; and/or (2) performing univariate logical analysis using 1 gene in the IgGH variable region of the BCR as an independent variable, or performing multivariate logical analysis using 2 or more genes in the IgGH variable region of the BCR as independent variables to obtain a logistic regression model equation, performing ROC analysis for measuring the degree of fitting of the logistic regression model equation, and selecting a gene showing a higher AUC value as a variable for the discriminant.
Next, a discriminant generated by performing multivariate analysis on the variables for the discriminant can be provided. The providing of the discriminant may include, for example, the steps of: for the variables, univariate or multivariate logistic regression is carried out by taking the differentiation of patients/healthy people as target variables; calculating a constant and a coefficient of a discriminant from a constant and a partial regression coefficient of a Logit model equation generated in logistic regression; and generating a discriminant based on the constant and the coefficient obtained in the processing. Then, the value of the subject variable is substituted into the discriminant, so that the probability of suffering from ME/CFS can be calculated. In case the probability of suffering from ME/CFS is higher than a specified value, it may be determined that the subject suffers from ME/CFS.
(differential diagnosis)
With respect to embodiments of the present invention, methods of distinguishing ME/CFS from other diseases in a subject are provided. In one embodiment of the present invention, differential diagnosis is performed by using a formula including an index (variable) that generates a difference between a non-ME/CFS group and an ME/CFS group of patients containing normal controls and other diseases. In this case, for example, healthy human + MS disease may be set as a non-ME/CFS group, an IGH gene effective for identification with the ME/CFS group may be selected, and 1 discriminant including the selected gene may be used. IGH genes that have significant differences according to the test for significant differences between the ME/CFS group and the non-ME/CFS group in the examples of the present specification may be mentioned, and as 1 or more variables, variables including the frequency of use of at least 1 gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHV3-30, IGHJ6, IGHGP, IGHV4-31, IGHV3-64, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1, and IGHV4-34 may be used.
Alternatively, in another embodiment, the following modes are assumed: after discrimination is performed by using discriminators for discriminating between normal control and ME/CFS patients, discriminators for discriminating other diseases (for example, MS) are substituted to negate the possibility of other diseases. In this case, the ME/CFS described in the present specification is discriminated from the normal control, and thereafter (or before or at the same time) discrimination from other diseases is performed by further using another discriminant. That is, a method may be provided that includes the steps of: (a) using a part of 1 or more variables as an index of ME/CFS of the subject, and (b) using a part of 1 or more variables as an index of ME/CFS but not other diseases. Other diseases may include Multiple Sclerosis (MS). For a variety of other diseases, (b) may be performed multiple times.
One embodiment of the present invention is a method for determining that a subject suffers from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) but not other diseases (e.g., multiple sclerosis and MS) using 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject. Such a method can be combined with discrimination of ME/CFS and normal control described in the present specification as necessary. Here, in the case where differentiation from MS is desired, the 1 or more genes may include at least 1 gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, and IGHJ 6. The 1 or more genes may be selected so that AUC.gtoreq.0.7, AUC.gtoreq.0.8, or AUC.gtoreq.0.9 is shown in an ROC curve based on regression analysis for discriminating MS and ME/CFS. For the purposes of the present invention, the number of at least 1 IGH gene for discriminating between MS and ME/CFS is not particularly limited, and any gene factor of 1 to 117 genes may be used. The 1 or more variables used as indicators in the present invention may include the frequency of use of the IGH gene in an amount of about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, about 110, or a number exceeding these numbers. Where appropriate, 1 variable or more may also be 1 variable.
In the present specification, the term "subject" refers to any living organism that is the subject of diagnosis, detection, or the like of the present invention. Preferably the subject is a human.
In the present specification, the term "sample" refers to any substance obtained from a subject, and includes, for example, peripheral blood, tissue biopsy, cell sample, lymph, saliva, urine, and the like. The preferred sample can be appropriately selected by those skilled in the art based on the description of the present specification.
In the present specification, "or" is used when "at least 1 or more of the items listed in the text can be adopted". The same applies to "or". In the present specification, when a range is explicitly described as "within 2 values", the range also includes 2 values themselves.
All references cited in the present specification, such as scientific literature, patents, and patent applications, are incorporated herein by reference in their entirety to the same extent as if each reference were specifically and individually indicated to be incorporated herein by reference.
The present invention has been described above by showing preferred embodiments for easy understanding. The present invention will be described below based on examples, but the above description and the following examples are provided for illustrative purposes only and are not provided for limiting the present invention. Therefore, the scope of the present invention is not limited to the embodiments and examples described in the specification, but is defined only by the claims.
Examples
In the examples described below in the present specification, the following abbreviations are used as appropriate.
[ Table A ]
Figure BDA0003010544010000321
[ example 1]
A new generation of B cell receptor bank assays using peripheral blood mononuclear cells from ME/CFS disease patients
1. Materials and methods
1.1. Peripheral blood mononuclear cell isolation and RNA extraction
10mL of whole Blood was collected from ME/CFS disease patients (37 patients) and healthy controls (23 patients) shown in Table 1 into heparin-containing Blood collection tubes, and Peripheral Blood Mononuclear Cells (PBMCs) were separated by Ficoll-Paque PLUS density gradient centrifugation. Total RNA was extracted from isolated PBMCs using a total RNA miniprep kit (Qiagen, Germany) and purified. RNA was quantified using an Agilent 2100 bioanalyzer (Agilent).
[ Table 1]
TABLE 1 ME/CFS patients
Figure BDA0003010544010000331
The physical ability status is evaluated in 10 stages of 0-9, with 9 being the most severe. Regarding the p-value, age was calculated by Mann-Whitney U-test (Mann-Whitney U-test), and gender was calculated by Fisher's exact test (Fisher's exact test), and p <0.05 was taken as meaningful.
1.2. Complementary DNA and Synthesis of double-stranded complementary DNA
Amplification of the BCR gene was performed using an adaptor-ligation PCR (adaptor-ligation PCR) method. Complementary strand DNA (cDNA) was synthesized by Oligo thymine (Oligo dT) primer (BSL-18E: Table 2) containing restriction enzyme digestion site and reverse transcriptase. Then, double-stranded complementary strand DNA (ds-cDNA) synthesis was performed using E.coli DNA Ligase (E.coli DNA Ligase) (Invitrogen), DNA polymerase I (DNA polymerase I) (Invitrogen), and ribonuclease H (RNase H) (Invitrogen). Thereafter, 5' -end smoothing reaction by T4 DNA polymerase was performed, and the ends were cleaved with restriction enzyme Not I. After column purification using a Reaction product purification Kit (MinElute Reaction Cleanup Kit (QIAGEN)), the P20EA/P10EA adaptor was added by T4 Ligase (T4 Ligase) -based ligation Reaction, and the adaptor-added ds-cDNA was digested with Not I restriction enzyme.
1.3.PCR
To specifically amplify the heavy chain (IGH) gene of immunoglobulin gamma chain (IgG) of B cell receptor (B cell receptor, BCR), three nested PCRs were performed using thermal cycler T100(Bio Rad) using high fidelity hot start polymerase premix (KAPA HiFi HS Ready Mix (japan Genetics)). A first PCR reaction was performed using P20EA and CG1, followed by a second PCR reaction using P20EA and CG 2. Furthermore, the Tag sequences required for sequencing were appended by P22EA-ST1-R and CG-ST 1-R. After removing the remaining primer carried in by using the Agencour AMPure bead, index was added thereto by Nextera XT Indexkit v2Set A (Illumina).
[ Table 2]
TABLE 2 primer sequences
Figure BDA0003010544010000341
(corresponding to the sequence numbers 1 to 7 from the top)
1.4. New generation sequencing analysis
The amplification product was measured in concentration using a Qubit (registered trademark) 3.0 fluorometer (Thermo Fisher Scientific), diluted to 4nM, and a portion of PhiX Control v3(Illumina) was mixed to prepare a final preparation. Double-ended (paired-end) sequencing was performed on the Miseq Reagent Kit v3(600 cycles, Illumina) and the final preparation using the Illumina company Miseq sequencer.
1.5. Analysis based on library analysis software
Using the 1 pair of Fastq base sequence data sets obtained by Miseq sequencing, the V region sequence (IGHV), D region sequence (IGHD), control of J region sequence (IGHJ) to C region sequence (IGHC), and CDR3 sequence of the IGH gene were determined by library analysis software repteroire Genesis. Reads with the same amino acid sequences of IGHV, IGHD, IGHJ, IGHC, and CDR3 were used as unique reads, and the number of copies of the unique reads in each sample was counted. And calculating the IGHV, IGHD, IGHJ, IGHC use frequency and diversity index of each sample according to the statistical result of the unique reads.
2. Results
Sequencing data for 20-30 million reads were obtained from samples of ME/CFS disease patients as well as healthy controls (table 3). The number of total reads, assigned reads, number of in-frame reads, and number of unique reads did not differ between ME/CFS disease patients and healthy controls. Comparisons were made between ME/CFS disease patients and healthy controls with respect to IGHV, IGHD, IGHJ, IGHC frequency of use, and diversity index calculated from the read data. The frequency of use of IGHV1-3, IGHV3-30, IGHV3-30-3, and IGHV3-49 in the case of IGHV was significantly higher in ME/CFS disease patients than healthy controls (FIG. 1, significance levels for the Mann-Whitney test: P <0.05, P <0.001, and P < 0.001). IGHD1-26 for IGHD, IGHJ6 for IGHJ showed significantly higher frequency in ME/CFS disease patients than healthy controls (FIG. 2, P <0.01 and P < 0.05). The dot diagrams are shown in fig. 3A and 3B. These results suggest that: by analysis of the new generation BCR library, measurements of the frequency of use of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49 and IGHJ6 can be used for the identification of ME/CFS diseases without effective biomarkers (Table 4).
[ Table 3]
TABLE 3 number of reads obtained by NGS sequencing
Figure BDA0003010544010000351
[ Table 4]
TABLE 4 identification of biomarker candidates I for ME/CFS disease
Figure BDA0003010544010000352
[ example 2]
Flow cytometry-based comparative study of lymphocytes from ME/CFS disease patients
1. Method of producing a composite material
Peripheral Blood Mononuclear Cells (PBMCs) were isolated by the method described in [ example 1 ]. Thereafter, the cells were stained with various fluorescent dye-labeled monoclonal antibodies, and the following frequencies of lymphocyte subsets (subclas) were calculated by a flow cytometer (FACS Canto II and FACS Aria II flow cytometers (BD Biosciences)). (%) B cells: CD19+ cells/PBMC; naive B cells (nB): CD19+ CD27-/CD19+ cells; memory B cells (mBs): CD19+ CD27+ CD180+/CD19 +; plasmablasts (PBs): CD19+ CD27+ CD180-CD38high/CD19 +; transitional B cells (TrB): CD19+ CD27-CD24+ Mito tracker green high/CD19 +; memory CD4T cell (mCD 4T): CD3+ CD4+ CD127+ CD45RA-/CD3+ CD4 +; follicular helper T cells (Tfh): CD3+ CD4+ CD127+ CD45RA-CXCR5+/CD3+ CD4 +; helper T cell 1(Th1 cell): CD3+ CD4+ CD127+ CD45RA-CXCR5-CXCR3+ CCR6-/CD3+ CD4+ CD127+ CD45 RA-; helper T cell 2(Th2 cell): CD3+ CD4+ CD127+ CD45RA-CXCR5-CXCR3-CCR6-/CD3+ CD4+ CD127+ CD45 RA-; helper T cell 17(Th17 cell): CD3+ CD4+ CD127+ CD45RA-CXCR5-CXCR3-CCR6+/CD3+ CD4+ CD127+ CD45 RA-; follicular helper T cell 1(Tfh1 cell): CD3+ CD4+ CD127+ CD45RA-CXCR5+ CXCR3+ CCR6-/CD3+ CD4+ CD127+ CD45RA-CXCR5 +; follicular helper T cell 2(Tfh2 cell): CD3+ CD4+ CD127+ CD45RA-CXCR5+ CXCR3-CCR6-/CD3+ CD4+ CD127+ CD45RA-CXCR5 +; follicular helper T cell 17(Tfh17 cell): CD3+ CD4+ CD127+ CD45RA-CXCR5+ CXCR3-CCR6+/CD3+ CD4+ CD127+ CD45RA-CXCR5 +; regulatory T cells (tregs): CD3+ CD4+ CD45RA-CD127-CD25+ +/CD3+ CD4 +. Regarding the frequency obtained, the mann-whitney U test was performed between ME/CFS disease patients and healthy human subjects, and significant differences among 2 groups were examined. P <0.05 was set as statistically significant.
2. Results
As shown in fig. 4, 5, and 6, (%) B cells were significantly higher in the disease group, the frequency of regulatory T cells (tregs) was significantly lower in the disease group, and the frequency of follicular helper T cells 17(Tfh17) was significantly higher in the disease group.
[ example 3-1]
Predictive identification of ME/CFS using IGH gene and cell subpopulation frequency data with significant differences observed between groups 2
1. Method of producing a composite material
To 2 groups of patients with ME/CFS disease (37 cases) and healthy controls (23 cases), 6 IGH genes (Table 5) in which significant differences were detected and (%) B cells or (%) regulatory T cells as frequency data of cell subsets were added to conduct a study on whether ME/CFS could be predicted. Multivariate logic analysis was performed using SPSS software (IBM), and Receiver Operating Characteristics (ROC) curves were generated using the predicted values of the dependent variables of the regression equation. Further, as a performance evaluation value for prediction judgment of these variables, an Area Under the ROC Curve (Area Under the Curve, AUC) value was obtained.
2. Results
A good assessment of an AUC value of 0.946 was obtained in IGH gene 6 variables and (%) B cells (fig. 7). In addition, a very good evaluation of an AUC value of 0.957 was obtained in an analysis based on a regression equation using the IGH gene 6 variable and (%) regulatory T cells (fig. 8).
[ Table 5]
TABLE 5 identification of biomarker candidates II for ME/CFS disease
Figure BDA0003010544010000371
[ examples 3-2]
Extraction of IGH genes and cell subset variables with high prediction performance for ME/CFS diseases based on single regression analysis
1. Method of producing a composite material
A single regression analysis was performed with 74 IGHVs, 32 IGHDs, 6 IGHJ, 5 IGHCs, 4 diversity indices, 1 of the 15 cell subset frequency data as independent variables and binary variables for ME/CFS disease and healthy human controls as dependent variables. The variables used are shown in Table 6. Variables satisfying significance levels P <0.05, 0.05 ≦ P <0.1, and 0.1 ≦ P <0.2 are extracted. The single and logistic regression analyses used glm () of R, and the ROC analysis used pROC of the R package (R package).
[ Table 6]
TABLE 6 variables for regression analysis and ROC analysis
Figure BDA0003010544010000381
2. Results
From a total of 136 variables of 74 IGHVs, 32 IGHDs, 6 IGHJ, 5 IGHCs, 4 diversity indices, 15 cell subset data, 8 IGH genes, a total of 8 cell subset variables were extracted as variables meeting significance level P <0.05 (table 7). The variables satisfying 0.05. ltoreq. P <0.1 are the frequency data for 10 IGH genes, 4 cell subsets. The variables satisfying 0.1. ltoreq. P <0.2 are 16 IGH genes, 3 diversity indices, 2 cell subset variables. Suggesting that a single variable satisfying significance level P <0.2 is effective for predicting ME/CFS disease. It was especially suggested that a single variable satisfying a significance level P <0.05 was also very effective for predicting ME/CFS disease (table 8). Variables that meet a significance level P <0.2 are used in multivariate logistic regression analysis based analysis.
It should be noted that ROC analysis is performed on several variables individually, and the results are IGHV 3-49: 0.764, IGHV 3-30-3: 0.759, IGHD 1-26: 0.731, IGHJ 6: 0.672, IGHV 3-30: 0.693, IGHV 1-3: 0.659, Bcell: 0.771, Treg: 0.817, Shannon: 0.617, Inverse: AUC value of 0.636.
[ Table 7]
Table 7 variables that satisfy significance levels in single regression analysis
Figure BDA0003010544010000391
[ Table 8]
TABLE 8 identification of biomarker candidates III for ME/CFS disease
Figure BDA0003010544010000401
[ example 4]
Predictive identification of ME/CFS disease Using 2 IGH genes
1. Method of producing a composite material
Multivariate logistic regression analysis was performed on any 2 IGH genes in a total of 117 IGH genes from 74 IGHV, 32 IGHD, 6 IGHJ and 5 IGHC, and ROC curves were prepared using the predicted values of the dependent variables of the regression equations. For these ROC curves, AUC values representing the predicted performance of ME/CFS diseases were calculated, and combinations of IGH genes whose AUC values showed 0.7 or 0.8 or more, and a list of IGH genes used for these combinations were extracted. Logistic regression analysis used glm () for R, and ROC analysis used pROC for the R package.
2. Results
In ROC analysis using any 2 of the IGH genes, the AUC values showed that the combination of IGH genes of 0.8 or more and 0.7 or more were 30 groups and 637 groups, respectively. The combinations of variables showing AUC.gtoreq.0.7 by two-variable logistic regression and ROC analysis are shown in Table 11. In addition, for the IGH genes used in these combinations, 26 genes with AUC values of 0.8 or more and 117 genes with AUC values of 0.7 or more were present (table 9). In addition, of the 30 groups of gene combinations having AUC values of 0.8 or more, 10 groups (33%) were combinations of variables showing significance in the single regression analysis, and 20 groups (67%) were IGH showing significance in the single regression analysis (table 10). These results show that: by combining IGH genes arbitrarily, predictive identification of ME/CFS patients can be performed.
[ Table 9]
TABLE 9 IGH genes that can be used in predictive identification of ME/CFS diseases
Figure BDA0003010544010000411
[ Table 10]
TABLE 10 prediction of ME/CFS patients based on any 2 IGH gene combinations
Figure BDA0003010544010000421
*1: underlined are genes for which significance was confirmed by single regression analysis with a significance level P <0.05
[ Table 11-1]
Combinations of variables showing AUC ≧ 0.7 by bivariate logistic regression and ROC analysis
Figure BDA0003010544010000431
[ tables 11-2]
Figure BDA0003010544010000441
[ tables 11-3]
Figure BDA0003010544010000451
[ tables 11 to 4]
Figure BDA0003010544010000461
[ tables 11 to 5]
Figure BDA0003010544010000471
[ tables 11 to 6]
Figure BDA0003010544010000481
[ tables 11 to 7]
Figure BDA0003010544010000491
[ tables 11 to 8]
Figure BDA0003010544010000501
[ tables 11 to 9]
Figure BDA0003010544010000511
[ tables 11 to 10]
Figure BDA0003010544010000521
[ tables 11 to 11]
Figure BDA0003010544010000531
[ example 5]
Multivariate logistic regression analysis using multiple IGH genes
1. Method of producing a composite material
Multivariate logistic regression analysis was performed in a round-robin fashion for any 2 or more IGH genes from among the total 117 IGH genes of 74 IGHV, 32 IGHD, 6 IGHJ and 5 IGHC. Using the predicted values of the dependent variables of the regression equation thus obtained, an ROC curve was prepared and the AUC value was calculated. Single and logistic regression analysis used glm () for R, and ROC analysis used pROC for the R package.
2. Results
In the multivariate logistic regression analysis using arbitrary 2 variables from 117 kinds of IGH and ROC analysis, AUC values of 30 groups (0.44%) out of the total number of combinations of 6786 groups in total showed 0.8 or more (table 12). The IGH genes used in this combination were 26 genes (Table 13). In the case of any of the 3 variables, the AUC values of the 4469 group (1.7%) and 349 group (0.13%) among the 260130 groups showed 0.8 or more, 0.85 or more, and 0.9 or more, respectively, and 4 groups (0.0015%) thereof (table 12). In addition, the IGH genes used in these combinations were 117 genes (100%), 102 genes (87%), and 6 genes (5.1%), respectively (table 13). In the case of using 4 variables, the AUC value of 3264 group (0.044%) showed 0.9 or more, the AUC value of 99458 group (1.4%) showed 0.85 or more, and the AUC value of 489529 group (6.6%) showed 0.8 or more. In the case of using 4 variables, AUC shows that at least one of the 117 genes was used in a combination of 0.8 or more. The IGH genes that can be used in the predictive identification of ME/CFS disease are shown in Table 14. Combinations of genes for which AUC values of 0.9 or more were considered to exhibit high prediction performance are shown in table 15 and table 16.
[ Table 12]
Table 12 shows the number and frequency of combinations of IGH genes with high AUC values
Figure BDA0003010544010000541
[ Table 13]
Table 13 can be used to show the number and frequency of IGH genes in combinations of AUC high values
Figure BDA0003010544010000542
[ Table 14-1]
TABLE 14 IGH genes that can be used in predictive identification of ME/CFS diseases
Figure BDA0003010544010000551
[ tables 14-2]
Figure BDA0003010544010000561
[ Table 15]
Table 15 shows the combination (3) of IGH genes having high prediction performance (AUC value of 0.9 or more)
Figure BDA0003010544010000562
[ Table 16-1]
Table 16 shows the combination (4) of IGH genes having high prediction performance (AUC value of 0.9 or more) (first 100 positions of AUC value)
Figure BDA0003010544010000571
[ Table 16-2]
Figure BDA0003010544010000581
[ tables 16-3]
Figure BDA0003010544010000591
[ example 6]
Multivariate logistic regression analysis based on combination of variables
1. Method of producing a composite material
Multivariate logistic regression analysis was performed based on a combination of 2, 3 and 4 variables using 46 variables (35 IGH, 8 cell subsets and 3 diversity indices) satisfying the significance level P <0.2 by the single regression analysis. Using the predicted values of the dependent variables of the regression equation thus obtained, an ROC curve was prepared and the AUC value was calculated. Single and logistic regression analysis used glm () for R, and ROC analysis used pROC for the R package.
2. Results
According to the multivariate logistic regression analysis and ROC analysis using any 2 variables from the 46 selected variables, the AUC values of 102 groups (9.9%) were 0.8 or more, and the AUC values of 382 groups (36.9%) were 0.7 or more and less than 0.8 (table 17). In the case of using 3 variables, the AUC values in 85 groups (0.6%) were 0.9 or more, the AUC values in 3472 groups (22.9%) were 0.8 or more and less than 0.9, and the AUC values in 8826 groups (58.1%) were 0.7 or more and less than 0.8. In the case of using 4 variables, the AUC value was 0.9 or more in the 5330 group (3.3%), 0.8 or more and less than 0.9 in the 63981 group (39.2%), and 0.7 or more and less than 0.8 in the 87291 group (53.5%). Consider that: by combining multiple variables, the predictive performance of ME/CFS disease can be improved. Table 18 shows combinations of 2 variables and AUC values of 0.8 or more, and table 19 shows combinations of 3 variables and AUC values of 0.9 or more. The combination with tregs dominates, presumably making prediction performance higher by using tregs in the variables.
[ Table 17]
Table 17 number of combinations showing AUC high values from the selected 46 variables
Figure BDA0003010544010000601
[ Table 18-1]
TABLE 18 ME/CFS patient prediction Performance based on any 2 variable combination from 46 variables (AUC ≧ 0.8)
Figure BDA0003010544010000611
[ Table 18-2]
Figure BDA0003010544010000621
[ tables 18 to 3]
Figure BDA0003010544010000631
[ tables 18 to 4]
Figure BDA0003010544010000632
*1: IGH/FCM: complexation of IGH variables with cell subsets, IGH: IGH variable only, FCM: cell subpopulation only variables
*2: tregs are underlined.
[ Table 19-1]
TABLE 19 prediction of ME/CFS patients based on any 3 variable combination from 46 variables (AUC ≧ 0.9)
Figure BDA0003010544010000641
[ tables 19-2]
Figure BDA0003010544010000651
[ tables 19 to 3]
Figure BDA0003010544010000661
*1: IGH/FCM: IGH variants and cellsRecombination of subgroups, IGH: IGH variable only, FCM: cell subpopulation only variables
*2: tregs are underlined.
It should be noted that the highest AUC value was 0.87(Treg, Tfh17) in the combination of the cell subset variables with each other, but in the case of combining the cell subset variables and the IGH gene, AUC values higher than 0.87 were obtained for Treg + IGHV1-3, Treg + IGHV3-23, Treg + IGHGP. In addition, the B cell amount obtained higher AUC values when combined with IGHV3-49, IGHV3-30-3, and IGHD1-26 than the maximum AUC of Treg was 0.84. By additionally using specific IGH genes for cell subset variables, a higher predictive performance can be obtained than would be obtained by combining the cell subset variables with each other.
[ example 7]
Identification of ME/CFS disease patients from other disease patients
1. Method of producing a composite material
In addition to the samples of 37 myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) disease patients conducted in example 1, whole blood was collected again from 10 patients with Multiple Sclerosis (MS). PBMC and RNA were isolated according to the method described in example 1, materials and methods. Subsequently, synthesis of complementary strand DNA and double-stranded complementary DNA, PCR, and next-generation sequencing analysis were performed, and then analysis was performed by library analysis software. The V region sequence (IGHV), D region sequence (IGHD), J region sequence (IGHJ) to C region sequence (IGHC) alignment, and CDR3 sequence were determined from the read data of each sample. The number of unique reads for each sample was counted and the frequency of use was calculated for 74 IGHVs, 32 IGHD, 6 IGHJ and 5 IGHC. Significant difference tests were performed between ME/CFS and MS groups using the Mann-Whitney method for read data, IGHV, IGHD, IGHJ, and IGHC frequency of use. Using the frequency of use data for ME/CFS patients and MS patients, single regression and logistic regression analysis used glm for R, and ROC analysis used pROC for the R package. For ROC analysis, Receiver Operating Characteristic (ROC) curves are made using predicted values of the dependent variables of the regression equation. As a performance evaluation value for prediction judgment of these variables, an area under ROC curve (AUC) value was obtained.
2. Results
Differences in IGH Gene usage frequency between ME/CFS and MS patients
Sequencing data for 14 to 28 million reads were obtained from 10 MS patient samples (table 20). The number of total reads, assigned reads, number of in-frame reads, and number of unique reads did not differ significantly between the ME/CFS patients and the MS patients. As a result: MS patients were significantly higher in frequency of use of the IGH genes compared to ME/CFS patients with respect to IGHV3-7, IGHV3-33, IGHV3-73, IGHV3-NL1, IGHV4-28, IGHV4-39, IGHD4-17, IGHD5-5, and IGHD5-18 (mann-whitney test:p <0.05, > P < 0.01). On the other hand, MS patients showed significantly lower frequency of use compared to ME/CFS patients with respect to IGHV3-23 and IGHV3-23D (fig. 9). Similarly, the diversity index (shannon index, inverse simpson index, pirox index, DE50 index) as a variable other than the IGH gene was examined for the presence of a difference between the ME/CFS patient and the MS patient (fig. 10). No significant difference was seen in any of the diversity indices between ME/CFS patients and MS patients. From these results it appears that: the diversity index is not useful in the identification of ME/CFS patients and MS patients, on the other hand, the frequency of use of the IGH gene reflects disease specificity and is effective for identification.
[ Table 20]
TABLE 20 number of sequencing reads in MS patients
Figure BDA0003010544010000681
2-2. identification between ME/CFS and MS patients Using multivariate logic analysis
Next, it was verified whether the ME/CFS patient and the MS patient could be identified using multivariate logic analysis for the above 11 IGH genes for which significant differences were detected between the ME/CFS patient and the MS patient. Logistic regression analysis was performed using the frequency of use of any 2 of the 11 IGH genes, or any 3 of the IGH genes, to calculate AUC values. The combinations having high AUC values are listed in the following tables (table 21 and table 22) as combinations of IGH genes having excellent prediction performance.
[ Table 21]
TABLE 21 identification based on any 2-variable combination of IGH genes for which significant differences were observed between ME/CFS patients and MS patients (AUC ≧ 0.8)
Figure BDA0003010544010000691
[ Table 22]
TABLE 22 identification based on any 3-variable combination of IGH genes for which significant differences were observed between ME/CFS patients and MS patients (AUC ≧ 0.85)
Figure BDA0003010544010000701
Using arbitrary 2-and 3-variables of the IGH gene, which showed that healthy persons and ME/CFS patients could be identified, it was investigated whether ME/CFS patients and MS patients could be identified by multivariate logic analysis (tables 23 to 30). The results show that: a large number of combinations of AUC high values were obtained showing high performance of the multivariate logistic model, enabling prediction of ME/CFS patients and MS patients with high specificity and sensitivity. In the case of any 2-variable of the 35 IGH genes showing significant differences in p <0.2 between healthy humans and ME/CFS patients, 48 combinations showed AUC ≧ 0.8, and 21 combinations showed AUC ≧ 0.9 in the case of any 3-variable. In addition, in the case of using any 2 and 3 of the 18 IGH genes showing significant differences in p <0.1 between healthy humans and ME/CFS patients, 17 combinations (AUC ≧ 0.8) and 21 combinations (AUC ≧ 0.875), respectively, showed high values. In the case of using any 2 and 3 of the 8 IGH genes showing significant differences in p <0.05 between healthy humans and ME/CFS patients, 7 combinations (AUC ≧ 0.8) and 22 combinations (AUC ≧ 0.8), respectively, showed high values. In the case of using any 2 variants and 3 variants of the 6 IGH genes tested in example 3-1, 1 combination (AUC. gtoreq.0.7) and 11 combinations (AUC. gtoreq.0.7), respectively, showed high values. From the above results it is clear that: by using the frequency of use of a single or plural kinds of IGH genes, ME/CFS patients and MS patients can be identified with high accuracy.
[ Table 23-1]
TABLE 23 identification of ME/CFS and MS based on the combination of any 2 variables from IGH showing significant differences in p <0.2 between healthy and ME/CFS patients (AUC ≧ 0.8)
Figure BDA0003010544010000711
[ Table 23-2]
Figure BDA0003010544010000721
[ Table 24]
TABLE 24 identification of ME/CFS and MS based on the combination of any 3 variables from IGH showing significant differences in p <0.2 between healthy and ME/CFS patients (AUC ≧ 0.9)
Figure BDA0003010544010000722
[ Table 25]
TABLE 25 identification of ME/CFS and MS based on the combination of any 2 variables from IGH showing significant differences in p <0.1 between healthy and ME/CFS patients (AUC ≧ 0.8)
Figure BDA0003010544010000731
[ Table 26]
TABLE 26 identification of ME/CFS and MS based on the combination of any 3 variables from IGH showing significant differences in p <0.1 between healthy and ME/CFS patients (AUC ≧ 0.85)
Figure BDA0003010544010000741
[ Table 27]
TABLE 27 identification of ME/CFS and MS based on the combination of any 2 variables from the IGH gene showing significant differences in p <0.05 between healthy and ME/CFS patients (AUC ≧ 0.8)
Figure BDA0003010544010000742
[ Table 28]
TABLE 28 identification of ME/CFS and MS based on the combination of any 3 variables from IGH showing significant differences in p <0.05 between healthy and ME/CFS patients (AUC ≧ 0.8)
Figure BDA0003010544010000751
[ Table 29]
TABLE 29 identification of ME/CFS and MS based on the combination of any 2 variables from the 6 IGHs tested in example 3-1 (AUC ≧ 0.7)
Figure BDA0003010544010000752
[ Table 30]
TABLE 30 identification of ME/CFS and MS based on the combination of any 3 variables from the 6 IGHs tested in example 3-1 (AUC ≧ 0.7)
Figure BDA0003010544010000761
2-3 identification between ME/CFS patients and MS patients based on multivariate logic analysis using any IGH genes
Multiple logical analyses were performed using any 2-IGH combinations (6786 total combinations) from 74 IGHV, 32 IGHD, 6 IGHJ, 5 IGHC, and 117 IGH genes in total, to select the IGH that showed the AUC high value of high predictive performance (Table 31). Combinations of AUC.gtoreq.0.9, AUC.gtoreq.0.8 and AUC.gtoreq.0.7 were shown as 4 groups, 252 groups and 1879 groups, respectively. The 2 variables IGHD3-3 and IGHGP showed the highest AUC, 0.92. Next, using any combination of 3 IGH genes from the 117 IGH genes (total number of combinations 260130), a multiple logical analysis was performed to select an IGH exhibiting a high AUC value of high predictive performance (table 32).
[ Table 31-1]
TABLE 31 identification of ME/CFS and MS based on any 2 IGH gene combinations (first 50)
Figure BDA0003010544010000771
[ Table 31-2]
Figure BDA0003010544010000781
[ Table 32-1]
TABLE 32 ME/CFS and MS identification based on any 3 IGH gene combinations (first 50)
Figure BDA0003010544010000791
[ Table 32-2]
Figure BDA0003010544010000801
Comparison of ME/CFS patients and non-ME/CFS patients
To investigate the IGH gene identification of ME/CFS patients from non-ME/CFS patients, 23 healthy persons plus 10 MS patients were used as a non-ME/CFS group, and the index frequency of the IGH gene was compared to the 37 ME/CFS patient group. From the mann-whitney test between groups 2, it is clear that: the patients with IGHV3-49(P ═ 0.0013), IGHV3-30-3(P ═ 0.0022), IGHD1-26(P ═ 0.0053), IGHV4-34(P ═ 0.0118), IGHV3-30(P ═ 0.186), IGHV4-31(P ═ 0.0205), IGHV3-64(P ═ 0.0286), IGHJ6(P ═ 0.0304), and IGHD5-10-1(P ═ 0.0373) were higher than the patients with non-ME/CFS. On the other hand, the non-ME/CFS patient group was significantly higher for IGHGP (P ═ 0.0061), IGHD3-22(P ═ 0.0313), IGHV3-33(P ═ 0.0332), and IGHV3-73(P ═ 0.0332) (table 33, fig. 11).
[ Table 33]
TABLE 33 IGH genes for which significant differences were detected between the ME/CFS and non-ME/CFS groups
Figure BDA0003010544010000811
Marman-Whitney test
2-5 identification between ME/CFS and non-ME/CFS patients based on multivariate logic analysis using any IGH genes
Multiple logistic regression analysis was performed using 2 variables from any of 74 IGHV, 32 IGHD, 6 IGHJ, 5 IGHC for a total of 117 IGH genes (6786 groups), selecting the IGH with high AUC values that showed high predictive performance for the discrimination between ME/CFS patients and non-ME/CFS patients (table 34). In addition, IGH showing a high AUC value when 3 variables (260130 total combinations) were used was also selected in the same manner (table 35). For any 2 variables, the combinations of AUC ≧ 0.8 and AUC ≧ 0.7 were shown as 5 groups and 469 groups, respectively. In the case of any 3 variables, the combinations of AUC ≧ 0.85 and AUC ≧ 0.8 were shown as 37 and 1164 groups, respectively. For the 2 variables, the combination of IGHGP and IGH3-30-3 showed the highest AUC of 0.820, and for any 3 variables, the combination of IGHGP, IGHV3-30, and IGHV3-49 showed the highest AUC of 0.889.
[ Table 34-1]
TABLE 34 identification of ME/CFS and non-ME/CFS based on any combination of 2 variables (top 50)
Figure BDA0003010544010000821
[ Table 34-2]
Figure BDA0003010544010000831
[ Table 35-1]
TABLE 35 identification of ME/CFS and non-ME/CFS based on any combination of 3 variables (top 50)
Figure BDA0003010544010000841
[ Table 35-2]
Figure BDA0003010544010000851
3. Prediction discrimination model for ME/CFS differential diagnosis
At the time of diagnosis, a predictive model equation for identifying ME/CFS patients was made using the use frequency data of the IGH gene obtained from BCR library analysis. Discrimination models based on logistic regression equations were prepared using combinations of IGH genes with high AUC values in the significance difference test or multinomial logistic analysis between 2 groups. Examples of differential diagnosis are shown below.
The combination of variables may be determined, for example, as shown below.
(1) Comparing healthy humans and ME/CFS patients, selecting variables (e.g., frequency of use of genes in the IgGH chain variable region of BCR) for which significant differences are detected; or
(2) A logistic regression model equation is obtained by performing univariate logical analysis using 1 variable (for example, 1 gene in the IgGH chain variable region of BCR) as an independent variable, or performing multivariate logical analysis using 2 or more variables (for example, 2 or more genes) as independent variables. An ROC analysis was performed measuring the fitness of the regression model and the variables showing higher AUC values were selected.
When a combination of variables is provided, univariate or multivariate logistic regression analysis is performed using (y ═ 1) of ME/CFS patients or (y ═ 0) of healthy persons as target variables for each variable (for example, frequency data of genes) (x1, x2, x3, …). In the case of (2), the logical analysis is performed in a loop manner using a plurality of variables.
The logistic regression analysis can use glm as a function for a generalized linear model (generalized linear model) of the R package, and the analysis is not limited to this package. In the logistic regression analysis, assuming that the probability of ME/CFS is pi, the following Logit model equation is obtained, and the constant of b0 and the partial regression coefficient corresponding to b1 to bp are obtained at the same time. That is, the coefficients of the Logit model equation can be determined using a dataset of differentiated patient/healthy persons and variables (gene frequencies, etc.).
[ mathematical formula 1]
Figure BDA0003010544010000861
Pi: as the probability of ME/CFS, b 0: constant, b 1-bp: partial regression coefficient
As for the probability pi as ME/CFS, if a constant and a partial regression coefficient are determined, it can be found by the following formula. When frequency data is newly obtained, discrimination prediction can be performed by inputting the value (discrimination prediction is performed using 0.5 or more as ME/CFS, or the like).
[ mathematical formula 2]
Figure BDA0003010544010000862
The prediction method can be carried out in two steps: ME/CFS is predicted by a discriminant that distinguishes ME/CFS from healthy persons, and MS patients are excluded by a discriminant that distinguishes ME/CFS from MS. By inputting frequency data to a Logit model equation, which is substantially the same as described above, the probability thereof can be predicted.
3-1. one-step prediction model
ME/CFS and non-ME/CFS (healthy people as well as MS patients) were predicted using 1 discriminant. The discrimination is performed in one step by any of the following examples or other combinations of the predictive variables.
[ mathematical formula 3]
(example of using 13 IGHs in which significant differences were observed among 2 groups)
Prediction variables: IGHV3-49, IGHV3-30-3, IGHD1-26, IGHGP, IGHV4-34, IGHV3-30, IGHV4-31, IGHV3-64, IGHJ6, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1
The discriminant:
Figure BDA0003010544010000871
pi: as the probability of ME/CFS, π/(1- π): odds ratio (odds ratio)
(example of using AUC high value 2 variant IGH)
Prediction variables: IGHGP, IGHV3-30-3
The discriminant:
Figure BDA0003010544010000872
(IGH time using AUC high value 3 variables)
Prediction variables: IGHGP, IGHV3-30, IGHV3-49
The discriminant:
Figure BDA0003010544010000873
3-2. two-step prediction model
ME/CFS patients are predicted by discriminating ME/CFS and healthy persons by using the discrimination formula 1 and then excluding patients with other diseases such as MS by using the discrimination formula 2. The discrimination is performed in two steps by any of the following examples or other combinations of the predictive variables.
1) Discrimination between ME/CFS patient and healthy person (discriminant 1)
[ mathematical formula 4]
(example of using 6 IGHs where significant differences were observed among 2 groups)
Prediction variables: IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, IGHJ6
The discriminant:
Figure BDA0003010544010000874
(example of using AUC high value 2 variant IGH)
Prediction variables: IGHGP, IGHV3-30-3
Figure BDA0003010544010000875
(IGH time using AUC high value 3 variables)
Prediction variables: IGHGP, IGHV3-30-3, IGHV3-49
Figure BDA0003010544010000876
2) Exclusion discrimination (discriminant 2) for exclusion of other diseases (MS patients) from ME/CFS patients
[ math figure 5]
Prediction variables: IGHV3-7, 1GHV3-23, IGHV3-23D, IGHV3-33, 1GHV3-73, IGHV3-NL1, IGHV4-28, IGHV4-39, IGHD4-17, IGHD5-5, IGHD5-18
The discriminant:
Figure BDA0003010544010000881
discriminant when 2 IGHs are used as prediction variables
Prediction variables: IGHGP, IGHV3-3
Figure BDA0003010544010000882
Discriminant when 3 IGHs are used as prediction variables
Prediction variables: IGHGP, IGHV3-3, IGHD1-14
Figure BDA0003010544010000883
(Note)
As described above, the present invention is exemplified using the preferred embodiments of the present invention, but it should be understood that the scope of the present invention should be construed only by the claims. It is understood that the contents of the patents, patent applications, and documents cited in the present specification are themselves the same as those specifically described in the present specification, and the contents are incorporated herein by reference.
(related application)
The application claims the priority of Japanese patent application No. 2018 and 155380 applied on 8/22/2018 and Japanese patent application No. 2019 and 44885 applied on 3/12/2019. The entire contents of these applications are incorporated by reference into this specification for all purposes.
Industrial applicability
The present invention can be used in a diagnostic agent for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS).
Sequence Listing free text
Sequence number 1: BSL18E primer
Sequence number 2: p20EA primer
Sequence number 3: p10EA primer
Sequence number 4: CG1 primer
Sequence number 5: CG2 primer
Sequence number 6: p22EA-ST1-R primer
Sequence number 7: CG-ST1-R primer
Sequence listing
<110> national mental and neurological medical research center of national research and development legal people
Genesis Genesis Corp.
<120> biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS)
<130> EC008PCT
<150> JP 2018-155380
<151> 2018-08-22
<160> 7
<170> PatentIn version 3.5
<210> 1
<211> 35
<212> DNA
<213> Artificial sequence
<220>
<223> BSL18E primer
<220>
<221> misc_feature
<222> (35)..(35)
<223> n is a, c, g, t or u
<400> 1
aaagcggccg catgcttttt tttttttttt tttvn 35
<210> 2
<211> 10
<212> DNA
<213> Artificial sequence
<220>
<223> P20EA primer
<400> 2
gggaattcgg 10
<210> 3
<211> 20
<212> DNA
<213> Artificial sequence
<220>
<223> P10EA primer
<400> 3
taatacgact ccgaattccc 20
<210> 4
<211> 21
<212> DNA
<213> Artificial sequence
<220>
<223> CG1 primer
<400> 4
caccttggtg ttgctgggct t 21
<210> 5
<211> 21
<212> DNA
<213> Artificial sequence
<220>
<223> CG2 primer
<400> 5
tcctgaggac tgtaggacag c 21
<210> 6
<211> 55
<212> DNA
<213> Artificial sequence
<220>
<223> P22EA-ST1-R primer
<400> 6
gtctcgtggg ctcggagatg tgtataagag acagctaata cgactccgaa ttccc 55
<210> 7
<211> 54
<212> DNA
<213> Artificial sequence
<220>
<223> CG-ST1-R primer
<400> 7
tcgtcggcag cgtcagatgt gtataagaga cagtgagttc cacgacaccg tcac 54

Claims (43)

1. A method wherein a library of B cell receptors of a subject is used as an indicator of myalgic encephalomyelitis/chronic fatigue syndrome in said subject, said B cell receptors being denoted BCR and said myalgic encephalomyelitis/chronic fatigue syndrome being denoted ME/CFS.
2. The method according to claim 1, wherein the myalgic encephalomyelitis/chronic fatigue syndrome is recorded as ME/CFS, as an index of the myalgic encephalomyelitis/chronic fatigue syndrome in the subject, using 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of the BCR of the subject.
3. The method of claim 2, wherein said 1 or more variables are indicative of the subject suffering from ME/CFS but not from other diseases.
4. The method according to any one of claims 1 to 3, wherein the 1 OR more genes include genes selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV 1/21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV 2-70/70D, IGHV2/OR16-5, IGHV 16-7, IGHV 16-9, IGHV 16-11, IGHV 16-13, IGHV 87472-72, IGHV 16-72-16, IGHV 16-72, IGHV3-30, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL 3, IGHV3/OR 3-7, IGHV3/OR 3-6, IGHV3/OR 3-8, IGHV3/OR 3-9, IGHV3/OR 3-10, IGHV3/OR 3-12, IGHV3/OR 3-13, IGHV 573-72-30-72, IGHV 3-72-3, IGHV 3-72-3, IGHV 3-72-3-72, IGHV4-59, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR 2-2 a/b, IGHD 2-3, IGHD 2-9, IGHD 2-10, IGHD 2-16, IGHD 2-22, IGHD2/OR 2-3 a/b, IGHD 2-4, IGHD-72, IGHD-2-72, IGHD-2/2-72, IGHD 2/2-72 a/b, IGHD 2/2-2, IGHD 2/72-72, IGHD 2-72, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.
5. The method of claim 4, wherein, the 1 OR more genes include at least one selected from the group consisting of IGHV3-73, IGHV1-69-2, IGHV5-51, IGHV4-31, IGHV3-23D, IGHV1/OR15-9, IGHV4-39, IGHD5-12, IGHV3-43D, IGHD4-17, IGHV5-10-1, IGHD4/OR15-4a/b, IGHG4, IGHV1/OR15-5, IGHV3/OR16-9, IGHD1-7, IGHV3-21, IGHV 6-6, IGHV3-33, IGHD4-23, IGHV 4-30-5, IGHV 4-23, IGHD 4-13, IGHV 4-48, IGHV 4-64, IGHV 4-72, IGHV 4-30-72, IGHV 4-72-30-72, IGHV 4-30-72, IGHV 4-3-.
6. The method of claim 5, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHD6-6, IGHV3-33, IGHD4-23, IGHV3-30-5, IGHV3-23, IGHD6-13, IGHV3-64D, IGHV3-48, IGHV3-64, IGHG1, IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD 3-22.
7. The method of claim 6, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHJ6, IGHV3-30, IGHGP, IGHV1-3, and IGHD 3-22.
8. The method of claim 7, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ 6.
9. The method of claim 2, wherein the 1 or more variables further comprise the number of 1 or more subpopulations of immune cells.
10. The method of claim 9, wherein the 1 or more variables show an AUC ≧ 0.7 in a regression analysis-based ROC curve used to discriminate between normal control and ME/CFS.
11. The method of claim 10, wherein the 1 or more variables show an AUC ≧ 0.8 in a regression analysis-based ROC curve used to discriminate between normal control and ME/CFS.
12. The method of any one of claims 9-11, wherein the number of immune cell subpopulations is selected from the group consisting of number of B cells, number of naive B cells, number of memory B cells, number of plasmablasts, number of activated naive B cells, number of transitional B cells, number of regulatory T cells, number of memory T cells, number of follicular helper T cells, number of Tfh1 cells, number of Tfh2 cells, number of Tfh17 cells, number of Th1 cells, number of Th2 cells, and number of Th17 cells.
13. The method of claim 2, wherein the 1 or more variables comprise frequency of use of 2 or more genes in the subject's BCR IgGH chain variable region.
14. The method of claim 13, wherein the 1 or more variables show an AUC ≧ 0.7 in a regression analysis-based ROC curve used to discriminate between normal control and ME/CFS.
15. The method of claim 14, wherein the 1 or more variables show an AUC ≧ 0.8 in a regression analysis-based ROC curve used to discriminate between normal control and ME/CFS.
16. The method of claim 15, wherein the 1 or more variables show an AUC ≧ 0.9 in a regression analysis-based ROC curve used to discriminate between normal control and ME/CFS.
17. The method of claim 2, wherein the 1 or more variables comprise a combination of 2 or more variables selected from the group consisting of frequency of use of 1 or more genes in the IgGH chain variable region of the subject's BCR, the subject's BCR diversity index, and the number of 1 or more subpopulations of immune cells of the subject, the 1 or more variables exhibiting an AUC ≧ 0.7 in a regression analysis-based ROC curve used to discriminate between normal control and ME/CFS.
18. The method of claim 17, wherein the 1 or more variables comprise 3 or more variables selected from the group.
19. The method of claim 18, wherein the 1 or more variables comprise 4 or more variables selected from the group.
20. The method of claim 19, wherein the 1 or more variables comprise 5 or more variables selected from the group.
21. The method of claim 20, wherein the 1 or more variables comprise 6 or more variables selected from the group.
22. The method of any one of claims 17 to 21, wherein the 1 or more variables show an AUC of 0.8 or more in a regression analysis based ROC curve used to discriminate between normal controls and ME/CFS.
23. The method of any one of claims 17 to 22, wherein the 1 or more variables show an AUC of 0.9 or more in a regression analysis based ROC curve used to discriminate between normal controls and ME/CFS.
24. The method of claim 2, wherein the 1 or more genes comprise IGHV1-3, IGHV3-30, IGHV3-30-3, IGHV3-49, IGHD1-26, and IGHJ 6.
25. The method of any one of claims 2 to 24, wherein the 1 or more variables comprise the number of B cells of the subject.
26. The method of any one of claims 2 to 25, wherein the 1 or more variables comprise the number of regulatory T cells in the subject scored as tregs.
27. The method according to any one of claims 2 to 26, wherein the frequency of use of the 1 or more genes is determined by a method comprising large-scale high-efficiency BCR library analysis.
28. The method of any one of claims 2 to 27, wherein the 1 or more variables comprise the frequency of use of at least 1 gene selected from the group consisting of IGHV3-49, IGHV3-30-3, IGHD1-26, IGHV3-30, IGHJ6, IGHGP, IGHV4-31, IGHV3-64, IGHD3-22, IGHV3-33, IGHV3-73, IGHV5-10-1, and IGHV 4-34.
29. The method of any one of claims 1 to 27, comprising:
(a) using a part of the more than 1 variable as the index of ME/CFS of the object,
(b) and taking a part of the more than 1 variable as an index of the object being ME/CFS but not other diseases.
30. The method of claim 29, wherein (b) is performed a plurality of times for a plurality of other diseases.
31. The method of claim 28 or 29, wherein the other disease comprises multiple sclerosis, which is scored as MS.
32. A method wherein 1 or more variables including the frequency of use of 1 or more genes in the IgGH chain variable region of a BCR of a subject are used as an index of the occurrence of myalgic encephalomyelitis/chronic fatigue syndrome, which is denoted as ME/CFS, rather than multiple sclerosis in the subject.
33. The method according to claim 32, wherein the 1 OR more genes comprise a gene selected from the group consisting of IGHV1-2, IGHV1-3, IGHV1-8, IGHV1-18, IGHV1-24, IGHV1-38-4, IGHV1-45, IGHV1-46, IGHV1-58, IGHV1-69, IGHV1-69-2, IGHV1-69D, IGHV1/OR15-1, IGHV1/OR15-5, IGHV1/OR15-9, IGHV1/OR21-1, IGHV2-5, IGHV2-26, IGHV2-70, IGHV2-70 2/OR 2-5, IGHV2-7, IGHV 2-9, IGHV 2-11, IGHV 2-13, IGHV 2-72, IGHV2-2, IGHV 2-72, IGHV 2-72, IGHV 3623-2-72, IGHV2, IGHV3-30-3, IGHV3-30-5, IGHV3-33, IGHV3-35, IGHV3-38, IGHV3-38-3, IGHV3-43, IGHV3-43D, IGHV3-48, IGHV3-49, IGHV3-53, IGHV3-64, IGHV3-64D, IGHV3-66, IGHV3-72, IGHV3-73, IGHV3-74, IGHV3-NL 3, IGHV3/OR 3-7, IGHV 3/HV 3-6, IGHV3/OR 3-8, IGHV3/OR 3-9, IGHV3/OR 3-10, IGHV3/OR 3-12, IGHV 72/OR 3-13, IGHV3-4, IGHV 3-28, IGHV 30-72, IGHV 3-72-3, IGHV 3-72-3-72, IGHV3-, IGHV4-61, IGHV4/OR15-8, IGHV5-10-1, IGHV5-51, IGHV6-1, IGHV7-4-1, IGHV7-81, IGHD1-1, IGHD1-7, IGHD1-14, IGHD1-20, IGHD1-26, IGHD1/OR15-1a/b, IGHD2-2, IGHD2-8, IGHD2-15, IGHD2-21, IGHD2/OR 2-2 a/b, IGHD 2-3, IGHD 2-9, IGHD 2-10, IGHD 2-16, IGHD 2-22, IGHD2/OR 2-3 a/b, IGHD 2-4, IGHD 2-11, IGHD-72, IGHD-OR 72-72, IGHD-72, IGHD-2-72, IGHD 2/2-72-2, IGHD 2/2-72-2-72, IGHD6-6, IGHD6-13, IGHD6-19, IGHD6-25, IGHD7-27, IGHJ1, IGHJ2, IGHJ3, IGHJ4, IGHJ5, IGHJ6, IGHG1, IGHG2, IGHG3, IGHG4, and IGHGP.
34. The method of claim 32 or 33, wherein the 1 or more genes comprise at least 1 gene selected from the group consisting of IGHV1-3, IGHV3-30, IGHV3-30-3, IGHD1-26, IGHV3-49, and IGHJ 6.
35. The method of claim 32, wherein said 1 or more variables comprise the frequency of use of 2 or more genes in the subject's BCR IgGH chain variable region.
36. The method of claim 35, wherein the 1 or more variables show an AUC ≧ 0.7 in a regression analysis-based ROC curve used to discriminate MS and ME/CFS.
37. The method of claim 36, wherein the 1 or more variables show an AUC ≧ 0.8 in a regression analysis-based ROC curve used to discriminate MS and ME/CFS.
38. The method of claim 37, wherein the 1 or more variables show an AUC ≧ 0.9 in a regression analysis-based ROC curve used to discriminate MS and ME/CFS.
39. The method of claim 32, wherein said 1 or more variables comprise the frequency of use of 3 or more genes in the subject's BCR IgGH chain variable region.
40. The method of claim 39, wherein said 1 or more variables exhibit an AUC ≧ 0.7 in a regression analysis-based ROC curve used to discriminate between MS and ME/CFS.
41. The method of claim 40, wherein said 1 or more variables exhibit an AUC ≧ 0.8 in a regression analysis-based ROC curve used to discriminate between MS and ME/CFS.
42. The method of claim 41, wherein said 1 or more variables exhibit an AUC ≧ 0.9 in a regression analysis-based ROC curve used to discriminate between MS and ME/CFS.
43. A process according to claim 31, wherein (b) is carried out by a process as claimed in any one of claims 32 to 42.
CN201980066179.1A 2018-08-22 2019-08-21 Biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) Pending CN112840033A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
JP2018155380 2018-08-22
JP2018-155380 2018-08-22
JP2019-044885 2019-03-12
JP2019044885 2019-03-12
PCT/JP2019/032673 WO2020040210A1 (en) 2018-08-22 2019-08-21 Biomarker for myalgic encephalomyelitis/chronic fatigue syndrome (me/cfs)

Publications (1)

Publication Number Publication Date
CN112840033A true CN112840033A (en) 2021-05-25

Family

ID=69592052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980066179.1A Pending CN112840033A (en) 2018-08-22 2019-08-21 Biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS)

Country Status (6)

Country Link
US (1) US20220364170A1 (en)
EP (1) EP3842547A4 (en)
JP (2) JPWO2020040210A1 (en)
CN (1) CN112840033A (en)
TW (1) TW202022121A (en)
WO (1) WO2020040210A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113238061A (en) * 2021-07-09 2021-08-10 中南大学湘雅医院 Kit for indicating myasthenia gravis by CD180 negative B cells and application

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111808195A (en) * 2020-06-30 2020-10-23 中国科学院心理研究所 Method for obtaining B cell antibody gene of anti-N-methyl-D-aspartate receptor encephalitis and research on immune repertoire thereof
CN114381492B (en) * 2021-11-25 2024-04-23 杭州拓宏生物科技有限公司 Myalgia encephalomyelitis marker microorganism and application thereof
CN114517235A (en) * 2021-12-31 2022-05-20 杭州拓宏生物科技有限公司 Myalgic encephalomyelitis marker gene and application thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015075939A1 (en) * 2013-11-21 2015-05-28 Repertoire Genesis株式会社 T cell receptor and b cell receptor repertoire analysis system, and use of same in treatment and diagnosis
WO2016023077A1 (en) * 2014-08-15 2016-02-18 Griffith University Biological markers
WO2017222056A1 (en) * 2016-06-23 2017-12-28 国立研究開発法人理化学研究所 T-cell receptor and b-cell receptor repertoire analysis system using one-step reverse transcription template-switching pcr

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015075939A1 (en) * 2013-11-21 2015-05-28 Repertoire Genesis株式会社 T cell receptor and b cell receptor repertoire analysis system, and use of same in treatment and diagnosis
CN106103711A (en) * 2013-11-21 2016-11-09 组库创世纪株式会社 System and the application in treatment and diagnosis thereof are analyzed in φt cell receptor and B-cell receptor storehouse
EP3091074A1 (en) * 2013-11-21 2016-11-09 Repertoire Genesis Incorporation T cell receptor and b cell receptor repertoire analysis system, and use of same in treatment and diagnosis
WO2016023077A1 (en) * 2014-08-15 2016-02-18 Griffith University Biological markers
WO2017222056A1 (en) * 2016-06-23 2017-12-28 国立研究開発法人理化学研究所 T-cell receptor and b-cell receptor repertoire analysis system using one-step reverse transcription template-switching pcr
CN109312327A (en) * 2016-06-23 2019-02-05 国立研究开发法人理化学研究所 Use the T cell receptor and B-cell receptor library analysis system of step reverse transcribing template conversion PCR
CN109328235A (en) * 2016-06-23 2019-02-12 国立研究开发法人理化学研究所 One step reverse transcribing template converts PCR

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JAIN VAGEESH等: "Prevalence of and risk factors for severe cognitive and sleep symptoms in ME/CFS and MS", 《BMC NEUROLOGY》, vol. 17, no. 117, pages 1 - 10, XP055687325, DOI: 10.1186/s12883-017-0896-0 *
MARIA EDILOVA等: "Control of lymphocytic leukemia through regulation of TRAF1 protein degradation", 《FOCIS》, vol. 81 *
ONO H等: "Dysregulation of T and B cells in myalgic encephalomyelitis/chronic fatigue syndrome", 《JOURNAL OF THE NEUROLOGICAL SCIENCE》, pages 899 - 900 *
尚颖, 曹东平, 贾海英, 刘志国: "慢性疲劳综合征的特征与评价", 中国临床康复, no. 24 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113238061A (en) * 2021-07-09 2021-08-10 中南大学湘雅医院 Kit for indicating myasthenia gravis by CD180 negative B cells and application
CN113238061B (en) * 2021-07-09 2021-10-01 中南大学湘雅医院 Kit for indicating myasthenia gravis by CD180 negative B cells and application

Also Published As

Publication number Publication date
JPWO2020040210A1 (en) 2021-08-10
EP3842547A4 (en) 2022-05-18
JP2024036563A (en) 2024-03-15
WO2020040210A1 (en) 2020-02-27
TW202022121A (en) 2020-06-16
US20220364170A1 (en) 2022-11-17
EP3842547A1 (en) 2021-06-30

Similar Documents

Publication Publication Date Title
Der et al. Single cell RNA sequencing to dissect the molecular heterogeneity in lupus nephritis
CN112840033A (en) Biomarkers for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS)
CN105189783B (en) Method for identifying quantitative cellular composition in biological sample
Shumilov et al. Current status and trends in the diagnostics of AML and MDS
US20070020670A1 (en) Methods for detecting and confirming minimal disease
Jiang et al. Single-cell repertoire tracing identifies rituximab-resistant B cells during myasthenia gravis relapses
CN110246539B (en) Method and device for evaluating immunity level
US20130078633A1 (en) Detection of Isotype Profiles as Signatures for Disease
CN103314298A (en) Novel marker for detection of bladder cancer and/or inflammatory conditions of the bladder
de Paula Alves Sousa et al. Intrathecal T‐cell clonal expansions in patients with multiple sclerosis
Scarfò et al. What does it mean I have a monoclonal B-cell lymphocytosis?: recent insights and new challenges
Keret et al. Differentially expressed genes in systemic sclerosis: Towards predictive medicine with new molecular tools for clinicians
Munoz et al. Comparative analysis of ZAP‐70 expression and Ig VH mutational status in B‐cell chronic lymphocytic leukemia
US20230028910A1 (en) Method for diagnosing cutaneous t-cell lymphoma diseases
CN114424291A (en) Immune repertoire health assessment system and method
Ghraichy et al. Maturation of the human B-cell receptor repertoire with age
Litjens et al. Validation of a combined transcriptome and T cell receptor alpha/beta (TRA/TRB) repertoire assay at the single cell level for paucicellular samples
Yoo et al. Prevalence and immunophenotypic characteristics of monoclonal B-Cell lymphocytosis in healthy Korean individuals with lymphocytosis
Needhamsen et al. Integration of small RNAs from plasma and cerebrospinal fluid for classification of multiple sclerosis
Demina et al. Correlation of the surface expression of thymic stromal lymphopoietin receptor with the presence of CRLF2 gene rearrangements in children with B‐lineage acute lymphoblastic leukemia
Koo et al. Human Germinal Center–associated Lymphoma (HGAL) Is a Reliable Marker of Normal and Neoplastic Follicular Helper T Cells Including Angioimmunoblastic T-Cell Lymphoma
Yao Investigating Disease Progression and Therapeutic Targets in Multiple Myeloma Using Single-Cell Technologies
Lu et al. cyTRBC1 evaluation rapidly identifies sCD3‐negative peripheral T‐cell lymphomas and reveals a novel type of sCD3‐negative T‐cell clone with uncertain significance
EP4232819A1 (en) Methods of assessing the therapeutic activity of agents for the treatment of immune disorders
EP2741084B1 (en) Method of prognosing the clinical course of chronic lymphocytic leukemia (CLL)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination