WO2022150354A1 - Essai multiplexé de lymphocytes pour une spécificité antigénique - Google Patents
Essai multiplexé de lymphocytes pour une spécificité antigénique Download PDFInfo
- Publication number
- WO2022150354A1 WO2022150354A1 PCT/US2022/011275 US2022011275W WO2022150354A1 WO 2022150354 A1 WO2022150354 A1 WO 2022150354A1 US 2022011275 W US2022011275 W US 2022011275W WO 2022150354 A1 WO2022150354 A1 WO 2022150354A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- antigens
- antigen
- lymphocyte
- unique
- cell
- Prior art date
Links
- 239000000427 antigen Substances 0.000 title claims abstract description 773
- 102000036639 antigens Human genes 0.000 title claims abstract description 766
- 108091007433 antigens Proteins 0.000 title claims abstract description 766
- 210000004698 lymphocyte Anatomy 0.000 title claims abstract description 360
- 238000012360 testing method Methods 0.000 title description 9
- 239000011541 reaction mixture Substances 0.000 claims abstract description 265
- 238000000034 method Methods 0.000 claims abstract description 189
- 238000012163 sequencing technique Methods 0.000 claims abstract description 59
- 239000012472 biological sample Substances 0.000 claims abstract description 51
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 31
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 30
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 30
- 210000001744 T-lymphocyte Anatomy 0.000 claims description 158
- 108020003175 receptors Proteins 0.000 claims description 153
- 108091008874 T cell receptors Proteins 0.000 claims description 54
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 claims description 54
- 210000000612 antigen-presenting cell Anatomy 0.000 claims description 51
- 239000003550 marker Substances 0.000 claims description 28
- 238000001514 detection method Methods 0.000 claims description 27
- 210000003719 b-lymphocyte Anatomy 0.000 claims description 23
- 101001057504 Homo sapiens Interferon-stimulated gene 20 kDa protein Proteins 0.000 claims description 17
- 101001055144 Homo sapiens Interleukin-2 receptor subunit alpha Proteins 0.000 claims description 17
- 102100026878 Interleukin-2 receptor subunit alpha Human genes 0.000 claims description 17
- 102000008096 B7-H1 Antigen Human genes 0.000 claims description 16
- 108010074708 B7-H1 Antigen Proteins 0.000 claims description 16
- 102100025137 Early activation antigen CD69 Human genes 0.000 claims description 16
- 101000934374 Homo sapiens Early activation antigen CD69 Proteins 0.000 claims description 16
- 102100032937 CD40 ligand Human genes 0.000 claims description 15
- 101000868215 Homo sapiens CD40 ligand Proteins 0.000 claims description 15
- 239000011324 bead Substances 0.000 claims description 15
- 108090000623 proteins and genes Proteins 0.000 claims description 11
- 230000014509 gene expression Effects 0.000 claims description 7
- 102000005962 receptors Human genes 0.000 description 110
- 210000004027 cell Anatomy 0.000 description 79
- 108090000765 processed proteins & peptides Proteins 0.000 description 64
- 238000000926 separation method Methods 0.000 description 47
- 102000004196 processed proteins & peptides Human genes 0.000 description 41
- 239000000523 sample Substances 0.000 description 25
- 238000003556 assay Methods 0.000 description 21
- 108091008875 B cell receptors Proteins 0.000 description 16
- 230000006870 function Effects 0.000 description 16
- 229960005486 vaccine Drugs 0.000 description 15
- 230000004913 activation Effects 0.000 description 14
- 230000008685 targeting Effects 0.000 description 13
- 230000000638 stimulation Effects 0.000 description 12
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 description 11
- 238000000338 in vitro Methods 0.000 description 9
- 238000004088 simulation Methods 0.000 description 8
- 210000004443 dendritic cell Anatomy 0.000 description 7
- 108020004999 messenger RNA Proteins 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- 238000009826 distribution Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 108091008915 immune receptors Proteins 0.000 description 5
- 102000027596 immune receptors Human genes 0.000 description 5
- 102000004169 proteins and genes Human genes 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 210000001266 CD8-positive T-lymphocyte Anatomy 0.000 description 4
- 108020004414 DNA Proteins 0.000 description 4
- 239000002671 adjuvant Substances 0.000 description 4
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 102000043131 MHC class II family Human genes 0.000 description 3
- 108091054438 MHC class II family Proteins 0.000 description 3
- 102000018697 Membrane Proteins Human genes 0.000 description 3
- 108010052285 Membrane Proteins Proteins 0.000 description 3
- 102000006707 alpha-beta T-Cell Antigen Receptors Human genes 0.000 description 3
- 108010087408 alpha-beta T-Cell Antigen Receptors Proteins 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 108020004705 Codon Proteins 0.000 description 2
- 102000043129 MHC class I family Human genes 0.000 description 2
- 108091054437 MHC class I family Proteins 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 230000006044 T cell activation Effects 0.000 description 2
- 239000000556 agonist Substances 0.000 description 2
- 125000003275 alpha amino acid group Chemical group 0.000 description 2
- 230000030741 antigen processing and presentation Effects 0.000 description 2
- 230000020411 cell activation Effects 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- 238000012165 high-throughput sequencing Methods 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000037452 priming Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000002103 transcriptional effect Effects 0.000 description 2
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 101710145634 Antigen 1 Proteins 0.000 description 1
- 230000003844 B-cell-activation Effects 0.000 description 1
- OBMZMSLWNNWEJA-XNCRXQDQSA-N C1=CC=2C(C[C@@H]3NC(=O)[C@@H](NC(=O)[C@H](NC(=O)N(CC#CCN(CCCC[C@H](NC(=O)[C@@H](CC4=CC=CC=C4)NC3=O)C(=O)N)CC=C)NC(=O)[C@@H](N)C)CC3=CNC4=C3C=CC=C4)C)=CNC=2C=C1 Chemical compound C1=CC=2C(C[C@@H]3NC(=O)[C@@H](NC(=O)[C@H](NC(=O)N(CC#CCN(CCCC[C@H](NC(=O)[C@@H](CC4=CC=CC=C4)NC3=O)C(=O)N)CC=C)NC(=O)[C@@H](N)C)CC3=CNC4=C3C=CC=C4)C)=CNC=2C=C1 OBMZMSLWNNWEJA-XNCRXQDQSA-N 0.000 description 1
- 108010077333 CAP1-6D Proteins 0.000 description 1
- 101100180402 Caenorhabditis elegans jun-1 gene Proteins 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 101710181478 Envelope glycoprotein GP350 Proteins 0.000 description 1
- 102100030386 Granzyme A Human genes 0.000 description 1
- 102100038395 Granzyme K Human genes 0.000 description 1
- 206010069767 H1N1 influenza Diseases 0.000 description 1
- 108010051539 HLA-DR2 Antigen Proteins 0.000 description 1
- 108010046732 HLA-DR4 Antigen Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101100005713 Homo sapiens CD4 gene Proteins 0.000 description 1
- 101001009599 Homo sapiens Granzyme A Proteins 0.000 description 1
- 101001033007 Homo sapiens Granzyme K Proteins 0.000 description 1
- 101000979599 Homo sapiens Protein NKG7 Proteins 0.000 description 1
- 101000914514 Homo sapiens T-cell-specific surface glycoprotein CD28 Proteins 0.000 description 1
- 101000800483 Homo sapiens Toll-like receptor 8 Proteins 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 102100033486 Lymphocyte antigen 75 Human genes 0.000 description 1
- 101710157884 Lymphocyte antigen 75 Proteins 0.000 description 1
- 238000000585 Mann–Whitney U test Methods 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 201000005505 Measles Diseases 0.000 description 1
- 208000005647 Mumps Diseases 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 101710176384 Peptide 1 Proteins 0.000 description 1
- 208000000474 Poliomyelitis Diseases 0.000 description 1
- 102100023370 Protein NKG7 Human genes 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 230000005867 T cell response Effects 0.000 description 1
- 102100027213 T-cell-specific surface glycoprotein CD28 Human genes 0.000 description 1
- 102100033110 Toll-like receptor 8 Human genes 0.000 description 1
- 238000012382 advanced drug delivery Methods 0.000 description 1
- OFHCOWSQAMBJIW-AVJTYSNKSA-N alfacalcidol Chemical group C1(/[C@@H]2CC[C@@H]([C@]2(CCC1)C)[C@H](C)CCCC(C)C)=C\C=C1\C[C@@H](O)C[C@H](O)C1=C OFHCOWSQAMBJIW-AVJTYSNKSA-N 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000000540 analysis of variance Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000000423 cell based assay Methods 0.000 description 1
- 230000005859 cell recognition Effects 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 231100000433 cytotoxic Toxicity 0.000 description 1
- 230000001472 cytotoxic effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 108700014844 flt3 ligand Proteins 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000003053 immunization Effects 0.000 description 1
- 238000002649 immunization Methods 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 229960003971 influenza vaccine Drugs 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 210000001806 memory b lymphocyte Anatomy 0.000 description 1
- 238000002620 method output Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000001616 monocyte Anatomy 0.000 description 1
- 208000010805 mumps infectious disease Diseases 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008823 permeabilization Effects 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 238000010149 post-hoc-test Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 108700042226 ras Genes Proteins 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 201000005404 rubella Diseases 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000012174 single-cell RNA sequencing Methods 0.000 description 1
- 210000000278 spinal cord Anatomy 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 230000004936 stimulating effect Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 201000010740 swine influenza Diseases 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 241000712461 unidentified influenza virus Species 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/569—Immunoassay; Biospecific binding assay; Materials therefor for microorganisms, e.g. protozoa, bacteria, viruses
- G01N33/56966—Animal cells
- G01N33/56972—White blood cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6881—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B30/00—Methods of screening libraries
- C40B30/04—Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2333/00—Assays involving biological materials from specific organisms or of a specific nature
- G01N2333/435—Assays involving biological materials from specific organisms or of a specific nature from animals; from humans
- G01N2333/705—Assays involving receptors, cell surface antigens or cell surface determinants
- G01N2333/70503—Immunoglobulin superfamily, e.g. VCAMs, PECAM, LFA-3
- G01N2333/7051—T-cell receptor (TcR)-CD3 complex
Definitions
- the present invention relates generally to identification of lymphocyte receptors that are specific to target antigens. More particularly, the present invention relates to systems and methods of accurately identifying lymphocyte (e.g ., B cell or T cell) receptor sequence chains that are specific to one or more antigens or peptides of interest.
- lymphocyte e.g ., B cell or T cell
- Patent Nos. 10,066,265 and 10,077,478 disclose methods for determining the sequence of one or more lymphocyte receptor chains specific to antigens of interest but fail to disclose systems and methods that can produce accurate lymphocyte receptor chain sequences (e.g., with low false positive/negative rates) specific to one or more target antigens. There exists a need for improved methods and assays for discovering lymphocyte receptor chain sequences that bind to specific antigens in pool-based detection formats and algorithms.
- the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: sorting a plurality of first antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of first antigens to a unique subset of the plurality of reaction mixtures, and wherein two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, contacting each reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in at least one reaction mixture of the plurality of reaction mixtures to expand in number such that a plurality of T cell clones is formed, contacting the plurality of T cell clones with a query antigen, separating a second activated T cell and a non-activated T cell from a subset of the plurality of reaction mixtures, wherein the second activated T cell recognizes
- separating the second activated T cell and the non-activated T cell is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
- the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
- the second activated T cell recognizes the query antigen by binding an MHC complex comprising the query antigen.
- the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the unique antigen is added to.
- the error-correcting code is a superimposed code.
- the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
- the decoding algorithm is a nearest neighbor algorithm.
- the query antigen is different from any antigen of the plurality of first antigens.
- separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using multimer sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected unique antigens that are specific to the T cell receptor chain sequence.
- the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
- the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: adding a plurality of first antigens to a first reaction mixture, contacting the first reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in the first reaction mixture to expand in number such that a plurality of T cell clones is formed, sorting a plurality of query antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a first query antigen of the plurality of query antigens to a unique subset of the plurality of reaction mixtures, wherein two unique query antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the first query antigen is different from any antigen of the plurality of first antigens, contacting each reaction mixture of the plurality of reaction mixtures with a portion of the first reaction mixture comprising the pluralit
- separating the second activated T cell from the subset of the plurality of T cell clones is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
- the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
- the second activated T cell recognizes the first query antigen by binding an MHC complex comprising the first query antigen.
- the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the first query antigen is added to.
- the error-correcting code is a superimposed code.
- the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the first query antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
- the decoding algorithm is a nearest neighbor algorithm.
- separating the second activated T cell from the subset of the plurality of T cell clones is performed using multimer sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using bead- based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected query antigens that are specific to the T cell receptor chain sequence.
- the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
- the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding the at least two unique antigens of the plurality of antigens-to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens, sequencing nucleic
- the lymphocyte is a T cell or a B cell. In some embodiments, separating the target lymphocyte is performed using multimer sorting. In some embodiments, the target lymphocyte is a T cell, and wherein separating the T cell is based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, and CD154. In some embodiments, the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
- a number of reaction mixtures comprising the at least two unique subsets is a function of a number of expected antigens that are specific to the lymphocyte cell receptor chain sequence.
- the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens by binding the at least two unique antigens of the plurality of antigens or by binding two or more molecular complexes comprising the at least two unique antigens of the plurality of antigens.
- the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence. In some embodiments, the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the at least two unique subsets of the plurality of reaction mixtures. In some embodiments, comprising assigning a superimposed code to each antigen of the plurality of antigens, wherein the superimposed code is configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence.
- the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding at the at least two unique antigens of the plurality of antigens to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte reacts with the at least two unique antigens of the plurality of antigens, sequencing nucleic acids of the target lymphocyte to
- the lymphocyte is a T cell or a B cell.
- the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte. In some embodiments, comprising contacting at least one reaction mixture of the plurality of reaction mixtures with a query antigen.
- the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific to a unique antigen, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of antigens to a unique subset of the plurality of reaction mixtures such that two different unique antigens are not added to the unique subset, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the unique antigen, after separating the target lymphocyte, sequencing nucleic acids of the target lymphocyte to obtain the lymphocyte receptor chain sequence, wherein the sequencing is performed by single-cell sequencing, and detecting the unique antigen, wherein the detecting comprises: computing a frequency of lymphocyte cells that express the lymphocyte receptor chain sequence.
- the lymphocyte is a T cell or a B cell.
- the target lymphocyte is a T cell, and wherein the T cell is separated based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, or a combination thereof.
- the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
- the detecting further comprises: computing a gene expression value of a gene of the target lymphocyte.
- the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the target lymphocyte recognizes the unique antigen by binding the unique antigen or by binding one or more molecular complexes comprising the unique antigen.
- the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the unique antigen that is specific to the lymphocyte receptor chain sequence.
- the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the one or more antigens that are specific to the lymphocyte receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures, and wherein the at least one reaction mixture comprises the one or more antigens.
- the invention provides for a method for determining a lymphocyte receptor chain sequence, or a portion thereof, specific for at least one antigen, the method comprising: providing a biological sample comprising a plurality of lymphocytes, extracting a plurality of first antigen presenting cells from the biological sample, dividing the plurality of first antigen presenting cells into a plurality of first reaction mixtures, sorting a plurality of first antigens into the plurality of first reaction mixtures, wherein the sorting comprises adding a unique first antigen of the plurality of first antigens to a unique subset of the plurality of first reaction mixtures, and wherein two unique first antigens are not added to any two identical subsets of the plurality of first reaction mixtures, contacting each first reaction mixture with the biological sample, providing a condition for a first activated lymphocyte in at least one first reaction mixture of the plurality of first reaction mixtures to expand in number such that a plurality of lymphocyte clones is formed, extracting a plurality of
- the lymphocyte is a T cell or a B cell.
- HLA typing of the biological sample to determine a predicted display of at least one antigen of the plurality of first antigens by an MHC molecule present in the biological sample.
- enriching the plurality of lymphocytes prior to sorting the plurality of first antigens into the plurality of first reaction mixtures.
- enriching the plurality of lymphocytes after providing the condition for the first activated lymphocyte to expand in number and prior to extracting the plurality of second antigen presenting cells.
- separating the second activated lymphocyte and the non-activated lymphocyte is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
- the second activated lymphocyte recognizes the query antigen by binding an MHC complex comprising the query antigen.
- the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of first reaction mixtures that the unique first antigen is added to.
- the error-correcting code is a collision free superimposed code configured to allow for detection of at least two unique first antigens specific for the lymphocyte receptor chain sequence.
- the collision free superimposed code is determined by a random search method.
- the collision free superimposed code consists of: a plurality of prefix codes, wherein a prefix code of the plurality of prefix codes is assigned to the unique first antigen of the plurality of first antigens, wherein the prefix code identifies an overlap set, wherein the prefix code is identical for more than one first antigen of the plurality of first antigens within the overlap set, and a plurality of suffix codes, wherein a suffix code of the plurality of suffix codes is assigned to the unique first antigen of the plurality of first antigens, wherein a combination of the prefix code and the suffix code is distinct for the unique first antigen.
- the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique first antigen specific for the lymphocyte receptor chain sequence when the lymphocyte receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of first reaction mixtures.
- the decoding algorithm is a nearest set algorithm.
- the query antigen is different from any antigen of the plurality of first antigens.
- separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using multimer sorting.
- separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of first reaction mixtures is a function of a number of expected unique first antigens that are specific to the lymphocyte receptor chain sequence. In some embodiments, the plurality of first reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample. In some embodiments, the detecting further comprises computing a frequency of lymphocytes that express the lymphocyte receptor chain sequence.
- FIG. 1 illustrates a flow chart of multiplexing of antigens into samples using an error correcting code that detects errors during demultiplexing.
- FIG. 2 illustrates a flow chart of detection of lymphocytes specific to antigens.
- FIG. 3 illustrates a flow chart of detection of lymphocytes that are expanded by exposure to one or more identified first antigens and are activated by one or more query antigens.
- FIG. 4 illustrates a flow chart of detection of lymphocytes that are expanded by exposure to one or more first antigens and are activated by one or more identified query antigens.
- a “unique antigen” is an antigen with a specific amino acid sequence.
- a “unique antigen” is an antigen derived from a specific epitope which can include multiple related peptides that are derived from that same epitope, and the “unique antigen” can therefore have more than one possible amino acid sequence.
- a lymphocyte is an immune system cell (e.g., T cell or B cell) that displays a receptor.
- a lymphocyte cell receptor LCR
- LCR lymphocyte cell receptor
- a lymphocyte receptor chain sequence means the sequence of a portion of a receptor molecule that is most variable (e.g, a CDR3 region).
- a lymphocyte receptor sequence pair is the two chain sequences of an immune receptor’s two components (e.g, for a T cell receptor, it is the alpha and beta chain sequence, for a B cell receptor it is the heavy and light chain sequence).
- a lymphocyte recognizes an antigen when at least one of the lymphocyte’s receptors binds the antigen, when at least one of the lymphocyte’s receptors binds a complex that includes an antigen (e.g, MHC complex), or the lymphocyte is activated when its receptor binds the antigen.
- an antigen e.g, MHC complex
- One advantage of the present systems and methods relates to LCR promiscuity. Certain LCR chain sequences will recognize more than one antigen that are contained in different pools (also referred to as reaction mixtures herein). Thus, a LCR sequence discovery algorithm that depends on LCR chain sequences appearing in pools/reaction mixtures unique to one antigen may fail to produce accurate results.
- a second advantage of the present systems and methods relates to host lymphocyte activation and non-specific markers. Lymphocytes may display native activation markers when they are isolated from animals or patients in peripheral blood mononuclear cell (PBMC) samples, and thus their activation will not be a consequence of the assay antigens.
- PBMC peripheral blood mononuclear cell
- a third advantage of the present systems and methods relates to experimental noise correction.
- a fourth advantage of the present systems and methods relates to LCR chain sequence count calibration.
- the level of lymphocyte cell recognition of an antigen and sequence discovery will vary from assay to assay and person to person.
- a means to normalize LCR chain sequence counts from different assays using control antigens/peptides can facilitate their direct comparison.
- the present disclosure employs coding and antigen control pool to reduce assay errors introduced by LCR promiscuity, host lymphocyte cell activation, and experimental noise. It also provides LCR chain sequence count calibration to permit comparison of disparate assays.
- pooled assays are used to discover LCR chain sequences that correspond to LCRs displayed by lymphocyte cells that recognize a specific peptide/antigen.
- K antigens e.g ., 15
- N antigen pools e.g., 7
- K refers to the total number of antigens (or peptides)
- N refers to the total number of antigen pools into which the K antigens (or peptides) are separated.
- antigens are placed into pools in a manner that allows the identification of LCRs on lymphocyte cells that recognize more than one antigen (or peptide).
- antigens are encoded into pools such that LCR chain sequences corresponding to an antigen (or peptide) do not have to appear (or be detected) in all pools where the antigen (or peptide) was present.
- the ability to detect LCRs that recognize antigens (or peptides) without having all corresponding pools that contain the antigen be recognized by lymphocytes with the LCR improves the sensitivity and accuracy of the assay.
- the method begins by distributing a plurality of antigens (also referred to as peptides herein) into a plurality of antigen pools.
- antigens e.g ., antigen 1 to antigen 15 as show in FIG. 1 are distributed into pools based on a minimum Hamming distance between the binary encoding of antigen pools where they reside.
- Antigens (peptides) are given numbers from 1 to K (e.g., 1 to 15), and each antigen (peptide) number is encoded into Abits (e.g, each bit labeled as 0 or 1), where A is the total number of antigen pools.
- the A bit encoding of an antigen number may be called its code word.
- FIG. 1 shows an example of 15 antigens (or peptides) that are each encoded into 7 bits (of Os and Is), where 7 is the number of antigen pools.
- an antigen is placed/distributed into a given antigen pool if the bit corresponding to that antigen pool is labeled “1” in the encoding of its number, and the peptide is not placed/distributed into a given antigen pool if the bit corresponding to that antigen pool is labeled “0”, as shown in FIG. 1.
- the encoding of the antigen number uses an error correcting code, such as a Hamming code, to enforce a minimum distance in bit changes between the encodings of two antigen numbers.
- the distance between two encodings as measured by the number of bit differences is called the Hamming distance.
- FIG. 1 shows the use of a “Hamming(7,4)” code that encodes 15 peptides into 7 bit code words (corresponding to 7 antigen pools) resulting in a minimum Hamming distance of 3 (i.e., 4 data bits, 3 parity bits, and 7 total bits corresponding to 7 antigen pools).
- code words which do not place an antigen into at least one pool i.e., all zeros
- FIG. 1 does not utilize the all zero code word from the Hamming(7,4) code.
- the use of an error correcting code can improve the sensitivity of the assay by not requiring detection of an LCR chain sequence from a lymphocyte that recognizes an antigen in every pool where the antigen is present.
- the use of an error correcting code improves the accuracy of the assay by allowing the detection in a biological sample of a LCR chain sequence from a lymphocyte that recognizes an antigen in one or more pools where the antigen is not present (i.e., false positive).
- the use of an error correcting code also improves the accuracy of the assay by allowing the lack of detection in a biological sample of a LCR chain sequence from a lymphocyte that recognizes an antigen in one or more pools where the antigen is present (i.e., false negative).
- codes for asymmetric channels can be used when the chance of a “1” occurring by error is higher than the chance of a “0” occurring by error. In some embodiments, codes for asymmetric channels can be used when the chance of a “0” occurring by error is higher than the chance of a “1” occurring by error. In some embodiments, a “1” occurs more often than a “0” when the separation of lymphocytes based on various markers is imperfect (i.e., false positive; e.g ., occurring at step 203 of FIG. 2).
- a “0” occurs more often than a “1” when there are a small number of lymphocyte cells that recognize an antigen (or peptide), and thus certain pools may have an insufficient number of lymphocyte cells that recognize an antigen (or peptide) to generate a “1” signal (i.e., false negative).
- a “1” occurs more often than a “0” not due to error or chance, but rather when a lymphocyte cell recognizes more than one antigen (or peptide), and thus produces hits in pools associated with both antigens (or peptides). Examples of asymmetric codes that can perform error detection and correction optimally under these circumstances can be found in Kim and Freiman (1959), incorporated by reference in its entirety herein.
- the antigen pools are exposed to a tissue sample (e.g, PBMCs) to cause antigen pool specific antigens to be exposed to the lymphocytes contained in the tissue sample.
- a tissue sample e.g, PBMCs
- lymphocyte cells are activated by the antigens and then separated into activated and non-activated cells, and optionally also separated by other markers, as described in greater detail below.
- lymphocyte cells bind the antigens and are then separated into antigen bound and non bound cells, and optionally also separated by other markers, as described in greater detail below.
- step 201 the method begins at step 201 in which antigens (e.g, peptides) are separated into a plurality of antigen pools (e.g, antigen pool 1 to antigen pool N) using the methods described herein (e.g, see FIG. 1).
- step 201 further includes creating a control pool (“Control Pool 0” in FIG.2), which is free of added peptides/antigens (but may include peptides/antigens endogenous to a tissue sample, for example at step 201).
- tissue samples e.g, PBMCs
- the same tissue sample is split equally so that each antigen pool and the control pool are exposed to substantially the same tissue sample (e.g, with the same number and distribution of lymphocytes).
- lymphocytes that are activated by the antigen pools are allowed time to expand.
- the antigen pools are separately re-stimulated with a query set of one or more antigens to test if the expanded lymphocytes respond to the query set of antigens.
- An example protocol that stimulates T cells with a first set of antigens and then queries with a second set of antigens is described by Tapia-Calle etal. (2019) “A PBMC- Based System to Assess Human T Cell Responses to Influenza Vaccine Candidates In Vitro.” Vaccines (Basel). 2019 Nov 13;7(4): 181, which is incorporated by reference in its entirety herein.
- LCR chain sequences that correspond to lymphocytes that recognize the query antigens are determined using the pool based methods described herein.
- each query antigen is assigned to the same pool as a pre-determined corresponding original pool antigen.
- this assay permits the identification of lymphocyte clones that recognize both sets of antigens. For example, an increase in the frequency of a LCR chain sequence in a subset of the antigen pools in which a first antigen was added means that the LCR chain sequence is specific to that first antigen (since the corresponding lymphocytes were allowed time to expand, resulting in increased frequencies of the LCR sequence in corresponding antigen pools).
- a query antigen is then added to the same set of antigen pools matched to a first antigen. If the same LCR chain sequence is detected in an activated set of lymphocytes from the same group of antigen pools, a conclusion can be drawn that the LCR chain sequence recognizes both the first antigen and the query antigen.
- query antigens are employed to test if a proposed derivative of a natural peptide, included as a first antigen, will cause expansion of lymphocyte clones that are activated by a query peptide (in which the query peptide is the natural peptide corresponding to the derivative of the natural peptide that was used as the first antigen).
- self-peptides are employed as query antigens to test if proposed vaccine peptides (or antigens) in the first antigen pools activate lymphocytes that also are activated by self-peptides that are naturally found (e.g ., query peptides are comprised of self-peptides).
- a tissue sample e.g., PBMCs
- a set of first antigens e.g, peptides
- the activated lymphocytes are allowed time to expand.
- the activated and expanded lymphocytes are then separated into pools that are stimulated with a second set of pool specific antigens (e.g, query peptides).
- Lymphocytes are separated into activated and non-activated cells, and optionally also separated by cell type.
- this method is used to test which specific query antigens in the antigen pools are recognized by lymphocytes activated by the first set of antigens.
- adjuvants are added at step 201 when the tissue sample is exposed to antigens (e.g ., prior to, simultaneously with, or following exposure to the antigens).
- antigens e.g ., prior to, simultaneously with, or following exposure to the antigens.
- One example method of using adjuvants is described in Lissina etal. (2016), “Priming of Qualitatively Superior Human Effector CD8+ T Cells Using TLR8 Ligand Combined with FLT3 Ligand” J Immunol. 2016 Jan l;196(l):256-263 incorporated by reference in its entirety herein.
- antigen specific responses to the use of adjuvants are observed based on the enrichment of LCR chain sequences in specific antigen pools.
- the adjuvants added at step 201 are molecules that provide co- stimulatory signals for lymphocytes (e.g., CD28 agonists, ICOS agonists, IL-2).
- lymphocytes are separated by their binding of antigens, and optionally also separated by lymphocyte cell type or other markers.
- methods of separating T cells based on the binding of their T cell receptors (TCRs) include MHC multimer (multimer) sorting, where a multimer displays a peptide in the context of an MHC molecule (see Klinger, et al. , “Multiplex Identification of Antigen-Specific T Cell Receptors Using a Combination of Immune Assays and Immune Receptor Sequencing” PLoS One. 2015 Oct 28; 10(10):e0141561).
- a set of fluorescent multimers is used that collectively displays all of the antigens (or peptides) present in a pool when bound by one or more than one MHC molecule.
- a given pool’s cells are then sorted by cells that are specific to the multimers assigned to the pool by fluorescence activated cell sorting (FACS).
- FACS fluorescence activated cell sorting
- multi-parameter FACS is used to separate each cell by multimer positive and negative cells with the addition of one or more additional markers such as CD4+ (CD4+ T Cell), and CD8+ (CD8+ T Cell), or other desired markers.
- Methods of separating B cells include sorting B cells that are bound to an antigen in a pool, and optionally by their type as determined by cell surface markers or other means known in the art.
- Example methods of sorting B cells based on their binding of antigens are described in Scheid, et al, “A method for identification of HIV gpl40 binding memory B cells in human blood” J Immunol Methods. 2009;343(2):65-67 and Zimmermann, et al, “Antigen Extraction and B Cell Activation Enable Identification of Rare Membrane Antigen Specific Human B Cells” Front Immunol. 2019; 10:829, which are incorporated by reference herein in their entireties.
- lymphocytes are separated into activated and non- activated cells, and optionally also separated by cell type (e.g, T cell, T cell type).
- activation markers that are specific for activated cells, and/or different cell types, can be used to identify and then separate cells that are activated by an antigen.
- antigens peptides
- Assays such as Activation Induced Markers (AIM) can be used to identify activation markers (see Bowyer et al. (2018).
- Activation-induced markers detect vaccine-specific cd4+ 1 cell responses not measured by assays conventionally used in clinical trials” Vaccines, 6(3), 50 and Reiss S, et al ., (2017) “Comparative analysis of activation induced marker (AIM) assays for sensitive identification of antigen-specific CD4 T cells” PLoS One, 12(10), e0186998, incorporated by reference in their entireties herein).
- Cell markers can be extracellular or intracellular, and cell permeabilization is used to permit antibodies to recognize intracellular markers. For example, activated T cells have been identified by their cell surface 0X40+ CD25+ markers using AIM.
- the type of cell that is activated can be further discriminated with other activation markers, including CD3+ (CD3+ T Cell), CD4+ (CD4+ T Cell), and CD8+ (CD8+ T Cell).
- Other T cell activation markers known in the art can be used including CD137 and 0X40, CD25, PD-L1, CD69, and CD 154.
- Lymphocyte cells can be physically separated by their markers at step 203 to enable the sequencing of the LCR chain sequences (at step 205, discussed in greater details below) in the physically separated cells.
- four separations of T cells result from each pool at step 203: 1) CD8+, Activated, 2) CD8+, Not activated, 3) CD4+, Activated, and 3) CD4+, Not-activated.
- Cell separation can be accomplished with bead-based methods, cell sorting-based methods, or other separation methods known in the art.
- Cell separation is accomplished at step 203.
- cell separation can be two-way, four-way, or more ways.
- one or more separations for each pool are retained.
- Markers used for separation can include cell proteins, antigen epitopes, antigens that are fluorescently tagged, fluorescent antibodies, florescent reagents, and other methods known in the art. Marker specific antibodies can be conjugated to beads, the beads can be exposed to a population of cells, and cells containing the selected markers can be physically separated by separating the beads. When selected cells are desired that are positive for more than one antibody, bead selections can be done serially.
- selection antibodies can be conjugated with a fluorescent dye and fluorescence activated cell sorting can be employed.
- antigens are fluorescently tagged, and sorting can be accomplished using this as one marker.
- Multi- parameter flow sorting can permit the separation of cell based markers such as type ( e.g ., CD4, CD8) and their activation status at the same time.
- all cell separations are retained for each antigen pool.
- four separations of T cells result from each antigen pool: 1) CD8+, Activated, 2) CD8+, Not activated, 3) CD4+, Activated, and 4) CD4+, Not-activated.
- nucleic acids are extracted from each separation of cells and separately amplified using TCR chain (e.g., T cell alpha, T cell beta, or both) or B cell receptor (BCR) chain (e.g, B cell heavy chain, B cell light chain, or both) specific PCR primers for sequencing.
- TCR chain e.g., T cell alpha, T cell beta, or both
- BCR chain e.g, B cell heavy chain, B cell light chain, or both
- DNA is extracted from each separation for sequencing.
- RNA is extracted from each separation and converted into DNA by reverse transcription for sequencing.
- control nucleic acid molecules that will be amplified with one or more of the specific PCR primers are added prior to PCR amplification to each separation at one or more pre-determined concentrations to enable precise quantification of the number of LCR chain molecules present.
- multiplex PCR is used to simultaneously amplify nucleic acid sequences originating from different LCR chains.
- PCR primers encode bar codes that are contained in all of their product nucleic acid molecules as known in the art (Stahlberg, el al, “Simple multiplexed PCR-based barcoding of DNA for ultrasensitive mutation detection by next-generation sequencing” Nat Protoc. 2017 Apr; 12(4): 664-682, and Binladen, et al.
- PCR primers include Unique Molecular Identifiers (UMI) to provide more accurate counting of LCR chain molecules as known in the art (Kivioja, et al, “Counting absolute numbers of molecules using unique molecular identifiers” Nat Methods. 2011 Nov 20;9(l):72-4, incorporated by reference in its entirety herein).
- UMI Unique Molecular Identifiers
- the nucleic acids derived from separations from each pool include a separation specific bar-code when prepared for sequencing in step 204.
- the amplified nucleic acids include a pool specific bar code to permit the mixing of pools for sequencing when prepared in step 204.
- separate nucleic acid primers specific for LCR chains e.g, alpha or beta are used that include a chain specific bar code to amplify nucleic acids from each pool for sequencing in step 204.
- molecules corresponding to amplified LCR chains contain a unique molecular identifier (UMI) and three bar codes: a separation specific bar code, an antigen pool specific bar code, and a LCR chain specific bar code (e.g ., alpha or beta).
- UMI unique molecular identifier
- three bar codes e.g ., a separation specific bar code, an antigen pool specific bar code, and a LCR chain specific bar code (e.g ., alpha or beta).
- single-cell based methods are used to sequence LCR chains from one or more separations.
- methods for measuring the RNA transcriptomes of single cells can provide paired sequences of LCR chains (De Simone, et al ., “Single Cell T Cell Receptor Sequencing: Techniques and Future Challenges” Front Immunol. 2018 Jul 18;9: 1638, Singh, et al, “High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes” Nat Commun. 2019 Jul 16; 10(1):3120, Stubbington, et al. , “T cell fate and clonality inference from single-cell transcriptomes” Nat Methods.
- methods for sequencing the DNA of single cells can be used to produce LCR chain sequencing reads from single cells or a count of the number of cells that contain a LCR chain sequence (Zong, et al. , “Genome-wide detection of single-nucleotide and copy- number variations of a single human cell” Science. 2012;338(6114): 1622-1626).
- methods for measuring the RNA transcriptomes of single cells can be used that do not require the physical separation of single cells (Rosenberg, etal. “Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding” Science.
- methods that provide mRNA transcript levels from single cells can provide transcript levels for genes that indicate lymphocyte activation or other state information that can be used in addition to, or instead of, marker information to separate cells for analysis (Singh, etal. “High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes” Nat Commun. 2019 Jul 16; 10(1):3120).
- results from single-cell based methods are used in step 205 to determine, for each sequenced LCR chain, the pools in which it is enriched, as described herein.
- the number of cells that contain an LCR chain sequence is used instead of LCR read counts in step 205.
- mRNA transcript levels for genes from single-cell based methods are used to create or augment separations for desired analysis.
- mRNA expression markers include elevated expression of genes characteristic of active tissue resident cytotoxic lymphocytes, such as CCL4, NKG7, GZMA, and GZMK (Singh, et al. 2019).
- expression or other sequencing derived markers from individual cells are used to augment or replace the separation labels (e.g ., CD8+ Activated) associated with the physical separation of cells.
- all or a portion of the cells in a pool can be analyzed by single-cell methods without separation by step 203.
- the bar-coded separations are combined for sequencing on a high-throughput sequencer.
- the separations from each pool have their LCRs sequenced using high throughput sequencing technology.
- adequate sequencing depth number of raw reads from the sequencing instrument
- decoding proceeds by identifying LCR chain sequences enriched in a desired set of physically separated pools, for example activated CD8+ cells.
- LCR enrichment in a pool is determined by comparing LCR chain read counts observed in a desired separation (e.g., CD8+ Activated) to a function of the read counts observed in one or more other separations for the same pool (e.g, CD8+ Not activated, CD4+ Activated, CD4+ Not Activated). In some embodiments, LCR enrichment in a pool is determined by comparing LCR chain read counts observed in a desired separation (e.g, CD8+ Activated) to the read counts from one or more read counts of control nucleic acid molecules in one or more pools for the desired separation.
- a desired separation e.g., CD8+ Activated
- LCR enrichment in a pool is determined by comparing LCR chain read counts observed in a desired separation (e.g, CD8+ Activated) to a function of the read counts for one or more separations (e.g, CD8+ Activated) in one or more pools. In some embodiments, LCR enrichment in a pool is determined by comparing LCR chain read counts in a desired separation (e.g, CD8+, Activated) to a function of the read counts observed in one or more separations in Control Pool 0 (e.g, CD8+,
- LCR enrichment in a pool is determined by computing a probability that the LCR chain read counts observed in a desired separation (e.g, CD8+ Activated) are drawn from a distribution computed using the read counts for one or more separations (e.g, CD8+ Activated) in one or more pools, and comparing this probability to a predetermined threshold (e.g, using standard deviation of a distribution).
- a desired separation e.g, CD8+ Activated
- a predetermined threshold e.g, using standard deviation of a distribution
- LCR enrichment in a target pool is determined by computing the distribution of read counts observed in a desired separation (e.g, CD8+ Activated) in the target pool and comparing this distribution to one or more distributions of read counts observed in one or more separations (e.g, CD8+ Activated) in one or more other pools.
- the enrichment of LCR chains in one or more pools is determined using statistical tests (e.g, Mann- Whitney U test, rank-sum test, Chi-squared test, /-test, ANOVA followed by post hoc tests) or other techniques known in the art when comparing to one or more alternative pools.
- LCR chain read counts are normalized in each pool by dividing by the total number of LCR chain read counts in complementary separations in that pool (e.g ., for CD8+ Activated read counts: divide the CD8+ Activated read counts by the total CD8+ Activated plus CD8+ Not Activated read counts). In some embodiments, LCR chain read counts are normalized in each pool by dividing by the total number of LCR chain read counts in that pool.
- the pool specific LCR chain read counts are normalized, and the normalized LCR chain read counts for that separation from all pools are clustered into two clusters using clustering methods known in the art (e.g, 2-means clustering).
- clustering methods known in the art (e.g, 2-means clustering).
- the cluster with the smaller average number of normalized read counts is labeled “0” and the cluster with the larger average number of normalized read counts is labeled “1”.
- an LCR chain sequence in a specific pool and separation is assigned a “ 1” or “0” based on the label of its most likely cluster assignment.
- an LCR chain sequence in a specific pool and separation is assigned a “1” or “0” based on the label of its most likely cluster assignment based on its maximum posterior probability assignment using Bayesian inference. In some embodiments, the LCR chain sequences assigned a “1” are considered to have been enriched.
- LCR chain sequence enrichment in a pool is determined using the number of cells containing a given LCR chain sequence instead of the number of observed LCR chain sequence read counts as described herein.
- sequencing reads include a cell specific bar code that permits the identification of the number of cells that contain a given LCR chain sequence.
- the number of observed sequencing reads will vary from cell- to-cell depending on the number of RNA molecules present in the cell that contain a LCR chain sequence.
- cell counts provide a more accurate method of determining the number of cells that contain a LCR chain sequence.
- specific cells that contain a LCR chain sequence can be identified with one or more desired markers.
- variations and errors in the sequencing process that result in different numbers of observed LCR chain sequences for a given cell can be eliminated by using the number of cells that include a given LCR chain sequence (e.g, based on a predetermined threshold of LCR chain sequence detection in a given cell).
- the number of cells containing a LCR chain sequence is used for analysis in steps 205-207 in place of read counts for each LCR chain sequence.
- bulk sequencing methods are used for read counts which can still produce accurate results.
- read counts or cell counts may be used.
- a binary number corresponding to the LCR chain sequence is determined corresponding to the antigen pools where it is enriched.
- the method proceeds by decoding the binary number with the error correcting code used for encoding (e.g ., see FIG. 1).
- a nearest neighbor decoding algorithm as known in the art decodes the binary number into the antigen number with a corresponding code word with the smallest Hamming distance from the binary number. If there is more than one antigen code word with the same smallest distance, the decoding algorithm outputs an error.
- the result of decoding can be a valid antigen number, or it can represent an error.
- the code used for decoding can detect errors when the pattern of enrichment does not correspond to a single antigen/peptide, and can correct errors when LCR chain sequence enrichment is corrupted by noise in samples up to the error correction limit of the code used.
- a nearest set decoding algorithm as described herein decodes the binary number into one or more antigen numbers.
- the result of the methods described herein is the output of LCR sequences enriched for each antigen (e.g., peptide) in each antigen pool.
- the decoding of antigen number(s) corresponding to an LCR chain sequence is based on the number of read counts of the LCR chain sequence in all pools, and these read counts are interpreted by a machine learning classifier (e.g, a neural network or other statistical model) that has been trained on examples of the code employed for placing antigens (peptides) in pools.
- a machine learning classifier e.g, a neural network or other statistical model
- the decoding of the antigen number(s) corresponding to a LCR chain sequence is based on the number of reads of the LCR chain sequence in all pools, and a maximum a posteriori estimator of the best antigen number(s) for the LCR chain sequence is employed.
- the method of the present disclosure includes any combination of one or more of steps 201-207.
- unique TCR chain sequences corresponding to alpha and beta chains are independently decoded for a desired separation.
- unique BCR chain sequences corresponding to BCR heavy and light chains are independently decoded for a desired separation.
- the same antigen number when the same antigen number is decoded for a TCR alpha and a TCR beta chain sequence, and only one alpha chain sequence and one beta chain sequence decodes into that antigen number, they are considered to have originated from the same TCR alpha-beta receptor sequence pair that is associated with that antigen.
- all of the TCR alpha and TCR beta chain sequences that decode to the same antigen number are ranked in each pool by their read counts where one rank list is created for alpha chains, and one for beta chains.
- TCR alpha chain and a TCR beta chain sequence in each pool have the same pool specific rank order of read counts in the alpha and beta chain rank lists, they are considered to have originated from the same TCR alpha-beta receptor sequence pair.
- single-cell sequencing methods are used to determine TCR alpha-beta receptor sequence pairs.
- the same antigen number is decoded for a BCR heavy and a BCR light chain sequence, and only one light chain sequence and heavy beta chain sequence decodes into that antigen number, they are considered to have originated from the same BCR heavy-light receptor sequence pair that is associated with that antigen.
- all of the BCR heavy and BCR light chain sequences that decode to the same antigen number are ranked in each pool by their read counts where one rank list is created for heavy chains, and one for beta chains. If a BCR heavy chain and a BCR light chain sequence in each pool have the same pool specific rank order of read counts in the heavy and light chain rank lists, they are considered to have originated from the same BCR heavy-light receptor sequence pair.
- single-cell sequencing methods are used to determine BCR heavy- light receptor sequence pairs.
- the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: sorting a plurality of first antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of first antigens to a unique subset of the plurality of reaction mixtures, and wherein two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, contacting each reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in at least one reaction mixture of the plurality of reaction mixtures to expand in number such that a plurality of T cell clones is formed, contacting the plurality of T cell clones with a query antigen, separating a second activated T cell and a non-activated T cell from a subset of the plurality of reaction mixtures, wherein the second activated T cell recognizes
- separating the second activated T cell and the non-activated T cell is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof
- the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
- the second activated T cell recognizes the query antigen by binding an MHC complex comprising the query antigen.
- the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the unique antigen is added to.
- the error-correcting code is a superimposed code.
- the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
- the decoding algorithm is a nearest neighbor algorithm.
- the query antigen is different from any antigen of the plurality of first antigens.
- separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using multimer sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected unique antigens that are specific to the T cell receptor chain sequence.
- the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
- the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: adding a plurality of first antigens to a first reaction mixture, contacting the first reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in the first reaction mixture to expand in number such that a plurality of T cell clones is formed, sorting a plurality of query antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a first query antigen of the plurality of query antigens to a unique subset of the plurality of reaction mixtures, wherein two unique query antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the first query antigen is different from any antigen of the plurality of first antigens, contacting each reaction mixture of the plurality of reaction mixtures with a portion of the first reaction mixture comprising the pluralit
- separating the second activated T cell from the subset of the plurality of T cell clones is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
- the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
- the second activated T cell recognizes the first query antigen by binding an MHC complex comprising the first query antigen.
- the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the first query antigen is added to.
- the error-correcting code is a superimposed code.
- the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the first query antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
- the decoding algorithm is a nearest neighbor algorithm.
- separating the second activated T cell from the subset of the plurality of T cell clones is performed using multimer sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using bead- based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected query antigens that are specific to the T cell receptor chain sequence. In some embodiments, the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
- the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding the at least two unique antigens of the plurality of antigens-to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens, sequencing nucleic
- the lymphocyte is a T cell or a B cell. In some embodiments, separating the target lymphocyte is performed using multimer sorting. In some embodiments, the target lymphocyte is a T cell, and wherein separating the T cell is based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, and CD154. In some embodiments, the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
- a number of reaction mixtures comprising the at least two unique subsets is a function of a number of expected antigens that are specific to the lymphocyte cell receptor chain sequence.
- the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens by binding the at least two unique antigens of the plurality of antigens or by binding two or more molecular complexes comprising the at least two unique antigens of the plurality of antigens.
- the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence. In some embodiments, the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the at least two unique subsets of the plurality of reaction mixtures. In some embodiments, comprising assigning a superimposed code to each antigen of the plurality of antigens, wherein the superimposed code is configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence.
- the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding at the at least two unique antigens of the plurality of antigens to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte reacts with the at least two unique antigens of the plurality of antigens, sequencing nucleic acids of the target lymphocyte to
- the lymphocyte is a T cell or a B cell.
- the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte. In some embodiments, comprising contacting at least one reaction mixture of the plurality of reaction mixtures with a query antigen.
- the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific to a unique antigen, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of antigens to a unique subset of the plurality of reaction mixtures such that two different unique antigens are not added to the unique subset, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the unique antigen, after separating the target lymphocyte, sequencing nucleic acids of the target lymphocyte to obtain the lymphocyte receptor chain sequence, wherein the sequencing is performed by single-cell sequencing, and detecting the unique antigen, wherein the detecting comprises: computing a frequency of lymphocyte cells that express the lymphocyte receptor chain sequence.
- the lymphocyte is a T cell or a B cell.
- the target lymphocyte is a T cell, and wherein the T cell is separated based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, or a combination thereof.
- the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
- the detecting further comprises: computing a gene expression value of a gene of the target lymphocyte.
- the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
- the target lymphocyte recognizes the unique antigen by binding the unique antigen or by binding one or more molecular complexes comprising the unique antigen.
- the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the unique antigen that is specific to the lymphocyte receptor chain sequence.
- the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the one or more antigens that are specific to the lymphocyte receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures, and wherein the at least one reaction mixture comprises the one or more antigens.
- the invention provides for a method for determining a lymphocyte receptor chain sequence, or a portion thereof, specific for at least one antigen, the method comprising: providing a biological sample comprising a plurality of lymphocytes, extracting a plurality of first antigen presenting cells from the biological sample, dividing the plurality of first antigen presenting cells into a plurality of first reaction mixtures, sorting a plurality of first antigens into the plurality of first reaction mixtures, wherein the sorting comprises adding a unique first antigen of the plurality of first antigens to a unique subset of the plurality of first reaction mixtures, and wherein two unique first antigens are not added to any two identical subsets of the plurality of first reaction mixtures, contacting each first reaction mixture with the biological sample, providing a condition for a first activated lymphocyte in at least one first reaction mixture of the plurality of first reaction mixtures to expand in number such that a plurality of lymphocyte clones is formed, extracting a plurality of
- the lymphocyte is a T cell or a B cell.
- HLA typing of the biological sample to determine a predicted display of at least one antigen of the plurality of first antigens by an MHC molecule present in the biological sample.
- enriching the plurality of lymphocytes prior to sorting the plurality of first antigens into the plurality of first reaction mixtures.
- enriching the plurality of lymphocytes after providing the condition for the first activated lymphocyte to expand in number and prior to extracting the plurality of second antigen presenting cells.
- separating the second activated lymphocyte and the non-activated lymphocyte is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
- the second activated lymphocyte recognizes the query antigen by binding an MHC complex comprising the query antigen.
- the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of first reaction mixtures that the unique first antigen is added to.
- the error-correcting code is a collision free superimposed code configured to allow for detection of at least two unique first antigens specific for the lymphocyte receptor chain sequence.
- the collision free superimposed code is determined by a random search method.
- the collision free superimposed code consists of: a plurality of prefix codes, wherein a prefix code of the plurality of prefix codes is assigned to the unique first antigen of the plurality of first antigens, wherein the prefix code identifies an overlap set, wherein the prefix code is identical for more than one first antigen of the plurality of first antigens within the overlap set, and a plurality of suffix codes, wherein a suffix code of the plurality of suffix codes is assigned to the unique first antigen of the plurality of first antigens, wherein a combination of the prefix code and the suffix code is distinct for the unique first antigen.
- the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique first antigen specific for the lymphocyte receptor chain sequence when the lymphocyte receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of first reaction mixtures.
- the decoding algorithm is a nearest set algorithm.
- the query antigen is different from any antigen of the plurality of first antigens.
- separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using multimer sorting.
- separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of first reaction mixtures is a function of a number of expected unique first antigens that are specific to the lymphocyte receptor chain sequence. In some embodiments, the plurality of first reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample. In some embodiments, the detecting further comprises computing a frequency of lymphocytes that express the lymphocyte receptor chain sequence.
- superimposed codes are used to separate peptides/antigens into antigen pools at step 201 which allows the assay to detect which peptides/antigens are recognized by a single LCR chain sequence when it recognizes more than one peptide/antigen.
- An example of a superimposed code is a Zatocoding (see Mooers, C. N., and Ashby, W. R., 1951, incorporated by reference in its entirety herein).
- superimposed codes are applied to assign each antigen (e.g ., peptide) to n antigen pools that are unique to the antigen. If N is the total number of antigen pools utilized, then a given antigen is assigned to a subset of these antigen pools //, where n ⁇ N. In some embodiments, preferably n is equal to F*N, where F is the fraction of antigen pools that are optimal.
- the binary number corresponding to the pools that an antigen is assigned to is the code word of that antigen, where a pool in which it is present is assigned a “1” and a pool where it is absent is assigned a “0”, and these binary digits are concatenated to form the antigen’s code word (e.g., for five pools, inclusion in pools 1 and 3, and exclusion in pools 2, 4, and 5 would result in the binary number “10100”).
- the fraction of antigen pools F is typically 1 - 2 1/r where r is the desired detection ability of a given TCR chain sequence to recognize r antigens. Table 1 provides the fraction, F, of the total number of antigen pools, N, that should be used for a given antigen according to the equation above.
- each antigen e.g ., peptide
- F*N antigen pools F*N antigen pools, except that it is ensured that no two antigens are allocated to exactly the same group of antigen pools.
- an antigen’s code word describes the pools in which it is present and absent, where “1” represents a pool where it is present and “0” represents a pool where it is absent. These binary digits are concatenated in pool number order (e.g., the antigen code word “01100” means the antigen is present in pools 2 and 3, and not present in pools 1, 4, and 5).
- the assignment of antigens to antigen pools is recorded.
- the sequence’s enrichment is computed versus its presence in the sequencing data from the negative selection of this pool (e.g, CD8+ Not Activated).
- the sequence’s enrichment is computed versus its presence in the sequencing data from other antigen pools.
- LCR chain sequence enrichment is computed based on read counts.
- enrichment is computed based on read counts as corrected by UMIs.
- LCR chain sequence enrichment is computed based on cell counts.
- pool specific LCR chain sequence enrichment is computed as described herein.
- a LCR chain sequence is enriched in a number of antigen pools that is larger than r*F*N , then the LCR chain sequence is flagged as recognizing more than r antigens.
- the antigen pools it was assigned to are evaluated for enriched LCR chain sequences.
- the LCR chain sequence is output as recognizing the antigen.
- the false positive rate of the assay is expected to be bounded by (1 ⁇ 2)" when r is an accurate estimate. Thus, when n is more than about 3, the false positive rate should be small.
- A is increased which causes a corresponding increase in n to lower the false positive rate to a desired level.
- collision free superimposed codes as described herein are utilized to ensure that every valid code word can be decoded into a single unique set of antigens.
- the receptor sequence pairing of LCR chain sequences (T cell alpha and beta, B cell heavy and light) is accomplished as described herein for paired chains that are assigned to the same antigen or antigens. Rank comparisons of read counts for pairing receptor chain sequences is done for each antigen separately.
- a binary number corresponding to the enrichment of a LCR chain sequence is constructed by concatenating its enriched (“1”) and non-enriched (“0”) pools e.g ., “10101” corresponds to a LCR chain sequence enriched in pools 1, 3, and 5, and not enriched in pools 2 and 4).
- the Hamming distance of this binary number is computed with respect to the result of the “OR” of the code words for each possible combination of the antigens. Described herein is a nearest set decoding algorithm which determines whether there is a unique nearest neighbor in Hamming distance between the binary number and a single antigen code word, or between the binary number the Boolean bit-wise “OR” of a combination of two or more antigen code words.
- the nearest set decoding algorithm When such a unique nearest neighbor in Hamming distance is found, the nearest set decoding algorithm outputs the corresponding combination of antigens as being recognized by the LCR chain sequence. For example, if there are K antigens, the method considers all 2 K possible “OR” combinations of antigen code words, including single code words, all combinations of 2 code words, all combinations of 3 code words, and so on.
- This method allows decoding in situations where a LCR chain sequence is specific to more than one antigen (e.g., by computing a Hamming distance for a set of combined code words).
- antigens are only considered in combinations if their code words have a minimum number of “1” bits that are also present in the binary number being decoded.
- the method considers all possible “OR” combinations of antigen code words from up to r antigens (where r is the number of antigens expected to be recognized by a typical LCR used during encoding).
- other distance metrics e.g., Euclidean distance, cosine distance
- the nearest set decoding method outputs an error.
- a nearest set decoding algorithm consists of the following computational steps. [0067] In some embodiments, the inputs for the computation are:
- N Number of antigen pools.
- EI, ,N The observed enrichment (enriched: “1”; non-enriched: “0”) of a LCR chain sequence in each of the N antigen pools.
- CI, ,K Matrix of code words for each of K antigens
- G specifies a binary number corresponding to the antigen pools where antigen i is present.
- the binary digits are concatenated in pool number order, where “1” represents a pool where the antigen is present, and “0” represents a pool where it is absent.
- m Threshold minimum number of antigen pools overlapping with the observed enrichment to consider an antigen for “OR” combinations during superimposed decoding.
- Neighbor-Distance A distance function (e.g., Hamming distance, Euclidean distance, cosine distance) used to compute the distance between two code words. This function takes in two code words represented as binary numbers and outputs an integer distance. In some embodiments, generalized minimum distance decoding or maximum likelihood decoding can be used for neighbor distance functions.
- a distance function e.g., Hamming distance, Euclidean distance, cosine distance
- a corresponding binary number sequence B is constructed by concatenating the enriched (“1”) and non-enriched (“0”) pools for the LCR chain sequence.
- a set of basis code words W is computed for the purpose of decoding.
- W U * C* (where W is the union of all code words in C and i is a given antigen).
- W is the union of all 2 K possible bit-wise Boolean “OR” combinations of antigen code words in C, including single code words, all combinations of 2 code words, all combinations of 3 code words, and so on, and each base code word in W is annotated by the combination of antigen code words used to create it. For example, if Ci is “11000” and C2 is “00101” then the combination of Ci and C2 would be represented by “11101” in W which is the bit-wise “OR” of the two code words, and “11101” would be annotated as the combination Ci and C2.
- a superimposed code e.g, a Zatocoding; a collision free superimposed code
- antigens are only considered in combinations if their code words have at least m “1” bits that are also present in B, the code word being decoded.
- W does not include combinations of antigen code words for more than r antigens at once, and thus the number of possible “OR” combinations of antigen code words up to r antigens is ) (where r is the number of antigens expected to be recognized by a typical LCR used during encoding).
- W stores both the binary code word and its annotation of the one or more antigens that corresponds to the basis code word.
- the distances di, ..., dj between B and all basis code words 1, ...,j in W are computed using the Neighbor-Distance function.
- the Neighbor-Distance function uses a Hamming distance
- the Neighbor-Distance is the number of positions in a code word sequence in which the two code words differ.
- N pools, a code word has N positions.
- d2 1.
- the output will be an error (“ERROR”). Otherwise, the output will be the annotated basis antigen(s) in Wi corresponding to basis code word di with distance z.
- the output may consist of a single antigen or multiple antigens that were combined using “OR” to form basis code word Wi. If the output consists of multiple antigens, the LCR chain sequence is specific to more than one antigen.
- a separate control pool is established that contains no antigens/peptides (“Control Pool 0”; see FIG. 2).
- This pool is separated at step 203, as are the other pools, and is used to detect cells that are activated when they are retrieved from a donor.
- donor cells are derived from humans or animals.
- LCR chain sequences that are found in the separated active set of cells in the control pool represent LCR chain sequences that correspond to host activated cells or cells that contain AIM markers that are not induced by the antigens/peptides in the other pools (i.e., the antigen pools). In some embodiments, these LCR chain sequences can be eliminated from the antigen specific set of LCR chain sequences discovered in the remainder of the antigen pools.
- control antigens e.g ., control peptides
- Control antigens that are broadly present in the human population can be derived from common immunizations such as measles, mumps, rubella, polio, and other control antigens/peptides can be used in addition to antigens specific to a target of interest.
- a threshold level of detection of the control antigens in a representative human population can be predetermined.
- added control antigens e.g., control peptides
- control peptides are added to the list of target antigens or query antigens to form a complete set of K antigens/peptides to be assayed (e.g, peptide 1 -K can include one or more target peptides and one or more control peptides).
- the counts of LCR chain sequences for control antigens can be used to normalize counts for other antigens to provide comparable figures across PBMC samples.
- normalization is accomplished by adjusting the LCR chain sequence counts in a given sample for an antigen to be presented as a ratio of the antigen’s counts divided by the sum of the control antigen counts.
- antigens are distributed into antigen pools based on a minimum Hamming distance between the binary encoding of pools where they reside as described in this disclosure (e.g, using a Hamming(7,4) code; see FIG. 1).
- codes for asymmetric channels can be used when the chance of a “1” occurring by error is higher than the chance of a “0” occurring by error such as when a T cell recognizes more than one antigen (see Kim and Freiman, 1959, for examples of asymmetric codes).
- other error correcting codes can be employed.
- FIG. 3 shows a method of determining one or more LCR chain sequences associated with lymphocytes that are expanded by one or more identified first antigens, where the lymphocytes are subsequently activated by one or more query antigens.
- at least one of the query antigens and first antigens are not identical.
- a tissue sample e.g ., PBMCs
- the tissue sample is then HLA typed at step 302 to determine the predicted display of antigens by the MHC molecules present in the tissue sample.
- the HLA typing at step 302 is used to determine the pool specific first antigens and query antigens that are used based upon their predicted or known display by MHC molecules.
- lymphocytes e.g., B cells, CD4+ T cells, and/or CD8+ T cells, or any other desired set of lymphocytes or lymphocyte combinations
- first lymphocyte enrichment is enriched from a portion of the tissue sample from step 301 using negative magnetic bead selection, or other methods as are known in the art including methods described in Dagur et al. (2015) “Collection, Storage, and Preparation of Human Blood Cells.” Curr Protoc Cytom.
- step 303 the output of step 303 (enriched lymphocytes) is divided into one or more unstimulated pools (i.e., one or more control pools) and N stimulated pools at step 304.
- step 303 is omitted, and the method proceeds directly to step 304 where the output of step 301 (tissue sample preparation) is divided into unstimulated pool(s)
- first antigen presenting cells are prepared from the tissue sample at step 301 using various methods such as those described by Schanen et al. (2008) “A novel approach for the generation of human dendritic cells from blood monocytes in the absence of exogenous factors.” J Immunol Methods . 2008 Jun 1 ; 335 ( 1 -2) : 53 -64, and Moser et. al. (2010) “Optimization of a dendritic cell-based assay for the in vitro priming of naive human CD4+ T cells.” J Immunol Methods.
- the first APCs are divided into a total of N first APC pools.
- pool specific first antigens are added to the N first APC pools, wherein each pool specific first antigen is added to a unique subset of the N first APC pools using the encoding methods described herein.
- nucleic acid constructs encoding the pool specific first antigens are transfected or virally delivered into the cells in the N first APC pools with the pool selection being accomplished using the encoding strategies described herein.
- the pool specific first antigens are vaccines or proteins
- the first APCs e.g ., dendritic cells
- the N first APC pools from step 306 are added to the N stimulation pools from step 304 with corresponding numbers (e.g, APC pool 1 is added to simulation pool 1, etc.).
- the first APCs from step 305 are added to the unstimulated pools from step 304 without exposure to the pool specific first antigens.
- the unstimulated pools and N stimulation pools at step 304 will already contain APCs (e.g, dendritic cells) and thus steps 305 and 306 are eliminated and each pool specific first antigen is added directly to a unique subset of the N simulation pools at step 307 using the encoding methods described herein.
- control antigens e.g, a CAP1 peptide or other known MHC class I or class II control peptides
- control antigens are added to all pools at step 307.
- control antigens are added to the first APCs at step 305.
- the control antigens are selected based upon the HLA typing from step 302.
- the lymphocytes from step 307 are allowed to expand.
- typical expansion times are 10 to 12 days, and typical culture expansion conditions are described by Tapia-Calle et al. (2019) and Schanen et al. (2011) “Coupling sensitive in vitro and in silico techniques to assess cross-reactive CD4(+) T cells against the swine-origin H1N1 influenza virus.” Vaccine. 2011 Apr 12;29(17):3299-309 each of which are incorporated by reference in their entireties herein.
- multiple rounds of in vitro stimulation are used that repeat steps 305-308 to expand rare lymphocytes, for example using the in vitro simulation cycle method described in Abrams et al. (1997) “Generation of stable CD4+ and CD8+ T cell lines from patients immunized with ras oncogene- derived peptides reflecting codon 12 mutations.” Cell Immunol. 1997 Dec 15; 182(2): 137-51, incorporated by reference in its entirety herein.
- the enrichment of lymphocytes activated by the control antigens added at step 305 or step 307 is monitored to determine the number of rounds of in vitro stimulation required.
- step 309 second lymphocyte enrichment step
- step 309 is omitted and lymphocytes are not enriched after they have undergone expansion at step 308.
- second APCs are prepared fresh in step 310 from the tissue sample at step 301 as described herein, the second APCs are added into a single pool and the query antigens are added to this single pool of second APCs.
- the second APCs are pulsed ( e.g ., for two hours) and the second APCs are then washed.
- nucleic acid constructs encoding the query antigens are transfected or virally delivered into the second APCs.
- the second APCs after antigen addition in step 310, are added to the unstimulated pool(s) and N stimulated pools.
- the query antigens are all added directly to the unstimulated pool(s) and to the N stimulated pools along with output of second APC preparation from step 310.
- the unstimulated pool(s) and N stimulated pools will already contain APCs (e.g., dendritic cells) and step 310 is eliminated and at step 311, all query antigens are added directly to the one or more unstimulated pool(s) and N simulation pools.
- APCs e.g., dendritic cells
- step 310 is eliminated and at step 311, all query antigens are added directly to the one or more unstimulated pool(s) and N simulation pools.
- cells in the resulting pools are given time to activate and then each pool is separated by markers for activated and non-activated lymphocytes of a desired type, the LCR chains in each pool specific fraction are sequenced, and the decoding algorithm described herein is used to assign, at step 313, LCR chain sequences to one or more first antigens based upon their expansion of lymphocytes that were subsequently activated by query antigens.
- the enrichment of LCR chain sequences in the N stimulated pools utilizes the LCR chain sequence read counts or cell counts observed for the same LCR chain sequence in the unstimulated pool(s), and the detection of an enriched LCR chain sequence of a lymphocyte that recognizes a first antigen in one or more of the N stimulation pools is based upon its relative read count or cell count when compared to the unstimulated pool(s).
- This enrichment is then used for decoding one or more pool specific first antigens as described herein.
- This LCR chain sequence enrichment corresponds to a lymphocyte that is activated by at least one of the query antigens in addition to the one or more first antigens that are decoded.
- these LCR chain sequences recognize both the one or more first antigens decoded and at least one of the query antigens.
- a method for determining one or more LCR chain sequences associated with lymphocytes that are activated by one or more identified query antigens, where the lymphocytes have been previously expanded by one or more first antigens.
- at least one of the query antigens and first antigens are not identical.
- a tissue sample e.g ., PBMCs
- HLA typed is prepared at step 401 and is HLA typed at step 402 to determine the predicted display of antigens by the MHC molecules present in the tissue sample.
- the HLA typing from step 402 is used to determine the first antigens and query antigens that are used based upon their predicted or known display by MHC molecules.
- lymphocytes e.g., B cells, CD4+ T cells, and/or CD8+ T cells, or any other desired set of lymphocytes or lymphocyte combinations
- lymphocytes are enriched from a portion of the tissue sample at step 403 using negative magnetic bead selection, or other methods including methods described in Dagur et al. (2015).
- the tissue sample is used directly without lymphocyte enrichment and step 403 is omitted.
- the output of step 403 is divided into one or more unstimulated pools (i.e., one or more control pools) and a stimulated pool.
- APCs are prepared from the tissue sample from step 401 using, for example, the methods described by Schanen et al. (2008) and Moser et. al. (2010). Preparing purified APCs for antigen presentation to lymphocytes can improve the effectiveness of antigen display and of lymphocyte activation by the APCs.
- the APCs from step 405 are divided into a control APC pool (not exposed to first antigens) and a first antigen exposed APC pool.
- the first antigens are then added to the first antigen exposed APC pool.
- the antigen exposed APC fraction of cells are pulsed (e.g, for two hours) with the first antigens and then washed.
- nucleic acid constructs encoding the first antigens are transfected or virally delivered into the antigen exposed fraction of APCs.
- the first antigen exposed APCs from step 406 are added to the stimulated pool from step 404.
- the control APC pool (not exposed to first antigen) from step 406 are added to the unstimulated pool(s) from step 404.
- the first antigens are added directly to the stimulated pool from step 404 along with output of APC preparation from step 405.
- the unstimulated (i.e., control) and stimulation pools from step 404 will already contain APCs (e.g, dendritic cells) and step 405 and 406 are eliminated and the first antigens are added directly to the N simulation pools from step 404.
- control antigens e.g, a CAPl peptide or other known MHC class I or class II control peptides
- control antigens are added to all pools at step 407.
- control antigens are added to the first APCs at step 405.
- the control antigens are selected based upon the HLA typing from step 402. [0079] As shown in FIG.
- the lymphocytes from step 407 are allowed to expand.
- typical expansion times are 10 - 12 days, and typical culture expansion conditions are described by Tapia-Calle et al. (2019) and Schanen et al. (2011).
- multiple rounds of in vitro stimulation are used that repeat steps 405, 406, and 407 to expand rare lymphocytes, for example using the in vitro simulation cycle method described in Abrams etal. (1997) “Generation of stable CD4+ and CD8+ T cell lines from patients immunized with ras oncogene-derived peptides reflecting codon 12 mutations.” Cell Immunol. 1997 Dec 15; 182(2): 137-51, incorporated in its entirety herein.
- the enrichment of lymphocytes activated by the control antigens added at step 405 or 407 is monitored to determine the number of rounds of in vitro stimulation required.
- desired lymphocytes are enriched at step 409 using negative magnetic bead selection, or other methods as described above.
- step 409 is omitted and lymphocytes are not enriched after they have undergone expansion at step 408.
- fresh second APCs are prepared at step 410 from the tissue sample prepared at step 401 as described herein, and the second APCs are split into second control APC and second N APC pools.
- pool specific query antigens are encoded and placed into the second N APC pools as described by the methods herein.
- all of the pool specific query antigens are added to the second control pool of APCs from step 410 to test the unstimulated pool(s) for lymphocyte activation that is independent of first antigen stimulation.
- the APCs are pulsed ( e.g ., for two hours) in their respective pools and then the APCs are washed.
- the simulated pool is divided into N stimulated pools.
- the antigen exposed second N APC pools from step 411 are added to these N stimulation pools with corresponding numbers (e.g., second APC pool 1 is added to simulation pool 1, etc.).
- the second control APC pools (exposed to the query antigens) from step 411 are added to the unstimulated pool(s).
- step 408 when lymphocyte enrichment at step 409 is not used, the output of step 408 will already contain APCs (e.g, dendritic cells), steps 410 and 411 are omitted, and at step 412, pool specific query antigens are added to the unstimulated and N stimulated pools created by step 412 with pool selection for each antigen accomplished using the encoding methods described herein.
- APCs e.g, dendritic cells
- the lymphocytes are given time to activate, and then each pool is separated by markers for activated and non-activated lymphocytes of a desired type, the LCR chains in each pool specific fraction are sequenced, and the decoding algorithm described herein is used to assign LCR chain sequences to one or more query antigens that activate lymphocytes that were expanded by the set of first antigens.
- the enrichment of LCR chain sequences in the N simulated pools utilizes the LCR chain sequence read counts or cell counts observed for the same LCR chain sequence in the unstimulated pool(s), and the detection of an enriched LCR chain sequence of a lymphocyte that recognizes a query antigen in one or more of the N stimulation pools is based upon its increased read count or cell count when compared to the unstimulated pool(s).
- This enrichment is then used for decoding one or more pool specific query antigens as described herein.
- This LCR chain sequence enrichment corresponds to a lymphocyte that is expanded by at least one of the first antigens in addition to the one or more query antigens that are decoded.
- these LCR chain sequences recognize both the one or more query antigens decoded and at least one of the first antigens.
- APCs or APCs mixed with other cell types can be stimulated with a vaccine that consists of one or more antigens that are physically associated (e.g., covalent coupled) to a VHH domain that binds to cells that have MHC class II molecules on their surface.
- a VHH targeting domain is any VHH domain that competes for binding to MHC class II complexes HLA-DRl, HLA-DR2, and HLA-DR4 with a VHH comprising SEQ ID NO: 1 or SEQ ID NO:
- VHH targeting domains are VHH molecules that bind to cell surface proteins of antigen presenting cells (e.g., DEC-205). In some embodiments, VHH targeting domains are VHH molecules that bind to cell surface proteins present on cells that have MHC class II molecules on their surface. In some embodiments, VHH targeting domains are VHH molecules that bind to cell type specific surface proteins (e.g,
- antigens physically associated with VHH targeting domains are used in one or more of the following steps: steps 306 and 311 of FIG. 3, as well as steps 406 and 411 of FIG. 4.
- Examples of VHH targeting domains are SEQ ID NO: 1 and SEQ ID NO: 2.
- VHH targeting domains are joined to antigens with linker sequences including fusion protein linkers described in Chen et al. (2012) “Fusion protein linkers: property, design and functionality ” Advanced Drug Delivery Reviews 65.10 (2013): 1357-1369. PMID 23026637, which is incorporated by reference in its entirety herein.
- linker sequences appear before an antigen.
- linker sequences appear after an antigen.
- antigens are natively occurring epitopes, such as the KRAS neoantigens LVVVGADGV (SEQ ID NO: 5) and EYKLVVVGADGVG (SEQ ID NO: 7).
- antigens are heteroclitic derivatives of naturally occurring epitopes as described by U.S. Patent No. 11,058,751, which is incorporated in its entirety herein.
- a vaccine comprises one or more heteroclitic antigens that are physically associated with a VHH targeting domain.
- LMVVGADGV (SEQ ID NO: 4) is a heteroclitic derivative of LVVVGADGV (SEQ ID NO. 5), and EYKFVVFGSDGAG (SEQ ID NO: 6) is a heteroclitic derivative of EYKLVVVGADGVG (SEQ ID NO: 7).
- An example of a VHH targeting domain (SEQ ID NO: 1) that is combined with a linker (SEQ ID NO: 3) and the single heteroclitic antigen LMVVGADGV (SEQ ID NO: 4) is SEQ ID NO: 8.
- VHH targeting domain SEQ ID NO: 1
- SEQ ID NO: 3 linker
- SEQ ID NO: 4 heteroclitic antigen LMVVGADGV
- SEQ ID NO: 3 linker
- EYKFVVFGSDGAG SEQ ID NO: 6
- a VHH-antigen molecule is a single polypeptide vaccine that encodes one or more antigens that are covalently coupled to a VHH targeting domain.
- VHH-antigen molecules are SEQ ID NO: 8 and SEQ ID NO: 9.
- VHH-antigen molecules can be expressed and purified, using for example the methods described in U.S. Patent No. 9,751,945, which is incorporated herein in its entirety.
- a VHH-antigen molecule is encoded as an mRNA molecule that is expressed in vivo , for example in a cell line or in an individual.
- the encoding of a VHH-antigen molecule as a mRNA molecule for expression includes a start codon at its beginning.
- the encoding of a VHH-antigen molecule as a mRNA molecule includes a secretion signal sequence as described in U.S. Patent No. 9,751,945, which is incorporated herein in its entirety.
- a VHH-antigen mRNA molecule is delivered with an mRNA-LNP formulation as is known in the art.
- a vaccine for administration to an individual can be constructed by physically associating (e.g ., covalent coupling) one or more antigens to a VHH targeting domain.
- a vaccine for administration to an individual can be constructed by physically associating (e.g., covalent coupling) one or more heteroclitic antigens to a VHH targeting domain.
- collision free superimposed codes are used to assign antigens to pools.
- a collision free superimposed code is defined as a superimposed code that guarantees that each superimposed code word has a unique decoding into one or more antigens.
- a superimposed code encodes multiple antigens into a single superimposed code word by the logical “OR” of their antigen specific code words.
- collision free superimposed codes assume that A antigens are each placed into n pools out of a total of N pools and LCRs only recognize up to r antigens.
- the superimposed code for antigens 1 and 2 in Table 2 is “1 1 1 0 0 1 1 0 0 1” which does not collide with any other antigen code word (or superimposed code word of two antigens) in Table 2.
- the collision free superimposed code in Table 2 guarantees that any superimposed code word (a single antigen code word, or the logical OR of any two antigen code words) has a unique decoding into its originating one or two antigens.
- nearest set decoding as described herein can be used to determine the antigens recognized by an LCR based upon the appearance of the LCR receptor sequence in pools that correspond to a “1” in a superimposed code, and “0” where the LCR receptor sequence does not appear.
- LCR receptor sequence appearance in a pool is based upon statistical metrics as described herein.
- collision free superimposed codes are determined by a random search method.
- an antigen is chosen at random to initialize the search.
- a random code word is chosen for the antigen that is distinct from any previously chosen antigen code word, where the randomly chosen antigen code word has exactly n “1” bits and total length of N bits.
- Step 2 all superimposed code words for existing antigens and the new antigen code word for combinations up to r are computed.
- Step 3 if any of the superimposed code words computed in Step 2 are the same, then the method returns to Step 1 to pick a replacement antigen code word.
- the code word for the antigen is recorded, and a new antigen is chosen at random, and the method continues again from Step 1.
- the method has determined a collision free superimposed code. In some embodiments, if at Step 1 all possible remaining code words have been tried for a given antigen, then the method stops with failure for the parameters provided, and the method can be repeated starting over from Step 1. In some embodiments, if a fixed number of random code words selected at Step 1 fail in a row without a new code word being recorded at Step 4, the method stops with failure to find a collision free superimposed code, and the method can be repeated from Step 1. After multiple failed attempts, it is possible that a superimposed code with the given constraints does not exist.
- antigens are arranged into overlap sets, where it is assumed that no LCR can recognize antigens in distinct overlap sets. For example, 30 antigens can be organized into 10 overlap sets of 3 antigens each. In this example, it is assumed that each LCR may recognize a maximum of r antigens in each overlap set.
- a collision free superimposed code consists of a prefix code that determines an overlap set, and a suffix code that determines the one or more antigens within this overlap set.
- a given antigen is placed into pools corresponding to “1” bits in the prefix code for its overlap set, and into pools corresponding to “1” bits in their antigen specific code (the suffix code) within their overlap set.
- the suffix code has one code word for each antigen in the largest overlap set.
- overlap sets share code words (e.g., the first antigen in each overlap set has the same suffix code word, the second antigen in each overlap set has the same suffix code word, etc.).
- the suffix code is a collision free superimposed code with r equal to the assumed maximum number of antigens that are recognized by an LCR within an overlap set.
- the number of bits (e.g., pools) for the suffix code is chosen to accommodate the number of antigens in the largest overlap set and the value of r.
- Table 3 illustrates a collision free superimposed code for 30 antigens placed into 8 pools where each LCR is assumed to not recognize antigens in distinct overlap sets.
- a “1” indicates that an antigen is placed into a pool, and a “0” indicates that an antigen is not placed into a pool.
- the example superimposed code in Table 3 is for 30 antigens organized into 10 overlap sets of 3 antigens per set.
- a prefix code is used to place the 30 antigens into pools PI to P5, and a suffix code is used to place the 30 antigens into pools P6 to P8.
- the prefix code uses a two out of five encoding system.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Immunology (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Cell Biology (AREA)
- Molecular Biology (AREA)
- Hematology (AREA)
- Zoology (AREA)
- Analytical Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biomedical Technology (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Urology & Nephrology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Wood Science & Technology (AREA)
- Biophysics (AREA)
- General Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- General Engineering & Computer Science (AREA)
- Tropical Medicine & Parasitology (AREA)
- Virology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Food Science & Technology (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
L'invention concerne un procédé de détermination d'une séquence de chaîne de récepteur de lymphocytes T spécifique à un antigène unique, comprenant : le tri d'une pluralité d'antigènes en une pluralité de mélanges réactionnels, le tri comprenant l'ajout d'un antigène unique de la pluralité d'antigènes à un sous-ensemble unique de la pluralité de mélanges réactionnels de façon à ce que deux antigènes uniques différents ne soient pas ajoutés au sous-ensemble unique; la mise en contact de chaque réaction avec un échantillon biologique comprenant une pluralité de lymphocytes; la séparation d'un lymphocyte cible d'un sous-ensemble de la pluralité de lymphocytes, le lymphocyte cible reconnaissant l'antigène unique; après la séparation du lymphocyte cible, le séquençage d'acides nucléiques du lymphocyte cible en vue d'obtenir la séquence de chaîne de récepteur de lymphocyte, le séquençage étant réalisé par séquençage de cellule unique; et la détection de l'antigène unique, la détection comprenant : le calcul d'une fréquence de cellules lymphocytes qui expriment la séquence de chaîne de récepteur de lymphocyte.
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/142,745 | 2021-01-06 | ||
US17/142,745 US11111489B1 (en) | 2021-01-06 | 2021-01-06 | Multiplexed testing of lymphocytes for antigen specificity |
US17/386,702 US20220213466A1 (en) | 2021-01-06 | 2021-07-28 | Multiplexed testing of lymphocytes for antigen specificity |
US17/386,702 | 2021-07-28 | ||
US202163262974P | 2021-10-25 | 2021-10-25 | |
US63/262,974 | 2021-10-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022150354A1 true WO2022150354A1 (fr) | 2022-07-14 |
Family
ID=82358138
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2022/011275 WO2022150354A1 (fr) | 2021-01-06 | 2022-01-05 | Essai multiplexé de lymphocytes pour une spécificité antigénique |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022150354A1 (fr) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5371750A (en) * | 1990-03-02 | 1994-12-06 | Mitsubishi Denki Kabushiki Kaisha | Error-correction encoding and decoding system |
US20150025812A1 (en) * | 2011-01-27 | 2015-01-22 | Norman A. Paradis | Method and apparatus for discovery, development and clinical application of multiplex assays based on patterns of cellular response |
US20180087109A1 (en) * | 2014-04-01 | 2018-03-29 | Adaptive Biotechnologies Corp. | Determining antigen-specific t-cells |
US20190025299A1 (en) * | 2015-09-25 | 2019-01-24 | Francois Vigneault | High throughput process for t cell receptor target identification of natively-paired t cell receptor sequences |
US11111489B1 (en) * | 2021-01-06 | 2021-09-07 | Think Therapeutics, Inc. | Multiplexed testing of lymphocytes for antigen specificity |
-
2022
- 2022-01-05 WO PCT/US2022/011275 patent/WO2022150354A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5371750A (en) * | 1990-03-02 | 1994-12-06 | Mitsubishi Denki Kabushiki Kaisha | Error-correction encoding and decoding system |
US20150025812A1 (en) * | 2011-01-27 | 2015-01-22 | Norman A. Paradis | Method and apparatus for discovery, development and clinical application of multiplex assays based on patterns of cellular response |
US20180087109A1 (en) * | 2014-04-01 | 2018-03-29 | Adaptive Biotechnologies Corp. | Determining antigen-specific t-cells |
US20190025299A1 (en) * | 2015-09-25 | 2019-01-24 | Francois Vigneault | High throughput process for t cell receptor target identification of natively-paired t cell receptor sequences |
US11111489B1 (en) * | 2021-01-06 | 2021-09-07 | Think Therapeutics, Inc. | Multiplexed testing of lymphocytes for antigen specificity |
Non-Patent Citations (1)
Title |
---|
MARK KLINGER, FRANCOIS PEPIN, JEN WILKINS, THOMAS ASBURY, TOBIAS WITTKOP, JIANBIAO ZHENG, MARTIN MOORHEAD, MALEK FAHAM: "Multiplex Identification of Antigen-Specific T Cell Receptors Using a Combination of Immune Assays and Immune Receptor Sequencing", PLOS ONE, vol. 10, no. 10, pages e0141561, XP055389430, DOI: 10.1371/journal.pone.0141561 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Madi et al. | T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequences | |
US11261490B2 (en) | Determining antigen-specific T-cells | |
Graham et al. | Antigen discovery and specification of immunodominance hierarchies for MHCII-restricted epitopes | |
Lu et al. | An efficient single-cell RNA-seq approach to identify neoantigen-specific T cell receptors | |
Ma et al. | High-throughput and high-dimensional single-cell analysis of antigen-specific CD8+ T cells | |
US20220213466A1 (en) | Multiplexed testing of lymphocytes for antigen specificity | |
CN113424264B (zh) | 用于生成个性化癌症疫苗的癌症突变选择 | |
Osbak et al. | Characterizing the syphilis-causing Treponema pallidum ssp. pallidum proteome using complementary mass spectrometry | |
van Schaik et al. | Discovery of invariant T cells by next-generation sequencing of the human TCR α-chain repertoire | |
Du et al. | Transcriptome analysis reveals immune-related gene expression changes with age in giant panda (Ailuropoda melanoleuca) blood | |
Obermair et al. | High-resolution profiling of MHC II peptide presentation capacity reveals SARS-CoV-2 CD4 T cell targets and mechanisms of immune escape | |
Bruno et al. | High-throughput, targeted MHC class I immunopeptidomics using a functional genetics screening platform | |
Admon | The biogenesis of the immunopeptidome | |
US20230126286A1 (en) | Multiplexed testing of lymphocytes for antigen specificity | |
WO2022150354A1 (fr) | Essai multiplexé de lymphocytes pour une spécificité antigénique | |
Bedran et al. | The immunopeptidome from a genomic perspective: establishing the noncanonical landscape of MHC class I–associated peptides | |
EP3807636B1 (fr) | Système d'identification d'antigènes reconnus par des récepteurs de lymphocytes t exprimés sur des lymphocytes infiltrant les tumeurs | |
Bradwell et al. | Host and parasite transcriptomic changes upon successive Plasmodium falciparum infections in early childhood | |
Houston et al. | In-Depth Proteome Coverage of In Vitro-Cultured Treponema pallidum and Quantitative Comparison Analyses with In Vivo-Grown Treponemes | |
WO2020165283A1 (fr) | Immunosuppression inverse | |
Afik et al. | Targeted reconstruction of T cell receptor sequence from single cell RNA-sequencing links CDR3 length to T cell differentiation state | |
CN113272419A (zh) | 制备治疗性t淋巴细胞的方法 | |
Karim et al. | Evaluating complete surface-associated and secretory proteome of Leishmania donovani for discovering novel vaccines and diagnostic targets | |
Mayer-Blackwell et al. | mRNA vaccination boosts S-specific T cell memory and promotes expansion of CD45RAint TEMRA-like CD8+ T cells in COVID-19 recovered individuals | |
Buggert et al. | Booster mRNA vaccination post-SARS-CoV-2 infection enhances functional qualities of T cell immunity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22737025 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22737025 Country of ref document: EP Kind code of ref document: A1 |