WO2022150354A1 - Essai multiplexé de lymphocytes pour une spécificité antigénique - Google Patents

Essai multiplexé de lymphocytes pour une spécificité antigénique Download PDF

Info

Publication number
WO2022150354A1
WO2022150354A1 PCT/US2022/011275 US2022011275W WO2022150354A1 WO 2022150354 A1 WO2022150354 A1 WO 2022150354A1 US 2022011275 W US2022011275 W US 2022011275W WO 2022150354 A1 WO2022150354 A1 WO 2022150354A1
Authority
WO
WIPO (PCT)
Prior art keywords
antigens
antigen
lymphocyte
unique
cell
Prior art date
Application number
PCT/US2022/011275
Other languages
English (en)
Inventor
David Kenneth GIFFORD
Brandon CARTER
Original Assignee
Think Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/142,745 external-priority patent/US11111489B1/en
Application filed by Think Therapeutics, Inc. filed Critical Think Therapeutics, Inc.
Publication of WO2022150354A1 publication Critical patent/WO2022150354A1/fr

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/569Immunoassay; Biospecific binding assay; Materials therefor for microorganisms, e.g. protozoa, bacteria, viruses
    • G01N33/56966Animal cells
    • G01N33/56972White blood cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B30/00Methods of screening libraries
    • C40B30/04Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/435Assays involving biological materials from specific organisms or of a specific nature from animals; from humans
    • G01N2333/705Assays involving receptors, cell surface antigens or cell surface determinants
    • G01N2333/70503Immunoglobulin superfamily, e.g. VCAMs, PECAM, LFA-3
    • G01N2333/7051T-cell receptor (TcR)-CD3 complex

Definitions

  • the present invention relates generally to identification of lymphocyte receptors that are specific to target antigens. More particularly, the present invention relates to systems and methods of accurately identifying lymphocyte (e.g ., B cell or T cell) receptor sequence chains that are specific to one or more antigens or peptides of interest.
  • lymphocyte e.g ., B cell or T cell
  • Patent Nos. 10,066,265 and 10,077,478 disclose methods for determining the sequence of one or more lymphocyte receptor chains specific to antigens of interest but fail to disclose systems and methods that can produce accurate lymphocyte receptor chain sequences (e.g., with low false positive/negative rates) specific to one or more target antigens. There exists a need for improved methods and assays for discovering lymphocyte receptor chain sequences that bind to specific antigens in pool-based detection formats and algorithms.
  • the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: sorting a plurality of first antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of first antigens to a unique subset of the plurality of reaction mixtures, and wherein two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, contacting each reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in at least one reaction mixture of the plurality of reaction mixtures to expand in number such that a plurality of T cell clones is formed, contacting the plurality of T cell clones with a query antigen, separating a second activated T cell and a non-activated T cell from a subset of the plurality of reaction mixtures, wherein the second activated T cell recognizes
  • separating the second activated T cell and the non-activated T cell is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
  • the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
  • the second activated T cell recognizes the query antigen by binding an MHC complex comprising the query antigen.
  • the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the unique antigen is added to.
  • the error-correcting code is a superimposed code.
  • the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
  • the decoding algorithm is a nearest neighbor algorithm.
  • the query antigen is different from any antigen of the plurality of first antigens.
  • separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using multimer sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected unique antigens that are specific to the T cell receptor chain sequence.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
  • the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: adding a plurality of first antigens to a first reaction mixture, contacting the first reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in the first reaction mixture to expand in number such that a plurality of T cell clones is formed, sorting a plurality of query antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a first query antigen of the plurality of query antigens to a unique subset of the plurality of reaction mixtures, wherein two unique query antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the first query antigen is different from any antigen of the plurality of first antigens, contacting each reaction mixture of the plurality of reaction mixtures with a portion of the first reaction mixture comprising the pluralit
  • separating the second activated T cell from the subset of the plurality of T cell clones is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
  • the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
  • the second activated T cell recognizes the first query antigen by binding an MHC complex comprising the first query antigen.
  • the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the first query antigen is added to.
  • the error-correcting code is a superimposed code.
  • the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the first query antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
  • the decoding algorithm is a nearest neighbor algorithm.
  • separating the second activated T cell from the subset of the plurality of T cell clones is performed using multimer sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using bead- based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected query antigens that are specific to the T cell receptor chain sequence.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
  • the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding the at least two unique antigens of the plurality of antigens-to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens, sequencing nucleic
  • the lymphocyte is a T cell or a B cell. In some embodiments, separating the target lymphocyte is performed using multimer sorting. In some embodiments, the target lymphocyte is a T cell, and wherein separating the T cell is based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, and CD154. In some embodiments, the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
  • a number of reaction mixtures comprising the at least two unique subsets is a function of a number of expected antigens that are specific to the lymphocyte cell receptor chain sequence.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens by binding the at least two unique antigens of the plurality of antigens or by binding two or more molecular complexes comprising the at least two unique antigens of the plurality of antigens.
  • the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence. In some embodiments, the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the at least two unique subsets of the plurality of reaction mixtures. In some embodiments, comprising assigning a superimposed code to each antigen of the plurality of antigens, wherein the superimposed code is configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence.
  • the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding at the at least two unique antigens of the plurality of antigens to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte reacts with the at least two unique antigens of the plurality of antigens, sequencing nucleic acids of the target lymphocyte to
  • the lymphocyte is a T cell or a B cell.
  • the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte. In some embodiments, comprising contacting at least one reaction mixture of the plurality of reaction mixtures with a query antigen.
  • the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific to a unique antigen, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of antigens to a unique subset of the plurality of reaction mixtures such that two different unique antigens are not added to the unique subset, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the unique antigen, after separating the target lymphocyte, sequencing nucleic acids of the target lymphocyte to obtain the lymphocyte receptor chain sequence, wherein the sequencing is performed by single-cell sequencing, and detecting the unique antigen, wherein the detecting comprises: computing a frequency of lymphocyte cells that express the lymphocyte receptor chain sequence.
  • the lymphocyte is a T cell or a B cell.
  • the target lymphocyte is a T cell, and wherein the T cell is separated based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, or a combination thereof.
  • the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
  • the detecting further comprises: computing a gene expression value of a gene of the target lymphocyte.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the target lymphocyte recognizes the unique antigen by binding the unique antigen or by binding one or more molecular complexes comprising the unique antigen.
  • the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the unique antigen that is specific to the lymphocyte receptor chain sequence.
  • the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the one or more antigens that are specific to the lymphocyte receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures, and wherein the at least one reaction mixture comprises the one or more antigens.
  • the invention provides for a method for determining a lymphocyte receptor chain sequence, or a portion thereof, specific for at least one antigen, the method comprising: providing a biological sample comprising a plurality of lymphocytes, extracting a plurality of first antigen presenting cells from the biological sample, dividing the plurality of first antigen presenting cells into a plurality of first reaction mixtures, sorting a plurality of first antigens into the plurality of first reaction mixtures, wherein the sorting comprises adding a unique first antigen of the plurality of first antigens to a unique subset of the plurality of first reaction mixtures, and wherein two unique first antigens are not added to any two identical subsets of the plurality of first reaction mixtures, contacting each first reaction mixture with the biological sample, providing a condition for a first activated lymphocyte in at least one first reaction mixture of the plurality of first reaction mixtures to expand in number such that a plurality of lymphocyte clones is formed, extracting a plurality of
  • the lymphocyte is a T cell or a B cell.
  • HLA typing of the biological sample to determine a predicted display of at least one antigen of the plurality of first antigens by an MHC molecule present in the biological sample.
  • enriching the plurality of lymphocytes prior to sorting the plurality of first antigens into the plurality of first reaction mixtures.
  • enriching the plurality of lymphocytes after providing the condition for the first activated lymphocyte to expand in number and prior to extracting the plurality of second antigen presenting cells.
  • separating the second activated lymphocyte and the non-activated lymphocyte is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
  • the second activated lymphocyte recognizes the query antigen by binding an MHC complex comprising the query antigen.
  • the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of first reaction mixtures that the unique first antigen is added to.
  • the error-correcting code is a collision free superimposed code configured to allow for detection of at least two unique first antigens specific for the lymphocyte receptor chain sequence.
  • the collision free superimposed code is determined by a random search method.
  • the collision free superimposed code consists of: a plurality of prefix codes, wherein a prefix code of the plurality of prefix codes is assigned to the unique first antigen of the plurality of first antigens, wherein the prefix code identifies an overlap set, wherein the prefix code is identical for more than one first antigen of the plurality of first antigens within the overlap set, and a plurality of suffix codes, wherein a suffix code of the plurality of suffix codes is assigned to the unique first antigen of the plurality of first antigens, wherein a combination of the prefix code and the suffix code is distinct for the unique first antigen.
  • the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique first antigen specific for the lymphocyte receptor chain sequence when the lymphocyte receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of first reaction mixtures.
  • the decoding algorithm is a nearest set algorithm.
  • the query antigen is different from any antigen of the plurality of first antigens.
  • separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using multimer sorting.
  • separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of first reaction mixtures is a function of a number of expected unique first antigens that are specific to the lymphocyte receptor chain sequence. In some embodiments, the plurality of first reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample. In some embodiments, the detecting further comprises computing a frequency of lymphocytes that express the lymphocyte receptor chain sequence.
  • FIG. 1 illustrates a flow chart of multiplexing of antigens into samples using an error correcting code that detects errors during demultiplexing.
  • FIG. 2 illustrates a flow chart of detection of lymphocytes specific to antigens.
  • FIG. 3 illustrates a flow chart of detection of lymphocytes that are expanded by exposure to one or more identified first antigens and are activated by one or more query antigens.
  • FIG. 4 illustrates a flow chart of detection of lymphocytes that are expanded by exposure to one or more first antigens and are activated by one or more identified query antigens.
  • a “unique antigen” is an antigen with a specific amino acid sequence.
  • a “unique antigen” is an antigen derived from a specific epitope which can include multiple related peptides that are derived from that same epitope, and the “unique antigen” can therefore have more than one possible amino acid sequence.
  • a lymphocyte is an immune system cell (e.g., T cell or B cell) that displays a receptor.
  • a lymphocyte cell receptor LCR
  • LCR lymphocyte cell receptor
  • a lymphocyte receptor chain sequence means the sequence of a portion of a receptor molecule that is most variable (e.g, a CDR3 region).
  • a lymphocyte receptor sequence pair is the two chain sequences of an immune receptor’s two components (e.g, for a T cell receptor, it is the alpha and beta chain sequence, for a B cell receptor it is the heavy and light chain sequence).
  • a lymphocyte recognizes an antigen when at least one of the lymphocyte’s receptors binds the antigen, when at least one of the lymphocyte’s receptors binds a complex that includes an antigen (e.g, MHC complex), or the lymphocyte is activated when its receptor binds the antigen.
  • an antigen e.g, MHC complex
  • One advantage of the present systems and methods relates to LCR promiscuity. Certain LCR chain sequences will recognize more than one antigen that are contained in different pools (also referred to as reaction mixtures herein). Thus, a LCR sequence discovery algorithm that depends on LCR chain sequences appearing in pools/reaction mixtures unique to one antigen may fail to produce accurate results.
  • a second advantage of the present systems and methods relates to host lymphocyte activation and non-specific markers. Lymphocytes may display native activation markers when they are isolated from animals or patients in peripheral blood mononuclear cell (PBMC) samples, and thus their activation will not be a consequence of the assay antigens.
  • PBMC peripheral blood mononuclear cell
  • a third advantage of the present systems and methods relates to experimental noise correction.
  • a fourth advantage of the present systems and methods relates to LCR chain sequence count calibration.
  • the level of lymphocyte cell recognition of an antigen and sequence discovery will vary from assay to assay and person to person.
  • a means to normalize LCR chain sequence counts from different assays using control antigens/peptides can facilitate their direct comparison.
  • the present disclosure employs coding and antigen control pool to reduce assay errors introduced by LCR promiscuity, host lymphocyte cell activation, and experimental noise. It also provides LCR chain sequence count calibration to permit comparison of disparate assays.
  • pooled assays are used to discover LCR chain sequences that correspond to LCRs displayed by lymphocyte cells that recognize a specific peptide/antigen.
  • K antigens e.g ., 15
  • N antigen pools e.g., 7
  • K refers to the total number of antigens (or peptides)
  • N refers to the total number of antigen pools into which the K antigens (or peptides) are separated.
  • antigens are placed into pools in a manner that allows the identification of LCRs on lymphocyte cells that recognize more than one antigen (or peptide).
  • antigens are encoded into pools such that LCR chain sequences corresponding to an antigen (or peptide) do not have to appear (or be detected) in all pools where the antigen (or peptide) was present.
  • the ability to detect LCRs that recognize antigens (or peptides) without having all corresponding pools that contain the antigen be recognized by lymphocytes with the LCR improves the sensitivity and accuracy of the assay.
  • the method begins by distributing a plurality of antigens (also referred to as peptides herein) into a plurality of antigen pools.
  • antigens e.g ., antigen 1 to antigen 15 as show in FIG. 1 are distributed into pools based on a minimum Hamming distance between the binary encoding of antigen pools where they reside.
  • Antigens (peptides) are given numbers from 1 to K (e.g., 1 to 15), and each antigen (peptide) number is encoded into Abits (e.g, each bit labeled as 0 or 1), where A is the total number of antigen pools.
  • the A bit encoding of an antigen number may be called its code word.
  • FIG. 1 shows an example of 15 antigens (or peptides) that are each encoded into 7 bits (of Os and Is), where 7 is the number of antigen pools.
  • an antigen is placed/distributed into a given antigen pool if the bit corresponding to that antigen pool is labeled “1” in the encoding of its number, and the peptide is not placed/distributed into a given antigen pool if the bit corresponding to that antigen pool is labeled “0”, as shown in FIG. 1.
  • the encoding of the antigen number uses an error correcting code, such as a Hamming code, to enforce a minimum distance in bit changes between the encodings of two antigen numbers.
  • the distance between two encodings as measured by the number of bit differences is called the Hamming distance.
  • FIG. 1 shows the use of a “Hamming(7,4)” code that encodes 15 peptides into 7 bit code words (corresponding to 7 antigen pools) resulting in a minimum Hamming distance of 3 (i.e., 4 data bits, 3 parity bits, and 7 total bits corresponding to 7 antigen pools).
  • code words which do not place an antigen into at least one pool i.e., all zeros
  • FIG. 1 does not utilize the all zero code word from the Hamming(7,4) code.
  • the use of an error correcting code can improve the sensitivity of the assay by not requiring detection of an LCR chain sequence from a lymphocyte that recognizes an antigen in every pool where the antigen is present.
  • the use of an error correcting code improves the accuracy of the assay by allowing the detection in a biological sample of a LCR chain sequence from a lymphocyte that recognizes an antigen in one or more pools where the antigen is not present (i.e., false positive).
  • the use of an error correcting code also improves the accuracy of the assay by allowing the lack of detection in a biological sample of a LCR chain sequence from a lymphocyte that recognizes an antigen in one or more pools where the antigen is present (i.e., false negative).
  • codes for asymmetric channels can be used when the chance of a “1” occurring by error is higher than the chance of a “0” occurring by error. In some embodiments, codes for asymmetric channels can be used when the chance of a “0” occurring by error is higher than the chance of a “1” occurring by error. In some embodiments, a “1” occurs more often than a “0” when the separation of lymphocytes based on various markers is imperfect (i.e., false positive; e.g ., occurring at step 203 of FIG. 2).
  • a “0” occurs more often than a “1” when there are a small number of lymphocyte cells that recognize an antigen (or peptide), and thus certain pools may have an insufficient number of lymphocyte cells that recognize an antigen (or peptide) to generate a “1” signal (i.e., false negative).
  • a “1” occurs more often than a “0” not due to error or chance, but rather when a lymphocyte cell recognizes more than one antigen (or peptide), and thus produces hits in pools associated with both antigens (or peptides). Examples of asymmetric codes that can perform error detection and correction optimally under these circumstances can be found in Kim and Freiman (1959), incorporated by reference in its entirety herein.
  • the antigen pools are exposed to a tissue sample (e.g, PBMCs) to cause antigen pool specific antigens to be exposed to the lymphocytes contained in the tissue sample.
  • a tissue sample e.g, PBMCs
  • lymphocyte cells are activated by the antigens and then separated into activated and non-activated cells, and optionally also separated by other markers, as described in greater detail below.
  • lymphocyte cells bind the antigens and are then separated into antigen bound and non bound cells, and optionally also separated by other markers, as described in greater detail below.
  • step 201 the method begins at step 201 in which antigens (e.g, peptides) are separated into a plurality of antigen pools (e.g, antigen pool 1 to antigen pool N) using the methods described herein (e.g, see FIG. 1).
  • step 201 further includes creating a control pool (“Control Pool 0” in FIG.2), which is free of added peptides/antigens (but may include peptides/antigens endogenous to a tissue sample, for example at step 201).
  • tissue samples e.g, PBMCs
  • the same tissue sample is split equally so that each antigen pool and the control pool are exposed to substantially the same tissue sample (e.g, with the same number and distribution of lymphocytes).
  • lymphocytes that are activated by the antigen pools are allowed time to expand.
  • the antigen pools are separately re-stimulated with a query set of one or more antigens to test if the expanded lymphocytes respond to the query set of antigens.
  • An example protocol that stimulates T cells with a first set of antigens and then queries with a second set of antigens is described by Tapia-Calle etal. (2019) “A PBMC- Based System to Assess Human T Cell Responses to Influenza Vaccine Candidates In Vitro.” Vaccines (Basel). 2019 Nov 13;7(4): 181, which is incorporated by reference in its entirety herein.
  • LCR chain sequences that correspond to lymphocytes that recognize the query antigens are determined using the pool based methods described herein.
  • each query antigen is assigned to the same pool as a pre-determined corresponding original pool antigen.
  • this assay permits the identification of lymphocyte clones that recognize both sets of antigens. For example, an increase in the frequency of a LCR chain sequence in a subset of the antigen pools in which a first antigen was added means that the LCR chain sequence is specific to that first antigen (since the corresponding lymphocytes were allowed time to expand, resulting in increased frequencies of the LCR sequence in corresponding antigen pools).
  • a query antigen is then added to the same set of antigen pools matched to a first antigen. If the same LCR chain sequence is detected in an activated set of lymphocytes from the same group of antigen pools, a conclusion can be drawn that the LCR chain sequence recognizes both the first antigen and the query antigen.
  • query antigens are employed to test if a proposed derivative of a natural peptide, included as a first antigen, will cause expansion of lymphocyte clones that are activated by a query peptide (in which the query peptide is the natural peptide corresponding to the derivative of the natural peptide that was used as the first antigen).
  • self-peptides are employed as query antigens to test if proposed vaccine peptides (or antigens) in the first antigen pools activate lymphocytes that also are activated by self-peptides that are naturally found (e.g ., query peptides are comprised of self-peptides).
  • a tissue sample e.g., PBMCs
  • a set of first antigens e.g, peptides
  • the activated lymphocytes are allowed time to expand.
  • the activated and expanded lymphocytes are then separated into pools that are stimulated with a second set of pool specific antigens (e.g, query peptides).
  • Lymphocytes are separated into activated and non-activated cells, and optionally also separated by cell type.
  • this method is used to test which specific query antigens in the antigen pools are recognized by lymphocytes activated by the first set of antigens.
  • adjuvants are added at step 201 when the tissue sample is exposed to antigens (e.g ., prior to, simultaneously with, or following exposure to the antigens).
  • antigens e.g ., prior to, simultaneously with, or following exposure to the antigens.
  • One example method of using adjuvants is described in Lissina etal. (2016), “Priming of Qualitatively Superior Human Effector CD8+ T Cells Using TLR8 Ligand Combined with FLT3 Ligand” J Immunol. 2016 Jan l;196(l):256-263 incorporated by reference in its entirety herein.
  • antigen specific responses to the use of adjuvants are observed based on the enrichment of LCR chain sequences in specific antigen pools.
  • the adjuvants added at step 201 are molecules that provide co- stimulatory signals for lymphocytes (e.g., CD28 agonists, ICOS agonists, IL-2).
  • lymphocytes are separated by their binding of antigens, and optionally also separated by lymphocyte cell type or other markers.
  • methods of separating T cells based on the binding of their T cell receptors (TCRs) include MHC multimer (multimer) sorting, where a multimer displays a peptide in the context of an MHC molecule (see Klinger, et al. , “Multiplex Identification of Antigen-Specific T Cell Receptors Using a Combination of Immune Assays and Immune Receptor Sequencing” PLoS One. 2015 Oct 28; 10(10):e0141561).
  • a set of fluorescent multimers is used that collectively displays all of the antigens (or peptides) present in a pool when bound by one or more than one MHC molecule.
  • a given pool’s cells are then sorted by cells that are specific to the multimers assigned to the pool by fluorescence activated cell sorting (FACS).
  • FACS fluorescence activated cell sorting
  • multi-parameter FACS is used to separate each cell by multimer positive and negative cells with the addition of one or more additional markers such as CD4+ (CD4+ T Cell), and CD8+ (CD8+ T Cell), or other desired markers.
  • Methods of separating B cells include sorting B cells that are bound to an antigen in a pool, and optionally by their type as determined by cell surface markers or other means known in the art.
  • Example methods of sorting B cells based on their binding of antigens are described in Scheid, et al, “A method for identification of HIV gpl40 binding memory B cells in human blood” J Immunol Methods. 2009;343(2):65-67 and Zimmermann, et al, “Antigen Extraction and B Cell Activation Enable Identification of Rare Membrane Antigen Specific Human B Cells” Front Immunol. 2019; 10:829, which are incorporated by reference herein in their entireties.
  • lymphocytes are separated into activated and non- activated cells, and optionally also separated by cell type (e.g, T cell, T cell type).
  • activation markers that are specific for activated cells, and/or different cell types, can be used to identify and then separate cells that are activated by an antigen.
  • antigens peptides
  • Assays such as Activation Induced Markers (AIM) can be used to identify activation markers (see Bowyer et al. (2018).
  • Activation-induced markers detect vaccine-specific cd4+ 1 cell responses not measured by assays conventionally used in clinical trials” Vaccines, 6(3), 50 and Reiss S, et al ., (2017) “Comparative analysis of activation induced marker (AIM) assays for sensitive identification of antigen-specific CD4 T cells” PLoS One, 12(10), e0186998, incorporated by reference in their entireties herein).
  • Cell markers can be extracellular or intracellular, and cell permeabilization is used to permit antibodies to recognize intracellular markers. For example, activated T cells have been identified by their cell surface 0X40+ CD25+ markers using AIM.
  • the type of cell that is activated can be further discriminated with other activation markers, including CD3+ (CD3+ T Cell), CD4+ (CD4+ T Cell), and CD8+ (CD8+ T Cell).
  • Other T cell activation markers known in the art can be used including CD137 and 0X40, CD25, PD-L1, CD69, and CD 154.
  • Lymphocyte cells can be physically separated by their markers at step 203 to enable the sequencing of the LCR chain sequences (at step 205, discussed in greater details below) in the physically separated cells.
  • four separations of T cells result from each pool at step 203: 1) CD8+, Activated, 2) CD8+, Not activated, 3) CD4+, Activated, and 3) CD4+, Not-activated.
  • Cell separation can be accomplished with bead-based methods, cell sorting-based methods, or other separation methods known in the art.
  • Cell separation is accomplished at step 203.
  • cell separation can be two-way, four-way, or more ways.
  • one or more separations for each pool are retained.
  • Markers used for separation can include cell proteins, antigen epitopes, antigens that are fluorescently tagged, fluorescent antibodies, florescent reagents, and other methods known in the art. Marker specific antibodies can be conjugated to beads, the beads can be exposed to a population of cells, and cells containing the selected markers can be physically separated by separating the beads. When selected cells are desired that are positive for more than one antibody, bead selections can be done serially.
  • selection antibodies can be conjugated with a fluorescent dye and fluorescence activated cell sorting can be employed.
  • antigens are fluorescently tagged, and sorting can be accomplished using this as one marker.
  • Multi- parameter flow sorting can permit the separation of cell based markers such as type ( e.g ., CD4, CD8) and their activation status at the same time.
  • all cell separations are retained for each antigen pool.
  • four separations of T cells result from each antigen pool: 1) CD8+, Activated, 2) CD8+, Not activated, 3) CD4+, Activated, and 4) CD4+, Not-activated.
  • nucleic acids are extracted from each separation of cells and separately amplified using TCR chain (e.g., T cell alpha, T cell beta, or both) or B cell receptor (BCR) chain (e.g, B cell heavy chain, B cell light chain, or both) specific PCR primers for sequencing.
  • TCR chain e.g., T cell alpha, T cell beta, or both
  • BCR chain e.g, B cell heavy chain, B cell light chain, or both
  • DNA is extracted from each separation for sequencing.
  • RNA is extracted from each separation and converted into DNA by reverse transcription for sequencing.
  • control nucleic acid molecules that will be amplified with one or more of the specific PCR primers are added prior to PCR amplification to each separation at one or more pre-determined concentrations to enable precise quantification of the number of LCR chain molecules present.
  • multiplex PCR is used to simultaneously amplify nucleic acid sequences originating from different LCR chains.
  • PCR primers encode bar codes that are contained in all of their product nucleic acid molecules as known in the art (Stahlberg, el al, “Simple multiplexed PCR-based barcoding of DNA for ultrasensitive mutation detection by next-generation sequencing” Nat Protoc. 2017 Apr; 12(4): 664-682, and Binladen, et al.
  • PCR primers include Unique Molecular Identifiers (UMI) to provide more accurate counting of LCR chain molecules as known in the art (Kivioja, et al, “Counting absolute numbers of molecules using unique molecular identifiers” Nat Methods. 2011 Nov 20;9(l):72-4, incorporated by reference in its entirety herein).
  • UMI Unique Molecular Identifiers
  • the nucleic acids derived from separations from each pool include a separation specific bar-code when prepared for sequencing in step 204.
  • the amplified nucleic acids include a pool specific bar code to permit the mixing of pools for sequencing when prepared in step 204.
  • separate nucleic acid primers specific for LCR chains e.g, alpha or beta are used that include a chain specific bar code to amplify nucleic acids from each pool for sequencing in step 204.
  • molecules corresponding to amplified LCR chains contain a unique molecular identifier (UMI) and three bar codes: a separation specific bar code, an antigen pool specific bar code, and a LCR chain specific bar code (e.g ., alpha or beta).
  • UMI unique molecular identifier
  • three bar codes e.g ., a separation specific bar code, an antigen pool specific bar code, and a LCR chain specific bar code (e.g ., alpha or beta).
  • single-cell based methods are used to sequence LCR chains from one or more separations.
  • methods for measuring the RNA transcriptomes of single cells can provide paired sequences of LCR chains (De Simone, et al ., “Single Cell T Cell Receptor Sequencing: Techniques and Future Challenges” Front Immunol. 2018 Jul 18;9: 1638, Singh, et al, “High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes” Nat Commun. 2019 Jul 16; 10(1):3120, Stubbington, et al. , “T cell fate and clonality inference from single-cell transcriptomes” Nat Methods.
  • methods for sequencing the DNA of single cells can be used to produce LCR chain sequencing reads from single cells or a count of the number of cells that contain a LCR chain sequence (Zong, et al. , “Genome-wide detection of single-nucleotide and copy- number variations of a single human cell” Science. 2012;338(6114): 1622-1626).
  • methods for measuring the RNA transcriptomes of single cells can be used that do not require the physical separation of single cells (Rosenberg, etal. “Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding” Science.
  • methods that provide mRNA transcript levels from single cells can provide transcript levels for genes that indicate lymphocyte activation or other state information that can be used in addition to, or instead of, marker information to separate cells for analysis (Singh, etal. “High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes” Nat Commun. 2019 Jul 16; 10(1):3120).
  • results from single-cell based methods are used in step 205 to determine, for each sequenced LCR chain, the pools in which it is enriched, as described herein.
  • the number of cells that contain an LCR chain sequence is used instead of LCR read counts in step 205.
  • mRNA transcript levels for genes from single-cell based methods are used to create or augment separations for desired analysis.
  • mRNA expression markers include elevated expression of genes characteristic of active tissue resident cytotoxic lymphocytes, such as CCL4, NKG7, GZMA, and GZMK (Singh, et al. 2019).
  • expression or other sequencing derived markers from individual cells are used to augment or replace the separation labels (e.g ., CD8+ Activated) associated with the physical separation of cells.
  • all or a portion of the cells in a pool can be analyzed by single-cell methods without separation by step 203.
  • the bar-coded separations are combined for sequencing on a high-throughput sequencer.
  • the separations from each pool have their LCRs sequenced using high throughput sequencing technology.
  • adequate sequencing depth number of raw reads from the sequencing instrument
  • decoding proceeds by identifying LCR chain sequences enriched in a desired set of physically separated pools, for example activated CD8+ cells.
  • LCR enrichment in a pool is determined by comparing LCR chain read counts observed in a desired separation (e.g., CD8+ Activated) to a function of the read counts observed in one or more other separations for the same pool (e.g, CD8+ Not activated, CD4+ Activated, CD4+ Not Activated). In some embodiments, LCR enrichment in a pool is determined by comparing LCR chain read counts observed in a desired separation (e.g, CD8+ Activated) to the read counts from one or more read counts of control nucleic acid molecules in one or more pools for the desired separation.
  • a desired separation e.g., CD8+ Activated
  • LCR enrichment in a pool is determined by comparing LCR chain read counts observed in a desired separation (e.g, CD8+ Activated) to a function of the read counts for one or more separations (e.g, CD8+ Activated) in one or more pools. In some embodiments, LCR enrichment in a pool is determined by comparing LCR chain read counts in a desired separation (e.g, CD8+, Activated) to a function of the read counts observed in one or more separations in Control Pool 0 (e.g, CD8+,
  • LCR enrichment in a pool is determined by computing a probability that the LCR chain read counts observed in a desired separation (e.g, CD8+ Activated) are drawn from a distribution computed using the read counts for one or more separations (e.g, CD8+ Activated) in one or more pools, and comparing this probability to a predetermined threshold (e.g, using standard deviation of a distribution).
  • a desired separation e.g, CD8+ Activated
  • a predetermined threshold e.g, using standard deviation of a distribution
  • LCR enrichment in a target pool is determined by computing the distribution of read counts observed in a desired separation (e.g, CD8+ Activated) in the target pool and comparing this distribution to one or more distributions of read counts observed in one or more separations (e.g, CD8+ Activated) in one or more other pools.
  • the enrichment of LCR chains in one or more pools is determined using statistical tests (e.g, Mann- Whitney U test, rank-sum test, Chi-squared test, /-test, ANOVA followed by post hoc tests) or other techniques known in the art when comparing to one or more alternative pools.
  • LCR chain read counts are normalized in each pool by dividing by the total number of LCR chain read counts in complementary separations in that pool (e.g ., for CD8+ Activated read counts: divide the CD8+ Activated read counts by the total CD8+ Activated plus CD8+ Not Activated read counts). In some embodiments, LCR chain read counts are normalized in each pool by dividing by the total number of LCR chain read counts in that pool.
  • the pool specific LCR chain read counts are normalized, and the normalized LCR chain read counts for that separation from all pools are clustered into two clusters using clustering methods known in the art (e.g, 2-means clustering).
  • clustering methods known in the art (e.g, 2-means clustering).
  • the cluster with the smaller average number of normalized read counts is labeled “0” and the cluster with the larger average number of normalized read counts is labeled “1”.
  • an LCR chain sequence in a specific pool and separation is assigned a “ 1” or “0” based on the label of its most likely cluster assignment.
  • an LCR chain sequence in a specific pool and separation is assigned a “1” or “0” based on the label of its most likely cluster assignment based on its maximum posterior probability assignment using Bayesian inference. In some embodiments, the LCR chain sequences assigned a “1” are considered to have been enriched.
  • LCR chain sequence enrichment in a pool is determined using the number of cells containing a given LCR chain sequence instead of the number of observed LCR chain sequence read counts as described herein.
  • sequencing reads include a cell specific bar code that permits the identification of the number of cells that contain a given LCR chain sequence.
  • the number of observed sequencing reads will vary from cell- to-cell depending on the number of RNA molecules present in the cell that contain a LCR chain sequence.
  • cell counts provide a more accurate method of determining the number of cells that contain a LCR chain sequence.
  • specific cells that contain a LCR chain sequence can be identified with one or more desired markers.
  • variations and errors in the sequencing process that result in different numbers of observed LCR chain sequences for a given cell can be eliminated by using the number of cells that include a given LCR chain sequence (e.g, based on a predetermined threshold of LCR chain sequence detection in a given cell).
  • the number of cells containing a LCR chain sequence is used for analysis in steps 205-207 in place of read counts for each LCR chain sequence.
  • bulk sequencing methods are used for read counts which can still produce accurate results.
  • read counts or cell counts may be used.
  • a binary number corresponding to the LCR chain sequence is determined corresponding to the antigen pools where it is enriched.
  • the method proceeds by decoding the binary number with the error correcting code used for encoding (e.g ., see FIG. 1).
  • a nearest neighbor decoding algorithm as known in the art decodes the binary number into the antigen number with a corresponding code word with the smallest Hamming distance from the binary number. If there is more than one antigen code word with the same smallest distance, the decoding algorithm outputs an error.
  • the result of decoding can be a valid antigen number, or it can represent an error.
  • the code used for decoding can detect errors when the pattern of enrichment does not correspond to a single antigen/peptide, and can correct errors when LCR chain sequence enrichment is corrupted by noise in samples up to the error correction limit of the code used.
  • a nearest set decoding algorithm as described herein decodes the binary number into one or more antigen numbers.
  • the result of the methods described herein is the output of LCR sequences enriched for each antigen (e.g., peptide) in each antigen pool.
  • the decoding of antigen number(s) corresponding to an LCR chain sequence is based on the number of read counts of the LCR chain sequence in all pools, and these read counts are interpreted by a machine learning classifier (e.g, a neural network or other statistical model) that has been trained on examples of the code employed for placing antigens (peptides) in pools.
  • a machine learning classifier e.g, a neural network or other statistical model
  • the decoding of the antigen number(s) corresponding to a LCR chain sequence is based on the number of reads of the LCR chain sequence in all pools, and a maximum a posteriori estimator of the best antigen number(s) for the LCR chain sequence is employed.
  • the method of the present disclosure includes any combination of one or more of steps 201-207.
  • unique TCR chain sequences corresponding to alpha and beta chains are independently decoded for a desired separation.
  • unique BCR chain sequences corresponding to BCR heavy and light chains are independently decoded for a desired separation.
  • the same antigen number when the same antigen number is decoded for a TCR alpha and a TCR beta chain sequence, and only one alpha chain sequence and one beta chain sequence decodes into that antigen number, they are considered to have originated from the same TCR alpha-beta receptor sequence pair that is associated with that antigen.
  • all of the TCR alpha and TCR beta chain sequences that decode to the same antigen number are ranked in each pool by their read counts where one rank list is created for alpha chains, and one for beta chains.
  • TCR alpha chain and a TCR beta chain sequence in each pool have the same pool specific rank order of read counts in the alpha and beta chain rank lists, they are considered to have originated from the same TCR alpha-beta receptor sequence pair.
  • single-cell sequencing methods are used to determine TCR alpha-beta receptor sequence pairs.
  • the same antigen number is decoded for a BCR heavy and a BCR light chain sequence, and only one light chain sequence and heavy beta chain sequence decodes into that antigen number, they are considered to have originated from the same BCR heavy-light receptor sequence pair that is associated with that antigen.
  • all of the BCR heavy and BCR light chain sequences that decode to the same antigen number are ranked in each pool by their read counts where one rank list is created for heavy chains, and one for beta chains. If a BCR heavy chain and a BCR light chain sequence in each pool have the same pool specific rank order of read counts in the heavy and light chain rank lists, they are considered to have originated from the same BCR heavy-light receptor sequence pair.
  • single-cell sequencing methods are used to determine BCR heavy- light receptor sequence pairs.
  • the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: sorting a plurality of first antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of first antigens to a unique subset of the plurality of reaction mixtures, and wherein two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, contacting each reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in at least one reaction mixture of the plurality of reaction mixtures to expand in number such that a plurality of T cell clones is formed, contacting the plurality of T cell clones with a query antigen, separating a second activated T cell and a non-activated T cell from a subset of the plurality of reaction mixtures, wherein the second activated T cell recognizes
  • separating the second activated T cell and the non-activated T cell is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof
  • the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
  • the second activated T cell recognizes the query antigen by binding an MHC complex comprising the query antigen.
  • the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the unique antigen is added to.
  • the error-correcting code is a superimposed code.
  • the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
  • the decoding algorithm is a nearest neighbor algorithm.
  • the query antigen is different from any antigen of the plurality of first antigens.
  • separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using multimer sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell and the non-activated T cell from the subset of the plurality of reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected unique antigens that are specific to the T cell receptor chain sequence.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
  • the invention provides for a method for determining a T cell receptor chain sequence, or a portion thereof, specific for one or more antigens, the method comprising: adding a plurality of first antigens to a first reaction mixture, contacting the first reaction mixture with a biological sample comprising a plurality of T cells, providing a condition for a first activated T cell in the first reaction mixture to expand in number such that a plurality of T cell clones is formed, sorting a plurality of query antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a first query antigen of the plurality of query antigens to a unique subset of the plurality of reaction mixtures, wherein two unique query antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the first query antigen is different from any antigen of the plurality of first antigens, contacting each reaction mixture of the plurality of reaction mixtures with a portion of the first reaction mixture comprising the pluralit
  • separating the second activated T cell from the subset of the plurality of T cell clones is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
  • the T cell receptor chain sequence comprises a receptor chain sequence pair, wherein the receptor chain sequence pair consists of an alpha chain sequence and a beta chain sequence.
  • the second activated T cell recognizes the first query antigen by binding an MHC complex comprising the first query antigen.
  • the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of reaction mixtures that the first query antigen is added to.
  • the error-correcting code is a superimposed code.
  • the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the first query antigen specific for the T cell receptor chain sequence when the T cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures.
  • the decoding algorithm is a nearest neighbor algorithm.
  • separating the second activated T cell from the subset of the plurality of T cell clones is performed using multimer sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using fluorescence-based sorting. In some embodiments, separating the second activated T cell from the subset of the plurality of T cell clones is performed using bead- based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of reaction mixtures is a function of a number of expected query antigens that are specific to the T cell receptor chain sequence. In some embodiments, the plurality of reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the detecting further comprises computing a frequency of T cells that express the T cell receptor chain sequence.
  • the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding the at least two unique antigens of the plurality of antigens-to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens, sequencing nucleic
  • the lymphocyte is a T cell or a B cell. In some embodiments, separating the target lymphocyte is performed using multimer sorting. In some embodiments, the target lymphocyte is a T cell, and wherein separating the T cell is based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, and CD154. In some embodiments, the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
  • a number of reaction mixtures comprising the at least two unique subsets is a function of a number of expected antigens that are specific to the lymphocyte cell receptor chain sequence.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the target lymphocyte recognizes the at least two unique antigens of the plurality of antigens by binding the at least two unique antigens of the plurality of antigens or by binding two or more molecular complexes comprising the at least two unique antigens of the plurality of antigens.
  • the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence. In some embodiments, the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the at least two unique subsets of the plurality of reaction mixtures. In some embodiments, comprising assigning a superimposed code to each antigen of the plurality of antigens, wherein the superimposed code is configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence.
  • the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific for at least two unique antigens, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding at the at least two unique antigens of the plurality of antigens to at least two unique subsets of the plurality of reaction mixtures such that the at least two unique antigens are not added to any two identical subsets of the plurality of reaction mixtures, and wherein the at least two unique subsets are configured to allow a detection of the at least two unique antigens that are specific to the lymphocyte cell receptor chain sequence, contacting each reaction mixture with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte reacts with the at least two unique antigens of the plurality of antigens, sequencing nucleic acids of the target lymphocyte to
  • the lymphocyte is a T cell or a B cell.
  • the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte. In some embodiments, comprising contacting at least one reaction mixture of the plurality of reaction mixtures with a query antigen.
  • the invention provides for a method for determining a lymphocyte cell receptor chain sequence, or a portion thereof, specific to a unique antigen, the method comprising: sorting a plurality of antigens into a plurality of reaction mixtures, wherein the sorting comprises adding a unique antigen of the plurality of antigens to a unique subset of the plurality of reaction mixtures such that two different unique antigens are not added to the unique subset, contacting each reaction mixture of the plurality of reaction mixtures with a biological sample comprising a plurality of lymphocytes, separating a target lymphocyte from a subset of the plurality of lymphocytes, wherein the target lymphocyte recognizes the unique antigen, after separating the target lymphocyte, sequencing nucleic acids of the target lymphocyte to obtain the lymphocyte receptor chain sequence, wherein the sequencing is performed by single-cell sequencing, and detecting the unique antigen, wherein the detecting comprises: computing a frequency of lymphocyte cells that express the lymphocyte receptor chain sequence.
  • the lymphocyte is a T cell or a B cell.
  • the target lymphocyte is a T cell, and wherein the T cell is separated based on a marker selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, or a combination thereof.
  • the lymphocyte cell receptor chain sequence comprises a receptor chain sequence pair, and wherein the receptor chain sequence pair consists of two components of a receptor of the target lymphocyte.
  • the detecting further comprises: computing a gene expression value of a gene of the target lymphocyte.
  • the plurality of reaction mixtures comprises at least one control reaction mixture, and wherein the control reaction mixture does not contain any antigens that are added to the biological sample.
  • the target lymphocyte recognizes the unique antigen by binding the unique antigen or by binding one or more molecular complexes comprising the unique antigen.
  • the detecting further comprises applying, by a processor, a nearest set decoding algorithm configured to determine the unique antigen that is specific to the lymphocyte receptor chain sequence.
  • the detecting further comprises: applying, by a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the one or more antigens that are specific to the lymphocyte receptor chain sequence when the lymphocyte cell receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of reaction mixtures, and wherein the at least one reaction mixture comprises the one or more antigens.
  • the invention provides for a method for determining a lymphocyte receptor chain sequence, or a portion thereof, specific for at least one antigen, the method comprising: providing a biological sample comprising a plurality of lymphocytes, extracting a plurality of first antigen presenting cells from the biological sample, dividing the plurality of first antigen presenting cells into a plurality of first reaction mixtures, sorting a plurality of first antigens into the plurality of first reaction mixtures, wherein the sorting comprises adding a unique first antigen of the plurality of first antigens to a unique subset of the plurality of first reaction mixtures, and wherein two unique first antigens are not added to any two identical subsets of the plurality of first reaction mixtures, contacting each first reaction mixture with the biological sample, providing a condition for a first activated lymphocyte in at least one first reaction mixture of the plurality of first reaction mixtures to expand in number such that a plurality of lymphocyte clones is formed, extracting a plurality of
  • the lymphocyte is a T cell or a B cell.
  • HLA typing of the biological sample to determine a predicted display of at least one antigen of the plurality of first antigens by an MHC molecule present in the biological sample.
  • enriching the plurality of lymphocytes prior to sorting the plurality of first antigens into the plurality of first reaction mixtures.
  • enriching the plurality of lymphocytes after providing the condition for the first activated lymphocyte to expand in number and prior to extracting the plurality of second antigen presenting cells.
  • separating the second activated lymphocyte and the non-activated lymphocyte is performed based on a marker, wherein the marker is selected from the group consisting of CD3, CD4, CD8, CD137, 0X40, CD25, PD-L1, CD69, CD154, and a combination thereof.
  • the second activated lymphocyte recognizes the query antigen by binding an MHC complex comprising the query antigen.
  • the sorting further comprises applying, using a processor, an error-correcting code configured to determine the unique subset of the plurality of first reaction mixtures that the unique first antigen is added to.
  • the error-correcting code is a collision free superimposed code configured to allow for detection of at least two unique first antigens specific for the lymphocyte receptor chain sequence.
  • the collision free superimposed code is determined by a random search method.
  • the collision free superimposed code consists of: a plurality of prefix codes, wherein a prefix code of the plurality of prefix codes is assigned to the unique first antigen of the plurality of first antigens, wherein the prefix code identifies an overlap set, wherein the prefix code is identical for more than one first antigen of the plurality of first antigens within the overlap set, and a plurality of suffix codes, wherein a suffix code of the plurality of suffix codes is assigned to the unique first antigen of the plurality of first antigens, wherein a combination of the prefix code and the suffix code is distinct for the unique first antigen.
  • the detecting comprises applying, using a processor, a decoding algorithm, wherein the decoding algorithm is configured to detect the unique first antigen specific for the lymphocyte receptor chain sequence when the lymphocyte receptor chain sequence is not substantially present in at least one reaction mixture of the unique subset of the plurality of first reaction mixtures.
  • the decoding algorithm is a nearest set algorithm.
  • the query antigen is different from any antigen of the plurality of first antigens.
  • separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using multimer sorting.
  • separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using fluorescence-based sorting. In some embodiments, separating the second activated lymphocyte and the non-activated lymphocyte from the subset of the plurality of final reaction mixtures is performed using bead-based sorting. In some embodiments, a number of reaction mixtures corresponding to the unique subset of the plurality of first reaction mixtures is a function of a number of expected unique first antigens that are specific to the lymphocyte receptor chain sequence. In some embodiments, the plurality of first reaction mixtures comprises at least one control reaction mixture, wherein the control reaction mixture does not contain any antigens that are added to the biological sample. In some embodiments, the detecting further comprises computing a frequency of lymphocytes that express the lymphocyte receptor chain sequence.
  • superimposed codes are used to separate peptides/antigens into antigen pools at step 201 which allows the assay to detect which peptides/antigens are recognized by a single LCR chain sequence when it recognizes more than one peptide/antigen.
  • An example of a superimposed code is a Zatocoding (see Mooers, C. N., and Ashby, W. R., 1951, incorporated by reference in its entirety herein).
  • superimposed codes are applied to assign each antigen (e.g ., peptide) to n antigen pools that are unique to the antigen. If N is the total number of antigen pools utilized, then a given antigen is assigned to a subset of these antigen pools //, where n ⁇ N. In some embodiments, preferably n is equal to F*N, where F is the fraction of antigen pools that are optimal.
  • the binary number corresponding to the pools that an antigen is assigned to is the code word of that antigen, where a pool in which it is present is assigned a “1” and a pool where it is absent is assigned a “0”, and these binary digits are concatenated to form the antigen’s code word (e.g., for five pools, inclusion in pools 1 and 3, and exclusion in pools 2, 4, and 5 would result in the binary number “10100”).
  • the fraction of antigen pools F is typically 1 - 2 1/r where r is the desired detection ability of a given TCR chain sequence to recognize r antigens. Table 1 provides the fraction, F, of the total number of antigen pools, N, that should be used for a given antigen according to the equation above.
  • each antigen e.g ., peptide
  • F*N antigen pools F*N antigen pools, except that it is ensured that no two antigens are allocated to exactly the same group of antigen pools.
  • an antigen’s code word describes the pools in which it is present and absent, where “1” represents a pool where it is present and “0” represents a pool where it is absent. These binary digits are concatenated in pool number order (e.g., the antigen code word “01100” means the antigen is present in pools 2 and 3, and not present in pools 1, 4, and 5).
  • the assignment of antigens to antigen pools is recorded.
  • the sequence’s enrichment is computed versus its presence in the sequencing data from the negative selection of this pool (e.g, CD8+ Not Activated).
  • the sequence’s enrichment is computed versus its presence in the sequencing data from other antigen pools.
  • LCR chain sequence enrichment is computed based on read counts.
  • enrichment is computed based on read counts as corrected by UMIs.
  • LCR chain sequence enrichment is computed based on cell counts.
  • pool specific LCR chain sequence enrichment is computed as described herein.
  • a LCR chain sequence is enriched in a number of antigen pools that is larger than r*F*N , then the LCR chain sequence is flagged as recognizing more than r antigens.
  • the antigen pools it was assigned to are evaluated for enriched LCR chain sequences.
  • the LCR chain sequence is output as recognizing the antigen.
  • the false positive rate of the assay is expected to be bounded by (1 ⁇ 2)" when r is an accurate estimate. Thus, when n is more than about 3, the false positive rate should be small.
  • A is increased which causes a corresponding increase in n to lower the false positive rate to a desired level.
  • collision free superimposed codes as described herein are utilized to ensure that every valid code word can be decoded into a single unique set of antigens.
  • the receptor sequence pairing of LCR chain sequences (T cell alpha and beta, B cell heavy and light) is accomplished as described herein for paired chains that are assigned to the same antigen or antigens. Rank comparisons of read counts for pairing receptor chain sequences is done for each antigen separately.
  • a binary number corresponding to the enrichment of a LCR chain sequence is constructed by concatenating its enriched (“1”) and non-enriched (“0”) pools e.g ., “10101” corresponds to a LCR chain sequence enriched in pools 1, 3, and 5, and not enriched in pools 2 and 4).
  • the Hamming distance of this binary number is computed with respect to the result of the “OR” of the code words for each possible combination of the antigens. Described herein is a nearest set decoding algorithm which determines whether there is a unique nearest neighbor in Hamming distance between the binary number and a single antigen code word, or between the binary number the Boolean bit-wise “OR” of a combination of two or more antigen code words.
  • the nearest set decoding algorithm When such a unique nearest neighbor in Hamming distance is found, the nearest set decoding algorithm outputs the corresponding combination of antigens as being recognized by the LCR chain sequence. For example, if there are K antigens, the method considers all 2 K possible “OR” combinations of antigen code words, including single code words, all combinations of 2 code words, all combinations of 3 code words, and so on.
  • This method allows decoding in situations where a LCR chain sequence is specific to more than one antigen (e.g., by computing a Hamming distance for a set of combined code words).
  • antigens are only considered in combinations if their code words have a minimum number of “1” bits that are also present in the binary number being decoded.
  • the method considers all possible “OR” combinations of antigen code words from up to r antigens (where r is the number of antigens expected to be recognized by a typical LCR used during encoding).
  • other distance metrics e.g., Euclidean distance, cosine distance
  • the nearest set decoding method outputs an error.
  • a nearest set decoding algorithm consists of the following computational steps. [0067] In some embodiments, the inputs for the computation are:
  • N Number of antigen pools.
  • EI, ,N The observed enrichment (enriched: “1”; non-enriched: “0”) of a LCR chain sequence in each of the N antigen pools.
  • CI, ,K Matrix of code words for each of K antigens
  • G specifies a binary number corresponding to the antigen pools where antigen i is present.
  • the binary digits are concatenated in pool number order, where “1” represents a pool where the antigen is present, and “0” represents a pool where it is absent.
  • m Threshold minimum number of antigen pools overlapping with the observed enrichment to consider an antigen for “OR” combinations during superimposed decoding.
  • Neighbor-Distance A distance function (e.g., Hamming distance, Euclidean distance, cosine distance) used to compute the distance between two code words. This function takes in two code words represented as binary numbers and outputs an integer distance. In some embodiments, generalized minimum distance decoding or maximum likelihood decoding can be used for neighbor distance functions.
  • a distance function e.g., Hamming distance, Euclidean distance, cosine distance
  • a corresponding binary number sequence B is constructed by concatenating the enriched (“1”) and non-enriched (“0”) pools for the LCR chain sequence.
  • a set of basis code words W is computed for the purpose of decoding.
  • W U * C* (where W is the union of all code words in C and i is a given antigen).
  • W is the union of all 2 K possible bit-wise Boolean “OR” combinations of antigen code words in C, including single code words, all combinations of 2 code words, all combinations of 3 code words, and so on, and each base code word in W is annotated by the combination of antigen code words used to create it. For example, if Ci is “11000” and C2 is “00101” then the combination of Ci and C2 would be represented by “11101” in W which is the bit-wise “OR” of the two code words, and “11101” would be annotated as the combination Ci and C2.
  • a superimposed code e.g, a Zatocoding; a collision free superimposed code
  • antigens are only considered in combinations if their code words have at least m “1” bits that are also present in B, the code word being decoded.
  • W does not include combinations of antigen code words for more than r antigens at once, and thus the number of possible “OR” combinations of antigen code words up to r antigens is ) (where r is the number of antigens expected to be recognized by a typical LCR used during encoding).
  • W stores both the binary code word and its annotation of the one or more antigens that corresponds to the basis code word.
  • the distances di, ..., dj between B and all basis code words 1, ...,j in W are computed using the Neighbor-Distance function.
  • the Neighbor-Distance function uses a Hamming distance
  • the Neighbor-Distance is the number of positions in a code word sequence in which the two code words differ.
  • N pools, a code word has N positions.
  • d2 1.
  • the output will be an error (“ERROR”). Otherwise, the output will be the annotated basis antigen(s) in Wi corresponding to basis code word di with distance z.
  • the output may consist of a single antigen or multiple antigens that were combined using “OR” to form basis code word Wi. If the output consists of multiple antigens, the LCR chain sequence is specific to more than one antigen.
  • a separate control pool is established that contains no antigens/peptides (“Control Pool 0”; see FIG. 2).
  • This pool is separated at step 203, as are the other pools, and is used to detect cells that are activated when they are retrieved from a donor.
  • donor cells are derived from humans or animals.
  • LCR chain sequences that are found in the separated active set of cells in the control pool represent LCR chain sequences that correspond to host activated cells or cells that contain AIM markers that are not induced by the antigens/peptides in the other pools (i.e., the antigen pools). In some embodiments, these LCR chain sequences can be eliminated from the antigen specific set of LCR chain sequences discovered in the remainder of the antigen pools.
  • control antigens e.g ., control peptides
  • Control antigens that are broadly present in the human population can be derived from common immunizations such as measles, mumps, rubella, polio, and other control antigens/peptides can be used in addition to antigens specific to a target of interest.
  • a threshold level of detection of the control antigens in a representative human population can be predetermined.
  • added control antigens e.g., control peptides
  • control peptides are added to the list of target antigens or query antigens to form a complete set of K antigens/peptides to be assayed (e.g, peptide 1 -K can include one or more target peptides and one or more control peptides).
  • the counts of LCR chain sequences for control antigens can be used to normalize counts for other antigens to provide comparable figures across PBMC samples.
  • normalization is accomplished by adjusting the LCR chain sequence counts in a given sample for an antigen to be presented as a ratio of the antigen’s counts divided by the sum of the control antigen counts.
  • antigens are distributed into antigen pools based on a minimum Hamming distance between the binary encoding of pools where they reside as described in this disclosure (e.g, using a Hamming(7,4) code; see FIG. 1).
  • codes for asymmetric channels can be used when the chance of a “1” occurring by error is higher than the chance of a “0” occurring by error such as when a T cell recognizes more than one antigen (see Kim and Freiman, 1959, for examples of asymmetric codes).
  • other error correcting codes can be employed.
  • FIG. 3 shows a method of determining one or more LCR chain sequences associated with lymphocytes that are expanded by one or more identified first antigens, where the lymphocytes are subsequently activated by one or more query antigens.
  • at least one of the query antigens and first antigens are not identical.
  • a tissue sample e.g ., PBMCs
  • the tissue sample is then HLA typed at step 302 to determine the predicted display of antigens by the MHC molecules present in the tissue sample.
  • the HLA typing at step 302 is used to determine the pool specific first antigens and query antigens that are used based upon their predicted or known display by MHC molecules.
  • lymphocytes e.g., B cells, CD4+ T cells, and/or CD8+ T cells, or any other desired set of lymphocytes or lymphocyte combinations
  • first lymphocyte enrichment is enriched from a portion of the tissue sample from step 301 using negative magnetic bead selection, or other methods as are known in the art including methods described in Dagur et al. (2015) “Collection, Storage, and Preparation of Human Blood Cells.” Curr Protoc Cytom.
  • step 303 the output of step 303 (enriched lymphocytes) is divided into one or more unstimulated pools (i.e., one or more control pools) and N stimulated pools at step 304.
  • step 303 is omitted, and the method proceeds directly to step 304 where the output of step 301 (tissue sample preparation) is divided into unstimulated pool(s)
  • first antigen presenting cells are prepared from the tissue sample at step 301 using various methods such as those described by Schanen et al. (2008) “A novel approach for the generation of human dendritic cells from blood monocytes in the absence of exogenous factors.” J Immunol Methods . 2008 Jun 1 ; 335 ( 1 -2) : 53 -64, and Moser et. al. (2010) “Optimization of a dendritic cell-based assay for the in vitro priming of naive human CD4+ T cells.” J Immunol Methods.
  • the first APCs are divided into a total of N first APC pools.
  • pool specific first antigens are added to the N first APC pools, wherein each pool specific first antigen is added to a unique subset of the N first APC pools using the encoding methods described herein.
  • nucleic acid constructs encoding the pool specific first antigens are transfected or virally delivered into the cells in the N first APC pools with the pool selection being accomplished using the encoding strategies described herein.
  • the pool specific first antigens are vaccines or proteins
  • the first APCs e.g ., dendritic cells
  • the N first APC pools from step 306 are added to the N stimulation pools from step 304 with corresponding numbers (e.g, APC pool 1 is added to simulation pool 1, etc.).
  • the first APCs from step 305 are added to the unstimulated pools from step 304 without exposure to the pool specific first antigens.
  • the unstimulated pools and N stimulation pools at step 304 will already contain APCs (e.g, dendritic cells) and thus steps 305 and 306 are eliminated and each pool specific first antigen is added directly to a unique subset of the N simulation pools at step 307 using the encoding methods described herein.
  • control antigens e.g, a CAP1 peptide or other known MHC class I or class II control peptides
  • control antigens are added to all pools at step 307.
  • control antigens are added to the first APCs at step 305.
  • the control antigens are selected based upon the HLA typing from step 302.
  • the lymphocytes from step 307 are allowed to expand.
  • typical expansion times are 10 to 12 days, and typical culture expansion conditions are described by Tapia-Calle et al. (2019) and Schanen et al. (2011) “Coupling sensitive in vitro and in silico techniques to assess cross-reactive CD4(+) T cells against the swine-origin H1N1 influenza virus.” Vaccine. 2011 Apr 12;29(17):3299-309 each of which are incorporated by reference in their entireties herein.
  • multiple rounds of in vitro stimulation are used that repeat steps 305-308 to expand rare lymphocytes, for example using the in vitro simulation cycle method described in Abrams et al. (1997) “Generation of stable CD4+ and CD8+ T cell lines from patients immunized with ras oncogene- derived peptides reflecting codon 12 mutations.” Cell Immunol. 1997 Dec 15; 182(2): 137-51, incorporated by reference in its entirety herein.
  • the enrichment of lymphocytes activated by the control antigens added at step 305 or step 307 is monitored to determine the number of rounds of in vitro stimulation required.
  • step 309 second lymphocyte enrichment step
  • step 309 is omitted and lymphocytes are not enriched after they have undergone expansion at step 308.
  • second APCs are prepared fresh in step 310 from the tissue sample at step 301 as described herein, the second APCs are added into a single pool and the query antigens are added to this single pool of second APCs.
  • the second APCs are pulsed ( e.g ., for two hours) and the second APCs are then washed.
  • nucleic acid constructs encoding the query antigens are transfected or virally delivered into the second APCs.
  • the second APCs after antigen addition in step 310, are added to the unstimulated pool(s) and N stimulated pools.
  • the query antigens are all added directly to the unstimulated pool(s) and to the N stimulated pools along with output of second APC preparation from step 310.
  • the unstimulated pool(s) and N stimulated pools will already contain APCs (e.g., dendritic cells) and step 310 is eliminated and at step 311, all query antigens are added directly to the one or more unstimulated pool(s) and N simulation pools.
  • APCs e.g., dendritic cells
  • step 310 is eliminated and at step 311, all query antigens are added directly to the one or more unstimulated pool(s) and N simulation pools.
  • cells in the resulting pools are given time to activate and then each pool is separated by markers for activated and non-activated lymphocytes of a desired type, the LCR chains in each pool specific fraction are sequenced, and the decoding algorithm described herein is used to assign, at step 313, LCR chain sequences to one or more first antigens based upon their expansion of lymphocytes that were subsequently activated by query antigens.
  • the enrichment of LCR chain sequences in the N stimulated pools utilizes the LCR chain sequence read counts or cell counts observed for the same LCR chain sequence in the unstimulated pool(s), and the detection of an enriched LCR chain sequence of a lymphocyte that recognizes a first antigen in one or more of the N stimulation pools is based upon its relative read count or cell count when compared to the unstimulated pool(s).
  • This enrichment is then used for decoding one or more pool specific first antigens as described herein.
  • This LCR chain sequence enrichment corresponds to a lymphocyte that is activated by at least one of the query antigens in addition to the one or more first antigens that are decoded.
  • these LCR chain sequences recognize both the one or more first antigens decoded and at least one of the query antigens.
  • a method for determining one or more LCR chain sequences associated with lymphocytes that are activated by one or more identified query antigens, where the lymphocytes have been previously expanded by one or more first antigens.
  • at least one of the query antigens and first antigens are not identical.
  • a tissue sample e.g ., PBMCs
  • HLA typed is prepared at step 401 and is HLA typed at step 402 to determine the predicted display of antigens by the MHC molecules present in the tissue sample.
  • the HLA typing from step 402 is used to determine the first antigens and query antigens that are used based upon their predicted or known display by MHC molecules.
  • lymphocytes e.g., B cells, CD4+ T cells, and/or CD8+ T cells, or any other desired set of lymphocytes or lymphocyte combinations
  • lymphocytes are enriched from a portion of the tissue sample at step 403 using negative magnetic bead selection, or other methods including methods described in Dagur et al. (2015).
  • the tissue sample is used directly without lymphocyte enrichment and step 403 is omitted.
  • the output of step 403 is divided into one or more unstimulated pools (i.e., one or more control pools) and a stimulated pool.
  • APCs are prepared from the tissue sample from step 401 using, for example, the methods described by Schanen et al. (2008) and Moser et. al. (2010). Preparing purified APCs for antigen presentation to lymphocytes can improve the effectiveness of antigen display and of lymphocyte activation by the APCs.
  • the APCs from step 405 are divided into a control APC pool (not exposed to first antigens) and a first antigen exposed APC pool.
  • the first antigens are then added to the first antigen exposed APC pool.
  • the antigen exposed APC fraction of cells are pulsed (e.g, for two hours) with the first antigens and then washed.
  • nucleic acid constructs encoding the first antigens are transfected or virally delivered into the antigen exposed fraction of APCs.
  • the first antigen exposed APCs from step 406 are added to the stimulated pool from step 404.
  • the control APC pool (not exposed to first antigen) from step 406 are added to the unstimulated pool(s) from step 404.
  • the first antigens are added directly to the stimulated pool from step 404 along with output of APC preparation from step 405.
  • the unstimulated (i.e., control) and stimulation pools from step 404 will already contain APCs (e.g, dendritic cells) and step 405 and 406 are eliminated and the first antigens are added directly to the N simulation pools from step 404.
  • control antigens e.g, a CAPl peptide or other known MHC class I or class II control peptides
  • control antigens are added to all pools at step 407.
  • control antigens are added to the first APCs at step 405.
  • the control antigens are selected based upon the HLA typing from step 402. [0079] As shown in FIG.
  • the lymphocytes from step 407 are allowed to expand.
  • typical expansion times are 10 - 12 days, and typical culture expansion conditions are described by Tapia-Calle et al. (2019) and Schanen et al. (2011).
  • multiple rounds of in vitro stimulation are used that repeat steps 405, 406, and 407 to expand rare lymphocytes, for example using the in vitro simulation cycle method described in Abrams etal. (1997) “Generation of stable CD4+ and CD8+ T cell lines from patients immunized with ras oncogene-derived peptides reflecting codon 12 mutations.” Cell Immunol. 1997 Dec 15; 182(2): 137-51, incorporated in its entirety herein.
  • the enrichment of lymphocytes activated by the control antigens added at step 405 or 407 is monitored to determine the number of rounds of in vitro stimulation required.
  • desired lymphocytes are enriched at step 409 using negative magnetic bead selection, or other methods as described above.
  • step 409 is omitted and lymphocytes are not enriched after they have undergone expansion at step 408.
  • fresh second APCs are prepared at step 410 from the tissue sample prepared at step 401 as described herein, and the second APCs are split into second control APC and second N APC pools.
  • pool specific query antigens are encoded and placed into the second N APC pools as described by the methods herein.
  • all of the pool specific query antigens are added to the second control pool of APCs from step 410 to test the unstimulated pool(s) for lymphocyte activation that is independent of first antigen stimulation.
  • the APCs are pulsed ( e.g ., for two hours) in their respective pools and then the APCs are washed.
  • the simulated pool is divided into N stimulated pools.
  • the antigen exposed second N APC pools from step 411 are added to these N stimulation pools with corresponding numbers (e.g., second APC pool 1 is added to simulation pool 1, etc.).
  • the second control APC pools (exposed to the query antigens) from step 411 are added to the unstimulated pool(s).
  • step 408 when lymphocyte enrichment at step 409 is not used, the output of step 408 will already contain APCs (e.g, dendritic cells), steps 410 and 411 are omitted, and at step 412, pool specific query antigens are added to the unstimulated and N stimulated pools created by step 412 with pool selection for each antigen accomplished using the encoding methods described herein.
  • APCs e.g, dendritic cells
  • the lymphocytes are given time to activate, and then each pool is separated by markers for activated and non-activated lymphocytes of a desired type, the LCR chains in each pool specific fraction are sequenced, and the decoding algorithm described herein is used to assign LCR chain sequences to one or more query antigens that activate lymphocytes that were expanded by the set of first antigens.
  • the enrichment of LCR chain sequences in the N simulated pools utilizes the LCR chain sequence read counts or cell counts observed for the same LCR chain sequence in the unstimulated pool(s), and the detection of an enriched LCR chain sequence of a lymphocyte that recognizes a query antigen in one or more of the N stimulation pools is based upon its increased read count or cell count when compared to the unstimulated pool(s).
  • This enrichment is then used for decoding one or more pool specific query antigens as described herein.
  • This LCR chain sequence enrichment corresponds to a lymphocyte that is expanded by at least one of the first antigens in addition to the one or more query antigens that are decoded.
  • these LCR chain sequences recognize both the one or more query antigens decoded and at least one of the first antigens.
  • APCs or APCs mixed with other cell types can be stimulated with a vaccine that consists of one or more antigens that are physically associated (e.g., covalent coupled) to a VHH domain that binds to cells that have MHC class II molecules on their surface.
  • a VHH targeting domain is any VHH domain that competes for binding to MHC class II complexes HLA-DRl, HLA-DR2, and HLA-DR4 with a VHH comprising SEQ ID NO: 1 or SEQ ID NO:
  • VHH targeting domains are VHH molecules that bind to cell surface proteins of antigen presenting cells (e.g., DEC-205). In some embodiments, VHH targeting domains are VHH molecules that bind to cell surface proteins present on cells that have MHC class II molecules on their surface. In some embodiments, VHH targeting domains are VHH molecules that bind to cell type specific surface proteins (e.g,
  • antigens physically associated with VHH targeting domains are used in one or more of the following steps: steps 306 and 311 of FIG. 3, as well as steps 406 and 411 of FIG. 4.
  • Examples of VHH targeting domains are SEQ ID NO: 1 and SEQ ID NO: 2.
  • VHH targeting domains are joined to antigens with linker sequences including fusion protein linkers described in Chen et al. (2012) “Fusion protein linkers: property, design and functionality ” Advanced Drug Delivery Reviews 65.10 (2013): 1357-1369. PMID 23026637, which is incorporated by reference in its entirety herein.
  • linker sequences appear before an antigen.
  • linker sequences appear after an antigen.
  • antigens are natively occurring epitopes, such as the KRAS neoantigens LVVVGADGV (SEQ ID NO: 5) and EYKLVVVGADGVG (SEQ ID NO: 7).
  • antigens are heteroclitic derivatives of naturally occurring epitopes as described by U.S. Patent No. 11,058,751, which is incorporated in its entirety herein.
  • a vaccine comprises one or more heteroclitic antigens that are physically associated with a VHH targeting domain.
  • LMVVGADGV (SEQ ID NO: 4) is a heteroclitic derivative of LVVVGADGV (SEQ ID NO. 5), and EYKFVVFGSDGAG (SEQ ID NO: 6) is a heteroclitic derivative of EYKLVVVGADGVG (SEQ ID NO: 7).
  • An example of a VHH targeting domain (SEQ ID NO: 1) that is combined with a linker (SEQ ID NO: 3) and the single heteroclitic antigen LMVVGADGV (SEQ ID NO: 4) is SEQ ID NO: 8.
  • VHH targeting domain SEQ ID NO: 1
  • SEQ ID NO: 3 linker
  • SEQ ID NO: 4 heteroclitic antigen LMVVGADGV
  • SEQ ID NO: 3 linker
  • EYKFVVFGSDGAG SEQ ID NO: 6
  • a VHH-antigen molecule is a single polypeptide vaccine that encodes one or more antigens that are covalently coupled to a VHH targeting domain.
  • VHH-antigen molecules are SEQ ID NO: 8 and SEQ ID NO: 9.
  • VHH-antigen molecules can be expressed and purified, using for example the methods described in U.S. Patent No. 9,751,945, which is incorporated herein in its entirety.
  • a VHH-antigen molecule is encoded as an mRNA molecule that is expressed in vivo , for example in a cell line or in an individual.
  • the encoding of a VHH-antigen molecule as a mRNA molecule for expression includes a start codon at its beginning.
  • the encoding of a VHH-antigen molecule as a mRNA molecule includes a secretion signal sequence as described in U.S. Patent No. 9,751,945, which is incorporated herein in its entirety.
  • a VHH-antigen mRNA molecule is delivered with an mRNA-LNP formulation as is known in the art.
  • a vaccine for administration to an individual can be constructed by physically associating (e.g ., covalent coupling) one or more antigens to a VHH targeting domain.
  • a vaccine for administration to an individual can be constructed by physically associating (e.g., covalent coupling) one or more heteroclitic antigens to a VHH targeting domain.
  • collision free superimposed codes are used to assign antigens to pools.
  • a collision free superimposed code is defined as a superimposed code that guarantees that each superimposed code word has a unique decoding into one or more antigens.
  • a superimposed code encodes multiple antigens into a single superimposed code word by the logical “OR” of their antigen specific code words.
  • collision free superimposed codes assume that A antigens are each placed into n pools out of a total of N pools and LCRs only recognize up to r antigens.
  • the superimposed code for antigens 1 and 2 in Table 2 is “1 1 1 0 0 1 1 0 0 1” which does not collide with any other antigen code word (or superimposed code word of two antigens) in Table 2.
  • the collision free superimposed code in Table 2 guarantees that any superimposed code word (a single antigen code word, or the logical OR of any two antigen code words) has a unique decoding into its originating one or two antigens.
  • nearest set decoding as described herein can be used to determine the antigens recognized by an LCR based upon the appearance of the LCR receptor sequence in pools that correspond to a “1” in a superimposed code, and “0” where the LCR receptor sequence does not appear.
  • LCR receptor sequence appearance in a pool is based upon statistical metrics as described herein.
  • collision free superimposed codes are determined by a random search method.
  • an antigen is chosen at random to initialize the search.
  • a random code word is chosen for the antigen that is distinct from any previously chosen antigen code word, where the randomly chosen antigen code word has exactly n “1” bits and total length of N bits.
  • Step 2 all superimposed code words for existing antigens and the new antigen code word for combinations up to r are computed.
  • Step 3 if any of the superimposed code words computed in Step 2 are the same, then the method returns to Step 1 to pick a replacement antigen code word.
  • the code word for the antigen is recorded, and a new antigen is chosen at random, and the method continues again from Step 1.
  • the method has determined a collision free superimposed code. In some embodiments, if at Step 1 all possible remaining code words have been tried for a given antigen, then the method stops with failure for the parameters provided, and the method can be repeated starting over from Step 1. In some embodiments, if a fixed number of random code words selected at Step 1 fail in a row without a new code word being recorded at Step 4, the method stops with failure to find a collision free superimposed code, and the method can be repeated from Step 1. After multiple failed attempts, it is possible that a superimposed code with the given constraints does not exist.
  • antigens are arranged into overlap sets, where it is assumed that no LCR can recognize antigens in distinct overlap sets. For example, 30 antigens can be organized into 10 overlap sets of 3 antigens each. In this example, it is assumed that each LCR may recognize a maximum of r antigens in each overlap set.
  • a collision free superimposed code consists of a prefix code that determines an overlap set, and a suffix code that determines the one or more antigens within this overlap set.
  • a given antigen is placed into pools corresponding to “1” bits in the prefix code for its overlap set, and into pools corresponding to “1” bits in their antigen specific code (the suffix code) within their overlap set.
  • the suffix code has one code word for each antigen in the largest overlap set.
  • overlap sets share code words (e.g., the first antigen in each overlap set has the same suffix code word, the second antigen in each overlap set has the same suffix code word, etc.).
  • the suffix code is a collision free superimposed code with r equal to the assumed maximum number of antigens that are recognized by an LCR within an overlap set.
  • the number of bits (e.g., pools) for the suffix code is chosen to accommodate the number of antigens in the largest overlap set and the value of r.
  • Table 3 illustrates a collision free superimposed code for 30 antigens placed into 8 pools where each LCR is assumed to not recognize antigens in distinct overlap sets.
  • a “1” indicates that an antigen is placed into a pool, and a “0” indicates that an antigen is not placed into a pool.
  • the example superimposed code in Table 3 is for 30 antigens organized into 10 overlap sets of 3 antigens per set.
  • a prefix code is used to place the 30 antigens into pools PI to P5, and a suffix code is used to place the 30 antigens into pools P6 to P8.
  • the prefix code uses a two out of five encoding system.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Cell Biology (AREA)
  • Molecular Biology (AREA)
  • Hematology (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Urology & Nephrology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Wood Science & Technology (AREA)
  • Biophysics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • General Engineering & Computer Science (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Virology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne un procédé de détermination d'une séquence de chaîne de récepteur de lymphocytes T spécifique à un antigène unique, comprenant : le tri d'une pluralité d'antigènes en une pluralité de mélanges réactionnels, le tri comprenant l'ajout d'un antigène unique de la pluralité d'antigènes à un sous-ensemble unique de la pluralité de mélanges réactionnels de façon à ce que deux antigènes uniques différents ne soient pas ajoutés au sous-ensemble unique; la mise en contact de chaque réaction avec un échantillon biologique comprenant une pluralité de lymphocytes; la séparation d'un lymphocyte cible d'un sous-ensemble de la pluralité de lymphocytes, le lymphocyte cible reconnaissant l'antigène unique; après la séparation du lymphocyte cible, le séquençage d'acides nucléiques du lymphocyte cible en vue d'obtenir la séquence de chaîne de récepteur de lymphocyte, le séquençage étant réalisé par séquençage de cellule unique; et la détection de l'antigène unique, la détection comprenant : le calcul d'une fréquence de cellules lymphocytes qui expriment la séquence de chaîne de récepteur de lymphocyte.
PCT/US2022/011275 2021-01-06 2022-01-05 Essai multiplexé de lymphocytes pour une spécificité antigénique WO2022150354A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US17/142,745 2021-01-06
US17/142,745 US11111489B1 (en) 2021-01-06 2021-01-06 Multiplexed testing of lymphocytes for antigen specificity
US17/386,702 US20220213466A1 (en) 2021-01-06 2021-07-28 Multiplexed testing of lymphocytes for antigen specificity
US17/386,702 2021-07-28
US202163262974P 2021-10-25 2021-10-25
US63/262,974 2021-10-25

Publications (1)

Publication Number Publication Date
WO2022150354A1 true WO2022150354A1 (fr) 2022-07-14

Family

ID=82358138

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/011275 WO2022150354A1 (fr) 2021-01-06 2022-01-05 Essai multiplexé de lymphocytes pour une spécificité antigénique

Country Status (1)

Country Link
WO (1) WO2022150354A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5371750A (en) * 1990-03-02 1994-12-06 Mitsubishi Denki Kabushiki Kaisha Error-correction encoding and decoding system
US20150025812A1 (en) * 2011-01-27 2015-01-22 Norman A. Paradis Method and apparatus for discovery, development and clinical application of multiplex assays based on patterns of cellular response
US20180087109A1 (en) * 2014-04-01 2018-03-29 Adaptive Biotechnologies Corp. Determining antigen-specific t-cells
US20190025299A1 (en) * 2015-09-25 2019-01-24 Francois Vigneault High throughput process for t cell receptor target identification of natively-paired t cell receptor sequences
US11111489B1 (en) * 2021-01-06 2021-09-07 Think Therapeutics, Inc. Multiplexed testing of lymphocytes for antigen specificity

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5371750A (en) * 1990-03-02 1994-12-06 Mitsubishi Denki Kabushiki Kaisha Error-correction encoding and decoding system
US20150025812A1 (en) * 2011-01-27 2015-01-22 Norman A. Paradis Method and apparatus for discovery, development and clinical application of multiplex assays based on patterns of cellular response
US20180087109A1 (en) * 2014-04-01 2018-03-29 Adaptive Biotechnologies Corp. Determining antigen-specific t-cells
US20190025299A1 (en) * 2015-09-25 2019-01-24 Francois Vigneault High throughput process for t cell receptor target identification of natively-paired t cell receptor sequences
US11111489B1 (en) * 2021-01-06 2021-09-07 Think Therapeutics, Inc. Multiplexed testing of lymphocytes for antigen specificity

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MARK KLINGER, FRANCOIS PEPIN, JEN WILKINS, THOMAS ASBURY, TOBIAS WITTKOP, JIANBIAO ZHENG, MARTIN MOORHEAD, MALEK FAHAM: "Multiplex Identification of Antigen-Specific T Cell Receptors Using a Combination of Immune Assays and Immune Receptor Sequencing", PLOS ONE, vol. 10, no. 10, pages e0141561, XP055389430, DOI: 10.1371/journal.pone.0141561 *

Similar Documents

Publication Publication Date Title
Madi et al. T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequences
US11261490B2 (en) Determining antigen-specific T-cells
Graham et al. Antigen discovery and specification of immunodominance hierarchies for MHCII-restricted epitopes
Lu et al. An efficient single-cell RNA-seq approach to identify neoantigen-specific T cell receptors
Ma et al. High-throughput and high-dimensional single-cell analysis of antigen-specific CD8+ T cells
US20220213466A1 (en) Multiplexed testing of lymphocytes for antigen specificity
CN113424264B (zh) 用于生成个性化癌症疫苗的癌症突变选择
Osbak et al. Characterizing the syphilis-causing Treponema pallidum ssp. pallidum proteome using complementary mass spectrometry
van Schaik et al. Discovery of invariant T cells by next-generation sequencing of the human TCR α-chain repertoire
Du et al. Transcriptome analysis reveals immune-related gene expression changes with age in giant panda (Ailuropoda melanoleuca) blood
Obermair et al. High-resolution profiling of MHC II peptide presentation capacity reveals SARS-CoV-2 CD4 T cell targets and mechanisms of immune escape
Bruno et al. High-throughput, targeted MHC class I immunopeptidomics using a functional genetics screening platform
Admon The biogenesis of the immunopeptidome
US20230126286A1 (en) Multiplexed testing of lymphocytes for antigen specificity
WO2022150354A1 (fr) Essai multiplexé de lymphocytes pour une spécificité antigénique
Bedran et al. The immunopeptidome from a genomic perspective: establishing the noncanonical landscape of MHC class I–associated peptides
EP3807636B1 (fr) Système d'identification d'antigènes reconnus par des récepteurs de lymphocytes t exprimés sur des lymphocytes infiltrant les tumeurs
Bradwell et al. Host and parasite transcriptomic changes upon successive Plasmodium falciparum infections in early childhood
Houston et al. In-Depth Proteome Coverage of In Vitro-Cultured Treponema pallidum and Quantitative Comparison Analyses with In Vivo-Grown Treponemes
WO2020165283A1 (fr) Immunosuppression inverse
Afik et al. Targeted reconstruction of T cell receptor sequence from single cell RNA-sequencing links CDR3 length to T cell differentiation state
CN113272419A (zh) 制备治疗性t淋巴细胞的方法
Karim et al. Evaluating complete surface-associated and secretory proteome of Leishmania donovani for discovering novel vaccines and diagnostic targets
Mayer-Blackwell et al. mRNA vaccination boosts S-specific T cell memory and promotes expansion of CD45RAint TEMRA-like CD8+ T cells in COVID-19 recovered individuals
Buggert et al. Booster mRNA vaccination post-SARS-CoV-2 infection enhances functional qualities of T cell immunity

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22737025

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22737025

Country of ref document: EP

Kind code of ref document: A1