CN111850018B - Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof - Google Patents

Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof Download PDF

Info

Publication number
CN111850018B
CN111850018B CN202010634108.1A CN202010634108A CN111850018B CN 111850018 B CN111850018 B CN 111850018B CN 202010634108 A CN202010634108 A CN 202010634108A CN 111850018 B CN111850018 B CN 111850018B
Authority
CN
China
Prior art keywords
epitope
mhc
sequence
vector
screening
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010634108.1A
Other languages
Chinese (zh)
Other versions
CN111850018A (en
Inventor
陈红松
谢兴旺
廖维甲
陈冬波
陈谱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Peoples Hospital
Original Assignee
Peking University Peoples Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Peoples Hospital filed Critical Peking University Peoples Hospital
Priority to CN202010634108.1A priority Critical patent/CN111850018B/en
Publication of CN111850018A publication Critical patent/CN111850018A/en
Application granted granted Critical
Publication of CN111850018B publication Critical patent/CN111850018B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • C07K14/70503Immunoglobulin superfamily
    • C07K14/7051T-cell receptor (TcR)-CD3 complex
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • C07K14/70503Immunoglobulin superfamily
    • C07K14/70539MHC-molecules, e.g. HLA-molecules
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/02Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/40Fusion polypeptide containing a tag for immunodetection, or an epitope for immunisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/15011Lentivirus, not HIV, e.g. FIV, SIV
    • C12N2740/15041Use of virus, viral particle or viral elements as a vector
    • C12N2740/15043Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Toxicology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Pathology (AREA)
  • Virology (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention relates to the technical field of molecular biology, in particular to a multi-MHC genotype and antigen MiniGene combinatorial library, a construction method and application thereof. The invention provides a T cell epitope screening vector, which comprises an MHC subtype connecting region, an epitope connecting region and a tag sequence positioned between the MHC subtype connecting region and the epitope connecting region, wherein the MHC subtype connecting region is used for connecting an MHC subtype coding sequence; the epitope-linking region is used to link candidate epitope-encoding sequences. The invention also provides a complete set of vectors carrying different MHC subtype sequences based on the vector and a multi-MHC genotype and antigen MiniGene combined library constructed by utilizing the complete set of vectors and candidate antigen epitope coding sequences. The combined library has the advantages of high throughput, low error, high resolution and monitoring in the screening of the antigen epitope, and provides efficient molecular tools and methods for the identification of the T cell antigen epitope.

Description

Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof
Technical Field
The invention relates to the technical field of molecular biology, in particular to a T cell epitope screening vector, a complete set of vector based on the vector, a multi-MHC genotype and antigen MiniGene combinatorial library, a construction method and application thereof.
Background
T lymphocytes are roughly divided into two categories, CD4+T lymphocytes and CD8+Cytotoxic T lymphocytes. The Cytotoxic T Lymphocyte (CTL) is a specific T cell, participates in an immune process by secreting various cytokines, has a killing effect on antigen substances such as certain viruses and tumor cells, and forms an important defense line of antiviral and antitumor immunity of an organism together with natural killer cells. CTLs recognize antigens presented by Major Histocompatibility Complex (MHC) on the cell surface via receptors (TCR) on their cell membranes, secrete cytokines or cytolytic molecules, and thereby kill target cells.
In recent years, CTL recognition of tumor cells has become the basis for tumor immunotherapy. The key of the tumor immunotherapy technology is to search a proper tumor antigen as a specific target of tumor cells and expand immune cells which specifically recognize the tumor antigen.
The tumor neoantigen is an antigen which is completely absent in normal cells generated by DNA mutation, virus infection and the like in tumor cells, can be bound by a human major histocompatibility complex (HLA) and presented to T cells, and can be recognized by TCR to further cause immune response. Most neoantigens are produced by DNA mutation, and the neoantigen is generally a polypeptide consisting of 7-11 amino acids, and the mutation site is generally one base. The number of base mutations in tumor cells can be up to thousands, and a considerable number of the mutations finally generate neoantigens, which cause the recognition of human autoimmune cells and further initiate a series of immune reactions. Compared with tumor-associated antigens, the neoantigens have better specificity and immunogenicity, so the development of the tumor neoantigens has important significance for the immunotherapy of tumors.
At present, tumor neoantigen is identified mainly by discovering tumor specific mutation sites on the basis of whole exon sequencing and transcriptome sequencing and then identifying by methods such as HLA affinity prediction, co-immunoprecipitation, mass spectrometry and the like. Chinese patent CN108796055A discloses a tumor neoantigen detection method based on second-generation sequencing, which comprises a variation detection step, an MHC molecule identification step, a variation annotation step, a mutant peptide segment prediction step, a mutant peptide segment MHC-II type affinity prediction step, an antigen expression abundance detection step, a clonality analysis step and a candidate tumor neoantigen comprehensive scoring and sequencing step. However, these methods are not suitable for efficient screening and identification of high-throughput tumor neoantigens due to complicated procedures, limited accuracy of affinity prediction, and the like.
Disclosure of Invention
The invention aims to provide a T cell epitope screening carrier, which simultaneously contains an MHC subtype connecting region and a candidate epitope sequence connecting region and can be used for screening T cell epitopes aiming at different MHC subtypes. Another objective of the invention is to provide a multi-MHC genotype and antigen MiniGene combinatorial library constructed by using the T cell epitope screening vector, wherein the combinatorial library comprises vectors carrying different combinations of MHC subtype sequences and candidate epitope sequences, and each vector in the library only carries a coding sequence of 1 MHC subtype and a coding sequence of 1 candidate epitope.
The invention also provides the application of the multi-MHC genotype and antigen MiniGene combined library in high-throughput screening of T cell epitope, firstly constructing T cell epitope screening vectors containing different MHC subtype sequences according to the MHC classification result of an individual; and respectively synthesizing candidate antigen epitope sequences aiming at different mutation sites according to mutation sites, coding proteins or polypeptide sequences specific to diseases such as tumor or microbial infection and the like discovered by sequencing technologies (DNA sequencing and RNA sequencing), and constructing a multi-MHC genotype and antigen MiniGene combined library by using the candidate antigen epitope sequences and T cell antigen epitope screening vectors containing different MHC subtype sequences, wherein the library can be used for expressing and screening T cell antigen epitopes.
The invention firstly provides a T cell epitope screening carrier, which comprises an epitope screening core region; the epitope screening core region comprises an MHC subtype connecting region, an epitope connecting region and a tag sequence (tag) positioned upstream or downstream of the epitope connecting region; wherein the MHC subtype binding region is for ligation of an MHC subtype coding sequence; the epitope-linking region is used to link candidate epitope-encoding sequences.
The MHC molecule of the invention can be derived from any animal containing the MHC molecule, including but not limited to human, mouse, rat or other model organism containing the MHC molecule, and can be MHC-I type gene and MHC-II type gene. In the case of human, human MHC genes are also called HLA genes, and HLA class I genes mainly include HLA-A, HLA-B, HLA-C genes, and HLA class II genes mainly include DR, DQ, DP, DO, and DM genes.
The T cell epitope screening vector can be constructed by connecting an epitope screening core region to a starting vector, wherein the starting vector can be any vector capable of being replicated in microbial cells, replicated in mammalian cells or integrated, and the starting vector comprises but is not limited to a lentiviral vector such as lenti-PURO-T2A-EGFP (shown in SEQ ID NO. 36) and the like.
In the prior art, a plurality of candidate antigen epitope coding sequences connected in series are connected to an antigen epitope connecting region at the same time, on one hand, the artificial series connection sequence is easy to generate a non-natural coding sequence and a non-natural peptide coded by the non-natural coding sequence, so that the screening of antigen epitopes is interfered, and the screening error is larger; on the other hand, if the sequence of the tandem candidate epitope coding sequence is required to be determined, which finally generates the T cell epitope, further identification experiments are required, and thus the identification resolution is low.
The invention creatively adopts the method of combining the MHC subtype and the candidate antigen epitope sequence in a one-to-one correspondence manner and then generating the carrier combination of a plurality of MHC subtypes and a plurality of candidate antigen epitope sequences in a one-to-one correspondence manner by the subsequent library construction method, thereby not only avoiding the generation of non-natural coding sequences caused by artificial concatenation, but also obviously improving the resolution of antigen epitope identification.
In the T cell epitope screening vector, an MHC subtype connecting region is connected with 1 MHC subtype coding sequence; and when T cell epitope screening is carried out, the epitope connecting region is also connected with the coding sequence of 1 candidate epitope.
The tag sequence has the function of judging the MHC subtype carried by the carrier through high-throughput sequencing of the tag sequence and the epitope coding sequence, and acquiring the combination mode of the MHC subtype and the epitope coding sequence of the carrier. The invention discovers that the introduction of a heterologous sequence into a tag sequence may increase the probability of non-natural coding sequences or non-natural epitopes due to artificial sequences, causing unnecessary interference or influence on the expression and screening of the epitopes.
The tag sequence is preferably an endogenous DNA sequence of the animal from which the MHC subtype encoding sequence is derived.
Preferably, the tag sequence is located between the MHC subtype binding region and the epitope binding region. The tag sequence is arranged between an MHC subtype connecting region and an antigen epitope connecting region, so that cloning operation can be facilitated.
Preferably, the total length of the tag sequence and the candidate epitope coding sequence and the spacer sequence between the tag sequence and the candidate epitope coding sequence does not exceed the read length for high throughput sequencing. Therefore, the combination mode of determining the MHC subtype and the epitope coding sequence of the vector by using high-throughput sequencing can be realized.
Taking next generation sequencing as an example, the total length of the tag sequence, the candidate antigen epitope coding sequence and the interval sequence between the tag sequence and the candidate antigen epitope coding sequence is less than or equal to 300 bp. If three generations of sequencing are used, the length of the spacer sequence can be further increased as desired.
As a preferable scheme of the invention, the tag sequence is derived from the MHC subtype coding sequence and has the length of 10-20 bp.
In the T cell epitope screening vector, 2A self-cutting peptides are connected with the upstream of an MHC subtype connecting region, an epitope connecting region and a tag sequence. The 2A self-cutting peptide can better ensure the stable and balanced expression of the connecting coding sequence of each connecting region.
Preferably, T2A is linked upstream of the MHC subtype binding region, P2A is linked upstream of the epitope binding region, and E2A is linked upstream of the tag sequence. The self-shearing activity of different self-shearing peptides is different, and the adoption of the connection mode of the self-shearing peptides can better ensure the balance of the expression quantity of the coding sequences connected into each connection region and reduce the interference of the rest parts on the expression of the antigen peptides.
In the above T cell epitope screening vector, the MHC subtype binding region and the epitope binding region share a single promoter, and the promoter is located upstream of the MHC subtype binding region.
Preferably, the promoter is any one selected from the group consisting of EF-1. alpha. promoter, CAG promoter, CMV promoter, PGK1 promoter and SV40 promoter.
More preferably, the promoter is the EF-1 alpha promoter. The activity of the EF-1 alpha promoter can be better suitable for expression, screening and identification of MHC and epitope peptides, and can be better matched with a sequence of a downstream self-cleavage peptide to ensure high-efficiency expression of the MHC and epitope peptides.
In the T cell epitope screening vector, the epitope screening core region sequentially comprises a promoter, T2A, an MHC subtype connecting region, E2A, a tag sequence, P2A and an epitope connecting region from 5 'to 3'.
Preferably, enzyme cutting sites are arranged at the upstream of the MHC subtype connecting region, the downstream of the tag sequence and the upstream and downstream of the antigen epitope connecting region, so that the substitution of the MHC subtype coding sequence and the connection of the candidate antigen epitope coding sequence are facilitated.
As an embodiment of the present invention, BamHI and AfeI are provided upstream of the MHC subtype binding region and downstream of the tag sequence, respectively, as single cleavage sites, and BsmBI is provided upstream of and downstream of the epitope binding region, respectively, as double cleavage sites.
In order to avoid the non-natural coding sequence generated by artificial tandem connection to the maximum extent and reduce the interference generated by screening the antigen epitope candidate fragment as much as possible, when the T cell antigen epitope is screened, the 5 ' end of the candidate antigen epitope coding sequence is preferably seamlessly connected with the 3 ' end of the 2A self-cutting peptide, and the 3 ' end of the candidate antigen epitope coding sequence is a stop codon or is connected with a T cell antigen epitope screening carrier to form the stop codon. The adoption of the design avoids introducing an additional fusion sequence at the N end of the epitope-encoding polypeptide.
The seamless connection refers to that the 5 'end of the candidate antigen epitope coding sequence is directly connected with the 3' end of the 2A self-cutting peptide, and other spacer sequences are not introduced.
The T cell epitope comprises a tumor neoantigen, a tumor associated antigen, an autoimmune disease associated antigen or a microbial epitope and the like.
The candidate epitope-encoding sequences of the present invention may be DNA (MiniGene) that is an integer multiple of 3 in length, between 21-150bp, and contains correctly arranged triplet codons that can be translated in the 5 'to 3' direction to generate a contiguous polypeptide sequence.
When the candidate epitope is a candidate tumor neoantigen, the candidate epitope coding sequence comprises 1 tumor specific mutation site and the upstream and downstream sequences thereof.
As an embodiment of the invention, when the candidate epitope is a candidate tumor neoantigen, the length of the candidate epitope coding sequence is 75bp, and the candidate epitope coding sequence comprises a tumor specific mutation site and a sequence (a coding sequence of 7-12 amino acids) of 21-36 bp upstream and downstream of the tumor specific mutation site. The length of the candidate epitope coding sequence can be adjusted when the tumor-specific mutation site is located at the beginning of the N-terminus or the end of the C-terminus of the wild-type protein.
In the T cell epitope screening vector, an epitope connection region can be connected with a DNA fragment (such as eGFP) in advance, and when epitope screening is carried out, the DNA fragment is cut by enzyme digestion and then is seamlessly connected with the candidate epitope coding sequence.
The T cell epitope screening vector also comprises a functional element which can enable the T cell epitope screening vector to replicate and integrate in a host cell and a functional element for resistance screening.
Preferably, the T cell epitope screening vector is obtained by connecting the epitope screening core region to a lentiviral vector.
Because MHC typing has individual difference, the sequence of the T cell epitope screening vector provided by the invention is different due to different connected MHC subtype coding sequences. Html database discloses human MHC (HLA) typing data, and the skilled in the art can obtain different MHC subtype coding sequences according to the database or other animal MHC databases, and can also obtain individualized MHC subtype coding sequences according to the MHC sequencing typing result of an individual to be tested.
The invention also provides a construction method of the T cell epitope screening vector, which comprises the following steps: and carrying out enzyme digestion treatment on the starting carrier, and connecting the starting carrier with the epitope screening core region to obtain the T cell epitope screening carrier.
Taking human MHC classification as HLA-A0201 as an example, using lentivirus vector lenti-PURO-T2A-EGFP as an initial vector, and connecting the initial vector with the epitope screening core region to obtain a T cell epitope screening vector (lenti-Puror-T2A-HLA (A0201) -E2A-Tag (A0201) -P2A-eGFP); the HLA subtype connecting region is connected with an HLA-A0201 coding sequence, the epitope connecting region is connected with an eGFP gene in advance, the nucleotide sequence of the T cell epitope screening vector is shown as SEQ ID No.1, wherein the 2650-5526 th site is an epitope screening core region, the 2650-2861 th site is an EF-1 alpha promoter, the 3487-3540 th site is T2A, the 3547-4647 th site is an HLA-A0201 coding sequence, the 4648-4707 th site is E2A, the 4708-4722 th site is a tag sequence, the tag sequence is derived from the HLA coding sequence, the 4729-4785 th site is P2A, and the 4795-5514 th site is the eGFP gene; BsmBI enzyme cutting sites are respectively arranged at the upstream and downstream of the antigen epitope connection region, and BamHI enzyme cutting sites and AfeI enzyme cutting sites are respectively arranged at the upstream of the HLA subtype connection region and the downstream of the tag sequence. By replacing the HLA0201 coding sequence and the tag sequence of the vector with other HLA subtype coding sequences and corresponding tag sequences, different T cell epitope screening vectors can be obtained. The eGFP of the vector is replaced by a candidate antigen epitope coding sequence, so that the vector can be used for screening T cell antigen epitopes.
Taking the tumor-specific mutation site PGM5-M (StroneN E, Toebes M, Kelderman S, et al.Targeting of cancer neoantigens with a primer-derived T cell receptors science,2016,352(6291):1337-1341.) as an example, candidate epitope coding sequences were synthesized against PGM5-M (mutated), PGM5-W (unmutated) was used as a control, the candidate epitope coding sequence of PGM5-M is shown as SEQ ID NO.7, and the coding sequence of PGM5-W is shown as SEQ ID NO. 8.
The invention further provides a complete set of vectors, which comprises at least 2T cell epitope screening vectors, wherein the T cell epitope screening vectors are the T cell epitope screening vectors. In the complete set of vectors, the MHC subtype connecting regions of the vectors are respectively connected with different MHC subtype coding sequences, the tag sequences of the vectors are different, and the tag sequences correspond to the MHC subtype coding sequences one by one.
In the above-mentioned complete set of vectors, the one-to-one correspondence between the tag sequence of each vector and the MHC subtype coding sequence means that: each tag sequence corresponds to a specific MHC subtype coding sequence. The design can realize reading the sequence from the tag sequence to the tail end of the antigen epitope coding sequence through high-throughput sequencing, namely judging the MHC subtype sequence connected with the vector according to the tag sequence and determining the combination mode of the MHC subtype of the vector and the antigen epitope coding sequence.
As an embodiment of the invention, the kit comprises a T cell epitope screening vector linked to all Asian MHC-class I coding sequences or to a more frequent MHC-class I coding sequence in Asians, respectively.
As another embodiment of the present invention, the kit comprises T cell epitope screening vectors separately linked to coding sequences of all MHC subtypes of an individual.
For the design of the tag sequences of different vectors in the above-mentioned complete set of vectors, the invention provides a design concept with regularity, which can not introduce non-natural sequences and can not affect the expression of MHC and epitope coding sequences, the length of the tag sequence is 15bp, and the specific design is as follows:
for the vector connected with HLA-A0201, the tag sequence tag A-0201 is a sequence consisting of the bases from position 7 (02X 3+1) to position 9 (02X 3+3) and the bases from position 1 (00X 3+1) to position 12 (00X 3+12) of HLA-A0201;
for the vector connected with HLA-A2402, the tag sequence tag A-2402 is a sequence consisting of the base from 73 (24X 3+1) th to 75 (24X 3+3) th positions and the base from 7 (02X 3+1) th to 18 (02X 3+12) th positions of HLA-2402;
for the vector connected with HLA-B2402, the tag sequence tag B-2402 is a sequence consisting of the base from 73 (24X 3+1) th to 78 (24X 3+6) th positions and the base from 7 (02X 3+1) th to 15 (02X 3+9) th positions of HLA-B2402;
for the vector to which HLA-C2402 is ligated, tag sequence tag C-2402 is a sequence consisting of the bases from position 73 (24X 3+1) to position 81 (24X 3+9) and the bases from position 7 (02X 3+1) to position 12 (02X 3+6) of HLA-C2402.
The skilled person can design the tag sequence of each vector in the set of vectors by referring to the above tag sequence design rules, and also can select other endogenous DNA sequences to form different tag sequences, so as to facilitate the differential sequencing of each vector, and monitor the combination of MHC subtype and candidate epitope sequence of each vector.
The invention further provides a multiple MHC genotype and antigen MiniGene combinatorial library derived from the above described set of vectors linked to candidate epitope coding sequences.
The invention also provides a construction method of the combinatorial library, which is characterized in that the complete set of the carrier is cut by utilizing the endonuclease corresponding to the upstream and downstream enzyme cutting sites of the antigen epitope connection region, and the candidate antigen epitope coding sequence is seamlessly connected into the antigen epitope connection region.
Specifically, the construction method comprises the following steps: mixing different candidate antigen epitope coding sequence fragments in an equal molar ratio to obtain a fragment mixture, mixing the complete set of vectors in an equal molar ratio, carrying out enzyme digestion and recovery, and then carrying out seamless connection with the fragment mixture.
The construction method further comprises the following steps: after ligation, the ligation products were transformed into E.coli competent cells, and the combinatorial library was obtained by screening, culturing and extracting plasmids.
The invention provides a construction method of the T cell epitope screening vector or the complete set of vectors or the combinatorial library or any one of the following applications of the tumor neoantigen:
(1) the application in T cell epitope screening or expression;
(2) the application in target screening of disease diagnosis or treatment;
(3) the use thereof for the preparation of a product for the diagnosis or treatment of a disease.
In the above (1), the T cell epitope is preferably a tumor neoantigen, a tumor-associated antigen, an autoimmune disease-associated antigen, or a microbial epitope;
in the above (2), the disease is preferably a tumor, an autoimmune disease or a disease caused by microbial infection;
in the above (3), the product is preferably a diagnostic agent, a drug or a vaccine.
The invention has the beneficial effects that: the invention provides a vector for screening T cell epitope, a complete set of vectors based on the vector and carrying different MHC subtype sequences, and a multi-MHC genotype and antigen MiniGene combined library constructed based on the complete set of vectors. In the combinatorial library of the present invention, each plasmid contains 1 MHC subtype sequence and 1 candidate epitope sequence, the MHC subtypes and candidate epitopes exist in a one-to-one combination correspondence manner, and the library contains vectors carrying different MHC phenotypes and different combinations of candidate epitope sequences. The library can simultaneously express a plurality of MHC subtypes and different antigen epitope peptide combinations after cells are transfected.
By utilizing the complete set of vectors and the library construction method provided by the invention, an individual and large-capacity multi-MHC genotype and antigen MiniGene combined library can be efficiently obtained, and the combination of MHC molecules and candidate antigen fragments is expressed in batches; by utilizing the one-to-one corresponding characteristics of the tag sequences and the MHC subtypes, the combined mode of the MHC subtypes and candidate antigen epitope fragments in the constructed library can be monitored by utilizing high-throughput sequencing, so that the screening of the neoantigens is facilitated; through the later stage proteasome processing, all antigen candidate peptide fragments generated by one mutation site or one candidate antigen epitope coding sequence can be generated (if the antigen epitope is tumor neoantigen, the length of the antigen epitope can be 7, 8, 9, 10 and 11 amino acids respectively, taking example 1 and 2 as examples, the combined library contains 5 × 28 ═ 140 plasmid, and 5 × 14 × 45 ═ 3150 candidate tumor neoantigen peptides can be expressed). Meanwhile, the vector or the complete set of vectors provided by the invention can also be connected with a candidate epitope peptide sequence obtained by screening for expression or verification; meanwhile, MHC molecules, antigen fragments obtained by screening and combinations thereof can be expressed in batches, and the MHC molecules and the antigen fragments can be used for clinical cell therapy.
The carrier and the library provided by the invention have the advantages of high throughput, low error (maximally avoiding the introduction of non-natural sequences and the interference of the non-natural sequences on antigen epitope screening), high resolution (capable of accurately determining candidate antigen epitope coding sequences capable of inducing T cells to generate specific immune response at one time), and capability of monitoring through sequencing, provide efficient molecular tools and methods for identifying tumor neoantigens and microbial antigen epitopes, and have important significance on immunotherapy of diseases such as tumor and microbial infection.
Drawings
FIG. 1 is a plasmid map of the starting vector lenti-PURO-T2A-EGFP in example 1 of the present invention.
FIG. 2 is a plasmid map of the T cell epitope screening vector lenti-Puror-T2A-HLA (A0201) -E2A-Tag (A0201) -P2A-eGFP in example 1 of the present invention.
FIG. 3 shows the flow cytometric assay results of 721-221 cell lines infected with lenti-Puror-T2A-HLA (A0201) -E2A-Tag (A0201) -P2A-eGFP vector in example 1 of the present invention.
FIG. 4 is a structural diagram of a candidate tumor neoantigen encoding sequence 75 bases in length in example 2 of the present invention.
FIG. 5 is a schematic diagram of the candidate tumor neoantigen encoding sequence fragment of the present invention, which is expressed in vivo and then processed by proteasome to produce a neoantigen peptide in example 2.
Detailed Description
Preferred embodiments of the present invention will be described in detail with reference to the following examples. It is to be understood that the following examples are given for illustrative purposes only and are not intended to limit the scope of the present invention. Various modifications and alterations of this invention will become apparent to those skilled in the art without departing from the spirit and scope of this invention.
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
Example 1 construction of T cell epitope screening vectors
The lentivirus vector lenti-PURO-T2A-EGFP is taken as a starting vector (figure 1, the sequence is shown as SEQ ID NO. 36), and HLA types are respectively taken as HLA-A0201, HLA-A2402, HLA-B1501, HLA-B3501 and HLA-0701 for example to construct T cell epitope screening vectors carrying different HLA subtype coding sequences.
The construction method of the T cell epitope screening carrier carrying HLA-A0201 comprises the following steps:
1. in order to meet the requirement of subsequent enzyme cutting site selection, the BsmBI enzyme cutting sites in the lenti-PURO-T2A-EGFP vector are subjected to synonymous mutation in a point mutation mode, so that all the BsmBI enzyme cutting sites in the lenti-PURO-T2A-EGFP vector are eliminated.
Whether the lenti-PURO-T2A-EGFP vector with the BsmBI enzyme cutting site eliminated can normally function is verified at the cellular level, and the method comprises the following specific steps: the lenti-PURO-T2A-EGFP vector with the BsmBI enzyme cutting sites eliminated is transfected to cells, antibiotics are added into a cell culture medium, and the transfected cells can grow normally in the culture medium with the antibiotics through verification, which shows that the resistance gene of the lenti-PURO-T2A-EGFP vector can express normally. The lenti-PURO-T2A-EGFP used in the subsequent operation is the lenti-PURO-T2A-EGFP with BsmBI enzyme cutting sites eliminated.
2. DNA fragments containing T2A, HLA-A0201, E2A, and Tag (A0201), and DNA fragments containing P2A and eGFP were synthesized, respectively.
3. Cutting the EGFP region of the lenti-PURO-T2A-EGFP vector by BamHI and EcoRI through double enzyme digestion, recovering the digested lenti-PURO-T2A-EGFP skeleton, connecting all the fragments synthesized in the step 2 to the lenti-PURO-T2A-EGFP skeleton by a seamless connection method, transforming the lenti-PURO-T2A-EGFP skeleton into escherichia coli competent cells, screening and identifying to obtain a T cell epitope screening vector carrying HLA-A0201, and naming the T cell epitope screening vector as lenti-Puror-T2A-HLA (A0201) -E2A-Tag (A0201) -P2A-eGFP (the sequence is shown as SEQ ID NO.1, and the vector map is shown as figure 2). Wherein, the 3541 2650-5526 th sites are epitope screening core regions, the 2650-2861 th sites are EF-1 alpha promoters, the 3487-3540 th sites are T2A, the 3547-4647 th sites are HLA-A0201 coding sequences, the 4648-4707 th sites are E2A, the 4708-4722 th sites are tag sequences, the tag sequences are derived from HLA coding sequences, the 4729-4785 th sites are P2A, and the 4795-5514 th sites are eGFP genes; BsmBI enzyme cutting sites are respectively arranged at the upstream and downstream of the antigen epitope connection region, and BamHI enzyme cutting sites and AfeI enzyme cutting sites are respectively arranged at the upstream of the HLA subtype connection region and the downstream of the tag sequence. By replacing the HLA0201 coding sequence and the tag sequence of the vector with other HLA subtype coding sequences and corresponding tag sequences, different T cell epitope screening vectors can be obtained. The eGFP of the vector is replaced by a candidate antigen epitope coding sequence, so that the vector can be used for screening T cell antigen epitopes.
The construction method of the T cell epitope screening vector carrying HLA-A2402, HLA-B1501, HLA-B3501 and HLA-C0701 is the same as that described above, wherein HLA-A0201 is replaced by the above different HLA subtype sequences, and the tag sequence is sequentially replaced by tag (A2402) shown in SEQ ID NO.2, tag (B1501) shown in SEQ ID NO.3, tag (B3501) shown in SEQ ID NO.4 and tag (C0701) shown in SEQ ID NO. 5. The coding sequences of HLA-A2402, HLA-B1501, HLA-B3501 and HLA-0701 subtypes can be obtained by https:// www.ebi.ac.uk/ipd/imgt/HLA/allele. If BsmBI enzyme cutting sites exist in the HLA subtype coding sequence, point mutation is adopted for carrying out synonymous mutation during synthesis.
The 721-221 cell line infected with lenti-Puror-T2A-HLA (A0201) -E2A-Tag (A0201) -P2A-eGFP vector was subjected to flow cytometry, and the results are shown in FIG. 3, which indicates that HLA can be normally expressed.
Example 2 construction of a Combined Multi-HLA genotype and neo-antigen MiniGene library
The T cell epitope screening vectors respectively carrying HLA-A0201, HLA-A2402, HLA-B1501, HLA-B3501 and HLA-0701 and the synthesized candidate tumor neoantigen coding sequence fragments which are constructed in the embodiment 1 are utilized to construct a multi-HLA genotype and neoantigen MiniGene combined library, and the specific method is as follows:
1. synthesis of candidate tumor neoantigen coding sequence fragment
The information of the melanoma neoantigen disclosed in the reference literature, the gene mutation site is determined, and the candidate tumor neoantigen coding sequence segment is synthesized.
Mutation information disclosed in the literature (Stronen E, Toebes M, Kelderman S, et al.targeting of cancer neoantigens with a primer-derived T cell receptors: science,2016, (352) 6291: 1337-1341.) shows that the 71098890 th base C of the PGM5 gene located on chromosome 9 is mutated to T, resulting in a neoantigen presented by HLA-A0201, taking PGM5-W and PGM5-M as examples.
The cDNA sequence of PGM5 gene was obtained from NCBI CCDS database, and the above C/T mutation reported in literature was precisely mapped in UCSC database. Taking 36 bases at the upstream and 36 bases at the downstream of the mutation site and 3 bases corresponding to the amino acid which participates in coding per se to form a candidate tumor neogenesis antigen coding sequence with the length of 75 bases (a schematic diagram is shown in figure 4). For subsequent seamless connection, a homologous arm sequence of 24 bases is added at the 5 'end and the 3' end of the candidate tumor neoantigen coding sequence of 75 bases. The sequences of the homology arms at the 5 'end and 3' end are as follows:
5’-GATGTCGAAGAGAATCCTGGACCG-3’(SEQ ID NO.6);
5’-GCTTGATATCGAATTCTTAGGCTA-3’(SEQ ID NO.7)。
candidate tumor neogenesis antigen coding sequence fragments with the length of 123bp and wild type fragments (control) without the mutation are synthesized respectively, wherein the sequences of the candidate tumor neogenesis antigen coding sequence fragments (mutant, PGM5-M) are shown as SEQ ID NO.8, and the sequences of the wild type fragments (PGM5-W) are shown as SEQ ID NO. 9.
Using the above-described methods, additional 13 pairs of (wild-type and mutant) candidate tumor-neoantigen-encoding sequence fragments were synthesized using the mutation information disclosed in references (Lu Y, Yao X, Crystal JS, et al, effective identification of mutated Cancer receptors by T cells associated with reduced tumor regression, clinical Cancer Research,2014,20(13):3401-3410.Stronen E, Toebes M, Kelderman S, et al, targeting of Cancer receptors with primer-derivative T cell epitopes. science,2016,352(6291): 7-1341), respectively, and the synthetic fragment sequences were as follows:
(1) HLA-A0201-presented ASTN1-M (SEQ ID NO.10) and its wild type ASTN1-W (SEQ ID NO. 11);
(2) HLA-A0201-presented CDK4-M (SEQ ID NO.12) and its wild-type CDK4-W (SEQ ID NO. 13);
(3) KMT2D-M presented by HLA-A0201 (SEQ ID NO.14) and its wild type KMT2D-W (SEQ ID NO. 15);
(4) BCS1L-M presented by HLA-A0201 (SEQ ID NO.16) and its wild type BCS1L-W (SEQ ID NO. 17);
(5) GNL3L-M presented by HLA-A0201 (SEQ ID NO.18) and its wild-type GNL3L-W (SEQ ID NO. 19);
(6) SLC38A1-M presented by HLA-A0201 (SEQ ID NO.20) and its wild type SLC38A1-W (SEQ ID NO. 21);
(7) HLA-A0201 presented USP28-M (SEQ ID NO.22) and its wild type USP28-W (SEQ ID NO. 23);
(8) HLA-A2402 presented HELLC-M (SEQ ID NO.24) and its wild type HELLC-W (SEQ ID NO. 25);
(9) DNAH17-M presented by HLA-B1501 (SEQ ID NO.26) and its wild-type DNAH17-W (SEQ ID NO. 27);
(10) HLA-B1501-presented SMARCD3-M (SEQ ID NO.28) and its wild-type SMARCD3-W (SEQ ID NO. 29);
(11) HLA-B1501 presented MAP2K3-M (SEQ ID NO.30) and its wild type MAP2K3-W (SEQ ID NO. 31);
(12) HLA-B3501 presented PARG-M (SEQ ID NO.32) and its wild type PARG-W (SEQ ID NO. 33);
(13) HLA-C0701-presented POLA2-M (SEQ ID NO.34) and its wild-type POLA2-W (SEQ ID NO. 35).
A total of 28 synthetic fragments were obtained, including 14 candidate tumor neoantigen-encoding sequence fragments and 14 wild-type controls thereof. A schematic diagram of the candidate tumor neoantigen encoding sequence fragments generated by proteasome processing after in vivo expression is shown in FIG. 5.
2. T cell epitope screening vectors respectively carrying HLA-A0201, HLA-A2402, HLA-B1501, HLA-B3501 and HLA-0701, which are constructed in example 1, were mixed in an equimolar ratio, and subjected to double digestion and recovery with BsmBI, 28 synthetic fragments obtained in step 1 were mixed in an equimolar ratio, and the mixture of the vector fragments and the mixture of 28 synthetic fragments were mixed for seamless ligation.
The attached reaction system is shown in table 1:
TABLE 1 ligation reaction System
Components Volume of
BmbI post-cleavage HLA vector mix (153 ng/. mu.l) 0.66μl
Synthesis of fragment mixture (10 ng/. mu.l) 0.8μl
2 x seamless connection premixed liquid 10μl
Adding deionized water to 20μl
The reaction conditions for ligation were as follows: 50 ℃ for 60 min.
3. The ligation product is electrically transferred to escherichia coli competent cells by the following specific method:
25ml of E.coli competent cells for electrotransformation (CAT # DE1080) were taken out and dissolved on ice, 1. mu.l of the ligation product was added, electric shock was applied, the shocked solution was resuspended in 1ml of preheated SOC medium quickly, 5ml of SOC medium was added, and the cells were shaken at 37 ℃ for 1 hour. Plate coating, plate scraping, bacteria shaking and plasmid large extraction are carried out, and a multi-HLA genotype and new antigen MiniGene combined library is obtained.
Example 3 sequencing validation of combinatorial libraries
Clone sequencing and second generation sequencing verification were performed by the multi-HLA genotype and the new antigen MiniGene combinatorial library constructed in example 2, respectively.
The clone sequencing aims at preliminarily evaluating the accuracy of library construction, the accuracy is defined as the probability of correct connection of an enzyme cutting mother vector and a connecting fragment, and the calculation formula is as follows: number of correct ligations/total number of clones sequenced. The results of clone sequencing showed that the library construction was 94.12% correct (table 2).
TABLE 2 sequencing results of clones
Figure BDA0002567250680000061
Figure BDA0002567250680000071
Figure BDA0002567250680000081
The aim of the next generation sequencing is to finally evaluate the correctness and coverage of the combination of the HLA subtype sequence and the candidate tumor neoantigen coding sequence fragment in the library. The results of the second generation sequencing showed that the library correctly covered 140 plasmids carrying different combinations of HLA-a0201, HLA-a2402, HLA-B1501, HLA-B3501, HLA-0701 subtypes and candidate tumor neoantigen coding sequence fragment PGM5-M, ASTN1-M, CDK4-M, KMT2D-M, BCS1L-M, GNL3L-M, SLC38a1-M, USP28-M, HELLC-M, NAH17-M, SMARCD3-M, MAP2K3-M, PARG-M, POLA2-M and its wild-type fragment (total of 5 HLA subtypes, 14 candidate neoantigen coding sequence fragments and 14 wild-type fragments, the combined library comprising 5 × 28 ═ 140 plasmids) and had good homogeneity. Wherein 1872953 random reads were obtained from the total sample, and the statistics of the reads (reads) of each HLA subtype and candidate tumor neoantigen encoding sequence fragment combination are shown in Table 3. The library can express 5 × 14 × 45 ═ 3150 candidate tumor neoantigen peptides, and is used for screening the candidate tumor neoantigen peptides.
TABLE 3 second Generation sequencing results
A0201 A2402 B1501 B3501 C0701
ASTN1-M 37645 26982 10942 22842 34502
ASTN1-W 38061 27151 11034 23066 34762
BCS1L-M 28924 20932 8391 18138 25105
BCS1L-W 31450 22704 9115 19640 27191
CDK4-M 3364 2327 894 1947 3108
CDK4-W 3341 2330 903 1952 3114
DNAH17-M 41110 29836 12245 25102 37524
DNAH17-W 41063 29777 12214 25078 37473
GNL3L-M 527 377 166 313 491
GNL3L-W 484 343 157 299 456
HELLC-M 2541 1854 656 1453 2295
HELLC-W 2492 1815 653 1433 2255
KMT2D-M 20502 14700 6035 12305 19253
KMT2D-W 20377 14709 6034 12233 19198
MAP2K3-M 27908 20234 8287 17393 24763
MAP2K3-W 28573 20790 8512 17826 25437
PARG-M 1674 1193 496 975 1651
PARG-W 1550 1110 460 913 1520
PGM5-M 1986 1537 554 1260 1785
PGM5-W 2010 1565 561 1261 1804
POLA2-M 36797 28170 10783 22449 35264
POLA2-W 36941 28251 10798 22463 35267
SLC38A1-M 27889 19988 7983 15974 24448
SLC38A1-W 28153 20162 8059 16157 24705
SMARCD3-M 31920 23332 9482 19073 28822
SMARCD3-W 31967 23387 9456 19065 28972
USP28-M 213 171 43 125 205
USP28-W 205 174 42 121 204
Although the invention has been described in detail hereinabove by way of general description, specific embodiments and experiments, it will be apparent to those skilled in the art that many modifications and improvements can be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.
Sequence listing
<110> Beijing university Hospital
<120> multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof
<130> KHP201112551.8
<160> 36
<170> SIPOSequenceListing 1.0
<210> 1
<211> 10862
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
gtcgacggat cgggagatct cccgatcccc tatggtgcac tctcagtaca atctgctctg 60
atgccgcata gttaagccag tatctgctcc ctgcttgtgt gttggaggtc gctgagtagt 120
gcgcgagcaa aatttaagct acaacaaggc aaggcttgac cgacaattgc atgaagaatc 180
tgcttagggt taggcgtttt gcgctgcttc gcgatgtacg ggccagatat acgcgttgac 240
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 300
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 360
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 420
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 480
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 540
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 600
tcatcgctat taccatggtg atgcggtttt ggcagtacat caatgggcgt ggatagcggt 660
ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt ttgttttggc 720
accaaaatca acgggacttt ccaaaatgtc gtaacaactc cgccccattg acgcaaatgg 780
gcggtaggcg tgtacggtgg gaggtctata taagcagcgc gttttgcctg tactgggtct 840
ctctggttag accagatctg agcctgggag ctctctggct aactagggaa cccactgctt 900
aagcctcaat aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac 960
tctggtaact agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagtggc 1020
gcccgaacag ggacttgaaa gcgaaaggga aaccagagga gctctctcga cgcaggactc 1080
ggcttgctga agcgcgcacg gcaagaggcg aggggcggcg actggtgagt acgccaaaaa 1140
ttttgactag cggaggctag aaggagagag atgggtgcga gagcgtcagt attaagcggg 1200
ggagaattag atcgcgatgg gaaaaaattc ggttaaggcc agggggaaag aaaaaatata 1260
aattaaaaca tatagtatgg gcaagcaggg agctagaacg attcgcagtt aatcctggcc 1320
tgttagaaac atcagaaggc tgtagacaaa tactgggaca gctacaacca tcccttcaga 1380
caggatcaga agaacttaga tcattatata atacagtagc aaccctctat tgtgtgcatc 1440
aaaggataga gataaaagac accaaggaag ctttagacaa gatagaggaa gagcaaaaca 1500
aaagtaagac caccgcacag caagcggccg ctgatcttca gacctggagg aggagatatg 1560
agggacaatt ggagaagtga attatataaa tataaagtag taaaaattga accattagga 1620
gtagcaccca ccaaggcaaa gagaagagtg gtgcagagag aaaaaagagc agtgggaata 1680
ggagctttgt tccttgggtt cttgggagca gcaggaagca ctatgggcgc agcgtcaatg 1740
acgctgacgg tacaggccag acaattattg tctggtatag tgcagcagca gaacaatttg 1800
ctgagggcta ttgaggcgca acagcatctg ttgcaactca cagtctgggg catcaagcag 1860
ctccaggcaa gaatcctggc tgtggaaaga tacctaaagg atcaacagct cctggggatt 1920
tggggttgct ctggaaaact catttgcacc actgctgtgc cttggaatgc tagttggagt 1980
aataaatctc tggaacagat ttggaatcac acgacctgga tggagtggga cagagaaatt 2040
aacaattaca caagcttaat acactcctta attgaagaat cgcaaaacca gcaagaaaag 2100
aatgaacaag aattattgga attagataaa tgggcaagtt tgtggaattg gtttaacata 2160
acaaattggc tgtggtatat aaaattattc ataatgatag taggaggctt ggtaggttta 2220
agaatagttt ttgctgtact ttctatagtg aatagagtta ggcagggata ttcaccatta 2280
tcgtttcaga cccacctccc aaccccgagg ggacccgaca ggcccgaagg aatagaagaa 2340
gaaggtggag agagagacag agacagatcc attcgattag tgaacggatc ggcactgcgt 2400
gcgccaattc tgcagacaaa tggcagtatt catccacaat tttaaaagaa aaggggggat 2460
tggggggtac agtgcagggg aaagaatagt agacataata gcaacagaca tacaaactaa 2520
agaattacaa aaacaaatta caaaaattca aaattttcgg gtttattaca gggacagcag 2580
agatccagtt tggttaatta gctagctagg tcttgaaagg agtgggaatt ggctccggtg 2640
cccgtcagtg ggcagagcgc acatcgccca cagtccccga gaagttgggg ggaggggtcg 2700
gcaattgatc cggtgcctag agaaggtggc gcggggtaaa ctgggaaagt gatgtcgtgt 2760
actggctccg cctttttccc gagggtgggg gagaaccgta tataagtgca gtagtcgccg 2820
tgaacgttct ttttcgcaac gggtttgccg ccagaacaca ggaccggttc tagagccacc 2880
atgaccgagt acaagcccac ggtgcgcctc gccacccgcg acgacgtccc ccgggccgta 2940
cgcaccctcg ccgccgcgtt cgccgactac cccgccacgc gccacaccgt cgacccggac 3000
cgccacatcg agcgggtcac cgagctgcaa gaactcttcc tcacgcgcgt cgggctcgac 3060
atcggcaagg tgtgggtcgc ggacgacggc gccgcggtgg cggtctggac cacgccggag 3120
agcgtcgaag cgggggcggt gttcgccgag atcggcccgc gcatggccga gttgagcggt 3180
tcccggctgg ccgcgcagca acagatggaa ggcctcctgg cgccgcaccg gcccaaggag 3240
cccgcgtggt tcctggccac cgtcggcgta tcgcccgacc accagggcaa gggtctgggc 3300
agcgccgtcg tgctccccgg agtggaggcg gccgagcgcg ccggggtgcc cgccttcctg 3360
gagacctccg cgccccgcaa cctccccttc tacgagcggc tcggcttcac cgtcaccgcc 3420
gacgtcgagg tgcccgaagg accgcgcacc tggtgcatga cccgcaagcc cggtgccggc 3480
tccggagagg gcaggggaag tctgctaacc tgcggcgacg tggaggaaaa tcccggcccc 3540
ggatccatgg ccgtcatggc gccccgaacc ctcgtcctgc tactctcggg ggctctggcc 3600
ctgacccaga cctgggcggg ctctcactcc atgaggtatt tcttcacatc cgtgtcccgg 3660
cccggccgcg gggagccccg cttcatcgca gtgggctacg tggacgacac gcagttcgtg 3720
cggttcgaca gcgacgccgc gagccagagg atggagccgc gggcgccgtg gatagagcag 3780
gagggtccgg agtattggga cggggagaca cggaaagtga aggcccactc acagactcac 3840
cgagtggacc tggggaccct gcgcggctac tacaaccaga gcgaggccgg ttctcacacc 3900
gtccagagga tgtatggctg cgacgtgggg tcggactggc gcttcctccg cgggtaccac 3960
cagtacgcct acgacggcaa ggattacatc gccctgaaag aggacctgcg ctcttggacc 4020
gcggcggaca tggcagctca gaccaccaag cacaagtggg aggcggccca tgtggcggag 4080
cagttgagag cctacctgga gggcacgtgc gtggagtggc tccgcagata cctggagaac 4140
gggaaggaga cactgcagcg cacggacgcc cccaaaacgc atatgactca ccacgctgtc 4200
tctgaccatg aagccaccct gaggtgctgg gccctgagct tctaccctgc ggagatcaca 4260
ctgacctggc agcgggatgg ggaggaccag acccaggaca cggagctcgt ggagaccagg 4320
cctgcagggg atggaacctt ccagaagtgg gcggctgtgg tggtgccttc tggacaggag 4380
cagagataca cctgccatgt gcagcatgag ggtttgccca agcccctcac cctgagatgg 4440
gagccgtctt cccagcccac catccccatc gtgggcatca ttgctggcct ggttctcttt 4500
ggagctgtga tcactggagc tgtggtcgct gctgtgatgt ggaggaggaa gagctcagat 4560
agaaaaggag ggagctactc tcaggctgca agcagtgaca gtgcccaggg ctctgatgtg 4620
tctctcacag cttgtaaagt gggaggtcag tgtactaatt atgctctctt gaaattggct 4680
ggagatgttg agagcaaccc aggtcccgtc atggccgtca tgagcgctgc aacaaacttc 4740
tctctgctga aacaagccgg agatgtcgaa gagaatcctg gaccgagaga cgagatggtg 4800
agcaagggcg aggagctgtt caccggggtg gtgcccatcc tggtcgagct ggacggcgac 4860
gtaaacggcc acaagttcag cgtgtccggc gagggcgagg gcgatgccac ctacggcaag 4920
ctgaccctga agttcatctg caccaccggc aagctgcccg tgccctggcc caccctcgtg 4980
accaccctga cctacggcgt gcagtgcttc agccgctacc ccgaccacat gaagcagcac 5040
gacttcttca agtccgccat gcccgaaggc tacgtccagg agcgcaccat cttcttcaag 5100
gacgacggca actacaagac ccgcgccgag gtgaagttcg agggcgacac cctggtgaac 5160
cgcatcgagc tgaagggcat cgacttcaag gaggacggca acatcctggg gcacaagctg 5220
gagtacaact acaacagcca caacgtctat atcatggccg acaagcagaa gaacggcatc 5280
aaggtgaact tcaagatccg ccacaacatc gaggacggca gcgtgcagct cgccgaccac 5340
taccagcaga acacccccat cggcgacggc cccgtgctgc tgcccgacaa ccactacctg 5400
agcacccagt ccgccctgag caaagacccc aacgagaagc gcgatcacat ggtcctgctg 5460
gagttcgtga ccgccgccgg gatcactctc ggcatggacg agctgtacaa gtaacgtctc 5520
atagcctaag aattcgatat caagcttatc ggtaatcaac ctctggatta caaaatttgt 5580
gaaagattga ctggtattct taactatgtt gctcctttta cgctatgtgg atacgctgct 5640
ttaatgcctt tgtatcatgc tattgcttcc cgtatggctt tcattttctc ctccttgtat 5700
aaatcctggt tgctgtctct ttatgaggag ttgtggcccg ttgtcaggca acgtggcgtg 5760
gtgtgcactg tgtttgctga cgcaaccccc actggttggg gcattgccac cacctgtcag 5820
ctcctttccg ggactttcgc tttccccctc cctattgcca cggcggaact catcgccgcc 5880
tgccttgccc gctgctggac aggggctcgg ctgttgggca ctgacaattc cgtggtgttg 5940
tcggggaaat catcgtcctt tccttggctg ctcgcctgtg ttgccacctg gattctgcgc 6000
gggacgtcct tctgctacgt cccttcggcc ctcaatccag cggaccttcc ttcccgcggc 6060
ctgctgccgg ctctgcggcc tcttccgcgt cttcgccttc gccctcagac gagtcggatc 6120
tccctttggg ccgcctcccc gcatcgatac cgtcgacctc gagacctaga aaaacatgga 6180
gcaatcacaa gtagcaatac agcagctacc aatgctgatt gtgcctggct agaagcacaa 6240
gaggaggagg aggtgggttt tccagtcaca cctcaggtac ctttaagacc aatgacttac 6300
aaggcagctg tagatcttag ccacttttta aaagaaaagg ggggactgga agggctaatt 6360
cactcccaac gaagacaaga tatccttgat ctgtggatct accacacaca aggctacttc 6420
cctgattggc agaactacac accagggcca gggatcagat atccactgac ctttggatgg 6480
tgctacaagc tagtaccagt tgagcaagag aaggtagaag aagccaatga aggagagaac 6540
acccgcttgt tacaccctgt gagcctgcat gggatggatg acccggagag agaagtatta 6600
gagtggaggt ttgacagccg cctagcattt catcacatgg cccgagagct gcatccggac 6660
tgtactgggt ctctctggtt agaccagatc tgagcctggg agctctctgg ctaactaggg 6720
aacccactgc ttaagcctca ataaagcttg ccttgagtgc ttcaagtagt gtgtgcccgt 6780
ctgttgtgtg actctggtaa ctagagatcc ctcagaccct tttagtcagt gtggaaaatc 6840
tctagcaggg cccgtttaaa cccgctgatc agcctcgact gtgccttcta gttgccagcc 6900
atctgttgtt tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca ctcccactgt 6960
cctttcctaa taaaatgagg aaattgcatc gcattgtctg agtaggtgtc attctattct 7020
ggggggtggg gtggggcagg acagcaaggg ggaggattgg gaagacaata gcaggcatgc 7080
tggggatgcg gtgggctcta tggcttctga ggcggaaaga accagctggg gctctagggg 7140
gtatccccac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag 7200
cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt 7260
tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc ctttagggtt 7320
ccgatttagt gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg 7380
tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt 7440
taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt 7500
tgatttataa gggattttgc cgatttcggc ctattggtta aaaaatgagc tgatttaaca 7560
aaaatttaac gcgaattaat tctgtggaat gtgtgtcagt tagggtgtgg aaagtcccca 7620
ggctccccag caggcagaag tatgcaaagc atgcatctca attagtcagc aaccaggtgt 7680
ggaaagtccc caggctcccc agcaggcaga agtatgcaaa gcatgcatct caattagtca 7740
gcaaccatag tcccgcccct aactccgccc atcccgcccc taactccgcc cagttccgcc 7800
cattctccgc cccatggctg actaattttt tttatttatg cagaggccga ggccgcctct 7860
gcctctgagc tattccagaa gtagtgagga ggcttttttg gaggcctagg cttttgcaaa 7920
aagctcccgg gagcttgtat atccattttc ggatctgatc agcacgtgtt gacaattaat 7980
catcggcata gtatatcggc atagtataat acgacaaggt gaggaactaa accatggcca 8040
agttgaccag tgccgttccg gtgctcaccg cgcgcgacgt cgccggagcg gtcgagttct 8100
ggaccgaccg gctcgggttc tcccgggact tcgtggagga cgacttcgcc ggtgtggtcc 8160
gggacgacgt gaccctgttc atcagcgcgg tccaggacca ggtggtgccg gacaacaccc 8220
tggcctgggt gtgggtgcgc ggcctggacg agctgtacgc cgagtggtcg gaggtcgtgt 8280
ccacgaactt ccgggacgcc tccgggccgg ccatgaccga gatcggcgag cagccgtggg 8340
ggcgggagtt cgccctgcgc gacccggccg gcaactgcgt gcacttcgtg gccgaggagc 8400
aggactgaca cgtgctacga gatttcgatt ccaccgccgc cttctatgaa aggttgggct 8460
tcggaatcgt tttccgggac gccggctgga tgatcctcca gcgcggggat ctcatgctgg 8520
agttcttcgc ccaccccaac ttgtttattg cagcttataa tggttacaaa taaagcaata 8580
gcatcacaaa tttcacaaat aaagcatttt tttcactgca ttctagttgt ggtttgtcca 8640
aactcatcaa tgtatcttat catgtctgta taccgtcgac ctctagctag agcttggcgt 8700
aatcatggtc atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca 8760
tacgagccgg aagcataaag tgtaaagcct ggggtgccta atgagtgagc taactcacat 8820
taattgcgtt gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt 8880
aatgaatcgg ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct 8940
cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa 9000
aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa 9060
aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc 9120
tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga 9180
caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc 9240
cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt 9300
ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct 9360
gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg 9420
agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta 9480
gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct 9540
acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa 9600
gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt 9660
gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta 9720
cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat 9780
caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa 9840
gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct 9900
cagcgatctg tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta 9960
cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct 10020
caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg 10080
gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa 10140
gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt 10200
cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta 10260
catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca 10320
gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta 10380
ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct 10440
gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg 10500
cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac 10560
tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact 10620
gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa 10680
atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt 10740
ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat 10800
gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg 10860
ac 10862
<210> 2
<211> 15
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
ggcgtcatgg cgccc 15
<210> 3
<211> 15
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
gccctgatgc gggtc 15
<210> 4
<211> 15
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
atgtccatgc gggtc 15
<210> 5
<211> 15
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
gccctcctca tgcgg 15
<210> 6
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
gatgtcgaag agaatcctgg accg 24
<210> 7
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gcttgatatc gaattcttag gcta 24
<210> 8
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
aaatccttca ttggccagca gtttgctgtg gggagctatg tctacagcgt ggcgaagacg 60
gatagttttg aatac 75
<210> 9
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
aaatccttca ttggccagca gtttgctgtg gggagccatg tctacagcgt ggcgaagacg 60
gatagttttg aatac 75
<210> 10
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
ctcaagtacc tggggtgccg ctacagcgag atcaaactct acggacttga ctgggcggag 60
ctcagccggg acctc 75
<210> 11
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
ctcaagtacc tggggtgccg ctacagcgag atcaaaccct acggacttga ctgggcggag 60
ctcagccggg acctc 75
<210> 12
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
attggtgtcg gtgcctatgg gacagtgtac aaggcccttg atccccacag tggccacttt 60
gtggccctca agagt 75
<210> 13
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
attggtgtcg gtgcctatgg gacagtgtac aaggcccttg atccccacag tggccacttt 60
gtggccctca agagt 75
<210> 14
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
gagcaagagg accgggccct ctcccctgtc atcccccaca ttcctcgggc cagcatccca 60
gtcttcccag atacc 75
<210> 15
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
gagcaagagg accgggccct ctcccctgtc atccccctca ttcctcgggc cagcatccca 60
gtcttcccag atacc 75
<210> 16
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
gggctggtgg gtgtgggcac agccctggcc ctggcccgaa agggtgtcca actgggcctg 60
gtggcattcc ggcgc 75
<210> 17
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
gggctggtgg gtgtgggcac agccctggcc ctggcccgga agggtgtcca actgggcctg 60
gtggcattcc ggcgc 75
<210> 18
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
aaggccagta cccagcatca ggtcaaaaac ctgaattgtt gcagtgtgcc agtagatcag 60
gcctctgagt cactg 75
<210> 19
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
aaggccagta cccagcatca ggtcaaaaac ctgaatcgtt gcagtgtgcc agtagatcag 60
gcctctgagt cactg 75
<210> 20
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
acagaccagg atggagataa aggaactcaa agaattttgg ctgccctttt cttgggcctg 60
ggggtgttgt tctcc 75
<210> 21
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
acagaccagg atggagataa aggaactcaa agaatttggg ctgccctttt cttgggcctg 60
ggggtgttgt tctcc 75
<210> 22
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
gagggcatta atgtgatgaa tgaactgatc atccccttca ttcaccttat cattaataat 60
gacatttcca aggat 75
<210> 23
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
gagggcatta atgtgatgaa tgaactgatc atcccctgca ttcaccttat cattaataat 60
gacatttcca aggat 75
<210> 24
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
acacaagaat ttaagatcga tgaagaattg gtaacatatt ctgggaagtt cttgattttg 60
gatcgaatgc tgcca 75
<210> 25
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
acacaagaat ttaagatcga tgaagaattg gtaacaaatt ctgggaagtt cttgattttg 60
gatcgaatgc tgcca 75
<210> 26
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 26
gtcatgaatt tggtgctgtt tgaggacgcc gtggcttaca tctgcaggat taatcgcatc 60
ctggagtctc cccgg 75
<210> 27
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 27
gtcatgaatt tggtgctgtt tgaggacgcc gtggctcaca tctgcaggat taatcgcatc 60
ctggagtctc cccgg 75
<210> 28
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 28
aaagccacga aaagcaaact ttttgagttt ctggtctatg gggtgcgccc cgggatgccg 60
tctggagccc ggatg 75
<210> 29
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 29
aaagccacga aaagcaaact ttttgagttt ctggtccatg gggtgcgccc cgggatgccg 60
tctggagccc ggatg 75
<210> 30
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 30
ggcatcagtg gctacttggt ggactctgtg gccaagatga tggatgccgg ctgcaagccc 60
tacatggccc ctgag 75
<210> 31
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 31
ggcatcagtg gctacttggt ggactctgtg gccaagacga tggatgccgg ctgcaagccc 60
tacatggccc ctgag 75
<210> 32
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 32
ttcacctttg gggactcaga attgatgaga gacattaaca gcatgcacat tttccttact 60
gaaaggaaac tcact 75
<210> 33
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 33
ttcacctttg gggactcaga attgatgaga gacatttaca gcatgcacat tttccttact 60
gaaaggaaac tcact 75
<210> 34
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 34
acaattattg aaggcacaag aagctccggc tcccactttg tctttgtccc gtcattgaga 60
gatgtgcacc atgag 75
<210> 35
<211> 75
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 35
acaattattg aaggcacaag aagctccggc tcccaccttg tctttgtccc gtcattgaga 60
gatgtgcacc atgag 75
<210> 36
<211> 9599
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 36
gtcgacggat cgggagatct cccgatcccc tatggtgcac tctcagtaca atctgctctg 60
atgccgcata gttaagccag tatctgctcc ctgcttgtgt gttggaggtc gctgagtagt 120
gcgcgagcaa aatttaagct acaacaaggc aaggcttgac cgacaattgc atgaagaatc 180
tgcttagggt taggcgtttt gcgctgcttc gcgatgtacg ggccagatat acgcgttgac 240
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 300
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 360
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 420
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 480
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 540
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 600
tcatcgctat taccatggtg atgcggtttt ggcagtacat caatgggcgt ggatagcggt 660
ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt ttgttttggc 720
accaaaatca acgggacttt ccaaaatgtc gtaacaactc cgccccattg acgcaaatgg 780
gcggtaggcg tgtacggtgg gaggtctata taagcagcgc gttttgcctg tactgggtct 840
ctctggttag accagatctg agcctgggag ctctctggct aactagggaa cccactgctt 900
aagcctcaat aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac 960
tctggtaact agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagtggc 1020
gcccgaacag ggacttgaaa gcgaaaggga aaccagagga gctctctcga cgcaggactc 1080
ggcttgctga agcgcgcacg gcaagaggcg aggggcggcg actggtgagt acgccaaaaa 1140
ttttgactag cggaggctag aaggagagag atgggtgcga gagcgtcagt attaagcggg 1200
ggagaattag atcgcgatgg gaaaaaattc ggttaaggcc agggggaaag aaaaaatata 1260
aattaaaaca tatagtatgg gcaagcaggg agctagaacg attcgcagtt aatcctggcc 1320
tgttagaaac atcagaaggc tgtagacaaa tactgggaca gctacaacca tcccttcaga 1380
caggatcaga agaacttaga tcattatata atacagtagc aaccctctat tgtgtgcatc 1440
aaaggataga gataaaagac accaaggaag ctttagacaa gatagaggaa gagcaaaaca 1500
aaagtaagac caccgcacag caagcggccg ctgatcttca gacctggagg aggagatatg 1560
agggacaatt ggagaagtga attatataaa tataaagtag taaaaattga accattagga 1620
gtagcaccca ccaaggcaaa gagaagagtg gtgcagagag aaaaaagagc agtgggaata 1680
ggagctttgt tccttgggtt cttgggagca gcaggaagca ctatgggcgc agcgtcaatg 1740
acgctgacgg tacaggccag acaattattg tctggtatag tgcagcagca gaacaatttg 1800
ctgagggcta ttgaggcgca acagcatctg ttgcaactca cagtctgggg catcaagcag 1860
ctccaggcaa gaatcctggc tgtggaaaga tacctaaagg atcaacagct cctggggatt 1920
tggggttgct ctggaaaact catttgcacc actgctgtgc cttggaatgc tagttggagt 1980
aataaatctc tggaacagat ttggaatcac acgacctgga tggagtggga cagagaaatt 2040
aacaattaca caagcttaat acactcctta attgaagaat cgcaaaacca gcaagaaaag 2100
aatgaacaag aattattgga attagataaa tgggcaagtt tgtggaattg gtttaacata 2160
acaaattggc tgtggtatat aaaattattc ataatgatag taggaggctt ggtaggttta 2220
agaatagttt ttgctgtact ttctatagtg aatagagtta ggcagggata ttcaccatta 2280
tcgtttcaga cccacctccc aaccccgagg ggacccgaca ggcccgaagg aatagaagaa 2340
gaaggtggag agagagacag agacagatcc attcgattag tgaacggatc ggcactgcgt 2400
gcgccaattc tgcagacaaa tggcagtatt catccacaat tttaaaagaa aaggggggat 2460
tggggggtac agtgcagggg aaagaatagt agacataata gcaacagaca tacaaactaa 2520
agaattacaa aaacaaatta caaaaattca aaattttcgg gtttattaca gggacagcag 2580
agatccagtt tggttaatta gctagctagg tcttgaaagg agtgggaatt ggctccggtg 2640
cccgtcagtg ggcagagcgc acatcgccca cagtccccga gaagttgggg ggaggggtcg 2700
gcaattgatc cggtgcctag agaaggtggc gcggggtaaa ctgggaaagt gatgtcgtgt 2760
actggctccg cctttttccc gagggtgggg gagaaccgta tataagtgca gtagtcgccg 2820
tgaacgttct ttttcgcaac gggtttgccg ccagaacaca ggaccggttc tagagccacc 2880
atgaccgagt acaagcccac ggtgcgcctc gccacccgcg acgacgtccc ccgggccgta 2940
cgcaccctcg ccgccgcgtt cgccgactac cccgccacgc gccacaccgt cgacccggac 3000
cgccacatcg agcgggtcac cgagctgcaa gaactcttcc tcacgcgcgt cgggctcgac 3060
atcggcaagg tgtgggtcgc ggacgacggc gccgcggtgg cggtctggac cacgccggag 3120
agcgtcgaag cgggggcggt gttcgccgag atcggcccgc gcatggccga gttgagcggt 3180
tcccggctgg ccgcgcagca acagatggaa ggcctcctgg cgccgcaccg gcccaaggag 3240
cccgcgtggt tcctggccac cgtcggcgtc tcgcccgacc accagggcaa gggtctgggc 3300
agcgccgtcg tgctccccgg agtggaggcg gccgagcgcg ccggggtgcc cgccttcctg 3360
gagacctccg cgccccgcaa cctccccttc tacgagcggc tcggcttcac cgtcaccgcc 3420
gacgtcgagg tgcccgaagg accgcgcacc tggtgcatga cccgcaagcc cggtgccggc 3480
tccggagagg gcaggggaag tctgctaacc tgcggcgacg tggaggaaaa tcccggcccc 3540
ggatccatgg tgagcaaggg cgaggagctg ttcaccgggg tggtgcccat cctggtcgag 3600
ctggacggcg acgtaaacgg ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc 3660
acctacggca agctgaccct gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg 3720
cccaccctcg tgaccaccct gacctacggc gtgcagtgct tcagccgcta ccccgaccac 3780
atgaagcagc acgacttctt caagtccgcc atgcccgaag gctacgtcca ggagcgcacc 3840
atcttcttca aggacgacgg caactacaag acccgcgccg aggtgaagtt cgagggcgac 3900
accctggtga accgcatcga gctgaagggc atcgacttca aggaggacgg caacatcctg 3960
gggcacaagc tggagtacaa ctacaacagc cacaacgtct atatcatggc cgacaagcag 4020
aagaacggca tcaaggtgaa cttcaagatc cgccacaaca tcgaggacgg cagcgtgcag 4080
ctcgccgacc actaccagca gaacaccccc atcggcgacg gccccgtgct gctgcccgac 4140
aaccactacc tgagcaccca gtccgccctg agcaaagacc ccaacgagaa gcgcgatcac 4200
atggtcctgc tggagttcgt gaccgccgcc gggatcactc tcggcatgga cgagctgtac 4260
aagtaagaat tcgatatcaa gcttatcggt aatcaacctc tggattacaa aatttgtgaa 4320
agattgactg gtattcttaa ctatgttgct ccttttacgc tatgtggata cgctgcttta 4380
atgcctttgt atcatgctat tgcttcccgt atggctttca ttttctcctc cttgtataaa 4440
tcctggttgc tgtctcttta tgaggagttg tggcccgttg tcaggcaacg tggcgtggtg 4500
tgcactgtgt ttgctgacgc aacccccact ggttggggca ttgccaccac ctgtcagctc 4560
ctttccggga ctttcgcttt ccccctccct attgccacgg cggaactcat cgccgcctgc 4620
cttgcccgct gctggacagg ggctcggctg ttgggcactg acaattccgt ggtgttgtcg 4680
gggaaatcat cgtcctttcc ttggctgctc gcctgtgttg ccacctggat tctgcgcggg 4740
acgtccttct gctacgtccc ttcggccctc aatccagcgg accttccttc ccgcggcctg 4800
ctgccggctc tgcggcctct tccgcgtctt cgccttcgcc ctcagacgag tcggatctcc 4860
ctttgggccg cctccccgca tcgataccgt cgacctcgag acctagaaaa acatggagca 4920
atcacaagta gcaatacagc agctaccaat gctgattgtg cctggctaga agcacaagag 4980
gaggaggagg tgggttttcc agtcacacct caggtacctt taagaccaat gacttacaag 5040
gcagctgtag atcttagcca ctttttaaaa gaaaaggggg gactggaagg gctaattcac 5100
tcccaacgaa gacaagatat ccttgatctg tggatctacc acacacaagg ctacttccct 5160
gattggcaga actacacacc agggccaggg atcagatatc cactgacctt tggatggtgc 5220
tacaagctag taccagttga gcaagagaag gtagaagaag ccaatgaagg agagaacacc 5280
cgcttgttac accctgtgag cctgcatggg atggatgacc cggagagaga agtattagag 5340
tggaggtttg acagccgcct agcatttcat cacatggccc gagagctgca tccggactgt 5400
actgggtctc tctggttaga ccagatctga gcctgggagc tctctggcta actagggaac 5460
ccactgctta agcctcaata aagcttgcct tgagtgcttc aagtagtgtg tgcccgtctg 5520
ttgtgtgact ctggtaacta gagatccctc agaccctttt agtcagtgtg gaaaatctct 5580
agcagggccc gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt gccagccatc 5640
tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc ccactgtcct 5700
ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg 5760
gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca ggcatgctgg 5820
ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc agctggggct ctagggggta 5880
tccccacgcg ccctgtagcg gcgcattaag cgcggcgggt gtggtggtta cgcgcagcgt 5940
gaccgctaca cttgccagcg ccctagcgcc cgctcctttc gctttcttcc cttcctttct 6000
cgccacgttc gccggctttc cccgtcaagc tctaaatcgg gggctccctt tagggttccg 6060
atttagtgct ttacggcacc tcgaccccaa aaaacttgat tagggtgatg gttcacgtag 6120
tgggccatcg ccctgataga cggtttttcg ccctttgacg ttggagtcca cgttctttaa 6180
tagtggactc ttgttccaaa ctggaacaac actcaaccct atctcggtct attcttttga 6240
tttataaggg attttgccga tttcggccta ttggttaaaa aatgagctga tttaacaaaa 6300
atttaacgcg aattaattct gtggaatgtg tgtcagttag ggtgtggaaa gtccccaggc 6360
tccccagcag gcagaagtat gcaaagcatg catctcaatt agtcagcaac caggtgtgga 6420
aagtccccag gctccccagc aggcagaagt atgcaaagca tgcatctcaa ttagtcagca 6480
accatagtcc cgcccctaac tccgcccatc ccgcccctaa ctccgcccag ttccgcccat 6540
tctccgcccc atggctgact aatttttttt atttatgcag aggccgaggc cgcctctgcc 6600
tctgagctat tccagaagta gtgaggaggc ttttttggag gcctaggctt ttgcaaaaag 6660
ctcccgggag cttgtatatc cattttcgga tctgatcagc acgtgttgac aattaatcat 6720
cggcatagta tatcggcata gtataatacg acaaggtgag gaactaaacc atggccaagt 6780
tgaccagtgc cgttccggtg ctcaccgcgc gcgacgtcgc cggagcggtc gagttctgga 6840
ccgaccggct cgggttctcc cgggacttcg tggaggacga cttcgccggt gtggtccggg 6900
acgacgtgac cctgttcatc agcgcggtcc aggaccaggt ggtgccggac aacaccctgg 6960
cctgggtgtg ggtgcgcggc ctggacgagc tgtacgccga gtggtcggag gtcgtgtcca 7020
cgaacttccg ggacgcctcc gggccggcca tgaccgagat cggcgagcag ccgtgggggc 7080
gggagttcgc cctgcgcgac ccggccggca actgcgtgca cttcgtggcc gaggagcagg 7140
actgacacgt gctacgagat ttcgattcca ccgccgcctt ctatgaaagg ttgggcttcg 7200
gaatcgtttt ccgggacgcc ggctggatga tcctccagcg cggggatctc atgctggagt 7260
tcttcgccca ccccaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 7320
tcacaaattt cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac 7380
tcatcaatgt atcttatcat gtctgtatac cgtcgacctc tagctagagc ttggcgtaat 7440
catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca cacaacatac 7500
gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa ctcacattaa 7560
ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat 7620
gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc 7680
tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg 7740
cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag 7800
gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc 7860
gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag 7920
gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 7980
ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 8040
atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 8100
tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 8160
ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca 8220
gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 8280
ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag 8340
ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca 8400
agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg 8460
ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa 8520
aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta 8580
tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag 8640
cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga 8700
tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac 8760
cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc 8820
ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta 8880
gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac 8940
gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat 9000
gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa 9060
gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg 9120
tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag 9180
aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc 9240
cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct 9300
caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat 9360
cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg 9420
ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc 9480
aatattattg aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta 9540
tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgac 9599

Claims (21)

  1. A T cell epitope screening vector comprising an epitope screening core region; the epitope screening core region comprises an MHC subtype connecting region, an epitope connecting region and a tag sequence positioned at the upstream or downstream of the epitope connecting region;
    the MHC subtype binding region is for ligation of an MHC subtype coding sequence;
    the epitope-linking region is used for linking a candidate epitope coding sequence.
  2. 2. The T-cell epitope screening vector of claim 1, wherein said MHC subtype binding region is associated with a coding sequence for 1 MHC subtype;
    the epitope-connecting region is linked with the coding sequence of 1 candidate epitope.
  3. 3. The T-cell epitope screening vector of claim 1, wherein said tag sequence is an endogenous DNA sequence of an animal from which said MHC subtype encoding sequence is derived.
  4. 4. The T-cell epitope screening vector according to claim 3, wherein said tag sequence is located between said MHC subtype binding region and said epitope binding region.
  5. 5. The T cell epitope screening vector of claim 4, wherein the total length of said tag sequence and said candidate epitope coding sequence and the spacer sequence between said tag sequence and said candidate epitope coding sequence does not exceed the read length of high throughput sequencing.
  6. 6. The T-cell epitope screening vector according to any one of claims 1 to 5, wherein a 2A self-cleaving peptide is linked upstream of said MHC subtype binding region, said epitope binding region and said tag sequence.
  7. 7. The T-cell epitope screening vector according to claim 6, wherein T2A is ligated upstream of said MHC subtype binding region, P2A is ligated upstream of said epitope binding region, and E2A is ligated upstream of said tag sequence.
  8. 8. The T-cell epitope screening vector according to any one of claims 1 to 5 and 7, wherein said MHC subtype binding region and said epitope binding region share a single promoter; the promoter is located upstream of the MHC subtype connecting region.
  9. 9. The T-cell epitope screening vector according to claim 8, wherein said promoter is any one selected from the group consisting of EF-1 α promoter, CAG promoter, CMV promoter, PGK1 promoter and SV40 promoter.
  10. 10. The T-cell epitope screening vector according to any one of claims 1 to 5, 7 and 9, wherein the epitope screening core region comprises a promoter, T2A, an MHC subtype connecting region, E2A, a tag sequence, P2A and an epitope connecting region in the order from 5 'to 3'.
  11. 11. The T-cell epitope screening vector according to claim 10, wherein said MHC subtype binding region, said tag sequence, and said epitope binding region are provided with cleavage sites upstream and downstream of said MHC subtype binding region, downstream of said tag sequence, and upstream and downstream of said epitope binding region.
  12. 12. The T cell epitope screening vector according to any one of claims 1 to 5, 7, 9 and 11, wherein the 5 ' end of the candidate epitope coding sequence is seamlessly linked to the 3 ' end of the 2A self-cleaving peptide, and the 3 ' end of the candidate epitope coding sequence is a stop codon or is linked to the T cell epitope screening vector to form a stop codon.
  13. 13. The T cell epitope screening vector of claim 12, wherein said epitope is a tumor neoantigen, and said candidate epitope coding sequence comprises 1 tumor-specific mutation site and its upstream and/or downstream sequences.
  14. 14. A T-cell epitope screening vector according to any one of claims 1 to 5, 7, 9, 11 or 13, further comprising a functional element capable of causing replication and/or integration in a host cell and a functional element for resistance screening.
  15. 15. The T-cell epitope screening vector according to any one of claims 1 to 5, 7, 9, 11 or 13, wherein the T-cell epitope screening vector is obtained by linking the epitope screening core region to a lentiviral vector.
  16. 16. A vector set comprising at least 2T cell epitope screening vectors; in the set of vectors, each vector is selected from the epitope screening vector of any one of claims 1 to 15;
    in the set of vectors, the MHC subtype connecting regions of each vector are respectively connected with different MHC subtype coding sequences; the tag sequences of the vectors are different and correspond to the MHC subtype coding sequences one by one.
  17. 17. A multi-MHC genotype and antigen MiniGene combinatorial library derived from the vector set of claim 16 linked to candidate epitope coding sequences as set forth in SEQ ID nos. 8-35, respectively.
  18. 18. A method of constructing a multiple MHC genotype and antigen MiniGene combinatorial library, wherein the library is derived from the set of vectors of claim 16 linked to candidate epitope coding sequences, comprising: and utilizing the endonuclease corresponding to the upstream and downstream enzyme cutting sites of the antigen epitope connection region to cut the complete set of the carrier, and seamlessly connecting the candidate antigen epitope coding sequence into the antigen epitope connection region.
  19. 19. The method of construction according to claim 18, comprising: mixing different candidate antigen epitope coding sequence fragments in an equal molar ratio to obtain a fragment mixture, mixing the vectors in the set of vectors in an equal molar ratio, carrying out enzyme digestion and recovery, and carrying out seamless connection with the fragment mixture.
  20. 20. Use of the T-cell epitope screening vector of any one of claims 1 to 15, or the vector set of claim 16, or the combinatorial library of claim 17, or the method of constructing the combinatorial library of claim 18 or 19 for screening or expressing T-cell epitopes.
  21. 21. The use according to claim 20, wherein said T cell epitope is a tumor neoantigen, a tumor associated antigen, an autoimmune disease associated antigen or a microbial epitope.
CN202010634108.1A 2020-07-02 2020-07-02 Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof Active CN111850018B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010634108.1A CN111850018B (en) 2020-07-02 2020-07-02 Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010634108.1A CN111850018B (en) 2020-07-02 2020-07-02 Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof

Publications (2)

Publication Number Publication Date
CN111850018A CN111850018A (en) 2020-10-30
CN111850018B true CN111850018B (en) 2021-06-22

Family

ID=73153633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010634108.1A Active CN111850018B (en) 2020-07-02 2020-07-02 Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof

Country Status (1)

Country Link
CN (1) CN111850018B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101314047A (en) * 2007-06-01 2008-12-03 中国科学院微生物研究所 Method and medicament for treating viral infection
CN110499324A (en) * 2019-09-02 2019-11-26 中生康元生物科技(北京)有限公司 A method of for identifying the bacterial expression vector and screening and identification tumour neoantigen of tumour neoantigen
CN110706742A (en) * 2019-09-30 2020-01-17 中生康元生物科技(北京)有限公司 Pan-cancer tumor neoantigen high-throughput prediction method and application thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009273377A (en) * 2008-05-12 2009-11-26 Igaku Seibutsugaku Kenkyusho:Kk Method for selecting candidate peptide presented on surface of cancer cell following binding to mhc class i molecule
CN104342453A (en) * 2013-08-06 2015-02-11 深圳先进技术研究院 Minicircle DNA recombinant parent plasmid containing genetically engineered antibody gene expression cassette, minicircle DNA containing the expression cassette and application thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101314047A (en) * 2007-06-01 2008-12-03 中国科学院微生物研究所 Method and medicament for treating viral infection
CN110499324A (en) * 2019-09-02 2019-11-26 中生康元生物科技(北京)有限公司 A method of for identifying the bacterial expression vector and screening and identification tumour neoantigen of tumour neoantigen
CN110706742A (en) * 2019-09-30 2020-01-17 中生康元生物科技(北京)有限公司 Pan-cancer tumor neoantigen high-throughput prediction method and application thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
High-throughput Screening of Human Tumor Antigen-specific CD4 T Cells, Including Neoantigen-reactive T Cells;Costa-Nunes, C等;《CLINICAL CANCER RESEARCH》;20190715;第25卷(第14期);第4320-4331页,参见全文 *
构建小鼠肉瘤随机表达文库筛选肿瘤新抗原;赵恢准;《中国优秀硕士学位论文全文数据库》;20160229;第E072-150页,参见全文 *

Also Published As

Publication number Publication date
CN111850018A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
KR102227657B1 (en) Method for in vivo high-throughput evaluating of RNA-guided nuclease activity
ES2819976T5 (en) Compositions and medical uses for TCR reprogramming with fusion proteins
AU2021250992A1 (en) Compositions and methods for directing proteins to specific loci in the genome
CN101605899B (en) A novel vector and expression cell line for mass production of recombinant protein and a process of producing recombinant protein using same
CN104059942B (en) Avian pneumo-encephalitis virus heat-resisting live vaccine vectors system and application thereof
CN112501269B (en) Method for rapidly identifying high-affinity TCR antigen cross-reactivity
CA2610702A1 (en) Targeting cells with altered microrna expression
KR20140035537A (en) Targeted gene delivery for dendritic cell vaccination
CN101273121A (en) Improved methods and compositions for increasing longevity and protein yield from a cell culture
KR20220133999A (en) Closed linear DNA with modified nucleotides
CN111850018B (en) Multi-MHC genotype and antigen MiniGene combinatorial library, and construction method and application thereof
US20220033845A1 (en) Expression vectors for eukaryotic expression systems
US6468754B1 (en) Vector and method for targeted replacement and disruption of an integrated DNA sequence
KR101791296B1 (en) Expression cassette and vector with genes related Alzheimer&#39;s disease and transgenic cell line made from it
KR20230044506A (en) Gene Therapy Using Nucleic Acid Constructs Containing the Methyl CPG Binding Protein 2 (MECP2) Promoter Sequence
CN109913499A (en) It is a kind of suitable for long-chain non-coding RNA construct and express integrated slow virus carrier system and its application
CN109943584B (en) Recombinant vector and recombinant yeast strain for producing sabinene, and construction method and application thereof
CN113198017B (en) Application of inhibitor taking ITGB1 protein as target in preparation of anti-SARS-CoV-2 medicine
KR20160129568A (en) Transgenic zebrafish expressing a liver-specific hIL-6 gene and method for producing thereof
RU2724431C1 (en) Recombinant cyto-car-yt-lact cell line exhibiting high cytotoxic activity with respect to psca-positive human cancer cells
CN110699381A (en) Mediterranean anemia gene therapy vector construction method and application thereof
RU2802825C2 (en) Gene construct for the expression of recombinant proteins based on the segment of the sars-cov-2 s-protein, including rbd and sd1, fused with the fc fragment of igg, a method of obtaining recombinant proteins, antigens and antigenic compositions for the induction of long-term antibody and cellular immunity against sars-co-2
KR102346159B1 (en) High Efficiency Expression Vector and Uses thereof
KR20230169221A (en) Non-viral homology-mediated end joining
CN100410382C (en) Co-expression carrier and eukaryon expressing carrier capable of inducing cell immune response

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant