CN112210596A - Tumor neoantigen prediction method based on gene fusion event and application thereof - Google Patents

Tumor neoantigen prediction method based on gene fusion event and application thereof Download PDF

Info

Publication number
CN112210596A
CN112210596A CN202010933998.6A CN202010933998A CN112210596A CN 112210596 A CN112210596 A CN 112210596A CN 202010933998 A CN202010933998 A CN 202010933998A CN 112210596 A CN112210596 A CN 112210596A
Authority
CN
China
Prior art keywords
fusion
gene
tumor
fusion gene
neoantigen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010933998.6A
Other languages
Chinese (zh)
Other versions
CN112210596B (en
Inventor
程旭东
管旭东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongsheng Kangyuan Bio Tech Beijing Co ltd
Original Assignee
Zhongsheng Kangyuan Bio Tech Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongsheng Kangyuan Bio Tech Beijing Co ltd filed Critical Zhongsheng Kangyuan Bio Tech Beijing Co ltd
Priority to CN202010933998.6A priority Critical patent/CN112210596B/en
Publication of CN112210596A publication Critical patent/CN112210596A/en
Application granted granted Critical
Publication of CN112210596B publication Critical patent/CN112210596B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/0005Vertebrate antigens
    • A61K39/0011Cancer antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4748Tumour specific antigens; Tumour rejection antigen precursors [TRAP], e.g. MAGE
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/57Medicinal preparations containing antigens or antibodies characterised by the type of response, e.g. Th1, Th2
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/57Medicinal preparations containing antigens or antibodies characterised by the type of response, e.g. Th1, Th2
    • A61K2039/575Medicinal preparations containing antigens or antibodies characterised by the type of response, e.g. Th1, Th2 humoral response

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Medicinal Chemistry (AREA)
  • Microbiology (AREA)
  • Veterinary Medicine (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Mycology (AREA)
  • Epidemiology (AREA)
  • Toxicology (AREA)
  • Oncology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a tumor neoantigen prediction method based on a gene fusion event and application thereof. The prediction method of the invention carries out the detection of fusion gene and MHC based on the high-throughput next generation sequencing original data, and carries out scoring according to a scientifically constructed scoring function to screen out the fusion gene tumor neoantigen with high reliability. The prediction method can obviously improve the screening efficiency and accuracy of the tumor neoantigen, reduce the false positive rate of the result, can be applied to multiple cancer species, and can realize the prediction of the fusion gene tumor neoantigen without distinguishing the cancer species.

Description

Tumor neoantigen prediction method based on gene fusion event and application thereof
Technical Field
The invention belongs to the fields of bioinformatics technology and tumor immunotherapy, and particularly relates to a tumor neoantigen prediction method based on a gene fusion event and application thereof.
Background
At present, the treatment of malignant tumors still faces a plurality of difficulties, and a new treatment strategy is urgently needed. In recent years, immunotherapy of tumors has been rapidly developed and receives more and more attention. The immunotherapy of tumor refers to the elimination of tumor cells by using the body's own immune system, and includes antibody therapy, cell therapy, tumor vaccine, etc. In the process of generating and developing tumors, fusion gene breakpoints of a plurality of genes are often accompanied, and the neogenetic antigen refers to epitope specific antigen generated by tumor cell mutation, is only expressed on tumor cells and cannot cause immune tolerance of an organism. More studies have shown that immunotherapy targeting neoantigens has achieved good clinical results in some cancer patients. Therefore, the screening and identification of the tumor-specific neoantigens are one of the key technologies for improving the tumor immunotherapy effect and are the basis for realizing the individualized immunotherapy.
Gene fusion refers to a hybrid gene formed by joining together two or more genes, some or all of which are normally independent of each other. Gene fusions are possible due to chromosomal translocations, inversions, insertions, deletions, and the like. The occurrence of gene fusion is closely related to the occurrence and development of diseases such as tumor, and the gene fusion event generally exists in the tumor discovered at present. Abnormal chromosomal structures (philadelphia chromosomes) were found in chronic myeloid leukemia by Nowell and Hungerford in 1960. In 1980, the first study revealed a pathogenic role of the BCR and ABL1 gene fusion event in Burkitt's lymphoma. Gene fusion is an important variant in tumor cells, and can have a great influence on the functional characteristics of the cells.
The tumor immunotherapy based on the tumor neoantigen has been widely noticed by the medical field in recent years, and recent research and clinical results also show that the tumor neoantigen immunotherapy has a wide application prospect. The acquisition of tumor neoantigens needs to be based on new amino acid or protein sequences generated by the genetic variation of tumors. The fusion of genes can generate a large amount of active abnormal protein sequences, thereby causing the generation of tumors, so that the fusion genes are important targets for the discovery of tumor neoantigens. However, the whole-genome high-throughput screening method for the fusion gene tumor neoantigen is still a medical difficulty, and therefore, the high-throughput method for efficiently and accurately screening the fusion gene tumor neoantigen is developed in the application, and the screening efficiency and the accuracy of the fusion gene tumor neoantigen can be remarkably improved.
Disclosure of Invention
In view of the above problems of the prior art, it is an object of the present invention to provide a method for predicting neoantigens of tumors based on gene fusion events, in which score values of neoantigens are creatively calculated using a scoring function based on a characteristic value of a tumor neoantigen of a fusion gene, and the neoantigens are ranked according to the score values, and the ranked neoantigens are highly reliable. The method provided by the invention comprises multi-step quality control and comprehensive analysis, so that the accuracy of the result and the verification rate of the specific antigen are greatly improved, the workload of experimental verification is greatly reduced, and a foundation is laid for the subsequent design of an anti-tumor vaccine, the development of an anti-tumor drug and the evaluation of a tumor treatment response biomarker.
The above object of the present invention is achieved by the following technical solutions:
the first aspect of the invention provides a scoring function for evaluating the credibility of a fusion gene tumor neoantigen, which is characterized in that the scoring function comprises the following characteristic values: number of junctional Reads supporting the fusion gene, number of bridging Fragments supporting the fusion gene, average coverage of single base of the upstream gene of the fusion event, average coverage of single base of the downstream gene of the fusion event.
Further, in a specific embodiment of the present invention, the scoring function is as follows: score ═ lg (IC/500) + [ (JunctionReadCount + spanningfrancount) × 2/(upstreamcov + downstreamcov) ];
wherein IC is mean (IC50[ i: i + n ]), representing the median taken for various software affinity values; junctionReadCount indicates the number of junctionReads supporting the fusion gene; the Spanning fracccount represents the number of Spanning Fragments supporting the fusion gene; upstreamcocov refers to the average single base coverage of the gene upstream of the fusion event; downstreamcov refers to the average single base coverage of the gene downstream of the fusion event.
Further, the plurality of software includes NetMHCpan, NetMHCIIpan, NetMHC, NetMHCII.
The second aspect of the invention provides a prediction method of a fusion gene tumor neoantigen.
Further, the prediction method comprises obtaining the following characteristic values: tumor tissue RNA-bam file, fusion gene in tumor tissue, gene expression level and mutant polypeptide affinity prediction value.
Further, the prediction method comprises the credibility ranking of the fusion gene tumor neoantigens obtained by the scoring function according to the first aspect of the invention.
Further, the prediction method comprises the following steps:
(1) obtaining a tumor tissue sample and RNA-seq sequencing data;
(2) detecting the fusion gene of the tumor tissue;
(3) calculating the expression quantity of the fusion gene;
(4) annotation of fusion genes;
(5) extracting fusion polypeptide;
(6) identifying MHC molecule types;
(7) HLA affinity prediction;
(8) the scoring function of the first aspect of the invention is used to obtain the scoring order of the confidence level of the fusion gene tumor neoantigen.
Further, the prediction method step (1) comprises: tumor tissues of any cancer tumor patient are obtained, and RNA-seq sequencing of the tumor tissues is completed through an illumina high-throughput sequencing platform.
Further, the raw data obtained by the above sequencing method needs to be processed by quality control, reference genome alignment, and bam files.
Wherein, the quality control: performing quality control on the RNA sequencing original fastq data through fastQC software to obtain data AO.clean.fq.gz after quality control;
reference genome alignment: performing reference genome comparison on the RNA after quality control by using hisat2 software to obtain a bam file of tumor RNA data;
and (3) bam file processing: the compared bam file needs further processing, and the RNA data bam file is subjected to sequencing and quality control processing to obtain the processed RNA-bam file.
Preferably, the expression level of the fusion gene is calculated by using RSEM;
preferably, the fusion gene is detected in tumor tissue using star-fusion software;
preferably, the fusion gene obtained by detection is annotated by using AGFusion;
preferably, the polypeptide extraction uses a sliding window mode, specifically, the step-by-step sliding window extraction is carried out on the upstream and downstream positions of the mutation site by using a sliding window with the length of 8-15 amino acids, and the step length of the sliding window is 1;
preferably, identification of MHCI and MHCII molecular types is performed using seq2 HLA;
preferably, the prediction is carried out by using various software such as NetMHCpan, NetMHCIIpan, NetMHC and NetMHCII, and each method obtains a corresponding IC50 value of the affinity prediction result.
Further, the scoring of the credibility in the step (8) of the prediction method is obtained based on the score values of the fusion gene tumor neoantigens obtained by the scoring function according to the first aspect of the present invention, and the scores are ranked from high to low, and a score higher indicates that the fusion gene tumor neoantigens have high credibility.
In a third aspect, the invention provides the use of a scoring function according to the first aspect of the invention for predicting fusion gene tumor neoantigens.
In a fourth aspect, the invention provides a fusion gene tumor neoantigen.
Further, the neoantigen is selected from one or more of the following group: RBP4_ FRA10AC1, AHNAK _ RPS11, SMURF1_ KPNA7, STRN3_ programmed 1.
In a fifth aspect, the present invention provides a method of screening for a neoantigen according to the fourth aspect of the invention.
Further, the screening method comprises the prediction method according to the second aspect of the present invention.
Further, the screening method further comprises verifying the predicted result.
Preferably, the verification of the predicted result comprises fusion gene verification and immunological verification;
preferably, the fusion gene verification refers to performing PCR verification on the fusion gene for predicting the neoantigen;
preferably, the immunological validation refers to ELISPOT validation of the neoantigen corresponding to the fusion event that is positively validated in the fusion gene validation result.
The sixth aspect of the invention provides the use of the neoantigen of the fourth aspect of the invention in the preparation of an anti-tumor drug or vaccine.
The seventh aspect of the present invention provides an apparatus for predicting a tumor neoantigen of a fusion gene.
Further, the apparatus comprises a memory for storing a program and a processor for executing the program to implement the prediction method according to the second aspect of the present invention.
An eighth aspect of the present invention provides a computer-readable storage medium.
Further, the computer readable storage medium includes a program, which is executable by a processor to perform the prediction method according to the second aspect of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, some terms are explained as follows.
The term "tumor neoantigen" as used herein refers to a neoantigen produced by a tumor cell as a result of a genetic change, which antigen is recognized by an immune cell (via a T cell receptor) resulting in activation of the immune cell. During the development process of cancer cells, a lot of gene mutations are generated, and partial gene mutations also generate proteins which are not contained in normal tissues and normal cells, and the proteins possibly also activate the immune system and cause the immune system to attack the cancer cells. These abnormal proteins (abnormal antigens) produced by the genetic mutation of cancer cells, which activate the immune system (recognized by immune cells), are tumor neoantigens. The tumor neoantigen is a key factor for stimulating the immune system of the organism to have initial immune response to tumor cells, and meanwhile, the identification, screening and identification of the tumor neoantigen are key factors for accelerating the development of individualized immunotherapy of tumor patients.
The term "fused gene" as used herein refers to a new gene formed by fusing together partial or complete sequences of two different genes due to some mechanism, such as genomic variation. In general, a fusion gene refers to a gene resulting from a genome-level fusion. However, fusion of the transcriptome levels may also occur, primarily due to the fact that the RNAs produced by the transcription of two different genes somehow fuse together to form a new fused RNA, which may or may not encode a protein. The fusion gene produced at the genome level may or may not be expressed (e.g., disruption of the promoter region or other reasons) depending on the fusion. The fusion gene is mainly produced by the following three mechanisms: (1) chromosomal Translocation (Chromosomal Translocation). E.g., the two segments on the two chromosomes are crossed over each other, resulting in gene-to-gene fusion on the two chromosomes; (2) intermediate deletion (intermediate deletion). Such as deletion (deletion) of genes and segments between genes on a chromosome, which finally leads to the fusion of the two genes; (3) chromosome Inversion (Chromosomal Inversion). For example, gene-to-gene segments on a chromosome are inverted, which eventually results in fusion of genes and genes.
The invention has the advantages and beneficial effects that:
(1) the invention constructs a scientific scoring function for the first time, scientifically distributes the weight of each key factor influencing the fusion gene tumor neoantigen, and improves the reliability of the prediction result.
(2) The invention provides a prediction method for predicting the fusion gene tumor neoantigen only by using the RNA sequencing data of the tumor tissue, and the method does not depend on other data such as DNA and the like, thereby greatly shortening the prediction time.
(3) The invention aims at the tumor fusion gene to realize high-throughput and high-accuracy prediction of the tumor neoantigen of the fusion gene.
(4) The multi-step quality control and comprehensive analysis provided by the invention greatly improve the accuracy of the result, improve the verification rate of the newborn antigen, shorten the application period and ensure the reliability of the result.
(5) The prediction method provided by the invention can be applied to various cancer species, and the prediction of the fusion gene tumor neoantigen can be realized without distinguishing the cancer species.
(6) The invention provides a fusion gene tumor neoantigen predicted and screened by the prediction method, which comprises RBP4_ FRA10AC1, AHNAK _ RPS11, SMURF1_ KPNA7 and STRN3_ HECTD 1.
Drawings
Embodiments of the invention are described in detail below with reference to the attached drawing figures, wherein:
FIG. 1 is a flowchart of a method for predicting a tumor neoantigen of a fusion gene according to an embodiment of the present invention;
FIG. 2 is a diagram showing the results of PCR verification of a fusion gene predicted to have a corresponding neoantigen;
FIG. 3 is a graph showing the results of enzyme-linked immunospot assay (ELISPOT assay) for detecting positive control;
FIG. 4 is a graph showing the results of enzyme-linked immunospot assay (ELISPOT assay) for detecting negative control;
FIG. 5 is a graph showing the results of enzyme-linked immunospot assay (ELISPOT assay) for RBP4_ FRA10AC 1;
FIG. 6 is a graph showing the results of detection of AHNAK _ RPS11 by enzyme-linked immunospot assay (ELISPOT assay);
FIG. 7 shows a graph of the results obtained by detecting DNPH1_ CRIP3 in an enzyme-linked immunospot assay (ELISPOT assay);
FIG. 8 is a graph showing the results of detection of SMURF1_ KPNA7 by enzyme-linked immunospot assay (ELISPOT assay);
FIG. 9 is a graph showing the results of enzyme-linked immunospot assay (ELISPOT assay) for detecting STRN3_ HECTD 1.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. In the following description, numerous details are set forth in order to provide a better understanding of the present application. However, those skilled in the art will readily recognize that some of the features may be omitted or replaced with other elements, materials, methods in different instances.
Example 1 prediction of fusion Gene tumor neoantigen
The flow chart of the method for predicting the tumor neoantigen of the fusion gene is shown in figure 1. The specific process is as follows:
1. material preparation
Tumor tissue of a liver cancer patient is taken, and RNA-seq sequencing of the tumor tissue is completed through a high-throughput sequencing platform such as illumina.
2. Data quality control
And performing quality control on the RNA-seq sequencing original fastq data through FastQC software to obtain data clean.
3. Data comparison
And performing reference genome alignment on the RNA after quality control by using hisat2 software to obtain a bam file of tumor RNA data.
4. Bam file processing
The compared bam file needs further processing, and the RNA data bam file is subjected to sequencing and quality control processing to obtain the processed RNA-bam file.
5. Detection of fusion genes
The fusion gene of tumor tissue was detected using star-fusion software.
6. Quantification of gene expression
Calculation of gene expression level was carried out using RSEM.
7. Fusion gene annotation
The fusion gene obtained by the detection was annotated using AGFusion.
8. Fusion polypeptide extraction
And (3) obtaining genetic mutation and somatic mutation information based on the steps, comprehensively and accurately extracting mutant site polypeptides, and correspondingly extracting polypeptide sequences of normal wild genotypes. The polypeptide extraction uses a sliding window mode, specifically, sliding windows with the length of 8-15 amino acids are respectively used for carrying out gradual sliding window extraction on the upstream and downstream positions of a mutation site to extract a polypeptide sequence containing the mutant amino acid, and the step length of the sliding window is 1.
9. MHC molecule type identification
Based on RNA sequencing data, MHCI and MHCII molecular types were identified using seq2HLA, and tumor patients were typed: HLA-A11: 01, HLA-A26: 01, HLA-B40: 01, HLA-B38: 01, HLA-C07: 02, and HLA-C12: 03.
10. HLA affinity prediction
Based on the polypeptide sequence and HLA type obtained by the steps, comprehensive prediction is carried out by using NetMHCpan, NetMHCIIpan, NetMHC and NetMHCII multi-software, and each method obtains a corresponding IC50 value of an affinity prediction result.
11. Ordering high affinity mutant polypeptides
The following scoring function is utilized:
and (2) calculating the Score value of the predicted fusion gene tumor neoantigen, wherein the Score value is in positive correlation with the reliability of the neoantigen.
Wherein IC is mean (IC50[ i: i + n ]), representing the median taken for various software affinity values; junctionReadCount indicates the number of junctionReads supporting the fusion gene; the Spanning fracccount represents the number of Spanning Fragments supporting the fusion gene; upstreamcocov refers to the average single base coverage of the gene upstream of the fusion event; downstreamcov refers to the average single base coverage of the gene downstream of the fusion event.
And (4) sequencing according to the degree of the score value to obtain the fusion gene tumor neoantigen with high reliability (see table 1).
TABLE 1 fusion Gene tumor neoantigen score ranking
Figure BDA0002671261540000091
Example 2 validation of candidate fusion Gene tumor neoantigen
1. PCR validation of fusion genes for prediction of neoantigens
To prevent false positives for fusion events, PCR validation was performed on fusion genes predicted to have the corresponding neoantigens.
The experimental method comprises the following steps: designing a cross-fusion point Primer on the upstream and downstream genes of the fusion gene by using Primer Premier 5 software, namely: the upstream primer is at the fusion event upstream gene, and the downstream primer is at the fusion event downstream gene. The primer sequences obtained are shown in Table 2. The experimental procedures were carried out according to the instructions of the kit used.
The experimental results are as follows: the results showed that 5 positive polypeptides were obtained (see fig. 2), which were: RBP4_ FRA10AC1, AHNAK _ RPS11, DNPH1_ CRIP3, SMURF1_ KPNA7, STRN3_ HECTD 1.
2. ELISPOT validation of neoantigens corresponding to positive fusion events
And (3) further performing immunogenicity verification on the new antigen polypeptides corresponding to the 5 fusion genes in the step (5): ELISPOT experiments.
The experimental method comprises the following steps: ELISPOT validation was performed on 5 neoantigens corresponding to positive fusion events.
The experimental results are as follows: the graphs of the experimental results obtained for the positive control and the negative control are shown in FIGS. 3 and 4, respectively. The results showed 4 immunogenic positive reactions, including one weak positive and 1 negative (see fig. 5-9). Finally, the neoantigens corresponding to the 4 positive results are taken as candidate neoantigens, and the candidate neoantigens are respectively as follows: RBP4_ FRA10AC1, AHNAK _ RPS11, SMURF1_ KPNA7, STRN3_ programmed 1.
TABLE 2 primer sequences obtained by designing Trans-fusion Point primers for genes upstream and downstream of the fusion Gene
RBP4_FRA10AC1_F1(5’-3’) GGCACCTTCACAGACACC(SEQ ID NO.1)
RBP4_FRA10AC1_R1(5’-3’) AGCTCTATCCTCTAGGAGCTAC(SEQ ID NO.2)
RBP4_FRA10AC1_F2(5’-3’) TGGGCACCTTCACAGACAC(SEQ ID NO.3)
RBP4_FRA10AC1_R2(5’-3’) TCACATTAAGGAGCGGAGG(SEQ ID NO.4)
AHNAK_RPS11_F(5’-3’) GGGGATGATGAGGAGTACC(SEQ ID NO.5)
AHNAK_RPS11_R(5’-3’) TGAAGCGCACTGTCTTGCTC(SEQ ID NO.6)
DGCR2_GSC2_F(5’-3’) GTTGCAGCCGAGAGTGTG(SEQ ID NO.7)
DGCR2_GSC2_R(5’-3’) GTACTCACGTCAGGATACTGG(SEQ ID NO.8)
DNPH1_CRIP3_F(5’-3’) GACAGGACGCTGTACGAGC(SEQ ID NO.9)
DNPH1_CRIP3_R(5’-3’) GGCCTGAGTTAGGGTGACC(SEQ ID NO.10)
PLA2G6_TMEM184B_F(5’-3’) CACTCAGATGGATGTCACCG(SEQ ID NO.11)
PLA2G6_TMEM184B_R(5’-3’) GCTGACGGAGATGTTGTAGA(SEQ ID NO.12)
SMURF1_KPNA7_F(5’-3’) GACTGGGCTCGGCTGGAAG(SEQ ID NO.13)
SMURF1_KPNA7_R(5’-3’) CGCCTGGAGGATGCAAGA(SEQ ID NO.14)
STRN3_HECTD1_F(5’-3’) CACTACATCCAGCACGAGTG(SEQ ID NO.15)
STRN3_HECTD1_R(5’-3’) CTGGCTGGGTAGTTACAGGA(SEQ ID NO.16)
The above-described embodiments are only for illustrating the present invention and are not to be construed as limiting the present invention. As will be understood by those of ordinary skill in the art: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Sequence listing
<110> Zhongsheng Kangyuan Biotechnology (Beijing) Co., Ltd
<120> tumor neoantigen prediction method based on gene fusion event and application thereof
<141> 2020-09-08
<160> 16
<170> SIPOSequenceListing 1.0
<210> 1
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
ggcaccttca cagacacc 18
<210> 2
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
agctctatcc tctaggagct ac 22
<210> 3
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
tgggcacctt cacagacac 19
<210> 4
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
tcacattaag gagcggagg 19
<210> 5
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
ggggatgatg aggagtacc 19
<210> 6
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
tgaagcgcac tgtcttgctc 20
<210> 7
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gttgcagccg agagtgtg 18
<210> 8
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
gtactcacgt caggatactg g 21
<210> 9
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
gacaggacgc tgtacgagc 19
<210> 10
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
ggcctgagtt agggtgacc 19
<210> 11
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
cactcagatg gatgtcaccg 20
<210> 12
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
gctgacggag atgttgtaga 20
<210> 13
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
gactgggctc ggctggaag 19
<210> 14
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
cgcctggagg atgcaaga 18
<210> 15
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
cactacatcc agcacgagtg 20
<210> 16
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
cactacatcc agcacgagtg 20

Claims (10)

1. A scoring function for assessing the credibility of a fused gene tumor neoantigen, wherein the scoring function comprises the following characteristic values: number of junctional Reads supporting the fusion gene, number of bridging Fragments supporting the fusion gene, average coverage of single base of the upstream gene of the fusion event, average coverage of single base of the downstream gene of the fusion event.
2. Scoring function according to claim 1, characterized in that it is represented as Score ═ -lg (IC/500) + [ (junctionReadCount + spanning FragCount) × 2/(upstreamcov + dow transreamcov) ];
wherein IC is mean (IC50[ i: i + n ]), representing the median taken for various software affinity values; junctionReadCount indicates the number of junctionReads supporting the fusion gene; the Spanning fracccount represents the number of Spanning Fragments supporting the fusion gene; upstreamcocov refers to the average single base coverage of the gene upstream of the fusion event; the downstreamcov refers to the average coverage of single base of the downstream gene of the fusion event;
preferably, the plurality of software includes NetMHCpan, NetMHCIIpan, NetMHC, NetMHCII.
3. A method for predicting a fused gene tumor neoantigen, comprising ranking the confidence levels of the fused gene tumor neoantigens obtained by the scoring function of claim 1 or 2.
4. The prediction method according to claim 3, characterized in that it comprises the steps of:
(1) obtaining a tumor tissue sample and RNA-seq sequencing data;
(2) detecting the fusion gene of the tumor tissue;
(3) calculating the expression quantity of the fusion gene;
(4) annotation of fusion genes;
(5) extracting fusion polypeptide;
(6) identifying MHC molecule types;
(7) HLA affinity prediction;
(8) obtaining a scoring ranking of the confidence level of the fusion gene tumor neoantigens using the scoring function of claim 1 or 2;
preferably, the fusion gene is detected in tumor tissue using star-fusion software;
preferably, the fusion gene obtained by detection is annotated by using AGFusion;
preferably, the polypeptide extraction uses a sliding window mode, specifically, the step-by-step sliding window extraction is carried out on the upstream and downstream positions of the mutation site by using a sliding window with the length of 8-15 amino acids, and the step length of the sliding window is 1;
preferably, the prediction is carried out by using various software such as NetMHCpan, NetMHCIIpan, NetMHC and NetMHCII, and each method obtains a corresponding IC50 value of the affinity prediction result.
5. Use of the scoring function of claim 1 or 2 for predicting fusion gene tumor neoantigen.
6. A fusion gene tumor neoantigen selected from one or more of the following group: RBP4_ FRA10AC1, AHNAK _ RPS11, SMURF1_ KPNA7, STRN3_ programmed 1.
7. A method of screening for neoantigens according to claim 6, which comprises the prediction method according to claim 3 or 4.
8. Use of the neoantigen of claim 6 in the preparation of an anti-tumor drug or vaccine.
9. An apparatus for predicting a fused gene tumor neoantigen, the apparatus comprising a memory for storing a program and a processor for executing the program to perform the prediction method of claim 3 or 4.
10. A computer-readable storage medium comprising a program executable by a processor to perform the prediction method of claim 3 or 4.
CN202010933998.6A 2020-09-08 2020-09-08 Tumor neoantigen prediction method based on gene fusion event and application thereof Active CN112210596B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010933998.6A CN112210596B (en) 2020-09-08 2020-09-08 Tumor neoantigen prediction method based on gene fusion event and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010933998.6A CN112210596B (en) 2020-09-08 2020-09-08 Tumor neoantigen prediction method based on gene fusion event and application thereof

Publications (2)

Publication Number Publication Date
CN112210596A true CN112210596A (en) 2021-01-12
CN112210596B CN112210596B (en) 2022-04-26

Family

ID=74048830

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010933998.6A Active CN112210596B (en) 2020-09-08 2020-09-08 Tumor neoantigen prediction method based on gene fusion event and application thereof

Country Status (1)

Country Link
CN (1) CN112210596B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113345516A (en) * 2021-06-23 2021-09-03 深圳裕泰抗原科技有限公司 HLA genotyping method, device and storage medium
CN116825188A (en) * 2023-06-25 2023-09-29 北京泛生子基因科技有限公司 Method, device and computer readable storage medium for identifying tumor neoantigen at multiple groups of chemical layers based on high-throughput sequencing technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3066635A1 (en) * 2017-06-09 2018-12-13 Gritstone Oncology, Inc. Neoantigen identification, manufacture, and use
CN110322925A (en) * 2019-07-18 2019-10-11 杭州纽安津生物科技有限公司 A method of prediction fusion generates neoantigen
CN110706742A (en) * 2019-09-30 2020-01-17 中生康元生物科技(北京)有限公司 Pan-cancer tumor neoantigen high-throughput prediction method and application thereof
CN110752041A (en) * 2019-10-23 2020-02-04 深圳裕策生物科技有限公司 Method, device and storage medium for predicting neoantigen based on next generation sequencing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3066635A1 (en) * 2017-06-09 2018-12-13 Gritstone Oncology, Inc. Neoantigen identification, manufacture, and use
CN110322925A (en) * 2019-07-18 2019-10-11 杭州纽安津生物科技有限公司 A method of prediction fusion generates neoantigen
CN110706742A (en) * 2019-09-30 2020-01-17 中生康元生物科技(北京)有限公司 Pan-cancer tumor neoantigen high-throughput prediction method and application thereof
CN110752041A (en) * 2019-10-23 2020-02-04 深圳裕策生物科技有限公司 Method, device and storage medium for predicting neoantigen based on next generation sequencing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王广志等: "个性化肿瘤新抗原疫苗中抗原肽预测研究进展", 《生物化学与生物物理进展》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113345516A (en) * 2021-06-23 2021-09-03 深圳裕泰抗原科技有限公司 HLA genotyping method, device and storage medium
CN116825188A (en) * 2023-06-25 2023-09-29 北京泛生子基因科技有限公司 Method, device and computer readable storage medium for identifying tumor neoantigen at multiple groups of chemical layers based on high-throughput sequencing technology
CN116825188B (en) * 2023-06-25 2024-04-09 北京泛生子基因科技有限公司 Method, device and computer readable storage medium for identifying tumor neoantigen at multiple groups of chemical layers based on high-throughput sequencing technology

Also Published As

Publication number Publication date
CN112210596B (en) 2022-04-26

Similar Documents

Publication Publication Date Title
CN108796055B (en) Method, device and storage medium for detecting tumor neoantigen based on second-generation sequencing
CN108388773B (en) A kind of identification method of tumor neogenetic antigen
US11485784B2 (en) Ranking system for immunogenic cancer-specific epitopes
CN104662171B (en) Individualized cancer vaccine and adoptive immunity cell therapy
CN110706742B (en) Pan-cancer tumor neoantigen high-throughput prediction method and application thereof
CN110277135B (en) Method and system for selecting individualized tumor neoantigen based on expected curative effect
CN112210596B (en) Tumor neoantigen prediction method based on gene fusion event and application thereof
BR112021005702A2 (en) method for selecting neoepitopes
CN110752041A (en) Method, device and storage medium for predicting neoantigen based on next generation sequencing
EP3431595A1 (en) Monitoring and diagnosis for immunotherapy, and design for therapeutic agent
US20230047716A1 (en) Method and system for screening neoantigens, and uses thereof
CN111755067A (en) Screening method of tumor neoantigen
KR20230165259A (en) Identification of clonal neoantigens and their use
CN113053458A (en) Prediction method and device for tumor neoantigen load
CN114446389A (en) Tumor neoantigen characteristic analysis and immunogenicity prediction tool and application thereof
Li et al. Shedding light on the hidden human proteome expands immunopeptidome in cancer
Mardis Neoantigen discovery in human cancers
CN114333998A (en) Tumor neoantigen prediction method and system based on deep learning model
US20240136013A1 (en) Quantification of rna mutation expression
WO2020187143A1 (en) Method for identifying neoantigens
CN116083587B (en) Method and device for predicting tumor neoantigen based on abnormal variable shear
CN111599410B (en) Method for extracting microsatellite unstable immunotherapy new antigen by integrating multiple sets of chemical data and application
RU2809620C2 (en) Selecting cancer mutations to create personalized cancer vaccine
Spellman et al. Enhancing the Breadth and Efficacy of Therapeutic Vaccines for Breast Cancer
CN114203260A (en) Method and device for predicting neoantigen

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210119

Address after: 315800 room 3593, building 2, 406, Xinqi Jingang Road, Beilun District, Ningbo City, Zhejiang Province

Applicant after: Ningbo Free Trade Zone Daming investment partnership (L.P.)

Address before: 102200-1, building 3, yard 12, Changsheng Road, science and Technology Park, Changping District, Beijing

Applicant before: ZHONGSHENG KANGYUAN BIO-TECH (BEIJING) Co.,Ltd.

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210326

Address after: 102200-1, building 3, yard 12, Changsheng Road, science and Technology Park, Changping District, Beijing

Applicant after: ZHONGSHENG KANGYUAN BIO-TECH (BEIJING) Co.,Ltd.

Address before: 315800 room 3593, building 2, 406, Xinqi Jingang Road, Beilun District, Ningbo City, Zhejiang Province

Applicant before: Ningbo Free Trade Zone Daming investment partnership (L.P.)

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant