CN111755067A

CN111755067A - Screening method of tumor neoantigen

Info

Publication number: CN111755067A
Application number: CN201910242904.8A
Authority: CN
Inventors: 雷俊卿; 苏小平; 秦汉楠
Original assignee: Geyuan Zhishan Shanghai Bio Tech Co ltd
Current assignee: Geyuan Zhishan Shanghai Bio Tech Co ltd
Priority date: 2019-03-28
Filing date: 2019-03-28
Publication date: 2020-10-09

Abstract

The invention discloses a method for screening tumor neoantigens, aiming at solving the problem that the existing method can not screen high-quality tumor neoantigens from multiple angles. The method comprises the following specific steps: step one, selecting a variation type of a tumor; step two, RNA variation analysis is carried out on the tumor; step three, carrying out MHC molecule analysis on normal blood and tumor; step four, RNA expression analysis is carried out on the tumor; fifthly, carrying out variation annotation on the tumor; analyzing the variation driving gene of the tumor; seventhly, predicting the HLA molecule binding affinity of the tumor; step eight, analyzing the mutation frequency of the tumor; step nine, comprehensively scoring the tumor neoantigens; step ten, analyzing the synthesis difficulty of the tumor neoantigen; step eleven, comprehensively selecting the final tumor neoantigen. The invention optimally combines tumor mutation analysis and tumor expression prediction tumor specific antigen, so that the analysis process is more efficient and accurate.

Description

Screening method of tumor neoantigen

Technical Field

The invention relates to the field of tumor immunity, in particular to a method for screening a tumor neoantigen.

Background

Tumor-specific antigens (TSAs) are antigens which are characteristic of tumor cells and are also known as neoantigens (neoantigens). Tumor-specific antigens were proposed in the first half of the last century, and then with the development of molecular biology and the deep knowledge of the function of major histocompatibility complex (abbreviated as MHC) molecules, Boon et al first discovered that in tumors, complexes of specific peptides and MHC molecules produced by tumors can be recognized by T cells such as CD8+ or CD4 +. Subsequent studies have recognized that these antigens recognized by T cells are derived from genomic variations of tumors expressed as tumor-specific peptides (neo-epitopes) and are defined as neoantigens (neoantigens). Unlike tumor-associated antigens, tumor-specific antigens are present only in tumor cells.

Two independent clinical phase I test results are published in 7 months in 2017 and in the same period of the British science journal Nature, a novel antigen (neoantigen) specifically expressed by tumor cells due to gene mutation is searched by sequencing DNA and RNA of the tumor cells, and then a personalized tumor vaccine is constructed and is infused back into a human body to activate immune cells and kill the tumor cells with the antigen. This is the first cancer vaccine study that has succeeded in clinical trials.

The prediction methods of tumor neoantigens which are published at present mainly comprise EpiToolkit and Epi-Seq. However, EpiToolKit only starts from mutation, does not consider the depth and coverage of sequencing data, does not consider the quality of mutation from the quality of data, and cannot judge the quality of the obtained neoantigen. In addition, EpiToolkit does not consider expression abundance and does not consider the expression condition of the neoantigen, which causes false positive prediction and can not screen high-quality neoantigen. Many mutations at the DNA level are not expressed, and on average there may be 50% of mutations that are not expressed, and thus may cause false positives for prediction of neoantigens. And the expression of the mutation is high or low, and the higher the expression is, the stronger the immunogenicity is generated on the whole. In addition, EpiToolKit does not consider the comparison between the mutant peptide and the normal peptide, and the high quality neoantigen is generally higher in affinity than the normal peptide, while EpiToolKit lacks such a comparison, which would also cause false positive screening of high quality neoantigen.

Epi-Seq predicts tumor-specific antigens only from the expression data of tumors, and predicts neoantigens from the expression data, which also causes false positives. On one hand, false positives are easily caused by the influence of RNA editing; on the other hand, since RNA sequencing is performed after reverse transcription from cDNA, this process also introduces a large number of false positives; on the other hand, there are many false positives in the detection method for the tomorcDNA VS germline DNA. The above factors result in more false positives for the new 66 antigen obtained by Epi-Seq.

Therefore, at present, there is no method for screening high-quality tumor neoantigens from multiple angles directly based on sequencing comparison results, and related research is being conducted.

Disclosure of Invention

The present invention is directed to a method for screening tumor neoantigens, which solves the above problems of the prior art.

In order to achieve the purpose, the invention provides the following technical scheme:

a screening method of tumor neoantigen comprises the following specific steps:

step one, selecting a variation type of tumor somatic cells;

step two, RNA variation analysis is carried out on the tumor somatic cells;

step three, respectively carrying out Major Histocompatibility Complex (MHC) molecule analysis on normal blood and tumor somatic cells;

step four, RNA expression analysis is carried out on the tumor somatic cells;

fifthly, carrying out variation annotation on tumor somatic cells;

analyzing the variation driving gene of the tumor somatic cell;

seventhly, predicting the binding affinity of Human Leukocyte Antigen (HLA) molecules on the tumor somatic cells;

step eight, analyzing the mutation frequency of the tumor somatic cells;

step nine, comprehensively scoring and sequencing candidate tumor neoantigens;

step ten, analyzing the synthesis difficulty of the candidate tumor neoantigen;

step eleven, integrating the results of the step nine and the step ten to select the final tumor neoantigen.

As a further scheme of the invention: the variation types of the tumor somatic cells in the first step comprise DNA point mutation, insertion deletion mutation and frame shift mutation of the tumor somatic cells.

As a further scheme of the invention: in the third step, the Optitype, xHLA and seq2HLA software is adopted to carry out molecular analysis on normal blood and tumor somatic cells, and the results of the three are integrated to determine the HLA typing result of the tumor somatic cells.

As a further scheme of the invention: the variation annotations in the fifth step comprise variation annotations of point mutations, insertion deletion mutations and frame shift mutations in the tumor somatic variations.

As a further scheme of the invention: and step six, analyzing the mutation driving gene of the point mutation, the insertion deletion mutation and the frame shift mutation in the tumor somatic cells.

As a further scheme of the invention: and seventhly, performing HLA molecule binding affinity prediction on the tumor somatic cells according to the HLA molecule type of the tumor somatic cells, the mutation prediction peptide fragment obtained in the mutation peptide fragment prediction step and the wild type peptide fragment sequence corresponding to the mutation prediction peptide fragment.

As a further scheme of the invention: the scoring and ranking in the ninth step are based on the MHC affinity of the tumor somatic cells, the expression abundance of the tumor somatic cell antigens and the contrast degree of the wild-type peptides, the mutation frequency of the tumor somatic cells, whether the tumor somatic cells are RNA mutations and whether the tumor somatic cells are tumor driving genes.

As a further scheme of the invention: and step ten, analyzing according to the molecular weight, the isoelectric point, the electrostatic charge when the pH value is 7, the average hydrophilicity and the comparison difficulty of hydrophilic residues in the synthesis of the candidate tumor neoantigen.

Compared with the prior art, the invention has the beneficial effects that:

the tumor mutation analysis and tumor expression prediction tumor specific antigen are optimally combined, so that the analysis process is more efficient and accurate;

the invention not only aims at human gene data, but also adds a mouse analysis module, so that the application range of the prediction analysis of the neoantigen is wider;

the method starts from reading the fastq file, automatically generates the result by one key, optimizes the combined calling of the intermediate file combination in the big data processing, greatly improves the analysis efficiency by adopting multi-task distributed processing, reduces the requirement of biological big data analysis hardware, ensures that the tumor mutation analysis result is more accurate, further improves the accuracy of subsequent treatment, and has positive use prospect.

Drawings

FIG. 1 is a hypothesis of the method for screening a tumor neoantigen in example 2

Is a graph at 3.

FIG. 2 is a graph showing the Expression amount of Expression _ TMP generated in example 2 of the screening method for a tumor neoantigen, which is 4.4.

FIG. 3 is a graph of affinity scores of wild-type peptide chains calculated in the case of Normal _ score of 7.6 in example 2 of the method for screening tumor neoantigens.

Detailed Description

The technical solution of the present patent will be described in further detail with reference to the following embodiments.

Example 1

A screening method of tumor neoantigen comprises the following specific steps:

1, selection of tumor somatic variation

Adopting internationally recognized GATK tumor cell somatic mutation detection software and commercial software to detect the whole exon secondary sequencing results of a tumor somatic cell sample and a normal blood sample, and taking the mutation with high mutation frequency detected by various detection software as candidate mutation; meanwhile, carrying out mutation analysis on the sequencing result of the tumor somatic cell transcriptome;

2, tumor somatic RNA variation analysis step

Combining the somatic mutation of the tumor somatic DNA and the RNA mutation of the tumor somatic, and finally determining the tumor somatic variation.

3, MHC molecule analysis step

HLA typing software Optitype is used for respectively analyzing HLA class I molecules of the full exon secondary sequencing results of the tumor somatic cell sample and the normal blood sample; respectively analyzing HLA class I molecules and HLA class II molecules of the full exon secondary sequencing results of the tumor somatic cell sample and the normal blood sample by using HLA typing software xHLA; HLA typing software seq2HLA is used for carrying out HLA class I molecule and HLA class II molecule analysis on the sequencing result of the tumor somatic cell transcriptome; the results of 3-binding finally confirmed the HLA typing results, and from the 3-binding results, the sample identity was confirmed.

4, analysis of RNA expression in tumor somatic cells

Transcriptome expression amount analysis is carried out on the sequencing result of the tumor somatic transcriptome, and genes and a transcriptional TPM (Transcripts Per Million) value are determined.

5, variant annotation step

Annotating point mutation, insertion deletion mutation and frameshift mutation in tumor somatic mutation from genome mutation to transcriptome correspondence to amino acid mutation; TMB (tumor burden) analysis of tumor somatic variations;

6, tumor somatic cell driver gene analysis step

And (3) carrying out tumor somatic mutation driving gene analysis on point mutation, insertion deletion mutation and frame shift mutation in tumor somatic mutation by referring to a COSMIC tumor database.

HLA molecule affinity prediction step

The method comprises the steps of taking the HLA molecule type of a tumor somatic cell sample obtained in the MHC molecule identification step, a mutation prediction peptide fragment obtained in the mutation peptide fragment prediction step and a wild type peptide fragment sequence corresponding to the mutation prediction peptide fragment as the input of MHC class I and MHC class II affinity prediction software, and predicting the affinity levels of the mutation peptide fragment and MHC class I and MHC class II genes respectively. The mutant peptide fragments of MHC class I molecules are predicted by using a computer neural network NNAlign algorithm in combination with affinity and MS elution ligand data.

8, tumor somatic mutation frequency analysis step

The method comprises the step of detecting the frequency of the mutation of tumor somatic cells occupying the gene locus in all DNA by adopting tumor mutation frequency analysis software, wherein the higher the mutation frequency is, the higher the percentage of the tumor somatic cells is.

9, comprehensive grading and sequencing step of candidate tumor neoantigens

The method comprises the steps of grading mutation prediction peptide sections in the candidate tumor neogenesis antigen according to influence factors such as MHC affinity, antigen expression abundance, wild type peptide comparison, tumor somatic mutation frequency, RNA mutation and tumor somatic driving gene, and the like, and sorting the mutation prediction peptide sections from high to low according to scores, and selecting the section with a high score as the tumor neogenesis antigen.

10, analysis step of ease of polypeptide synthesis

Peptide chains with the length of 30 bits are respectively filled towards the left and the right by taking the predicted mutant peptide as the center, and the synthesis difficulty of the candidate tumor neoantigen is analyzed from the aspects of molecular weight, isoelectric point, electrostatic charge when the pH value is 7, average hydrophilicity and hydrophilic residue ratio according to polypeptide synthesis difficulty analysis software.

11, candidate tumor neoresistance Final selection step

The final synthetic tumor neoantigen was selected according to the score of step 9 and the ease of polypeptide synthesis of step 10.

Example 2

A screening method of tumor neoantigen comprises the following specific steps:

1, selection of tumor somatic variation

1.1. Carrying out second-generation sequencing on tumor tissues and blood of the sample by using related reagents, wherein the sequencing depths are 200X and 100X respectively;

1.2. the sequencing data in 1.1 were subjected to comprehensive quality analysis using the fastp software of OpenGene, and if Q20 was less than 98%, or Q30 was less than 90%, or the GC ratio was abnormal, the sequencing data were considered to be of unacceptable quality, and the neoantigen analysis was stopped. Filters out reads with too low, too short, or too many N.

1.3. BWA-MEM alignment was performed on fastq clean data. Judging the type of the sample, if the sample is a human sample, selecting a GRCh38 human reference gene group to compare the tumor tissue sequencing data with the blood sequencing data; if it is a mouse sample, the GRCm38 reference panel is selected to align the tumor tissue sequencing data with the blood sequencing data.

1.4. And (5) further counting the sequencing quality. And respectively counting the Phred scores of each sequencing cycle in the tumor tissue and blood data after 1.3, wherein the Q value of each sequencing cycle is required to be more than 30, and otherwise, stopping the new antigen analysis. The sequencing depth of the tumor tissue and blood data after 1.3 steps was calculated, respectively, and neoantigen analysis was stopped if the sequencing depth was below 200X and 100X, respectively.

1.5. The number of repeated rejects after step 1.3.

1.6. And performing realignment on the data after the step 1.5 according to the known indel information in 1000G.

1.7. The existing variation database is used to build a model to generate a recalibration table. The mass fraction of bases is then corrected according to this model.

1.8. Tumor somatic mutation analysis was performed using mutec, mutec 2. The results of both variants were combined.

2, analysis of RNA variation in tumor tissues

2.1. RNA-seq was performed on tumor tissue with a sequencing cluster of 60M

2.2. RNA-seq data were quality verified using FastQC software. And stopping the new antigen analysis if the quality is unqualified.

2.3. Alignment was performed using STAR software. Judging the type of the sample, and if the sample is a human sample, selecting a GRCh38 human reference gene group to compare the tumor tissue sequencing data with the blood sequencing data. If the mouse is the mouse, the GRCm38 reference gene group is selected to align the tumor tissue sequencing data with the blood sequencing data.

2.4. And (5) further counting the sequencing quality. And respectively counting the Phred scores of each sequencing cycle in the tumor tissue and blood data after 2.3, wherein the Q value of each sequencing cycle is required to be more than 30, and otherwise, stopping the new antigen analysis.

2.5. The number of repeated rejects after step 2.3.

2.6. The split reads strategy is used to discover new connections.

2.7. And performing realignment on the data after the step 2.6 according to the known indel information in 1000G.

2.8. The existing variation database is used to build a model to generate a recalibration table. The mass fraction of bases is then corrected according to this model.

2.9. Mutation analysis was performed using HaplotypeCaller. The emit _ conf parameter in the haplotypecall command is set to 30, the call _ conf parameter is set to 25, and the ploidy transfer parameter is set to 4.

3, MHC molecule analysis step

3.1. HLA typing software Optitype is used for respectively analyzing HLA class I molecules of the full exon secondary sequencing results of the sample tumor tissue and the normal blood;

3.2. analyzing HLA class I molecules and HLA class II molecules of the full exon secondary sequencing results of the sample tumor tissue and the normal blood respectively by using HLA typing software xHLA;

3.3. HLA typing software seq2HLA is used for carrying out HLA class I molecule and HLA class II molecule analysis on the sequencing result of the tumor tissue transcriptome;

3.4. the results of binding 3.1,3.2 and 3.3 ultimately determined HLA typing results. If the 3 results are very different from the alarm exit, the new antigen analysis is stopped.

4, analysis of RNA expression in tumor tissue

4.1. And comparing sequencing results of the tumor tissue transcriptome, judging the type of the sample, and if the sample is a human sample, selecting a grch38_ tran human reference genome to compare the sequencing data of the tumor tissue with the blood sequencing data. If the mouse is the mouse, a grcm38_ tran reference gene group is selected to align the tumor tissue sequencing data with the blood sequencing data.

4.2. The 4.1 output bam files are sorted by samtools.

4.3. Calculating the expression level of transcriptome by using stringtie.

4.4. TMP (Transcripts Per Million) values for transcriptomes were extracted from the gtf files generated in 4.3.

5, variant annotation step

5.1. Annotation of genomic mutations for point, indel, and frameshift mutations in tumor somatic variations was performed with VEP software. Judging the type of the sample, if the sample is a human sample, selecting a GRCh38 human reference gene group to compare the tumor tissue sequencing data with the blood sequencing data; if it is a mouse sample, the GRCm38 reference panel is selected to align the tumor tissue sequencing data with the blood sequencing data.

5.2. The vcf format is converted to maf format with vcf2maf software.

5.2. Tumor somatic variations were screened for variations that eliminated Intron, 5'UTR, 3' UTR, IGR, 5'Flank, 3' Flank, RNA, and lincRNA types, and were variations in non-dbsnp, and TMB (tumor burden) analysis was calculated.

6, tumor driver Gene analysis step

HLA molecule affinity prediction step

The method comprises the steps of taking the HLA molecule type of a tumor sample obtained in the MHC molecule identification step, a mutation prediction peptide segment obtained in the mutation peptide segment prediction step and a wild type peptide segment sequence corresponding to the mutation prediction peptide segment as the input of MHC I type and MHC II type affinity prediction software, and predicting the affinity levels of the mutation peptide segment and MHC I type and MHC II type genes respectively. The mutant peptide fragments of MHC class I molecules are predicted by using a computer neural network NNAlign algorithm in combination with affinity and MS elution ligand data.

7.1. Judging the type of the sample, if the sample is a human sample, selecting a cDNA reference sequence of GRCh38 and a peptide sequence of GRCh 38; in the case of mouse samples, the cDNA reference sequence of GRCm38 and the peptide sequence of GRCm38 were selected. Inquiring a wild type peptide chain and a mutant type peptide chain with corresponding predicted lengths for the mutation of SNP mutation, insertion and deletion types by using the result of vcf2maf in 5.2; and inquiring the wild type peptide chain and the mutant peptide chain with corresponding predicted lengths according to the cDNA sequence and the peptide chain reference sequence for the frame shift mutation.

7.2. The HLA class I molecules generated in step 3 were analyzed using netMHCpan-4.0. Turning on netMHCpan-4.0, integrating affinity (BA) and mass spectrometry data (MS) parameters, more information was obtained from two different angles. Firstly, necessary screening is carried out by utilizing class MHC data of an IEDB database, model training is carried out by utilizing data of affinity (BA) and mass spectrum elution ligand (MS elected ligand), information of the two data is integrated by an artificial neural network method, and the affinity value of a peptide segment for predicting the binding of a specific MHC molecule and the length of the peptide segment are increased based on an NNAlign framework. The method of NetMHCpan-4.0 improves the prediction accuracy of the T cell immune epitope in tumor neoantigens, verified Elution Ligands (ELs) and T cells. And predicting and scoring by utilizing the affinity of the netMHCpan-4.0 predicted HLA class I molecules with the wild-type peptide chain with the length of 8-15 bits and the mutant peptide chain with the length of 8-15 bits generated in 7.1.

7.3. The HLA class II molecules generated in step 3 were analyzed using netMHCIIpan-3.2. Prediction of HLA class I molecules affinity to the 8-15 bit long wild-type peptide chain and 8-15 bit long mutant peptide chain generated in 7.1 was performed and scoring was performed.

7.4. netMHC analysis was performed on the HLA-I, class II molecules generated in step 3, with an affinity threshold set at 500 nm. Prediction of HLA class I molecules affinity to the 8-15 bit long wild-type peptide chain and 8-15 bit long mutant peptide chain generated in 7.1 was performed and scoring was performed.

7.5. The HLA-I, class II molecules generated in step 3 were analyzed using NetMHCcons with an affinity threshold set at 500 nm. Prediction of HLA class I molecules affinity to the 8-15 bit long wild-type peptide chain and 8-15 bit long mutant peptide chain generated in 7.1 was performed and scoring was performed.

7.6. The affinity scores of the HLA-I molecules of the above 3 kinds of software are median, and the Chinese value is used as the final affinity score.

8, tumor mutation frequency analysis step

The method comprises the step of detecting the frequency of tumor mutation accounting for the gene site in all DNA by using tumor mutation frequency analysis software, wherein the higher the mutation frequency is, the higher the percentage of tumor cells is. And reading the mutation frequency from the VCF file, reading the FA field in the VCF file if the mutation frequency is the result of Mutect software analysis, and reading the AF field in the VCF file if the mutation frequency is the result of Mutect 2.

9, comprehensive grading and sequencing step of candidate tumor neoantigens

The method comprises the steps of scoring each mutation prediction peptide segment in the candidate tumor neogenesis antigen according to influence factors such as MHC affinity, antigen expression abundance, wild type peptide comparison, tumor mutation frequency, RNA mutation, tumor driving gene and the like, sorting according to scores from high to low, and selecting a person with a high score as the tumor neogenesis antigen.

9.1. Formula one

Neoantigen score = live antigen score

The scores of the neoantigens of the peptide chains at positions 8-15 in 7.6 were calculated, respectively, and all the neoantigens were sorted in the reverse order of the scores.

9.2. Formula two

The value _ affinity _ score calculation formula:

Mutant_affinity_score =Δ*(1+e^{mutant_score*10-5})

the Mutant score is the affinity score of the Mutant peptide chain calculated in 7.6, and the affinity is used as an index to convert the affinity into a natural logarithm for operation. Wherein

The number of tumor variations, if snp variations, are included

Is 1; if the mutation is insertion or deletion, then

The number of specific insertions or deletions; if it is a shift variation

Is a specific number of frameshifts. FIG. 1 is a drawing of a fakeIs provided with

Curve at 3.

9.3. Formula three

Expression _ score calculation formula:

Expression_score=tanh(expression_TMP)

FIG. 2 shows the Expression level of Expression _ TMP produced in 4.4, and the value of Expression _ score is 1 when the Expression level of the transcriptome reaches a certain level.

9.4. Formula four

Normal _ affinity _ score calculation formula:

Normal_affinity_score =1/(1+e^normal_*10-5)

FIG. 3 shows the affinity score for wild-type peptide chains calculated in 7.6 for Normal _ score, converted to natural logarithm using the affinity as an index and inverted to calculate the value.

9.5. Formula five

α=0.99*allele_frequency+0.9*TBM+0.1*in_RNA_mutant+

0.1*is_cancer_driven_genne

Allle _ frequency: tumor mutation frequency calculated in step 8.

TBM: tumor burden calculated in 5.2.

In _ RNA _ mutant: 2.9 whether the tumor variation is among RNA variations.

Is _ cancer _ drive _ gene: whether it is a tumor driver gene in step 6.

10, analysis step of ease of polypeptide synthesis

Respectively filling peptide chains with the length of 25-30 bits to the left and the right by taking the predicted mutant peptide as a center, and analyzing the synthesis difficulty of the candidate tumor neoantigens from the aspects of molecular weight, isoelectric point, electrostatic charge when the pH value is 7, average hydrophilicity and hydrophilic residue ratio according to polypeptide synthesis difficulty analysis software.

11, candidate tumor neoresistance Final selection step

Selecting the final synthesized tumor neoantigen according to the scoring of the 9 th step and the synthetic difficulty of the polypeptide of the 10 th step

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims

1. A method for screening tumor neoantigens is characterized by comprising the following specific steps:

step one, selecting a variation type of tumor somatic cells;

step two, RNA variation analysis is carried out on the tumor somatic cells;

step three, respectively carrying out MHC molecule analysis on normal blood and tumor somatic cells;

step four, RNA expression analysis is carried out on the tumor somatic cells;

fifthly, carrying out variation annotation on tumor somatic cells;

analyzing the variation driving gene of the tumor somatic cell;

seventhly, predicting the HLA molecule binding affinity of the tumor somatic cells;

step eight, analyzing the mutation frequency of the tumor somatic cells;

step nine, comprehensively scoring and sequencing candidate tumor neoantigens;

step ten, analyzing the synthesis difficulty of the candidate tumor neoantigen;

2. The method for screening tumor neoantigens according to claim 1, wherein the types of the variation of tumor somatic cells in the first step include DNA point mutation, insertion deletion mutation and frame shift mutation of tumor somatic cells.

3. The method for screening tumor neoantigens according to claim 1, wherein the variation annotations in the step five comprise variation annotations of point mutations, insertion/deletion mutations and frameshift mutations in tumor somatic variations.

4. The method for screening tumor neoantigen according to claim 1 or 2, wherein the analysis of the mutation driver is performed for the point mutation, the insertion deletion mutation and the frame shift mutation in the tumor somatic cell in the sixth step.

5. The method for screening tumor neoantigens according to claim 1, wherein in the seventh step, the HLA molecule binding affinity of tumor somatic cells is predicted according to the HLA molecule type of tumor somatic cells, the mutation prediction peptide fragment obtained in the step of predicting the mutation peptide fragment, and the wild-type peptide fragment sequence corresponding to the mutation prediction peptide fragment.

6. The method for screening tumor neoantigen according to claim 1 or 3, wherein the step nine is performed by ranking according to the MHC affinity of tumor somatic cells, the antigen expression abundance of tumor somatic cells and the contrast degree of wild-type peptides, the mutation frequency of tumor somatic cells, whether tumor somatic cells are RNA mutations and whether tumor driver genes.

7. The method of claim 1, wherein the step ten comprises analyzing the molecular weight, isoelectric point, electrostatic charge at pH 7, average hydrophilicity, and ease of synthesis of the hydrophilic residue ratio to the candidate tumor neoantigen.