CN111755067A - Screening method of tumor neoantigen - Google Patents

Screening method of tumor neoantigen Download PDF

Info

Publication number
CN111755067A
CN111755067A CN201910242904.8A CN201910242904A CN111755067A CN 111755067 A CN111755067 A CN 111755067A CN 201910242904 A CN201910242904 A CN 201910242904A CN 111755067 A CN111755067 A CN 111755067A
Authority
CN
China
Prior art keywords
tumor
mutation
somatic cells
analysis
variation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910242904.8A
Other languages
Chinese (zh)
Inventor
雷俊卿
苏小平
秦汉楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Geyuan Zhishan Shanghai Bio Tech Co ltd
Original Assignee
Geyuan Zhishan Shanghai Bio Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Geyuan Zhishan Shanghai Bio Tech Co ltd filed Critical Geyuan Zhishan Shanghai Bio Tech Co ltd
Priority to CN201910242904.8A priority Critical patent/CN111755067A/en
Publication of CN111755067A publication Critical patent/CN111755067A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a method for screening tumor neoantigens, aiming at solving the problem that the existing method can not screen high-quality tumor neoantigens from multiple angles. The method comprises the following specific steps: step one, selecting a variation type of a tumor; step two, RNA variation analysis is carried out on the tumor; step three, carrying out MHC molecule analysis on normal blood and tumor; step four, RNA expression analysis is carried out on the tumor; fifthly, carrying out variation annotation on the tumor; analyzing the variation driving gene of the tumor; seventhly, predicting the HLA molecule binding affinity of the tumor; step eight, analyzing the mutation frequency of the tumor; step nine, comprehensively scoring the tumor neoantigens; step ten, analyzing the synthesis difficulty of the tumor neoantigen; step eleven, comprehensively selecting the final tumor neoantigen. The invention optimally combines tumor mutation analysis and tumor expression prediction tumor specific antigen, so that the analysis process is more efficient and accurate.

Description

Screening method of tumor neoantigen
Technical Field
The invention relates to the field of tumor immunity, in particular to a method for screening a tumor neoantigen.
Background
Tumor-specific antigens (TSAs) are antigens which are characteristic of tumor cells and are also known as neoantigens (neoantigens). Tumor-specific antigens were proposed in the first half of the last century, and then with the development of molecular biology and the deep knowledge of the function of major histocompatibility complex (abbreviated as MHC) molecules, Boon et al first discovered that in tumors, complexes of specific peptides and MHC molecules produced by tumors can be recognized by T cells such as CD8+ or CD4 +. Subsequent studies have recognized that these antigens recognized by T cells are derived from genomic variations of tumors expressed as tumor-specific peptides (neo-epitopes) and are defined as neoantigens (neoantigens). Unlike tumor-associated antigens, tumor-specific antigens are present only in tumor cells.
Two independent clinical phase I test results are published in 7 months in 2017 and in the same period of the British science journal Nature, a novel antigen (neoantigen) specifically expressed by tumor cells due to gene mutation is searched by sequencing DNA and RNA of the tumor cells, and then a personalized tumor vaccine is constructed and is infused back into a human body to activate immune cells and kill the tumor cells with the antigen. This is the first cancer vaccine study that has succeeded in clinical trials.
The prediction methods of tumor neoantigens which are published at present mainly comprise EpiToolkit and Epi-Seq. However, EpiToolKit only starts from mutation, does not consider the depth and coverage of sequencing data, does not consider the quality of mutation from the quality of data, and cannot judge the quality of the obtained neoantigen. In addition, EpiToolkit does not consider expression abundance and does not consider the expression condition of the neoantigen, which causes false positive prediction and can not screen high-quality neoantigen. Many mutations at the DNA level are not expressed, and on average there may be 50% of mutations that are not expressed, and thus may cause false positives for prediction of neoantigens. And the expression of the mutation is high or low, and the higher the expression is, the stronger the immunogenicity is generated on the whole. In addition, EpiToolKit does not consider the comparison between the mutant peptide and the normal peptide, and the high quality neoantigen is generally higher in affinity than the normal peptide, while EpiToolKit lacks such a comparison, which would also cause false positive screening of high quality neoantigen.
Epi-Seq predicts tumor-specific antigens only from the expression data of tumors, and predicts neoantigens from the expression data, which also causes false positives. On one hand, false positives are easily caused by the influence of RNA editing; on the other hand, since RNA sequencing is performed after reverse transcription from cDNA, this process also introduces a large number of false positives; on the other hand, there are many false positives in the detection method for the tomorcDNA VS germline DNA. The above factors result in more false positives for the new 66 antigen obtained by Epi-Seq.
Therefore, at present, there is no method for screening high-quality tumor neoantigens from multiple angles directly based on sequencing comparison results, and related research is being conducted.
Disclosure of Invention
The present invention is directed to a method for screening tumor neoantigens, which solves the above problems of the prior art.
In order to achieve the purpose, the invention provides the following technical scheme:
a screening method of tumor neoantigen comprises the following specific steps:
step one, selecting a variation type of tumor somatic cells;
step two, RNA variation analysis is carried out on the tumor somatic cells;
step three, respectively carrying out Major Histocompatibility Complex (MHC) molecule analysis on normal blood and tumor somatic cells;
step four, RNA expression analysis is carried out on the tumor somatic cells;
fifthly, carrying out variation annotation on tumor somatic cells;
analyzing the variation driving gene of the tumor somatic cell;
seventhly, predicting the binding affinity of Human Leukocyte Antigen (HLA) molecules on the tumor somatic cells;
step eight, analyzing the mutation frequency of the tumor somatic cells;
step nine, comprehensively scoring and sequencing candidate tumor neoantigens;
step ten, analyzing the synthesis difficulty of the candidate tumor neoantigen;
step eleven, integrating the results of the step nine and the step ten to select the final tumor neoantigen.
As a further scheme of the invention: the variation types of the tumor somatic cells in the first step comprise DNA point mutation, insertion deletion mutation and frame shift mutation of the tumor somatic cells.
As a further scheme of the invention: in the third step, the Optitype, xHLA and seq2HLA software is adopted to carry out molecular analysis on normal blood and tumor somatic cells, and the results of the three are integrated to determine the HLA typing result of the tumor somatic cells.
As a further scheme of the invention: the variation annotations in the fifth step comprise variation annotations of point mutations, insertion deletion mutations and frame shift mutations in the tumor somatic variations.
As a further scheme of the invention: and step six, analyzing the mutation driving gene of the point mutation, the insertion deletion mutation and the frame shift mutation in the tumor somatic cells.
As a further scheme of the invention: and seventhly, performing HLA molecule binding affinity prediction on the tumor somatic cells according to the HLA molecule type of the tumor somatic cells, the mutation prediction peptide fragment obtained in the mutation peptide fragment prediction step and the wild type peptide fragment sequence corresponding to the mutation prediction peptide fragment.
As a further scheme of the invention: the scoring and ranking in the ninth step are based on the MHC affinity of the tumor somatic cells, the expression abundance of the tumor somatic cell antigens and the contrast degree of the wild-type peptides, the mutation frequency of the tumor somatic cells, whether the tumor somatic cells are RNA mutations and whether the tumor somatic cells are tumor driving genes.
As a further scheme of the invention: and step ten, analyzing according to the molecular weight, the isoelectric point, the electrostatic charge when the pH value is 7, the average hydrophilicity and the comparison difficulty of hydrophilic residues in the synthesis of the candidate tumor neoantigen.
Compared with the prior art, the invention has the beneficial effects that:
the tumor mutation analysis and tumor expression prediction tumor specific antigen are optimally combined, so that the analysis process is more efficient and accurate;
the invention not only aims at human gene data, but also adds a mouse analysis module, so that the application range of the prediction analysis of the neoantigen is wider;
the method starts from reading the fastq file, automatically generates the result by one key, optimizes the combined calling of the intermediate file combination in the big data processing, greatly improves the analysis efficiency by adopting multi-task distributed processing, reduces the requirement of biological big data analysis hardware, ensures that the tumor mutation analysis result is more accurate, further improves the accuracy of subsequent treatment, and has positive use prospect.
Drawings
FIG. 1 is a hypothesis of the method for screening a tumor neoantigen in example 2
Figure DEST_PATH_IMAGE001
Is a graph at 3.
FIG. 2 is a graph showing the Expression amount of Expression _ TMP generated in example 2 of the screening method for a tumor neoantigen, which is 4.4.
FIG. 3 is a graph of affinity scores of wild-type peptide chains calculated in the case of Normal _ score of 7.6 in example 2 of the method for screening tumor neoantigens.
Detailed Description
The technical solution of the present patent will be described in further detail with reference to the following embodiments.
Example 1
A screening method of tumor neoantigen comprises the following specific steps:
1, selection of tumor somatic variation
Adopting internationally recognized GATK tumor cell somatic mutation detection software and commercial software to detect the whole exon secondary sequencing results of a tumor somatic cell sample and a normal blood sample, and taking the mutation with high mutation frequency detected by various detection software as candidate mutation; meanwhile, carrying out mutation analysis on the sequencing result of the tumor somatic cell transcriptome;
2, tumor somatic RNA variation analysis step
Combining the somatic mutation of the tumor somatic DNA and the RNA mutation of the tumor somatic, and finally determining the tumor somatic variation.
3, MHC molecule analysis step
HLA typing software Optitype is used for respectively analyzing HLA class I molecules of the full exon secondary sequencing results of the tumor somatic cell sample and the normal blood sample; respectively analyzing HLA class I molecules and HLA class II molecules of the full exon secondary sequencing results of the tumor somatic cell sample and the normal blood sample by using HLA typing software xHLA; HLA typing software seq2HLA is used for carrying out HLA class I molecule and HLA class II molecule analysis on the sequencing result of the tumor somatic cell transcriptome; the results of 3-binding finally confirmed the HLA typing results, and from the 3-binding results, the sample identity was confirmed.
4, analysis of RNA expression in tumor somatic cells
Transcriptome expression amount analysis is carried out on the sequencing result of the tumor somatic transcriptome, and genes and a transcriptional TPM (Transcripts Per Million) value are determined.
5, variant annotation step
Annotating point mutation, insertion deletion mutation and frameshift mutation in tumor somatic mutation from genome mutation to transcriptome correspondence to amino acid mutation; TMB (tumor burden) analysis of tumor somatic variations;
6, tumor somatic cell driver gene analysis step
And (3) carrying out tumor somatic mutation driving gene analysis on point mutation, insertion deletion mutation and frame shift mutation in tumor somatic mutation by referring to a COSMIC tumor database.
HLA molecule affinity prediction step
The method comprises the steps of taking the HLA molecule type of a tumor somatic cell sample obtained in the MHC molecule identification step, a mutation prediction peptide fragment obtained in the mutation peptide fragment prediction step and a wild type peptide fragment sequence corresponding to the mutation prediction peptide fragment as the input of MHC class I and MHC class II affinity prediction software, and predicting the affinity levels of the mutation peptide fragment and MHC class I and MHC class II genes respectively. The mutant peptide fragments of MHC class I molecules are predicted by using a computer neural network NNAlign algorithm in combination with affinity and MS elution ligand data.
8, tumor somatic mutation frequency analysis step
The method comprises the step of detecting the frequency of the mutation of tumor somatic cells occupying the gene locus in all DNA by adopting tumor mutation frequency analysis software, wherein the higher the mutation frequency is, the higher the percentage of the tumor somatic cells is.
9, comprehensive grading and sequencing step of candidate tumor neoantigens
The method comprises the steps of grading mutation prediction peptide sections in the candidate tumor neogenesis antigen according to influence factors such as MHC affinity, antigen expression abundance, wild type peptide comparison, tumor somatic mutation frequency, RNA mutation and tumor somatic driving gene, and the like, and sorting the mutation prediction peptide sections from high to low according to scores, and selecting the section with a high score as the tumor neogenesis antigen.
10, analysis step of ease of polypeptide synthesis
Peptide chains with the length of 30 bits are respectively filled towards the left and the right by taking the predicted mutant peptide as the center, and the synthesis difficulty of the candidate tumor neoantigen is analyzed from the aspects of molecular weight, isoelectric point, electrostatic charge when the pH value is 7, average hydrophilicity and hydrophilic residue ratio according to polypeptide synthesis difficulty analysis software.
11, candidate tumor neoresistance Final selection step
The final synthetic tumor neoantigen was selected according to the score of step 9 and the ease of polypeptide synthesis of step 10.
Example 2
A screening method of tumor neoantigen comprises the following specific steps:
1, selection of tumor somatic variation
1.1. Carrying out second-generation sequencing on tumor tissues and blood of the sample by using related reagents, wherein the sequencing depths are 200X and 100X respectively;
1.2. the sequencing data in 1.1 were subjected to comprehensive quality analysis using the fastp software of OpenGene, and if Q20 was less than 98%, or Q30 was less than 90%, or the GC ratio was abnormal, the sequencing data were considered to be of unacceptable quality, and the neoantigen analysis was stopped. Filters out reads with too low, too short, or too many N.
1.3. BWA-MEM alignment was performed on fastq clean data. Judging the type of the sample, if the sample is a human sample, selecting a GRCh38 human reference gene group to compare the tumor tissue sequencing data with the blood sequencing data; if it is a mouse sample, the GRCm38 reference panel is selected to align the tumor tissue sequencing data with the blood sequencing data.
1.4. And (5) further counting the sequencing quality. And respectively counting the Phred scores of each sequencing cycle in the tumor tissue and blood data after 1.3, wherein the Q value of each sequencing cycle is required to be more than 30, and otherwise, stopping the new antigen analysis. The sequencing depth of the tumor tissue and blood data after 1.3 steps was calculated, respectively, and neoantigen analysis was stopped if the sequencing depth was below 200X and 100X, respectively.
1.5. The number of repeated rejects after step 1.3.
1.6. And performing realignment on the data after the step 1.5 according to the known indel information in 1000G.
1.7. The existing variation database is used to build a model to generate a recalibration table. The mass fraction of bases is then corrected according to this model.
1.8. Tumor somatic mutation analysis was performed using mutec, mutec 2. The results of both variants were combined.
2, analysis of RNA variation in tumor tissues
2.1. RNA-seq was performed on tumor tissue with a sequencing cluster of 60M
2.2. RNA-seq data were quality verified using FastQC software. And stopping the new antigen analysis if the quality is unqualified.
2.3. Alignment was performed using STAR software. Judging the type of the sample, and if the sample is a human sample, selecting a GRCh38 human reference gene group to compare the tumor tissue sequencing data with the blood sequencing data. If the mouse is the mouse, the GRCm38 reference gene group is selected to align the tumor tissue sequencing data with the blood sequencing data.
2.4. And (5) further counting the sequencing quality. And respectively counting the Phred scores of each sequencing cycle in the tumor tissue and blood data after 2.3, wherein the Q value of each sequencing cycle is required to be more than 30, and otherwise, stopping the new antigen analysis.
2.5. The number of repeated rejects after step 2.3.
2.6. The split reads strategy is used to discover new connections.
2.7. And performing realignment on the data after the step 2.6 according to the known indel information in 1000G.
2.8. The existing variation database is used to build a model to generate a recalibration table. The mass fraction of bases is then corrected according to this model.
2.9. Mutation analysis was performed using HaplotypeCaller. The emit _ conf parameter in the haplotypecall command is set to 30, the call _ conf parameter is set to 25, and the ploidy transfer parameter is set to 4.
3, MHC molecule analysis step
3.1. HLA typing software Optitype is used for respectively analyzing HLA class I molecules of the full exon secondary sequencing results of the sample tumor tissue and the normal blood;
3.2. analyzing HLA class I molecules and HLA class II molecules of the full exon secondary sequencing results of the sample tumor tissue and the normal blood respectively by using HLA typing software xHLA;
3.3. HLA typing software seq2HLA is used for carrying out HLA class I molecule and HLA class II molecule analysis on the sequencing result of the tumor tissue transcriptome;
3.4. the results of binding 3.1,3.2 and 3.3 ultimately determined HLA typing results. If the 3 results are very different from the alarm exit, the new antigen analysis is stopped.
4, analysis of RNA expression in tumor tissue
4.1. And comparing sequencing results of the tumor tissue transcriptome, judging the type of the sample, and if the sample is a human sample, selecting a grch38_ tran human reference genome to compare the sequencing data of the tumor tissue with the blood sequencing data. If the mouse is the mouse, a grcm38_ tran reference gene group is selected to align the tumor tissue sequencing data with the blood sequencing data.
4.2. The 4.1 output bam files are sorted by samtools.
4.3. Calculating the expression level of transcriptome by using stringtie.
4.4. TMP (Transcripts Per Million) values for transcriptomes were extracted from the gtf files generated in 4.3.
5, variant annotation step
5.1. Annotation of genomic mutations for point, indel, and frameshift mutations in tumor somatic variations was performed with VEP software. Judging the type of the sample, if the sample is a human sample, selecting a GRCh38 human reference gene group to compare the tumor tissue sequencing data with the blood sequencing data; if it is a mouse sample, the GRCm38 reference panel is selected to align the tumor tissue sequencing data with the blood sequencing data.
5.2. The vcf format is converted to maf format with vcf2maf software.
5.2. Tumor somatic variations were screened for variations that eliminated Intron, 5'UTR, 3' UTR, IGR, 5'Flank, 3' Flank, RNA, and lincRNA types, and were variations in non-dbsnp, and TMB (tumor burden) analysis was calculated.
6, tumor driver Gene analysis step
And (3) carrying out tumor somatic mutation driving gene analysis on point mutation, insertion deletion mutation and frame shift mutation in tumor somatic mutation by referring to a COSMIC tumor database.
HLA molecule affinity prediction step
The method comprises the steps of taking the HLA molecule type of a tumor sample obtained in the MHC molecule identification step, a mutation prediction peptide segment obtained in the mutation peptide segment prediction step and a wild type peptide segment sequence corresponding to the mutation prediction peptide segment as the input of MHC I type and MHC II type affinity prediction software, and predicting the affinity levels of the mutation peptide segment and MHC I type and MHC II type genes respectively. The mutant peptide fragments of MHC class I molecules are predicted by using a computer neural network NNAlign algorithm in combination with affinity and MS elution ligand data.
7.1. Judging the type of the sample, if the sample is a human sample, selecting a cDNA reference sequence of GRCh38 and a peptide sequence of GRCh 38; in the case of mouse samples, the cDNA reference sequence of GRCm38 and the peptide sequence of GRCm38 were selected. Inquiring a wild type peptide chain and a mutant type peptide chain with corresponding predicted lengths for the mutation of SNP mutation, insertion and deletion types by using the result of vcf2maf in 5.2; and inquiring the wild type peptide chain and the mutant peptide chain with corresponding predicted lengths according to the cDNA sequence and the peptide chain reference sequence for the frame shift mutation.
7.2. The HLA class I molecules generated in step 3 were analyzed using netMHCpan-4.0. Turning on netMHCpan-4.0, integrating affinity (BA) and mass spectrometry data (MS) parameters, more information was obtained from two different angles. Firstly, necessary screening is carried out by utilizing class MHC data of an IEDB database, model training is carried out by utilizing data of affinity (BA) and mass spectrum elution ligand (MS elected ligand), information of the two data is integrated by an artificial neural network method, and the affinity value of a peptide segment for predicting the binding of a specific MHC molecule and the length of the peptide segment are increased based on an NNAlign framework. The method of NetMHCpan-4.0 improves the prediction accuracy of the T cell immune epitope in tumor neoantigens, verified Elution Ligands (ELs) and T cells. And predicting and scoring by utilizing the affinity of the netMHCpan-4.0 predicted HLA class I molecules with the wild-type peptide chain with the length of 8-15 bits and the mutant peptide chain with the length of 8-15 bits generated in 7.1.
7.3. The HLA class II molecules generated in step 3 were analyzed using netMHCIIpan-3.2. Prediction of HLA class I molecules affinity to the 8-15 bit long wild-type peptide chain and 8-15 bit long mutant peptide chain generated in 7.1 was performed and scoring was performed.
7.4. netMHC analysis was performed on the HLA-I, class II molecules generated in step 3, with an affinity threshold set at 500 nm. Prediction of HLA class I molecules affinity to the 8-15 bit long wild-type peptide chain and 8-15 bit long mutant peptide chain generated in 7.1 was performed and scoring was performed.
7.5. The HLA-I, class II molecules generated in step 3 were analyzed using NetMHCcons with an affinity threshold set at 500 nm. Prediction of HLA class I molecules affinity to the 8-15 bit long wild-type peptide chain and 8-15 bit long mutant peptide chain generated in 7.1 was performed and scoring was performed.
7.6. The affinity scores of the HLA-I molecules of the above 3 kinds of software are median, and the Chinese value is used as the final affinity score.
8, tumor mutation frequency analysis step
The method comprises the step of detecting the frequency of tumor mutation accounting for the gene site in all DNA by using tumor mutation frequency analysis software, wherein the higher the mutation frequency is, the higher the percentage of tumor cells is. And reading the mutation frequency from the VCF file, reading the FA field in the VCF file if the mutation frequency is the result of Mutect software analysis, and reading the AF field in the VCF file if the mutation frequency is the result of Mutect 2.
9, comprehensive grading and sequencing step of candidate tumor neoantigens
The method comprises the steps of scoring each mutation prediction peptide segment in the candidate tumor neogenesis antigen according to influence factors such as MHC affinity, antigen expression abundance, wild type peptide comparison, tumor mutation frequency, RNA mutation, tumor driving gene and the like, sorting according to scores from high to low, and selecting a person with a high score as the tumor neogenesis antigen.
9.1. Formula one
Neoantigen score = live antigen score
Figure DEST_PATH_IMAGE002
The scores of the neoantigens of the peptide chains at positions 8-15 in 7.6 were calculated, respectively, and all the neoantigens were sorted in the reverse order of the scores.
9.2. Formula two
The value _ affinity _ score calculation formula:
Mutant_affinity_score =Δ*(1+emutant_score*10-5)
the Mutant score is the affinity score of the Mutant peptide chain calculated in 7.6, and the affinity is used as an index to convert the affinity into a natural logarithm for operation. Wherein
Figure DEST_PATH_IMAGE003
The number of tumor variations, if snp variations, are included
Figure DEST_PATH_IMAGE004
Is 1; if the mutation is insertion or deletion, then
Figure DEST_PATH_IMAGE005
The number of specific insertions or deletions; if it is a shift variation
Figure DEST_PATH_IMAGE006
Is a specific number of frameshifts. FIG. 1 is a drawing of a fakeIs provided with
Figure DEST_PATH_IMAGE007
Curve at 3.
9.3. Formula three
Expression _ score calculation formula:
Expression_score=tanh(expression_TMP)
Figure DEST_PATH_IMAGE008
FIG. 2 shows the Expression level of Expression _ TMP produced in 4.4, and the value of Expression _ score is 1 when the Expression level of the transcriptome reaches a certain level.
9.4. Formula four
Normal _ affinity _ score calculation formula:
Normal_affinity_score =1/(1+enormal_*10-5)
FIG. 3 shows the affinity score for wild-type peptide chains calculated in 7.6 for Normal _ score, converted to natural logarithm using the affinity as an index and inverted to calculate the value.
9.5. Formula five
Figure DEST_PATH_IMAGE009
α=0.99*allele_frequency+0.9*TBM+0.1*in_RNA_mutant+
0.1*is_cancer_driven_genne
Allle _ frequency: tumor mutation frequency calculated in step 8.
TBM: tumor burden calculated in 5.2.
In _ RNA _ mutant: 2.9 whether the tumor variation is among RNA variations.
Is _ cancer _ drive _ gene: whether it is a tumor driver gene in step 6.
10, analysis step of ease of polypeptide synthesis
Respectively filling peptide chains with the length of 25-30 bits to the left and the right by taking the predicted mutant peptide as a center, and analyzing the synthesis difficulty of the candidate tumor neoantigens from the aspects of molecular weight, isoelectric point, electrostatic charge when the pH value is 7, average hydrophilicity and hydrophilic residue ratio according to polypeptide synthesis difficulty analysis software.
11, candidate tumor neoresistance Final selection step
Selecting the final synthesized tumor neoantigen according to the scoring of the 9 th step and the synthetic difficulty of the polypeptide of the 10 th step
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (7)

1. A method for screening tumor neoantigens is characterized by comprising the following specific steps:
step one, selecting a variation type of tumor somatic cells;
step two, RNA variation analysis is carried out on the tumor somatic cells;
step three, respectively carrying out MHC molecule analysis on normal blood and tumor somatic cells;
step four, RNA expression analysis is carried out on the tumor somatic cells;
fifthly, carrying out variation annotation on tumor somatic cells;
analyzing the variation driving gene of the tumor somatic cell;
seventhly, predicting the HLA molecule binding affinity of the tumor somatic cells;
step eight, analyzing the mutation frequency of the tumor somatic cells;
step nine, comprehensively scoring and sequencing candidate tumor neoantigens;
step ten, analyzing the synthesis difficulty of the candidate tumor neoantigen;
step eleven, integrating the results of the step nine and the step ten to select the final tumor neoantigen.
2. The method for screening tumor neoantigens according to claim 1, wherein the types of the variation of tumor somatic cells in the first step include DNA point mutation, insertion deletion mutation and frame shift mutation of tumor somatic cells.
3. The method for screening tumor neoantigens according to claim 1, wherein the variation annotations in the step five comprise variation annotations of point mutations, insertion/deletion mutations and frameshift mutations in tumor somatic variations.
4. The method for screening tumor neoantigen according to claim 1 or 2, wherein the analysis of the mutation driver is performed for the point mutation, the insertion deletion mutation and the frame shift mutation in the tumor somatic cell in the sixth step.
5. The method for screening tumor neoantigens according to claim 1, wherein in the seventh step, the HLA molecule binding affinity of tumor somatic cells is predicted according to the HLA molecule type of tumor somatic cells, the mutation prediction peptide fragment obtained in the step of predicting the mutation peptide fragment, and the wild-type peptide fragment sequence corresponding to the mutation prediction peptide fragment.
6. The method for screening tumor neoantigen according to claim 1 or 3, wherein the step nine is performed by ranking according to the MHC affinity of tumor somatic cells, the antigen expression abundance of tumor somatic cells and the contrast degree of wild-type peptides, the mutation frequency of tumor somatic cells, whether tumor somatic cells are RNA mutations and whether tumor driver genes.
7. The method of claim 1, wherein the step ten comprises analyzing the molecular weight, isoelectric point, electrostatic charge at pH 7, average hydrophilicity, and ease of synthesis of the hydrophilic residue ratio to the candidate tumor neoantigen.
CN201910242904.8A 2019-03-28 2019-03-28 Screening method of tumor neoantigen Pending CN111755067A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910242904.8A CN111755067A (en) 2019-03-28 2019-03-28 Screening method of tumor neoantigen

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910242904.8A CN111755067A (en) 2019-03-28 2019-03-28 Screening method of tumor neoantigen

Publications (1)

Publication Number Publication Date
CN111755067A true CN111755067A (en) 2020-10-09

Family

ID=72671533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910242904.8A Pending CN111755067A (en) 2019-03-28 2019-03-28 Screening method of tumor neoantigen

Country Status (1)

Country Link
CN (1) CN111755067A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112309502A (en) * 2020-10-14 2021-02-02 深圳市新合生物医疗科技有限公司 Method and system for calculating tumor neoantigen load
CN112466396A (en) * 2020-12-04 2021-03-09 中山大学附属第一医院 Screening method of tumor high-affinity new antigen and application of tumor high-affinity new antigen in indication of treatment prognosis curative effect of PD-1 of liver cancer patient
CN113322233A (en) * 2021-04-19 2021-08-31 格源致善(上海)生物科技有限公司 Improved preparation method and application of reactive T cells based on neoantigens
CN113517021A (en) * 2021-06-09 2021-10-19 海南精准医疗科技有限公司 Cancer driver gene prediction method
CN114446389A (en) * 2022-02-08 2022-05-06 上海科技大学 Tumor neoantigen characteristic analysis and immunogenicity prediction tool and application thereof
CN114882951A (en) * 2022-05-27 2022-08-09 深圳裕泰抗原科技有限公司 Method and device for detecting MHC II tumor neoantigen based on next generation sequencing data
CN117174166A (en) * 2023-10-26 2023-12-05 北京基石京准诊断科技有限公司 Tumor neoantigen prediction method and system based on third-generation sequencing data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104662171A (en) * 2012-07-12 2015-05-27 普瑟姆尼股份有限公司 Personalized cancer vaccines and adoptive immune cell therapies
EP3323070A1 (en) * 2015-07-14 2018-05-23 Personal Genome Diagnostics Inc. Neoantigen analysis
CN108796055A (en) * 2018-06-12 2018-11-13 深圳裕策生物科技有限公司 Tumor neogenetic antigen detection method, device and storage medium based on the sequencing of two generations
WO2019008365A1 (en) * 2017-07-05 2019-01-10 The Francis Crick Institute Limited Method for treating cancer by targeting a frameshift indel neoantigen

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104662171A (en) * 2012-07-12 2015-05-27 普瑟姆尼股份有限公司 Personalized cancer vaccines and adoptive immune cell therapies
EP3323070A1 (en) * 2015-07-14 2018-05-23 Personal Genome Diagnostics Inc. Neoantigen analysis
CN108351916A (en) * 2015-07-14 2018-07-31 个人基因组诊断公司 Neoantigen is analyzed
WO2019008365A1 (en) * 2017-07-05 2019-01-10 The Francis Crick Institute Limited Method for treating cancer by targeting a frameshift indel neoantigen
CN108796055A (en) * 2018-06-12 2018-11-13 深圳裕策生物科技有限公司 Tumor neogenetic antigen detection method, device and storage medium based on the sequencing of two generations

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112309502A (en) * 2020-10-14 2021-02-02 深圳市新合生物医疗科技有限公司 Method and system for calculating tumor neoantigen load
CN112309502B (en) * 2020-10-14 2024-09-20 深圳市新合生物医疗科技有限公司 Method and system for calculating tumor neoantigen load
CN112466396A (en) * 2020-12-04 2021-03-09 中山大学附属第一医院 Screening method of tumor high-affinity new antigen and application of tumor high-affinity new antigen in indication of treatment prognosis curative effect of PD-1 of liver cancer patient
CN113322233A (en) * 2021-04-19 2021-08-31 格源致善(上海)生物科技有限公司 Improved preparation method and application of reactive T cells based on neoantigens
CN113517021A (en) * 2021-06-09 2021-10-19 海南精准医疗科技有限公司 Cancer driver gene prediction method
CN113517021B (en) * 2021-06-09 2022-09-06 海南精准医疗科技有限公司 Cancer driver gene prediction method
CN114446389A (en) * 2022-02-08 2022-05-06 上海科技大学 Tumor neoantigen characteristic analysis and immunogenicity prediction tool and application thereof
CN114446389B (en) * 2022-02-08 2024-05-14 上海科技大学 Tumor neoantigen feature analysis and immunogenicity prediction tool and application thereof
CN114882951A (en) * 2022-05-27 2022-08-09 深圳裕泰抗原科技有限公司 Method and device for detecting MHC II tumor neoantigen based on next generation sequencing data
CN114882951B (en) * 2022-05-27 2022-12-27 深圳裕泰抗原科技有限公司 Method and device for detecting MHC II tumor neoantigen based on next generation sequencing data
CN117174166A (en) * 2023-10-26 2023-12-05 北京基石京准诊断科技有限公司 Tumor neoantigen prediction method and system based on third-generation sequencing data
CN117174166B (en) * 2023-10-26 2024-03-26 北京基石生命科技有限公司 Tumor neoantigen prediction method and system based on third-generation sequencing data

Similar Documents

Publication Publication Date Title
CN111755067A (en) Screening method of tumor neoantigen
CN108796055B (en) Method, device and storage medium for detecting tumor neoantigen based on second-generation sequencing
CN109801678B (en) Tumor antigen prediction method based on complete transcriptome and application thereof
CN109767810B (en) High-throughput sequencing data analysis method and device
EP2718862B1 (en) Method for assembly of nucleic acid sequence data
CN109584960B (en) Method, device and storage medium for predicting tumor neoantigen
US20130332081A1 (en) Variant annotation, analysis and selection tool
CN110621785B (en) Method and device for haplotyping diploid genome based on three-generation capture sequencing
JP2021534492A (en) Systems and Methods Using Neural Networks for Germline and Somatic Mutation Calls
CN110211633B (en) Detection method for MGMT gene promoter methylation, processing method for sequencing data and processing device
CN110752041A (en) Method, device and storage medium for predicting neoantigen based on next generation sequencing
CN111139291A (en) High-throughput sequencing analysis method for monogenic hereditary diseases
KR20190085667A (en) Circulating Tumor DNA Detection Method Using Sample comprising Cell free DNA and Uses thereof
CN114446389B (en) Tumor neoantigen feature analysis and immunogenicity prediction tool and application thereof
WO2014041380A1 (en) Method and computer program product for detecting mutation in a nucleotide sequence
CN106021993A (en) Tumor exome sequencing analysis system and method
CN111534602A (en) Method for analyzing human blood type and genotype based on high-throughput sequencing and application thereof
CN112210596B (en) Tumor neoantigen prediction method based on gene fusion event and application thereof
CN111696628A (en) Method for identifying neoantigens
CN115240773B (en) New antigen identification method and device, equipment and medium of tumor specific circular RNA
KR101815529B1 (en) Human Haplotyping System And Method
Al Seesi et al. Geneo: a bioinformatics toolbox for genomics-guided neoepitope prediction
CN114333998A (en) Tumor neoantigen prediction method and system based on deep learning model
CN111599410B (en) Method for extracting microsatellite unstable immunotherapy new antigen by integrating multiple sets of chemical data and application
Esim et al. Determination of malignant melanoma by analysis of variation values

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination