CN109584960A - Predict the method, apparatus and storage medium of tumor neogenetic antigen - Google Patents

Predict the method, apparatus and storage medium of tumor neogenetic antigen Download PDF

Info

Publication number
CN109584960A
CN109584960A CN201811531729.6A CN201811531729A CN109584960A CN 109584960 A CN109584960 A CN 109584960A CN 201811531729 A CN201811531729 A CN 201811531729A CN 109584960 A CN109584960 A CN 109584960A
Authority
CN
China
Prior art keywords
mutation
peptide
fusion
wild
length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811531729.6A
Other languages
Chinese (zh)
Other versions
CN109584960B (en
Inventor
叶浩
李祥永
戴珩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xukang medical technology (Suzhou) Co., Ltd
Original Assignee
Shanghai Whale Boat Gene Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Whale Boat Gene Technology Co Ltd filed Critical Shanghai Whale Boat Gene Technology Co Ltd
Priority to CN201811531729.6A priority Critical patent/CN109584960B/en
Publication of CN109584960A publication Critical patent/CN109584960A/en
Application granted granted Critical
Publication of CN109584960B publication Critical patent/CN109584960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention relates to a kind of methods for predicting tumor neogenetic antigen, comprising steps of (1) carries out somatic mutation according to tumour-embryonal system check sample and Gene Fusion detects;(2) fusion mutant peptide and corresponding wildtype peptide are generated for every a pair of of fusion;(3) mutant peptide and corresponding wildtype peptide are generated based on each somatic mutation;(4) the special a human genome of building tumor sample and generation contain the mutant peptide of multiple mutation;(5) judge the mutant peptide true and false of single mutation and multimutation;(6) removal and the completely the same mutant peptide of wild-type protein other positions sequence;(7) detection of HLA molecule parting is carried out, the affinity of nascent polypeptide and HLA molecule is predicted, using the high nascent polypeptide of affinity as candidate tumor neoantigen.The present invention also provides corresponding device and computer storage mediums.It using method, apparatus and storage medium of the invention, can effectively oncotherapy response biomarker assess, provide accurately candidate peptide fragment for tumor vaccine design.

Description

Predict the method, apparatus and storage medium of tumor neogenetic antigen
Technical field
The present invention relates to biological information field more particularly to immunotherapy of tumors biomarker discoveries, in particular to one Kind forms the prediction technique and its application of tumor neogenetic antigen to somatic mutation and Gene Fusion.
Background technique
Tumor neogenetic antigen
Tumor neogenetic antigen refers to that the script for presenting cell recognition by human antigen is not stored in " nonego " new raw egg of human body White polypeptide is somebody's turn to do the nascent polypeptide of " nonego " mainly from the mutain apoptosis that tumor cell mutations are formed.Specifically new For in the biological process that raw antigen is offered, be divided into 5 steps: 1, antigen presenting cell (APC) can be by endocytosis tumour Albumen (including mutain) in tumour cell is cracked into short peptide fragment by cell;2, APC intracellular transhipment egg These peptide fragments are transported in endoplasmic reticulum by white (TAP, endosome);3, the HLA I class molecule in endoplasmic reticulum in expression ,-II class The anchoring of molecule groove and peptide fragment is combined into stable compound, and (8~11 amino acid length peptide fragments, II class are divided I class molecule in conjunction with Son combines 13~25 amino acid length peptide fragments);4, the MHC molecule in endoplasmic reticulum and peptide fragment compound are secreted through golgiosome To APC cell surface;5, the HLA molecule-peptide fragment compound on the surface receptor TCR identification surface APC of immune t-cell, after excitation Continuous immune response.Tumor neogenetic antigen is the key factor for exciting body immune system to react tumour cell primary immune.
Tumor neogenetic antigen is applied in tumour immunity
The immune system that immunization therapy is conceived to recovery body dispels the identification killing ability of tumour cell to reach The purpose of tumour.Different to healthy cell and the logical traditional chemicotherapy or tyrosine kinase inhibitor killed of tumour cell Deng the direct killing of closing growth of tumour cell survival signaling access, immunization therapy is a kind of completely new and efficient oncotherapy New model.2013, cancer immunotherapy was chosen as first of " annual ten big sciences are broken through " by the U.S.'s " science " magazine.2018 Also flower falls immunotherapy field to Nobel prize's soul.Although immunization therapy growth momentum is swift and violent, this utilization Immune system is only effective to certain cancers and several patients come the strategy for attacking tumour.Do not doing any biomarker sieve In the case where choosing, the Overall response rate of most solid tumors is lower than 30%.And with high microsatellite instability/mismatch repair defects Overall response rate for the tumour of biomarker screening, PD1 treatment then can be improved to 50% or more.Therefore, suitable biology Marker screens the patient of immunization therapy, is the key point for realizing the accurate medicine of tumour immunity.Tumour in October, 2018 is prominent The non-small cell lung cancer practice guidelines of the comprehensive cancer network of US National are formally written in varying duty, and neoantigen is prominent as tumour Varying duty triggers the final effect factor of immune response, and the biomarker assessment immunization therapy that can become more accurate can Benefit property.Tumor vaccine personalized treatment based on tumor neogenetic antigen is also another important application scene.Tumour epidemic disease Seedling is to feed back the neoantigen detected in patient tumors cell into human body, exciting human immune response, and orientation is dispelled Present the tumour cell of these neoantigens.Currently, neoantigen, with polypeptide, nucleic acid or shapes such as DC cell through inducing in vitro Formula feeds back human body.Ott (PMID:29542692) and Sahin (PMID:28678784), Carreno (PMID:25837513) Et al., the neoantigen come will be predicted, be applied on cutaneum carcinoma small sample in the form of this 3 kinds of tumor vaccines respectively, obtained Good therapeutic effect.To sum up, tumor neogenetic antigen can not only be used for the biomarker that assessment immunization therapy benefits, can also To directly apply in the treatment of tumor vaccine.
The pre- flow gauge of existing neoantigen and method
Full exon hybrid capture sequencing based on two generation sequencing technologies provides high-throughput detection tumour body cell The possibility of mutation.Currently, the common process of neoantigen prediction is: 1, constructing nascent polypeptide library, somatic mutation annotation is arrived On protein level, it is 8~11 amino acid and 13~25 amino acid lengths that traversal, which generates length, around mutain point Mutant peptide and corresponding wild peptide fragment;2, HLA molecule is predicted to the affinity of nascent polypeptide and its corresponding wildtype peptide, is based on The affinity forecasting software of open source is predicted the affinity of nascent polypeptide and wildtype peptide and HLA molecule, was made with empirical value Filter, filters out potential neoantigen.Affinity forecasting software generally directed to I type and II type HLA molecule is that Denmark's industry is big NetMHC, netMHCpan and the netMHCII of exploitation, netMHCIIpan (http://www.cbs.dtu.dk/ services/).There are two big mainstream open source neoantigen forecasting softwares at present, is Xi Naishanyikan medical college OpenVax respectively Project team exploitation Topiary (https://github.com/openvax/topiary) and Mike Tang, University of Washington How Joint Genome Institute's Malachi Griffith development in laboratory pVACtools (https://github.com/ griffithlab/pVACtools).The two open source softwares have been applied to more and have been published in authoritative magazine Cancer In the research paper in relation to TCGA tumour big-sample data such as Cell, Immunity (PMID:29657128, PMID: 29628290).Due in affinity prediction, Topiary and pVACtools are all made of netMHC, the tools such as netMHCpan, It is not described herein.On nascent polypeptide library generates, two kinds of softwares are to generate single mutation peptide as unit of each mutation.This In have an apparent design defect as shown in Figure 1, if within 8~11 or 13~25 amino acid lengths occur two Or more cis- mutation, conventional method these can be contained multiple mutation nascent polypeptide lose.However, these missings The nascent polypeptide that the cis- mutation of multiple spot is formed, it is also possible to the neoantigen as body actual immunity originality.On the other hand, The process of these open sources is only that the mutant polypeptide that mutation is formed is collectively referred to as nascent polypeptide.Actually some mutant form At polypeptide be not real nascent polypeptide, these mutant polypeptides are likely to be present on wild type protein sequence other than catastrophe point Other positions on.Especially when these mutation occur on Sequences of Low Complexity region, (insertion on such as repetitive sequence is lacked Lose), the case where this mutant polypeptide is not real nascent polypeptide, is more common.For example, the 523rd~530, wild type PRX albumen Polypeptide sequence be LKVSEMKL, the 471st~478 polypeptide sequence is PKVSEMKL.It is mutated chr19:40902691A > G meeting PRX the 523rd amino acid L is caused to become P.Although the mutant polypeptide PKVSEMKL generated at this time and 523~530 open countries Raw peptide sequence is different, but completely the same with 471~478 peptide section sequences on wild PRX protein sequence, thus, this Mutant polypeptide is present in wild type PRX albumen, is not real nascent polypeptide for body machine.Conventional method is set Defect is counted, the accuracy of neoantigen prediction is directly affected.
Summary of the invention
The purpose of the present invention is overcoming the above-mentioned prior art, a kind of more effective oncotherapy response is provided Biomarker assessment provides the method for predicting tumor neogenetic antigen of accurately candidate peptide fragment for tumor vaccine design.
To achieve the goals above, one aspect of the present invention provides a kind of method for predicting tumor neogenetic antigen, has It is following to constitute:
The method comprising steps of
(1) somatic mutation is carried out according to tumour-embryonal system check sample and Gene Fusion detects;
(2) fusogenic peptide and corresponding wildtype peptide are generated for every a pair of of fusion;
(3) mutant peptide and corresponding wildtype peptide are generated based on each somatic mutation;
(4) the special a human genome of building tumor sample and generation contain the multimutation peptide of multiple mutation;
(5) by the cis trans relationship between mutation, judge the mutant peptide true and false of single mutation and multimutation, generate true Existing mutant peptide;
(6) removal and the completely the same mutant peptide of wild-type protein other positions sequence, construct complete nascent polypeptide library;
(7) the bam file based on embryonal system check sample carries out the detection of HLA molecule parting, and prediction nascent polypeptide and HLA divide The affinity of son, using the high nascent polypeptide of affinity as candidate tumor neoantigen.
Preferably, it is thin to export body under the default parameters of Mutect2 tool in the somatic mutation of the step (1) Cytoplasmic process becomes after result, carries out further Quality Control filtering, and the Quality Control filtering includes: that the frequency of mutation is greater than 2%;Catastrophe point Sequencing depth be greater than 10;At least 2 reads instructions have mutation and average base quality > 20 of the reads.
Preferably, detected in the step (1) for Gene Fusion, if input be full exon WES or full genome because Group WGS sequencing data, then detect Gene Fusion under default parameters with FACTERA tool;If input is RNAseq data, then Gene Fusion is detected under default parameters with STAR-Fusion tool, then by number >=1 junction reads, do into To reduce false positive, the junction reads refers to the reads of directly covering fusion breakpoint for the Quality Control of one step.
Preferably, fusion breakpoint annotation AGFusion is according to 5 ' ends, 3 ' end breakpoints in gene in the step (2) Coordinate information in group, annotation fusion breakpoint, and synthesize fused protein sequence overall length;Intercepted length is L and containing fusion The fusogenic peptide of breakpoint and corresponding 5 ' end, 3 ' end wildtype peptides.
Preferably, specific interception rule are as follows:
Determine coordinate of 5 ' the end fusion breakpoints on the fusion protein that wild albumen is held in 5 ' that length is p5 and length is g Index: fusion protein and 5 ' the wild protein sequences in end are compared, obtains maximum consistency fragment sequence seq1 and seq1 5 ' hold the coordinated indexing m on wild albumen, the length of coordinated indexing t, the seq1 on fusion protein are s1, then 5 ' end The coordinated indexing of breakpoint is m+s1 on wild-type protein, is t+s1 on fusion protein;
It determines coordinated indexing of 3 ' the end fusion breakpoints on 3 ' the wild albumen in end that length is p3: comparing fusion protein and 3 ' Wild albumen is held, obtains coordinated indexing n of the most homogeneous fragment sequence seq2 and seq2 on 3 ' the wild albumen in end, institute The length of the seq2 stated is s2;
The fusogenic peptide and corresponding 5 ' and 3 ' end wildtype peptides that intercepted length is L:
In the case where 3 ' end fusion breakpoints do not cause frameshit frame to change, each fusogenic peptide has opposite 5 ' end and 3 ' Two wildtype peptides are held to generate, fusion protein indexes t+s1-L from min coordinates and indexes t to maximum coordinates, and intercepted length is melting for L Close peptide;5 ' the wild albumen in end, which are stayed at one's house demanding payment of a debt from min coordinates index m+s1-L to maximum, draws m+s1, and 3 ' the wild albumen in end are from n-L to most Global coordinate indexes n, and intercepted length is two corresponding wildtype peptides of L;
When 3 ' end fusion breakpoints cause frameshit frame to change, each fusogenic peptide only has one 5 ' end wildtype peptide to generate, and melts Hop protein indexes t+s1-L from min coordinates and indexes g-L to maximum coordinates, and intercepted length is the fusogenic peptide of L, 5 ' the wild albumen in end M+s1-L is indexed from min coordinates and indexes p5-L to maximum coordinates, is sequentially generated the wildtype peptide that corresponding length is L.
Preferably, in the step (3),
It is annotated using SnpEff, it will be on the base mutation annotation to Ensembel database in each somatic cell gene group Each transcript and corresponding protein sequence on;
Intercepted length is the mutant peptide and corresponding wildtype peptide of L.
Preferably, interception rule are as follows:
For missense mutation, non-frameshift mutation, centered on being mutated coordinate, L-1 amino acid is taken to 5 ' ends, is taken to 3 ' ends L-1 amino acid generates length and contains mutation between 8~11 amino acid lengths and 13~25 amino acid lengths The mutation section of amino acid and corresponding wildtype peptide;
Typical single point is mutated, saltant type-wild type peptide fragment that 38 pairs of length are 8~11 amino acid lengths is generated, And 247 pairs of length are saltant type-wild type peptide fragment of 13~25 amino acid lengths;
For frameshift mutation, since the preceding L-1 amino acid of catastrophe point, until extending to first terminator appearance, Generate the mutation multistage of 8~11 and 13~25 amino acid lengths and the wild peptide fragment of corresponding coordinate.
Preferably, in the step (4),
Abrupt information in the VCF file of somatic mutation is disposably all imported into the mankind to refer on genome, and The wild-type base on former coordinate is replaced, all a human genomes of the tumor sample are generated;
The transcript containing mutation edited on genome base level is translated into mutation with Biopython tool Protein sequence;
Based on the corresponding mutain coordinated indexing information of each transcript that single mutation in step (3) annotates, by step (3) peptide fragment in intercepts rule, generates the mutant polypeptide of 8~11 and 13~25 amino acid lengths containing mutation;
In the step (5):
The bam file of tumor sample is read by pysam, export compares the reads information for arriving each catastrophe point, calculates mutation It puts between any twoWherein f (i), f (j) respectively indicate the NGS that instruction has mutation i, is mutated j Reads illustrates that none tumour subclone is to possess mutation i and mutation j simultaneously, needs when Jacard coefficient is 0 Remove the i containing mutation and the multimutation peptide for being mutated j while generation in step (4);When Jaccard coefficient is 1, illustrate all Tumour subclone is gathered around simultaneously with mutation i and mutation j, need to remove generate in step (3) containing only mutation i or be mutated j's Single mutation peptide;Jaccard coefficient between 0 and 1, then retain the i containing mutation is generated in step (3) or be mutated the single mutation peptide of j with And it is generated in step (4) simultaneously containing mutation i and the multimutation peptide for being mutated j.
Preferably, HLA molecule parting tool is sequenced using 5 kinds of two different generations, passes through generation in the step (7) The highest HLA genotyping result of consistency is to reduce false positive;Preferably, using Polysolver, HLA-HD, HLA-PRG-LA, OptiType and Hla-genotyper calculates HLA molecule parting, to each of 8 major class HLA HLA allele, initially It must be divided into 0, every to be arrived by a software detection, then score+1, the HLA allele of highest scoring finally divide as all kinds of HLA Sub- genotyping result;
The affinity of nascent polypeptide and HLA molecule is predicted specifically, working as nascent polypeptide and HLA molecule affinity≤500nM And opposite ranking≤2%, then regarding as the nascent polypeptide is candidate tumor neoantigen.
The present invention also provides the prediction tumor neogenetic antigens described in one kind in the application for preparing anti-tumor drug or vaccine.
Using prediction tumor neogenetic antigen of the invention and its application, by generating the special a people's gene of tumor sample Group compensates for two big defects of current main stream approach: 1, losing for the wrong deconsolidation process of multimutation peptide fragment or directly;2, New life is mistaken as by the mutation peptide fragment (being often found in wild-type protein) for occurring to be formed in low complex degree region mutagenesis Polypeptide.To enable neoantigen prediction technique of the invention that it is really new accurately more fully to react tumor sample Raw antigen status.In the hepatocellular carcinoma data that 13 receive immunization therapy, it is able to confirm that the neoantigen that the present invention calculates is negative Lotus can be effectively applied to the benefit assessment of immunotherapy of tumors.It is accurate comprehensive pre- on neoantigen in view of the present invention It surveys and provides reliable peptide fragment source for tumor vaccine design, the application on tumor vaccine, which is also that the present invention is another, potentially answers Use scene.
Detailed description of the invention
Fig. 1 is the difference of art methods and the present invention on building nascent polypeptide library.
Fig. 2 is that neoantigen provided by the invention predicts flow diagram.
Fig. 3 is method provided by the invention in embodiment 1 compared with open source software Topiary and pVACtools.
Fig. 4 is that the survivorship curve that the present invention is calculated in neoantigen in hepatocellular carcinoma immunization therapy sample is analyzed.
Specific embodiment
In order to more clearly describe technology contents of the invention, further retouch combined with specific embodiments below It states.
A kind of tumor neogenetic antigen prediction method based on building tumour human genome provided by the invention, comprising: packet The fusogenic peptide and corresponding wildtype peptide interception rule of the breakpoint containing fusion;It constructs tumour human genome and generates and contain multimutation Mutant peptide;The heterogeneous feature for fully considering tumour measures mutation using the Jaccard coefficient of NGS reads in catastrophe point Between cis-trans relationship, guarantee the accuracy of mutant peptide generated;It is highest that consistency is generated based on multiple and different tools HLA molecule parting as a result, guarantee the accuracy of HLA to the full extent;Removal and mutant peptide completely the same in wild albumen, Generate nascent polypeptide truly.
In method provided by the invention: providing raw including the mutant peptide from Gene Fusion and somatic mutation At method, the comprehensive of mutant peptide source is ensured;Tumor sample human genome is constructed, to accurately generate containing multiple prominent The mutant peptide of change;The heterogeneity for fully taking into account tumour ensure that mutant peptide in body with the cis-trans relationship between mutation Interior truth;Removal and the completely the same mutant peptide of wild albumen, ensure that the accuracy of nascent polypeptide;Pass through generation Multiple and different highest HLA of HLA molecule parting consistency as a result, improve the accuracy of HLA to the full extent;For time-consuming HLA and nascent polypeptide affinity prediction steps, carried out parallel processing, effectively improved operation efficiency.
Somatic mutation has been detected the present invention is based on full exon or genome sequencing, and building tumor patient is special A human genome examines or check the NGS reads information of the cis- mutation of multiple spot comprehensively, and it is raw to provide a kind of comprehensive and accurate nascent polypeptide At method.On this basis, the affine force prediction method of mainstream is integrated, neoantigen is calculated, so as to do more effective tumour Treatment response biomarker assessment provides accurately candidate peptide fragment for tumor vaccine design.
In conjunction with Fig. 2, illustrate the method for prediction tumor neogenetic antigen provided by the invention, method includes the following steps:
Step1: it is detected for tumour-embryonal system check sample to somatic mutation and Gene Fusion is done.
Open-Source Tools Mutect2, FACTERA are respectively used to somatic mutation and Gene Fusion detection (for input text When part is RNAseq data, Gene Fusion is detected using STAR-Fusion).
Following Quality Control is separately done after the parameter filtering of Mutect2 default for the reliability for guaranteeing somatic mutation result Filtering:
A. the frequency of mutation is greater than 2%;B. the sequencing depth of catastrophe point is greater than 10;C. at least 2 reads instructions have mutation And average base quality > 20 of the reads.
Detection for Gene Fusion breakpoint, in FACTERA tool (if RNAseq data STAR-Fusion) After default parameters exports Gene Fusion result, it need to separately guarantee that (directly breakpoint is merged in covering at least one junction reads reads)。
Step2: fusogenic peptide and corresponding wildtype peptide are generated for every a pair of of fusion.
It is held, each transcript of gene where 3 ' ends, and generated corresponding complete with AGFusion tool tips fusion breakpoint 5 ' Fusion protein sequence, then 8-11,13-25 amino acid length of the interception comprising fusion breakpoint in fusion protein sequence Polypeptide and 5 ' ends, the corresponding wildtype peptide in 3 ' ends.
It specifically includes:
1. merging breakpoint annotation: AGFusion (is specially contaminated according to 5 ' ends, the coordinate information of 3 ' end breakpoints in the genome Colour solid number+coordinate, such as: chr21:42866283), annotation fusion breakpoint, and synthesize fused protein sequence overall length.
2. intercepted length is that the fusogenic peptide of L breakpoint containing fusion and corresponding 5 ' are held, 3 ' end wildtype peptides.
A) seat of 5 ' the end fusion breakpoints on the fusion protein that wild albumen is held in 5 ' that length is p5 and length is g is determined Mark index.Fusion protein and 5 ' the wild protein sequences in end are compared, show that maximum length is the consistency fragment sequence of s1 Seq1 and seq1 the coordinated indexing m on 5 ' the wild albumen in end, the coordinated indexing t on fusion protein then 5 ' hold breakpoint Coordinated indexing on wild-type protein be m+s1, on fusion protein be t+s1.
B) 3 ' end fusion breakpoints coordinated indexing on 3 ' the wild albumen in end that length is p3 is determined.Compare fusion protein and 3 ' Wild albumen is held, show that the most homogeneous fragment sequence seq2 and seq2 that length is s2 holds the seat on wild albumen 3 ' Mark index n.
C) fusogenic peptide and corresponding 5 ' and 3 ' end wildtype peptides that intercepted length is L.
In the case where 3 ' end fusion breakpoints do not cause frameshit frame to change, each fusogenic peptide has corresponding 5 ' end and 3 ' Two wildtype peptides are held to generate.Fusion protein indexes t+s1-L from min coordinates and indexes t to maximum coordinates, and intercepted length is melting for L Close peptide.5 ' the wild albumen in end, which are stayed at one's house demanding payment of a debt from min coordinates index m+s1-L to maximum, draws m+s1.3 ' the wild albumen in end are from n-L to most Global coordinate indexes two corresponding wildtype peptides that n intercepted length is L.
When 3 ' end fusion breakpoints cause frameshit frame to change, each fusogenic peptide only has one 5 ' end wildtype peptide to generate.Melt Hop protein indexes t+s1-L from min coordinates and indexes g-L to maximum coordinates, and intercepted length is the fusogenic peptide of L.5 ' the wild albumen in end M+s1-L is indexed from min coordinates and indexes p5-L to maximum coordinates, starts the wildtype peptide that intercepted length is L.
Step3: single mutation peptide and corresponding wildtype peptide are generated based on each somatic mutation.
With SnpEff annotation by the body cell base mutation annotation on genome to each on Ensembel database On a transcript and corresponding protein sequence.Interception includes that protein mutation site length is 8-11,13-25 amino acid length Mutant peptide and corresponding position on wildtype peptide.
It specifically includes:
A) being annotated with SnpEff will be on the base mutation annotation to Ensembel database in each somatic cell gene group Each transcript and corresponding protein sequence on.
B) intercepted length is the mutant peptide and corresponding wildtype peptide of L.
For missense mutation, for non-frameshift mutation, centered on being mutated coordinate, L-1 amino acid is taken to 5 ' ends, to 3 ' End takes L-1 amino acid (L is the mutant peptide length to be generated).Generating length is 8-11 amino acid length and 13-25 The mutation section containing mutating acid and corresponding wildtype peptide between a amino acid length.
For simple point mutation typical for one, the mutation that 38 pairs of length are 8-11 amino acid length can be generated Type-wild type peptide fragment and 247 pairs of length are saltant type-wild type peptide fragment of 13-25 amino acid length.
If sporting frameshift mutation, then since taking L-1 amino acid before catastrophe point, prolong to reaching first termination Until son occurs, the mutation multistage of 8-11 and 13-25 amino acid length and the wild peptide fragment of corresponding coordinate are generated.
Step4: the special a human genome of building tumor sample simultaneously generates the mutant peptide for containing multiple mutation.
All mutating alkali yls that batch detects Step1 replace the mankind with reference to the base on genome.This advantage exists In the multiple mutation occurred on each gene can be captured simultaneously.
Loss or error note to the mutant peptide comprising multiple mutation are that the one of existing neoantigen forecasting tool is big short Plate.The present invention is using the method for generating the special a human genome of tumor sample, to make up this defect.Specifically, by body Abrupt information in the VCF file of cell mutation disposably all imported into the mankind with reference to genome, and replaces on former coordinate Wild-type base, generate all a human genomes of the tumor sample.
The transcript containing mutation edited on genome base level is translated into mutation with Biopython tool Protein sequence.Based on the corresponding mutain coordinated indexing information of each transcript that 3 single mutation of Step annotates, by Step 3 In peptide fragment intercept rule, generate containing mutation 8-11 and 13-25 amino acid length mutant polypeptide.
It is edited on protein level compared to based on single mutation annotation information, the present invention is raw by the base of editor's genome At tumour human genome, then unify the entire transcript of annotation, after the multiple base mutations of reaction that can be more accurate Mutain situation.Especially when point mutation occurs within the codeword triplet of the same amino acid.Such as: chr11: 56143803A > G, chr11:56143804G > A occur in the same codon, when individually annotation arrives protein level, respectively For ORBU1:p.Gln235Arg and ORBU1:p.Gln235His, the conflict of protein level editor will cause.This step needs It is noted that guaranteeing that mutation is corresponding consistent with the reference genome version that will be imported with reference to genome version.The invention branch at present Major version GRCh37 and GRCh38 that the mankind refer to genome are held, the reference gene of other species can be further expanded to Such as rat, mouse in group.
Step5: judge to generate in the single mutation and Step4 generated in Step3 by the cis trans relationship between mutation The multimutation peptide true and false, generate the mutant peptide of necessary being.
According to the reads information in sequence alignment bam file, determine that the relationship between mutation is cis- or cis relationship, To judge the mutant peptide true and false containing multimutation and single mutation.By taking two mutation as an example, if two sport trans- dash forward Become, i.e., is that then the mutant peptide containing this pair of prominent peptide is removed by this, is only protected simultaneously comprising the two mutation without a reads It stays containing the mutant peptide being individually mutated.If two sport cis- mutation, only retain the mutation simultaneously containing the two mutation Peptide.
Due to Tumor Heterogeneity, the listed mutation in somatic mutation file is not entirely cis- mutation, i.e., these are prominent Change, which is not necessarily, to be appeared in the same tumour subclone, and multiple mutation are dispersed in different subclones in other words, are being surveyed Ordinal number appears on different NGS reads according to multiple mutation are above shown as.Therefore, tumor sample sequence is introduced in this step The bam file of comparison is used to judge the true and false of these multimutation peptide fragments and single mutation peptide fragment.Only indicate these mutation NGS reads overlaps, and just can guarantee the single mutation peptide fragment generated containing multimutation peptide and Step3 that Step4 is generated It is all necessary being.
The bam file of tumor sample is read particular by pysam, export passes through the chromosome coordinate of each catastrophe point Reads information calculates catastrophe point between any twoWherein, f (i), f (j) respectively indicate instruction The NGS reads for having mutation i, being mutated j illustrates that none tumour subclone is to possess simultaneously when Jacard coefficient is 0 It is mutated i and mutation j, need to remove the i containing mutation simultaneously and is mutated the mutant peptide of j;When Jaccard coefficient is 1, illustrate institute Have tumour subclone simultaneously gather around with mutation i and mutation j, need to remove generated in Step3 it is independent containing only mutation i or dash forward Become the mutant peptide of j;Jaccard coefficient between 0 and 1, then retain Step3 generation the caused mutant peptide of single mutation and The multimutation containing mutation i and mutation j generated in Step4.True complete mutant peptide is eventually generated via step Library.
Step6: removal and the completely the same mutant peptide of wild-type protein other positions sequence construct complete nascent polypeptide Library.
Nascent polypeptide, which refers to, is not present in mutant peptide on wild-type protein caused by mutation, so just can be by immunity of organism system System is considered newborn.The mutant peptide formed via Step5 is not fully equivalent to nascent polypeptide.Especially as the repetition of generation The protein mutation in region, it is easy to the sequence completely the same with mutant peptide, this peptide are found in the other positions of wild-type protein Section can not be known as really newborn anti-peptide fragment.This point is also the place that existing Open-Source Tools directly neglect.This step is directed to Every mutant peptide obtains the corresponding wild protein sequence of each transcript by pyensemble, check mutant peptide whether Occur in wild protein sequence.
Step7: the bam file based on embryonal system check sample does the high HLA molecule parting detection of high consistency.
In view of the goldstandard generation sequencing consistency that is detected as HLA molecule parting also only up to 84% (PMID: 27802932), HLA molecule parting tool is sequenced using 5 kinds of two different generations in the present invention, by generating the highest HLA of consistency Genotyping result is to reduce false positive.Specifically, with Polysolver, HLA-HD, HLA-PRG-LA, OptiType and Hla- Genotyper calculates HLA molecule parting.Wherein Polysolver, OptiType only calculate I type HLA detection, and other three kinds same When can also be used for II type HLA detection.To each of 8 major class HLA (A, B, C, DRB, DPA, DPB, DQA, DQB) HLA Allele, initial to be divided into 0, every to be arrived by a software detection, then score+1.The HLA allele of highest scoring is as each The final molecule parting result of class HLA.
Step8: the affinity of prediction nascent polypeptide and HLA molecule, and according to affinity height, the new life for calculating sample is anti- Former load.
The prediction actually analogue antigen of neoantigen passes through the groove and new life in structure in the HLA molecule in delivery cell The anchoring of polypeptide combines.Existing several mainstream tool such as netMHC, netMHCpan, netMHCII, netMHCIIpan, Mhcflurry etc., which is all based on to the affinity of HLA molecule and small peptide in truthful data, trains each HLA molecular specific Neural network model, the affinity for being subsequently used for the HLA molecule and nascent polypeptide are predicted.Current this kind of algorithm is almost owned Reported neoantigen forecasting tool is applied.The present invention also uses the affinity prediction algorithm of these types of mainstream.
Specifically screening conditions are, as nascent polypeptide and HLA molecule affinity≤500nM and relative affinity ranking≤ 2%, then regarding as the nascent polypeptide is neoantigen.Here the affinity exported is indicated with IC50 value, is represented and 50% Nascent polypeptide concentration when HLA molecule combines, unit nM.The numerical value is smaller, indicates coded by peptide fragment and the allele HLA albumen affinity it is higher.The opposite ranking of affinity is indicated with Rank (%).I.e. the IC50 value of the nascent polypeptide with Percentage ranking in the IC50 data set for 400000 peptide fragments that machine generates.Numerical value is smaller, illustrates peptide fragment and the HLA points The affinity of son is in relatively higher position.Reach the sum of all neoantigen-HLA molecular complexes of this threshold value Referred to as tumor neogenetic antigen load.
Embodiment 1
The present embodiment is started with the mutation file of an example non-small cell lung cancer sample, and specific abrupt information is shown in Table 1, and divides Do not compare Topiary, pVACtools and method of the invention predicts the nascent polypeptide of 8-11 amino acid length.
The embodiment can be proof scheme embodiment of the invention, it was demonstrated that two of the present invention compared to current mainstream The advantage of tool.Fig. 3 illustrate the comparison procedures of two Open-Source Tools of neoantigen prediction technique and mainstream of the invention with As a result.Since affinity of three kinds of tools for peptide fragment and HLA molecule uses identical method, three kinds of tools are only focused on here Difference in nascent polypeptide generation.
58 individual cells mutational site information of 1 Patients with Non-small-cell Lung of table
It is true as the nascent polypeptide for judging to generate according to whether being detected in raw albumen out of office in method provided by the invention Pseudo- foundation, all mutant polypeptides for being mutated and being formed can be at two parts: genuine nascent polypeptide, false nascent polypeptide.In this example In, symbiosis of the present invention is at 203 false nascent polypeptides (i.e. mutated polypeptide sequences can be found in wild-type protein), and 1792 The genuine nascent polypeptide of item.Topiary and pVACtools generate 1748,1702 nascent polypeptides respectively.
From the results of three tools relatively in, it can clearly be seen that 3 point discoveries: 1, the present invention and pVACtools method are raw At nascent polypeptide the result of Topiary can be completely covered;2, in the nascent polypeptide sequence of Topiary and pVACtools prediction There are 62 to find in wild-type protein.This general mutant polypeptide for carrying out mutation is referred to as the way of nascent polypeptide It is wanting in consideration;3, pVACtools and Topiary is weak when handling adjacent double alkali yl mutation.It is prominent for double alkali yl replacement Become, pVACtools is directly split as two base mutations by force, will cause amino acid annotation mistake, and Topiary is then directly neglected Omitting whole double alkali yl mutation causes the nascent polypeptide generated to reduce totally (4 in this sample).Peculiar 76 mistakes of pVACseq institute Nascent polypeptide accidentally is mutated from 4: chr15:28947425G > A;chr15:28947426A>G;chr4: 145041707C>A; chr4:145041708T>C.This 4 mutation be by directly by two in somatic mutation file it is adjacent Double alkali yl is mutated chr15:28947425:GA > AG;Chr4:145041707CT > AC is split by force.Carefully analyze this hair Bright specific 190 genuine nascent polypeptides, discovery are concentrated mainly in 9 mutation shown in table 2.Wherein there are 4 double alkali yls Mutation is lost, and the nascent polypeptide that an EGFR hot spot mutation is formed especially is lacked.Turn in addition, being lost part there are also 3 mutation This annotation is recorded, and leads to the loss of corresponding nascent polypeptide.
To sum up illustrate, invention achieves expected design effects, can overcome the disadvantages that existing tool fault, this will be helpful to accurately It calculates tumor neogenetic antigen load and assesses immunotherapeutic effects, and reliable polypeptide information service swelling in the later period is provided Tumor vaccine design.
9 abrupt informations corresponding to 2 distinctive 190 nascent polypeptides of the present invention of table
Embodiment 2
Embodiment 2 is method provided by the invention concrete application scene on immunotherapy of tumors, to illustrate the present invention Application value on immunotherapy of tumors, and the advantage compared to Tumor mutations load granted at present.High tumour is prominent Varying duty explanation has more tumour somatic mutations, it is meant that can generate more tumor neogenetic antigens, such tumour cell A possibility that being identified by immunocyte is also bigger, this is exactly Tumor mutations load as biomarker and assesses immunization therapy Where the biological theory of effect.It is biological as immunization therapy that embodiment 2 verifies the tumor neogenetic antigen load that the present invention calculates The validity of marker.
The overall survival data of 13 hepatocellular carcinoma patients through immunization therapy are shown in table 3, and pass through full exon The Tumor mutations load detected, the sample neoantigen load that the present invention calculates is sequenced.Here, Tumor mutations load (TMB) it is defined as the non-synonymous somatic mutation number detected on Quan Xianzi.Tumor neogenetic antigen load (TNB) refers to institute There are nascent polypeptide-HLA points for meeting threshold value (nascent polypeptide and HLA molecule affinity≤500nM and opposite ranking≤2%) The sum of sub- compound.According to the median of TMB, patient can be divided into two groups: high 7 people of TMB group, low 6 people of TMB group.It is identical , patient can also be divided by high TNB group and low TNB group according to the median of TNB.In Fig. 4, made respectively TNB, The survivorship curve of TMB height grouping.It was found that TNB can significantly distinguish Survival (p value < 0.05, the high TNB of immunization therapy patient The low TNB group OS of group OS vs is 565 days: 185 days).Although it can be seen that high TMB group has extension compared to low TMB group in trend OS, do not have conspicuousness statistically (p value=0.29, the low TMB group OS of high TMB group OS vs are 336 days: 304 days).This knot Fruit shows that neoantigen prediction technique of the invention more can accurately assess immunotherapy of tumors compared to Tumor mutations load Effect.Neoantigen load has good application scenarios as biomarker.
Table 3
To sum up, neoantigen prediction technique of the invention is in nascent polypeptide generation, special by generating tumor sample A human genome compensates for two big defects of current main stream approach: 1, for the wrong deconsolidation process of multimutation peptide fragment or directly It loses;2, it is missed by the mutation peptide fragment (being often found in wild-type protein) for occurring to be formed in low complex degree region mutagenesis Think nascent polypeptide.In addition, the nascent polypeptide that the present invention is also included in somatic mutation simultaneously and Gene Fusion is formed.To, Neoantigen prediction technique of the invention is enabled more fully accurately to react the true neoantigen situation of tumor sample. In the hepatocellular carcinoma data that 13 receive immunization therapy, it is able to confirm that the neoantigen load that the present invention calculates can be effective Benefit applied to immunotherapy of tumors is assessed.Tumour epidemic disease is accurately comprehensively predicted as on neoantigen in view of the present invention Seedling design provides reliable peptide fragment source, and the application on tumor vaccine is also another potential application scenarios of the present invention.
In this description, the present invention is described referring to its specific embodiment.But it is clear that can still make Various modifications and alterations are without departing from the spirit and scope of the invention out.Therefore, the description and the appended drawings should be considered as illustrative And not restrictive.

Claims (9)

1. a kind of method for predicting tumor neogenetic antigen, which is characterized in that the method comprising steps of
(1) somatic mutation is carried out according to tumour-embryonal system check sample and Gene Fusion detects;
(2) fusogenic peptide and corresponding wildtype peptide are generated for every a pair of of fusion;
(3) single mutation peptide and corresponding wildtype peptide are generated based on each somatic mutation;
(4) the special a human genome of building tumor sample and generation contain the multimutation peptide of multiple mutation;
(5) by the cis trans relationship between mutation, judge the mutant peptide true and false of single mutation and multimutation, generate necessary being Mutant peptide;
(6) removal and the completely the same mutant peptide of wild-type protein other positions sequence, construct complete nascent polypeptide library;
(7) the bam file based on embryonal system check sample carries out the detection of HLA molecule parting, predicts the parent of nascent polypeptide and HLA molecule And power, using the high nascent polypeptide of affinity as candidate tumor neoantigen.
2. the method for prediction tumor neogenetic antigen according to claim 1, which is characterized in that in the step (1):
For somatic mutation, after exporting somatic mutation result under the default parameters of Mutect2 tool, carry out further Quality Control filtering, the Quality Control filtering include: that the frequency of mutation is greater than 2%;The sequencing depth of catastrophe point is greater than 10;At least 2 Reads instruction has mutation and average base quality > 20 of the reads;
Gene Fusion is detected, if input is full exon WES or full-length genome WGS sequencing data, uses FACTERA tool Gene Fusion is detected under default parameters;If input is RNAseq data, examined under default parameters with STAR-Fusion tool Cls gene fusion is done further Quality Control to reduce false positive, is somebody's turn to do then by number >=1 junction reads Junction reads refers to directly to cover the reads of fusion breakpoint.
3. the method for prediction tumor neogenetic antigen according to claim 1, which is characterized in that in the step (2), Breakpoint annotation is merged, using AGFusion tool according to 5 ' ends, the coordinate information of 3 ' end breakpoints in the genome, annotation fusion is disconnected Point, and generate fused protein sequence overall length;Intercepted length is L and the fusogenic peptide and corresponding 5 ' containing fusion breakpoint End, 3 ' end wildtype peptides;
Preferably, specific interception rule are as follows:
Determine coordinate rope of 5 ' the end fusion breakpoints on the fusion protein that wild albumen is held in 5 ' that length is p5 and length is g Draw: comparing fusion protein and 5 ' the wild protein sequences in end, obtains the 5 ' ends of maximum consistency fragment sequence seq1 and seq1 The length of coordinated indexing t, the seq1 on coordinated indexing m, fusion protein on wild albumen are s1, then 5 ' hold breakpoints Coordinated indexing is m+s1 on wild-type protein, is t+s1 on fusion protein;
It determines coordinated indexing of 3 ' the end fusion breakpoints on 3 ' the wild albumen in end that length is p3: comparing fusion protein and 3 ' ends are wild Raw albumen obtains coordinated indexing n of the most homogeneous fragment sequence seq2 and seq2 on 3 ' the wild albumen in end, described The length of seq2 is s2;
The fusogenic peptide and corresponding 5 ' and 3 ' end wildtype peptides that intercepted length is L:
In the case where 3 ' end fusion breakpoints do not cause frameshit frame to change, each fusogenic peptide has opposite 5 ' end and 3 ' ends two Wildtype peptide generates, and fusion protein indexes t+s1-L from min coordinates and indexes t to maximum coordinates, and intercepted length is the fusogenic peptide of L; 5 ' the wild albumen in end, which are stayed at one's house demanding payment of a debt from min coordinates index m+s1-L to maximum, draws m, and 3 ' the wild albumen in end are indexed from n-L to maximum coordinates N, intercepted length are two corresponding wildtype peptides of L;
When 3 ' end fusion breakpoints cause frameshit frame to change, each fusogenic peptide only has one 5 ' end wildtype peptide to generate, and merges egg White to index t+s1-L to maximum coordinates index g-L from min coordinates, intercepted length is the fusogenic peptide of L, and 5 ' the wild albumen in end are from most Small coordinated indexing m+s1-L indexes p5-L to maximum coordinates, is sequentially generated the wildtype peptide that corresponding length is L.
4. the method for prediction tumor neogenetic antigen according to claim 1, which is characterized in that in the step (3),
It is annotated using SnpEff, the base mutation in each somatic cell gene group is annotated to every on Ensembel database On one transcript and corresponding protein sequence;
Intercepted length is the mutant peptide and corresponding wildtype peptide of L.
5. the method for prediction tumor neogenetic antigen according to claim 4, which is characterized in that interception rule are as follows:
For missense mutation, non-frameshift mutation, centered on being mutated coordinate, L-1 amino acid is taken to 5 ' ends, takes L-1 to 3 ' ends A amino acid generates length and contains mutation amino between 8~11 amino acid lengths and 13~25 amino acid lengths The mutation section of acid and corresponding wildtype peptide;
Typical single point is mutated, saltant type-wild type peptide fragment that 38 pairs of length are 8~11 amino acid lengths is generated, and 247 pairs of length are saltant type-wild type peptide fragment of 13~25 amino acid lengths;
For frameshift mutation, since the preceding L-1 amino acid of catastrophe point, until extending to first terminator appearance, 8 are generated The mutation peptide fragment of~11 and 13~25 amino acid lengths and the wild peptide fragment of corresponding coordinate.
6. the method for prediction tumor neogenetic antigen according to claim 1, which is characterized in that in the step (4),
Abrupt information in the VCF file of somatic mutation is disposably all imported into the mankind with reference to genome, and is replaced Wild-type base on former coordinate generates the special a human genome of the tumor sample;
The transcript containing mutation edited on genome base level is translated into mutain with Biopython tool Sequence;
Based on the corresponding mutain coordinated indexing information of each transcript that single mutation in step (3) annotates, by step (3) Peptide fragment intercept rule, generate containing mutation 8~11 and 13~25 amino acid lengths mutation peptide fragment;
In the step (5):
The bam file of tumor sample is read by pysam, export compares the reads information for arriving each catastrophe point, calculates catastrophe point two Between twoWherein f (i), f (j) respectively indicate the NGS reads that instruction has mutation i, is mutated j, When Jacard coefficient is 0, illustrates that none tumour subclone is to possess mutation i and mutation j simultaneously, need to remove step (4) the multimutation peptide of the i containing mutation and mutation j while generation in;When Jaccard coefficient is 1, illustrate all tumour subclones It gathers around simultaneously with i and mutation j is mutated, needs to remove the single mutation peptide containing only mutation i or mutation j generated in step (3); Jaccard coefficient then retains the single mutation peptide and step (4) that the i containing mutation or mutation j are generated in step (3) between 0 and 1 Middle generation is simultaneously containing mutation i and the multimutation peptide for being mutated j.
7. the method for prediction tumor neogenetic antigen according to claim 1, which is characterized in that in the step (7), adopt HLA molecule parting tool is sequenced with 5 kinds of two different generations, false sun is reduced by generating the highest HLA genotyping result of consistency Property;Preferably, HLA points are calculated using Polysolver, HLA-HD, HLA-PRG-LA, OptiType and hla-genotyper Sub- parting, initial to be divided into 0 to each of 8 major class HLA HLA allele, every to be arrived by a software detection, then score + 1, the HLA allele of highest scoring is as the final molecule parting result of all kinds of HLA;
The affinity of nascent polypeptide and HLA molecule is predicted specifically, working as nascent polypeptide and HLA molecule affinity≤500nM and phase To ranking≤2%, then regarding as the nascent polypeptide is candidate tumor neoantigen.
8. a kind of device for predicting tumor neogenetic antigen, which is characterized in that the device includes the storage for storing program Device and processor for executing the program, to realize prediction tumor neogenetic described in any one of claims 1 to 7 The method of antigen.
9. a kind of computer readable storage medium, which is characterized in that including program, the program can be executed by processor with complete At the method for predicting tumor neogenetic antigen described in any one of claims 1 to 7.
CN201811531729.6A 2018-12-14 2018-12-14 Method, device and storage medium for predicting tumor neoantigen Active CN109584960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811531729.6A CN109584960B (en) 2018-12-14 2018-12-14 Method, device and storage medium for predicting tumor neoantigen

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811531729.6A CN109584960B (en) 2018-12-14 2018-12-14 Method, device and storage medium for predicting tumor neoantigen

Publications (2)

Publication Number Publication Date
CN109584960A true CN109584960A (en) 2019-04-05
CN109584960B CN109584960B (en) 2021-07-30

Family

ID=65928671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811531729.6A Active CN109584960B (en) 2018-12-14 2018-12-14 Method, device and storage medium for predicting tumor neoantigen

Country Status (1)

Country Link
CN (1) CN109584960B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110277135A (en) * 2019-08-10 2019-09-24 杭州新范式生物医药科技有限公司 A kind of method and system based on expected effect selection individuation knubble neoantigen
CN110322925A (en) * 2019-07-18 2019-10-11 杭州纽安津生物科技有限公司 A method of prediction fusion generates neoantigen
CN110706747A (en) * 2019-09-17 2020-01-17 北京橡鑫生物科技有限公司 Method and device for detecting tumor neoantigen polypeptide
CN111951887A (en) * 2020-07-27 2020-11-17 深圳市新合生物医疗科技有限公司 Leukocyte antigen and polypeptide binding affinity prediction method based on deep learning
CN111979323A (en) * 2020-08-28 2020-11-24 深圳裕策生物科技有限公司 Biomarker detection method and system for predicting tumor immunotherapy effect
CN112639984A (en) * 2018-08-28 2021-04-09 生命科技股份有限公司 Method for detecting mutation load from tumor sample
CN113053458A (en) * 2021-01-19 2021-06-29 深圳裕康医学检验实验室 Prediction method and device for tumor neoantigen load
CN113160887A (en) * 2021-04-23 2021-07-23 哈尔滨工业大学 Screening method of tumor neoantigen fused with single cell TCR sequencing data
CN113345516A (en) * 2021-06-23 2021-09-03 深圳裕泰抗原科技有限公司 HLA genotyping method, device and storage medium
CN114333998A (en) * 2020-10-10 2022-04-12 格源致善(上海)生物科技有限公司 Tumor neoantigen prediction method and system based on deep learning model
CN115424740A (en) * 2022-09-30 2022-12-02 四川大学华西医院 Tumor immunotherapy effect prediction system based on NGS and deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103596985A (en) * 2011-04-19 2014-02-19 美国政府(由卫生和人类服务部的部长所代表) Human monoclonal antibodies specific for glypican-3 and use thereof
CN107391965A (en) * 2017-08-15 2017-11-24 上海派森诺生物科技股份有限公司 A kind of lung cancer somatic mutation determination method based on high throughput sequencing technologies
CN108388773A (en) * 2018-02-01 2018-08-10 杭州纽安津生物科技有限公司 A kind of identification method of tumor neogenetic antigen
CN108441547A (en) * 2018-04-13 2018-08-24 北京诺诗康瀛基因技术股份有限公司 A kind of HLA gene magnifications, the primer sets of Genotyping, kit and method
CN108491689A (en) * 2018-02-01 2018-09-04 杭州纽安津生物科技有限公司 Tumour neoantigen identification method based on transcript profile
CN108796055A (en) * 2018-06-12 2018-11-13 深圳裕策生物科技有限公司 Tumor neogenetic antigen detection method, device and storage medium based on the sequencing of two generations

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103596985A (en) * 2011-04-19 2014-02-19 美国政府(由卫生和人类服务部的部长所代表) Human monoclonal antibodies specific for glypican-3 and use thereof
CN107391965A (en) * 2017-08-15 2017-11-24 上海派森诺生物科技股份有限公司 A kind of lung cancer somatic mutation determination method based on high throughput sequencing technologies
CN108388773A (en) * 2018-02-01 2018-08-10 杭州纽安津生物科技有限公司 A kind of identification method of tumor neogenetic antigen
CN108491689A (en) * 2018-02-01 2018-09-04 杭州纽安津生物科技有限公司 Tumour neoantigen identification method based on transcript profile
CN108441547A (en) * 2018-04-13 2018-08-24 北京诺诗康瀛基因技术股份有限公司 A kind of HLA gene magnifications, the primer sets of Genotyping, kit and method
CN108796055A (en) * 2018-06-12 2018-11-13 深圳裕策生物科技有限公司 Tumor neogenetic antigen detection method, device and storage medium based on the sequencing of two generations

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NOHA A. YOUSRI ET AL.: "Associating gene functional groups with multiple clinical conditions using Jaccard similarity", 《2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW)》 *
焦洋: "HIV-1耐药相关特征性突变位点及复制适应性研究", 《中国博士学位论文全文数据库 医药卫生科技辑》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112639984A (en) * 2018-08-28 2021-04-09 生命科技股份有限公司 Method for detecting mutation load from tumor sample
CN110322925B (en) * 2019-07-18 2021-09-03 杭州纽安津生物科技有限公司 Method for predicting generation of neoantigen by fusion gene
CN110322925A (en) * 2019-07-18 2019-10-11 杭州纽安津生物科技有限公司 A method of prediction fusion generates neoantigen
CN110277135B (en) * 2019-08-10 2021-06-01 杭州新范式生物医药科技有限公司 Method and system for selecting individualized tumor neoantigen based on expected curative effect
CN110277135A (en) * 2019-08-10 2019-09-24 杭州新范式生物医药科技有限公司 A kind of method and system based on expected effect selection individuation knubble neoantigen
CN110706747A (en) * 2019-09-17 2020-01-17 北京橡鑫生物科技有限公司 Method and device for detecting tumor neoantigen polypeptide
CN110706747B (en) * 2019-09-17 2021-09-07 北京橡鑫生物科技有限公司 Method and device for detecting tumor neoantigen polypeptide
CN111951887A (en) * 2020-07-27 2020-11-17 深圳市新合生物医疗科技有限公司 Leukocyte antigen and polypeptide binding affinity prediction method based on deep learning
CN111979323A (en) * 2020-08-28 2020-11-24 深圳裕策生物科技有限公司 Biomarker detection method and system for predicting tumor immunotherapy effect
CN114333998A (en) * 2020-10-10 2022-04-12 格源致善(上海)生物科技有限公司 Tumor neoantigen prediction method and system based on deep learning model
CN114333998B (en) * 2020-10-10 2024-10-15 格源致善(上海)生物科技有限公司 Tumor neoantigen prediction method and neoantigen prediction system based on deep learning model
CN113053458A (en) * 2021-01-19 2021-06-29 深圳裕康医学检验实验室 Prediction method and device for tumor neoantigen load
CN113053458B (en) * 2021-01-19 2023-08-04 深圳裕康医学检验实验室 Method and device for predicting tumor neoantigen load
CN113160887A (en) * 2021-04-23 2021-07-23 哈尔滨工业大学 Screening method of tumor neoantigen fused with single cell TCR sequencing data
CN113345516A (en) * 2021-06-23 2021-09-03 深圳裕泰抗原科技有限公司 HLA genotyping method, device and storage medium
CN115424740A (en) * 2022-09-30 2022-12-02 四川大学华西医院 Tumor immunotherapy effect prediction system based on NGS and deep learning
CN115424740B (en) * 2022-09-30 2023-11-17 四川大学华西医院 Tumor immunotherapy effect prediction system based on NGS and deep learning

Also Published As

Publication number Publication date
CN109584960B (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN109584960A (en) Predict the method, apparatus and storage medium of tumor neogenetic antigen
CN108388773B (en) A kind of identification method of tumor neogenetic antigen
CN108796055B (en) Method, device and storage medium for detecting tumor neoantigen based on second-generation sequencing
CN109801678A (en) Based on the tumour antigen prediction technique of full transcript profile and its application
KR102209364B1 (en) Systems and methods for sequencing T cell receptors and uses thereof
CN108601820A (en) Compositions and methods for viral cancer neoepitopes
JP2021503897A (en) Reduced junction epitope presentation for nascent antigens
CN108491689A (en) Tumour neoantigen identification method based on transcript profile
CN111415707B (en) Prediction method of clinical individuation tumor neoantigen
Varricchio et al. Calreticulin: challenges posed by the intrinsically disordered nature of calreticulin to the study of its function
Zhang et al. Comparison of immune checkpoint inhibitors between older and younger patients with advanced or metastatic lung cancer: a systematic review and meta‐analysis
Parey et al. Synteny-guided resolution of gene trees clarifies the functional impact of whole-genome duplications
CN111755067A (en) Screening method of tumor neoantigen
CN110752041A (en) Method, device and storage medium for predicting neoantigen based on next generation sequencing
CN107978345A (en) Health data analysis report generation system and method based on gene sequencing
Delaye et al. Evidence of the red-queen hypothesis from accelerated rates of evolution of genes involved in biotic interactions in Pneumocystis
von Kügelgen et al. Interdigitated immunoglobulin arrays form the hyperstable surface layer of the extremophilic bacterium Deinococcus radiodurans
Kawashita et al. Homology, paralogy and function of DGF-1, a highly dispersed Trypanosoma cruzi specific gene family and its implications for information entropy of its encoded proteins
DeVette et al. A pipeline for identification and validation of tumor-specific antigens in a mouse model of metastatic breast cancer
CN111192632B (en) Method and device for extracting gene fusion immunotherapy new antigen by integrating DNA and RNA deep sequencing data
Chen et al. Dominant neoantigen verification in hepatocellular carcinoma by a single-plasmid system coexpressing patient HLA and antigen
CA3115017A1 (en) Method and system of targeting epitopes for neoantigen-based immunotherapy
Bayrami et al. In silico prediction of B cell epitopes of the extracellular domain of insulin-like growth factor-1 receptor
CN111696628A (en) Method for identifying neoantigens
Sheng et al. The efficacy of combining EGFR monoclonal antibody with chemotherapy for patients with advanced nonsmall cell lung cancer: a meta-analysis from 9 randomized controlled trials

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200422

Address after: 215123 Unit 201, B7 Building, 218 Xinghu Street, Suzhou Industrial Park, Jiangsu Province

Applicant after: Xukang medical technology (Suzhou) Co., Ltd

Address before: 201318 4-5 Floors, Area A, Building 19, 3399 Lane, Kangxin Road, Pudong New District, Shanghai

Applicant before: SHANGHAI JINGZHOU GENE TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant