CN109767811A - For predicting the construction method of the line style model of Tumor mutations load, predicting the method and device of Tumor mutations load - Google Patents

For predicting the construction method of the line style model of Tumor mutations load, predicting the method and device of Tumor mutations load Download PDF

Info

Publication number
CN109767811A
CN109767811A CN201811447772.4A CN201811447772A CN109767811A CN 109767811 A CN109767811 A CN 109767811A CN 201811447772 A CN201811447772 A CN 201811447772A CN 109767811 A CN109767811 A CN 109767811A
Authority
CN
China
Prior art keywords
tumor mutations
tumor
mutations load
predicting
load
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811447772.4A
Other languages
Chinese (zh)
Other versions
CN109767811B (en
Inventor
张静波
李孟键
王建伟
伍启熹
刘倩
刘珂弟
唐宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing You Xun Medical Laboratory Laboratory Co Ltd
Original Assignee
Beijing You Xun Medical Laboratory Laboratory Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing You Xun Medical Laboratory Laboratory Co Ltd filed Critical Beijing You Xun Medical Laboratory Laboratory Co Ltd
Priority to CN201811447772.4A priority Critical patent/CN109767811B/en
Publication of CN109767811A publication Critical patent/CN109767811A/en
Application granted granted Critical
Publication of CN109767811B publication Critical patent/CN109767811B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The construction method that the invention discloses a kind of for predicting the line style model of Tumor mutations load, the method and device for predicting Tumor mutations load.Wherein, which includes: S1, screens same sense mutation and the nonsynonymous mutation of protein encoding regions, calculates separately NsynAnd Nnon;S2 calculates Ltarget, calculate target area Tumor mutations load same sense mutation item: Tsys=Nsyn/Ltarget, calculate target area Tumor mutations load nonsynonymous mutation item: Tnon=Nnon/Ltarget;S3 establishes multivariate linear model, calculates the Tumor mutations load TMB:TMB=a of sample to be tested1Tsys+a2Tnon+a3log(Tnon).It applies the technical scheme of the present invention, can predict the Tumor mutations load of full exon, effectively reduce cost, shorten the period.

Description

Construction method, prediction tumour for predicting the line style model of Tumor mutations load is prominent The method and device of varying duty
Technical field
The present invention relates to field of biomedicine technology, in particular to a kind of for predicting the line of Tumor mutations load The construction method of pattern type, the method and device for predicting Tumor mutations load.
Background technique
Tumor mutations load (TMB) refers to for the non-synonymous somatic mutation that the every megabasse of exons coding district occurs Number is the completely new biomarker for predicting immunotherapeutic effects, has a good application prospect.Somatic mutation can change egg Bai Xulie generates neoantigen.These neoantigens are identified as non-self antigen by self immune system, activate T cell, cause to be immunized Reaction, therefore when Tumor mutations load is high, will generate more antigens, be conducive to immune system and kill tumour cell.Much grind Studying carefully verified Tumor mutations load to immunotherapeutic effects is significant relevant.
Currently used Tumor mutations load testing method is the Lawrence team plan proposed on Nature in 2015 Slightly, Tumor mutations load condition is judged by calculating the somatic mutation number of full exon group (mean depth < 200X).So And this method is full sequencing of extron group, at high cost, detection cycle is long.
Summary of the invention
The present invention is intended to provide a kind of construction method for predicting the line style model of Tumor mutations load, prediction tumour are prominent The method and device of varying duty needs to carry out full exon group survey to solve Tumor mutations load testing method in the prior art Sequence, at high cost, the long technical problem of detection cycle.
To achieve the goals above, according to an aspect of the invention, there is provided it is a kind of for predicting Tumor mutations load Line style model construction method.The construction method obtains sample to be tested by sequencing and sequence analysis the following steps are included: S1 Accidental data, screen same sense mutation and the nonsynonymous mutation of protein encoding regions, calculate separately the same sense mutation of target area Number NsynWith nonsynonymous mutation number Nnon;S2 calculates the protein-coding region length of field L of target areatarget, calculate targeting district Domain Tumor mutations load same sense mutation item: Tsys=Nsyn/Ltarget, calculate target area Tumor mutations load nonsynonymous mutation item: Tnon=Nnon/ Ltarget;S3 establishes multivariate linear model, calculates the Tumor mutations load TMB:TMB=a of sample to be tested1Tsys+ a2Tnon+ a3log(Tnon), wherein a1、a2And a3It is fitted to obtain by existing database data.
Further, existing database includes TCGA database.
According to another aspect of the present invention, a kind of method for predicting Tumor mutations load is provided.This method includes following Step: S1 obtains the tumor sample of same patient respectively and normal sample and extracts DNA;S2 is captured former according to target area Reason captures tumor-related gene using probe;S3 carries out sequencing by high-throughput method and sequence is analyzed, obtains tumor sample Accidental data;And S4, according to as claimed in claim 1 for predicting the construction method of the line style model of Tumor mutations load The line style model prediction Tumor mutations load TMB of building.
Further, S3 further include: select the sequencing sequence of high quality, removal N content is greater than 5% sequence, and removal contains There is the sequence of connector.
Further, the sequence analysis in S3 includes: and uses to compare software for tumor sample DNA and normal sample DNA ratio To genome is referred to, then using variation inspection software, detection obtains the accidental data of tumor sample.
Further, S4 further include: mutational site is annotated using ANNOVAR software, obtains the same of target area Justice and nonsynonymous mutation, and calculate separately its number.
In accordance with a further aspect of the present invention, a kind of device for predicting Tumor mutations load is provided.The device includes: device For storing the module perhaps run or module as the component part of device;Wherein, module is software module, software module The method for being used to execute above-mentioned prediction Tumor mutations load for one or more, software module.
It applies the technical scheme of the present invention, passes through the Tumor mutations of the full exon of Tumor mutations load prediction of target area Load effectively reduces cost, shortens the period.
Detailed description of the invention
The accompanying drawings constituting a part of this application is used to provide further understanding of the present invention, and of the invention shows Examples and descriptions thereof are used to explain the present invention for meaning property, does not constitute improper limitations of the present invention.In the accompanying drawings:
Fig. 1 shows the dependency graph that method in embodiment 1 calculates gained TMB and full exon data calculating gained TMB; And
Fig. 2 shows TMB (WXS) intuitively comparings obtained by TMB (549panel) obtained by method in embodiment 1 and full exon Figure;
Fig. 3 shows the thumbnail being mutated obtained by PM00G18**0228 sample in embodiment 2.
Specific embodiment
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.Below in conjunction with embodiment, the present invention will be described in detail.
A kind of typical embodiment is invented at all, is provided a kind of for predicting the structure of the line style model of Tumor mutations load Construction method.Method includes the following steps: S1, the accidental data of sample to be tested is obtained by sequencing and sequence analysis, screens egg The same sense mutation of white coding region and nonsynonymous mutation calculate separately the same sense mutation number N of target areasynIt dashes forward with non-synonymous Become number Nnon;S2 calculates the protein-coding region length of field L of target areatarget;It is same to calculate target area Tumor mutations load Justice mutation item: Tsys=Nsyn /Ltarget;Calculate target area Tumor mutations load nonsynonymous mutation item: Tnon=Nnon/Ltarget; S3 establishes multivariate linear model, calculates the Tumor mutations load TMB:TMB=a of sample to be tested1Tsys+a2Tnon+a3log(Tnon), Wherein, a1、a2And a3It is fitted to obtain by existing database data.
It applies the technical scheme of the present invention, is sequenced by targeting and calculates the effectively replacement full exon sequencing of Tumor mutations load Tumor mutations load is calculated, cost is effectively reduced, shortens the period.
A kind of typical embodiment according to the present invention, existing database include TCGA database.For example, in the present invention In one embodiment, it is fitted by 10092 WES data of the 33 kinds of tumours downloaded from TCGA.Because generally acknowledging WES at present The TMB of calculating is most quasi-, so being subject to TMB obtained by WES when fitting, PANEL gene in the present embodiment is extracted from WES data Abrupt information, be fitted therewith, finally obtain model parameter.
A kind of typical embodiment is invented at all, and a kind of method for predicting Tumor mutations load is provided.This method include with Lower step: S1 obtains the tumor sample of same patient respectively and normal sample and extracts DNA;S2 is captured according to target area Principle captures tumor-related gene using probe;S3 carries out sequencing by high-throughput method and sequence is analyzed, and obtains sequencing letter Cease the accidental data of tumor sample;And S4, according to above-mentioned for predicting the construction method of the line style model of Tumor mutations load The line style model prediction Tumor mutations load TMB of building.
Calculating Tumor mutations load, which is sequenced, by targeting effectively replaces the sequencing of full exon to calculate Tumor mutations load, effectively Reduce cost, shorten the period.
Preferably, S3 further include: select the sequencing sequence of high quality, removal N content is greater than 5% sequence, and removal contains The sequence of connector, to further increase the accuracy of prediction.Sequence point in a typical embodiment of the invention, in S3 Analysis includes: that tumor sample DNA and normal sample DNA are compared reference genome using comparison software, then using variation inspection Software is surveyed, detection obtains the accidental data of the tumor sample.
Preferably, S4 further include: mutational site is annotated using ANNOVAR software, obtains the synonymous of target area And nonsynonymous mutation, and calculate separately its number.
A kind of typical embodiment is invented at all, and a kind of device for predicting Tumor mutations load is provided.The device includes: Device is used to store the module perhaps run or module is the component part of device;Wherein, module is software module, software Module is one or more, and software module is for executing any of the above-described kind of method.
Beneficial effects of the present invention are further illustrated below in conjunction with embodiment.
Embodiment 1
From TCGA database download 33 kinds of tumours (33 kinds of tumours be respectively ACC, KIRC, PRAD, BLCA, KIRP, READ, BRCA、LAML、SARC、CESC、LGG、SKCM、CHOL、LIHC、STAD、COAD、LUAD、 TGCT、DLBC、LUSC、THCA、 ESCA, MESO, THYM, GBM, OV, UCEC, HNSC, PAAD, UCS, KICH, PCPG and UVM) 10092 WES data, from The middle targeted capture region mutagenesis information for extracting customized 549 genes calculates target area by following multivariate linear model TMB。
S1 obtains the accidental data of sample to be tested by sequencing and sequence analysis, screens the synonymous prominent of protein encoding regions Change and nonsynonymous mutation, calculate separately the same sense mutation number N of target areasynWith nonsynonymous mutation number Nnon
S2 calculates the protein-coding region length of field L of target areatarget;It is synonymous prominent to calculate target area Tumor mutations load Variable: Tsys=Nsyn/Ltarget;Calculate target area Tumor mutations load nonsynonymous mutation item: Tnon=Nnon/Ltarget
S3 establishes multivariate linear model, calculates the Tumor mutations load TMB:TMB=a of sample to be tested1Tsys+a2Tnon+ a3log(Tnon)。
The TMB that WES is directly calculated is compared as standard, Fig. 1 shows the present embodiment method and calculates gained TMB The dependency graph of gained TMB is calculated with full exon data.Abscissa is that the present embodiment method calculates gained TMB, and ordinate is A exon calculates gained TMB, correlation R2=0.9898.Illustrate the accuracy of the modeling method of the invention.
Fig. 2 shows TMB obtained by TMB (549panel) obtained by the present embodiment method in 33 kinds of tumours and full exon (WXS) intuitively comparing figure.Abscissa is different tumours, and ordinate calculates gained TMB.
Embodiment 2
The tumor sample and normal control sample of sample PM00G18**0228 are obtained, and extracts DNA respectively;By free 549 gene panel simultaneously to tumor sample and normal sample DNA carry out capture build library, sequencing;Then it compares, search and dash forward Become, result annotation, mutation filtering, finally obtains the detailed abrupt information (16466, see Fig. 3) of subject.
Abrupt information classifies to obtain 5027 nonsynonymous mutation (Nnon) and 11439 same sense mutation (Nsyn), target area Protein-coding region length of field be 1.373692Mbp (Ltarget), target area Tumor mutations load nonsynonymous mutation is calculated Item (Tnon) and same sense mutation item (Tsys), then substitute into established multivariate linear model be calculated sample to be tested tumour it is prominent Varying duty is 82.9.
Fig. 3 shows the thumbnail being mutated obtained by PM00G18**0228 sample.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (7)

1. a kind of for predicting the construction method of the line style model of Tumor mutations load, which comprises the following steps:
S1, by sequencing and sequence analysis obtain sample to be tested accidental data, screen protein encoding regions same sense mutation and Nonsynonymous mutation calculates separately the same sense mutation number N of target areasynWith nonsynonymous mutation number Nnon
S2 calculates the protein-coding region length of field L of target areatarget, calculate the load same sense mutation of target area Tumor mutations : Tsys=Nsyn/Ltarget, calculate target area Tumor mutations load nonsynonymous mutation item: Tnon=Nnon/Ltarget
S3 establishes multivariate linear model, calculates the Tumor mutations load TMB of sample to be tested:
TMB=a1Tsys+a2Tnon+a3log(Tnon), wherein a1、a2And a3It is fitted to obtain by existing database data.
2. construction method according to claim 1, which is characterized in that the existing database includes TCGA database.
3. a kind of method for predicting Tumor mutations load, which comprises the following steps:
S1 obtains the tumor sample of same patient respectively and normal sample and extracts DNA;
S2 captures principle according to target area and captures tumor-related gene using probe;
S3 carries out sequencing by high-throughput method and sequence is analyzed, obtains the accidental data of tumor sample;And
S4, according to the construction method building of the line style model as described in claim 1 for predicting Tumor mutations load Line style model prediction Tumor mutations load TMB.
4. according to the method described in claim 3, it is characterized in that, the S3 further include: the sequencing sequence for selecting high quality is gone Except N content is greater than 5% sequence, the sequence containing connector is removed.
5. according to the method described in claim 4, it is characterized in that, the sequence analysis in the S3 includes: using comparison software Tumor sample DNA and normal sample DNA are compared into reference genome, then using variation inspection software, detection obtains described The accidental data of tumor sample.
6. according to the method described in claim 4, it is characterized in that, the S4 further include: using ANNOVAR software to mutation position Point is annotated, and obtains the synonymous and nonsynonymous mutation of target area, and calculate separately its number.
7. a kind of device for predicting Tumor mutations load characterized by comprising
Described device is used to store the module perhaps run or the module is the component part of described device;Wherein, described Module is software module, and the software module is one or more, and the software module is for executing the claims 3 to 6 Any one of described in method.
CN201811447772.4A 2018-11-29 2018-11-29 Method for constructing linear model for predicting tumor mutation load, method and device for predicting tumor mutation load Active CN109767811B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811447772.4A CN109767811B (en) 2018-11-29 2018-11-29 Method for constructing linear model for predicting tumor mutation load, method and device for predicting tumor mutation load

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811447772.4A CN109767811B (en) 2018-11-29 2018-11-29 Method for constructing linear model for predicting tumor mutation load, method and device for predicting tumor mutation load

Publications (2)

Publication Number Publication Date
CN109767811A true CN109767811A (en) 2019-05-17
CN109767811B CN109767811B (en) 2020-01-31

Family

ID=66450349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811447772.4A Active CN109767811B (en) 2018-11-29 2018-11-29 Method for constructing linear model for predicting tumor mutation load, method and device for predicting tumor mutation load

Country Status (1)

Country Link
CN (1) CN109767811B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110504032A (en) * 2019-08-23 2019-11-26 元码基因科技(无锡)有限公司 The method for predicting Tumor mutations load based on the image procossing of hematoxylin-eosin dye piece
CN111826447A (en) * 2020-09-21 2020-10-27 求臻医学科技(北京)有限公司 Method for detecting tumor mutation load and prediction model
CN111933219A (en) * 2020-09-16 2020-11-13 北京求臻医学检验实验室有限公司 Detection method of molecular marker tumor deletion mutation load
CN112786103A (en) * 2020-12-31 2021-05-11 普瑞基准生物医药(苏州)有限公司 Method and device for analyzing feasibility of target sequencing Panel for estimating tumor mutation load
CN113257349A (en) * 2021-06-10 2021-08-13 元码基因科技(北京)股份有限公司 Method for selecting design interval for analyzing tumor mutation load and application

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570349A (en) * 2016-10-28 2017-04-19 深圳华大基因科技服务有限公司 Specificity tumor probe area designing method for acquiring high-throughput sequencing in target area, device and probe
CN107287285A (en) * 2017-03-28 2017-10-24 上海至本生物科技有限公司 It is a kind of to predict the method that homologous recombination absent assignment and patient respond to treatment of cancer
CN108009400A (en) * 2018-01-11 2018-05-08 至本医疗科技(上海)有限公司 Full-length genome Tumor mutations load forecasting method, equipment and storage medium
CN108470114A (en) * 2018-04-27 2018-08-31 元码基因科技(北京)股份有限公司 The method of two generation sequencing datas analysis Tumor mutations load based on single sample
WO2018175501A1 (en) * 2017-03-20 2018-09-27 Caris Mpi, Inc. Genomic stability profiling
CN108588194A (en) * 2018-05-28 2018-09-28 北京诺禾致源科技股份有限公司 Utilize the method and device of high-flux sequence Data Detection Tumor mutations load

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570349A (en) * 2016-10-28 2017-04-19 深圳华大基因科技服务有限公司 Specificity tumor probe area designing method for acquiring high-throughput sequencing in target area, device and probe
WO2018175501A1 (en) * 2017-03-20 2018-09-27 Caris Mpi, Inc. Genomic stability profiling
CN107287285A (en) * 2017-03-28 2017-10-24 上海至本生物科技有限公司 It is a kind of to predict the method that homologous recombination absent assignment and patient respond to treatment of cancer
CN108009400A (en) * 2018-01-11 2018-05-08 至本医疗科技(上海)有限公司 Full-length genome Tumor mutations load forecasting method, equipment and storage medium
CN108470114A (en) * 2018-04-27 2018-08-31 元码基因科技(北京)股份有限公司 The method of two generation sequencing datas analysis Tumor mutations load based on single sample
CN108588194A (en) * 2018-05-28 2018-09-28 北京诺禾致源科技股份有限公司 Utilize the method and device of high-flux sequence Data Detection Tumor mutations load

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KEIICHI HATAKEYAMA ET AL.: "Tumor mutational burden analysis of 2,000 Japanese cancer genomes using whole exome and targeted gene panel sequencing", 《BIOMEDICAL RESEARCH》 *
周进学 等: "肿瘤基因高通量捕获测序技术检测肝癌细胞株体细胞突变", 《中华实验外科杂志》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110504032A (en) * 2019-08-23 2019-11-26 元码基因科技(无锡)有限公司 The method for predicting Tumor mutations load based on the image procossing of hematoxylin-eosin dye piece
CN110504032B (en) * 2019-08-23 2022-09-09 元码基因科技(无锡)有限公司 Method for predicting tumor mutation load based on image processing of hematoxylin-eosin staining tablet
CN111933219A (en) * 2020-09-16 2020-11-13 北京求臻医学检验实验室有限公司 Detection method of molecular marker tumor deletion mutation load
CN111933219B (en) * 2020-09-16 2021-06-08 北京求臻医学检验实验室有限公司 Detection method of molecular marker tumor deletion mutation load
CN111826447A (en) * 2020-09-21 2020-10-27 求臻医学科技(北京)有限公司 Method for detecting tumor mutation load and prediction model
CN111826447B (en) * 2020-09-21 2021-01-05 求臻医学科技(北京)有限公司 Method for detecting tumor mutation load and prediction model
CN112786103A (en) * 2020-12-31 2021-05-11 普瑞基准生物医药(苏州)有限公司 Method and device for analyzing feasibility of target sequencing Panel for estimating tumor mutation load
CN112786103B (en) * 2020-12-31 2024-03-15 普瑞基准生物医药(苏州)有限公司 Method and device for analyzing feasibility of target sequencing Panel in estimating tumor mutation load
CN113257349A (en) * 2021-06-10 2021-08-13 元码基因科技(北京)股份有限公司 Method for selecting design interval for analyzing tumor mutation load and application
CN113257349B (en) * 2021-06-10 2021-10-01 元码基因科技(北京)股份有限公司 Method for selecting design interval for analyzing tumor mutation load and application

Also Published As

Publication number Publication date
CN109767811B (en) 2020-01-31

Similar Documents

Publication Publication Date Title
CN109767811A (en) For predicting the construction method of the line style model of Tumor mutations load, predicting the method and device of Tumor mutations load
Bouyssié et al. Proline: an efficient and user-friendly software suite for large-scale proteomics
Bourgeois et al. An overview of current population genomics methods for the analysis of whole‐genome resequencing data in eukaryotes
Field et al. Reliably detecting clinically important variants requires both combined variant calls and optimized filtering strategies
Kinsinger et al. Recommendations for mass spectrometry data quality metrics for open access data (corollary to the Amsterdam Principles)
US20210257050A1 (en) Systems and methods for using neural networks for germline and somatic variant calling
Yang et al. ISOexpresso: a web-based platform for isoform-level expression analysis in human cancer
Oveland et al. Viewing the proteome: how to visualize proteomics data?
Robotti et al. Biomarkers discovery through multivariate statistical methods: a review of recently developed methods and applications in proteomics
Cai et al. PulseDIA: data-independent acquisition mass spectrometry using multi-injection pulsed gas-phase fractionation
US20110257893A1 (en) Methods for classifying samples based on network modularity
Yeats et al. A fast and automated solution for accurately resolving protein domain architectures
CN109411015A (en) Tumor mutations load detection device and storage medium based on Circulating tumor DNA
Shemesh et al. Machine learning analysis of naïve B-cell receptor repertoires stratifies celiac disease patients and controls
Altenburg et al. Ad hoc learning of peptide fragmentation from mass spectra enables an interpretable detection of phosphorylated and cross-linked peptides
Donovan et al. Functionally distinct BMP1 isoforms show an opposite pattern of abundance in plasma from non-small cell lung cancer subjects and controls
Barann et al. Manananggal-a novel viewer for alternative splicing events
Luthra et al. Computational methods and translational applications for targeted next‐generation sequencing platforms
Menzel et al. NoPeak: k-mer-based motif discovery in ChIP-Seq data without peak calling
JP5213009B2 (en) Gene expression variation analysis method and system, and program
CN103488913A (en) A computational method for mapping peptides to proteins using sequencing data
Has et al. PGMiner: Complete proteogenomics workflow; from data acquisition to result visualization
Zhang et al. A new strategy to filter out false positive identifications of peptides in SEQUEST database search results
Bessant Proteome informatics
Raj et al. Quality control of variant peptides identified through proteogenomics-catching the (un) usual suspects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant