CN110229902A - The determination method of assessment gene group for gastric cancer prognosis prediction - Google Patents

The determination method of assessment gene group for gastric cancer prognosis prediction Download PDF

Info

Publication number
CN110229902A
CN110229902A CN201910550753.2A CN201910550753A CN110229902A CN 110229902 A CN110229902 A CN 110229902A CN 201910550753 A CN201910550753 A CN 201910550753A CN 110229902 A CN110229902 A CN 110229902A
Authority
CN
China
Prior art keywords
gene
gastric cancer
rhoa
sample
assessment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910550753.2A
Other languages
Chinese (zh)
Inventor
施巍炜
牟硕
王凯
柳文进
赵松辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
To Medical Science And Technology (shanghai) Co Ltd
Original Assignee
To Medical Science And Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by To Medical Science And Technology (shanghai) Co Ltd filed Critical To Medical Science And Technology (shanghai) Co Ltd
Priority to CN201910550753.2A priority Critical patent/CN110229902A/en
Publication of CN110229902A publication Critical patent/CN110229902A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention provides a kind of determination methods of assessment gene group for gastric cancer prognosis prediction, it is characterized by: the genescreen for including in the RhoA protein active regulatory pathway based on regulation RhoA protein active obtains the assessment gene group containing multiple genes, specifically includes the following steps: step 1, by the gene for including in RhoA gene regulation access alternately target gene set;Step 2, based on the expression data of each gene in the alternative target gene sets for each gastric cancer sample of screening and the expression data of RhoA gene, determine that rule determines assessment gene group undetermined by predetermined;Step 3, limit the cox survival analysis that gastric cancer prognosis that method determines that each gastric cancer verifying sample obtains is the bad sample of bad prognosis as good all prognosis bona's samples and all gastric cancer prognosis according to predetermined based on the gastric cancer verifying sample cluster for including multiple gastric cancers verifying sample, determine assessment gene group undetermined whether be the assessment that can be used in gastric cancer prognosis assessment gene group.

Description

The determination method of assessment gene group for gastric cancer prognosis prediction
Technical field
The invention belongs to biological fields, and in particular to a kind of determination side of the assessment gene group for gastric cancer prognosis prediction Method.
Background technique
Gastric cancer prognosis prediction has the setting of assessment guiding clinical treatment scheme and new therapy target research important Meaning.
In gastric cancer, mutation of the RhoA gene in gastric cancer crowd is had found by the mutation analysis of full exon genes group Ratio is higher, there is the enrichment of mutation especially in the gastric cancer of dispersivity hypotype.But RhoA mutation is analyzed by survival analysis Prognosis, individually establish prognosis with the mutation of RhoA this gene and carry out survival analysis, there is no significant relationships for discovery.So The prognosis that RhoA gene association gastric cancer can only be used, is not able to satisfy the demand of gastric cancer prognosis prediction, needs to find new pre- with gastric cancer The high prediction index of correlation afterwards.
In addition, there are two main classes for current technical solution to the relationship of gene and prognosis in gastric cancer: the first kind is with list The mutation association prognosis of a gene as a result, however, this index due to being limited by the frequency of mutation in crowd, such as RhoA For the frequency of mutation of gene 6.3%, collected sample size will limit the robustness of survival analysis result;Second class is with certain The catastrophe of a little assortments of genes is associated with prognosis, and the molecular labeling of polygenic combination can reach higher mutation frequently Rate.But the gene usually chosen is the gene sets synthesis that the higher frequency of mutation is found to have in tumour, lacks base The reliable Biological Mechanism evidence because between, it is difficult to reliably be applied in clinical practice.
Summary of the invention
The present invention provides a kind of determination method of assessment gene group for gastric cancer prognosis.
To achieve the goals above, present invention employs following technical solutions:
The object of the present invention is to provide a kind of determination method of assessment gene group for gastric cancer prognosis prediction, feature exists In: the genescreen for including in the RhoA protein active regulatory pathway based on regulation RhoA protein active is obtained containing multiple genes Assessment gene group, specifically includes the following steps: step 1, by the gene for including in RhoA gene regulation access alternately mesh Mark gene sets;Step 2, each gene in each alternative target gene sets based on each gastric cancer sample for screening The expression data for expressing data and RhoA gene determine that rule determines assessment gene group undetermined by predetermined;Step 3, based on packet The gastric cancer verifying sample cluster for including multiple gastric cancer verifying samples determines what each gastric cancer verifying sample obtained according to predetermined restriction method Gastric cancer prognosis is good all prognosis bona's samples and all gastric cancer prognosis are the cox existence point of the bad sample of bad prognosis Analysis, determine it is undetermined assessment gene group whether be the assessment that can be used in gastric cancer prognosis assessment gene group.
Determining method provided by the invention also has a feature in that wherein, in step 2, makes a reservation for determine rule are as follows: base The median of RhoA gene expression amount is determined in the expression data of the RhoA gene of all gastric cancer samples;Based on the middle position All gastric cancer samples are determined as the high expression group of RhoA and the low expression group of RhoA by value;Based on each gastric cancer sample The expression data of each gene in this alternative target gene sets, respectively by each of alternative target gene sets The expression data acquisition system for the gene that gene is obtained by each gastric cancer sample in the high expression group is logical with the gene The expression data acquisition system for the gene that each gastric cancer sample crossed in the low expression group obtains does differential expression conspicuousness Analysis;It is that significant all genes add RhoA base by differential expression significance analysis result in alternative target gene sets Because being determined as the assessment gene group undetermined.
Determining method provided by the invention also has a feature in that wherein, as significance probability value p < 0.05, table Show that differential expression significance analysis result is significant.
Determining method provided by the invention also has a feature in that wherein, in step 3, predetermined restriction method are as follows: when When the mutation of at least one gene meets predetermined sudden change conditions in the gene group to be assessed of one gastric cancer verifying sample, The corresponding gastric cancer verifying sample is determined into prognosis bona's sample, when a gastric cancer verifies the described to be evaluated of sample When estimating the mutation in gene group there is no any one gene and meeting predetermined sudden change conditions, the corresponding gastric cancer is verified into sample Determine the bad sample of the prognosis;When all prognosis bona's samples do result that cox survival analysis obtains with it is all pre- Bad sample does and does the obtained result of existence significance difference analysis between the result that cox survival analysis obtains afterwards when being significant, Determine that the assessment gene group undetermined is the assessment gene group that can be used in the assessment of the gastric cancer prognosis.
Determining method provided by the invention also has a feature in that wherein, as significance probability value p < 0.05, table The result for showing that existence significance difference analysis obtains is significant.
Determining method provided by the invention also has a feature in that wherein, and predetermined sudden change conditions are, mutation The gene meets simultaneously: there are single nucleotide mutation or insertion and deletions to be mutated for the gene, and the single nucleotide mutation or Insertion and deletion mutation is afunction mutation, and the single nucleotide mutation or the coverage of insertion and deletion mutation are more than or equal in advance The read determined coverage, and the single nucleotide mutation or insertion and deletion is supported to be mutated is more than or equal to predetermined item number, and chain specificity Degree is less than or equal to predetermined percentage, and the site of the single nucleotide mutation or insertion and deletion mutation is appointed existing for the gene The distance in a repetitive sequence region of anticipating is greater than preset distance, and carrying out the obtained result of functional annotation to the gene is that albumen is compiled The mutation in code region.
Determining method provided by the invention also has a feature in that wherein predetermined coverage is 30x.
Determining method provided by the invention also has a feature in that wherein predetermined item number is 3.
Determining method provided by the invention also has a feature in that wherein, predetermined percentage 95%.
Determining method provided by the invention also has a feature in that wherein, preset distance 5bp.
Determining method provided by the invention also has a feature in that wherein assessment gene group includes following gene: RHOA、ARAP3、ARHGAP4、ARHGAP6、ARHGDIA、ARHGEF10、ARHGEF17、ARHGEF25、ARHGEF3、CDC42、 DEF6、DLC1、DYNLL1、ECT2、FARP1、LRP2、MCF2L、MTG1、NET1、OBSCN、PAK1、PKN1、PRPF38B、 SLC6A2, SRGAP1, TRIO and TSPAN1.
Invention action and effect
Provided by the present invention for the determination method of the assessment gene group of gastric cancer prognosis prediction, since this method is based on tune It controls the genescreen for including in the RhoA protein active regulatory pathway of RhoA protein active and obtains the assessment gene group, it is such to obtain To assessment gene group in just include gene relevant to the regulation of RhoA protein active, and RhoA protein active and adjust cell Contraction, movement and migration RhoA signal path it is closely related, play in the entire vital movement of tumour cell to Guan Chong The effect wanted, so compared to according to the gene sets synthesis for being found to have the higher frequency of mutation in gastric cancer tumor and to choose Certain assortments of genes are associated with the prognosis of patients with gastric cancer, and the assessment gene group energy that determination method of the invention determines is more reliable answers It uses in clinical practice, in particular, the assessment gene group of the present embodiment, the assessment to the gastric cancer prognosis prediction of Chinese gastric cancer patients Accuracy is high, can be highly reliably applied in the gastric cancer prognosis prediction of Chinese gastric cancer patients;In addition, since this method determines Assessment gene group contain multiple genes, so compared to individual gene mutation association prognosis method, substantially reduce people The limitation of the frequency of mutation in group substantially reduces the limitation of the sample of collection to the robustness of survival analysis result.
Detailed description of the invention
Fig. 1 is the regulation schematic diagram of RhoA protein active regulatory pathway of the present invention;
Fig. 2 is the step flow chart of determination method involved in the embodiment of the present invention 1;
Fig. 3 is the determination of the median of the RhoA gene expression values in determination method involved in the embodiment of the present invention 1 Schematic diagram;
Fig. 4 is that gene group definitive result figure is assessed in determination method involved in the embodiment of the present invention 1;
Fig. 5 is the survival analysis of the gastric cancer verifying sample cluster involved in the embodiment of the present invention 1 from TCGA database Result schematic diagram;
Fig. 6 is the survival analysis of the gastric cancer verifying sample cluster involved in the embodiment of the present invention 1 from ACRG database Result schematic diagram;
Fig. 7 is the Hazard ratio result schematic diagram that two groups of gastric cancers verify sample cluster data involved in the embodiment of the present invention 1;
Fig. 8 is the survival analysis result schematic diagram from clinical case verifying involved in the embodiment of the present invention 3.
Specific embodiment
Illustrate a specific embodiment of the invention below in conjunction with attached drawing.For specific method used in embodiment or Material, those skilled in the art can carry out conventional replacement according to existing technology on the basis of the technology of the present invention thinking Selection, is not limited solely to the specific record of the embodiment of the present invention.
Method used in embodiment is conventional method unless otherwise specified;Used material, reagent etc., such as Without specified otherwise, it is commercially available.
Definition involved in following embodiment or term:
1.RhoA protein active regulatory pathway
Refer in functional genomics, whole network approach influential on the regulation of RhoA protein active.
Fig. 1 is the regulation schematic diagram of RhoA protein active regulatory pathway of the present invention.
As shown in Figure 1, RHOA encodes the Rho family member of small GTPase, in inactive GDP bonding state and activity It is recycled between GTP bonding state.The activation of RhoA gene mainly passes through Guanine nucleotide exchange factor (GEF) and reaches, such as OBSCN, The genes such as TRIO, NET1.And the process inactivated mainly passes through GTP enzyme activation albumen (GAP), such as DLC1, SRGAP1, ARAP3 etc. Gene action.Furthermore ARHGDIA also controls the chelation of RhoA-GDP and RhoA-GDI.The RhoA albumen of state of activation can The invasion and the processes such as transfer of tumour are influenced to act on the effector molecule in downstream.
2. controlling gene
RhoA albumen is there are two types of conformational state: between RhoA-GTP and RhoA-GDP, i.e., it is active and inactive between carry out Conversion.RhoA albumen and GDP combining form are free in endochylema, and RhoA albumen then acts on intracellular with GTP combining form Effector.Adjust active and inactive conversion has three albuminoids: (1) Guanine nucleotide exchange factor (Guaninenucleotide Exchanging factors, GEFs), promote the release of GDP.(2) GTP enzyme activation albumen (GTPase activating Protein, GAP), it is a kind of negative regulation factor, the hydrolysis of RhoA GTP enzyme can be accelerated, is become from RhoA GTP activated state The inactive state of RhoA GDP;(3) GDP dissociates inhibiting factor (GDPdissociation inhibitor, GDI), prevents GDP With RhoA Protein Separation, RhoA GTP enzymatic activity can be inhibited.Controlling gene involved in this patent is mainly for these three types of albumen Corresponding encoding gene.
3. coverage (sequencing coverage)
Refer to the read number of plies that the read that sequencing obtains covers after reference genome alignment, if referred to individual gene The coverage of individual gene refers to the coverage being individually mutated to single mutation.
4. supporting read
Refer generally to the read that sequencing obtains to compare to reference genome identical mutation site or the quantity of gene loci, quantity It is more, it is higher to represent probability of the tested sample with certain mutation or gene.
5. chain Preference degree
Refer to that serious unbalanced distribution occurs in the read of forward and reverse, usually occurs in the end region of capture probe Domain refers to that comparing the quantity that positive or reversed read is shown as into the read on reference genome accounts for total percentage, Namely it is that forward or backwards, at this moment, maximum probability illustrates that capture has chain preference, meeting that biggish compare of accounting in obtained read, which is sequenced, Lead to the result of false positive.
6. functional mutant is unknown
Refer in current database, specific biological experiment is lacked to the annotation of specific mutation and is supported, the reality of the mutation Border influences to be unknown.
7. non-coding region
In exon sequencing, in actual probes design, other than including all protein coding genes, one is further comprised Genome sequence other than a little important code areas, such as certain promoter regions, include subregion, and 5 ' ends of gene two sides are non- Code area (5 ' UTR) and 3 ' ends noncoding region (3 ' UTR) etc..These noncoding regions are not usually to directly affect, but adjust indirectly Control the expression of gene.
Embodiment 1
Fig. 2 is the step flow chart of determination method involved in the embodiment of the present invention 1.
The present embodiment is the determination method in order to illustrate the assessment gene group for gastric cancer prognosis prediction, such as Fig. 2 institute Show, the determination method specifically includes the following steps:
Step S1 determines alternative target gene sets, specifically:
By all genes for including in RhoA protein active regulatory pathway alternately target gene set, specifically, this Alternative target gene sets are that the RhoA protein active regulation verified by the experimental data in NCI database is led in embodiment The controlling gene on road in the present embodiment, is verified by the document tracking that is in progress and picks out 48 controlling genes alternately mesh in total Mark gene sets.
Step S2 determines assessment gene group undetermined, the alternative target gene set based on each gastric cancer sample for screening The expression data of each gene and the expression data of RhoA gene in conjunction determine that rule determines assessment gene undetermined by predetermined Group, specifically:
It collects 123 gastric cancer samples announced in the present embodiment from the database that TCGA is published first, makees For for screening the gastric cancer sample for determining assessment gene group undetermined, these samples all have the data of corresponding full transcript profile, life The expression data of the data of the data, genome mutation deposited and each gene in respective alternative target gene sets, Then determine that rule determines assessment gene group undetermined by predetermined.
It is predetermined to determine rule specifically:
Fig. 3 is the determination of the median of the RhoA gene expression values in determination method involved in the embodiment of the present invention 1 Schematic diagram.
S2-1 first determines the median of RhoA gene expression amount in gastric cancer sample, specifically:
RhoA gene expression amount is determined based on each expression data of the respective RhoA gene of all gastric cancer samples Median: in the present embodiment, specifically, as shown in Fig. 2, the full transcript profile data based on TCGA, according to each gastric cancer sample The expression value of RhoA gene is ranked up this 123 gastric cancer samples, and in Fig. 2, channel zapping is grey histogram, and density line is Curve, center vertical dotted line are the median of the expression quantity of RhoA gene in the group.Wherein, read of the expression quantity from transcript profile Number is estimated as quantity (the Fragment per kilo-bp in that every 1,000,000 reads are fallen in every 1,000 bases of gene One million reads, FPKM), horizontal axis takes the logarithm with 2 bottom of for.In the present embodiment, determine in RhoA gene expression amount Place value (middle position FPKM=177);
Then S2-2 determines the high expression group of the RhoA in gastric cancer sample and the low expression group of RhoA, specifically:
Based on median, 123 gastric cancer samples in all gastric cancer samples namely the present embodiment are determined as to the height of RhoA The expression quantity of the low expression group of expression group and RhoA, high expression group namely RhoA gene is greater than all gastric cancer samples of median, The expression quantity of low expression group namely RhoA gene is lower than all gastric cancer samples of median, wherein median gastric cancer sample can be with It is placed on high expression group or low expression group;
S2-3 after again, each gene in alternative target gene sets is done by high expression combination low expression group express it is poor Different significance analysis, specifically:
The expression data of each gene in alternative target gene sets based on each gastric cancer sample, respectively by alternative mesh The expression data acquisition system for the gene that each gene in mark gene sets is obtained by each gastric cancer sample in high expression group, It is aobvious that the expression data acquisition system of the gene obtained with the gene by each gastric cancer sample in the low expression group does differential expression The analysis of work property, for example have 62 gastric cancer samples in high expression group, there are 61 gastric cancer samples in low expression group, for alternative target base Because of the Gene A in set, it all regard the expression data of the A gene of each gastric cancer sample in high expression group as an expression data The expression data of set namely 62 A genes, similarly, by the expression data of the A gene of gastric cancer sample each in low expression group The two are expressed and make table between data acquisition system by the expression data that data acquisition system namely 61 A genes are all expressed as one Up to significance difference analysis.According to this method, differential expression is done to each gene in alternative target gene sets respectively to show The analysis of work property.
Differential expression significance analysis result in alternative target gene sets is that significant all genes add by last S2-4 Upper RhoA gene is determined as assessment gene group undetermined.In the present embodiment, as significance probability value p < 0.05, differential expression is indicated Significance analysis result is significant.
Table 1 is to do significance difference analysis for each gene in alternative target gene sets in the embodiment of the present invention 1 Analysis result.
In table 1:
Gene: gene name;Ensembl_ID ensembl: the gene I/D of database;High_group RhoA: height expression The average expression amount of the correspondence gene of group;Low_group:RhoA low expression group corresponds to the average expression amount of gene;Up/down: Whether raise, refer to a gene high expression group average expression amount compare the average expression amount in low expression group whether on It adjusts.
Wherein, high expression group and two groups of the low expression group the statistical testing results to each gene are p-value, to all bases Because the corrected value of Multiple range test is p-values value, when less than 0.05, then corresponding gene is in high expression group and low expression group The differential expression significance analysis result of middle expression is significant;Q.value (result after multiple correction), restrictive condition is 10% (q.value≤0.1)
From table 1 it follows that in 48 genes, first 26 and RhoA gene itself in table, in high expression group and Differential expression is significant in two groups of low expression group.
Fig. 4 is that gene group definitive result figure is assessed in determination method involved in the embodiment of the present invention 1.
As shown in figure 4, discovery has 26 tables in alternative target gene sets (namely above-mentioned 48 genes) in this implementation Up to discrepant gene, in addition RhoA itself, totally 27 genes are as gene group to be assessed, the function and introduction of each gene It is shown in Table 2.
Step S3, verifying determine whether assessment gene group undetermined is assessment gene group
In order to do prognosis verifying analysis, select the same database in source includes the gastric cancer verifying of multiple gastric cancers verifying sample Sample cluster is verified.
Here, each gastric cancer is verified into sample first by the multiple gastric cancers for including verifying sample according to predetermined restriction method It is determined as prognosis bona's sample (RhoA_S positive sample) or the bad sample of prognosis (RhoA_S negative sample).Predetermined restriction method Are as follows: when the mutation of at least one gene in the gene group to be assessed of a gastric cancer verifying sample meets predetermined mutation item When part, the corresponding gastric cancer verifying sample is determined into prognosis bona's sample, when the institute of a gastric cancer verifying sample When stating the mutation in gene group to be assessed there is no any one gene and meeting predetermined sudden change conditions, the corresponding gastric cancer is tested Card sample determines the bad sample of the prognosis.
In the present embodiment, predetermined sudden change conditions are that the gene of mutation meets simultaneously: there are monokaryon glycosides for the gene Acid mutation or insertion and deletion mutation, and the single nucleotide mutation or insertion and deletion mutation are not function gain mutations, and the list Coding mutation or the coverage of insertion and deletion mutation are more than or equal to predetermined coverage, and support the single nucleotide mutation or insertion The read of deletion mutation is more than or equal to predetermined item number, and chain degrees of specificity is less than or equal to predetermined percentage, and the mononucleotide The distance in site any one repetitive sequence region existing for the gene of mutation or insertion and deletion mutation is greater than pre- spacing From, and the mutation that the result that functional annotation obtains is protein encoding regions is carried out to the gene.
Among the above, why require there are single nucleotide mutation or insertion and deletion mutation, be that the functions of these two types of mutation are ground Study carefully clear and definite.
And present inventor have discovered that function gain mutation will increase the expression of gene, cancer cell can be enhanced in this way Activity and wild type tumour cell be it is similar, and afunction mutation will lead to tumour cell activity reduce, with Preferable clinical prognosis is related, so using the gene being mutated with afunction as gastric cancer verifying sample is true in the present embodiment It is set to one of the condition of prognosis bona's sample, in this way, correspondingly, will just have the gene of function gain mutation to be used as gastric cancer Verifying sample is classified as one of the condition of the bad sample of prognosis.
And the quantity of above-mentioned read number is more, representing tested sample has the probability of certain mutation or gene higher, as a result Also more accurate;And above-mentioned chain Preference degree is bigger, it is more to represent false positive in testing result, so wanting smaller more accurate; Inventors believe that sequencing technologies modern around repetitive sequence will appear unstable sequencing result, so as to cause mistake Mutant analysis results, above-mentioned preset distance is bigger, represent testing result occur mistake probability it is smaller.
In the present embodiment, it may be preferable that predetermined coverage is 30x, and predetermined item number is 3, and predetermined percentage 95% makes a reservation for Distance is 5bp.
In addition, inventors believe that, for being mutated Unknown Function after annotation, such mutation can interfere the analysis of data, It needs to remove from subsequent analysis;And be noncoding region after annotating, not usually directly affect, but indirect adjustments and controls gene Expression.Only consider the result that annotation obtains for the mutation of protein encoding regions in the present embodiment.
In the present embodiment, in order to improve the reliability of verification result, two groups of gastric cancers for selecting two, source database to obtain Verifying sample cluster is verified respectively, and two databases are respectively as follows: TCGA and ACRG gastric cancer database.
Then, cox survival analysis is done to all prognosis bona's samples and obtains one as a result, for convenient for statement, use A table here Show;And cox survival analysis is done to all bad samples of prognosis and obtains one as a result, for convenient for statement, it is indicated with B.By result A Existence significance difference analysis is done between B, when obtained result is significant, determines that assessment gene group undetermined is that can be used in The assessment gene group of the assessment of gastric cancer prognosis.In the present embodiment, when significance probability value p, p < 0.05, existence significant difference is indicated The result that property is analyzed is significant.As a result such as Fig. 5-8.
Fig. 5 is the survival analysis of the gastric cancer verifying sample cluster involved in the embodiment of the present invention 1 from TCGA database Result schematic diagram;
Fig. 6 is the survival analysis of the gastric cancer verifying sample cluster involved in the embodiment of the present invention 1 from ACGR database Result schematic diagram.
In Fig. 5-Fig. 6, respectively indicates all prognosis bona's samples of corresponding database (RhoA_S positive sample) and own The difference of the overall survival phase of the bad sample of prognosis (RhoA_S negative sample).Thick line line segment is RhoA_S in km-plot in figure Positive sample, filament line segment be RhoA_S negative sample.The ordinate of top represents survival probability in figure (Probability of Survial), horizontal axis time for survival (Time, unit are number of days)." X " generation is shown in figure on curve Table data are to end the censored data (censored data) of follow-up to the end.The lower left corner of figure is labelled with the system of Cox model Count the p-value for the conspicuousness degree examined.The data of lower section represent the survival number of each type difference life span in figure, Wherein, the corresponding data for upper column of negative sample survival number, the corresponding data for lower column of positive sample survival number.
Fig. 7 is the Hazard ratio result schematic diagram that two groups of gastric cancers verify sample cluster data involved in the embodiment of the present invention 1.
In Fig. 7, black square represents the size of Hazard ratio, and two sides line segment represents 95% confidence interval.Dotted line indicates wind Danger is than for 1 (i.e. the RhoA_S positive and RhoA_S feminine gender risk is suitable).If Hazard ratio less than 1, represents positive of RhoA_S Body has preferable prognosis life span.
It can be seen that in two groups of data from Fig. 5-Fig. 7, the prognosis life span of prognosis bona's sample, is better than in every group The prognosis life span of the bad sample of prognosis in every group, and it is statistically significant.Determine as a result, assessment gene group undetermined be can The assessment gene of assessment for gastric cancer prognosis.From the determination process of assessment gene group it is found that a patients with gastric cancer progress When gastric cancer prognosis prediction, when the mutation of at least one gene in the assessment gene group of the gastric cancer sample from the patient meet it is above-mentioned When predetermined sudden change conditions, the corresponding patients with gastric cancer prognosis bona of the gastric cancer sample can be predicted, and when a gastric cancer verifies sample Gene group to be assessed in when meeting predetermined sudden change conditions there is no the mutation of any one gene, then can predict the gastric cancer sample This corresponding patients with gastric cancer prognosis is bad.
Embodiment 2
The present embodiment is the possible biology machine of prognosis in order to illustrate the assessment gene group (RhoA_S) that embodiment 1 determines Reason, for this purpose, we carry out function enrichment, knot to the genes for meeting predetermined sudden change conditions all in the RhoA_S of prognosis bona's sample The access that fruit shows that the mutation of prognosis bona's sample is mainly enriched with is shown in Table 3.
The access of significant enrichment occurs table 3 for mutation in prognosis bona sample as the result is shown: function number namely access are compiled Number;Function description refers to the corresponding function of access, or is called the function of playing;Pathway gene number is the total of gene in respective channels Number;The gene number for meeting predetermined sudden change conditions refers in above-mentioned access, in the RhoA_S that RhoA_S positive sample is related to The number that the gene for meeting predetermined condition mutation occurred occurs in the access, namely in each access of statistics, be related to The number that all genes for meeting predetermined condition mutation occur in total in all RhoA_S positive samples in RhoA_S, such as logical In road " interaction between species ", in the RhoA_S that is related to, in statistics, meet the gene of predetermined condition mutation in total by It counts on 16 times.
It include the migration (leukocyte migration) of leucocyte, the numerator mediated product of immune response in table 3 The access of functions such as (production of molecular mediator of immune response).
Table 3 illustrates, the access of the mutation enrichment occurred in RhoA_S positive sample mostly with the locomitivity of cell and exempt from Epidemic disease response is related, illustrates that in the function of genomic level the locomitivity with cell can occur for the RhoA_S positive and the sample of feminine gender Variation related with immune response.
In order to further excavate the biological significance for the assessment genome that the present embodiment determines, we have collected data respectively The data of the immune microenvironment of prognosis bona's sample (RhoA_S is positive) and the bad sample of prognosis (RhoA_S is negative) in the TCGA of library With the infiltration data of the tumor-infiltrating cells in immunohistochemistry, it the results are shown in Table 4.
Table 4 be shown: the relevant pathology of immunocyte of the immune indexes in the RhoA_S positive and negative sample and point The comparison of sub- index.
In table 4, average value Mean is the arithmetic mean of instantaneous value of each index;Median Median is the middle position of each index Value;A quarter site 1st Qu. is the tercile of preceding four molecule one of each index;3/4ths site 3st Qu. are The tercile of preceding four molecule three
Show that the content of the immune infiltration cell in the slice of the positive sample of RhoA_S is higher in table 4.According to by being based on The immunocyte composition that the data-signal of RNA-seq is inferred, discovery content of CD8+ cell in the sample of RhoA_S are higher.And The polymorphism for comparing the tcr gene of CD8+ cell recognition tumour cell is also significant higher in the positive cell of RhoA_S.And it is thin The expression quantity of the active significant gene (granzyme A and perforin) of cellular lysis is higher in the positive sample of RhoA_S.These The immune microenvironment of the sample of the RhoA_S positive is more active as the result is shown.
By upper, we are it can be deduced that the determination method of embodiment 1 determines obtained assessment gene group, it may be possible to due to The infiltration degree of immunocyte is higher in the sample of the RhoA_S positive, and the activity of immunocyte is stronger, it may be possible to lead to prognosis One of preferable reason, so that the assessment gene group energy accurately predicts the prognosis of gastric cancer.
Embodiment 3
The embodiment can serve as gastric cancer for the assessment gene group that the determination method confirmed in embodiment 1 obtains really The clinical foundation of prognosis prediction.
In the present embodiment, we have collected the gastric cancer of 61 dispersivitys from Chinese patients and the gastric cancer of 50 visible peristalsis visible intestinal peristalsis is faced Bed sample.These samples are concentrated mainly on the sample of phase clinical stages II to III phase.Collect the Overall survival of its disease, if Dead and other clinical information such as gender, age, the information such as amplification of EBV, HER2 (being shown in Table 5).
The DNA in this 111 gastric cancer clinical samples is extracted, carrying out full exon sequencing to these samples, (exon is sequenced Result), then sequencing result is handled.
Wherein, in the sequencing result processing of the present embodiment, connector and low-quality are removed to the initial data of sequencing result The base (base for only retaining quality q > 15) of amount.Reduce the influence of sequencer mistake.
Initial data is compared onto genome, using software bwa, (parameter is set as-k 10-t 20-W 5-B 1-U 5-M, meaning are that comparison seed length is 10, and Thread Count 20, the kind subchain lower than 5 length stops search, and mispairing deduction of points is 1 point, Non-matching read is deducted points 5 points, is compared the matching of read half and is compared point for second), human genome hg19.
The removal of PCR product redundancy is done using picard software, parameter setting is according to default setting.
The analysis that bam file after removal redundancy is mutated.With the analysis for doing point mutation of Mutect, Pindel is used Do the analysis of insertion and deletion.The result for collecting the VCF of abrupt information is summarized.
Fig. 8 is the survival analysis result schematic diagram from clinical case verifying involved in the embodiment of the present invention 3.
From the sequencing result obtained after processing, the assessment gene group of this 111 gastric cancer samples is found out, and according to embodiment Predetermined restriction method in 1 determines prognosis bona's sample and the bad sample of prognosis in this 111 gastric cancer clinical samples, then Survival analysis is done to prognosis bona's sample and the bad sample of prognosis.As a result see Fig. 8.
The same explanation with mark identical in Fig. 5 of its appearance is omitted in Fig. 8.As shown in figure 8, the knot of the present embodiment Fruit shows, when what embodiment 1 determined assesses prognosis prediction of the gene group to Chinese gastric cancer patients, conspicuousness 0.00062, namely Assessment gene group of the invention has apparent advantage to Chinese population, illustrates that assessment gene group of the invention is suitble to Chinese population Practical mutation distribution, the prognosis prediction preferably applied to Chinese population.
Embodiment action and effect
The determination method for the assessment gene group for gastric cancer prognosis prediction that embodiment 1 provides, from embodiment 2 and embodiment In 3 verifying as can be seen that since this method is based on the regulation active RhoA gene protein activity regulation of RhoA gene protein What the genescreen for including in access obtained the assessment gene group is that the RhoA gene regulation based on regulation RhoA gene activity is logical The genescreen for including in road obtains the assessment gene group, just includes in the assessment gene group obtained in this way and RhoA gene egg The relevant gene of white activity regulation includes gene relevant to the regulation of RhoA gene activity, and RhoA protein active and adjusts cell Contraction, movement and migration RhoA signal path it is closely related, play in the entire vital movement of tumour cell to Guan Chong The effect wanted and mutation ratio of the RhoA gene in gastric cancer crowd is higher, there is mutation especially in the gastric cancer of dispersivity hypotype Enrichment, so gene relevant to its expression regulation also can with the prognosis of gastric cancer have significant correlation, so compare basis The gene sets that are found to have the higher frequency of mutation in various gastric cancer tumors are comprehensive and close come the certain assortments of genes chosen Joining the prognosis of patients with gastric cancer, the assessment gene group energy that determination method of the invention determines is more reliable to be applied in clinical practice, In particular, the assessment gene group of the present embodiment, high to the assessment accuracy of the gastric cancer prognosis prediction of Chinese gastric cancer patients, it can height It is to reliably applied in the gastric cancer prognosis prediction of Chinese gastric cancer patients;In addition, since the assessment gene group that this method determines contains There are multiple genes, so substantially reducing the frequency of mutation in crowd compared in the method for the mutation association prognosis of individual gene Limitation substantially reduces the limitation of the sample of collection to the robustness of survival analysis result.

Claims (10)

1. a kind of determination method of the assessment gene group for gastric cancer prognosis prediction, it is characterised in that:
The genescreen for including in RhoA protein active regulatory pathway based on regulation RhoA protein active is obtained containing multiple bases The assessment gene group of cause,
Specifically includes the following steps:
Step 1, by the gene for including in RhoA gene regulation access alternately target gene set;
Step 2, in the alternative target gene sets based on each gastric cancer sample for screening each gene expression number Accordingly and the expression data of RhoA gene, determine that rule determines assessment gene group undetermined by predetermined;
Step 3, each institute is determined according to predetermined restriction method based on the gastric cancer verifying sample cluster for including multiple gastric cancers verifying sample State that the obtained gastric cancer prognosis of gastric cancer verifying sample is good all prognosis bona's samples and all gastric cancer prognosis are bad pre- The cox survival analysis of bad sample afterwards determines whether the assessment gene group undetermined is that can be used in commenting for the gastric cancer prognosis The assessment gene group estimated,
Wherein, in step 2, the predetermined determining rule are as follows:
The median of RhoA gene expression amount is determined based on the expression data of the RhoA gene of all gastric cancer samples;
Based on the median, all gastric cancer samples are determined as the high expression group of RhoA and the low expression group of RhoA;
The expression data of each gene in the alternative target gene sets based on each gastric cancer sample, respectively will be standby The gene for selecting each gene in target gene set to obtain by each gastric cancer sample in the high expression group Express data acquisition system, the expression number of the gene obtained with the gene by each gastric cancer sample in the low expression group Differential expression significance analysis is done according to set;
It is that significant all genes add RhoA base by differential expression significance analysis result in alternative target gene sets Because being determined as the assessment gene group undetermined.
2. determining method according to claim 1, it is characterised in that:
Wherein, as significance probability value p < 0.05, indicate that differential expression significance analysis result is significant.
3. determining method according to claim 1, it is characterised in that:
Wherein, in step 3, the predetermined restriction method are as follows: when the gene group to be assessed of a gastric cancer verifying sample In the mutation of at least one gene when meeting predetermined sudden change conditions, the corresponding gastric cancer verifying sample is determined that the prognosis is good Good sample, when there is no the mutation of any one gene to meet in the gene group to be assessed of a gastric cancer verifying sample When predetermined sudden change conditions, the corresponding gastric cancer verifying sample is determined into the bad sample of the prognosis;
Cox life is done with all bad samples of prognosis when all prognosis bona's samples do the result that cox survival analysis obtains It deposits and does result that existence significance difference analysis obtains between the obtained result of analysis when being significant, determine the assessment base undetermined Because group is the assessment gene group that can be used in the assessment of the gastric cancer prognosis.
4. determining method according to claim 3, it is characterised in that:
Wherein, as significance probability value p < 0.05, the result for indicating that existence significance difference analysis obtains is significant.
5. determining method according to claim 3, it is characterised in that:
Wherein, the predetermined sudden change conditions are that the gene of mutation meets simultaneously: there are mononucleotides for the gene Mutation or insertion and deletion mutation, and the single nucleotide mutation or insertion and deletion mutation are afunction mutation, and the monokaryon glycosides Acid mutation or the coverage of insertion and deletion mutation are more than or equal to predetermined coverage, and support the single nucleotide mutation or insertion and deletion The read of mutation is more than or equal to predetermined item number, and chain degrees of specificity is less than or equal to predetermined percentage, and the single nucleotide mutation Or the distance in site any one repetitive sequence region existing for the gene of insertion and deletion mutation is greater than preset distance, and The mutation that the result that functional annotation obtains is protein encoding regions is carried out to the gene.
6. determining method according to claim 5, it is characterised in that:
Wherein, the predetermined coverage is 30x.
7. determining method according to claim 5, it is characterised in that:
Wherein, the predetermined item number is 3.
8. determining method according to claim 5, it is characterised in that:
Wherein, the predetermined percentage is 95%.
9. determining method according to claim 5, it is characterised in that:
Wherein, the preset distance is 5bp.
10. determination method described in -9 any one according to claim 1, it is characterised in that:
Wherein, the assessment gene group includes following gene: RHOA, ARAP3, ARHGAP4, ARHGAP6, ARHGDIA, ARHGEF10、ARHGEF17、ARHGEF25、ARHGEF3、CDC42、DEF6、DLC1、DYNLL1、ECT2、FARP1、LRP2、 MCF2L, MTG1, NET1, OBSCN, PAK1, PKN1, PRPF38B, SLC6A2, SRGAP1, TRIO and TSPAN1.
CN201910550753.2A 2019-06-24 2019-06-24 The determination method of assessment gene group for gastric cancer prognosis prediction Pending CN110229902A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910550753.2A CN110229902A (en) 2019-06-24 2019-06-24 The determination method of assessment gene group for gastric cancer prognosis prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910550753.2A CN110229902A (en) 2019-06-24 2019-06-24 The determination method of assessment gene group for gastric cancer prognosis prediction

Publications (1)

Publication Number Publication Date
CN110229902A true CN110229902A (en) 2019-09-13

Family

ID=67856438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910550753.2A Pending CN110229902A (en) 2019-06-24 2019-06-24 The determination method of assessment gene group for gastric cancer prognosis prediction

Country Status (1)

Country Link
CN (1) CN110229902A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037863A (en) * 2020-08-26 2020-12-04 南京医科大学 Early NSCLC prognosis prediction system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108130372A (en) * 2018-01-17 2018-06-08 华中科技大学鄂州工业技术研究院 A kind of method and device for the instruction of acute myeloid leukemia drug
CA3069469A1 (en) * 2017-07-21 2019-01-24 Genentech, Inc. Therapeutic and diagnostic methods for cancer
CN109715829A (en) * 2016-05-16 2019-05-03 迪莫·迪特里希 A method of the reaction of assessment prognosis and prediction malignant disease patient to immunization therapy

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109715829A (en) * 2016-05-16 2019-05-03 迪莫·迪特里希 A method of the reaction of assessment prognosis and prediction malignant disease patient to immunization therapy
CA3069469A1 (en) * 2017-07-21 2019-01-24 Genentech, Inc. Therapeutic and diagnostic methods for cancer
CN108130372A (en) * 2018-01-17 2018-06-08 华中科技大学鄂州工业技术研究院 A kind of method and device for the instruction of acute myeloid leukemia drug

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
无: "2018 ASCO-GI:RhoA调控通路突变对胃癌患者生存的影响,http://www.sohu.com/a/219225327_711199", 《搜狐》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037863A (en) * 2020-08-26 2020-12-04 南京医科大学 Early NSCLC prognosis prediction system
CN112037863B (en) * 2020-08-26 2022-06-21 南京医科大学 Early NSCLC prognosis prediction system

Similar Documents

Publication Publication Date Title
CN113450873B (en) Marker for predicting gastric cancer prognosis and immunotherapy applicability and application thereof
CN106778073B (en) A kind of method and system of assessment tumor load variation
CN108388773A (en) A kind of identification method of tumor neogenetic antigen
KR102029393B1 (en) Circulating Tumor DNA Detection Method Using Sample comprising Cell free DNA and Uses thereof
US20210358626A1 (en) Systems and methods for cancer condition determination using autoencoders
CN109859796B (en) Dimension reduction analysis method for DNA methylation spectrum of gastric cancer
Liu et al. Multi‐omics analysis of intra‐tumoural and inter‐tumoural heterogeneity in pancreatic ductal adenocarcinoma
EP3811365A1 (en) A noise measure for copy number analysis on targeted panel sequencing data
JP2017070240A (en) Rare mutation detection method, detection device, and computer program
Zhang et al. Hallmark guided identification and characterization of a novel immune-relevant signature for prognostication of recurrence in stage I–III lung adenocarcinoma
CN114220487A (en) Construction method of novel 9-gene RISK acute myelogenous leukemia prognosis model
CN110229902A (en) The determination method of assessment gene group for gastric cancer prognosis prediction
Wang et al. Copy number signature analyses in prostate cancer reveal distinct etiologies and clinical outcomes
CN116895330A (en) Construction method and application of psoriasis accurate parting model
CN111471773A (en) Diagnostic biomarker for predicting prognosis of gastric adenocarcinoma patient and determination method and application thereof
US20240153588A1 (en) Systems and methods for identifying microbial biosynthetic genetic clusters
US20190112729A1 (en) Novel set of biomarkers useful for predicting lung cancer survival
CN110408706A (en) It is a kind of assess recurrent nasopharyngeal carcinoma biomarker and its application
EP4328920A1 (en) Microsatellite instability detection method based on second-generation sequencing
CN114496097A (en) Gastric cancer metabolic gene prognosis prediction method and device
CN118098378B (en) Gene model construction method for identifying new subtype of liver cell liver cancer and application
CN115472294B (en) Model for predicting transformation speed of small cell transformation lung adenocarcinoma patient and construction method thereof
El Naqa et al. Biological data: The use of-omics in outcome models
Huang et al. Systematic analysis of 4-gene prognostic signature in patients with diffuse gliomas based on gene expression profiles
Aoto Genome and transcriptome analysis for the process of cancer progression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190913