CN106503489A - The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system - Google Patents

The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system Download PDF

Info

Publication number
CN106503489A
CN106503489A CN201610973131.7A CN201610973131A CN106503489A CN 106503489 A CN106503489 A CN 106503489A CN 201610973131 A CN201610973131 A CN 201610973131A CN 106503489 A CN106503489 A CN 106503489A
Authority
CN
China
Prior art keywords
gene
variant sites
cardiovascular system
site
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610973131.7A
Other languages
Chinese (zh)
Inventor
范振鑫
郭涛
何苗
刘鱼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Xin Yun Decoding Technology Co Ltd
Original Assignee
Chengdu Xin Yun Decoding Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Xin Yun Decoding Technology Co Ltd filed Critical Chengdu Xin Yun Decoding Technology Co Ltd
Priority to CN201610973131.7A priority Critical patent/CN106503489A/en
Publication of CN106503489A publication Critical patent/CN106503489A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Abstract

This application provides the acquisition methods and device in a kind of mutational site of the corresponding gene of cardiovascular system, are related to technical field of biological information.The method includes:The multiple short sequence of testing gene and reference gene group are carried out comparing, the preliminary variant sites information of testing gene is obtained;According to preliminary variant sites information, the variant sites for being unsatisfactory for default reserve are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked in multiple preliminary variant sites;Multiple variant sites of corresponding with the cardiovascular system in cardiovascular system gene bank for site to be checked gene are compared;When there are and mutating alkali yl identical variant sites identical with position in cardiovascular system gene bank in site to be checked, the site mutation situation of the corresponding gene of testing gene central vasculature is obtained.The method and device can obtain the catastrophe of related to cardiovascular system multiple variant sites in the variant sites in the testing gene.

Description

The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system
Technical field
The application is related to technical field of biological information, in particular to a kind of the prominent of the corresponding gene of cardiovascular system Become the acquisition methods and device in site.
Background technology
With development and the maturation of medical science, genomics and high throughput sequencing technologies, accurate medical treatment (Precision Medicine) also apply in countries in the world, become new medical model.Precisely medical treatment is by individual people's gene, environment and life The disease prevention that custom difference is taken into account and the therapy that disposes, according to everyone hereditary information, personalized, precision Go formulate medical treatment and health management scheme.
And everyone genetic background be distinguishing, in the process, it is necessary to determine everyone genome or The catastrophe of some genes being associated with corresponding organ or position, further according to the base mutation situation to allow to Analysis contrast, determines final ill probability, to specify corresponding medical treatment and health management scheme.
Cardiovascular system is the tubing of a closing, is made up of heart and blood vessel.Heart is working organ, blood vessel It is the pipeline for transporting blood.There are Rythmic contractions characteristic and diastole by heart, promote blood in the blood vessel according to a certain direction not Circulate with stopping, referred to as blood circulation.Blood circulation is one of most important physiological function of body existence.As blood is followed Ring, whole functions of blood are just achieved, and the blood volume of adjustment distribution at any time, the organ that with adaptive act, the needs of tissue, So as to ensure that the relative constancy of organismic internal environment and metabolic being normally carried out.So, cardiovascular system is organism weight The system that wants, if cardiovascular system occurs pathological changes, can produce extremely serious impact.Then, one is done to cardiovascular system diseases Determine preventive measure, to reduce incidence probability, of crucial importance.
As the incidence of cardiovascular system diseases is contacted with certain with gene, cardiovascular system is corresponding The site base mutation situation of gene is different, may make the incidence of the different cardiovascular system diseases of cardiovascular system and send out Sick probability is different.Thus it is possible to utilize accurate medical model, according to the base mutation situation of the corresponding gene of cardiovascular system with And the combination of other information is predicted to the incidence and probability of cardiovascular system diseases, to enter to cardiovascular system diseases Row prevention is a kind of effective precautionary approach.Cardiovascular system diseases is cardiovascular system diseases.
The existing determination to cardiovascular system gene point mutation situation, obtains base to be measured typically by chemical mode The base mutation situation of the gene locis of a certain specified location of cause, the limited amount in the mutational site that the acquisition modes are obtained, Be typically only capable to obtain the catastrophe of some or certain several bases, it is impossible at the same determine in testing gene with cardiovascular system pair The catastrophe of as much as possible multiple variant sites of the gene that answers, makes subsequently to combine other information to cardiovascular system diseases Predicting the outcome for disease condition be likely to occur relatively large deviation.
Content of the invention
In view of this, the embodiment of the present application provides a kind of acquisition side in the mutational site of the corresponding gene of cardiovascular system Method and device, by by corresponding with the cardiovascular system in cardiovascular system gene bank for the variant sites of testing gene gene Multiple variant sites are compared, it is hereby achieved that multiple change dystopys of the corresponding gene of cardiovascular system in testing gene The base mutation situation of point, to improve the problems referred to above.
To achieve these goals, the technical scheme that the application is adopted is as follows:
A kind of acquisition methods in the mutational site of the corresponding gene of cardiovascular system, methods described include:By testing gene Multiple short sequence and reference gene group carry out comparing, obtain the preliminary variant sites information of testing gene, described preliminary Variant sites information includes the mutating alkali yl of multiple preliminary variant sites and the positional information of each preliminary variant sites;Root According to the preliminary variant sites information, the variant sites for being unsatisfactory for default reserve in the plurality of preliminary variant sites are deleted Remove, using the variant sites in the testing gene obtained after deletion as site to be checked;By the site to be checked and cardiovascular Multiple variant sites of the corresponding gene of cardiovascular system in system gene storehouse are compared, the cardiovascular system gene bank Include mutating alkali yl and each variant sites position of each variant sites of the corresponding gene of cardiovascular system;When There are and mutating alkali yl identical variant sites identical with position in the cardiovascular system gene bank in the site to be checked, obtain Obtain the site mutation situation of the corresponding gene of the testing gene central vasculature.
A kind of acquisition device in the mutational site of the corresponding gene of cardiovascular system, described device include:Comparing module, uses In the multiple short sequence of testing gene and reference gene group are carried out comparing, the preliminary variant sites letter of testing gene is obtained Breath, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and each preliminary variant sites Positional information;Filtering module, for according to the preliminary variant sites information, will be unsatisfactory in the plurality of preliminary variant sites The variant sites of default reserve are deleted, using the variant sites in the testing gene obtained after deletion as position to be checked Point;Comparison module, for by the more of corresponding with the cardiovascular system in cardiovascular system gene bank for the site to be checked gene Individual variant sites are compared, and the cardiovascular system gene bank includes each change dystopy of the corresponding gene of cardiovascular system The mutating alkali yl of point and each variant sites position;Mutation acquisition module, when exist in the site to be checked with described In cardiovascular system gene bank, position is identical and mutating alkali yl identical variant sites, for obtaining painstaking effort in the testing gene The site mutation situation of the corresponding gene of guard system.
The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system that the embodiment of the present application is provided, are obtaining Testing gene variant sites in the case of, by the variant sites of testing gene with cardiovascular system gene bank central vessel Multiple variant sites of the corresponding gene of system are compared, and cardiovascular system gene bank includes the corresponding base of cardiovascular system The mutating alkali yl of each variant sites of cause and each variant sites position.When presence and cardiovascular system in testing gene In system gene bank, position is identical and mutating alkali yl identical variant sites, it may be determined that there is cardiovascular system in the testing gene Corresponding gene mutation.
As cardiovascular system gene bank includes multiple variant sites related to cardiovascular system, then this programme can be with Determine related to cardiovascular system multiple variant sites in testing gene, and the concrete base mutation of the plurality of variant sites Situation.
For enabling the above-mentioned purpose of the application, feature and advantage to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Description of the drawings
Purpose, technical scheme and advantage for making the embodiment of the present application is clearer, below in conjunction with the embodiment of the present application In accompanying drawing, to the embodiment of the present application in technical scheme be clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, rather than whole embodiments.Embodiment in based on the application, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of the application protection.
Fig. 1 shows the structural representation of the computer that the embodiment of the present application is provided;
Fig. 2 shows the acquisition side in the mutational site of the corresponding gene of the cardiovascular system of the application first embodiment offer A kind of flow chart of method;
Fig. 3 shows the acquisition side in the mutational site of the corresponding gene of the cardiovascular system of the application first embodiment offer The flow chart of the part steps of method;
Fig. 4 shows the acquisition dress in the mutational site of the corresponding gene of the cardiovascular system of the application second embodiment offer The functional block diagram that puts;
Fig. 5 shows the acquisition dress in the mutational site of the corresponding gene of the cardiovascular system of the application second embodiment offer The gene bank that puts sets up the functional block diagram of module;
Fig. 6 shows the acquisition dress in the mutational site of the corresponding gene of the cardiovascular system of the application second embodiment offer The functional block diagram of the filtering module that puts;
Fig. 7 shows the acquisition dress in the mutational site of the corresponding gene of the cardiovascular system of the application second embodiment offer The functional block diagram of the comparing module that puts.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present application, to the embodiment of the present application in technical scheme carry out clear, complete Ground description, it is clear that described embodiment is only some embodiments of the present application, rather than whole embodiments.Generally exist The component of the embodiment of the present application described and illustrated in accompanying drawing can be arranged and be designed with a variety of configurations herein.Cause This, is not intended to limit claimed the application's to the detailed description of the embodiments herein for providing in the accompanying drawings below Scope, but it is merely representative of the selected embodiment of the application.Embodiments herein is based on, those skilled in the art are not doing The every other embodiment obtained on the premise of going out creative work, belongs to the scope of the application protection.
It should be noted that:Similar label and letter represent similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then in subsequent accompanying drawing which further need not be defined and be explained.Meanwhile, the application's In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or hint relative importance.
As shown in figure 1, being the block diagram of the application computer 100.The computer 100 includes cardiovascular system pair The acquisition device 200 in the mutational site of the gene that answers, memorizer 101, storage control 102, processor 103, Peripheral Interface 104th, input-output unit 105 and other.
The memorizer 101, storage control 102, processor 103, Peripheral Interface 104 and input-output unit 105 Each element is directly or indirectly electrically connected with each other, to realize the transmission or interaction of data.For example, these elements mutually it Between can pass through one or more communication bus or holding wire and realize being electrically connected with.The mutation of the corresponding gene of the cardiovascular system The acquisition device 200 in site includes that at least one can be stored in the memorizer in the form of software or firmware (firmware) In 101 or the software function module that is solidificated in the operating system (operating system, OS) of the computer 100.Institute Processor 103 is stated for executing the executable module stored in memorizer 101, the corresponding gene of for example described cardiovascular system Software function module or computer program that the acquisition device 200 in mutational site includes.
Wherein, memorizer 101 may be, but not limited to, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memorizer 101 is used for storage program, and the processor 103 executes described program after execute instruction is received, aforementioned Method performed by the computer 100 of the stream process definition that the embodiment of the present application any embodiment is disclosed can apply to processor In 103, or realized by processor 103.
A kind of possibly IC chip of processor 103, the disposal ability with signal.Above-mentioned processor 103 can To be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), special IC (ASIC), Ready-made programmable gate array (FPGA) or other PLDs, discrete gate or transistor logic, discrete hard Part component.Can realize or execute disclosed each method in the embodiment of the present application, step and logic diagram.General processor Can be microprocessor or the processor 103 can also be any conventional processor etc..
The Peripheral Interface 104 is by various input/output devices coupled to processor 103 and memorizer 101.At some In embodiment, Peripheral Interface 104, processor 103 and storage control 102 can be realized in one single chip.Other one In a little examples, they can be realized by independent chip respectively.
Input-output unit 105 is used for being supplied to user input data realizes interacting for user and the computer.Described Input-output unit may be, but not limited to, digital independent device, mouse and keyboard etc..
It should be understood that structure shown in Fig. 1 is only to illustrate, computer 100 can also include more more than shown in Fig. 1 or Less component, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can adopt hardware, software or its Combination is realized.
First embodiment
The embodiment of the present application provides a kind of acquisition methods in the mutational site of the corresponding gene of cardiovascular system, for obtaining Take the base mutation situation of the variant sites of related to cardiovascular system gene in testing gene.Refer to Fig. 2, the method bag Include:
Step S110:The multiple short sequence of testing gene and reference gene group are carried out comparing, testing gene is obtained Preliminary variant sites information, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and per The positional information of individual preliminary variant sites.
First, the multiple short sequence of testing gene is obtained, and the short sequence can be exported by second filial generation microarray dataset.Will The short sequence of testing gene is compared with reference gene group.Such as, if testing gene is human gene, the reference gene group is then Mankind's reference gene group.
Certainly, the comparison process can include repeatedly comparing and the process such as duplicate removal, after being compared including multiple changes The preliminary variant sites information of ectopic sites.
Specifically, as shown in figure 3, in the present embodiment, the comparing in this step is believed with obtaining preliminary variant sites The process of breath can include:
Step S111:The multiple short sequence of the testing gene and reference gene group are compared first, SAM lattice are obtained The comparison result of formula.
The short sequence of testing gene and reference gene group are carried out comparing, the comparison process can utilize existing ratio Software is carried out, such as Bowtie2, it is possible to obtain the comparison result of SAM forms, be stored with the comparison result of the SAM forms ratio Comparison information to rear acquisition.It should be understood that in the comparison result of the SAM forms, including each alkali in testing gene The information of base, such as positional information.
Certainly, the representation of specifically used comparison software and comparison result is not intended as limiting in the present embodiment System, can compare the multiple short sequence of testing gene and reference gene group and obtain the comparison information for representing comparison result It is advisable.
Step S112:Duplicate removal is carried out to the comparison result, contrast is made to the short sequence of a position of reference gene group Number is less than or equal to 1.
In the comparison result that step S111 is obtained, there are a certain proportion of repetitive sequence and result, for example, contrast to referring to base Same position because organizing may have multiple short sequences, then, in this step, comparison result be carried out duplicate removal.
In the present embodiment, it is possible to use software Picard carries out duplicate removal work.Specifically, that utilized can be Picard MarkDuplicate instrument duplicate removals, obtain bam forms duplicate removal result.
Step S113:Comparing result after to duplicate removal carries out local anharmonic ratio to (local multiple alignment).
As obtained is difficult accurately to compare highly similar repetition to the short sequence that reference gene group is compared Region, then the repeat region in genome be readily available false-positive variant sites, such as false-positive SNPs.It is appreciated that , false-positive variant sites are the variant sites of comparison result mistake.In order to reduce false positive variant sites quantity and Ratio, in the present embodiment, to duplicate removal after comparing result carry out local anharmonic ratio pair.
Specifically, the local anharmonic ratio can be used in GATK to (local multiple alignment) IndelRealigner is carried out, and obtains comparison result of the anharmonic ratio of bam forms to after.The comparison process typically has three steps, A. detect suspicious, need the region for carrying out anharmonic ratio pair;B. anharmonic ratio pair is carried out to these suspicious regions;C. repair in anharmonic ratio The mate pairing information that loses to during.
Step S114:Recalculate the base mass fraction in comparison result of the local anharmonic ratio to after.
In the step of during aforementioned processing S111, each single base can be endowed in data processing One mass fraction (Quality scores), for reflecting the credibility of nucleotide that corresponding base is observed.
Mass fraction due to obtaining during aforementioned processing does not have preferably to contact with the genotyping result probability of mistake Get up, while the mass fraction of single base, does not contact with other specification phase example, different surveys such as in same sample Sequence platform, different sequencing circulations, different libraries etc. are contacted.
Therefore, in this step in S114, the mass fraction of each base is connected with each factor in sequencing procedure System, recalculates to the mass fraction of each base, generates new mass fraction, for judging that each base whether may be used Letter.
Specifically, in the present embodiment, it is possible to use GATK carries out empirical quality score Recalibration, obtains the result of bam forms.
Step S115:According to the base mass fraction, SNP and indel is carried out to comparing result of the local anharmonic ratio to after Analysis, obtains preliminary variant sites information.
According to the base mass fraction for recalculating acquisition, local anharmonic ratio is carried out to the comparison result for obtaining SNP and The preliminary interpretation of indel, carries out SNP and indel typings to which, to obtain the variant sites information for including multiple variant sites, , used as preliminary variant sites information, multiple variant sites that this includes are used as preliminary variant sites for the variant sites information.Can be with Understand, in the preliminary variant sites information, include the mutating alkali yl of multiple preliminary variant sites, and each change dystopy Point position.In the present embodiment, variant sites are SNP and indel, it is preferred that in the present embodiment, variant sites are only SNP.
Specifically, in this step, can be analyzed using the Unified Genotyper of GATK.Because complete Into after the typing of SNPs, a lot of data filtering parameter logistics are employed according to being filtered again, with further control data quality, So standard minimum confidence thresholds are both configured to zero in this step.It should be understood that SNPs represents the plural form of SNP.
Certainly, the preliminary interpretation process of the SNP and indel can also be carried out in other ways, in the present embodiment not As limit, or other, the such as HaplotypeCaller of GATK is carried out.
In this step, it is possible to obtain including the vcf files of preliminary variant sites information, the preliminary change in the vcf files Ectopic sites information includes each variant sites for obtaining in step s 110 and the corresponding positional information of each variant sites, Certainly, other are also included, here is not added with repeating.
Step S120:According to the preliminary variant sites information, will be unsatisfactory for presetting in the plurality of preliminary variant sites The variant sites of reserve are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked.
In step s 110, in the preliminary variant sites in the preliminary variant sites information of acquisition, it would still be possible to there is false sun Property variant sites, then, this step is further filtered to preliminary variant sites, delete wherein false positive probability higher Variant sites, using the variant sites in the result after deletion as the testing gene in variant sites, make last acquisition Variant sites are more accurate.It should be understood that delete after result in further comprises each variant sites positional information and Other information, will not be described here.
Specifically, in this step, can include following one or more deleting the variation for being unsatisfactory for default reserve The mode in site:
Mode one:Remove in the plurality of preliminary variant sites, the number of allele is more than the change dystopy of predetermined threshold value Point.
Allele is that the probability of false positive variant sites is higher, which is carried out more than the variant sites of predetermined threshold value Remove.In the present embodiment, the predetermined threshold value can value according to actual needs, due to comprising being more than more than 1 allele Site just have higher gene type mistake, it is preferred that the value of the predetermined threshold value can be 1.
When predetermined threshold value value is 1, that is, remove Variant sites.
Mode two:Delete in the plurality of preliminary variant sites, positioned at each insertion and deletion (indel) upstream span or All variant sites in person's span downstream, the base number that the upstream span and span downstream include are predetermined number.
As the short sequence for comparing is often exported by secondary direction finding platform, and the short sequence of secondary microarray dataset exists The comparison of mistake is more prone near the region of insertion and deletion (indel), and the local anharmonic ratio in above-mentioned processing procedure is not to This mistake can be completely eliminated.Then, all variant sites in insertion and deletion upstream span or span downstream are deleted, with Reduce the probability of false positive results.
The base number that the upstream span and span downstream include be predetermined number, the predetermined number can by user according to Actual demand determines, is not restricted in the present embodiment, also, the predetermined number of upstream span and span downstream can phase Same or different.
In the present embodiment, the base number for above having scope to include is preferably 5, and the base number that span downstream includes is excellent Elect 5 as.That is, all indel in preliminary variant sites are determined, for each indel, by its upstream 5bp (5 bases) Within all variant sites delete, or all variant sites within 5bp downstream are deleted.
Certainly, in the present embodiment, only can delete in variant sites or the span downstream in the upstream span of indel Variant sites, it is also possible to the variant sites in the upstream span of indel and the variant sites in span downstream are all deleted.
Preferably, in the present embodiment, in the upstream span or span downstream for insertion and deletion (indel) of deletion All SNPs.
Mode three:By in the plurality of preliminary variant sites, the variant sites for being spaced default base number each other are deleted Remove.
In this step, variant sites close to each other are deleted, will variation of the distance less than certain value each other Delete in site.
In the present embodiment, the default base number is not intended as limiting, and can set according to actual needs.
Preferably, the default base number is 4, if variation of the base number being spaced between existing less than 4 Site, is deleted.That is, variant sites of the deletion within upstream each other or downstream 5bp.
Preferably, in the step, the SNPs for being spaced default base number each other of deletion.
Mode four:By in the plurality of preliminary variant sites, corresponding GQ (Genotype quality) value is less than default The variant sites of GQ threshold values are deleted.
GQ (Genotype quality) is a posterior probability (the phred-scaled probabilities) value, For each site, GQ values are not possible of truth in order to represent the site in the genotypic results of current acquisition Property, that is, represent the probability existed in the site genotype for obtaining.Calculation is:
GQ values=- 10*log10 (P [error]), wherein, P [error] represents that corresponding site is not the general of truth Rate.
Preferably, in the present embodiment, it is 20 to preset GQ threshold values.Empirical tests, when GQ threshold values are 20, theoretic mistake Rate is 1%.
Mode five:By in the plurality of preliminary variant sites, corresponding MQ (Mapping quality) value is less than default MQ The variant sites of threshold value are deleted.
MQ represents the specificity (uniqueness) in aligned sequences.When same short sequence can compare same During genome zones of different, the alignment score of first best comparison area (the first best alignment) The alignment score of (alignment's score) and second best comparison area (the second best alignment), two Person's difference is bigger, shows that the specificity for comparing is better, and the value of MQ is higher.
In this embodiment it is believed that it is false sun that MQ values have higher probability less than the variant sites of default MQ threshold values Property, it is deleted.
Preferably, in the present embodiment, it is 30 to preset MQ threshold values value.Empirical tests, when MQ values are 30, P [error]= 0.001, i.e., relative to current location is compared, the probability for comparing another position is up to 0.1%.
In embodiments of the present invention, mode one is optional executive mode to mode five, i.e., in this step, can adopt which In a certain mode, certain several ways or all of mode.When the change for carrying out being unsatisfactory for reservation conditions using various ways During the deletion of ectopic sites, the execution sequence between the various ways is not intended as limiting.Certainly, the various ways can also be parallel Execute.
In addition, in the step 120, when there is various ways to be performed serially, follow-up step can be in preceding step On the basis of execute.For example, if the number of the plurality of preliminary variant sites allelic of the removal of executive mode one is more than pre- If in the variant sites of threshold value, and mode three, default base will be spaced in the plurality of preliminary variant sites each other The variant sites of number are deleted, and first carry out mode one, then executive mode three.Then in mode three, deletion can be mode The variant sites of default base number are spaced in variant sites after one process each other.
Step S120 is carried out to preliminary variant sites after deletion filtration, and the variant sites in the final result of acquisition are used as treating The site to be checked of cls gene, can be represented with vcf formatted files.
Step S130:By many of corresponding with the cardiovascular system in cardiovascular system gene bank for the site to be checked gene Individual variant sites are compared, and the cardiovascular system gene bank includes each change dystopy of the corresponding gene of cardiovascular system The mutating alkali yl of point and each variant sites position.
In embodiments of the present invention, cardiovascular system gene bank is initially set up, and the cardiovascular system gene bank includes the heart The mutating alkali yl of each variant sites of the corresponding gene of vascular system and each variant sites position.
The cardiovascular system gene bank step S130 relatively before set up.Specifically, it can obtain that this sets up process Take COSMIC gene databases, the clivar data bases of NCBI, other international and domestic each big authoritative academic journal magazines, gene inspection In the gene database that survey company and relevant government department announce, the gene locis information related to cardiovascular system.Mainly Obtained is the base mutation situation and each variant sites of each variant sites for including the corresponding gene of cardiovascular system The gene locis information of position.
Certainly, the Data Source for obtaining gene locis information can also be other, be not intended as in the present embodiment limiting.
Further, each variation of the corresponding gene of cardiovascular system can also be included in the gene locis information of acquisition Impact of the every kind of mutating alkali yl in site to protein function, that is, get the base of certain variant sites by normal base mutation To current mutating alkali yl, which kind of impact can be produced on the function of corresponding protein.
Certainly, in the present embodiment, can also include in the gene locis information of acquisition:The corresponding base in each mutational site Write a Chinese character in simplified form because of name, the coordinate of gene name full name, this site in human genome, corresponding histoorgan type, gene are dashed forward Change type, normal gene are in the base in this site, whether this kind of mutation in this site of clinical research causes a disease, original mutation finds In the source that crowd, the sex of original mutation carrier patient, the age of original mutation carrier patient, original mutation are recorded One or more.
Again will be with a low credibility in preset standard and mistake gene locis information deletion in the gene locis information, The gene locis information of acquisition forms the cardiovascular system gene bank.
In the present embodiment, include following at least one less than the gene locis information of preset standard:
1) the gene locis information got from the very poor periodical of non-SCI periodicals or reputation in the field of business, reputation is very in the industry for this Poor periodical can be factor of influence less than the periodical for being unsatisfactory for requiring under the periodical of certain value or other judgment criteria;2) record In the original of the gene locis information, sample size used is less than certain value so that being not enough to the conclusion for drawing science 's;3) in the original for recording the gene locis, the gene locis are not the most important gene locis found in document, The most important gene locis can be in the result for getting front 10% site.
The gene locis information of mistake includes following at least one:1) the gene locis information described in the data base for obtaining Original substantially do not have been reported that this site;2) record in the original of the gene locis, the gene locis As a result it is statistically non-significant.
Certainly, the criterion of preset standard and gene locis information errors, is not intended as limiting in the present embodiment, Can be determined according to practical situation.
Further, as the gene studiess related to cardiovascular system are constantly carried out, the base related to cardiovascular system The catastrophe of the variant sites of cause can be in updating, and in current cardiovascular system gene bank might not there is institute There is the variant sites catastrophe of the related gene of cardiovascular system, then, in embodiments of the present invention, also include every default Time period is updated to the cardiovascular system data base.
Specific renewal process can be, every preset time period, acquisition is newest to be published in internal authority scholarly journal, such as The research paper related to cardiovascular system that delivers on Nature, Nature Genetics etc., in the research paper that will be obtained The newest gene locis information related to cardiovascular system, deletes wherein with a low credibility in preset standard and mistake base Because of site information, it is added in cardiovascular system data base to realize updating.
After obtaining cardiovascular system gene bank, site to be checked is corresponding with the cardiovascular system in cardiovascular system data base Multiple variant sites of gene be compared.
In the present embodiment, the comparison procedure can directly be carried out behind the acquisition site to be checked of step S120, also may be used Be by user triggering carry out.I.e. after the inquiry request for receiving user's triggering, the comparison in step S130 is executed.
One or more, step S130 alternatively, it is also possible to be, in the site to be checked obtained in user input step S120 Multiple change dystopys of the middle site to be checked by user input gene corresponding with the cardiovascular system in cardiovascular system gene bank Point is compared.
Alternatively, it is also possible to be, user directly obtains the related change dystopy of cardiovascular system from cardiovascular system gene bank Point.Specifically, user by input-output unit be input into gene name, site genome the information such as coordinate.Receiving After the information of user input, the information according to user input is made a look up in cardiovascular system gene bank, by lookup result, such as The various information such as gene name, site coordinate, base mutation type are shown.If finding use in cardiovascular system gene bank The information of family input, then prove that the corresponding gene locis of the input information are related to cardiovascular system, and there is base mutation.Can With understood, position of the site in the coordinate as site of genome.
Step S140:When exist in the site to be checked identical with position in the cardiovascular system gene bank and be mutated alkali Base identical variant sites, obtain the site mutation situation of the corresponding gene of the testing gene central vasculature.
When comparative result is, exist in site to be checked and identical variant sites in cardiovascular system data base, then can be with Determine in the testing gene, there is the corresponding gene of cardiovascular system according to the identical variant sites in cardiovascular system data base Site mutation, and catastrophe is consistent with the identical variant sites in cardiovascular system data base.Thus it is possible to be treated The variant sites of the gene which has related to cardiovascular system in cls gene and each variation related with cardiovascular system The concrete catastrophe in site, the catastrophe are included in which base mutation of which position is which base.
It should be understood that identical variant sites refer to that the position of variant sites is identical and base mutation situation is identical, that is, exist Same position has identical mutating alkali yl, it is believed that be to become dystopy with identical in cardiovascular system data base in site to be checked Point.I.e. related to the cardiovascular system gene of the corresponding gene of cardiovascular system.
Then, related personnel can be according to the site mutation feelings of the corresponding gene of the testing gene central vasculature of acquisition Possible disease condition under every kind of catastrophe of condition, and other information, such as cardiovascular system related gene, determines that this is to be measured The cardiovascular system disease condition of the corresponding object of gene.
Further, in the present embodiment, can be with according to the corresponding gene of the testing gene central vasculature Site mutation situation, and each variant sites of the corresponding gene of cardiovascular system database hub vascular system is every kind of prominent Become the impact of base pair protein function, determine the shadow of the mutations on protein function of each variant sites in the testing gene Ring, which the protein work(related to cardiovascular system of the corresponding object of testing gene (such as corresponding people) may thereby determine that Can be affected, which receives affects.So that skilled addressee can be according to the impact of protein function, in conjunction with other Information, such as protein function change interactively with organ concrete function etc., judge the heart of the corresponding object of the testing gene Which cardiovascular system diseases vascular system disease illness probability and may suffer from.
Certainly, include the catastrophe of every kind of variant sites to cardiovascular in embodiments of the present invention, or directly The pathogenic situation of systemic disease, the such as impact to angina pectoris potentially include pathogenic, may cause a disease, risk factor, not true Fixed, have the result of study of conflict, optimum, the pathogenic situation of wherein certain certain mutating alkali yl of position is risk factor, shows this The probability that the object that there is this kind of mutating alkali yl position suffers from angina pectoris is very high, should be noted to prevent.
Second embodiment
A kind of acquisition device 200 in the mutational site of the corresponding gene of cardiovascular system is present embodiments provided, is referred to Fig. 4, the device 200 include:
Comparing module 210, for the multiple short sequence of testing gene and reference gene group are carried out comparing, is treated The preliminary variant sites information of cls gene, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites And the positional information of each preliminary variant sites.
Filtering module 220, for according to the preliminary variant sites information, will be unsatisfactory in multiple preliminary variant sites pre- If the variant sites of reserve are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked.
Comparison module 230, for will be corresponding with the cardiovascular system in cardiovascular system gene bank for the site to be checked Multiple variant sites of gene are compared, and the cardiovascular system gene bank includes the every of the corresponding gene of cardiovascular system The mutating alkali yl of individual variant sites and each variant sites position.
Mutation acquisition module 240 is identical with position in the cardiovascular system gene bank when existing in the site to be checked And mutating alkali yl identical variant sites, for obtaining the site mutation of the corresponding gene of the testing gene central vasculature Situation.
Further, each variant sites of the corresponding gene of cardiovascular system are also included in cardiovascular system gene bank Impact of every kind of mutating alkali yl to protein function, the mutation acquisition module 240 in the present embodiment are additionally operable to according to described to be measured The site mutation situation of the corresponding gene of gene center vascular system, determines the mutation of each variant sites in the testing gene Impact to protein function.
Further, in the present embodiment, as shown in figure 4, also include that gene bank sets up module 250, for setting up cardiovascular System gene storehouse, the gene bank sets up module 250 to be included:Data capture unit 251, for obtaining COSMIC gene datas The gene locis information related to cardiovascular system in storehouse, the clivar data bases of NCBI, the gene locis information include The mutating alkali yl of each variant sites of the corresponding gene of cardiovascular system and each variant sites position.Data deletion Unit 252, for deleting with a low credibility in the gene locis information in preset standard and mistake gene locis information Remove, the gene locis information of acquisition forms the cardiovascular system gene bank.
Further, as shown in figure 5, the gene bank sets up module 250 also includes updating block 253, for every default Time period is updated to the cardiovascular system gene bank.
Further, as shown in fig. 6, in the present embodiment, filtering module 220 includes one or more of:First deletes Unit 221 is removed, for removing in the plurality of preliminary variant sites, the number of allele is more than the change dystopy of predetermined threshold value Point.Second deletes unit 222, for deleting in the plurality of preliminary variant sites, positioned at the upstream span of each insertion and deletion Or all variant sites in span downstream, the base number that the upstream span and span downstream include is predetermined number. 3rd deletes unit 223, for by the plurality of preliminary variant sites, being spaced the change dystopy of default base number each other Point deletion.4th deletes unit 224, for by the plurality of preliminary variant sites, corresponding GQ values are less than default GQ threshold values Variant sites delete.5th deletes unit 225, for by the plurality of preliminary variant sites, corresponding MQ values are less than pre- If the variant sites of MQ threshold values are deleted.
In this example, Fig. 7 is referred to, and comparing module 210 can include:Comparing unit 211, for by the base to be measured The multiple short sequence of cause is compared first with reference gene group, obtains the comparison result of SAM forms;Duplicate removal unit 212, is used for Duplicate removal is carried out to the comparison result, makes contrast 1 is less than or equal to the short sequence number of a position of reference gene group;Weight Comparing unit 213, carries out local anharmonic ratio pair for the comparing result after to duplicate removal;Computing unit 214, local for recalculating Base mass fraction in comparison result of the anharmonic ratio to after;Just sentence unit 215, for according to the base mass fraction, to this Ground comparing result of the anharmonic ratio to after carries out SNP and indel analyses, obtains preliminary variant sites information.
In sum, the acquisition methods in the mutational site of the corresponding gene of cardiovascular system provided in an embodiment of the present invention and Device, obtain testing gene site to be measured after, by site to be measured with cardiovascular system gene bank corresponding gene many Individual variant sites are compared, it is hereby achieved that related to cardiovascular system multiple in the variant sites in the testing gene The catastrophe of variant sites, for aiding in the judgement of the possibility disease condition of cardiovascular system diseases.
It should be noted that each embodiment in this specification is described by the way of going forward one by one, each embodiment weight Point explanation is all difference with other embodiment, between each embodiment identical similar part mutually referring to. For device class embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, related part ginseng See the part explanation of embodiment of the method.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, it is also possible to pass through Other modes are realized.Device embodiment described above is only schematically, for example flow chart and block diagram in accompanying drawing Show the device of multiple embodiments according to the application, the architectural framework in the cards of method and computer program product, Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of module, program segment or a code Part, a part the holding comprising one or more logic functions for realization regulation of the module, program segment or code Row instruction.It should also be noted that in some are as the implementations that replaces, the function that marked in square frame can also be being different from The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially be executed substantially in parallel, and they are sometimes Can execute in the opposite order, this is depending on involved function.It is also noted that every in block diagram and/or flow chart Individual square frame and the combination of block diagram and/or the square frame in flow chart, can use the special base for executing the function or action of regulation Realize in the system of hardware, or can combine to realize with specialized hardware and computer instruction.
In addition, each functional module in the application each embodiment can integrate to form an independent portion Divide, or modules individualism, it is also possible to which two or more modules are integrated to form an independent part.
If the function is realized using in the form of software function module and as independent production marketing or when using, can be with It is stored in a computer read/write memory medium.Such understanding is based on, the technical scheme of the application is substantially in other words The part contributed by prior art or the part of the technical scheme can be embodied in the form of software product, the meter Calculation machine software product is stored in a storage medium, is used including some instructions so that a computer equipment (can be individual People's computer, server 100, or network equipment etc.) execute all or part of step of the application each embodiment methods described Suddenly.And aforesaid storage medium includes:USB flash disk, portable hard drive, read only memory (ROM, Read-Only Memory), deposit at random Access to memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes. It should be noted that herein, such as first and second, another or the like relational terms be used merely to an entity or Person's operation is made a distinction with another entity or operation, and not necessarily requires or imply that presence is appointed between these entities or operation What this actual relation or order.And, term " including ", "comprising" or its any other variant are intended to non-row His property includes, so that a series of process, method, article or equipment including key elements not only includes those key elements, and And also include other key elements being not expressly set out, or also include for this process, method, article or equipment institute inherently Key element.In the absence of more restrictions, the key element for being limited by sentence "including a ...", it is not excluded that including institute Also there is other identical element in process, method, article or the equipment of stating key element.
The preferred embodiment of the application is the foregoing is only, the application is not limited to, for the skill of this area For art personnel, the application can have various modifications and variations.All within spirit herein and principle, made any repair Change, equivalent, improvement etc., should be included within the protection domain of the application.It should be noted that:Similar label and letter exist Similar terms is represented in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, is then not required in subsequent accompanying drawing Which is further defined and be explained.
The above, the protection domain of the only specific embodiment of the application, but the application is not limited thereto, any Those familiar with the art can readily occur in change or replacement in the technical scope that the application is disclosed, and should all contain Cover within the protection domain of the application.Therefore, the protection domain of the application described should be defined by scope of the claims.

Claims (10)

1. the acquisition methods in the mutational site of the corresponding gene of a kind of cardiovascular system, it is characterised in that methods described includes:
The multiple short sequence of testing gene and reference gene group are carried out comparing, the preliminary variant sites of testing gene are obtained Information, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and each preliminary variant sites Positional information;
According to the preliminary variant sites information, the variant sites of default reserve will be unsatisfactory in multiple preliminary variant sites Delete, using the variant sites in the testing gene obtained after deletion as site to be checked;
Multiple variant sites of corresponding with the cardiovascular system in cardiovascular system gene bank for the site to be checked gene are entered Row compares, and the cardiovascular system gene bank includes the mutating alkali yl of each variant sites of the corresponding gene of cardiovascular system And each variant sites position;
Make a variation when there is and mutating alkali yl identical identical with position in the cardiovascular system gene bank in the site to be checked Site, obtains the site mutation situation of the corresponding gene of the testing gene central vasculature.
2. method according to claim 1, it is characterised in that also include cardiovascular system in the cardiovascular system gene bank Unite corresponding gene each variant sites impact of every kind of mutating alkali yl to protein function, methods described also includes:
According to the site mutation situation of the corresponding gene of the testing gene central vasculature, determine every in the testing gene The impact of the mutations on protein function of individual variant sites.
3. method according to claim 1, it is characterised in that described by the site to be checked and cardiovascular system gene Before multiple variant sites of the corresponding gene of cardiovascular system in storehouse are compared, also include setting up cardiovascular system gene Storehouse, the cardiovascular system gene bank of setting up include:
The gene locis information related to cardiovascular system in acquisition COSMIC gene databases, the clivar data bases of NCBI, The gene locis information includes the mutating alkali yl of each variant sites of the corresponding gene of cardiovascular system and each change Ectopic sites position;
Will be with a low credibility in preset standard and mistake gene locis information deletion in the gene locis information, acquisition Gene locis information forms the cardiovascular system gene bank.
4. method according to claim 3, it is characterised in that also include:
The cardiovascular system gene bank is updated every preset time period.
5. method according to claim 1, it is characterised in that described will be unsatisfactory for default guarantor in multiple preliminary variant sites The variant sites deletion of condition is stayed to include one or more of:
Remove in the plurality of preliminary variant sites, the number of allele is more than the variant sites of predetermined threshold value;
Delete in the plurality of preliminary variant sites, all in the upstream span or span downstream of each insertion and deletion Variant sites, the base number that the upstream span and span downstream include are predetermined number;
By in the plurality of preliminary variant sites, the variant sites for being spaced default base number each other are deleted;
By in the plurality of preliminary variant sites, corresponding GQ values are deleted less than the variant sites of default GQ threshold values;
By in the plurality of preliminary variant sites, corresponding MQ values are deleted less than the variant sites of default MQ threshold values.
6. method according to claim 1, it is characterised in that the multiple short sequence and reference gene by testing gene Group carries out comparing, and the preliminary variant sites information for obtaining testing gene includes:
The multiple short sequence of the testing gene and reference gene group are compared first, the comparison result of SAM forms is obtained;
Duplicate removal is carried out to the comparison result, contrast is less than or equal to the short sequence number of a position of reference gene group 1;
Comparing result after to duplicate removal carries out local anharmonic ratio pair;
Recalculate the base mass fraction in comparison result of the local anharmonic ratio to after;
According to the base mass fraction, SNP and indel analyses are carried out to comparing result of the local anharmonic ratio to after, obtain preliminary Variant sites information.
7. method according to claim 1, it is characterised in that the variant sites are SNP.
8. the acquisition device in the mutational site of the corresponding gene of a kind of cardiovascular system, it is characterised in that described device includes:
Comparing module, for the multiple short sequence of testing gene and reference gene group are carried out comparing, obtains testing gene Preliminary variant sites information, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and per The positional information of individual preliminary variant sites;
Filtering module, for according to the preliminary variant sites information, will be unsatisfactory for default reservation in multiple preliminary variant sites The variant sites of condition are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked;
Comparison module, for by the more of corresponding with the cardiovascular system in cardiovascular system gene bank for the site to be checked gene Individual variant sites are compared, and the cardiovascular system gene bank includes each change dystopy of the corresponding gene of cardiovascular system The mutating alkali yl of point and each variant sites position;
Mutation acquisition module, when exist in the site to be checked identical with position in the cardiovascular system gene bank and be mutated alkali Base identical variant sites, for obtaining the site mutation situation of the corresponding gene of the testing gene central vasculature.
9. device according to claim 8, it is characterised in that also include that gene bank sets up module, for setting up cardiovascular System gene storehouse, the gene bank sets up module to be included:
Data capture unit, for obtain in the clivar data bases of COSMIC gene databases, NCBI with cardiovascular system phase The gene locis information of pass, the gene locis information include the prominent of each variant sites of the corresponding gene of cardiovascular system Become base and each variant sites position;
Data deletion unit, for will be with a low credibility in preset standard and mistake gene position in the gene locis information Point information deletion, the gene locis information of acquisition form the cardiovascular system gene bank.
10. device according to claim 8, it is characterised in that the filtering module includes one or more of:
First deletes unit, and for removing in the plurality of preliminary variant sites, the number of allele is more than predetermined threshold value Variant sites;
Second delete unit, for deleting in the plurality of preliminary variant sites, positioned at each insertion and deletion upstream span or All variant sites in person's span downstream, the base number that the upstream span and span downstream include are predetermined number;
3rd deletes unit, for by the plurality of preliminary variant sites, being spaced the variation of default base number each other Delete in site;
4th deletes unit, for by the plurality of preliminary variant sites, corresponding GQ values are less than the variation for presetting GQ threshold values Delete in site;
5th deletes unit, for by the plurality of preliminary variant sites, corresponding MQ values are less than the variation for presetting MQ threshold values Delete in site.
CN201610973131.7A 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system Pending CN106503489A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610973131.7A CN106503489A (en) 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610973131.7A CN106503489A (en) 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system

Publications (1)

Publication Number Publication Date
CN106503489A true CN106503489A (en) 2017-03-15

Family

ID=58323613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610973131.7A Pending CN106503489A (en) 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system

Country Status (1)

Country Link
CN (1) CN106503489A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122624A (en) * 2017-05-01 2017-09-01 杨永臣 The HGVS titles generation of human mutation and the implementation method of analysis system
CN109979531A (en) * 2019-03-29 2019-07-05 北京市商汤科技开发有限公司 A kind of genetic mutation recognition methods, device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539967A (en) * 2008-12-12 2009-09-23 深圳华大基因研究院 Method for detecting mononucleotide polymorphism
CN104462869A (en) * 2014-11-28 2015-03-25 天津诺禾致源生物信息科技有限公司 Method and device for detecting somatic cell SNP
US20160188793A1 (en) * 2014-12-29 2016-06-30 Counsyl, Inc. Method For Determining Genotypes in Regions of High Homology
CN106011224A (en) * 2015-12-24 2016-10-12 晶能生物技术(上海)有限公司 Nervous system genetic disease gene united screening method, kit and preparation method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539967A (en) * 2008-12-12 2009-09-23 深圳华大基因研究院 Method for detecting mononucleotide polymorphism
CN104462869A (en) * 2014-11-28 2015-03-25 天津诺禾致源生物信息科技有限公司 Method and device for detecting somatic cell SNP
US20160188793A1 (en) * 2014-12-29 2016-06-30 Counsyl, Inc. Method For Determining Genotypes in Regions of High Homology
CN106011224A (en) * 2015-12-24 2016-10-12 晶能生物技术(上海)有限公司 Nervous system genetic disease gene united screening method, kit and preparation method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GABOR T MARTH ET AL: "The functional spectrum of low-frequency coding variation", 《GENOME BIOLOGY》 *
张颖等: "成骨不全症家系基因突变位点的检测", 《中国医学工程》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122624A (en) * 2017-05-01 2017-09-01 杨永臣 The HGVS titles generation of human mutation and the implementation method of analysis system
CN107122624B (en) * 2017-05-01 2021-11-12 杨永臣 Method for realizing HGVS name generation and analysis system of human gene mutation
CN109979531A (en) * 2019-03-29 2019-07-05 北京市商汤科技开发有限公司 A kind of genetic mutation recognition methods, device and storage medium
CN109979531B (en) * 2019-03-29 2021-08-31 北京市商汤科技开发有限公司 Gene variation identification method, device and storage medium

Similar Documents

Publication Publication Date Title
JP6662933B2 (en) Systems and methods for clinical decision support
KR102562419B1 (en) Variant classifier based on deep neural networks
US20140067813A1 (en) Parallelization of synthetic events with genetic surprisal data representing a genetic sequence of an organism
CN106407747A (en) Method and device for acquiring mutation sites of genes corresponding to tumors
JP2019515369A (en) Genetic variant-phenotypic analysis system and method of use
JP2016513303A5 (en)
US11636951B2 (en) Systems and methods for generating a genotypic causal model of a disease state
KR102628141B1 (en) Deep Learning-Based Framework For Identifying Sequence Patterns That Cause Sequence-Specific Errors (SSES)
Johnston et al. PEMapper and PECaller provide a simplified approach to whole-genome sequencing
US20140088942A1 (en) Molecular genetic diagnostic system
JP7041614B6 (en) Multi-level architecture for pattern recognition in biometric data
US20030220777A1 (en) Method and system for determining genotype from phenotype
Ramstetter et al. Inferring identical-by-descent sharing of sample ancestors promotes high-resolution relative detection
KR20170000744A (en) Method and apparatus for analyzing gene
CN105224823B (en) A kind of drug gene target spot prediction technique
CN106529211A (en) Variable site obtaining method and apparatus
Kaur et al. An integrated approach for cancer survival prediction using data mining techniques
CN111724911A (en) Target drug sensitivity prediction method and device, terminal device and storage medium
CN106503489A (en) The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system
Bobak et al. Assessment of imputation methods for missing gene expression data in meta-analysis of distinct cohorts of tuberculosis patients
KR20220069943A (en) Single-cell RNA-SEQ data processing
US20130253892A1 (en) Creating synthetic events using genetic surprisal data representing a genetic sequence of an organism with an addition of context
CN106407745A (en) Mutation site acquisition method and device for a gene corresponding to skin
CN106529210A (en) Method and device for acquiring gene mutation site corresponding to psychology and spirit
Conn et al. Random Forests and Fuzzy Forests in Biomedical Research.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315