CN106503488A - The acquisition methods and device in the mutational site of the corresponding gene of digestive system - Google Patents

The acquisition methods and device in the mutational site of the corresponding gene of digestive system Download PDF

Info

Publication number
CN106503488A
CN106503488A CN201610972446.XA CN201610972446A CN106503488A CN 106503488 A CN106503488 A CN 106503488A CN 201610972446 A CN201610972446 A CN 201610972446A CN 106503488 A CN106503488 A CN 106503488A
Authority
CN
China
Prior art keywords
gene
variant sites
digestive system
preliminary
site
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610972446.XA
Other languages
Chinese (zh)
Inventor
范振鑫
刘鱼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Xin Yun Decoding Technology Co Ltd
Original Assignee
Chengdu Xin Yun Decoding Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Xin Yun Decoding Technology Co Ltd filed Critical Chengdu Xin Yun Decoding Technology Co Ltd
Priority to CN201610972446.XA priority Critical patent/CN106503488A/en
Publication of CN106503488A publication Critical patent/CN106503488A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Landscapes

  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Analytical Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

This application provides the acquisition methods and device in a kind of mutational site of the corresponding gene of digestive system, are related to technical field of biological information.The method includes:The multiple short sequence of testing gene and reference gene group are carried out comparing, the preliminary variant sites information of testing gene is obtained;According to preliminary variant sites information, the variant sites for being unsatisfactory for default reserve are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked in multiple preliminary variant sites;Multiple variant sites of corresponding with the digestive system in digestive system gene pool for the site to be checked gene are compared;When there are and mutating alkali yl identical variant sites identical with position in the digestive system gene pool in site to be checked, the site mutation situation of the corresponding gene of digestive system in testing gene is obtained.The method and device can obtain the catastrophe of related to digestive system multiple variant sites in the variant sites in the testing gene.

Description

The acquisition methods and device in the mutational site of the corresponding gene of digestive system
Technical field
The application is related to technical field of biological information, in particular to a kind of mutation of the corresponding gene of digestive system The acquisition methods and device in site.
Background technology
With development and the maturation of medical science, genomics and high throughput sequencing technologies, accurate medical treatment (Precision Medicine) also apply in countries in the world, become new medical model.Precisely medical treatment is by individual people's gene, environment and life The prevention from suffering from the diseases that custom difference is taken into account and the therapy that disposes, according to everyone hereditary information, personalized, precision Go formulate medical treatment and health management scheme.
And everyone genetic background be distinguishing, in the process, it is necessary to determine everyone genome or The catastrophe of some genes being associated with corresponding organ or position, further according to the base mutation situation to allow to Analysis contrast, determines final ill possibility, to specify corresponding medical treatment and health management scheme.
Digestive system (digestive system) is made up of alimentary canal and glandula digestive two large divisions.Alimentary canal:Including mouth The portions such as chamber, pharynx, esophagus, stomach, small intestine (duodenum, jejunum, ileum) and large intestine (caecum, appendix, colon, rectum, anal tube).Disappear Changing gland has two kinds of small digestive gland and big glandula digestive.Small digestive gland is dispersed in the tube wall in each portion of digest tube, and big glandula digestive has three pairs Salivary gland (parotid gland, glandula submandibularis, sublingual gland), liver and pancreas.Digestive system is the important system of organism, if digestive system occurs Pathology, can produce extremely serious impact.Then, certain precautionary measures are done to disease of digestive system, to reduce incidence probability, pole Which is important.
As the incidence of disease of digestive system is contacted with certain with gene, the corresponding gene of digestive system Site base mutation situation different, the incidence and incidence probability of the different disease of digestive systems of digestive system may be made not With.Thus it is possible to accurate medical model is utilized, according to the base mutation situation and other information of the corresponding gene of digestive system Combination the incidence and probability of disease of digestive system are predicted, be that one kind has to carry out prevention to disease of digestive system The precautionary approach of effect.
The existing determination to digesting system gene site mutation situation, obtains testing gene typically by chemical mode A certain specified location gene loci base mutation situation, the limited amount in the mutational site that the acquisition modes are obtained leads to The catastrophe of some or certain several bases can only often be obtained, it is impossible to while determining corresponding with digestive system in testing gene The catastrophe of as much as possible multiple variant sites of gene, makes subsequently to combine illness of the other information to disease of digestive system Predicting the outcome for situation is likely to occur relatively large deviation.
Content of the invention
In view of this, the embodiment of the present application provides a kind of acquisition methods in the mutational site of the corresponding gene of digestive system And device, by by multiple changes of corresponding with the digestive system in digestive system gene pool for the variant sites of testing gene gene Ectopic sites are compared, it is hereby achieved that the base of multiple variant sites of the corresponding gene of digestive system in testing gene Catastrophe, to improve the problems referred to above.
To achieve these goals, the technical scheme that the application is adopted is as follows:
A kind of acquisition methods in the mutational site of the corresponding gene of digestive system, methods described include:By testing gene Multiple short sequences carry out comparing with reference gene group, obtain the preliminary variant sites information of testing gene, the preliminary change Ectopic sites information includes the mutating alkali yl of multiple preliminary variant sites and the positional information of each preliminary variant sites;According to The preliminary variant sites information, the variant sites for being unsatisfactory for default reserve in the plurality of preliminary variant sites are deleted Remove, using the variant sites in the testing gene obtained after deletion as site to be checked;By the site to be checked and Digestive Multiple variant sites of the corresponding gene of digestive system in system gene pool are compared, and the digestive system gene pool includes The mutating alkali yl of each variant sites of the corresponding gene of digestive system and each variant sites position;When described to be checked There are and mutating alkali yl identical variant sites identical with position in the digestive system gene pool in site, obtain described to be measured The site mutation situation of the corresponding gene of digestive system in gene.
A kind of acquisition device in the mutational site of the corresponding gene of digestive system, described device include:Comparing module, is used for The multiple short sequence of testing gene and reference gene group are carried out comparing, the preliminary variant sites letter of testing gene is obtained Breath, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and each preliminary variant sites Positional information;Filtering module, for according to the preliminary variant sites information, will be unsatisfactory in the plurality of preliminary variant sites The variant sites of default reserve are deleted, using the variant sites in the testing gene obtained after deletion as position to be checked Point;Comparison module, for by multiple changes of corresponding with the digestive system in digestive system gene pool for the site to be checked gene Ectopic sites are compared, and the digestive system gene pool includes the mutation of each variant sites of the corresponding gene of digestive system Base and each variant sites position;Mutation acquisition module, when existing in the site to be checked and the digestive system In gene pool, position is identical and mutating alkali yl identical variant sites, corresponding for obtaining digestive system in the testing gene The site mutation situation of gene.
The acquisition methods and device in the mutational site of the corresponding gene of digestive system that the embodiment of the present application is provided, are obtaining In the case of the variant sites of testing gene, by the variant sites of testing gene with digestive system pair in digestive system gene pool Multiple variant sites of the gene that answers are compared, and digestive system gene pool includes each change of the corresponding gene of digestive system The mutating alkali yl of ectopic sites and each variant sites position.When presence in testing gene and digestive system gene pool middle position Put identical and mutating alkali yl identical variant sites, it may be determined that there is the corresponding gene of digestive system in the testing gene and dash forward Become.
As digestive system gene pool includes multiple variant sites related to digestive system, then this programme can determine Multiple variant sites related to digestive system in testing gene, and the concrete base mutation situation of the plurality of variant sites.
For enabling the above-mentioned purpose of the application, feature and advantage to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Description of the drawings
Purpose, technical scheme and advantage for making the embodiment of the present application is clearer, below in conjunction with the embodiment of the present application In accompanying drawing, to the embodiment of the present application in technical scheme be clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, rather than whole embodiments.Embodiment in based on the application, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of the application protection.
Fig. 1 shows the structural representation of the computer that the embodiment of the present application is provided;
Fig. 2 shows the acquisition methods in the mutational site of the corresponding gene of the digestive system of the application first embodiment offer A kind of flow chart;
Fig. 3 shows the acquisition methods in the mutational site of the corresponding gene of the digestive system of the application first embodiment offer Part steps flow chart;
Fig. 4 shows the acquisition device in the mutational site of the corresponding gene of the digestive system of the application second embodiment offer Functional block diagram;
Fig. 5 shows the acquisition device in the mutational site of the corresponding gene of the digestive system of the application second embodiment offer Gene pool set up the functional block diagram of module;
Fig. 6 shows the acquisition device in the mutational site of the corresponding gene of the digestive system of the application second embodiment offer Filtering module functional block diagram;
Fig. 7 shows the acquisition device in the mutational site of the corresponding gene of the digestive system of the application second embodiment offer Comparing module functional block diagram.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present application, to the embodiment of the present application in technical scheme carry out clear, complete Ground description, it is clear that described embodiment is only some embodiments of the present application, rather than whole embodiments.Generally exist The component of the embodiment of the present application described and illustrated in accompanying drawing can be arranged and be designed with a variety of configurations herein.Cause This, is not intended to limit claimed the application's to the detailed description of the embodiments herein for providing in the accompanying drawings below Scope, but it is merely representative of the selected embodiment of the application.Embodiments herein is based on, those skilled in the art are not doing The every other embodiment obtained on the premise of going out creative work, belongs to the scope of the application protection.
It should be noted that:Similar label and letter represent similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then in subsequent accompanying drawing which further need not be defined and be explained.Meanwhile, the application's In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or hint relative importance.
As shown in figure 1, being the block diagram of the application computer 100.The computer 100 includes that digestive system is corresponding The acquisition device 200 in mutational site of gene, memory 101, storage control 102, processor 103, Peripheral Interface 104, Input-output unit 105 and other.
The memory 101, storage control 102, processor 103, Peripheral Interface 104 and input-output unit 105 Each element is directly or indirectly electrically connected with each other, to realize the transmission or interaction of data.For example, these elements mutually it Between can pass through one or more communication bus or holding wire and realize being electrically connected with.The mutation position of the corresponding gene of the digestive system The acquisition device 200 of point includes that at least one can be stored in the memory 101 in the form of software or firmware (firmware) In or the software function module that is solidificated in the operating system (operating system, OS) of the computer 100.The place Reason device 103 is used for executing the executable module stored in memory 101, the mutation position of the corresponding gene of for example described digestive system Software function module or computer program that the acquisition device 200 of point includes.
Wherein, memory 101 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only storage (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memory 101 is used for storage program, and the processor 103 executes described program after execute instruction is received, aforementioned Method performed by the computer 100 of the stream process definition that the embodiment of the present application any embodiment is disclosed can apply to processor In 103, or realized by processor 103.
A kind of possibly IC chip of processor 103, the disposal ability with signal.Above-mentioned processor 103 can To be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), special IC (ASIC), Ready-made programmable gate array (FPGA) or other PLDs, discrete gate or transistor logic, discrete hard Part component.Can realize or execute disclosed each method in the embodiment of the present application, step and logic diagram.General processor Can be microprocessor or the processor 103 can also be any conventional processor etc..
The Peripheral Interface 104 is by various input/output devices coupled to processor 103 and memory 101.At some In embodiment, Peripheral Interface 104, processor 103 and storage control 102 can be realized in one single chip.Other one In a little examples, they can be realized by independent chip respectively.
Input-output unit 105 is used for being supplied to user input data realizes interacting for user and the computer.Described Input-output unit may be, but not limited to, digital independent device, mouse and keyboard etc..
It should be understood that structure shown in Fig. 1 is only to illustrate, computer 100 can also include more more than shown in Fig. 1 or Less component, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can adopt hardware, software or its Combination is realized.
First embodiment
The embodiment of the present application provides a kind of acquisition methods in the mutational site of the corresponding gene of digestive system, for obtaining The base mutation situation of the variant sites of the gene related to digestive system in testing gene.Fig. 2 is referred to, the method includes:
Step S110:The multiple short sequence of testing gene and reference gene group are carried out comparing, testing gene is obtained Preliminary variant sites information, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and per The positional information of individual preliminary variant sites.
First, the multiple short sequence of testing gene is obtained, and the short sequence can be exported by second generation microarray dataset.Will The short sequence of testing gene is compared with reference gene group.Such as, if testing gene is human gene, the reference gene group is then Mankind's reference gene group.
Certainly, the comparison process can include repeatedly comparing and the process such as duplicate removal, after being compared including multiple changes The preliminary variant sites information of ectopic sites.
Specifically, as shown in figure 3, in the present embodiment, the comparing in this step is believed with obtaining preliminary variant sites The process of breath can include:
Step S111:The multiple short sequence of the testing gene and reference gene group are compared first, SAM lattice are obtained The comparison result of formula.
The short sequence of testing gene and reference gene group are carried out comparing, the comparison process can utilize existing ratio Software is carried out, such as Bowtie2, it is possible to obtain the comparison result of SAM forms, be stored with the comparison result of the SAM forms ratio Comparison information to rear acquisition.It should be understood that in the comparison result of the SAM forms, including each alkali in testing gene The information of base, such as positional information.
Certainly, the representation of specifically used comparison software and comparison result is not intended as limiting in the present embodiment System, can compare the multiple short sequence of testing gene and reference gene group and obtain the comparison information for representing comparison result It is advisable.
Step S112:Duplicate removal is carried out to the comparison result, contrast is made to the short sequence of a position of reference gene group Number is less than or equal to 1.
In the comparison result that step S111 is obtained, there are a certain proportion of repetitive sequence and result, for example, contrast to referring to base Same position because organizing may have multiple short sequences, then, in this step, comparison result be carried out duplicate removal.
In the present embodiment, it is possible to use software Picard carries out duplicate removal work.Specifically, that utilized can be Picard MarkDuplicate instrument duplicate removals, obtain bam forms duplicate removal result.
Step S113:Comparing result after to duplicate removal carries out local anharmonic ratio to (local multiple alignment).
As obtained is difficult accurately to compare highly similar repetition to the short sequence that reference gene group is compared Region, then the repeat region in genome be readily available false-positive variant sites, such as false-positive SNPs.It is appreciated that , false-positive variant sites are the variant sites of comparison result mistake.In order to reduce false positive variant sites quantity and Ratio, in the present embodiment, to duplicate removal after comparing result carry out local anharmonic ratio pair.
Specifically, the local anharmonic ratio can be used in GATK to (local multiple alignment) IndelRealigner is carried out, and obtains comparison result of the anharmonic ratio of bam forms to after.The comparison process typically has three steps, A. detect suspicious, need the region for carrying out anharmonic ratio pair;B. anharmonic ratio pair is carried out to these suspicious regions;C. repair in anharmonic ratio The mate pairing information that loses to during.
Step S114:Recalculate the base mass fraction in comparison result of the local anharmonic ratio to after.
In the step of during aforementioned processing S111, each single base can be endowed in data processing One mass fraction (Quality scores), for reflecting the confidence level of nucleotides that corresponding base is observed.
Mass fraction due to obtaining during aforementioned processing does not have preferably to contact with the genotyping result possibility of mistake Get up, while the mass fraction of single base, does not contact with other specification phase example, different surveys such as in same sample Sequence platform, different sequencing circulations, different libraries etc. are contacted.
Therefore, in this step in S114, the mass fraction of each base is connected with each factor in sequencing procedure System, recalculates to the mass fraction of each base, generates new mass fraction, for judging that each base whether may be used Letter.
Specifically, in the present embodiment, it is possible to use GATK carries out empirical quality score Recalibration, obtains the result of bam forms.
Step S115:According to the base mass fraction, SNP and indel is carried out to comparing result of the local anharmonic ratio to after Analysis, obtains preliminary variant sites information.
According to the base mass fraction for recalculating acquisition, local anharmonic ratio is carried out to the comparison result for obtaining SNP and The preliminary interpretation of indel, carries out SNP and indel partings to which, to obtain the variant sites information for including multiple variant sites, , used as preliminary variant sites information, multiple variant sites that this includes are used as preliminary variant sites for the variant sites information.Can be with Understand, in the preliminary variant sites information, include the mutating alkali yl of multiple preliminary variant sites, and each change dystopy Point position.In the present embodiment, variant sites are SNP and indel, it is preferred that in the present embodiment, variant sites are only SNP.
Specifically, in this step, can be analyzed using the Unified Genotyper of GATK.Because complete Into after the parting of SNPs, a lot of data filtering parameter logistics are employed according to being filtered again, with further control data quality, So standard minimum confidence thresholds are both configured to zero in this step.It should be understood that SNPs represents the plural form of SNP.
Certainly, the preliminary interpretation process of the SNP and indel can also be carried out in other ways, in the present embodiment not As limit, or other, the such as HaplotypeCaller of GATK is carried out.
In this step, it is possible to obtain including the vcf files of preliminary variant sites information, the preliminary change in the vcf files Ectopic sites information includes each variant sites for obtaining in step s 110 and the corresponding positional information of each variant sites, Certainly, other are also included, here is not added with repeating.
Step S120:According to the preliminary variant sites information, will be unsatisfactory for presetting in the plurality of preliminary variant sites The variant sites of reserve are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked.
In step s 110, in the preliminary variant sites in the preliminary variant sites information of acquisition, it would still be possible to there is false sun Property variant sites, then, this step is further filtered to preliminary variant sites, delete wherein false positive possibility higher Variant sites, using the variant sites in the result after deletion as the testing gene in variant sites, make last acquisition Variant sites are more accurate.It should be understood that delete after result in further comprises each variant sites positional information and Other information, will not be described here.
Specifically, in this step, can include following one or more deleting the variation for being unsatisfactory for default reserve The mode in site:
Mode one:Remove in the plurality of preliminary variant sites, the number of allele is more than the change dystopy of predetermined threshold value Point.
Allele is that the possibility of false positive variant sites is higher, which is carried out more than the variant sites of predetermined threshold value Remove.In the present embodiment, the predetermined threshold value can value according to actual needs, due to comprising being more than more than 1 allele Site just have higher Genotyping mistake, it is preferred that the value of the predetermined threshold value can be 1.
When predetermined threshold value value is 1, that is, remove Variant sites.
Mode two:Delete in the plurality of preliminary variant sites, positioned at each insertion and deletion (indel) upstream span or All variant sites in person's span downstream, the base number that the upstream span and span downstream include are predetermined number.
As the short sequence for comparing is often exported by two generations direction finding platform, and the short sequence of two generation microarray datasets exists The comparison of mistake is more prone near the region of insertion and deletion (indel), and the local anharmonic ratio in above-mentioned processing procedure is not to This mistake can be completely eliminated.Then, all variant sites in insertion and deletion upstream span or span downstream are deleted, with Reduce the possibility of false positive results.
The base number that the upstream span and span downstream include be predetermined number, the predetermined number can by user according to Actual demand determines, is not restricted in the present embodiment, also, the predetermined number of upstream span and span downstream can phase Same or different.
In the present embodiment, the base number for above having scope to include is preferably 5, and the base number that span downstream includes is excellent Elect 5 as.That is, all indel in preliminary variant sites are determined, for each indel, by its upstream 5bp (5 bases) Within all variant sites delete, or all variant sites within 5bp downstream are deleted.
Certainly, in the present embodiment, only can delete in variant sites or the span downstream in the upstream span of indel Variant sites, it is also possible to the variant sites in the upstream span of indel and the variant sites in span downstream are all deleted.
Preferably, in the present embodiment, in the upstream span or span downstream for insertion and deletion (indel) of deletion All SNPs.
Mode three:By in the plurality of preliminary variant sites, the variant sites for being spaced default base number each other are deleted Remove.
In this step, variant sites close to each other are deleted, will variation of the distance less than certain value each other Delete in site.
In the present embodiment, the default base number is not intended as limiting, and can set according to actual needs.
Preferably, the default base number is 4, if variation of the base number being spaced between existing less than 4 Site, is deleted.That is, variant sites of the deletion within upstream each other or downstream 5bp.
Preferably, in the step, the SNPs for being spaced default base number each other of deletion.
Mode four:By in the plurality of preliminary variant sites, corresponding GQ (Genotype quality) value is less than default The variant sites of GQ threshold values are deleted.
GQ (Genotype quality) is a posterior probability (the phred-scaled probabilities) value, For each site, GQ values are not possible of truth in order to represent the site in the genotypic results of current acquisition Property, that is, represent the possibility existed in the site genotype for obtaining.Calculation is:
GQ values=- 10*log10 (P [error]), wherein, P [error] represents that corresponding site is not the general of truth Rate.
Preferably, in the present embodiment, it is 20 to preset GQ threshold values.Empirical tests, when GQ threshold values are 20, theoretic mistake Rate is 1%.
Mode five:By in the plurality of preliminary variant sites, corresponding MQ (Mapping quality) value is less than default MQ The variant sites of threshold value are deleted.
MQ represents the selectivity (uniqueness) in aligned sequences.When same short sequence can compare same During genome zones of different, the alignment score of first best comparison area (the first best alignment) The alignment score of (alignment's score) and second best comparison area (the second best alignment), two Person's difference is bigger, shows that the selectivity for comparing is better, and the value of MQ is higher.
In this embodiment it is believed that it is false sun that MQ values have higher possibility less than the variant sites of default MQ threshold values Property, it is deleted.
Preferably, in the present embodiment, it is 30 to preset MQ threshold values value.Empirical tests, when MQ values are 30, P [error]= 0.001, i.e., relative to current location is compared, the possibility for comparing another position is up to 0.1%.
In embodiments of the present invention, mode one is optional executive mode to mode five, i.e., in this step, can adopt which In a certain mode, certain several ways or all of mode.When the change for carrying out being unsatisfactory for reservation conditions using various ways During the deletion of ectopic sites, the execution sequence between the various ways is not intended as limiting.Certainly, the various ways can also be parallel Execute.
In addition, in the step 120, when there is various ways to be performed serially, follow-up step can be in preceding step On the basis of execute.For example, if the number of the plurality of preliminary variant sites allelic of the removal of executive mode one is more than pre- If in the variant sites of threshold value, and mode three, default base will be spaced in the plurality of preliminary variant sites each other The variant sites of number are deleted, and first carry out mode one, then executive mode three.Then in mode three, deletion can be mode The variant sites of default base number are spaced in variant sites after one process each other.
Step S120 is carried out to preliminary variant sites after deletion filtration, and the variant sites in the final result of acquisition are used as treating The site to be checked of cls gene, can be represented with vcf formatted files.
Step S130:Multiple changes by corresponding with the digestive system in digestive system gene pool for the site to be checked gene Ectopic sites are compared, and the digestive system gene pool includes the mutation of each variant sites of the corresponding gene of digestive system Base and each variant sites position.
In embodiments of the present invention, digestive system gene pool is initially set up, and the digestive system gene pool includes Digestive Unite corresponding gene each variant sites mutating alkali yl and each variant sites position.
The digestive system gene pool step S130 relatively before set up.Specifically, this sets up process can be, obtain COSMIC gene databases, the clivar databases of NCBI, other international and domestic each big authoritative academic journal magazine, genetic tests In the gene database that company and relevant government department announce, the gene loci information related to digestive system.Main acquisition Be each variant sites for including the corresponding gene of digestive system base mutation situation and each variant sites institute in place The gene loci information that puts.
Certainly, the data source for obtaining gene loci information can also be other, be not intended as in the present embodiment limiting.
Further, each change dystopy of the corresponding gene of digestive system can also be included in the gene loci information of acquisition The impact of every kind of mutating alkali yl to protein function of point, that is, get the base of certain variant sites by normal base mutation to Which kind of impact current mutating alkali yl, can produce to the function of corresponding protein.
Certainly, in the present embodiment, can also include in the gene loci information of acquisition:The corresponding base in each mutational site Write a Chinese character in simplified form because of name, the coordinate of gene name full name, this site in human genome, corresponding histoorgan type, gene are dashed forward Change type, normal gene are in the base in this site, whether this kind of mutation in this site of clinical research causes a disease, original mutation finds In the source that crowd, the sex of original mutation patient carrier, the age of original mutation patient carrier, original mutation are recorded One or more.
Again will be with a low credibility in preset standard and mistake gene loci information deletion in the gene loci information, The gene loci information of acquisition forms the digestive system gene pool.
In the present embodiment, include following at least one less than the gene loci information of preset standard:
1) the gene loci information got from the very poor periodical of non-SCI periodicals or reputation in the field of business, reputation is very in the industry for this Poor periodical can be factor of influence less than the periodical for being unsatisfactory for requiring under the periodical of certain value or other judgment criteria;2) record In the original of the gene loci information, sample size used is less than certain value so that being not enough to the conclusion for drawing science 's;3) in the original for recording the gene loci, the gene loci is not the most important gene loci found in document, The most important gene loci can be in the result for getting front 10% site.
The gene loci information of mistake includes following at least one:1) the gene loci information described in the database for obtaining Original substantially do not have been reported that this site;2) record in the original of the gene loci, the gene loci As a result it is statistically non-significant.
Certainly, the criterion of preset standard and gene loci information errors, is not intended as limiting in the present embodiment, Can be determined according to actual conditions.
Further, as the gene studies related to digestive system is constantly carried out, related to the digestive system gene The catastrophe of variant sites can be in updating, and in current digestive system gene pool might not there are all digestion The variant sites catastrophe of the related gene of system, then, in embodiments of the present invention, also includes every preset time period pair The digestive system database is updated.
Specific renewal process can be, every preset time period, acquisition is newest to be published in internal authority scholarly journal, such as The research paper related to digestive system that delivers on Nature, Nature Genetics etc., in the research paper that will be obtained most The new gene loci information related to digestive system, deletes wherein with a low credibility in preset standard and mistake gene position Point information, is added in digestive system database to realize updating.
After obtaining digestive system gene pool, by corresponding with the digestive system in digestive system database for site to be checked gene Multiple variant sites be compared.
In the present embodiment, the comparison procedure can directly be carried out behind the acquisition site to be checked of step S120, also may be used Be by user triggering carry out.I.e. after the inquiry request for receiving user's triggering, the comparison in step S130 is executed.
One or more, step S130 alternatively, it is also possible to be, in the site to be checked obtained in user input step S120 Multiple variant sites of the middle site to be checked by user input gene corresponding with the digestive system in digestive system gene pool are entered Row compares.
Alternatively, it is also possible to be, user directly obtains the related variant sites of digestive system from digestive system gene pool.Tool Body, user by input-output unit be input into gene name, site genome the information such as coordinate.It is defeated user is received After the information for entering, the information according to user input is made a look up in digestive system gene pool, by lookup result, such as gene name The various information such as word, site coordinate, base mutation type are shown.If finding user input in digestive system gene pool Information, then prove that the corresponding gene loci of the input information is related to digestive system, and there is base mutation.It should be understood that Position of the site in the coordinate as site of genome.
Step S140:When there is and mutating alkali yl identical with position in the digestive system gene pool in the site to be checked Identical variant sites, obtain the site mutation situation of the corresponding gene of digestive system in the testing gene.
When comparative result is, exist in site to be checked and identical variant sites in digestive system database, then can be with root The site for having the corresponding gene of digestive system in the testing gene is determined according to the identical variant sites in digestive system database Mutation, and catastrophe is consistent with the identical variant sites in digestive system database.Thus it is possible to obtain in testing gene The variant sites of the gene which has related to digestive system and the concrete of each variant sites related with digestive system dash forward Change situation, the catastrophe are included in which base mutation of which position is which base.
It should be understood that identical variant sites refer to that the position of variant sites is identical and base mutation situation is identical, that is, exist Same position has identical mutating alkali yl, it is believed that be to become dystopy with identical in digestive system database in site to be checked Point.I.e. related to the digestive system gene of the corresponding gene of digestive system.
Then, related personnel can be according to the site mutation feelings of the corresponding gene of digestive system in the testing gene for obtaining Possible disease condition under every kind of catastrophe of condition, and other information, such as digestive system related gene, determines the base to be measured Digestive system disease condition because of corresponding object.
Further, in the present embodiment, can be with the position according to the corresponding gene of digestive system in the testing gene Every kind of mutating alkali yl of each variant sites of the corresponding gene of digestive system in point mutation situation, and digestive system database Impact to protein function, determines the impact of the mutations on protein function of each variant sites in the testing gene, from And can determine which protein function related to digestive system of the corresponding object of testing gene (such as corresponding people) is subject to Affect, which receives affects.So that skilled addressee can according to the impact of protein function, in conjunction with other information, Such as protein function changes interactively with organ concrete function etc., judges the digestive system of the corresponding object of the testing gene Which disease of digestive system disease illness probability and may suffer from.
Certainly, include the catastrophe of every kind of variant sites to Digestive in embodiments of the present invention, or directly The pathogenic situation of system disease, the impact of such as certain disease of digestive system potentially include pathogenic, may cause a disease, hazards, uncertain, There is the result of study of conflict, optimum, the wherein pathogenic situation of certain certain mutating alkali yl of position is hazards, shows the position The probability that the object for having this kind of mutating alkali yl suffers from this kind of disease of digestive system is very high, should be noted to prevent.
Second embodiment
A kind of acquisition device 200 in the mutational site of the corresponding gene of digestive system is present embodiments provided, figure is referred to 4, the device 200 includes:
Comparing module 210, for the multiple short sequence of testing gene and reference gene group are carried out comparing, is treated The preliminary variant sites information of cls gene, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites And the positional information of each preliminary variant sites.
Filtering module 220, for according to the preliminary variant sites information, will be unsatisfactory in multiple preliminary variant sites pre- If the variant sites of reserve are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked.
Comparison module 230, for by corresponding with the digestive system in digestive system gene pool for the site to be checked gene Multiple variant sites be compared, the digestive system gene pool include the corresponding gene of digestive system each change dystopy The mutating alkali yl of point and each variant sites position.
Mutation acquisition module 240, when exist in the site to be checked identical with position in the digestive system gene pool and Mutating alkali yl identical variant sites, for obtaining the site mutation feelings of the corresponding gene of digestive system in the testing gene Condition.
Further, also include in digestive system gene pool the corresponding gene of digestive system each variant sites every kind of Impact of the mutating alkali yl to protein function, the mutation acquisition module 240 in the present embodiment are additionally operable to according to the testing gene The site mutation situation of the corresponding gene of middle digestive system, determines the mutation of each variant sites in the testing gene to albumen The impact of matter function.
Further, in the present embodiment, as shown in figure 4, also include that gene pool sets up module 250, for setting up Digestive System gene pool, the gene pool sets up module 250 to be included:Data capture unit 251, for obtain COSMIC gene databases, The gene loci information related to digestive system in the clivar databases of NCBI, the gene loci information include Digestive Unite corresponding gene each variant sites mutating alkali yl and each variant sites position.Data delete unit 252, For will be with a low credibility in preset standard and mistake gene loci information deletion in the gene loci information, acquisition Gene loci information forms the digestive system gene pool.
Further, as shown in figure 5, the gene pool sets up module 250 also includes updating block 253, for every default Time period is updated to the digestive system gene pool.
Further, as shown in fig. 6, in the present embodiment, filtering module 220 includes one or more of:First deletes Unit 221 is removed, for removing in the plurality of preliminary variant sites, the number of allele is more than the change dystopy of predetermined threshold value Point.Second deletes unit 222, for deleting in the plurality of preliminary variant sites, positioned at the upstream span of each insertion and deletion Or all variant sites in span downstream, the base number that the upstream span and span downstream include is predetermined number. 3rd deletes unit 223, for by the plurality of preliminary variant sites, being spaced the change dystopy of default base number each other Point deletion.4th deletes unit 224, for by the plurality of preliminary variant sites, corresponding GQ values are less than default GQ threshold values Variant sites delete.5th deletes unit 225, for by the plurality of preliminary variant sites, corresponding MQ values are less than pre- If the variant sites of MQ threshold values are deleted.
In this example, Fig. 7 is referred to, and comparing module 210 can include:Comparing unit 211, for by the base to be measured The multiple short sequence of cause is compared first with reference gene group, obtains the comparison result of SAM forms;Duplicate removal unit 212, is used for Duplicate removal is carried out to the comparison result, makes contrast 1 is less than or equal to the short sequence number of a position of reference gene group;Weight Comparing unit 213, carries out local anharmonic ratio pair for the comparing result after to duplicate removal;Computing unit 214, local for recalculating Base mass fraction in comparison result of the anharmonic ratio to after;Just sentence unit 215, for according to the base mass fraction, to this Ground comparing result of the anharmonic ratio to after carries out SNP and indel analyses, obtains preliminary variant sites information.
In sum, the acquisition methods and dress in the mutational site of the corresponding gene of digestive system provided in an embodiment of the present invention Put, behind the site to be measured for obtaining testing gene, by multiple changes of site to be measured and corresponding gene in digestive system gene pool Ectopic sites are compared, it is hereby achieved that related to digestive system multiple change dystopys in the variant sites in the testing gene The catastrophe of point, for the judgement of the possibility disease condition of assist digestion systemic disease.
It should be noted that each embodiment in this specification is described by the way of going forward one by one, each embodiment weight Point explanation is all difference with other embodiment, between each embodiment identical similar part mutually referring to. For device class embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, related part ginseng See the part explanation of embodiment of the method.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, it is also possible to pass through Other modes are realized.Device embodiment described above is only schematically, for example flow chart and block diagram in accompanying drawing Show the device of multiple embodiments according to the application, the architectural framework in the cards of method and computer program product, Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of module, program segment or a code Part, a part the holding comprising one or more logic functions for realization regulation of the module, program segment or code Row instruction.It should also be noted that in some are as the implementations that replaces, the function that marked in square frame can also be being different from The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially be executed substantially in parallel, and they are sometimes Can execute in the opposite order, this is depending on involved function.It is also noted that every in block diagram and/or flow chart Individual square frame and the combination of block diagram and/or the square frame in flow chart, can use the special base for executing the function or action of regulation Realize in the system of hardware, or can combine to realize with specialized hardware and computer instruction.
In addition, each functional module in the application each embodiment can integrate to form an independent portion Divide, or modules individualism, it is also possible to which two or more modules are integrated to form an independent part.
If the function is realized using in the form of software function module and as independent production marketing or when using, can be with It is stored in a computer read/write memory medium.Such understanding is based on, the technical scheme of the application is substantially in other words The part contributed by prior art or the part of the technical scheme can be embodied in the form of software product, the meter Calculation machine software product is stored in a storage medium, is used including some instructions so that a computer equipment (can be individual People's computer, server 100, or network equipment etc.) execute all or part of step of the application each embodiment methods described Suddenly.And aforesaid storage medium includes:USB flash disk, portable hard drive, read-only storage (ROM, Read-Only Memory), deposit at random Access to memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes. It should be noted that herein, such as first and second, another or the like relational terms be used merely to an entity or Person's operation is made a distinction with another entity or operation, and not necessarily requires or imply that presence is appointed between these entities or operation What this actual relation or order.And, term " including ", "comprising" or its any other variant are intended to non-row His property includes, so that a series of process, method, article or equipment including key elements not only includes those key elements, and And also include other key elements being not expressly set out, or also include for this process, method, article or equipment institute inherently Key element.In the absence of more restrictions, the key element for being limited by sentence "including a ...", it is not excluded that including institute Also there is other identical element in process, method, article or the equipment of stating key element.
The preferred embodiment of the application is the foregoing is only, the application is not limited to, for the skill of this area For art personnel, the application can have various modifications and variations.All within spirit herein and principle, made any repair Change, equivalent, improvement etc., should be included within the protection domain of the application.It should be noted that:Similar label and letter exist Similar terms is represented in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, is then not required in subsequent accompanying drawing Which is further defined and be explained.
The above, the protection domain of the only specific embodiment of the application, but the application is not limited thereto, any Those familiar with the art can readily occur in change or replacement in the technical scope that the application is disclosed, and should all contain Cover within the protection domain of the application.Therefore, the protection domain of the application described should be defined by scope of the claims.

Claims (10)

1. the acquisition methods in the mutational site of the corresponding gene of a kind of digestive system, it is characterised in that methods described includes:
The multiple short sequence of testing gene and reference gene group are carried out comparing, the preliminary variant sites of testing gene are obtained Information, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and each preliminary variant sites Positional information;
According to the preliminary variant sites information, the variant sites of default reserve will be unsatisfactory in multiple preliminary variant sites Delete, using the variant sites in the testing gene obtained after deletion as site to be checked;
Multiple variant sites of corresponding with the digestive system in digestive system gene pool for the site to be checked gene are compared Compared with, the digestive system gene pool include the mutating alkali yl of each variant sites of the corresponding gene of digestive system and each Variant sites position;
Become dystopy when there is and mutating alkali yl identical identical with position in the digestive system gene pool in the site to be checked Point, obtains the site mutation situation of the corresponding gene of digestive system in the testing gene.
2. method according to claim 1, it is characterised in that also include digestive system pair in the digestive system gene pool Impact of the every kind of mutating alkali yl of each variant sites of the gene that answers to protein function, methods described also include:
According to the site mutation situation of the corresponding gene of digestive system in the testing gene, each in the testing gene is determined The impact of the mutations on protein function of variant sites.
3. method according to claim 1, it is characterised in that described by the site to be checked and digestive system gene pool In the corresponding gene of digestive system multiple variant sites be compared before, also include setting up digestive system gene pool, institute State and set up digestive system gene pool and include:
Obtain related to digestive system gene loci information in COSMIC gene databases, the clivar databases of NCBI, institute State gene loci information include the corresponding gene of digestive system each variant sites mutating alkali yl and each change dystopy Point position;
Will be with a low credibility in preset standard and mistake gene loci information deletion in the gene loci information, acquisition Gene loci information forms the digestive system gene pool.
4. method according to claim 3, it is characterised in that also include:
The digestive system gene pool is updated every preset time period.
5. method according to claim 1, it is characterised in that described will be unsatisfactory for default guarantor in multiple preliminary variant sites The variant sites deletion of condition is stayed to include one or more of:
Remove in the plurality of preliminary variant sites, the number of allele is more than the variant sites of predetermined threshold value;
Delete in the plurality of preliminary variant sites, all in the upstream span or span downstream of each insertion and deletion Variant sites, the base number that the upstream span and span downstream include are predetermined number;
By in the plurality of preliminary variant sites, the variant sites for being spaced default base number each other are deleted;
By in the plurality of preliminary variant sites, corresponding GQ values are deleted less than the variant sites of default GQ threshold values;
By in the plurality of preliminary variant sites, corresponding MQ values are deleted less than the variant sites of default MQ threshold values.
6. method according to claim 1, it is characterised in that the multiple short sequence and reference gene by testing gene Group carries out comparing, and the preliminary variant sites information for obtaining testing gene includes:
The multiple short sequence of the testing gene and reference gene group are compared first, the comparison result of SAM forms is obtained;
Duplicate removal is carried out to the comparison result, contrast is less than or equal to the short sequence number of a position of reference gene group 1;
Comparing result after to duplicate removal carries out local anharmonic ratio pair;
Recalculate the base mass fraction in comparison result of the local anharmonic ratio to after;
According to the base mass fraction, SNP and indel analyses are carried out to comparing result of the local anharmonic ratio to after, obtain preliminary Variant sites information.
7. method according to claim 1, it is characterised in that the variant sites are SNP.
8. the acquisition device in the mutational site of the corresponding gene of a kind of digestive system, it is characterised in that described device includes:
Comparing module, for the multiple short sequence of testing gene and reference gene group are carried out comparing, obtains testing gene Preliminary variant sites information, the preliminary variant sites information include the mutating alkali yl of multiple preliminary variant sites and per The positional information of individual preliminary variant sites;
Filtering module, for according to the preliminary variant sites information, will be unsatisfactory for default reservation in multiple preliminary variant sites The variant sites of condition are deleted, using the variant sites in the testing gene obtained after deletion as site to be checked;
Comparison module, for by multiple changes of corresponding with the digestive system in digestive system gene pool for the site to be checked gene Ectopic sites are compared, and the digestive system gene pool includes the mutation of each variant sites of the corresponding gene of digestive system Base and each variant sites position;
, when there is and mutating alkali yl identical with position in the digestive system gene pool in the site to be checked in mutation acquisition module Identical variant sites, for obtaining the site mutation situation of the corresponding gene of digestive system in the testing gene.
9. device according to claim 8, it is characterised in that also include that gene pool sets up module, for setting up Digestive System gene pool, the gene pool sets up module to be included:
Data capture unit is related to digestive system in the clivar databases of COSMIC gene databases, NCBI for obtaining Gene loci information, the gene loci information includes the mutation alkali of each variant sites of the corresponding gene of digestive system Base and each variant sites position;
Data delete unit, for will be with a low credibility in preset standard and mistake gene position in the gene loci information Point information deletion, the gene loci information of acquisition form the digestive system gene pool.
10. device according to claim 8, it is characterised in that the filtering module includes one or more of:
First deletes unit, and for removing in the plurality of preliminary variant sites, the number of allele is more than predetermined threshold value Variant sites;
Second delete unit, for deleting in the plurality of preliminary variant sites, positioned at each insertion and deletion upstream span or All variant sites in person's span downstream, the base number that the upstream span and span downstream include are predetermined number;
3rd deletes unit, for by the plurality of preliminary variant sites, being spaced the variation of default base number each other Delete in site;
4th deletes unit, for by the plurality of preliminary variant sites, corresponding GQ values are less than the variation for presetting GQ threshold values Delete in site;
5th deletes unit, for by the plurality of preliminary variant sites, corresponding MQ values are less than the variation for presetting MQ threshold values Delete in site.
CN201610972446.XA 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of digestive system Pending CN106503488A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610972446.XA CN106503488A (en) 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of digestive system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610972446.XA CN106503488A (en) 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of digestive system

Publications (1)

Publication Number Publication Date
CN106503488A true CN106503488A (en) 2017-03-15

Family

ID=58323186

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610972446.XA Pending CN106503488A (en) 2016-11-04 2016-11-04 The acquisition methods and device in the mutational site of the corresponding gene of digestive system

Country Status (1)

Country Link
CN (1) CN106503488A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539967A (en) * 2008-12-12 2009-09-23 深圳华大基因研究院 Method for detecting mononucleotide polymorphism
CN104462869A (en) * 2014-11-28 2015-03-25 天津诺禾致源生物信息科技有限公司 Method and device for detecting somatic cell SNP
US20160188793A1 (en) * 2014-12-29 2016-06-30 Counsyl, Inc. Method For Determining Genotypes in Regions of High Homology
CN106011224A (en) * 2015-12-24 2016-10-12 晶能生物技术(上海)有限公司 Nervous system genetic disease gene united screening method, kit and preparation method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539967A (en) * 2008-12-12 2009-09-23 深圳华大基因研究院 Method for detecting mononucleotide polymorphism
CN104462869A (en) * 2014-11-28 2015-03-25 天津诺禾致源生物信息科技有限公司 Method and device for detecting somatic cell SNP
US20160188793A1 (en) * 2014-12-29 2016-06-30 Counsyl, Inc. Method For Determining Genotypes in Regions of High Homology
CN106011224A (en) * 2015-12-24 2016-10-12 晶能生物技术(上海)有限公司 Nervous system genetic disease gene united screening method, kit and preparation method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GABOR T MARTH ET AL: "The functional spectrum of low-frequency coding variation", 《GENOME BIOLOGY》 *
张颖等: "成骨不全症家系基因突变位点的检测", 《中国医学工程》 *

Similar Documents

Publication Publication Date Title
Choudhury et al. High-depth African genomes inform human migration and health
Turakhia et al. Stability of SARS-CoV-2 phylogenies
Sibbesen et al. Accurate genotyping across variant classes and lengths using variant graphs
US20200027557A1 (en) Multimodal modeling systems and methods for predicting and managing dementia risk for individuals
CN106407747A (en) Method and device for acquiring mutation sites of genes corresponding to tumors
Zhang et al. HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination
BR112020013636A2 (en) method to facilitate the prenatal diagnosis of a genetic disorder from a maternal sample associated with the pregnant woman, method for identifying contamination associated with at least one between preparation of sequencing library and high-throughput sequencing and method for characterization associated with at least one between sequencing library preparation and sequencing
JP2019515369A (en) Genetic variant-phenotypic analysis system and method of use
JP2018037093A (en) Systems and methods for disease-associated human genomic variant analysis and reporting
WO2020014280A1 (en) DEEP LEARNING-BASED FRAMEWORK FOR IDENTIFYING SEQUENCE PATTERNS THAT CAUSE SEQUENCE-SPECIFIC ERRORS (SSEs)
US20220414597A1 (en) Methods for Analysis of Digital Data
KR101828052B1 (en) Method and apparatus for analyzing copy-number variation (cnv) of gene
JP2018502602A (en) Method for genotyping in regions of high homology
Johnston et al. PEMapper and PECaller provide a simplified approach to whole-genome sequencing
US20140088942A1 (en) Molecular genetic diagnostic system
CN106529211A (en) Variable site obtaining method and apparatus
Liang Bioinformatics for biomedical science and clinical applications
KR20220069943A (en) Single-cell RNA-SEQ data processing
Glusman et al. Ultrafast comparison of personal genomes via precomputed genome fingerprints
EP3619712B1 (en) Deep learning-based framework for identifying sequence patterns that cause sequence-specific errors
Genovese et al. SpeedHap: an accurate heuristic for the single individual SNP haplotyping problem with many gaps, high reading error rate and low coverage
Bobak et al. Assessment of imputation methods for missing gene expression data in meta-analysis of distinct cohorts of tuberculosis patients
CN106407745A (en) Mutation site acquisition method and device for a gene corresponding to skin
CN106503489A (en) The acquisition methods and device in the mutational site of the corresponding gene of cardiovascular system
CN106407746A (en) Method and device for acquiring mutational sites of genes corresponding to respiratory system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315

RJ01 Rejection of invention patent application after publication