CN106778078B - DNA sequence dna similitude comparison method based on kendall related coefficient - Google Patents
DNA sequence dna similitude comparison method based on kendall related coefficient Download PDFInfo
- Publication number
- CN106778078B CN106778078B CN201611186639.9A CN201611186639A CN106778078B CN 106778078 B CN106778078 B CN 106778078B CN 201611186639 A CN201611186639 A CN 201611186639A CN 106778078 B CN106778078 B CN 106778078B
- Authority
- CN
- China
- Prior art keywords
- dna sequence
- dna
- sequence dna
- word
- related coefficient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 108091028043 Nucleic acid sequence Proteins 0.000 title claims abstract description 99
- 238000000034 method Methods 0.000 title claims abstract description 22
- 239000013598 vector Substances 0.000 claims abstract description 29
- 239000011159 matrix material Substances 0.000 claims abstract description 17
- 108020004414 DNA Proteins 0.000 description 8
- 241000894007 species Species 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000001712 DNA sequencing Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 2
- 241001439211 Almeida Species 0.000 description 1
- 241000283084 Balaenoptera musculus Species 0.000 description 1
- 241000283081 Balaenoptera physalus Species 0.000 description 1
- 241000282805 Ceratotherium simum Species 0.000 description 1
- 238000012270 DNA recombination Methods 0.000 description 1
- 241000289427 Didelphidae Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241000282575 Gorilla Species 0.000 description 1
- 241000283118 Halichoerus grypus Species 0.000 description 1
- 241000282620 Hylobates sp. Species 0.000 description 1
- 241000289569 Macropus robustus Species 0.000 description 1
- 241000289371 Ornithorhynchus anatinus Species 0.000 description 1
- 241000282577 Pan troglodytes Species 0.000 description 1
- 241001504519 Papio ursinus Species 0.000 description 1
- 241000283150 Phoca vitulina Species 0.000 description 1
- 241000282405 Pongo abelii Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Landscapes
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention discloses the DNA sequence dna similitude comparison method based on kendall related coefficient comprising following steps: 1) obtaining N item DNA sequence dna to be compared;2) length k is chosen, the corresponding k word of each pair of combination DNA sequence dna is obtained in the way of sliding window, and it is combined into corresponding vector 3) with k word acquired in step 2), it calculates the number that each k word occurs in DNA sequence dna and calculates the frequency vector that k word occurs in DNA sequence dna, be denoted as xi, all k word frequency rates of DNA sequence dna are denoted as X={ xi};4) combination of two is carried out to N DNA sequence dna k term vector to get arrivingCombination, each combination k word frequency vector are denoted as x, y;5) k word frequency vector, that is, x, y of every kind of combination, calculates its corresponding kendall related coefficient;6) the N*N rank similarity factor matrix of N DNA sequence dna is established, to obtain the similitude and evolutionary relationship figure of DNA sequence dna.The present invention improves the effect that DNA sequence dna similitude compares, and simplifies computational complexity and shortens operation time.
Description
Technical field
The present invention relates to computers and bioinformatics process field, more particularly to the DNA based on kendall related coefficient
Sequence similarity comparison method.
Background technique
The central task of bioinformatics is to extract conceptual knowledge from vast as the open sea DNA sequence data.Biological information
The task that scholar is faced is not only to solve efficient data storage means, and needs to develop effective data analysis tool.
Because only that DNA sequence dna information could be converted into Biological Knowledge, and understand fully using new, effective data analysis tool
The structure and function information that they are contained, and then thoroughly understand the biological significance representated by them.
The theoretical basis that DNA sequence dna compares is Evolution Theory, if having enough similitudes between two DNA sequence dnas,
There may be common evolution ancestors with regard to both speculating, by lacking for the replacement of residue in DNA sequence dna, residue or DNA sequencing fragment
It loses and the hereditary variations processes such as DNA sequence dna recombination develops respectively.DNA sequence dna phase Sihe DNA sequence dna is homologous to be different
Concept, the similarity degree between DNA sequence dna is the parameter that can quantify, and DNA sequence dna it is whether homologous need evolve it is true
Verifying.It is actually to use certain specific mathematical model or algorithm that DNA sequence dna, which compares, finds out two or more DNA sequence dnas
Between maximum matching base number.
The frequency and location information that Huang Yujuan, Wang Tianming et al. are occurred using the k word in DNA sequence dna construct a probability
Distribution, this distribution indicate the distance between two vectors, it is closer to be worth smaller species.Vinga and Almeida, which is proposed, to be based on
The DNA sequence dna comparative approach of word frequency rate: the number that the word that all length is k by way of sliding window occurs obtains k word
One DNA sequence dna, is mapped as a vector in higher-dimension theorem in Euclid space in this way by several or frequency vector, thus by DNA sequence dna it
Between similarity system design be converted to the comparison between vector.
It is exactly that two DNA sequence dnas are compared with specific algorithm that double DNA sequence dnas, which compare, so as to find out this two DNA
The matching of maximum similitude between sequence.Kendall related coefficient is widely used in time DNA sequence dna, the hydrology, water quality DNA
The dependency prediction of sequence etc., but it be not used for the matching of DNA sequence dna similitude.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide the DNA sequence dna phases based on kendall related coefficient
Like property comparison method, building one is about N DNA sequence dnaRank similarity factor matrix, the evolution for obtaining N DNA sequence dna are closed
System, while improving the efficiency of DNA sequence dna similitude comparison and improving operation efficiency.
The technical solution adopted by the present invention is that:
DNA sequence dna similitude comparison method based on kendall related coefficient comprising following steps:
1) N item DNA sequence dna to be compared is obtained;
2) length k is chosen, the corresponding k word of each pair of combination DNA sequence dna is obtained in the way of sliding window, and is combined into phase
The vector answered
3) with k word acquired in step 2), the number that each k word occurs in DNA sequence dna is calculated, i.e. calculating k word is in DNA
The frequency vector occurred in sequence, is denoted as xi;
4) combination of two is carried out to N DNA sequence dna k term vector to get arrivingCombination, each mix vector are denoted as X=
{xi, Y={ yi}。
5) k word frequency vector, that is, x of every kind of combinationi, yi, calculate its corresponding kendall related coefficient;
6) establish N × N rank kendall correlation matrix of N DNA sequence dna, with obtain the analog information of DNA sequence dna with
And evolutionary relationship figure.
Further, in the step 2), the word frequency vector the length is k is taken to DNA sequence dna.
Further, in the step 5), the kendall related coefficient of the k word of DNA sequence dna can be obtained as follows;
A) by following formula, the k word of DNA sequence dna A to be compared is obtained, wherein DNA sequence dna A length is set as n:
B) by following formula, the frequency that k word occurs: x is calculatedi={ i-th of k wordRepeat in DNA sequence dna A
Number;
C) to combined X, Y-direction amount calculates kendall related coefficient by following formulaT in formulaxIt is { xi},
{yiIn possess consistency logarithm, tyIt is { xi,yiPossessing inconsistency logarithm, T is { xi,yiPossess not identical k word total number.
D) t in step c)x, tyIt can be obtained by following formula, tx=(xi-yi)*(xi-yi) it is jack per line, then it is known as { xi,
yiIn consistency logarithm, tyIt can be obtained by following formula, ty=(xi-yi)*(xi-yi) it is contrary sign, then it is known as { xi,yiIn it is different
Cause property logarithm
Kendall related coefficient obtained is expressed as τ, is the number that a value is [- 1,1], when the value of τ is closer to 1
Then indicate that degree of correlation is stronger between two DNA sequence dnas, when being negative sense between the value of τ two DNA sequence dnas of closer -1 expression
Correlation, when the value of τ indicates that correlation is not present in two DNA sequence dnas close to 0.
The kendall correlation matrix of N*N rank is constructed, this matrix is symmetrical matrix, and the value on diagonal line is 1, can be with
The affinity information two-by-two of N DNA sequence dna is obtained, the relationship of the evolution of N DNA sequence dna is thus constructed.
The present invention is based on the DNA sequence dna similitude comparison methods of kendall related coefficient, are sought using sliding window mode
The k word frequency vector of DNA sequence dna to be analyzed carries out combination of two to the k term vector of N DNA sequence dna, utilizes kendall correlation
Coefficient seeks its related coefficient to the k word frequency vector of corresponding DNA sequence dna, makes it possible to carry out similitude inspection to a plurality of DNA sequence dna
It surveys, testing result is effectively reflected the evolutionary relationship between DNA sequence dna.This method is more succinct, need to only construct one symmetrically
Matrix, the value on the diagonal line of matrix left to bottom right are 1, simplify computational complexity, improve operation efficiency, kendall
Coefficient can be used as the characteristic value of description DNA sequence dna similitude prediction, can obtain good accuracy.
Detailed description of the invention
The present invention is described in further details below in conjunction with the drawings and specific embodiments;
Fig. 1 is that the present invention is based on the flow diagrams of the DNA sequence dna similitude comparison method of kendall related coefficient;
Fig. 2 is that the present invention is based on the evolution of the DNA sequence dna of the DNA sequence dna similitude comparison method of kendall related coefficient
Relational graph.
Specific embodiment
As shown in Figure 1 or 2, analysis object is used as using the DNA encoding DNA sequence dna of 20 species to method of the invention
For be further elaborated, comprising the following steps: as shown in Figure 1, the present embodiment based on kendall related coefficient
DNA sequence dna similitude comparison method includes the following steps:
1) select the DNA encoding DNA sequence dna of 20 species as initial DNA sequence dna, the DNA sequence dna title of 20 species and
Length is shown in Table 1;
Species name | DNA sequence dna length |
baboon | 16522 |
bluewhale | 16403 |
cat | 17010 |
common_chimpanzee | 16564 |
cow | 16339 |
fin_whale | 16399 |
gibbon | 16473 |
gorilla | 16365 |
grayseal | 16798 |
harborseal | 16827 |
horse | 16661 |
human | 16570 |
mouse | 16296 |
opossum | 17085 |
orangutan | 16390 |
pigmy_chimpanzee | 16555 |
platypus | 17020 |
rat | 16301 |
wallaroo | 16897 |
whiterhinoceros | 16833 |
Table 1: species DNA sequence dna information
2) its k word is obtained to the initial DNA sequence dna of step 1, and combines these k words, obtain the k word frequency of initial DNA sequence dna
Rate vector is (referring to Vinga, S.Almeida, J.S.Alignment-free sequence comparison area review
[J].Bioinformatics.513-523.2003).The characteristics of the method is to the short dna for seeking length k by sliding window mode
Sequence appears in frequency in DNA sequence dna to be measured, and to 4 bases { A, T, G, C } of DNA, taking k length is 2, then corresponding to k word has 42
=16 kinds, k word 4 is corresponded to if k=33=64 kinds;Such as DNA sequence dna A=ATAACTA, the k word W of DNA sequencing fragment to be measured2=
{ AT, TA, AA, TT, AG, GA, AC, CA, CT ... }, frequency vectorValue for 1,
2,1,0,0,0,1,0,1,0…};DNA sequencing fragment B=ACAACTTA to be measured, k word frequency vector be 0,1,1,1,0,0,
2,1,1,0…};
3) corresponding N DNA sequence dna, can find out N number of k word frequency vector and obtain its combination of twoCombination, each
Combination frequency vector is denoted as X, Y
4) it is calculate by the following formulaKendall related coefficient is obtained, wherein txIt is { xi,yiAnd other k word frequency
Possess consistency logarithm, t between rateyIt is { xi,yiAnd other k word frequency rates between possess inconsistency logarithm, T is { xi,yiGather around
There is not identical k word total number, the k word total number of DNA sequence dna A, B segment is T=7 in step 2);
5) t in step 4)x, tyIt can be obtained by following formula, tx=(xi-yi)×(xi-yi) it is jack per line, then it is known as { xi,yi}
Middle consistency logarithm, tyIt can be obtained by following formula, ty=(xi-yi)×(xi-yi) it is contrary sign, then it is known as { xi,yiIn inconsistency
Logarithm;
6) building matrix be N*N rank kendall correlation matrix, this matrix be symmetrical matrix, diagonal line value 1,
Upper triangular matrix can be usually classified as.Since similitude and distance are negatively correlated relationship, so, building evolutionary relationship figure it
Before, similarity figure is taken opposite number to be converted to distance by we, and constructs evolutionary relationship figure with this, please refers to Fig. 2.
Interpretation of result: pass through the Pearson correlation coefficients between calculating and editing distance, it has been found that count using kendall
The related coefficient of the DNA sequence dna similitude and editing distance that calculate is -0.94, illustrate that the method for the present invention is applied to calculate
DNA sequence dna similitude has the characteristics that with high accuracy, and can be a kind of the non-of substitution editing distance by being quickly calculated
Normal effective method.
The above description is only an embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (4)
1. the DNA sequence dna similitude comparison method based on kendall related coefficient, it is characterised in that: it includes the following steps:
1) N item DNA sequence dna to be compared is obtained;
2) length k is chosen, the corresponding k word of each pair of combination DNA sequence dna is obtained in the way of sliding window, and is combined into corresponding
Vector
3) with k word acquired in step 2), the number that each k word occurs in DNA sequence dna is calculated, i.e. calculating k word is in DNA sequence dna
The frequency vector of middle appearance, is denoted as xi;
4) combination of two is carried out to N DNA sequence dna k term vector to get arrivingCombination, each mix vector are denoted as X={ xi},Y
={ yi};
5) k word frequency vector, that is, x of every kind of combinationi, yi, calculate its corresponding kendall related coefficient;
In step 5), the kendall related coefficient of the k word of DNA sequence dna is obtained as follows:
A) by following formula, the k word of DNA sequence dna A to be compared is obtained, wherein DNA sequence dna A length is set as n:
B) by following formula, the frequency that k word occurs: x is calculatedi={ i-th of k wordTime repeated in DNA sequence dna A
Number };
C) to combined X, Y-direction amount calculates kendall related coefficient by following formulaT in formulaxIt is { xi},{yiIn
Possess consistency logarithm, tyIt is { xi,yiPossessing inconsistency logarithm, T is { xi,yiPossess not identical k word total number;
D) t in step c)x, tyIt can be obtained by following formula, tx=(xi-yi)*(xi-yi) it is jack per line, then it is known as { xi,yiIn
Consistency logarithm, tyIt can be obtained by following formula, ty=(xi-yi)*(xi-yi) it is contrary sign, then it is known as { xi,yiIn inconsistency
Logarithm;
6) establish N × N rank kendall correlation matrix of N DNA sequence dna, with obtain DNA sequence dna analog information and into
Change relational graph.
2. the DNA sequence dna similitude comparison method based on kendall related coefficient according to claim 1, it is characterised in that:
In the step 2), the word frequency vector the length is k is taken to DNA sequence dna.
3. the DNA sequence dna similitude comparison method based on kendall related coefficient according to claim 1, it is characterised in that:
Kendall related coefficient obtained is expressed as τ, and τ is the number that a value is [- 1,1], when the value of τ indicates two closer to 1
Degree of correlation is stronger between DNA sequence dna, when being negative sense correlation between the value of τ two DNA sequence dnas of closer -1 expression, works as τ
Value indicate that correlation is not present in two DNA sequence dnas close to 0.
4. the DNA sequence dna similitude comparison method based on kendall related coefficient according to claim 1, it is characterised in that:
The kendall correlation matrix of building N*N rank in step 6), this matrix are symmetrical matrix, and the value on diagonal line is 1, can be with
The affinity information two-by-two of N DNA sequence dna is obtained, the relationship of the evolution of N DNA sequence dna is thus constructed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611186639.9A CN106778078B (en) | 2016-12-20 | 2016-12-20 | DNA sequence dna similitude comparison method based on kendall related coefficient |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611186639.9A CN106778078B (en) | 2016-12-20 | 2016-12-20 | DNA sequence dna similitude comparison method based on kendall related coefficient |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106778078A CN106778078A (en) | 2017-05-31 |
CN106778078B true CN106778078B (en) | 2019-04-09 |
Family
ID=58896076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611186639.9A Expired - Fee Related CN106778078B (en) | 2016-12-20 | 2016-12-20 | DNA sequence dna similitude comparison method based on kendall related coefficient |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106778078B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846262A (en) * | 2018-05-31 | 2018-11-20 | 广西大学 | The method that RNA secondary structure distance based on DFT calculates phylogenetic tree construction |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102732609A (en) * | 2011-04-08 | 2012-10-17 | 博奥生物有限公司 | Method for detecting similarity of oligonucleotide and target genome |
WO2014019164A1 (en) * | 2012-08-01 | 2014-02-06 | 深圳华大基因研究院 | Method and device for analyzing microbial community composition |
CN104395900A (en) * | 2013-03-15 | 2015-03-04 | 北京未名博思生物智能科技开发有限公司 | Spatial arithmetic method of sequence alignment |
CN104657628A (en) * | 2015-01-08 | 2015-05-27 | 深圳华大基因科技服务有限公司 | Proton-based transcriptome sequencing data comparison and analysis method and system |
WO2016058089A1 (en) * | 2014-10-17 | 2016-04-21 | The Hospital For Sick Children | Dna methylation markers for overgrowth syndromes |
EP3081257A1 (en) * | 2015-04-17 | 2016-10-19 | Sorin CRM SAS | Active implantable medical device for cardiac stimulation comprising means for detecting a remodelling or reverse remodelling phenomenon of the patient |
CN106203471A (en) * | 2016-06-22 | 2016-12-07 | 南京航空航天大学 | A kind of based on the Spectral Clustering merging Kendall Tau distance metric |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040101846A1 (en) * | 2002-11-22 | 2004-05-27 | Collins Patrick J. | Methods for identifying suitable nucleic acid probe sequences for use in nucleic acid arrays |
-
2016
- 2016-12-20 CN CN201611186639.9A patent/CN106778078B/en not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102732609A (en) * | 2011-04-08 | 2012-10-17 | 博奥生物有限公司 | Method for detecting similarity of oligonucleotide and target genome |
WO2014019164A1 (en) * | 2012-08-01 | 2014-02-06 | 深圳华大基因研究院 | Method and device for analyzing microbial community composition |
CN104395900A (en) * | 2013-03-15 | 2015-03-04 | 北京未名博思生物智能科技开发有限公司 | Spatial arithmetic method of sequence alignment |
WO2016058089A1 (en) * | 2014-10-17 | 2016-04-21 | The Hospital For Sick Children | Dna methylation markers for overgrowth syndromes |
CN104657628A (en) * | 2015-01-08 | 2015-05-27 | 深圳华大基因科技服务有限公司 | Proton-based transcriptome sequencing data comparison and analysis method and system |
EP3081257A1 (en) * | 2015-04-17 | 2016-10-19 | Sorin CRM SAS | Active implantable medical device for cardiac stimulation comprising means for detecting a remodelling or reverse remodelling phenomenon of the patient |
CN106203471A (en) * | 2016-06-22 | 2016-12-07 | 南京航空航天大学 | A kind of based on the Spectral Clustering merging Kendall Tau distance metric |
Non-Patent Citations (1)
Title |
---|
基于k词的DNA序列分析的模型研究及应用;黄玉娟;《中国博士学位论文全文数据库(基础科学辑)》;20120915(第09期);第A006-9页 |
Also Published As
Publication number | Publication date |
---|---|
CN106778078A (en) | 2017-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Talagala et al. | Meta-learning how to forecast time series | |
Kalinowski | How well do evolutionary trees describe genetic relationships among populations? | |
Van Buuren et al. | Fully conditional specification in multivariate imputation | |
CN110263979B (en) | Method and device for predicting sample label based on reinforcement learning model | |
CN105740312A (en) | Clustering database queries for runtime prediction | |
CN110717617A (en) | Unsupervised relation prediction method based on depth map network self-encoder | |
Qiu et al. | A deep learning framework for imputing missing values in genomic data | |
Hird et al. | Rapid and accurate species tree estimation for phylogeographic investigations using replicated subsampling | |
US20070021952A1 (en) | General graphical Gaussian modeling method and apparatus therefore | |
Kwon et al. | The use of random-effect models for high-dimensional variable selection problems | |
Bezáková et al. | Graph model selection using maximum likelihood | |
CN106778078B (en) | DNA sequence dna similitude comparison method based on kendall related coefficient | |
Liu et al. | Group variable selection and estimation in the tobit censored response model | |
Maenhout et al. | Graph-based data selection for the construction of genomic prediction models | |
CN109063418A (en) | Determination method, apparatus, equipment and the readable storage medium storing program for executing of disease forecasting classifier | |
Hein et al. | Can we compare effect size of spatial genetic structure between studies and species using Moran eigenvector maps? | |
Gómez-Vela et al. | Gene network coherence based on prior knowledge using direct and indirect relationships | |
Lee et al. | Survival prediction and variable selection with simultaneous shrinkage and grouping priors | |
CN107103206A (en) | The DNA sequence dna cluster of local sensitivity Hash based on standard entropy | |
CN109326327B (en) | Biological sequence clustering method based on SeqRank graph algorithm | |
CN113838519B (en) | Gene selection method and system based on adaptive gene interaction regularization elastic network model | |
Boggis et al. | equips: eqtl analysis using informed partitioning of snps–a fully Bayesian approach | |
Lehmann et al. | High trait variability in optimal polygenic prediction strategy within multiple-ancestry cohorts | |
CN110162704B (en) | Multi-scale key user extraction method based on multi-factor genetic algorithm | |
Cheng et al. | Use of biclustering for missing value imputation in gene expression data. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190409 |