CN104376234A - Promoter identification method and system - Google Patents

Promoter identification method and system Download PDF

Info

Publication number
CN104376234A
CN104376234A CN201410727536.3A CN201410727536A CN104376234A CN 104376234 A CN104376234 A CN 104376234A CN 201410727536 A CN201410727536 A CN 201410727536A CN 104376234 A CN104376234 A CN 104376234A
Authority
CN
China
Prior art keywords
test data
eigenvector
promoter
sub
described test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410727536.3A
Other languages
Chinese (zh)
Other versions
CN104376234B (en
Inventor
张莉
徐文轩
鲁亚平
王邦军
张召
杨季文
李凡长
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201410727536.3A priority Critical patent/CN104376234B/en
Publication of CN104376234A publication Critical patent/CN104376234A/en
Application granted granted Critical
Publication of CN104376234B publication Critical patent/CN104376234B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a promoter identification method and system. The promoter identification method includes the steps of obtaining test data X; determining primary feature vectors of the test data X; conducting feature extraction on the primary feature vectors of the test data X through an automatic coding device, and obtaining secondary feature vectors of the test data X; classifying the secondary feature vectors of the test data X through a preset support vector machine to obtain classification results, and determining that the testing data X are promoters when the classification results meet preset conditions. Compared with the prior art, classification judging is carried out on the feature vectors directly extracted through KL divergence. By means of the promoter identification method and system, the neural network learning algorithm of the automatic coding device is adopted, the recognition performance on the promoter is effectively improved, and the identification accuracy is further improved.

Description

Promoter Recognition method and system
Technical field
The present invention relates to human gene recognition technology field, particularly relate to a kind of Promoter Recognition method and system.
Background technology
To in the annotation process of whole genome functions, vital role is occupied to the accurate identification of promoter.
At present, usual employing is based on SVM (Support Vector Machine, support vector machine) promoter classifying identification method identify promoter, in this recognition methods, SVM classifier utilizes the word frequency statistics feature based on KL to identify promoter, and namely the direct proper vector to utilizing KL divergence to extract carries out classification judgement.But, because KL statistical nature can not represent the separable characteristic of gene well, make the recognition performance of above-mentioned recognition methods poor, reduce the recognition accuracy to promoter.
Summary of the invention
In view of this, the invention provides a kind of Promoter Recognition method and system, to reach raising recognition performance, and then improve the object of recognition accuracy.
For solving the problems of the technologies described above, the invention provides a kind of Promoter Recognition method, comprising:
Obtain test data determine described test data a sub-eigenvector;
Utilize own coding device, to described test data a sub-eigenvector carry out feature extraction, obtain described test data quadratic character vector;
Utilize and preset support vector machine, to described test data quadratic character vector classify, obtain classification results, when described classification results meets pre-conditioned, determine described test data for promoter.
In above-mentioned recognition methods, preferably, test data is obtained determine described test data a sub-eigenvector, comprising:
Obtain test data
Extract described test data word frequency statistics feature;
Utilize KL divergence, extract the proper vector of described word frequency statistics feature, obtain described test data a sub-eigenvector.
In above-mentioned recognition methods, preferably, by described test data carry out 5-mer words-frequency feature statistics and extract described word frequency statistics feature.
In above-mentioned recognition methods, preferably, also comprise:
Obtain training data X, determine a sub-eigenvector of described training data X;
Utilize described own coding device, feature extraction is carried out to a sub-eigenvector of described training data X, obtain the quadratic character vector of described test data X;
Utilize described test data X, Training Support Vector Machines, obtain default support vector machine.
In above-mentioned recognition methods, preferably, also comprise:
Set the neuronic number of the neuron of the input layer of described own coding device and hidden layer, output layer and input layer.
In above-mentioned recognition methods, preferably, the activation function of described hidden layer and described output layer is sigmoid function.
In above-mentioned recognition methods, preferably, described test data a sub-eigenvector comprise:
Distinguish a sub-eigenvector of promoter and extron
Distinguish a sub-eigenvector of promoter and introne
Distinguish a sub-eigenvector of promoter and 3 '-UTR
Described test data quadratic character vector comprise:
With a described sub-eigenvector corresponding quadratic character vector
With a described sub-eigenvector corresponding quadratic character vector
With a described sub-eigenvector corresponding quadratic character vector
Described classification results comprises:
With described quadratic character vector the first corresponding classification results;
With described quadratic character vector the second corresponding classification results;
With described quadratic character vector the 3rd corresponding classification results;
Wherein, described first classification results, the second classification results and the 3rd classification results are+1 or-1.
In above-mentioned recognition methods, preferably, described test data is determined by following steps for promoter:
When in described first classification results, the second classification results and the 3rd classification results, when at least having two+1, determine described test data for promoter.
The present invention also provides a kind of Promoter Recognition system, comprising:
A characteristics determining unit, for obtaining test data determine described test data a sub-eigenvector;
Quadratic character determining unit, for utilizing own coding device, to described test data a sub-eigenvector in each proper vector carry out feature extraction, obtain described test data quadratic character vector;
Promoter identifying unit, for utilizing default support vector machine, to described test data quadratic character vector classify, obtain classification results, when described classification results meets pre-conditioned, determine described test data for promoter.
In above-mentioned recognition system, select excellent, also comprise:
Presetting support vector machine training unit, for obtaining training data X, determining a sub-eigenvector of described training data X; Utilize described own coding device, feature extraction is carried out to a sub-eigenvector of described training data X, obtain the quadratic character vector of described test data X; Utilize described test data X, Training Support Vector Machines, obtain default support vector machine.
Above in Promoter Recognition method and system provided by the invention, utilize own coding device, to test data a sub-eigenvector, namely the proper vector extracted of the KL of utilization divergence of the prior art, carries out feature extraction, obtains quadratic character vector; Then, utilize default support vector machine to carry out classification to this quadratic character vector to judge; Compare in prior art and directly classification judgement is carried out to the proper vector utilizing KL divergence to extract, present invention utilizes the Learning Algorithm of own coding device, effectively improve the recognition performance to promoter, and then improve recognition accuracy.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only embodiments of the invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to the accompanying drawing provided.
Fig. 1 is the process flow diagram of a kind of Promoter Recognition embodiment of the method 1 of the present invention;
Fig. 2 is the process flow diagram of a kind of Promoter Recognition embodiment of the method 2 of the present invention;
Fig. 3 is the process flow diagram of a kind of promoter embodiment of the method 3 of the present invention;
Fig. 4 is the structured flowchart of a kind of promoter systems embodiment 1 of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Core of the present invention is to provide a kind of Promoter Recognition method and system, to reach raising recognition performance, and then improves the object of recognition accuracy.
In order to make those skilled in the art person understand the present invention program better, below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.
With reference to figure 1, show the process flow diagram of a kind of Promoter Recognition embodiment of the method 1 of the present invention, this recognition methods can comprise the steps:
Step S100, acquisition test data determine described test data a sub-eigenvector;
In the present invention, test data a sub-eigenvector be KL statistical nature, wherein, KL divergence (Kullback – Leibler divergence, be called for short KLD), in theory of probability or information theory, KL divergence is also known as relative entropy (relative entropy), and be a kind of method of description two probability distribution P and Q difference, it is asymmetrical.
In the present invention, consider promoter and the non-start up such as encoded exon and introne subsequence, input genetic test data add up the KL notable feature of its 5-linked body, obtain three notable feature vectors of this test data, namely described test data a sub-eigenvector, specifically comprise:
Distinguish a sub-eigenvector of promoter and extron
Distinguish a sub-eigenvector of promoter and introne
Distinguish a sub-eigenvector of promoter and 3 '-UTR
Step S101, utilize own coding device, to described test data a sub-eigenvector carry out feature extraction, obtain described test data quadratic character vector;
In the present invention, the corresponding own coding device of each sub-eigenvector, described test data quadratic character vector comprise:
With a described sub-eigenvector corresponding quadratic character vector
With a described sub-eigenvector corresponding quadratic character vector
With a described sub-eigenvector corresponding quadratic character vector
Wherein, own coding device is a kind of unsupervised Learning Algorithm, utilizes back-propagation algorithm to attempt approaching an identity function, makes to export close to input;
Particularly, three the own coding device h trained are used 1, h 2and h 3again extract feature, obtain the quadratic character vector of test data respectively with
Step S102, utilization preset support vector machine, to described test data quadratic character vector classify, obtain classification results, when described classification results meets pre-conditioned, determine described test data for promoter.
In the present invention, described classification results comprises:
With described quadratic character vector the first corresponding classification results;
With described quadratic character vector the second corresponding classification results;
With described quadratic character vector the 3rd corresponding classification results;
Wherein, described first classification results, the second classification results and the 3rd classification results are+1 or-1.
In the present invention, determine described test data by following steps for promoter:
When in described first classification results, the second classification results and the 3rd classification results, when at least having two+1, determine described test data for promoter.
Particularly, utilize disaggregated model, namely preset support vector machine, to the quadratic character vector of test data with carry out categorised decision, obtain with wherein function g i(*) output valve is 1 or-1.These three output valves comprehensive, carry out last judgement to test data, if with in have at least the output of two functions to be+1, then test data x judgement for promoter; Otherwise be non-start up.In the present invention, with SVM as sorter, kernel function selects Radial basis kernel function to form non-linear SVM.
Process for recognising human gene promoter provided by the invention above, first utilizes own coding device to carry out reprocessing to the KL statistical nature extracted, then just carries out Classification and Identification; Compare in prior art and directly classification judgement is carried out to the proper vector utilizing KL divergence to extract, present invention utilizes the Learning Algorithm of own coding device, effectively improve the recognition performance to promoter, and then improve recognition accuracy.
With reference to figure 2, show the process flow diagram of a kind of Promoter Recognition embodiment of the method 2 of the present invention, perform step S100, obtain test data determine described test data a sub-eigenvector, specifically comprise the steps:
Step S200, acquisition test data
Step S201, extract described test data word frequency statistics feature;
Step S202, utilize KL divergence, extract the proper vector of described word frequency statistics feature, obtain described test data a sub-eigenvector.
In the application, by described test data carry out 5-mer words-frequency feature statistics and extract described word frequency statistics feature.
In the present invention, default support vector machine is through that the training study of the test data of human gene obtains; With reference to figure 3, show the process flow diagram of a kind of promoter embodiment of the method 3 of the present invention, the training process for support vector machine specifically can comprise:
Step S300, acquisition training data X, determine a sub-eigenvector of described training data X;
Particularly, gene order training data is inputted wherein x i∈ R l, y i∈ { " promoter ", " exon ", " intron ", " 3'-UTR " }, N is number of samples, and L is sequence length.If f prfor the frequency that 5-linked body occurs in promoter, (a=1 represents extron to the frequency occurred in a kind non-start up subsequence for 5-linked body; A=2 represents introne; A=3 represents 3 '-UTR).KL divergence is defined as follows:
D a ( f pr , f np a ) = Σ i = 1 1024 f pr ( i ) ln f pr ( i ) f np a ( i ) , a = 1,2,3
Will by descending sort, and it is made to be d a∈ R 256, order:
R a = Σ m = 1 n a d a ( m ) D a ( f pr , f np a ) , n a ∈ [ 1,1024 ]
M is increased to n gradually by 1 a, and calculate corresponding R a.If work as m=n atime, R a>=98%, then n before athe conjuncted frequency of occurrences of individual K is as the notable feature of difference promoter and a kind non-start up subsequence.Retain n a, then add up each gene data x inotable feature.Then can obtain three notable feature set: X 1, X 2and X 3, X 1the notable feature distinguishing promoter and extron, X 2represent the notable feature distinguishing promoter and introne, X 3represent the notable feature distinguishing promoter and 3'-UTR.
Step S301, utilize described own coding device, feature extraction is carried out to a sub-eigenvector of described training data X, obtain the quadratic character vector of described test data X;
Particularly, three own coding device h are used 1, h 2and h 3respectively for the remarkable training characteristics set X obtained through characteristic extracting module 1, X 2and X 3again learn, in order to obtain the another kind of feature representation of this training characteristics set, the namely quadratic character vector of described test data X.
Before utilizing described own coding device, need to set the neuronic number of the neuron of the input layer of described own coding device and hidden layer, output layer and input layer.Wherein, the neuronic number of output layer and input layer is equal, and the activation function of described hidden layer and described output layer is sigmoid function.By X ainput as a own coding device learns, then can obtain the quadratic character set of training data, be designated as X ' a=h a(X a).
In the present invention, hidden layer neuron number is 2 times of input layer number, and iterations is 1200 times.
Step S302, utilize described test data X, Training Support Vector Machines, obtain default support vector machine.
Particularly, use three support vector machine to quadratic character set X ' 1, X ' 2with X ' 3train, obtain three support vector cassification model g 1, g 2and g 3, also just preset support vector machine; Wherein g 1promoter-extron sorter, g 2promoter-introne sorter and g 3it is promoter-3'-UTR sorter.
In the present invention, training data (also claiming training set), test data (also claiming test set) are human gene, wherein promoter (promoter) data set is from EPD database, extron (exon), introne (intron) data set are from EID database, and 3 '-UTR data set is from UTRdb database.The data chosen consist of: each sequence length of promoter is 251bp, take from TSS upstream 200bp ~ downstream 50bp, and namely scope is [-200 ,+50], and the position of TSS is 0; Exon, intron, 3 ' UTR sequence length is 251bp; At data centralization stochastic generation 8000 samples wherein x i∈ R 251, y i∈ { " promoter ", " exon ", " intron ", " 3'-UTR " } tests, wherein positive and negative sample imbalance, and the ratio of promoter, exon, intron and 3 ' UTR is 1:1:1:1, and promoter is considered as+1 class by us, and non-start up is considered as-1 class.
Wherein, 4000 samples are as training set wherein x i∈ R 251, y i∈ { " promoter ", " exon ", " intron ", " 3'-UTR " }, remaining 4000 samples are as training set.
Corresponding with a kind of Promoter Recognition embodiment of the method 1 of the invention described above, present invention also offers a kind of Promoter Recognition system embodiment 1, with reference to figure 4, this recognition system 400 specifically can comprise:
One time feature determines single 401, for obtaining test data determine described test data a sub-eigenvector;
Quadratic character determining unit 402, for utilizing own coding device, to described test data a sub-eigenvector in each proper vector carry out feature extraction, obtain described test data quadratic character vector;
Promoter identifying unit 403, for utilizing default support vector machine, to described test data quadratic character vector classify, obtain classification results, when described classification results meets pre-conditioned, determine described test data for promoter.
In the present invention, preferably, recognition system also comprises:
Presetting support vector machine training unit, for obtaining training data X, determining a sub-eigenvector of described training data X; Utilize described own coding device, feature extraction is carried out to a sub-eigenvector of described training data X, obtain the quadratic character vector of described test data X; Utilize described test data X, Training Support Vector Machines, obtain default support vector machine.
To sum up, in technical scheme provided by the present invention, first carry out word frequency (5-mer) statistical nature to genetic test data to extract, consider promoter and the non-start up such as encoded exon and introne subsequence more simultaneously, KL divergence is utilized to choose the differentiation promoter of the representative and most resolving power of most and extron, promoter and introne with the remarkable words-frequency feature of promoter and 3'-UTR, and then with own coding device, reprocessing is carried out to the proper vector obtained respectively; For the character representation after the process of three own coding devices, support vector machine is utilized to identify, last comprehensive distinguishing result.The results show, this system has good recognition performance.
Beneficial effect provided by the present invention can by following experimental verification:
By the process for recognising human gene promoter based on SVM cascade that the present invention proposes, concentrate random ten extractions 8000 samples to test at data-oriented, wherein positive and negative sample imbalance, experimental result gets ten result mean values.In order to very clear to our experiment effect, the method that this experiment proposes compares on identical data set with algorithm in " Human Promoter Recognition Algorithm ".
According to Bajic evaluation criterion: susceptibility (Sensitivity), specificity (Specificity) and average conditional probability (Averaged conditional probability) can be used for the performance of evaluation algorithms.
S n = TP TP + FN S p = TP TP + FP ACP = 1 4 ( TP TP + FN + TP TP + FP + TN TN + FP + TN TN + FN )
Wherein TP represents the correct promoter sequence number identified; FN represents the non-start up subsequence number of wrong identification; FP represents the promoter sequence number of wrong identification; TN represents the correct non-start up subsequence number identified.
We have carried out performance test at application testing set pair system, have carried out Performance comparision with the system of not carrying out feature reprocessing module.Table 1 gives the Comparative result of two kinds of systems.
The classification performance contrast of table 1 two kinds of systems
System performance Prior art The present invention
S n 73.91 77.50
S p 82.08 82.69
ACP 72.25 73.99
By experiment result we can find out that the recognition performance that own coding device applies in the reprocessing of characteristic is significantly improved by the present invention, especially had higher Sensitivity and Specificity, this method has certain advantage by extracting tagsort respectively to different genes.
It should be noted that, each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiment, between each embodiment identical similar part mutually see.For system class embodiment, due to itself and embodiment of the method basic simlarity, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.
Above Promoter Recognition method and system provided by the present invention are described in detail.Apply specific case herein to set forth principle of the present invention and embodiment, the explanation of above embodiment just understands method of the present invention and core concept thereof for helping.It should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention, can also carry out some improvement and modification to the present invention, these improve and modify and also fall in the protection domain of the claims in the present invention.

Claims (10)

1. a Promoter Recognition method, is characterized in that, comprising:
Obtain test data determine described test data a sub-eigenvector;
Utilize own coding device, to described test data a sub-eigenvector carry out feature extraction, obtain described test data quadratic character vector;
Utilize and preset support vector machine, to described test data quadratic character vector classify, obtain classification results, when described classification results meets pre-conditioned, determine described test data for promoter.
2. recognition methods as claimed in claim 1, is characterized in that, obtains test data determine described test data a sub-eigenvector, comprising:
Obtain test data
Extract described test data word frequency statistics feature;
Utilize KL divergence, extract the proper vector of described word frequency statistics feature, obtain described test data a sub-eigenvector.
3. recognition methods as claimed in claim 2, is characterized in that, by described test data carry out 5-mer words-frequency feature statistics and extract described word frequency statistics feature.
4. recognition methods as claimed in claim 1, is characterized in that, also comprise:
Obtain training data X, determine a sub-eigenvector of described training data X;
Utilize described own coding device, feature extraction is carried out to a sub-eigenvector of described training data X, obtain the quadratic character vector of described test data X;
Utilize described test data X, Training Support Vector Machines, obtain default support vector machine.
5. recognition methods as claimed in claim 4, is characterized in that, also comprise:
Set the neuronic number of the neuron of the input layer of described own coding device and hidden layer, output layer and input layer.
6. recognition methods as claimed in claim 5, it is characterized in that, the activation function of described hidden layer and described output layer is sigmoid function.
7. recognition methods as claimed in claim 1, is characterized in that, described test data a sub-eigenvector comprise:
Distinguish a sub-eigenvector of promoter and extron
Distinguish a sub-eigenvector of promoter and introne
Distinguish a sub-eigenvector of promoter and 3 '-UTR
Described test data quadratic character vector comprise:
With a described sub-eigenvector corresponding quadratic character vector
With a described sub-eigenvector corresponding quadratic character vector
With a described sub-eigenvector corresponding quadratic character vector
Described classification results comprises:
With described quadratic character vector the first corresponding classification results;
With described quadratic character vector the second corresponding classification results;
With described quadratic character vector the 3rd corresponding classification results;
Wherein, described first classification results, the second classification results and the 3rd classification results are+1 or-1.
8. recognition methods as claimed in claim 7, is characterized in that, determine described test data by following steps for promoter:
When in described first classification results, the second classification results and the 3rd classification results, when at least having two+1, determine described test data for promoter.
9. a Promoter Recognition system, is characterized in that, comprising:
A characteristics determining unit, for obtaining test data determine described test data a sub-eigenvector;
Quadratic character determining unit, for utilizing own coding device, to described test data a sub-eigenvector in each proper vector carry out feature extraction, obtain described test data quadratic character vector;
Promoter identifying unit, for utilizing default support vector machine, to described test data quadratic character vector classify, obtain classification results, when described classification results meets pre-conditioned, determine described test data for promoter.
10. recognition system as claimed in claim 9, is characterized in that, also comprise:
Presetting support vector machine training unit, for obtaining training data X, determining a sub-eigenvector of described training data X; Utilize described own coding device, feature extraction is carried out to a sub-eigenvector of described training data X, obtain the quadratic character vector of described test data X; Utilize described test data X, Training Support Vector Machines, obtain default support vector machine.
CN201410727536.3A 2014-12-03 2014-12-03 promoter recognition method and system Active CN104376234B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410727536.3A CN104376234B (en) 2014-12-03 2014-12-03 promoter recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410727536.3A CN104376234B (en) 2014-12-03 2014-12-03 promoter recognition method and system

Publications (2)

Publication Number Publication Date
CN104376234A true CN104376234A (en) 2015-02-25
CN104376234B CN104376234B (en) 2017-12-26

Family

ID=52555138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410727536.3A Active CN104376234B (en) 2014-12-03 2014-12-03 promoter recognition method and system

Country Status (1)

Country Link
CN (1) CN104376234B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834834A (en) * 2015-04-09 2015-08-12 苏州大学张家港工业技术研究院 Construction method and device of promoter recognition system
CN105550538A (en) * 2016-02-03 2016-05-04 苏州大学 Human gene promoter recognition method and system
CN106054778A (en) * 2016-07-22 2016-10-26 北京农业信息技术研究中心 Cold chain transportation process intelligent monitoring sampling method and device and cold chain vehicle
CN107145836A (en) * 2017-04-13 2017-09-08 西安电子科技大学 Hyperspectral image classification method based on stack boundary discrimination self-encoding encoder
CN108647489A (en) * 2018-05-15 2018-10-12 华中农业大学 A kind of method and system of screening disease medicament target and target combination

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870719A (en) * 2014-04-09 2014-06-18 苏州大学 Human gene promoter identification method and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870719A (en) * 2014-04-09 2014-06-18 苏州大学 Human gene promoter identification method and system

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
HAIHONG ZHANG等: ""A Kernel Autoassociator Approach to Pattern Classification"", 《IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS-PART B: CYBERNETICS》 *
NILS PLATH: ""Extracting low-dimensional features by means of Deep Network Architectures"", 《PHD.THESIS.TECHNISCHE UNIVERSITAT BERLIN》 *
YUSHU GAO等: ""Extract Features Using Stacked Denoised Autoencoder"", 《INTELLIGENT COMPUTING IN BIOINFORMATICS》 *
刘咏梅等: ""基于特征综合的启动子识别方法"", 《计算机工程与应用》 *
孙丽慧等: ""基于SDEC和机器支持矢量机的基因检测"", 《电子器件》 *
智慧等: ""应用新的基于知识编码方法及双层SVM识别人类Pol II启动子"", 《哈尔滨医科大学学报》 *
李时辉等: ""2D-PCA 在基因特征提取中的应用"", 《航天医学与医学工程》 *
杜耀华等: ""一种基于特征筛选的原核生物启动子判别分析方法"", 《生物物理学报》 *
梅丽: ""人类启动子识别算法研究"", 《中国优秀硕士学位论文全文数据库 基础科学辑》 *
石欣等: ""基于二次特征提取与SVM的异常步态识别"", 《仪器仪表学报》 *
蒋胜利等: ""一种二次投影识别蛋白质谱数据的新方法"", 《中山大学学报(自然科学版)》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834834A (en) * 2015-04-09 2015-08-12 苏州大学张家港工业技术研究院 Construction method and device of promoter recognition system
CN105550538A (en) * 2016-02-03 2016-05-04 苏州大学 Human gene promoter recognition method and system
CN105550538B (en) * 2016-02-03 2018-06-01 苏州大学 A kind of process for recognising human gene promoter and system
CN106054778A (en) * 2016-07-22 2016-10-26 北京农业信息技术研究中心 Cold chain transportation process intelligent monitoring sampling method and device and cold chain vehicle
CN106054778B (en) * 2016-07-22 2018-11-20 北京农业信息技术研究中心 The cold chain transportation process intellectual monitoring method of sampling, device and cold chain vehicle
CN107145836A (en) * 2017-04-13 2017-09-08 西安电子科技大学 Hyperspectral image classification method based on stack boundary discrimination self-encoding encoder
CN107145836B (en) * 2017-04-13 2020-04-07 西安电子科技大学 Hyperspectral image classification method based on stacked boundary identification self-encoder
CN108647489A (en) * 2018-05-15 2018-10-12 华中农业大学 A kind of method and system of screening disease medicament target and target combination

Also Published As

Publication number Publication date
CN104376234B (en) 2017-12-26

Similar Documents

Publication Publication Date Title
CN104376234A (en) Promoter identification method and system
CN102156871B (en) Image classification method based on category correlated codebook and classifier voting strategy
CN106951825A (en) A kind of quality of human face image assessment system and implementation method
CN105389379A (en) Rubbish article classification method based on distributed feature representation of text
CN105389583A (en) Image classifier generation method, and image classification method and device
CN106126751A (en) A kind of sorting technique with time availability and device
CN104239554A (en) Cross-domain and cross-category news commentary emotion prediction method
CN102156885A (en) Image classification method based on cascaded codebook generation
CN105045913B (en) File classification method based on WordNet and latent semantic analysis
CN103077399B (en) Based on the biological micro-image sorting technique of integrated cascade
CN104834918A (en) Human behavior recognition method based on Gaussian process classifier
CN109887279B (en) Traffic jam prediction method and system
CN104142960A (en) Internet data analysis system
CN103927550A (en) Handwritten number identifying method and system
CN104750875A (en) Machine error data classification method and system
CN104978569A (en) Sparse representation based incremental face recognition method
CN104809229B (en) A kind of text feature word extracting method and system
CN105224577A (en) Multi-label text classification method and system
CN104462870A (en) Method and device for identifying human gene promoter
CN111260490A (en) Rapid claims settlement method and system based on tree model for car insurance
CN107886130A (en) A kind of kNN rapid classification methods based on cluster and Similarity-Weighted
CN104866867B (en) A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine
CN111414473A (en) Semi-supervised classification method and system
CN105760471A (en) Classification method for two types of texts based on multiconlitron
CN104834834A (en) Construction method and device of promoter recognition system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant