CN108595913A - Differentiate the supervised learning method of mRNA and lncRNA - Google Patents
Differentiate the supervised learning method of mRNA and lncRNA Download PDFInfo
- Publication number
- CN108595913A CN108595913A CN201810449074.1A CN201810449074A CN108595913A CN 108595913 A CN108595913 A CN 108595913A CN 201810449074 A CN201810449074 A CN 201810449074A CN 108595913 A CN108595913 A CN 108595913A
- Authority
- CN
- China
- Prior art keywords
- lncrna
- mrna
- sequence
- model
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
Abstract
The invention discloses a kind of supervised learning methods differentiating mRNA and lncRNA, including step:Using mRNA the and lncRNA data of people in Genecode databases as training set and test set, transcript sequence is converted to k mer sequences;The frequencies of various k mer in the sequence in every sequence are counted, are then normalized;K mer frequencies are configured to matrix form, the input as convolutional neural networks model convolutional layer;Using the mRNA and lncRNA of people in the convolutional neural networks training Genecode databases built, training and test mRNA and lncRNA determine model parameter, are used for the lncRNA or mRNA of Accurate Prediction people.The present invention is distinguished according to the kmer features of mRNA and lncRNA, propose a kind of differentiation lncRNA and mRNA method based on convolutional neural networks supervised learning kmer features, the effective advantage for incorporating convolutional neural networks, using the k mer of sequence as mode input, for discriminating model accuracy after training up to 98% or more, the biological function for further analysis lncRNA sequences has established good basis.
Description
Technical field
The present invention relates to biotechnology, it is related to predicting the category attribute aspect skill of the transcript sequence of unknown human
Art especially identifies the lncRNA sequences or mRNA sequence technology of unknown human, in particular to a kind of discriminating mRNA and lncRNA
Supervised learning method.
Background technology
Long-chain non-coding RNA (long non-coding RNA, lncRNA) is that one kind is present in eucaryote, length
More than 200 nucleotide, while the non-coding RNA of any protein coding potential is not shown.Initially they are considered as RNA
The by-product of polymerase II transcription, without biological function.But go deep into research, it has been found that:Although lncRNAs sheets
Body not coding protein, but plays very important effect, it is joined in the form of RNA in Eukaryotic gene expression regulation
It is modified with organism genomic imprinting, chromatin, transcriptional activation, transcription interference, a variety of important regulation and control such as transport in core
Journey, to play its biological function.Usually between 200nt~100000nt, structure is similar to the length of lncRNAs
MRNAs has poly (A) tails and promoter structure by montage.Because there is centainly similar in lncRNAs and mRNAs
Property, and there is biological function in lncRNAs and mRNAs, therefore identification to lncRNAs and mRNAs and distinguish work particularly
It is important.
Currently, the method for identification lncRNA is more.Its main flow is the feature of first abstraction sequence, such as the opening of sequence
Then the sequence structure features such as reading frame, protein sequence similitude utilize machine learning method, to the sequence signature of extraction into
Row training, to obtain the identification model of lncRNA and mRNA.The identification model of wherein better performances have CNCI, CPC, CSF and
PhyloCSF etc., they all achieve good effect on respective training dataset and validation data set.
However the method for abstraction sequence feature training is excessively high for the sequencing quality dependence of sequence, and come from current research
It sees, high throughput sequencing technologies may not be able to ensure the accuracy of obtained sequence, and the hair of mistake is sequenced in sequencing procedure
Raw and different sequencing depth and sequencing preference can all impact sequencing quality.Certainly, also some scholars consider
These problems, and propose the recognizer independent of sequence quality, but these methods but be difficult to escape model it is cumbersome,
Computationally intensive and high time cost denounces.Therefore in the base for not depending on sequence quality and sequence biological structure feature
On plinth, while reduced model avoids largely calculating again, proposes a kind of model of effective classification lncRNA and mRNA.
Invention content
It is special according to the kmer of mRNA and lncRNA the purpose of the present invention is to solve deficiency existing for above-mentioned background technology
Sign difference, and the supervised learning method of the discriminating mRNA and lncRNA based on convolutional neural networks proposed a kind of.
To achieve the above object, the supervised learning method of the discriminating mRNA and lncRNA designed by the present invention, it is special
Place is, includes the following steps:
1) mRNA manned under in Genecode databases transcription notebook data and lncRNA transcribe notebook data, and selection meets
The mRNA sequence and lncRNA of length requirement are as experimental data;
2) each transcript sequence sample in experimental data is converted to k-mer sequences, wherein k is the nature more than 0
Number;
3) each frequency of k-mer sequence fragments type in the sequence in every sequence is counted, is then normalized
Processing, finds out frequency of each k-mer sequence appeared in the k-mers sequences in every sequence, k-mers in every sequence
The sum of frequency be 1;
4) k-mers frequencies are configured to matrix form, as the input of convolutional neural networks model convolutional layer, then divided
It is not used as the full articulamentum of activation primitive by convolutional calculation layer, pond computation layer and use softmax functions, builds convolution god
Through network model framework;
5) experimental data is divided into model training sample data set and model measurement sample data set, utilizes model training sample
Notebook data set pair convolutional neural networks model is trained, and obtains disaggregated model;
6) by adjusting the parameter of convolutional neural networks model and k values, optimize convolutional neural networks model, and utilize model
Test sample data set verifies classification accuracy, to Accurate Prediction mRNA and lncRNA sequence.
Preferably, the tool of the mRNA sequence and lncRNA that meet length requirement as experimental data is chosen in the step 1)
Body step is:Under 2000~10000 sequences are randomly selected in manned mRNA transcription notebook datas and lncRNA transcription notebook datas
Row analyze sequence length, determine the length range of lncRNA and mRNA, and then mRNA manned under transcribes notebook data
With lncRNA transcribe notebook data in meet lncRNA and mRNA length range data in randomly select mRNA sequence and
LncRNA is as experimental data.
Preferably, the first layer of convolutional neural networks model uses 32 sizes for the convolution kernel of 3*3 in the step 4),
Relu activation primitives are chosen, ensure that the size of matrix before and after convolutional calculation is constant in a manner of 0 progress periphery filling;The second layer is adopted
The convolution kernel for being 3*3 with 64 sizes chooses Relu activation primitives;Third layer is maximum pond layer, and pond area size is 2*
2, with the connection of 0.25 probability Dropout partial nerve members between pond layer and full articulamentum;Last layer is to connect entirely
Layer, using 128 neurons, taken after being connect entirely with pond layer with 0.5 probability to full articulamentum and output layer neuron it
Between connection carry out Dropout, finally obtain classification results using softmax functions as activation primitive.
Preferably, the quantity of model training sample data set is no less than 10000 in the step 5), model measurement sample
The quantity of data set is no less than 1000.
Preferably, the value of k is 1,2,3 in the step 2).
Preferably, the length range of the lncRNA and mRNA is respectively 250nt~3500nt and 200nt~4000nt.
The present invention is distinguished according to the kmer features of mRNA and lncRNA, it is proposed that one kind having supervision based on convolutional neural networks
Differentiation lncRNA and the mRNA method for learning kmer features, the biological function further to analyze lncRNA sequences have been established good
Good basis.This method effectively incorporates the advantage of convolutional neural networks, is trained using the convolutional neural networks built
The mRNA and lncRNA of people in Genecode databases, training and test mRNA and lncRNA, using the k-mer of sequence as mould
Type inputs, and determines model parameter, is used for the lncRNA or mRNA of Accurate Prediction people, the discriminating model accuracy after training is up to 98%
More than, it lays the foundation for follow-up study.
Description of the drawings
Fig. 1 is the flow chart for the supervised learning method for differentiating mRNA and lncRNA.
Fig. 2 is the structural schematic diagram of convolutional neural networks in the embodiment of the present invention.
Fig. 3 is that the flow of convolutional neural networks training and test mRNA and lncRNA sequences is utilized in the embodiment of the present invention
Figure.
Specific implementation mode
In order to make technical scheme of the present invention, technical characterstic and advantage statement become apparent from, in conjunction with embodiment, to the present invention
It is further elaborated.
As shown in Figure 1, the supervised learning method of discriminating mRNA and lncRNA proposed by the present invention a kind of, including walk as follows
Suddenly:
1) mRNA manned under in Genecode databases transcription notebook data and lncRNA transcribe notebook data, and selection meets
The mRNA sequence and lncRNA of length requirement are as experimental data.
The data set that the present invention downloads includes that 27720 lncRNA transcribe notebook data and 199324 mRNA transcript numbers
According to.In order to make the data volume of lncRNA and mRNA reach balanced, we are respectively from 27720 lncRNA data and 199324
Random selection 5000lncRNA sequences and mRNA sequence respectively are for statistical analysis to the length of sequence in mRNA sequence, true respectively
It is 250nt~3500nt and 200nt~4000nt respectively to have determined lncRNA and mRNA as the length range of test sample.Then
Randomly select reality of 20000 sequences for meeting length (length range) requirement as lncRNA and mRNA respectively from data set
Test data.Wherein, the training sample data of 15000 lncRNA transcription notebook datas and mRNA transcription notebook datas as model are chosen
Test sample data set of the notebook datas as model is transcribed in collection, in addition each selection respectively 5000.
2) each transcript sequence sample in experimental data is converted to k-mer sequences, wherein k is the nature more than 0
Number.
Convert each transcript sequence sample in experimental data to k-mer sequences.By taking k=6 as an example, the 6- of sequence
Mer subsequence segments have 4096 kinds, by its being ranked sequentially according to A, T, C, G, then the collection of 6-mer subsequences is combined into
AAAAAA, AAAAAT, AAAAAC, AAAAAG, AAAATA, AAAATT, AAAATC, AAAATG, AAAACA,
AAAACT ..., GGGGGG }.
3) each frequency of k-mer sequence fragments type in the sequence in every sequence is counted, is then normalized
Processing, finds out frequency of each k-mer sequence appeared in the k-mers sequences in every sequence, k-mers in every sequence
The sum of frequency be 1.By taking k=6 as an example, count respectively 6-mer subsequences AAAAAA, AAAAAT, AAAAAC, AAAAAG,
AAAATA, AAAATT, AAAATC, AAAATG, AAAACA, AAAACT ..., GGGGGG in each 6-mer frequency,
Then the ratio that all 6-mers sums in this sequence are occupied according to each 6-mer calculates each 6-mer and goes out in the sequence
Existing frequency.
4) k-mers frequencies are configured to matrix form, the input as convolutional neural networks model convolutional layer.Then divide
It is not used as the full articulamentum of activation primitive by convolutional calculation layer, pond computation layer and use softmax functions, builds the present invention
Convolutional neural networks model framework, including two convolutional layers, a pond layer and a full articulamentum composition.
Specific convolutional neural networks model framework such as Fig. 2, corresponding process flow diagram flow chart such as Fig. 3.Convolutional neural networks mould
The first layer of type uses 32 sizes for the convolution kernel of 3*3, chooses Relu activation primitives, is protected in a manner of 0 progress periphery filling
The size of matrix is constant before and after card convolutional calculation.The second layer is still convolutional layer, uses 64 sizes for the convolution kernel of 3*3, is swashed
Function living is still Relu functions.Third layer is maximum pond layer, and pond area size is 2*2.Pond layer and full articulamentum it
Between with the connection of 0.25 probability Dropout partial nerve members, to prevent over-fitting the phenomenon that occurs.Last layer is to connect entirely
Layer, full articulamentum has 128 neurons, take connection with 0.5 probability between full articulamentum and output layer neuron into
Row Dropout finally obtains classification results using softmax functions as activation primitive.Loss letter during model training
Number selection is cross entropy loss function, and optimizer is then Adadelta.
5) notebook data is transcribed as the training sample of model using the 15000 lncRNA transcription notebook datas and mRNA chosen
Data set is trained convolutional neural networks, obtains disaggregated model.
6) by adjusting the parameter of convolutional neural networks model and k values, optimize convolutional neural networks model, and utilize model
Test sample data set verifies classification accuracy, to Accurate Prediction mRNA and lncRNA sequence.
By adjusting the parameter and k values of convolutional neural networks model, finally determine that optimal k values are combined as k=1,2,3, this
When under the calculated case of 10 times of cross validations, the loss function value average out to 0.0430 of training set, average classification accuracy is
0.9872.The loss function average value of verification collection is 0.0431, and average classification accuracy is 0.9790.
Finally, under above-mentioned the same terms, we are also with random forest (RF), logistic regression (LR), decision tree
(DT) data are trained and are verified with support vector machines (SVM) these four machine learning methods, the classifying quality of each model
With convolutional neural networks (CNN) comparison such as the following table 1:
The classifying quality comparing result of 1 each model of table
As a result, table 1 illustrate model proposed by the present invention in model accuracy, accuracy rate and recall rate compared with the prior art its
He has a distinct increment at method.
It will be understood by those of skill in the art that specific embodiments described herein, which is only used, explains patent of the present invention, and
It is not used in limitation patent of the present invention.Any modification for being made within the spirit and principle of patent of the present invention and changes equivalent replacement
Into etc., it should be included among the protection domain of patent of the present invention.
Claims (6)
1. a kind of supervised learning method differentiating mRNA and lncRNA, it is characterised in that:Include the following steps:
1) mRNA manned under in Genecode databases transcription notebook data and lncRNA transcribe notebook data, and selection meets length
It is required that mRNA sequence and lncRNA as experimental data;
2) each transcript sequence sample in experimental data is converted to k-mer sequences, wherein k is the natural number more than 0;
3) each frequency of k-mer sequence fragments type in the sequence in every sequence is counted, is then normalized,
Find out frequency of each k-mer sequence appeared in the k-mers sequences in every sequence, the frequency of k-mers in every sequence
The sum of rate is 1;
4) k-mers frequencies are configured to matrix form, as the input of convolutional neural networks model convolutional layer, then led to respectively
Convolution computation layer, pond computation layer and the full articulamentum using softmax functions as activation primitive are crossed, convolutional Neural net is built
Network model framework;
5) experimental data is divided into model training sample data set and model measurement sample data set, utilizes model training sample number
It is trained according to set pair convolutional neural networks model, obtains disaggregated model;
6) by adjusting the parameter of convolutional neural networks model and k values, optimize convolutional neural networks model, and utilize model measurement
Sample data set verifies classification accuracy, to Accurate Prediction mRNA and lncRNA sequence.
2. the supervised learning method according to claim 1 for differentiating mRNA and lncRNA, it is characterised in that:The step
1) in choose meet length requirement mRNA sequence and lncRNA as experimental data the specific steps are:The manned mRNA under
2000~10000 sequence pair sequence lengths are randomly selected in transcription notebook data and lncRNA transcription notebook datas to be analyzed, really
Determine the length range of lncRNA and mRNA, then meets in mRNA manned under transcription notebook data and lncRNA transcription notebook datas
MRNA sequence and lncRNA are randomly selected in the data of the length range of lncRNA and mRNA as experimental data.
3. the supervised learning method according to claim 1 for differentiating mRNA and lncRNA, it is characterised in that:The step
4) first layer of convolutional neural networks model uses 32 sizes for the convolution kernel of 3*3 in, choose Relu activation primitives, with 0 into
The mode of row periphery filling ensures that the size of matrix before and after convolutional calculation is constant;The second layer uses 64 sizes for the convolution of 3*3
Core chooses Relu activation primitives;Third layer is maximum pond layer, and pond area size is 2*2, in pond layer and full articulamentum it
Between with the connection of 0.25 probability Dropout partial nerve members;Last layer is full articulamentum, using 128 neurons, with pond
Change and the connection with 0.5 probability between full articulamentum and output layer neuron is taken to carry out Dropout after layer connects entirely, finally
Classification results are obtained using softmax functions as activation primitive.
4. the supervised learning method according to claim 1 for differentiating mRNA and lncRNA, it is characterised in that:The step
5) quantity of model training sample data set is no less than 10000 in, and the quantity of model measurement sample data set is no less than 1000
Item.
5. the supervised learning method according to claim 1 for differentiating mRNA and lncRNA, it is characterised in that:The step
2) value of k is 1,2,3 in.
6. the supervised learning method according to claim 2 for differentiating mRNA and lncRNA, it is characterised in that:It is described
The length range of lncRNA and mRNA is respectively 250nt~3500nt and 200nt~4000nt.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810449074.1A CN108595913B (en) | 2018-05-11 | 2018-05-11 | Supervised learning method for identifying mRNA and lncRNA |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810449074.1A CN108595913B (en) | 2018-05-11 | 2018-05-11 | Supervised learning method for identifying mRNA and lncRNA |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108595913A true CN108595913A (en) | 2018-09-28 |
CN108595913B CN108595913B (en) | 2021-07-06 |
Family
ID=63637233
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810449074.1A Active CN108595913B (en) | 2018-05-11 | 2018-05-11 | Supervised learning method for identifying mRNA and lncRNA |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108595913B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109448795A (en) * | 2018-11-12 | 2019-03-08 | 山东农业大学 | The recognition methods of circRNA a kind of and device |
CN109559781A (en) * | 2018-10-24 | 2019-04-02 | 成都信息工程大学 | A kind of two-way LSTM and CNN model that prediction DNA- protein combines |
CN109903813A (en) * | 2019-01-23 | 2019-06-18 | 华东师范大学 | A kind of gene order similarity calculating method based on multiple k values |
CN110189797A (en) * | 2019-06-17 | 2019-08-30 | 福建师范大学 | A kind of sequence errors number prediction technique based on DBN |
CN110428870A (en) * | 2019-08-08 | 2019-11-08 | 苏州泓迅生物科技股份有限公司 | A kind of method and its application of prediction heavy chain of antibody light chain pairing probability |
CN110942805A (en) * | 2019-12-11 | 2020-03-31 | 云南大学 | Insulator element prediction system based on semi-supervised deep learning |
CN111223522A (en) * | 2020-01-06 | 2020-06-02 | 西安理工大学 | Method for identifying lncRNA based on fuzzy k-mer utilization rate |
CN111933217A (en) * | 2020-06-17 | 2020-11-13 | 西安电子科技大学 | DNA (deoxyribonucleic acid) motif length prediction method and prediction system based on deep learning |
CN112711907A (en) * | 2020-12-29 | 2021-04-27 | 浙江大学 | Energy consumption-based manufacturing equipment yield analysis method |
CN112786112A (en) * | 2021-01-19 | 2021-05-11 | 中山大学 | Prediction method and system for combination of lncRNA and target DNA |
CN113658643A (en) * | 2021-07-22 | 2021-11-16 | 西安理工大学 | Prediction method for lncRNA and mRNA based on attention mechanism |
CN113808671A (en) * | 2021-08-30 | 2021-12-17 | 西安理工大学 | Method for distinguishing coding ribonucleic acid from non-coding ribonucleic acid based on deep learning |
CN113851190A (en) * | 2021-11-01 | 2021-12-28 | 四川大学华西医院 | Heterogeneous mRNA sequence optimization method |
CN115579062A (en) * | 2022-11-17 | 2023-01-06 | 南京腾鸿医疗科技有限公司 | Specific promoter expression information prediction method based on convolutional neural network |
CN116417068A (en) * | 2023-02-03 | 2023-07-11 | 中国人民解放军军事科学院军事医学研究院 | Method, system and device for predicting laboratory source of engineering nucleic acid sequence based on deep learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332064A (en) * | 2011-10-07 | 2012-01-25 | 吉林大学 | Biological species identification method based on genetic barcode |
WO2013138727A1 (en) * | 2012-03-15 | 2013-09-19 | Sabiosciences Corp. | Method, kit and array for biomarker validation and clinical use |
CN106778079A (en) * | 2016-11-22 | 2017-05-31 | 重庆邮电大学 | A kind of DNA sequence dna k mer frequency statistics methods based on MapReduce |
-
2018
- 2018-05-11 CN CN201810449074.1A patent/CN108595913B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332064A (en) * | 2011-10-07 | 2012-01-25 | 吉林大学 | Biological species identification method based on genetic barcode |
WO2013138727A1 (en) * | 2012-03-15 | 2013-09-19 | Sabiosciences Corp. | Method, kit and array for biomarker validation and clinical use |
CN106778079A (en) * | 2016-11-22 | 2017-05-31 | 重庆邮电大学 | A kind of DNA sequence dna k mer frequency statistics methods based on MapReduce |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559781A (en) * | 2018-10-24 | 2019-04-02 | 成都信息工程大学 | A kind of two-way LSTM and CNN model that prediction DNA- protein combines |
CN109448795A (en) * | 2018-11-12 | 2019-03-08 | 山东农业大学 | The recognition methods of circRNA a kind of and device |
CN109448795B (en) * | 2018-11-12 | 2021-04-16 | 山东农业大学 | Method and device for recognizing circRNA |
CN109903813A (en) * | 2019-01-23 | 2019-06-18 | 华东师范大学 | A kind of gene order similarity calculating method based on multiple k values |
CN109903813B (en) * | 2019-01-23 | 2023-03-31 | 华东师范大学 | Gene sequence similarity calculation method based on multiple k values |
CN110189797B (en) * | 2019-06-17 | 2022-10-21 | 福建师范大学 | Sequence error number prediction method based on DBN |
CN110189797A (en) * | 2019-06-17 | 2019-08-30 | 福建师范大学 | A kind of sequence errors number prediction technique based on DBN |
CN110428870A (en) * | 2019-08-08 | 2019-11-08 | 苏州泓迅生物科技股份有限公司 | A kind of method and its application of prediction heavy chain of antibody light chain pairing probability |
CN110428870B (en) * | 2019-08-08 | 2023-03-21 | 苏州泓迅生物科技股份有限公司 | Method for predicting antibody heavy chain and light chain pairing probability and application thereof |
CN110942805A (en) * | 2019-12-11 | 2020-03-31 | 云南大学 | Insulator element prediction system based on semi-supervised deep learning |
CN111223522B (en) * | 2020-01-06 | 2023-04-28 | 西安理工大学 | Method for identifying lncRNA based on fuzzy k-mer utilization rate |
CN111223522A (en) * | 2020-01-06 | 2020-06-02 | 西安理工大学 | Method for identifying lncRNA based on fuzzy k-mer utilization rate |
CN111933217B (en) * | 2020-06-17 | 2024-04-05 | 西安电子科技大学 | DNA motif length prediction method and prediction system based on deep learning |
CN111933217A (en) * | 2020-06-17 | 2020-11-13 | 西安电子科技大学 | DNA (deoxyribonucleic acid) motif length prediction method and prediction system based on deep learning |
CN112711907A (en) * | 2020-12-29 | 2021-04-27 | 浙江大学 | Energy consumption-based manufacturing equipment yield analysis method |
CN112786112A (en) * | 2021-01-19 | 2021-05-11 | 中山大学 | Prediction method and system for combination of lncRNA and target DNA |
CN112786112B (en) * | 2021-01-19 | 2023-10-20 | 中山大学 | Method and system for predicting combination of lncRNA and target DNA |
CN113658643A (en) * | 2021-07-22 | 2021-11-16 | 西安理工大学 | Prediction method for lncRNA and mRNA based on attention mechanism |
CN113658643B (en) * | 2021-07-22 | 2024-02-13 | 西安理工大学 | Method for predicting lncRNA and mRNA based on attention mechanism |
CN113808671B (en) * | 2021-08-30 | 2024-02-06 | 西安理工大学 | Method for distinguishing coding ribonucleic acid from non-coding ribonucleic acid based on deep learning |
CN113808671A (en) * | 2021-08-30 | 2021-12-17 | 西安理工大学 | Method for distinguishing coding ribonucleic acid from non-coding ribonucleic acid based on deep learning |
CN113851190A (en) * | 2021-11-01 | 2021-12-28 | 四川大学华西医院 | Heterogeneous mRNA sequence optimization method |
CN113851190B (en) * | 2021-11-01 | 2023-07-21 | 四川大学华西医院 | Heterogeneous mRNA sequence optimization method |
CN115579062A (en) * | 2022-11-17 | 2023-01-06 | 南京腾鸿医疗科技有限公司 | Specific promoter expression information prediction method based on convolutional neural network |
CN116417068A (en) * | 2023-02-03 | 2023-07-11 | 中国人民解放军军事科学院军事医学研究院 | Method, system and device for predicting laboratory source of engineering nucleic acid sequence based on deep learning |
CN116417068B (en) * | 2023-02-03 | 2024-01-16 | 中国人民解放军军事科学院军事医学研究院 | Method, system and device for predicting laboratory source of engineering nucleic acid sequence based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN108595913B (en) | 2021-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108595913A (en) | Differentiate the supervised learning method of mRNA and lncRNA | |
CN111798921B (en) | RNA binding protein prediction method and device based on multi-scale attention convolution neural network | |
US20060230018A1 (en) | Mahalanobis distance genetic algorithm (MDGA) method and system | |
CN112232413B (en) | High-dimensional data feature selection method based on graph neural network and spectral clustering | |
CN106202952A (en) | A kind of Parkinson disease diagnostic method based on machine learning | |
CN111785328B (en) | Coronavirus sequence identification method based on gated cyclic unit neural network | |
CN105808976B (en) | A kind of miRNA microRNA target prediction methods based on recommended models | |
CN111462820A (en) | Non-coding RNA prediction method based on feature screening and integration algorithm | |
Sapkota et al. | Data summarization using clustering and classification: Spectral clustering combined with k-means using nfph | |
Benso et al. | A cDNA microarray gene expression data classifier for clinical diagnostics based on graph theory | |
CN108776919B (en) | Article recommendation method for constructing information core based on clustering and evolutionary algorithm | |
CN106548041A (en) | A kind of tumour key gene recognition methods based on prior information and parallel binary particle swarm optimization | |
CN110853756A (en) | Esophagus cancer risk prediction method based on SOM neural network and SVM | |
CN106951728B (en) | Tumor key gene identification method based on particle swarm optimization and scoring criterion | |
CN104376234B (en) | promoter recognition method and system | |
CN113420291B (en) | Intrusion detection feature selection method based on weight integration | |
CN112215278B (en) | Multi-dimensional data feature selection method combining genetic algorithm and dragonfly algorithm | |
CN113241114A (en) | LncRNA-protein interaction prediction method based on graph convolution neural network | |
CN117272025A (en) | High-dimensional data feature selection method based on fuzzy competition particle swarm multi-objective optimization | |
CN112256209A (en) | Parameter configuration optimization method and optimization system of cloud storage system | |
CN105651941B (en) | A kind of cigarette sense organ intelligent evaluation system based on decomposition aggregation strategy | |
Gil et al. | Fusion of feature selection methods in gene recognition | |
CN110739028B (en) | Cell line drug response prediction method based on K-nearest neighbor constraint matrix decomposition | |
CN111383716A (en) | Method and device for screening gene pairs, computer equipment and storage medium | |
CN112580606B (en) | Large-scale human body behavior identification method based on clustering grouping |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |