CN111276187B - Gene expression profile feature learning method based on self-encoder - Google Patents

Gene expression profile feature learning method based on self-encoder Download PDF

Info

Publication number
CN111276187B
CN111276187B CN202010029068.8A CN202010029068A CN111276187B CN 111276187 B CN111276187 B CN 111276187B CN 202010029068 A CN202010029068 A CN 202010029068A CN 111276187 B CN111276187 B CN 111276187B
Authority
CN
China
Prior art keywords
layer
data
sample
encoder
gene expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010029068.8A
Other languages
Chinese (zh)
Other versions
CN111276187A (en
Inventor
彭绍亮
张磊
李非
毕夏安
周德山
肖港
辛彬
王子航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202010029068.8A priority Critical patent/CN111276187B/en
Publication of CN111276187A publication Critical patent/CN111276187A/en
Application granted granted Critical
Publication of CN111276187B publication Critical patent/CN111276187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of computer science, and discloses a gene expression profile feature learning method based on a self-encoder. The invention gives full play to the advantage that deep learning can accurately acquire data characteristics, combines the characteristics learning capability of multiple channels of a convolutional neural network and an autoencoder, and designs a new multi-channel autoencoder model, thereby better learning the characteristics representation with relatively low dimensionality and effectively distinguishing the original gene expression spectrum data.

Description

Gene expression profile feature learning method based on self-encoder
Technical Field
The invention relates to a gene expression profile characteristic learning method based on an autoencoder, belonging to the field of computer science.
Background
The LINCS gene expression profile data covers the expression quantity condition of the whole genome of each cell line of a human body under various different experimental environments, embodies the cell overall information of different drugs under the conditions of different doses and time in an in vitro cell model, and provides necessary basic data for computational analysis and experimental verification. The silencing and over-expression of more than 4000 genes and 130 or more than 7000 molecules of small molecule compounds in 77 typical cell lines are obtained, and necessary biological repeated experiments are included. However, as long as the experimental conditions are slightly different, the gene expression level of the same cell line is also very different, and moreover, the expression level of the whole genome of the human body is more than twenty thousand, if the whole expression level is directly used as the sample characteristic, redundancy may exist among the characteristics, and if the high-dimensional characteristics are directly used, the difficulty of model training is increased, and the performance of the model is affected. Although there are many deep learning models applied to gene expression data having a characteristic of a high-dimensional small sample, there is a problem in that the computational complexity is relatively high. Therefore, before using deep learning models, a suitable feature representation learning method is required to reduce the dimensionality of the gene expression data.
In order to solve the problem, it is necessary to invent a gene expression profile feature learning method, which extracts a feature representation with relatively low dimensionality from high-dimensional sparse gene expression profile data and can effectively distinguish original data, so as to be better applied to gene expression profile classification and the like.
Disclosure of Invention
In order to achieve the above object, the present invention provides a gene expression profile feature learning method based on an autoencoder. The feature learning is mainly divided into the conventional feature learning and the feature learning based on the neural network. In the traditional feature learning, linear mapping is carried out on data by learning a projection matrix, the data is converted into a low-dimensional space from a high-dimensional feature space, and the differentiability of the data is improved, so that better expression of the data is obtained. And the characteristic learning capability is improved by linear transformation of input data and a nonlinear activation function of a neuron based on the characteristic learning of the neural network. The invention gives full play to the advantage that deep learning can accurately acquire data characteristics, combines the characteristics learning capability of multiple channels of a convolutional neural network and an autoencoder, and designs a new multi-channel autoencoder model, thereby better learning the characteristics representation with relatively low dimensionality and effectively distinguishing the original gene expression spectrum data.
The technical scheme adopted by the invention is as follows:
a gene expression profile characteristic learning method based on an autoencoder comprises the following steps:
step 1, importing a gene expression profile file at a LINCS data set Level3 stage, and extracting required data according to a cell line, an experiment type, a disturbance condition and the like;
step 2, processing the extracted sample data to obtain input data which can be directly used for a feature learning model;
step 3, extracting feature representation with relatively low dimensionality and capable of effectively distinguishing original data from high-dimensional sparse gene expression profile data;
the step is combined with the multi-channel of the convolutional neural network and the characteristic learning capability of the self-encoder to design a new multi-channel self-encoder model, and the performance of the model is greatly improved compared with that of the original self-encoder.
And 4, verifying the quality of the characteristics predicted by the model, calculating the prediction accuracy rate by taking the gene expression profile experimental group as a classification label based on a KNN classification method, wherein the higher the accuracy rate is, the better the characteristics learned by the multi-channel self-encoder are represented in the classification task.
Preferably, in step 1, the information of the required cell line, experiment type, disturbance condition and the like is screened out as required according to the gene information and cell line information in the LINCS data set attachment and the gene expression profile sample information, the sample information is used as a screening standard according to the uniqueness of each gene expression profile sample information, and finally, the sample of the required data is searched according to the sample identifier, and the expression profile value of the sample is the required data.
Preferably, the step 2 comprises the following steps:
step 2.1, reading the extracted gene expression profile sample data;
and 2.2, manually adding label information according to the sample types, setting the read sample label of the first type as 0, setting the sample label of the second type as 1, and sequentially and alternately setting.
Preferably, the step 3 comprises the following steps:
step 3.1, reading sample data and sample label data, and dividing a training set and a verification set;
step 3.2, initializing an Adam optimizer, setting the learning rate to be 0.003, and selecting the MAE by a loss function;
3.3, constructing a depth multichannel self-encoder model, wherein a multichannel self-encoder (MCAE) comprises an encoding process and a decoding process, a directional three-layer neural network structure is adopted, namely an input layer, a hidden layer and an output layer, wherein the input layer and the output layer have the same dimensionality and are both n-dimensional, and the dimensionality of the hidden layer is m-dimensional;
step 3.4, putting the training set into a multi-channel self-encoder for model training, saving the optimal weight according to the loss obtained by the verification set, and loading the optimal weight after the model training is finished to obtain an optimal model;
step 3.5 predicts a feature representation of the sample data using the optimal model.
Preferably, in step 3.3, the encoding process from the input layer to the hidden layer is to implement dimension reduction on the input data, and the encoding is used as a feature representation on the data, and the encoding function is represented as f, then:
Figure BDA0002363613340000021
hi=fi(x)=Sf(wix + p), where h is the final feature representation, hiFor each channel' S feature representation, n is the number of channels, SfFor the encoder activation function, the ReLu function is taken, i.e. s (x) max (0, x), wiFor the weight matrix between the ith channel of the input layer and the hidden layer, p ∈ RmIs a bias term; fromDecoding from the hidden layer to the input layer, and decoding the code obtained by the hidden layer to reconstruct the input data; expressing the coding function as g, then:
Figure BDA0002363613340000022
yi=g(hi)=Sg(whi+ q) where y is the final reconstruction data, yiFor the reconstruction data of each channel, n is the number of channels, SgFor decoder activation of functions, taking Sigmod functions, i.e.
Figure BDA0002363613340000023
w is a weight matrix between the hidden layer and the output layer, and q belongs to RnIs a bias term;
superposing a plurality of multi-channel self-encoders layer by layer to obtain a depth multi-channel self-encoder model containing a plurality of hidden layers, and expressing by 2-dimensional features: a first layer 978 × 1 of the depth multichannel self-encoder, wherein 978 is a feature dimension, 1 is a channel number, 5 times of weights are initialized by using different random seeds, and the feature of 978 dimensions is input by using a Dense layer respectively to obtain a second layer 128 × 5, namely 5 layers 128 × 1; the second layer 128 x 5 initializes 4 times of weights with different random seeds, the first weight uses a Dense layer to input 5 128-dimensional features respectively to obtain 5 2-dimensional features, then the average value is calculated according to the corresponding position to obtain the first 2 x 1 of the third layer, and the rest is repeated to obtain the second, third and fourth 2 x 1 of the third layer, namely the third layer 2 x 4, the fourth layer 128 x 5 and the fifth layer 978 x 1.
Preferably, in the step 4, firstly, the feature data and the sample label data are read, the training set and the test set are divided, then, the KNN classes are instantiated, the n _ neighbors parameters are sequentially taken to be 1-10, then, the training set is used for fitting the KNN model, and finally, the trained KNN model is used for predicting the sample label of the test set, which is the prediction accuracy.
Drawings
FIG. 1 is a schematic diagram of a multi-channel self-encoder according to the present invention;
FIG. 2 is a schematic diagram of a depth multi-channel self-encoder according to the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
A gene expression profile characteristic learning method based on an autoencoder comprises the following steps:
step 1, importing a gene expression profile file at a LINCS data set Level3 stage, and extracting required data according to a cell line, an experiment type, a disturbance condition and the like;
step 2, processing the extracted sample data to obtain input data which can be directly used for a feature learning model;
step 3, extracting feature representation with relatively low dimensionality and capable of effectively distinguishing original data from high-dimensional sparse gene expression profile data;
the step is combined with the multi-channel of the convolutional neural network and the characteristic learning capability of the self-encoder to design a new multi-channel self-encoder model, and the performance of the model is greatly improved compared with that of the original self-encoder.
And 4, verifying the quality of the characteristics predicted by the model, calculating the prediction accuracy rate by taking the gene expression profile experimental group as a classification label based on a KNN classification method, wherein the higher the accuracy rate is, the better the characteristics learned by the multi-channel self-encoder are represented in the classification task.
Preferably, in step 1, the information of the required cell line, experiment type, disturbance condition and the like is screened out as required according to the gene information and cell line information in the LINCS data set attachment and the gene expression profile sample information, the sample information is used as a screening standard according to the uniqueness of each gene expression profile sample information, and finally, the sample of the required data is searched according to the sample identifier, and the expression profile value of the sample is the required data.
Preferably, the step 2 comprises the following steps:
step 2.1, reading the extracted gene expression profile sample data;
and 2.2, manually adding label information according to the sample types, setting the read sample label of the first type as 0, setting the sample label of the second type as 1, and sequentially and alternately setting.
Preferably, the step 3 comprises the following steps:
step 3.1, reading sample data and sample label data, and dividing a training set and a verification set;
step 3.2, initializing an Adam optimizer, setting the learning rate to be 0.003, and selecting the MAE by a loss function;
3.3, constructing a depth multichannel self-encoder model, wherein a multichannel self-encoder (MCAE) comprises an encoding process and a decoding process, a directional three-layer neural network structure is adopted, namely an input layer, a hidden layer and an output layer, wherein the input layer and the output layer have the same dimensionality and are both n-dimensional, and the dimensionality of the hidden layer is m-dimensional;
step 3.4, putting the training set into a multi-channel self-encoder for model training, saving the optimal weight according to the loss obtained by the verification set, and loading the optimal weight after the model training is finished to obtain an optimal model;
step 3.5 predicts a feature representation of the sample data using the optimal model.
Preferably, in step 3.3, the encoding process from the input layer to the hidden layer is to implement dimension reduction on the input data, and the encoding is used as a feature representation on the data, and the encoding function is represented as f, then:
Figure BDA0002363613340000041
hi=fi(x)=Sf(wix + p), where h is the final feature representation, hiFor each channel' S feature representation, n is the number of channels, SfFor the encoder activation function, the ReLu function is taken, i.e. s (x) max (0, x), wiFor the weight matrix between the ith channel of the input layer and the hidden layer, p ∈ RmIs a bias term; decoding the code obtained by the hidden layer to reconstruct the input data in the decoding process from the hidden layer to the input layer; expressing the coding function as g, then:
Figure BDA0002363613340000042
yi=g(hi)=Sg(whi+ q) where y is the final reconstruction numberAccording to yiFor the reconstruction data of each channel, n is the number of channels, SgFor decoder activation of functions, taking Sigmod functions, i.e.
Figure BDA0002363613340000043
w is a weight matrix between the hidden layer and the output layer, and q belongs to RnIs a bias term; the schematic diagram of the network structure is shown in fig. 1.
Superposing a plurality of multi-channel self-encoders layer by layer to obtain a depth multi-channel self-encoder model containing a plurality of hidden layers, and representing by 2-dimensional features (as shown in fig. 2): a first layer 978 × 1 of the depth multichannel self-encoder, wherein 978 is a feature dimension, 1 is a channel number, 5 times of weights are initialized by using different random seeds, and the feature of 978 dimensions is input by using a Dense layer respectively to obtain a second layer 128 × 5, namely 5 layers 128 × 1; the second layer 128 x 5 initializes 4 times of weights with different random seeds, the first weight uses a Dense layer to input 5 128-dimensional features respectively to obtain 5 2-dimensional features, then the average value is calculated according to the corresponding position to obtain the first 2 x 1 of the third layer, and the rest is repeated to obtain the second, third and fourth 2 x 1 of the third layer, namely the third layer 2 x 4, the fourth layer 128 x 5 and the fifth layer 978 x 1.
Preferably, in the step 4, firstly, the feature data and the sample label data are read, the training set and the test set are divided, then, the KNN classes are instantiated, the n _ neighbors parameters are sequentially taken to be 1-10, then, the training set is used for fitting the KNN model, and finally, the trained KNN model is used for predicting the sample label of the test set, which is the prediction accuracy.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (1)

1. A gene expression profile feature learning method based on a self-encoder is characterized by comprising the following steps:
step 1, importing a gene expression profile file at a LINCS (LINCS) data set Level3 stage, and extracting required data according to a cell line, an experiment type, a disturbance condition and the like;
screening information such as required cell lines, experiment types, disturbance conditions and the like according to gene information, cell line information and gene expression profile sample information in the LINCS data set accessory, taking the sample information as a screening standard according to the uniqueness of each gene expression profile sample information, and finally searching a sample of required data according to a sample identifier, wherein the expression profile value of the sample is the required data;
step 2, processing the extracted sample data to obtain input data which can be directly used for a feature learning model; the method specifically comprises the following steps:
step 2.1, reading the extracted gene expression profile sample data;
step 2.2, manually adding label information according to the sample types, setting the read sample label of the first type as 0, setting the sample label of the second type as 1, and sequentially and alternately setting;
step 3, extracting feature representation with relatively low dimensionality and capable of effectively distinguishing original data from high-dimensional sparse gene expression profile data; the method specifically comprises the following steps:
step 3.1, reading sample data and sample label data, and dividing a training set and a verification set;
step 3.2, initializing an Adam optimizer, setting the learning rate to be 0.003, and selecting the MAE by a loss function;
3.3, constructing a depth multi-channel self-encoder model, wherein the multi-channel self-encoder comprises two processes of encoding and decoding, a directional three-layer neural network structure is adopted, namely an input layer, a hidden layer and an output layer, the input layer and the output layer have the same dimensionality and are both n-dimensional, and the dimensionality of the hidden layer is m-dimensional;
the coding process from the input layer to the hidden layer is realized to reduce the dimension of the input data, the coding is used as the characteristic representation of the data, the coding function is represented as f, and then:
Figure FDA0003146815090000011
hi=fi(x)=Sf(wix + p), where h is the final feature representation, hiFor each channel' S feature representation, n is the number of channels, SfFor the encoder activation function, the ReLu function is taken, i.e. s (x) max (0, x), wiFor the weight matrix between the ith channel of the input layer and the hidden layer, p ∈ RmIs a bias term; decoding the code obtained by the hidden layer to reconstruct the input data in the decoding process from the hidden layer to the input layer; expressing the coding function as g, then:
Figure FDA0003146815090000012
yi=g(hi)=Sg(whi+ q) where y is the final reconstruction data, yiFor the reconstruction data of each channel, n is the number of channels, SgFor decoder activation of functions, taking Sigmod functions, i.e.
Figure FDA0003146815090000013
w is a weight matrix between the hidden layer and the output layer, and q belongs to RnIs a bias term;
superposing a plurality of multi-channel self-encoders layer by layer to obtain a depth multi-channel self-encoder model containing a plurality of hidden layers, and expressing by 2-dimensional features: a first layer 978 × 1 of the depth multichannel self-encoder, wherein 978 is a feature dimension, 1 is a channel number, 5 times of weights are initialized by using different random seeds, and the feature of 978 dimensions is input by using a Dense layer respectively to obtain a second layer 128 × 5, namely 5 layers 128 × 1; the second layer 128 x 5 initializes 4 times of weights by using different random seeds, the first weight respectively uses a Dense layer to input 5 128 dimensional features to obtain 5 2 dimensional features, then the average value is calculated according to the corresponding position to obtain the first 2 x 1 of the third layer, and the rest is repeated to obtain the second, third and fourth 2 x 1 of the third layer, namely the third layer 2 x 4, the fourth layer 128 x 5 and the fifth layer 978 x 1;
step 3.4, putting the training set into a multi-channel self-encoder for model training, saving the optimal weight according to the loss obtained by the verification set, and loading the optimal weight after the model training is finished to obtain an optimal model;
step 3.5, predicting the characteristic representation of the sample data by using the optimal model;
step 4, verifying the quality of the characteristics of model prediction, taking a gene expression profile experimental group as a classification label, and calculating the accuracy of prediction based on a KNN classification method, wherein the higher the accuracy is, the better the characteristics learned by a multi-channel self-encoder are represented in the classification task;
firstly, reading characteristic data and sample label data, dividing a training set and a test set, then instantiating KNN classes, sequentially taking 1-10 n _ neighbors parameters, then using the training set to fit a KNN model, and finally using the trained KNN model to predict the sample label of the test set, wherein the prediction accuracy is the sample label.
CN202010029068.8A 2020-01-12 2020-01-12 Gene expression profile feature learning method based on self-encoder Active CN111276187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010029068.8A CN111276187B (en) 2020-01-12 2020-01-12 Gene expression profile feature learning method based on self-encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010029068.8A CN111276187B (en) 2020-01-12 2020-01-12 Gene expression profile feature learning method based on self-encoder

Publications (2)

Publication Number Publication Date
CN111276187A CN111276187A (en) 2020-06-12
CN111276187B true CN111276187B (en) 2021-09-10

Family

ID=71001828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010029068.8A Active CN111276187B (en) 2020-01-12 2020-01-12 Gene expression profile feature learning method based on self-encoder

Country Status (1)

Country Link
CN (1) CN111276187B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111785326B (en) * 2020-06-28 2024-02-06 西安电子科技大学 Gene expression profile prediction method after drug action based on generation of antagonism network
CN111882066B (en) * 2020-07-23 2023-11-14 浙江大学 Inverse fact reasoning equipment based on deep characterization learning
CN114496303A (en) * 2022-01-06 2022-05-13 湖南大学 Anticancer drug screening method based on multichannel neural network
CN117095744A (en) * 2023-08-21 2023-11-21 上海信诺佰世医学检验有限公司 Copy number variation detection method based on single-sample high-throughput transcriptome sequencing data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104819846A (en) * 2015-04-10 2015-08-05 北京航空航天大学 Rolling bearing sound signal fault diagnosis method based on short-time Fourier transform and sparse laminated automatic encoder
CN110561192A (en) * 2019-09-11 2019-12-13 大连理工大学 Deep hole boring cutter state monitoring method based on stacking self-encoder

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104819846A (en) * 2015-04-10 2015-08-05 北京航空航天大学 Rolling bearing sound signal fault diagnosis method based on short-time Fourier transform and sparse laminated automatic encoder
CN110561192A (en) * 2019-09-11 2019-12-13 大连理工大学 Deep hole boring cutter state monitoring method based on stacking self-encoder

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Learning Classifiers from Synthetic Data Using a Multichannel Autoencoder;Xi Zhang等;《arXiv:1503.03163v1》;20150311;第1~11页 *
Learning influential genes on cancer gene expression data with stacked denoising autoencoders;Teixeira等;《2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)》;20171218;第1~5页 *
Teixeira等.Learning influential genes on cancer gene expression data with stacked denoising autoencoders.《2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)》.2017,第1~5页. *

Also Published As

Publication number Publication date
CN111276187A (en) 2020-06-12

Similar Documents

Publication Publication Date Title
CN111276187B (en) Gene expression profile feature learning method based on self-encoder
CN111667884B (en) Convolutional neural network model for predicting protein interactions using protein primary sequences based on attention mechanism
CN107622182B (en) Method and system for predicting local structural features of protein
CN109086805B (en) Clustering method based on deep neural network and pairwise constraints
CN110751044B (en) Urban noise identification method based on deep network migration characteristics and augmented self-coding
CN107742061B (en) Protein interaction prediction method, system and device
CN111538761A (en) Click rate prediction method based on attention mechanism
CN114927162A (en) Multi-set correlation phenotype prediction method based on hypergraph representation and Dirichlet distribution
CN112699960A (en) Semi-supervised classification method and equipment based on deep learning and storage medium
CN113889192B (en) Single-cell RNA-seq data clustering method based on deep noise reduction self-encoder
CN116386899A (en) Graph learning-based medicine disease association relation prediction method and related equipment
Wang et al. Human mitochondrial genome compression using machine learning techniques
CN116386729A (en) scRNA-seq data dimension reduction method based on graph neural network
CN114880538A (en) Attribute graph community detection method based on self-supervision
CN114783526A (en) Depth unsupervised single cell clustering method based on Gaussian mixture graph variation self-encoder
Tabus et al. Classification and feature gene selection using the normalized maximum likelihood model for discrete regression
CN114093419A (en) RBP binding site prediction method based on multitask deep learning
CN113362900A (en) Mixed model for predicting N4-acetylcytidine
CN112735604B (en) Novel coronavirus classification method based on deep learning algorithm
Tabus et al. Normalized maximum likelihood models for Boolean regression with application to prediction and classification in genomics
CN115019876A (en) Gene expression prediction method and device
CN115579068A (en) Pre-training and deep clustering-based metagenome species reconstruction method
CN115348182A (en) Long-term spectrum prediction method based on depth stack self-encoder
CN111599412B (en) DNA replication initiation region identification method based on word vector and convolutional neural network
CN114334013A (en) Single cell clustering method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant