CN113643758B - Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter - Google Patents

Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter Download PDF

Info

Publication number
CN113643758B
CN113643758B CN202111113150.XA CN202111113150A CN113643758B CN 113643758 B CN113643758 B CN 113643758B CN 202111113150 A CN202111113150 A CN 202111113150A CN 113643758 B CN113643758 B CN 113643758B
Authority
CN
China
Prior art keywords
cnn
training set
hmm
enterobacter
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111113150.XA
Other languages
Chinese (zh)
Other versions
CN113643758A (en
Inventor
廖晓萍
刘雅红
方畅
吴精乙
高源�
凌宏韬
吴名柔
吴玉寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Agricultural University
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN202111113150.XA priority Critical patent/CN113643758B/en
Publication of CN113643758A publication Critical patent/CN113643758A/en
Application granted granted Critical
Publication of CN113643758B publication Critical patent/CN113643758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Physiology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a prediction method for obtaining beta-lactam drug resistance resistant genes for enterobacter, which comprises the following steps: step one, constructing a training set and a testing set; step two, randomly dividing to form k-mers; step three, constructing a prediction model; step four, outputting the optimal result of each prediction; and step five, optimizing, testing and predicting. The method has accurate prediction, credible result, reasonable algorithm and stable model for the beta-lactam drug resistance of the enterobacter bacteria, can identify or classify the ARGs with the different heights in the reference database, improves the prediction performance, and realizes deep learning application with the biological significance as the leading factor.

Description

Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter
Technical Field
The invention belongs to the field of biological information, and particularly relates to a prediction method for obtaining a beta-lactam drug resistance resistant gene for enterobacter bacteria.
Background
With the widespread use and abuse of antibiotics in clinical and agricultural settings, resistance of bacterial pathogens to antibiotics (antibiotic Resistance) has become an urgent public health problem. Therefore, drug resistance genes (ARGs) must be found and predicted from clinical and environmental samples, and the types of ARGs determined, in order to formulate a targeted therapeutic or control measure. Furthermore, the rapid identification of ARGs in pathogens helps to optimize antibacterial therapy. Culture-based Antibiotic Susceptibility Testing (AST) can provide phenotypic resistance results for microorganisms, but may take weeks and is less informative than sequencing-based methods in terms of the epidemiology of ARGs. Furthermore, culture-based methods are not applicable to non-culturable bacteria. Genomic sequence data provides another view of resistance, enabling researchers to assess the genetic mechanisms that confer resistance in each strain.
The noun explains:
k-mers: and (4) units.
Disclosure of Invention
In order to solve the problems, the invention discloses a prediction method for obtaining a beta-lactam drug resistance resistant gene for enterobacter bacteria. The method has accurate prediction, credible result, reasonable algorithm and stable model for the beta-lactam drug resistance of the enterobacter bacteria, can identify or classify the ARGs with the different heights in the reference database, improves the prediction performance, and realizes deep learning application with the biological significance as the leading factor.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a prediction method for obtaining beta-lactam drug resistance resistant genes for enterobacteriaceae comprises the following steps:
step one, enterobacter drug-resistant phenotype-genotype data are obtained, and the enterobacter drug-resistant phenotype-genotype data are divided into a training set and a testing set according to the enterobacter gene data conditions containing drug-resistant genes and not containing the drug-resistant genes;
step two, randomly segmenting the sequence of the enterobacter drug-resistant phenotype-genotype in the training set to form k-mers with preset length; adopting sequence characteristics of k-mers extracted by characteristic engineering; performing gray level correlation Analysis (gray relationship Analysis) on the sequence features, wherein the gray level correlation Analysis adopts the following formula:
Figure BDA0003273431230000021
therein, ζ i0 (k) Representing the gray scale correlation coefficient, k the kth point, min i min k Representing traversing all points, taking a value such that | x 0 (k)-x i (k) The value of | is minimal, x 0 (k) Representing the position of the 0 th point of vector k), x i (k) Represents the position of the ith point of the vector k, and rho represents a weight coefficient; 0 denotes a target vector, i denotes a reference vector; max i max k Meaning that all k are traversed by a value such that | x 0 (k)-x i (k) The value of | is maximum;
step three, constructing a prediction model: constructing a CNN-HMM mixed model as a prediction model;
inputting the training set into a CNN-HMM mixed model, and optimizing the CNN-HMM mixed model, wherein the optimization method is to add a self-attention mechanism, a self-adaptive enhancement mechanism and an extreme adaptive enhancement mechanism to finally obtain an optimized CNN-HMM mixed model;
step five, testing the optimized CNN-HMM mixed model by adopting a test set;
step six, if the accuracy is higher than a preset threshold value, a final CNN-HMM mixed model is obtained, otherwise, a training set is expanded, and the steps one to five are repeated until the final CNN-HMM mixed model is obtained;
and seventhly, predicting whether the enterobacter bacteria sample obtains the beta-lactam drug resistance gene or not by adopting a final CNN-HMM mixed model.
In a further improvement, in the first step, enterobacter drug-resistant phenotype-genotype data is obtained through NCBI and EMBL-EBI databases; the enterobacter drug-resistant phenotype-genotype data is removed with genes associated with non-acquired drug resistance.
In a further refinement, said k-mers have a length of 7-10bp or 14-16bp.
In a further improvement, the CNN self-attention mechanism is introduced as follows: each embedded layer of k-mers is represented by a dense vector, then a convolutional layer for capturing local information of the sequence is established, then a double-layer self-attention network is constructed for capturing the most relevant part in the given protein sequence to carry out ARG classification, and finally a fully-connected layer and flexible maximum transfer function are generated, and cross entropy loss is used as a loss function in the training process.
In a further improvement, the CNN-HMM mixed model construction method is as follows: (1) the HMM model is divided into several groups, and the networks between the groups have obvious difference; (2) reducing parameters of an HMM by establishing a small hidden layer with H hidden nodes; (3) the number of the hidden nodes is changed to properly adjust the number of the parameters so as to adapt to the sizes of various training sets; (4) the parameters of the whole CNN are used for constructing the HMM, and the processes of various initializations and network structure realization are flexible; the parameters of the CNN include radial basis function, multiple hidden layers, sparse connectivity, weight sharing, gaussian prior distribution, and hyper-parameters.
In the seventh step, the L2R method is used for sequencing the prediction results of different predictions, so that the multi-label prediction problem is solved.
In a further improvement, an adaptive enhancement mechanism and an extreme adaptive enhancement mechanism are introduced into the CNN-HMM model: for a training set T = { (x) 1 ,y 1 ),(x 2 ,y 2 ),...,(x n ,y n ) }; wherein
Figure BDA0003273431230000041
y i ∈Y={-1,+1},
Wherein n represents the nth data, x i Representing the first classification, y, of the ith element in the training set i Representing a second classification of the ith element in the training set, R n Representing the real number R to the power of n, wherein R is the real number, and n represents the number of elements of the training set;
first, the training set weight D is initialized 1 =(w 11 ,w 12 ,...,w 1n ),w 1n Representing the original weights of the nth training set;
training according to the weight of each round of training setSampling the exercise set data according to T m A base learner hm of each round can be obtained, and finally a final learner is obtained according to the M constructed base learners;
Figure BDA0003273431230000042
wherein x represents the submodel, M represents the number of the base learners, M represents the number of the submodels, alpha i Coefficient, h, representing a basis learner m (x) Represents samples from the weighted training set and Tm represents the mth training data set.
The invention has the advantages that:
the method has accurate prediction, credible result, reasonable algorithm and stable model for the beta-lactam drug resistance of the enterobacter bacteria, can identify or classify the ARGs with the different heights in the reference database, improves the prediction performance, and realizes deep learning application with the biological significance as the leading factor.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
The invention is further explained by the following embodiments in conjunction with the drawings.
As shown in fig. 1, we collected enterobacter resistance phenotype-genotype data from NCBI and EMBL-EBI for the dataset.
For the algorithm, the technology uses a CNN (convolutional neural network model) and an HMM (hidden Markov model), constructs a CNN-HMM mixed model, respectively constructs the algorithm, and outputs the optimal result of each prediction. In CNN, we introduce self-attention (self-attention mechanism) to make the CNN model self-supervision. The parameters and elements of the CNN-HMM model are adjusted together, and the two parts are coordinated with each other.
In order to achieve good calculation precision and time cost control, the algorithm randomly cuts the gene sequence, and takes 7-10 nucleotides as a k-mers; to achieve the calculated stability, one k-mers can be made of 14-16 nucleotides.
Aiming at the problem that the resistance phenotype is a discrete variable, the algorithm uses an L2R (Learning to Rank) method to solve the multi-tag prediction problem.
Aiming at the stability of the model, an adaptive boosting mechanism (AdaBoost) and an extreme adaptive boosting mechanism (XGboost) are introduced
The invention has accurate prediction: a series of algorithms such as CNN, HMM, L2R and the like can be used for identifying or classifying the ARGs with different heights in the reference database, the prediction performance is improved, and deep learning application with biological significance as the leading factor is realized.
The result is credible: and a bacterial genome sketch is used as the input of an algorithm model and a prediction result is output, so that the prediction result is more suitable for the actual situation, and the original biological significance is kept.
The algorithm is reasonable: the method is suitable for the mathematical characteristics of sequence base changes, and compared with other types of machine learning algorithms, the algorithm is more suitable for the requirements of genetics and developmental biology.
And (3) stabilizing the model: under the addition of a self-adaptive enhancement mechanism and an extreme adaptive enhancement mechanism, the model has better robustness and stability and is not easy to crash under the impact of large data.
While embodiments of the invention have been disclosed above, it is not limited to the applications set forth in the specification and the embodiments, which are fully applicable to various fields of endeavor for which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.

Claims (3)

1. A prediction method for obtaining a beta-lactam drug resistance resistant gene facing enterobacteriaceae is characterized by comprising the following steps:
step one, acquiring enterobacter drug-resistant phenotype-genotype data, and dividing the enterobacter drug-resistant phenotype-genotype data into a training set and a testing set according to the enterobacter gene data conditions containing drug-resistant genes and not containing the drug-resistant genes;
step two, randomly segmenting the sequence of the enterobacter drug-resistant phenotype-genotype in the training set to form k-mers with preset length; adopting the sequence characteristics of k-mers extracted by characteristic engineering, wherein the k-mers represents a unit; the length of the k-mers is 7-10bp or 14-16bp;
carrying out gray level correlation analysis on the sequence characteristics, wherein the gray level correlation analysis adopts the following formula:
Figure FDA0004043503980000011
wherein xi is i0 (k) Representing the gray scale correlation coefficient, k the kth point, min i min k Representing traversing all points, taking a value such that | x 0 (k)-x i (k) The value of | is minimal, x 0 (k) Denotes the position of the 0 th point of the vector k, x i (k) Represents the position of the ith point of the vector k, and rho represents a weight coefficient; 0 denotes a target vector, i denotes a reference vector; max of i max k Meaning that all k are traversed by a value such that | x 0 (k)-x i (k) The value of | is maximum;
step three, constructing a prediction model: constructing a CNN-HMM mixed model as a prediction model;
inputting the training set into a CNN-HMM mixed model, and optimizing the CNN-HMM mixed model, wherein the optimization method comprises the steps of adding a CNN self-attention mechanism, a self-adaptive enhancement mechanism and an extreme adaptive enhancement mechanism, and finally obtaining an optimized CNN-HMM mixed model;
the introduction process of the CNN self-attention mechanism is as follows: expressing each embedded layer of k-mers by using dense vectors, then establishing a convolutional layer for capturing local information of sequences, then constructing a double-layer self-attention network for capturing the most relevant part in a given protein sequence to carry out ARG classification, and finally generating a fully-connected layer and a flexible maximum value transmission function, wherein cross entropy loss is used as a loss function in the training process;
the CNN-HMM mixed model construction method comprises the following steps: (1) the HMM model is divided into several groups, and the networks between the groups have obvious difference; (2) reducing parameters of an HMM by establishing a small hidden layer with H hidden nodes; (3) the number of the hidden nodes is changed to properly adjust the number of the parameters so as to adapt to the sizes of various training sets; (4) the parameters of the whole CNN are used for constructing the HMM, and the processes of various initializations and network structure realization are flexible; the parameters of the CNN comprise a radial basis function, multiple hidden layers, sparse connectivity, weight sharing, gaussian prior distribution and hyper-parameters;
the process of introducing the self-adaptive enhancement mechanism and the extreme self-adaptive enhancement mechanism into the CNN-HMM mixed model is as follows: for a training set T = { (x) 1 ,y 1 ),(x 2 ,y 2 ),...,(x n ,y n ) }; wherein
Figure FDA0004043503980000021
y i ∈Y={-1,+1},
Wherein n represents the nth data, x i Representing the first class, y, of the ith element in the training set i Representing a second classification of the ith element in the training set, R n Representing the real number R to the power of n, wherein R is the real number, and n represents the number of elements of the training set;
first from the initialization training set weights D 1 =(w 11 ,w 12 ,...,w 1n ),w 1n Representing the original weights of the nth training set;
sampling the training set data according to the weight of each round of training set, and then obtaining the training set data according to T m A base learner hm of each round can be obtained, and finally a final learner is obtained according to the M constructed base learners;
Figure FDA0004043503980000031
wherein x represents the submodel, M represents the number of the base learners, M represents the number of the submodels, alpha i Coefficient, h, representing a basis learner m (x) Watch (A)Samples from the weighted training sets are shown, tm representing the mth training data set;
step five, testing the optimized CNN-HMM mixed model by adopting a test set;
step six, if the accuracy is higher than a preset threshold value, a final CNN-HMM mixed model is obtained, otherwise, a training set is expanded, and the steps one to five are repeated until the final CNN-HMM mixed model is obtained;
and seventhly, predicting whether the enterobacter bacteria sample obtains the beta-lactam drug resistance gene or not by adopting a final CNN-HMM mixed model.
2. The method for predicting acquisition of a gene resistant to β -lactam resistance for enterobacteriaceae of claim 1, wherein in the first step, enterobacteriaceae resistance phenotype-genotype data is obtained by NCBI and EMBL-EBI databases; the enterobacter drug-resistant phenotype-genotype data is removed with genes associated with non-acquired drug resistance.
3. The prediction method for obtaining the beta-lactam drug resistance gene facing the enterobacter bacteria in the claim 1, wherein the prediction results of different predictions are ranked by using an L2R method in the seventh step, so as to solve the multi-label prediction problem.
CN202111113150.XA 2021-09-22 2021-09-22 Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter Active CN113643758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111113150.XA CN113643758B (en) 2021-09-22 2021-09-22 Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111113150.XA CN113643758B (en) 2021-09-22 2021-09-22 Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter

Publications (2)

Publication Number Publication Date
CN113643758A CN113643758A (en) 2021-11-12
CN113643758B true CN113643758B (en) 2023-04-07

Family

ID=78426148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111113150.XA Active CN113643758B (en) 2021-09-22 2021-09-22 Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter

Country Status (1)

Country Link
CN (1) CN113643758B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114606331B (en) * 2021-11-23 2023-12-22 天津金匙医学科技有限公司 Application of non-core drug-resistant gene in klebsiella pneumoniae drug sensitivity prediction
CN114582429B (en) * 2022-03-03 2023-06-13 四川大学 Mycobacterium tuberculosis drug resistance prediction method and device based on hierarchical attention neural network

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040065235A (en) * 2001-12-07 2004-07-21 주식회사 툴젠 Phenotypic screen of chimeric proteins
JP2008146538A (en) * 2006-12-13 2008-06-26 Intec Web & Genome Informatics Corp Microrna detector, detection method and program
US20140039803A1 (en) * 2011-03-04 2014-02-06 The Rockefeller University Method for Rapid Identification of Drug Targets and Drug Mechanisms of Action in Human Cells
CN104862217B (en) * 2015-06-15 2017-06-20 杨亮 Bacterial drug resistance fast prediction system and its Forecasting Methodology
CN107604084A (en) * 2017-10-30 2018-01-19 深圳市第三人民医院 Bacterial drug resistance fast prediction system and its Forecasting Methodology
US20230260593A1 (en) * 2019-10-15 2023-08-17 King Abdullah University Of Science And Technology Deep learning-based antibiotic resistance gene prediction system and method
CN111755074B (en) * 2020-07-03 2022-05-17 桂林电子科技大学 Method for predicting DNA replication origin in saccharomyces cerevisiae
CN112258251B (en) * 2020-11-18 2022-12-27 北京理工大学 Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand
CN113077849B (en) * 2021-03-16 2023-03-31 华南农业大学 Escherichia coli beta-lactam acquired drug resistance phenotype prediction composite method
CN113308467A (en) * 2021-07-13 2021-08-27 清华大学深圳国际研究生院 Novel penA resistance gene of beta-lactam antibiotics and application thereof
CN113293160A (en) * 2021-07-15 2021-08-24 清华大学深圳国际研究生院 Novel fmtC resistance gene of beta-lactam antibiotics and application thereof

Also Published As

Publication number Publication date
CN113643758A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN111798921B (en) RNA binding protein prediction method and device based on multi-scale attention convolution neural network
CN113643758B (en) Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter
Liu et al. Application of deep learning in genomics
US11574703B2 (en) Method, apparatus, and computer-readable medium for efficiently optimizing a phenotype with a combination of a generative and a predictive model
Noviello et al. Deep learning predicts short non-coding RNA functions from only raw sequence data
CN106682454A (en) Method and device for data classification of metagenome
CN115510963A (en) Incremental equipment fault diagnosis method
WO2021217138A1 (en) Method for efficiently optimizing a phenotype with a combination of a generative and a predictive model
CN111564179A (en) Species biology classification method and system based on triple neural network
Raad et al. miRe2e: a full end-to-end deep model based on transformers for prediction of pre-miRNAs
Lalwani et al. An efficient two-level swarm intelligence approach for RNA secondary structure prediction with bi-objective minimum free energy scores
Touati et al. The Helitron family classification using SVM based on Fourier transform features applied on an unbalanced dataset
Akkaya et al. Classification of DNA Sequences with k-mers Based Vector Representations
CN113257359A (en) CRISPR/Cas9 guide RNA editing efficiency prediction method based on CNN-SVR
CN113393900B (en) RNA state inference research method based on improved Transformer model
Lahmer et al. Classification of DNA Microarrays Using Deep Learning to identify Cell Cycle Regulated Genes
CN114783526A (en) Depth unsupervised single cell clustering method based on Gaussian mixture graph variation self-encoder
Kaur et al. A novel fuzzy logic based reverse engineering of gene regulatory network
KR20230018358A (en) Conformal Inference for Optimization
CN106845546B (en) BFBA and ELM-based mammary X-ray image feature selection method
Polushina et al. Change-point detection in binary Markov DNA sequences by the Cross-Entropy method
Baten et al. Biological sequence data preprocessing for classification: A case study in splice site identification
Meharunnisa et al. An Optimized Hybrid Model for Classifying Bacterial Genus using an Integrated CNN-RF Approach on 16S rDNA Sequences: OPTIMIZED CNN-RF MODEL FOR BACTERIAL GENUS CLASSIFICATION
CN112149738B (en) Method for improving transformation phenomenon in image recognition model field
CN116959561B (en) Gene interaction prediction method and device based on neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant