CN114611719A - XGboost training method based on cuckoo search algorithm - Google Patents
XGboost training method based on cuckoo search algorithm Download PDFInfo
- Publication number
- CN114611719A CN114611719A CN202210236632.2A CN202210236632A CN114611719A CN 114611719 A CN114611719 A CN 114611719A CN 202210236632 A CN202210236632 A CN 202210236632A CN 114611719 A CN114611719 A CN 114611719A
- Authority
- CN
- China
- Prior art keywords
- bird nest
- xgboost
- bird
- nest
- random
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
- G06F18/2113—Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of machine learning, in particular to a novel XGboost training method based on cuckoo search. The CS-based XGboost is applied to the real-world enterprise personnel management field staff information data set for the time-out prediction after the XGboost trained by the method. In addition, CS-based XGBoosts were compared to existing XGBoosts trained by other optimization algorithms, including GA, PSO, etc., in addition to four classifiers of GBDT, RF, SVM and KNN. Experimental results and corresponding discussion show that the XGboost based on the MFO is superior to the comparison model in the main performance indexes such as accuracy, accuracy and recall rate.
Description
Technical Field
The invention relates to the field of machine learning, in particular to a novel XGboost training method based on a cuckoo search algorithm.
Background
With the rapid development of artificial intelligence technology, machine learning algorithms are applied in various industries to solve practical problems. At present, data information in each field is explosively increased along with industrial development, and the massive data cannot be effectively processed by manpower alone, so that an effective computer algorithm is urgently needed to analyze and utilize the data, and therefore, the problem of processing the data in each field by adopting an artificial intelligence technology to solve is always a research hotspot. XGboost, a typical representative of integrated learning techniques, can efficiently handle large-scale machine learning tasks. Since its introduction, due to its performance advantages and affordable time and memory complexities, it has been widely used in a number of research areas, ranging from cancer diagnosis, medical history analysis to credit risk assessment, metagenomics, etc. Although the traditional XGBoost (i.e., the XGBoost with default parameter setting) is widely applied in many fields, the fitting degree of the original model without parameter optimization and the existing data set is low, which results in poor generalization performance and adaptability. XGBoost has over thirty superparameters, the performance of which is highly dependent on how they are optimized in training, and it is therefore very important to tune them.
Disclosure of Invention
The invention aims to solve the problem of parameter optimization during XGboost model training, and provides a novel XGboost training method based on a cuckoo search algorithm.
The purpose of the invention can be realized by the following technical scheme:
a novel XGboost training method based on cuckoo search algorithm comprises the following steps:
(1) preprocessing the original data set: firstly, scaling each column of attribute values in the data set to an interval [0,1] by adopting a maximum and minimum normalization method, and secondly, performing feature dimensionality reduction on the data set by adopting a random forest feature selection method;
(2) dividing the preprocessed data set into a training set and a test set according to a user-defined proportion;
(3) an XGboost training method based on a cuckoo search algorithm is adopted to train the over-parameters of the XGboost;
(4) according to a group of optimal parameter values obtained by training, constructing the XGboost, and then inputting a training set to train the XGboost;
(5) testing the trained XGboost by using a test set, and outputting a prediction result;
(6) evaluating the prediction performance of the XGboost by using 4 model performance evaluation indexes of Precision Accuracy, Precision, Recall and F1 score;
in the step (2), a random forest feature selection algorithm is adopted to screen the data set, and specifically, the data set is divided into a training set and a testing set according to a fixed proportion, then the training set is input to train a random forest model, importance scores corresponding to each feature are output and are sorted in a descending order, then a feature importance score threshold value is set, and finally the feature with the feature importance score smaller than the set threshold value is deleted, so that the data set after dimensionality reduction is obtained.
In the step (3), an XGBoost structure is trained by using an XGBoost training method based on a cuckoo search algorithm, specifically:
(4-1) determining the size n of the bird nest population; dimension d of the bird nest position; namely the number of parameters to be optimized in the XGboost; probability of discovery Pa(ii) a Upper and lower bounds of the bird's nest search space; the maximum number of iterations Max _ itex. Setting the classification Accuracy of XGboost model prediction as a fitness function of a bird Nest, wherein a matrix representation Nest of the bird Nest position and corresponding fitness vectors NF represent a formula (1) and a formula (2);
wherein: n represents the number of bird nests, d represents the dimension of the bird nest position, xi,jRepresents the j dimension in the i bird nest, wherein fiAnd representing the fitness value corresponding to the ith bird nest.
(4-2) randomly initializing bird nest positions and searching space S (S ═ lb, ub)]) Initializing the position of bird's nest according to x*,j=random(lbj,ubj) Calculating a random initial value, wherein ubjAnd lbjThe upper and lower search bounds for the jth hyper-parametric variable to be optimized, respectively, and random () represents a random function that returns an in-range[lbj,ubj]A random number within;
(4-3) calculating a fitness function value of the bird nest according to the set fitness function, and reserving the optimal bird nest gt (namely the bird nest position vector with the maximum fitness value);
(4-4) updating the position of the bird nest by adopting Laevir flight: randomly changing the position of the current bird nest by adopting the following formula so as to obtain a group of new bird nest positions, comparing the new bird nest positions with the old bird nest positions, and reserving the bird nest positions with larger adaptability values;
wherein: alpha is alpha>0 is the step size scaling factor, and L (lambda) represents the Levy flight function, i.e., L vy, u-t-λ,(1<λ≤3)。
(4-5) discarding a small fraction of worse nests than creating new nests: circulating from the 1 st bird nest to the n th bird nest, and generating a random number r epsilon [0,1] which is subjected to uniform distribution in each circulation; and if r is larger than Pa, updating the position of the bird nest by adopting a formula (4), otherwise, not updating the position of the bird nest. When the circulation is finished, a group of new bird nest positions are obtained;
wherein XljAnd XkjFor randomly selected solutions, H (μ) is the Hervessed function, PaIs a handover parameter for balancing local and global random walks, s being the step size, and epsilon being a uniformly distributed random number.
(4-6) calculating the fitness corresponding to the updated bird nest position, and reserving the locally optimal bird nest pt (namely, storing a bird nest position vector with the maximum fitness value in the current bird nest);
(4-7) comparing the fitness values of pt and gt, and if the fitness value of pt is larger than gt, updating the global optimal gt;
(4-8) comparing pt with gt, and updating global optimal gt (including the bird nest position GXbox and the fitness value Gfmax thereof);
(4-9) judging whether the maximum iteration number is reached: and if not, returning to (4-4) to continue the loop iteration, otherwise, returning to the global optimal bird nest position gt.
Compared with the existing XGboost training method, the XGboost training method has the beneficial effects that:
(1) the invention provides a novel XGboost training method based on a cuckoo search algorithm, which is superior to the existing XGboost training method based on PSO and GA when a multi-peak function is optimized;
(2) the invention provides a novel XGboost training method based on a cuckoo search algorithm, which keeps effective balance between local search and diversity or randomness;
(3) the invention provides a novel XGboost training method based on a cuckoo search algorithm, which only comprises two control parameters, so that the algorithm is simpler and more universal;
drawings
Fig. 1 is a schematic flow diagram of XGBoost optimized by the cuckoo search algorithm in the embodiment.
FIG. 2 is a diagram illustrating feature score ordering according to an embodiment.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Examples
A novel XGboost training method based on cuckoo search algorithm comprises the following specific processes:
1. data set preprocessing
Selecting an employee data set HR _ comma _ sep from human resource management of a Kaggle official network, wherein the total number of the employee data set HR _ comma _ sep is 14999 employee records, 10 attribute characteristics and no missing value; the attribute feature details are shown in table 1, the attribute left is a classification label, which indicates whether the job leaving (1-job leaving, 0-job not) is marked as y, the first 9 sample attributes are marked as x, normalization processing is performed on x, and a maximum minimization method is adopted.
TABLE 1 Attribute feature details for employee datasets
Properties | Means of | Numbering | Maximum value | Minimum value |
satisfaction_level | Degree of satisfaction | f0 | 1.00 | 0.00 |
last_evaluation | Performance assessment | f1 | 1.00 | 0.36 |
number_project | Number of completed items | f2 | 7.00 | 2.00 |
average_montly_hours | Average monthly working time | f3 | 310.00 | 96.00 |
time_spend_company | Duration of work at company | f4 | 10.00 | 2.00 |
work_accident | Whether there is a work accident | f5 | 1.00 | 0.00 |
promotion | Whether or not there has been an increase in the past 5 years | f6 | 1.00 | 0.00 |
department | Department of department | f7 | 9.00 | 0.00 |
Salary | Salary level | f8 | 2.00 | 0.00 |
left | Whether or not to leave work | class | 1 | 0 |
2. Random forest feature selection algorithm screening dataset
The method comprises the following specific implementation steps of screening an original data set by adopting a feature selection method, reducing the dimensionality of the data set so as to improve the operation efficiency, deleting redundant or irrelevant attribute features so as to improve the prediction precision of a model, and screening the data set by adopting a random forest feature selection algorithm:
the method comprises the following steps: firstly, dividing a data set (X, y) into a training set (X _ train, y _ train) and a testing set (X _ test, y _ test) according to a ratio of 7: 3;
step two: inputting a training set training random forest classification model rf _ model, calling rf _ model, feature _ attributes _ output importance scores corresponding to the features, and sorting in a descending order, as shown in fig. 2;
step three: setting an importance score threshold thresh to 0.004383, adopting a selectfrommomodel function to reserve a feature larger than thresh, adopting a transform (X) function to convert an original sample X into a new sample X, wherein the reserved features are f0, f4, f2, f3, f1, f7 and f8, and the features f5 and f6 are deleted;
3. data set partitioning
And dividing the data set (X, y) after dimensionality reduction into a training set (X _ train, y _ train) and a testing set (X _ test, y _ test) according to the proportion of 7: 3.
4. XGboost training method based on cuckoo search algorithm trains XGboost
The XGboost comprises a plurality of hyper-parameters, and in order to further improve the prediction accuracy of the model, the optimal parameter set of the model is searched by adopting a cuckoo search algorithm. Referring to fig. 1, the specific implementation steps of training the XGBoost by using the XGBoost training method based on the cuckoo search algorithm are as follows:
the method comprises the following steps: determining the size n of the bird nest population to be 25 and the dimension d to be 9, and finding the probability PaThe upper and lower boundaries of the bird nest search space are shown in table 2, the maximum iteration number MaxN is 100, and the matrix representation of the bird nest population is shown in formula (1);
step two: randomly initializing bird nest positions in searchRandomly initializing bird nest positions in space according to chi*,j=random(lbj,ubj) Calculating a random initial value, wherein ubjAnd lbjThe upper and lower search bounds for the jth hyper-parametric variable to be optimized, respectively, and random () represents a random function that returns an interval [ lbj,ubj]A random number within;
step three: calculating the fitness function value of the bird Nest according to the set fitness function (1. precondition that XGboost parameter is set as the position value of the bird Nest in Nest, 2. input training model of training set, 3. input testing set into the trained model, calculate the classification Accuracy of the model),
reserving an optimal bird nest gt (namely a bird nest position vector with the maximum fitness value);
step four: and (3) updating the position of the bird nest by adopting Laiwei flight: randomly changing the position of the current bird nest by adopting the following formula so as to obtain a group of new bird nest positions, comparing the new bird nest positions with the old bird nest positions, and reserving the bird nest positions with larger adaptability values;
step five: discarding a small fraction of worse nests than creating new nests: circulating from the 1 st bird nest to the nth bird nest, and generating a random number r ∈ [0,1] which is subjected to uniform distribution in each circulation; and if r is greater than Pa, updating the position of the bird nest by adopting a formula (4), otherwise, not updating the position of the bird nest. When the circulation is finished, a group of new bird nest positions are obtained;
step six: calculating the fitness corresponding to the updated bird nest position, and reserving the locally optimal bird nest pt (namely, storing the bird nest position vector with the maximum fitness value in the current bird nest);
step seven: comparing the fitness value of pt with the fitness value of gt, and if the fitness value of pt is larger than gt, updating the global optimal gt;
step eight: judging whether the maximum iteration number is reached: and if not, returning to the step four to continue the loop iteration, otherwise, returning to the global optimal bird nest position gt.
TABLE 2 upper and lower bounds of the parameters
Parameter(s) | Search scope |
learning_rate | [0.01,0.3] |
n_estimators | [10,2000] |
max_depth | [1,15] |
min_child_weight | [0,10] |
gamma | [0.01,10.0] |
subsample | [0.01,1.0] |
colsample_bytree | [0.01,1.0] |
reg_alpha | [0.01,1.0] |
reg_lambda | [0.01,1.0] |
Table 3 optimal parameter set
Parameter(s) | Optimal value |
learning_rate | 0.1457 |
n_estimators | 85 |
max_depth | 15 |
min_child_weight | 0.019 |
gamma | 0.0113 |
subsample | 0.86916 |
colsample_bytree | 1.0 |
reg_alpha | 0.7277 |
reg_lambda | 0.2664 |
5. Training the optimized XGboost model and carrying out model evaluation
Inputting a training set to train the optimized XGboost model, and measuring and evaluating the trained XGboost classification model by adopting Precision Accuracy, Precision, Recall and F1, wherein 4 index calculation modes are as follows:
where TP represents the number of samples for which the job separation was correctly predicted as separation, FP represents the number of samples for which the job separation was not incorrectly predicted as separation, TN represents the number of samples for which the job separation was incorrectly predicted as non-separation, and FN represents the number of samples for which the job separation was not correctly predicted as non-separation.
6. Performing staff outage prediction
And inputting the test set into a trained XGboost model for prediction to obtain a final prediction result.
7. Design of experiments
In order to verify the effectiveness of the method provided by the invention, two groups of comparison experiments are set, the first group respectively compares the XGboost original model XGB, the model RF-XGB adopting random forests for feature screening and 4 index (Accuracy, Precision, Recall and F1) evaluation results of the three models of the model RF-CS-XGB provided by the invention, and the comparison results are shown in Table 4; the second group compares the method RF-CS-XGB provided by the invention with the random forest RF-RF, the logistic regression RF-LR, the support vector machine RF-SVM, the gradient boosting decision tree RF-GBDT, the K neighbor algorithm RF-KNN and other common classification models which are only processed by the random forest feature selection method, and the experimental comparison result is shown in the table 5.
TABLE 4 results of the first comparative set of experiments
Model (model) | Accuracy | Precision | Recall | F1 |
XGB | 97.40% | 97.17% | 91.53% | 94.27% |
RF-XGB | 97.44% | 97.27% | 91.63% | 94.37% |
RF-CS-XGB | 99.09% | 99.22% | 96.86% | 98.03% |
TABLE 5 second set of comparative experimental results
Model (model) | Accuracy | Precision | Recall | F1 |
RF-RF | 99.04% | 99.32% | 96.57% | 97.93% |
RF-LR | 76.60% | 49.83% | 27.40% | 35.36% |
RF-SVM | 81.53% | 92.31% | 22.84% | 36.61% |
RF-GBDT | 97.58% | 97.38% | 92.10% | 94.67% |
RF-KNN | 95.62% | 90.36% | 90.96% | 90.66% |
RF-CS-XGB | 99.09% | 99.22% | 96.86% | 98.03% |
The above embodiments describe in detail a specific implementation manner of the XGBoost training method based on cuckoo search algorithm and applied to the staff departure prediction, and the above embodiments only use the proposed method and core ideas to help understanding the present invention.
Claims (2)
1. A novel XGboost training method based on cuckoo search algorithm is characterized by comprising the following steps:
step 1: preprocessing an original data set, including normalization and feature dimension reduction, and dividing the processed data set into a training set and a test set according to a fixed proportion;
and 2, step: the XGboost is trained through an XGboost training method based on a cuckoo search algorithm;
and step 3: constructing XGboost according to a group of parameter values obtained by training;
and 4, step 4: the XGboost is constructed by adopting a test set test, and the model is comprehensively evaluated by adopting 4 model evaluation indexes of Accuracy, Precision, Recall and F1 score.
2. The new XGBoost training method based on cuckoo search algorithm as claimed in claim 1, wherein: the training of the XGBoost by the XGBoost training method based on the cuckoo search algorithm in step 2 specifically comprises:
step 2-1: determining the size n of the bird nest population; dimension d of the bird nest position; namely the number of parameters to be optimized in the XGboost; probability of discovery Pa(ii) a Upper and lower bounds of the bird's nest search space; the maximum number of iterations Max _ itex. The classification Accuracy predicted by the XGboost model is set as a fitness function of the bird Nest, and a matrix representation Nest of the bird Nest position and a corresponding fitness vector NF are represented as follows:
wherein: x is the number ofi,jRepresenting the jth dimension in the ith bird nest; n represents the number of bird nests; d represents the dimension of the bird nest, namely the number of the parameters of the XGboost to be optimized.
WhereinfiAnd (3) representing the fitness value corresponding to the ith bird nest, wherein n represents the number of the bird nests.
Step 2-2: randomly initializing bird nest position, and searching space S (S ═ lb, ub)]) Initializing the position of bird's nest according to x*,j=random(lbj,ubj) Calculating a random initial value, wherein ubjAnd lbjThe upper and lower search bounds for the jth hyper-parametric variable to be optimized, respectively, and random () represents a random function that returns an interval [ lbj,ubj]The random number in (c).
Step 2-3: calculating the adaptability value of the bird nest according to the classification Accuracy Accuracy of the XGboost, and reserving the optimal bird nest gt (namely the bird nest position vector with the maximum adaptability value);
step 2-4: and (3) updating the position of the bird nest by adopting Laiwei flight: randomly changing the position of the current bird nest by adopting the following formula so as to obtain a group of new bird nest positions, comparing the new bird nest positions with the old bird nest positions, and reserving the bird nest positions with larger adaptability values;
wherein: alpha is alpha>0 is the step size scaling factor, and L (lambda) represents the Levy flight function, i.e., L vy, u-t-λ,(1<λ≤3)。
Step 2-5: discarding a small fraction of worse nests than creating new nests: circulating from the 1 st bird nest to the nth bird nest, and generating a random number r ∈ [0,1] which is subjected to uniform distribution in each circulation; if r is greater than Pa, the position of the bird nest is updated by adopting the following formula, otherwise, the position of the bird nest is not updated. And when the circulation is finished, obtaining a new group of bird nest positions.
Wherein XljAnd XkjFor randomly selected solutions, H (μ) is the Hervessed function, PaIs used for smoothingSwitching parameters of local and global random walk are balanced, s is a step length, and epsilon is a uniformly distributed random number;
step 2-6: calculating the fitness corresponding to the updated bird nest position, and reserving the local optimal bird nest pt (namely, the position of the bird nest with the maximum fitness value in the current bird nest is saved);
step 2-7: comparing the fitness value of pt with that of gt, and if the fitness value of pt is greater than that of gt, updating the global optimal gt;
step 2-8: judging whether the maximum iteration number is reached: and if not, returning to 2-4 to continue the cycle iteration, otherwise, returning to the global optimal bird nest position gt.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210236632.2A CN114611719A (en) | 2022-03-11 | 2022-03-11 | XGboost training method based on cuckoo search algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210236632.2A CN114611719A (en) | 2022-03-11 | 2022-03-11 | XGboost training method based on cuckoo search algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114611719A true CN114611719A (en) | 2022-06-10 |
Family
ID=81862317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210236632.2A Pending CN114611719A (en) | 2022-03-11 | 2022-03-11 | XGboost training method based on cuckoo search algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114611719A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115406882A (en) * | 2022-10-31 | 2022-11-29 | 常州安控电器成套设备有限公司 | GBDT and improved MFO-based water quality pollutant detection method |
-
2022
- 2022-03-11 CN CN202210236632.2A patent/CN114611719A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115406882A (en) * | 2022-10-31 | 2022-11-29 | 常州安控电器成套设备有限公司 | GBDT and improved MFO-based water quality pollutant detection method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10713597B2 (en) | Systems and methods for preparing data for use by machine learning algorithms | |
CN108920556B (en) | Expert recommending method based on discipline knowledge graph | |
CN108985335B (en) | Integrated learning prediction method for irradiation swelling of nuclear reactor cladding material | |
CN108733976B (en) | Key protein identification method based on fusion biology and topological characteristics | |
US11366806B2 (en) | Automated feature generation for machine learning application | |
Casalino et al. | Incremental adaptive semi-supervised fuzzy clustering for data stream classification | |
US20220277188A1 (en) | Systems and methods for classifying data sets using corresponding neural networks | |
CN110310012B (en) | Data analysis method, device, equipment and computer readable storage medium | |
Martínez-Ballesteros et al. | Improving a multi-objective evolutionary algorithm to discover quantitative association rules | |
CN111309577B (en) | Spark-oriented batch application execution time prediction model construction method | |
Santhosh et al. | Generalized fuzzy logic based performance prediction in data mining | |
CN114611719A (en) | XGboost training method based on cuckoo search algorithm | |
CN110968693A (en) | Multi-label text classification calculation method based on ensemble learning | |
CN110175631A (en) | A kind of multiple view clustering method based on common Learning Subspaces structure and cluster oriental matrix | |
Tiruneh et al. | Feature selection for construction organizational competencies impacting performance | |
CN113469288A (en) | High-risk personnel early warning method integrating multiple machine learning algorithms | |
CN114385808A (en) | Text classification model construction method and text classification method | |
CN116913394A (en) | Cell type annotation method based on single cell transcriptome data | |
Ibrahım | WBBA-KM: a hybrid weight-based bat algorithm with K-means algorithm for cluster analysis | |
CN111832645A (en) | Classification data feature selection method based on discrete crow difference collaborative search algorithm | |
CN116756373A (en) | Project review expert screening method, system and medium based on knowledge graph update | |
Zhao et al. | Rfe based feature selection improves performance of classifying multiple-causes deaths in colorectal cancer | |
KR101085066B1 (en) | An Associative Classification Method for detecting useful knowledge from huge multi-attributes dataset | |
CN115344386A (en) | Method, device and equipment for predicting cloud simulation computing resources based on sequencing learning | |
Li et al. | Parameters optimization of back propagation neural network based on memetic algorithm coupled with genetic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |