CN110991653A - Method for classifying unbalanced data sets - Google Patents

Method for classifying unbalanced data sets Download PDF

Info

Publication number
CN110991653A
CN110991653A CN201911256817.4A CN201911256817A CN110991653A CN 110991653 A CN110991653 A CN 110991653A CN 201911256817 A CN201911256817 A CN 201911256817A CN 110991653 A CN110991653 A CN 110991653A
Authority
CN
China
Prior art keywords
data set
sample
unbalanced data
samples
minority
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911256817.4A
Other languages
Chinese (zh)
Inventor
简玉琳
叶茂
闵艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911256817.4A priority Critical patent/CN110991653A/en
Publication of CN110991653A publication Critical patent/CN110991653A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for classifying unbalanced data sets, which is applied to the fields of network intrusion detection, animal age prediction, vehicle performance evaluation and the like, and aims at the problem of low classification precision of minority classes in the prior art, on the basis of original training data, the invention utilizes the relation between the minority classes and the majority classes in the original data set, and utilizes an SMOTE and K nearest neighbor algorithm to process the original training data set to construct a new set, and the set focuses on the minority classes and other related majority samples; two random forests with the same size are constructed according to original training data and a new set, then decision trees in the two forests are combined into a large forest, the test set is tested together to obtain a classification result, and the classification precision is greatly improved compared with the prior art.

Description

Method for classifying unbalanced data sets
Technical Field
The invention belongs to the fields of network intrusion detection, animal age prediction, vehicle performance evaluation and the like, and particularly relates to a technology for unbalanced data set classification.
Background
The classification is one of the important research directions in the field of machine learning, and a plurality of mature algorithms are formed through years of development and are successfully applied in practice. These conventional classification algorithms target the highest classification accuracy and assume a substantial balance of sample numbers for each class in the data set. However, in practical problems, there is a case where: two types of data are contained in one data set, wherein the samples contained in one type of data are far less than the samples contained in the other type of data, the former type is called a minority type, and the latter type is called a majority type. Due to the difficulties encountered in the classification of unbalanced data sets, the classification problem with unbalanced data sets has recently gained increased attention. How to correctly classify unbalanced data sets and improve the classification accuracy of a few classes becomes a research focus in the field of data mining.
In the conventional learning algorithm classification process, the distribution of the data set is assumed to be basically balanced, and overall classification accuracy is taken as a target, so that the classifier has obvious preference on the majority classes with the dominant quantity in the unbalanced data set classification process, the classification accuracy of the majority classes is improved, and the classification accuracy of the minority classes is reduced. However, in practical problems, people tend to pay more attention to the classification accuracy of a few classes. For example, if the intrusion samples in network intrusion detection are generally less than 1%, if the classification algorithm predicts all samples as majority samples, a 99% accuracy is still obtained, but a minority sample cannot be identified, and obviously, such classification does not help to identify network intrusion; for example, when oil exploration is performed by using sea surface photos returned by satellites, most of the pictures which cannot obviously detect the existence of oil exist, only a few pictures can detect possible oil resources, and it is important to find natural oil resources from the satellite images which are in the loss as accurately as possible. Therefore, how to classify the unbalanced data set has high application value and wide application prospect.
The invention relates to a classification method of unbalanced data, namely 'a classification method of unbalanced data' invented by Beijing aerospace university prince, Deng Wei nation, Qianzhong, Wang '31054' \\ 26104, xu Bo, Rechao and Yongyue, which is published in 2015, 6.3.3.A, and the publication number is CN 104679860A. According to the method, two types of decision functions are obtained through learning of a training sample set, then membership and membership classification decision functions are sequentially and respectively obtained, and finally classification is carried out on samples in a second overlapping area determined in a test set. However, the patent only focuses on the optimization of the decision function, and the steps are too complex and involve more parameters. For unbalanced data sets, the information about the majority and minority classes contained in the data set itself is also important for classification, so how to collect and utilize such information is also relevant, and thus the patent is incomplete.
The invention of "unbalanced data classification oversampling method, device, equipment and medium" of Guangzhou university Korean Weihong, plum tree, Wangle, fangchining, Jia flame, Huangzhong, Zhou bin, Yilihua and Dingxihong applies for patent and obtains approval to the Chinese intellectual property office in 5 months and 10 days in 2018, and is disclosed in 10 months and 12 days in 2018, and the publication number is: CN 108647728B. The patent realizes classification by processing a few types of samples in an unbalanced data set, acquiring the number of corresponding majority samples by using a K-nearest neighbor algorithm, and determining the types of the minority samples by reversely deducing the number of the majority samples. That is to say, the method only focuses on the processing of the data set, and has a certain limitation without being fused with a subsequent algorithm for improvement.
Disclosure of Invention
In order to solve the above technical problem, the present invention provides a method for classifying unbalanced data sets.
The technical scheme adopted by the invention is as follows: a method for classifying unbalanced data sets utilizes SMOTE and K nearest neighbor algorithm to process an original training data set to construct a new set, wherein the set focuses on minority samples and majority samples related to the minority samples; then constructing two random forests with the same size, wherein one random forest is trained by using an original training data set, and the other random forest is trained by using a newly constructed data set; and finally merging the decision trees in the two forests into a large forest, and testing the test set together to obtain a classification result.
The invention comprises the following technologies:
1. k nearest neighbor algorithm
The invention relates to the following parts: and finding the nearest k majority class samples and k minority class samples around a few class sample point by taking the Euclidean distance as a standard. The n-dimensional space euclidean distance is expressed as:
Figure BDA0002310495970000021
2. SMOTE algorithm
SMOTE is a comprehensive Minority over sampling Technique, namely a Synthetic Minority over sampling Technique, and is an improved scheme based on a random over sampling algorithm. The basic idea of the SMOTE algorithm is to analyze a few classes of samples and artificially synthesize new samples from the few classes of samples to add to the dataset. The general flow is as follows:
for each sample x in the minority class, calculating a sample set S from the sample x to the minority class by using Euclidean distance as a standardminObtaining k neighbors of the samples according to the distances of all the samples;
setting a sampling proportion according to the sample unbalance proportion to determine a sampling multiplying factor N, and randomly selecting a plurality of samples from k neighbors of each minority sample x, wherein the selected neighbors are assumed to be xn;
the sampling multiplying power N is set according to the unbalanced proportion of each data set so as to achieve the purpose of balancing less types and various samples after oversampling. Assume that the imbalance ratio is of the multiple classes: and if the minority is 7:1, setting the sampling multiplying factor N to 7.
For each randomly selected neighbor xn, a new sample is constructed with the original sample according to the following formula:
xnew=x+rand(0,1)*|x-xn|
3. ORF (Online random forest)
ORF (online random forest is) is a form of online learning of RF (random forest). In contrast to random forests, when a new sample arrives, each decision tree in an online random forest is updated, which allows the stability of the online random forest and the life cycle of the entire model to be increased.
Suppose a forest is of size T ═ T1,t2,···,tT]There are N random numbersSample set S { (x) of samples1,y1),(x2,y2),…,(xN,yN) In which xi=[xi1,xi2,···,xim]T∈RmAnd y isiE {1, …, K } represents the class label of one for each sample pair. A bootstrapping sampling method is applied to each decision tree to construct a training set. The training set is entered into the forest in serial form, for each sample s about to enter the foresti=<xi,yi>. obtaining a random integer k using Poisson's algorithm:
Figure BDA0002310495970000031
the above formula can be expressed as k to Poisson (λ), and in general λ is a constant. For generated k, there are two cases:
1) if the value of k is greater than or equal to 0, then sample s is usedi=<xi,yiTraining the forest k times.
2) Otherwise, use the sample si=<xi,yiCalculating the out-of-bag error rate of the forest and adjusting.
4. Data set collation
The data set used in this experiment contained: 9 real world collected imbalance data relating to the fields of vehicle performance assessment, animal age prediction, etc., from the UCI machine learning database. And 9 artificial imbalance data from the KEEL database.
5. Fusion algorithm
The invention fully utilizes the learning ability of the online random forest, processes the unbalanced data set by considering the generalization advantages of the K neighbor and the SMOTE algorithm, and realizes brand-new improvement by fusing the unbalanced data set into a classified algorithm stage. In other words, unlike the previous algorithm which only focuses on the data stage or the algorithm stage, the algorithm model provided by the invention constructs two classifiers which respectively focus on few samples in the original data set and the data set, thereby realizing the improvement of the classification performance.
In summary, the implementation process of the invention is as follows:
s1, for each data set, dividing the data set into training set and testing set according to ten-fold cross validation principle
S2, firstly, training a standard online random forest according to the training set in the S1;
s3, dividing the training set in the S1 into a plurality of classes and a plurality of classes, and constructing a new data set with the same size as the training set by using a K neighbor algorithm and SMOTE;
s4, training an online random forest with the same size as that in the step S2 by using the new data set in the step S3;
and S5, merging two online random forests into one and testing the test set in the S1 together.
The invention has the beneficial effects that: the invention relates to a method for classifying unbalanced data sets, which combines a classification algorithm with information in a data set to realize the classification of the unbalanced data sets together, and specifically adopts an SMOTE and K neighbor algorithm to process original training data so as to oversample a few classes and undersample a plurality of classes; and then training two online random forests with the same size as the classifiers of the original data and newly established data respectively, and finally fusing the two online random forests into one forest to test the test set.
Drawings
FIG. 1 is a general flow chart for implementing an unbalanced data set classification algorithm;
FIG. 2 is a generalized algorithm diagram for implementing an unbalanced data set classification algorithm;
FIG. 3 is a schematic diagram of the K-nearest neighbor algorithm;
FIG. 4 is a diagram of the results of the SMOTE algorithm;
wherein, fig. 4(a) is a schematic diagram of neighbors found in the SMOTE algorithm, and fig. 4(b) is a schematic diagram of synthesized few-class samples in the SMOTE algorithm;
FIG. 5 is a rough diagram of online random forest implementation classification.
Detailed Description
When the problem of classification of an unbalanced data set is processed, the relation between a few classes and a plurality of classes in the original data set is searched and utilized, and the classification performance can be effectively improved. Meanwhile, the selection of the classifier is also very important, and if the classification algorithm can be fused with the information in the data set to realize the classification of the unbalanced data set together, the performance can be greatly improved, and the generalization performance is very good. Based on the above thought, the invention adopts familiar SMOTE and K neighbor algorithm to process the original data so as to oversample the minority class and undersample the majority class. And then training two online random forests with the same size to be used as classifiers of original data and newly established data respectively, and finally fusing the two online random forests into one forest to test the test set.
Fig. 1 is a flowchart of a scheme of the present invention, and this embodiment takes network intrusion detection as an example to explain the contents of the present invention:
the method comprises the steps of firstly, obtaining an original unbalanced training set corresponding to network intrusion, specifically, the unbalanced data set of the embodiment comprises 9 unbalanced data collected in the real world, and the unbalanced data come from a UCI machine learning database; and 9 artificial imbalance data from the KEEL database; firstly, dividing an original unbalanced training set into a training set and a testing set according to a ten-fold cross validation principle;
then, the training set is divided into multi-class sets
Figure BDA0002310495970000051
And a short set of
Figure BDA0002310495970000052
The dimensions T of the forest are determined for each data set (here, the multiple sets X)majAnd minority class Xmin) Determining the appropriate Q (Q is a variable integer for different data sets as set forth in the present embodiment, in order to enable new data sets to be generated in subsequent stepsIs comparable to the original data set size).
Secondly, as shown in fig. 2, a new data set is constructed by using KNN and SMOTE; fig. 3 is a schematic diagram of the K-nearest neighbor algorithm, which is a decision diagram center "? "the sample represented by the circle at the position shown belongs to which category, the nearest K neighbors around need to be found, and the category with the largest number of neighbors is considered as the category to which the sample belongs. The present invention only relates to the process of searching for nearest neighbors. Specifically, the method comprises the following steps:
and respectively selecting the K multi-class samples and the K few-class samples which are nearest to each few class by using a K nearest neighbor algorithm. Setting a variable integer Q for different data sets, and randomly selecting Q samples from the selected samples for each sample with less classes. For each small sample, the following operations are performed:
randomly selecting an integer, randomly selecting q samples from k multiple classes, and performing de-duplication processing on repeated samples in the q samples, wherein the number of the finally selected samples is less than or equal to q;
as shown in fig. 4, randomly constructing (Q-Q) synthetic minority samples from the k minority classes using SMOTE strategy; x in FIG. 4(a)iFor the few class samples currently selected as the center, the sample pointed to by the arrow
Figure BDA0002310495970000053
One of the currently selected k nearest minority class sample neighbors; x in FIG. 4(b)iAnd
Figure BDA0002310495970000054
the block on the link denoted by "Generated Synthetic Instance" is a small number of samples that are newly constructed. The generated synthetic instance is shown.
Combining the constructed sample and the original few types of samples into a new sample set;
and thirdly, training an online random forest with the size of T by utilizing the new sample set and the original unbalanced training set respectively, recording the random forest trained by the original unbalanced training set as ORF1, and recording the random forest trained by the new sample set in the second step as ORF 2.
And fourthly, combining ORF1 and ORF2 into a random forest, testing the test set obtained in the step S1 together to obtain a classification result, wherein the classification result obtained by the method of the invention and the comparison result of the prior art are shown in Table 1, and in order to clearly show the effectiveness of the algorithm provided by the invention, the same type of algorithm is selected: comparing RF, BRAF, TempC and AdaBoost, wherein FORF-S is the method of the invention; the evaluation index was G-mean (%). From table 1, it is clear that the FORF-S model proposed by the present invention performs best on animal age prediction, disease assessment monitoring and artificial imbalance data, and that the percentage improvement over the best performing methods among others is up to 20% (hepatits dataset). Although TempC performed best on both the Car-good (Car-good) and cellular protein location (Yeast) data sets, our model was comparable and competitive.
TABLE 1 comparison of the classification results obtained by the method of the invention with the prior art
Figure BDA0002310495970000061
In table 1, Dataset represents the sample and abatone 19 is the data predicted by physical measurements of the age of Abalone; breast is Breast cancer recurrence data; car-good is vehicle evaluation data; haberman is survival data for breast cancer patients; hepatitis is survival data of Hepatitis patients; yeast is the prediction data of the cellular localization sites of the protein; clover70, Paw70 and Subc70 are from KEEL databases, the rest from UCI machine learning databases.
When a new sample (x)i,yi) At the time of arrival, the forest is updated once as shown in fig. 5, different bars beside each node represent different classes of samples, and each update aims to better classify the samples. Specifically, the method comprises the following steps: starting from the top root node, each long bar represents a category, and the length of each long bar represents the number of samples; the downward walking has different categories at each nodeDifferent numbers of samples are branched, so that at each node, the bars may have different lengths.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (6)

1. A method for classifying unbalanced data sets is characterized in that two classifiers are constructed, wherein the two classifiers are respectively: the method comprises the steps of constructing a first classifier according to an original unbalanced data set, constructing a second classifier according to few types of samples in the original unbalanced data set, and then classifying the unbalanced data set by adopting a third classifier obtained by combining the first classifier and the second classifier.
2. A method for classification of an unbalanced data set as claimed in claim 1, wherein the first classifier is a first online random forest obtained by a process comprising:
dividing the acquired original unbalanced data set into two parts, wherein one part is used as a training set, and the other part is used as a test set;
and training according to the training set to obtain a first online random forest.
3. The method for classifying an unbalanced data set according to claim 2, wherein the collected original unbalanced data set is divided into a training set and a testing set according to a ten-fold cross validation principle.
4. A method for classification of an unbalanced data set as claimed in claim 2, wherein the second classifier is a second online random forest obtained by a process comprising:
dividing the training set into a majority sample set and a minority sample set, and constructing a new data set with the same size as the training set by using a K neighbor algorithm and SMOTE;
and training a second online random forest with the same size as the first online random forest by using the new data set.
5. The method for classifying an unbalanced data set according to claim 4, further comprising classifying the test set using a third classifier to obtain a classification result.
6. The method for classifying an unbalanced data set according to claim 5, wherein the step S3 is specifically as follows:
for each sample in the minority class, calculating the distance from the sample to all samples in the minority class sample set by taking the Euclidean distance as a standard to obtain k neighbor of the sample;
setting a sampling proportion as a sampling multiplying factor according to the sample unbalance proportion, and randomly selecting a plurality of samples from k neighbors of each minority sample x according to the sampling multiplying factor to be used as the neighbors of the minority sample x;
for each randomly selected neighbor xn, respectively constructing a new sample with the corresponding few class samples according to the following formula:
xnew=x+rand(0,1)*|x-xn|;
where xn is the neighbor corresponding to the minority sample x.
CN201911256817.4A 2019-12-10 2019-12-10 Method for classifying unbalanced data sets Pending CN110991653A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911256817.4A CN110991653A (en) 2019-12-10 2019-12-10 Method for classifying unbalanced data sets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911256817.4A CN110991653A (en) 2019-12-10 2019-12-10 Method for classifying unbalanced data sets

Publications (1)

Publication Number Publication Date
CN110991653A true CN110991653A (en) 2020-04-10

Family

ID=70091753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911256817.4A Pending CN110991653A (en) 2019-12-10 2019-12-10 Method for classifying unbalanced data sets

Country Status (1)

Country Link
CN (1) CN110991653A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967343A (en) * 2020-07-27 2020-11-20 广东工业大学 Detection method based on simple neural network and extreme gradient lifting model fusion
CN112465245A (en) * 2020-12-04 2021-03-09 复旦大学青岛研究院 Product quality prediction method for unbalanced data set
CN112836735A (en) * 2021-01-27 2021-05-25 中山大学 Optimized random forest processing unbalanced data set method
CN112932497A (en) * 2021-03-10 2021-06-11 中山大学 Unbalanced single-lead electrocardiogram data classification method and system
CN113178264A (en) * 2021-05-04 2021-07-27 温州医科大学附属第一医院 Deep muscle layer infiltration data prediction method and system
CN113553580A (en) * 2021-07-12 2021-10-26 华东师范大学 Intrusion detection method for unbalanced data
CN113628697A (en) * 2021-07-28 2021-11-09 上海基绪康生物科技有限公司 Random forest model training method for classification unbalance data optimization
CN113689053A (en) * 2021-09-09 2021-11-23 国网安徽省电力有限公司电力科学研究院 Strong convection weather overhead line power failure prediction method based on random forest
CN115050477A (en) * 2022-06-21 2022-09-13 河南科技大学 Bayesian optimization based RF and LightGBM disease prediction method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967343A (en) * 2020-07-27 2020-11-20 广东工业大学 Detection method based on simple neural network and extreme gradient lifting model fusion
CN112465245A (en) * 2020-12-04 2021-03-09 复旦大学青岛研究院 Product quality prediction method for unbalanced data set
CN112836735A (en) * 2021-01-27 2021-05-25 中山大学 Optimized random forest processing unbalanced data set method
CN112836735B (en) * 2021-01-27 2023-09-01 中山大学 Method for processing unbalanced data set by optimized random forest
CN112932497A (en) * 2021-03-10 2021-06-11 中山大学 Unbalanced single-lead electrocardiogram data classification method and system
CN113178264A (en) * 2021-05-04 2021-07-27 温州医科大学附属第一医院 Deep muscle layer infiltration data prediction method and system
CN113553580A (en) * 2021-07-12 2021-10-26 华东师范大学 Intrusion detection method for unbalanced data
CN113628697A (en) * 2021-07-28 2021-11-09 上海基绪康生物科技有限公司 Random forest model training method for classification unbalance data optimization
CN113689053A (en) * 2021-09-09 2021-11-23 国网安徽省电力有限公司电力科学研究院 Strong convection weather overhead line power failure prediction method based on random forest
CN113689053B (en) * 2021-09-09 2024-03-29 国网安徽省电力有限公司电力科学研究院 Strong convection weather overhead line power failure prediction method based on random forest
CN115050477A (en) * 2022-06-21 2022-09-13 河南科技大学 Bayesian optimization based RF and LightGBM disease prediction method
CN115050477B (en) * 2022-06-21 2023-06-20 河南科技大学 Bethes-optimized RF and LightGBM disease prediction method

Similar Documents

Publication Publication Date Title
CN110991653A (en) Method for classifying unbalanced data sets
Han et al. A graph-based approach for trajectory similarity computation in spatial networks
CN107545275A (en) The unbalanced data Ensemble classifier method that resampling is merged with cost sensitive learning
CN113378632A (en) Unsupervised domain pedestrian re-identification algorithm based on pseudo label optimization
CN105809672B (en) A kind of image multiple target collaboration dividing method constrained based on super-pixel and structuring
CN103888541B (en) Method and system for discovering cells fused with topology potential and spectral clustering
CN107291895B (en) Quick hierarchical document query method
Lei et al. Detecting protein complexes from DPINs by density based clustering with Pigeon-Inspired Optimization Algorithm
CN101980251A (en) Remote sensing classification method for binary tree multi-category support vector machines
CN107832778A (en) A kind of same target recognition methods based on spatial synthesis similarity
Moitra et al. Cluster-based data reduction for persistent homology
CN112766170A (en) Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
JP2012079187A (en) Feature vector generating device, feature vector generating method and program therefor
CN104361135A (en) Image retrieval method
CN110490234A (en) The construction method and classification method of classifier based on Cluster Classification associative mechanism
CN109919320A (en) Triplet online learning methods based on Semantic hierarchy
CN117495891A (en) Point cloud edge detection method and device and electronic equipment
CN108549915A (en) Image hash code training pattern algorithm based on two-value weight and classification learning method
CN117271894A (en) Paper recommendation method based on hybrid network and DPP
Pappula A Novel Binary Search Tree Method to Find an Item Using Scaling.
Wang et al. Feature selection methods in the framework of mRMR
CN110727833B (en) Multi-view learning-based graph data retrieval result optimization method
Mahajan et al. Various approaches of community detection in complex networks: a glance
Hasanpour et al. Optimal selection of ensemble classifiers using particle swarm optimization and diversity measures
Kavitha et al. Machine learning paradigm towards content-based image retrieval on high-resolution satellite images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination