CN108764346A - A kind of mixing sampling integrated classifier based on entropy - Google Patents

A kind of mixing sampling integrated classifier based on entropy Download PDF

Info

Publication number
CN108764346A
CN108764346A CN201810536985.8A CN201810536985A CN108764346A CN 108764346 A CN108764346 A CN 108764346A CN 201810536985 A CN201810536985 A CN 201810536985A CN 108764346 A CN108764346 A CN 108764346A
Authority
CN
China
Prior art keywords
sample
entropy
class
training
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810536985.8A
Other languages
Chinese (zh)
Inventor
王喆
李冬冬
程阳
杜文莉
张静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China University of Science and Technology
Original Assignee
East China University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China University of Science and Technology filed Critical East China University of Science and Technology
Priority to CN201810536985.8A priority Critical patent/CN108764346A/en
Publication of CN108764346A publication Critical patent/CN108764346A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of mixing sampling integrated classifier based on entropy, calculates comentropy in original training sample first;Then, negative class sample is divided into two groups according to the size of entropy;Secondly, smaller one group of entropy is divided into several subsets at random, and each subset forms new subset with another group of sample;Then positive class is up-sampled, then new positive class is merged to form new data views with each negative class subset;Finally, these visual angles of generation are individually handled using several graders, and they are integrated.The model that test sample can be substituted into training in test phase is identified.Compared to traditional method of sampling, present invention introduces entropys to avoid characteristic excessively random in sampling.Meanwhile the information of sample being avoided to lose, and mitigate sample overlap problem.The present invention is that each data set generation suitably samples degree in the case where fully considering sample imbalance degree, and new method and thinking are provided to solve imbalance problem.

Description

A kind of mixing sampling integrated classifier based on entropy
Technical field
The present invention relates to a kind of, and the mixing based on entropy samples integrated classifier, belongs to mode identification technology.
Background technology
Unbalanced problem has become data mining and machine since it is in the generality of actual life and scientific research One of most important topic in study.When as soon as our target is the rare and important case of solution, will appear not The problem of balance, i.e. sample size in a classification are few more many than another classification.For example, fraud detection, disease is examined Disconnected and access control is typical imbalance problem.In fraud detection, fraud case only accounts for the fraction of regular traffic.And And the request of access control system most of time processing kinsfolk, and the record of stranger is seldom.In actual conditions, gate inhibition system Stranger is mistakenly considered kinsfolk's ratio and kinsfolk is mistakenly considered stranger seriously much by system.Therefore, different estate is being located Different concerns should be given when managing these problems.It is explicitly described unbalanced problem, there is the class of great amount of samples to be known as more Several classes of or negative class, the few class of sample size are known as minority class or positive class.The ratio of negative class and positive class sample size is known as unbalance factor (IR), it is used for the uneven degree of descriptor data set.
Although canonical algorithm achieves ideal effect on equilibrium criterion collection, they would generally in imbalance problem There is lower positive class discrimination.In order to solve this problem, there are two types of most common technologies.It is the method for data plane first, Refer to the method for sampling.It makes data balance as possible and independently of specific grader in pretreatment stage.Followed by algorithm layer The method in face, including threshold method, the study of single class study and cost-sensitive.Different from data plane, the method for algorithm level Do not change the distribution of sample, but is thought of as imbalance problem exploitation more suitably algorithm.
The present invention mainly proposes a new method of sampling, it not only considers the distribution of sample in sampling process It is interior, while also overcoming the problem of sample information lacks.On the one hand, the method weighs point of original sample using comentropy Cloth.In message area, entropy is the tool for weighing sample certainty.Since the sample close to positive class and negative class boundary is being classified In play an important role, therefore they should give special attention in assorting process.Sample close to positive and negative class boundary is usual Possess lower classification certainty.Therefore, their entropy is often different from those samples far from boundary.In order to which area detaches side It is distinguished with other samples and is treated by the closer sample in boundary, methods herein in the training process.On the other hand, in order to avoid The loss of sample information during down-sampling, negative class sample are directly divided into several subdivisions rather than selected part subset. Therefore, all negative class samples all will include in the training process.Meanwhile positive class sample can be sampled it is each new to balance Data subset.These subsets are considered as several data views.Since the positive class in each visual angle only needs to extend to negative class The size of subset, therefore up-sample the new samples generated and be controlled in a smaller quantity.Institute can mitigate sample in this approach Overlap problem caused by this generation.Finally, the present invention will mix sampling policy with integrated approach and combine propose happy one it is new Disaggregated model has great importance in the sorting algorithm research for solving imbalance problem.
Invention content
Technical problem:The present invention provides a kind of mixing sampling that can solve imbalance problem plus integrated classification moulds Type.During down-sampling, the distribution of sample is weighed by introducing comentropy, mitigates the excessively random characteristic of traditional sampling, Important originally to be distinguished in sampling process is treated.Meanwhile not losing any sample information, that is, retain whole training samples Under the premise of, it is sampled.In addition, being asked by controlling the quantity of the artificial sample newly formed to slow down sample overlapping in up-sampling Topic.Finally, several data views that sampling generates are integrated.The Model Independent carried adapts to simultaneously in specific classification device In various types of graders, therefore the characteristics of each grader can be made full use of, final classification effect is improved.
Technical solution:First, raw sample data is divided into training set and test set two parts;Secondly, original The comentropy of each sample is calculated on training set;Then, by negative class sample according to the threshold value of setting according to the size of entropy by sample Two groups are divided into, the entropy of first group of sample is more than threshold value, and the entropy of second group of sample is no more than threshold value;Then, by second The sample of group is divided into M groups at random, and each group all forms new subset with first group;Then, positive class sample adopt Sample reaches balance with the subset with negative class sample, then forms M new data views with each subset of negative class sample;Finally, These visual angles are handled using M base grader, and obtain the training result of training sample using integrated approach.In test step Suddenly, test sample is updated in the corresponding discriminant function of the model and is identified.
The technical solution adopted by the present invention to solve the technical problems can also be refined further.In the training stage In the operation for calculating sample information entropy, the present invention is based on k nearest neighbor algorithms, but indeed, it is possible to use any near neighbor method.This Outside, number M in visual angle is determined according to the unbalance factor IR of each data set, and in an experiment, M is selected from { 1,2 ..., round (IR)}.Wherein round (IR) is the value that rounds up to IR.Finally, in integration phase, the present invention uses most classes to throw Ticket method.But in fact, other integrated approaches all can be applied to this.
Advantageous effect:Compared with prior art, the present invention haing the following advantages:
The present invention proposes a mixing sampling algorithm based on entropy to be directed to unbalanced data in data plane.Due to Sampling algorithm is easily achieved and is achieved on unbalanced data considerable effect, therefore its quilt in various research work It is widely used.Although the existing method of sampling works well on unbalanced data, they still have some defects.Such as Down-sampling may lose the information of significant samples and up-sampling is likely to result in sample overlap problem.It is asked to solve these Topic, this paper presents an integrated methods of new mixing sampling to be known as EHSEL.On the one hand, the method in sampling by sample Distribution situation take into account, therefore it treats significant samples with can distinguishing.On the other hand, this method is protected in the training process All original samples have been stayed, it in this way can be to avoid the loss of sample information.In addition, the algorithm of this paper only expands the scale of positive class It opens up to the size of each negative class sample set, significantly reduces the influence of sample overlap problem in this way.It is more to adopt herein With a plurality of types of base graders come the multiple data views for training pretreatment stage to generate, then integrated using integrated approach These base graders.In addition, in order to verify the validity of context of methods, we are in 48 reality KEEL imbalance standard data sets Upper contrived experiment compared EHSEL_BP and BP, EHSEL_GFRNN and GFRNN, EHSEL_SVM and SVM.Experimental result explanation Methods herein is significantly increased to classification results of the original base grader on unbalanced dataset.Moreover, herein also Contrived experiment compared the performance of EHSEL_BP and other 6 relevant algorithms on unbalanced dataset.Experimental result is shown Classification performances of the EHSEL_BP in all algorithms makes number one.Therefore, experimental result illustrates EHSEL_BP more other Comparison algorithm has better performance on unbalanced dataset, this also illustrates that the method carried herein is for uneven number simultaneously According to a feasible and effective method of collection.
Description of the drawings
Fig. 1 is that the mixing sampling based on entropy of the present invention integrates learning model overall flow figure
Specific implementation mode
For the content of the clearer description present invention, it is described further with reference to example and Figure of description. The embodiment hereafter carried is not to be used for limiting the range that the present invention is covered.The present invention is based on the mixing of entropy to sample Ensemble classifier Device algorithm, includes the following steps:
Step 1:Input training sampleWherein N is the quantity of training sample, yi=+1 generation This x of table sampleiBelong to positive class, yi=-1 representative sample xiBelong to negative class;
Step 2:The comentropy of each training sample is calculated first, is not known specifically as follows:
Step 2.1:Euclidean distance between calculating all samples two-by-two using following formula:
It is sample xiAnd xjDistance, d indicate sample dimension;
Step 2.2:Next the comentropy of each training sample is calculated.Sample xiComentropy calculation formula be:
C is the classification number of training sample, Pj(xi) it is the sample x calculated according to neighbour's ruleiBelong to certain a kind of probability.It calculates Probability Pj(xi) pass through following steps:
Step 2.2.1:To each sample xi, calculate its probability:
numjIt is xcandiIn belong to the number of samples of jth class, k is neighbour's number of samples in k near neighbor methods.
Step 3:Assuming that the entropy of all samples is { E1,E2,...,EN}.First negative class sample is divided into according to these entropy Two groups.A threshold alpha is given, first group of division is as follows:
G1={ xi|xi∈xneg,Ei> α } (4)
Second group is divided into:
G2={ xj|xj∈xneg,Ej≤α} (5)
Wherein xnegIt represents and bears class data.
Step 4:Retain the sample of first part.Then the sample of second part is further partitioned into M subset, be expressed as {Sub1,Sub2,...,SubM}.M is known as resampling rate.Finally, by first part's samples fusion to each subset of reservation, To form the subset of M original minus class sample, it is denoted as { S1,S2,...,SM}。
Step 5:After completing negative class specimen sample, next set about handling positive class sample.In order to by positive class sample Scale as possible close to negative class sample, we increase the number of positive class sample using SMOTE algorithms herein.In addition, just Class sample needs newly-generated quantity depending on the subset S of negative class samplei.Assuming that original positive class sample is denoted as SposAnd it will be new The positive class sample generated is denoted as Ssmo.So newly generated positive class sample is:
Spos'=Spos+Ssmo (6)
Step 6:Finally, by the positive class sample of above-mentioned generation with the M group data of the sub-combinations Cheng Xin of negative class.New data It can be expressed as:
Vi=Spos'+Si, i=1,2 ..., M (7)
In general, every group of data can regard a data views as.Therefore, the method is by mixing the form of sampling finally from original Training set in generate M New Century Planned Textbook.
Step 7:Next, will be trained using individual base grader on each visual angle.Due to above-mentioned sampling Journey is operated in pretreatment stage, therefore it is independently of specific grader.Namely almost all of grader can be herein It is applicable under method.Herein, we use three different types of graders, they include BP algorithm, GFRNN algorithms and SVM algorithm.
BP algorithm is a classical neural network algorithm.L layers are suppose there is, and every layer of neuronal quantity is Sl,For sample (xi,yi), cost function is:
The wherein connection weight of W interlayers, b are offset, hW,b(xi) represent the output valve of a sample.So all samples Cost function can be expressed as:
Wherein l=1,2 ..., L andRepresent the link between l+1 layers of i-th of neuron and l layers of j-th of neuron Weight.
GFRNN algorithms judge its belonging kinds according to the gravity acted on solution sample point z.Its main calculating Formula is as follows:
Wherein c is the number of classification, XcandiIt is the candidate samples determined by near neighbor method.In addition, sample point z and xjBetween Gravity is calculated as:
Wherein d (z, xj) it is sample z and xjBetween Euclidean distance.
SVM is a kind of method of discrimination, its thought keeps one boundary of searching that positive class and negative class is as separated as possible.SVM Object function it is as follows:
ζi>=0, i=1,2 ..., N
WhereinRepresent kernel function.ζiIt is slack variable, it allows some unessential sample mistakes point, and C is regularization ginseng Number.
Step 8:Base grader will produce M training pattern for handling data views.In order to integrate these moulds Type judges final result using most voting methods herein.It is assumed that each test sample xiEach output model be pi,j, j=1,2 ..., M.Majority ballot is to sample xiOutput be:
WhereinIt is sample xiPrediction class label.Sample x is indicated respectively with 0iBelong to positive class and negative class.M is that data regard The number at angle.
The specific implementation mode of the present invention is above described with reference to the accompanying drawings.But those skilled in the art It is understood that under the premise of not departing from spirit and principles of the present invention, several improvement and equivalent replacement can also be made.This hair Bright claim be improved with the technology and scheme after equivalent replacement, each fall within protection scope of the present invention.
Experimental design
Experimental data set is chosen:Reality unbalanced dataset used herein is selected from KEEL imbalance normal datas.They Details it is as shown in the table.
Dataset IR Size Dimension
pima 1.87 768 8
vehicle0 3.25 846 18
glass04vs5 9.22 92 9
glass2 11.59 214 9
abalone918 16.40 713 8
yest5 32.73 1484 8
The parameter selection of all algorithms is data set to be first divided into two parts, portion training is a using 5 wheel cross methods Test.Then training set is further divided into 1,2,3,4,5 this five parts.First by 1234 training, 5 verifications, then 1235 training, 4 test Card, shares five times in this way.It is tested on test set again after choosing best parameter.
Contrast model:In an experiment, we select M BP algorithm as base grader, and the systematic naming method of formation is EHSEL_BP.We mix sampling algorithm REA, hybrid algorithm EasyEnsemble, down-sampling algorithm in EHSEL_BP Classification performance comparison is carried out between Bagging, boosting algorithm Adaboost, up-sampling algorithm SMOTE-SVM.
Performance metric mode:Due to herein for be unbalanced dataset, this evaluation index of overall precision is not applicable. Therefore the evaluation criterion of mean accuracy (AACC) as a result is used herein.AACC takes into account the precision of each class, because This it positive class mistake point can be avoided very much, but overall evaluation when negative class mistake point is less or good situation.In practice, AACC be based on positive class sample point to ratio and negative class sample point to ratio calculate, they are denoted as TP respectivelyrate And TNrate.The calculation formula of AACC is:
Experimental result:
Data are the prediction result and its mean square deviation under AACC measure of criterions in table, a data set are corresponded to per a line, often The corresponding algorithm of one row.Best result on each data set is used in both runic and marks, and the ranking of each algorithm obtains Divide and is all marked in bracket.
By classification results and its score of each algorithm it is found that EHSEL_BP obtains optimum efficiency on all data sets, And the stability embodied on all data sets makes it finally possess highest average classification performance and near preceding average row Name score.

Claims (5)

1. a kind of mixing based on entropy samples integrated classifier, which is characterized in that the training method of the grader includes following rapid:
1) raw sample data is divided into training set and test set two parts;
2) comentropy of each sample is calculated on original training set;
3) sample is divided into two groups, the entropy of first group of sample by negative class sample according to the threshold value of setting according to the size of entropy More than threshold value, the entropy of second group of sample is no more than threshold value;
4) second group of sample is divided into M groups at random, and each group all forms new subset with first group;
5) positive class sample is up-sampled and balance, then each subset shape with negative class sample is reached with the subset with negative class sample At M new data views;
6) these visual angles are handled using M base grader, and obtains the training result of training sample using integrated approach;
7) test sample is updated in the corresponding discriminant function of the model and is identified by testing procedure.
2. the mixing according to claim 1 based on entropy samples integrated classifier, it is characterised in that pressed in training second step Training sample is divided into two groups by the size of this entropy in the same old way, and concrete operations are:
A threshold alpha is given, first group of division is as follows:
G1={ xi|xi∈xneg,Ei> α }
Second group is divided into:
G2={ xj|xj∈xneg,Ej≤α}
Wherein xnegIt represents and bears class data.
3. the mixing according to claim 1 based on entropy samples integrated classifier, which is characterized in that in step 4), bear class The number M that sample is divided equally at random is determined by the unbalance factor of data set;Assuming that the unbalance factor of data set is IR, then M Range be selected from { 1,2 ..., round (IR) }, wherein round () is that IR is taken approximate integral;In this way, the present invention is fully examining Consider under the uneven degree of sample is that each data set calculates a suitable down-sampling ratio.
4. the mixing according to claim 1 based on entropy samples integrated classifier, it is characterised in that will in training third step Second part is divided into M groups at random, it is assumed that second group of sample number is { 1,2 ..., N }, then grouping situation is:{1,2,…, Floor (N/M) }, { floor (N/M)+1, floor (N/M)+2 ..., 2floor (N/M) } ..., (M-1) floor (N/M)+1, (M-1) floor (N/M)+2 ..., N }, wherein floor () is for rounding.
5. the mixing according to claim 1 based on entropy samples integrated classifier, train in the 5th step, through raw data set M new data views are generated, are then handled using individual base grader on each visual angle;At this point, these bases are classified Device can make the grader of M single type, can also be the grader of M mixing of different types;These graders can be But it is not limited to following type:
BP algorithm is a classical neural network algorithm;L layers are suppose there is, and every layer of neuronal quantity is Sl,For sample (xi,yi), cost function is:
The wherein connection weight of W interlayers, b are offset, hW,b(xi) represent the output valve of a sample;The generation of so all samples Valence function can be expressed as:
Wherein l=1,2 ..., L andRepresent the link power between l+1 layers of i-th of neuron and l layers of j-th of neuron Weight;
GFRNN algorithms judge its belonging kinds according to the gravity acted on solution sample point z;Its main formulas for calculating It is as follows:
Wherein c is the number of classification, XcandiIt is the candidate samples determined by near neighbor method;In addition, sample point z and xjBetween Gravity is calculated as:
Wherein d (z, xj) it is sample z and xjBetween Euclidean distance;
SVM is a kind of method of discrimination, its thought keeps one boundary of searching that positive class and negative class is as separated as possible;The mesh of SVM Scalar functions are as follows:
ζi>=0, i=1,2 ..., N
WhereinRepresent kernel function;ζiIt is slack variable, it allows some unessential sample mistakes point, and C is regularization ginseng Number;Since the present invention is independently of specific grader, it has good robustness under different type grader.
CN201810536985.8A 2018-05-30 2018-05-30 A kind of mixing sampling integrated classifier based on entropy Pending CN108764346A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810536985.8A CN108764346A (en) 2018-05-30 2018-05-30 A kind of mixing sampling integrated classifier based on entropy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810536985.8A CN108764346A (en) 2018-05-30 2018-05-30 A kind of mixing sampling integrated classifier based on entropy

Publications (1)

Publication Number Publication Date
CN108764346A true CN108764346A (en) 2018-11-06

Family

ID=64004108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810536985.8A Pending CN108764346A (en) 2018-05-30 2018-05-30 A kind of mixing sampling integrated classifier based on entropy

Country Status (1)

Country Link
CN (1) CN108764346A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934203A (en) * 2019-03-25 2019-06-25 南京大学 A kind of cost-sensitive increment type face identification method based on comentropy selection
CN110266672A (en) * 2019-06-06 2019-09-20 华东理工大学 Network inbreak detection method based on comentropy and confidence level down-sampling
CN111260120A (en) * 2020-01-12 2020-06-09 桂林理工大学 Weather data entropy value-based weather day prediction method
CN112583414A (en) * 2020-12-11 2021-03-30 北京百度网讯科技有限公司 Scene processing method, device, equipment, storage medium and product
WO2022257458A1 (en) * 2021-06-08 2022-12-15 平安科技(深圳)有限公司 Vehicle insurance claim behavior recognition method, apparatus, and device, and storage medium
CN117076861A (en) * 2023-08-18 2023-11-17 深圳市深国际湾区投资发展有限公司 Data fusion-based related data processing system, method and medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934203A (en) * 2019-03-25 2019-06-25 南京大学 A kind of cost-sensitive increment type face identification method based on comentropy selection
CN109934203B (en) * 2019-03-25 2023-09-29 南京大学 Cost-sensitive incremental face recognition method based on information entropy selection
CN110266672A (en) * 2019-06-06 2019-09-20 华东理工大学 Network inbreak detection method based on comentropy and confidence level down-sampling
CN110266672B (en) * 2019-06-06 2021-09-28 华东理工大学 Network intrusion detection method based on information entropy and confidence degree downsampling
CN111260120A (en) * 2020-01-12 2020-06-09 桂林理工大学 Weather data entropy value-based weather day prediction method
CN112583414A (en) * 2020-12-11 2021-03-30 北京百度网讯科技有限公司 Scene processing method, device, equipment, storage medium and product
WO2022257458A1 (en) * 2021-06-08 2022-12-15 平安科技(深圳)有限公司 Vehicle insurance claim behavior recognition method, apparatus, and device, and storage medium
CN117076861A (en) * 2023-08-18 2023-11-17 深圳市深国际湾区投资发展有限公司 Data fusion-based related data processing system, method and medium

Similar Documents

Publication Publication Date Title
CN108764346A (en) A kind of mixing sampling integrated classifier based on entropy
CN105589806B (en) A kind of software defect tendency Forecasting Methodology based on SMOTE+Boosting algorithms
CN104809877B (en) The highway place traffic state estimation method of feature based parameter weighting GEFCM algorithms
CN102521656B (en) Integrated transfer learning method for classification of unbalance samples
CN103020642A (en) Water environment monitoring and quality-control data analysis method
CN110309888A (en) A kind of image classification method and system based on layering multi-task learning
CN105956621B (en) A kind of flight delay method for early warning based on evolution sub- sampling integrated study
CN111582596A (en) Pure electric vehicle endurance mileage risk early warning method integrating traffic state information
CN105260738A (en) Method and system for detecting change of high-resolution remote sensing image based on active learning
CN110009030A (en) Sewage treatment method for diagnosing faults based on stacking meta learning strategy
CN105740914A (en) Vehicle license plate identification method and system based on neighboring multi-classifier combination
CN108877947A (en) Depth sample learning method based on iteration mean cluster
CN112633337A (en) Unbalanced data processing method based on clustering and boundary points
CN103426004B (en) Model recognizing method based on error correcting output codes
CN106482967A (en) A kind of Cost Sensitive Support Vector Machines locomotive wheel detecting system and method
CN104123678A (en) Electricity relay protection status overhaul method based on status grade evaluation model
Gupta et al. Clustering-Classification based prediction of stock market future prediction
CN105975611A (en) Self-adaptive combined downsampling reinforcing learning machine
CN110246134A (en) A kind of rail defects and failures sorter
CN112001788A (en) Credit card default fraud identification method based on RF-DBSCAN algorithm
CN104809476A (en) Multi-target evolutionary fuzzy rule classification method based on decomposition
CN110020868A (en) Anti- fraud module Decision fusion method based on online trading feature
CN105512675A (en) Memory multi-point crossover gravitational search-based feature selection method
Cao et al. Imbalanced data classification using improved clustering algorithm and under-sampling method
CN109685133A (en) The data classification method of prediction model low cost, high discrimination based on building

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20181106