CN110533116A - Based on the adaptive set of Euclidean distance at unbalanced data classification method - Google Patents
Based on the adaptive set of Euclidean distance at unbalanced data classification method Download PDFInfo
- Publication number
- CN110533116A CN110533116A CN201910832525.4A CN201910832525A CN110533116A CN 110533116 A CN110533116 A CN 110533116A CN 201910832525 A CN201910832525 A CN 201910832525A CN 110533116 A CN110533116 A CN 110533116A
- Authority
- CN
- China
- Prior art keywords
- sample
- classifier
- classification
- test
- fundamental
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
Abstract
The invention discloses the adaptive set based on Euclidean distance at unbalanced data classification method, several multifarious balanced subsets are obtained by stochastic equilibrium method first, then on each balanced subset establish obtain multiple fundamental classifiers.It joined the preselected algorithm of classifier before dynamic select algorithm.After the fundamental classifier screened, a kind of new dynamic select algorithm is proposed, by assessing the sample classification device situation in sample peripheral region to be sorted, when to belong to the more more then abilities of minority class sample in range stronger for correct classification.Finally exported based on the adaptive set of distance at the prediction result that the fundamental classifier that rule will be selected obtains using a kind of.This method can obtain establishing fundamental classifier in the subset for generating multiplicity, it is proposed that dynamic select algorithm can pick out the strongest sub-classifier of classification capacity simultaneously, the integrated rule finally proposed is capable of providing preferably output as a result, finally effectively increasing unbalanced data nicety of grading.
Description
Technical field
The invention belongs to artificial intelligence field, it is specifically a kind of based on the adaptive set of Euclidean distance at uneven number
According to classification method.
Background technique
Unbalanced data refers in training sample the sample an of classification or the sample of multiple classifications and other classification samples
The case where quantity differs greatly.According to research report, class imbalance problem occurs in the various fields of real world,
If the facial age estimate, detect satellite image oil leak, abnormality detection, identify fraudulent credit card trade, software defect prediction and
Image labeling etc..Therefore, researcher pays much attention to data nonbalance problem and has held symposium several times and meeting, such as
Artificial intelligence promotes association (AAAI) 2000, international Conference on Machine Learning (ICML) Knowledge Discovery sum number in 2003 and 2004
The special interest group of ACM is explored according to excavation (SIGKDD).
For two classification imbalance problems, learning sample is generally divided into most classes and minority class.In general, people are to few
The degree of concern of several classes of samples will be more than most class samples, for example credit card fraud transaction identification is wanted at the cost of arm's length dealing
It is much higher that fraudulent trading cost is identified as than credit card arm's length dealing, because the latter can contact credit clamping by staff
Someone confirms what whether transaction was initiated by me.But the quantity of minority class sample is well below most this feelings of class sample size
The consequence that condition band comes may be very serious.Due to most of traditional sorting algorithm such as decision trees, k- arest neighbors and RIPPER
Tend to generate the model for maximizing whole classification accuracy, minority class sample is typically ignored.For example, for only
1% sample belongs to the data set of minority class, even if all sample classifications are most classes by model, it still can achieve 99%
Overall accuracy, the minority class mistake of desired Accurate classification can be classified with the classifier of this high accuracy.
At present applied to the integrated learning approach of machine learning and the field of data mining in terms of unbalanced data classification
Practical application is more and more to be suggested, but such most of algorithm can only limited raising unbalanced data classification prediction essence
Degree, each fundamental classifier is the expert of regional area, does not account for each fundamental classifier for different test specimens
This classification capacity is different, and the poor fundamental classifier of these performances, which is participated in final integrate, will affect the general of integrated model
Change ability, and generate for fundamental classifier study subset should be it is various, guarantee the diversity of fundamental classifier, together
When most of integrated studies integrated rule be all voted by most classes it is determining, do not consider training sample and test sample it
Between relationship, the prediction result that provides of fundamental classifier after optimization also cannot be improved further immediately.
Summary of the invention
To solve, integrated study neutron taxonomic diversity is insufficient, does not consider that the poor fundamental classifier sum aggregate of performance is set at rule
The problem of meter, the application propose it is a kind of based on the adaptive set of Euclidean distance at unbalanced data classification method, improve uneven
Weigh data classification precision.
To achieve the above object, the technical solution of the present invention is as follows: based on the adaptive set of Euclidean distance at uneven number
According to classification method, specifically comprise the following steps:
Step 1: data prediction, obtains diversity balanced subset;
Step 2: obtaining m homogeneous classification device building candidate using same classification learning algorithm on m balanced subset
Classifier pond;
Step 3: the preselected fundamental classifier in candidate classification device pond, will not have the classification of minority class sample ability
Device is deleted;
Step 4: using dynamic select algorithm by test sample peripheral region from the classifier pond that step 3 is screened
The strong candidate sub-classifier of sample classification ability picks out formation base classifier set;
Step 5: using a kind of fundamental classifier set that will be selected based on the adaptive set of distance at rule for surveying
The prediction result of sample sheet exports.
Further, in step 1, to data prediction: including the balanced subset obtained to training set stochastic equilibrium,
Verifying collection and test set;Specific steps are as follows:
1. according to training set Strain, verifying collection SvaWith test set sample StestQuantitative proportion is a:b:c, in raw data set
Middle division sample, and guarantee the training set after division, the ratio of verifying collection and most classes in test set sample and minority class
The ratio of most classes and minority class is concentrated to be consistent with initial data;
2. being randomly assigned a random number num according to formula (1)rand;
numrand=Smin+rand(0,1)*(Smax-Smin) (1)
Wherein SminFor training set StrainMiddle minority class sample size, rand (0,1) are the random number between 0 and 1, SmaxIt is
Training set StrainMiddle majority class sample size;
3. in training set StrainIt takes at random in most class samples and does not put back to sample until the sample newly formed reaches sample
Quantity is numrand, while over-sampling is carried out to minority class sample according to formula (2) and generates new sample z addition minority class sample
In, repeating minority class number of samples of the over-sampling after being added is numrand, the most class samples and over-sampling that will newly form
Minority class sample merging afterwards then obtains a balanced subset;
Z=β p+ (1- β) q (2)
Wherein p, q are StrainMiddle minority class sample, β are the random numbers between 0 to 1;
4. repeating step 2. and 3. until obtaining m balanced subset.
Further, in step 2, construct candidate classification device pond, specific steps: the m subset obtained to step 1 is equal
M homogeneity fundamental classifier composition candidate classification device pond is obtained using same classification learning algorithm.
Further, it in step 3, needs preselected to the fundamental classifier in candidate classification device pond;Specific steps
Are as follows:
1. to currently in test set StestIn sample x to be sortedq, collect S in verifyingvaMiddle k nearest-neighbors for calculating it,
If there are different classes of samples in k nearest-neighbors, recording k current neighbours is Ψ;If existing in k nearest-neighbors
Same category of sample, then enter step four;
2. each fundamental classifier h using the Ψ of acquisition as input, in candidate classification device pondiFor the Ψ for label of erasing
Prediction obtains output yp;
3. comparison basis classification prediction output ypWith the label y of true Ψ, if there is cannot at least correct classification simultaneously
The fundamental classifier of the sample of one group of minority class and most classes, which is given, deletes;Fundamental classifier after deletion in candidate classification device is
N.
Further, it in step 4, needs to be dynamically selected the candidate classification device after preselected, specific steps
Are as follows:
1. to currently in test set StestIn sample x to be sortedq, collect S in verifyingvaMiddle k nearest-neighbors for calculating it,
K sample is denoted as £;
2. each fundamental classifier h using the £ of acquisition as input, in candidate classification device pondiFor the £ for label of erasing
Prediction obtains output yout;Y is exported for predictionoutWith true label y, each fundamental classifier is calculated according to formula (3)
Ability weight:
Wherein I () is indicator function, θjFor the weight coefficient of j-th of sample class, θjIt is defined as follows:
3. it sorts after ability weight has been calculated according to numerical values recited, P% formation base before being taken from n fundamental classifier
Classifier set C'.
Further, in step 5, classifier set C' is obtained to selection and provides prediction to current sample to be sorted
Integrated output, specific steps are as follows:
1. calculating separately out parameter R1 and R2 according to formula (4) and (5)
Wherein t is the fundamental classifier quantity in set C', Pi1And Pi2It corresponds respectively in i-th of classifier for surveying
The probability of minority class and most classes that sample originally provides, Di1And Di2Test sample is corresponded respectively into i-th of fundamental classifier
The average Euclidean distance of the training sample of minority class and most classes, α is auto-adaptive parameter, is needed true according to different sorting algorithms
It is vertical;
Before calculating distance, need that sample is normalized by formula (6):
WhereinxiRespectively represent the value of normalization front and back, xmax、xminRespectively indicate maximum value in sample data, most
Small value;
2. comparing the value of parameter R1 and R2, if R1>R2, then current sample classification be minority class, on the contrary it is then for majority classes;
It repeats Step 3: step 4 and step 5 are to all test set sample StestIn sample classification complete.
The present invention can be achieved that by above method
(1) has the characteristics that the basis multifarious, guarantee is established on it in the subset obtained using stochastic equilibrium method
Classifier has diversity.
(2) it joined pre-selection selection method, ensure that next step dynamic select algorithm can more preferably select basis point faster
Class device.
(3) with dynamic select algorithm be each stronger fundamental classifier of samples selection ability to be sorted, avoid by
The poor fundamental classifier of performance brings Generalization Capability decline problem caused by final decision exports into.
(4) the integrated rule proposed combines the output of each fundamental classifier, and considers training set and test set
Between relationship, this relationship is exactly that sample to be sorted should be more categorized into nearest sample class.It is integrated using this
Multiple output end values effectively can be merged integrated output by rule, improve integrated output accuracy.
Detailed description of the invention
Fig. 1 is implementation flow chart of the invention.
Specific embodiment
With reference to Fig. 1, it is the flow chart that the present invention realizes step, is made in conjunction with the figure to implementation process of the invention detailed
Explanation.The embodiment of the present invention is implemented under the premise of the technical scheme of the present invention, gives detailed embodiment party
Formula and specific operating process, but protection scope of the present invention is not limited to following embodiments.
It is a kind of based on the adaptive set of Euclidean distance at unbalanced data classification method, the life including candidate classification device pond
At the adaptive set of, the stronger fundamental classifier set of dynamic select classification capacity and fundamental classifier at output, successively wrap
Include following steps:
(1) data prediction obtains training set, verifying collection and test set;And in training pooled applications stochastic equilibrium method
Obtain m balanced subset;
(2) candidate point of m homogeneous classification device building is obtained using same classification learning algorithm on this m balanced subset
In class device pond;
(3) the preselected fundamental classifier in candidate classification device pond, by point for the sample ability for not having classification minority class
Class device is deleted;
(4) it is screened test sample peripheral region from step (3) using dynamic select algorithm in obtained classifier pond
The strongest candidate sub-classifier of sample classification ability is picked out;
(5) using a kind of fundamental classifier that will be selected based on the adaptive set of distance at rule for test sample
Prediction result output;
Diversity subset is obtained using stochastic equilibrium method, is randomly assigned most class sample sizes and minority class sample number
Most class samples are carried out lack sampling and carry out over-sampling to minority class sample, reach data balancing by a value between amount
Purpose steps be repeated alternatively until that the subset quantity of generation reaches desired subset quantity.
In order to reduce the efficiency and more conducively dynamic select algorithm that the quantity of candidate classification device improves dynamic select algorithm
The stronger fundamental classifier of selection ability deletes a part of fundamental classifier using pre-selection selection method, and specific method is to verify
K nearest-neighbors for taking current sample to be tested are concentrated, it is every in candidate classification device pond using this k nearest-neighbors as inputting
A sub-classifier carries out prediction output to it, deletes the fundamental classifier for not having and distinguishing minority class sample ability.
Classify situation to calculate each fundamental classifier classification capacity, specifically by the verifying collection of test sample arest neighbors
Method is that k nearest-neighbors for taking current test sample are concentrated in verifying, respectively with each of candidate classification device pond basis point
Class device carries out prediction output to this k nearest-neighbors, and a small number of classification abilities of correct classification relatively can guarantee totally by force just again simultaneously
The fundamental classifier of true rate chooses, and is different from traditional dynamic select algorithm, and traditional selection algorithm design is mostly being protected
It is carried out in the case where demonstrate,proving overall accuracy, but this fundamental classifier chosen in uneven sample can be partial to majority
Class.
Each fundamental classifier can provide the output of sample to be predicted, not only allow for the defeated of each fundamental classifier
Out, while the relationship between sample and training sample to be sorted, formula being had also contemplated are as follows:
Wherein t is basic classifier quantity, Pi1And Pi2It corresponds respectively in i-th of classifier provide test sample
Classification 1 and classification 2 probability, Di1And Di2Correspond respectively to test sample classification 1 and classification 2 into i-th of fundamental classifier
Training sample average Euclidean distance, α is auto-adaptive parameter, needs to be established according to different sorting algorithm.
If R1>R2, then current sample classification be classification 1, on the contrary it is then be classification 2.
The present embodiment uses ecoli046vs5 data collected by a disclosed standard unbalanced data library KEEL
Collection.Ecoli046vs5 data set includes 203 samples in total, and each sample has 7 attribute, wherein 20 minority class samples,
183 most class samples.Degree of unbalancedness is 9.15.Specific unbalanced data assorting process is as follows:
(1) according to sample size in training set, it is 8:1:1's that sample size in sample size and test set is concentrated in verifying
The original uneven learning sample collection of ratio cut partition, and guarantee most class samples and minority class sample size in the data set after dividing
Ratio and original uneven learning sample collection ratio it is consistent.
(2) specific step is as follows for stochastic equilibrium on training set:
1. being randomly assigned a random number num according to formula (1)rand;
2. to training set StrainMiddle minority class sample carries out reaching number of samples according to formula (2) over-sampling being numrand,
Most class samples are carried out lack sampling to reach number of samples being numrand, obtain a balanced subset;
3. repeating step 1. and 2. until obtaining 100 balanced subsets.
(3) 100 homogeneous classification device building candidate classification devices are obtained using decision Tree algorithms on this 100 balanced subsets
Chi Zhong;
(4) pre-selection selection method is executed to the fundamental classifier that step (3) obtain, the specific steps are as follows:
1. to currently in training set StestIn sample x to be sortedq, collect S in verifyingvaMiddle 7 nearest-neighbors for calculating it,
If there are different classes of samples in 7 nearest-neighbors, recording 7 current neighbours is Ψ.If existing in 7 nearest-neighbors
Same category of sample then enters step (5);
2. each fundamental classifier h using the Ψ of acquisition as input, in candidate classification device pondiFor the Ψ for label of erasing
Prediction obtains output yp;
3. comparison basis classification prediction output ypWith the label y of true Ψ, if there is cannot simultaneously at least classify just
Really the fundamental classifier of the sample of one group of minority class of classification and most classes, which is given, deletes.Basis point after deletion in candidate classification device
Class device is n.
(5) the n fundamental classifier acquired to step (4) is dynamically selected, the specific steps are as follows:
1. to currently in training set StestIn sample x to be sortedq, collect S in verifyingvaMiddle 7 nearest-neighbors for calculating it,
7 samples are denoted as £;
2. each fundamental classifier h using the £ of acquisition as input, in candidate classification device pondiFor the £ for label of erasing
Prediction obtains output yout.Y is exported for predictionoutWith true label y, each fundamental classifier is calculated according to formula (3)
Ability weight;
3. sorting after ability weight has been calculated according to numerical values recited, preceding 15% is taken to constitute base from n fundamental classifier
Plinth classifier set C'.
(6) in order to determine the α value in formula (4)-(5), cross validation is carried out to different α values using verifying collection, finally
Obtaining α value with decision tree is 1, brings α value into formula (2)-(3) and calculates separately the value of R1 and R2 and compare, if
R1>R2, then current sample classification be minority class, on the contrary it is then for majority classes.
Step (4) (5) and step (6) are repeated until all test set sample StestIn sample classification complete.
In order to better illustrate the validity of algorithm, decision Tree algorithms are used after only being handled with decision Tree algorithms and smote
It is compared as algorithm, while the use of AUC being algorithm index to quantify last result output.
Table 1: distinct methods compare the classification results of ecoli046vs5 data set
It can be seen from Table 1 that based on the base that in ecoli046vs5 unbalanced data classification experiments, the application is proposed
In Euclidean distance adaptive set at the obtained AUC value of unbalanced data classification method be 0.9192, compared to other allusion quotations
The processing method of type is enhanced on classification performance.Experimental result illustrates that this method can be effectively combined dynamic and select
Select algorithm sum aggregate and design respective advantage at rule, can effectively improve unbalanced data precision of prediction and integrated model it is extensive
Ability.
Claims (5)
1. based on the adaptive set of Euclidean distance at unbalanced data classification method, which is characterized in that specifically include following step
It is rapid:
Step 1: data prediction, obtains diversity balanced subset;
Step 2: obtaining m homogeneous classification device building candidate classification using same classification learning algorithm on m balanced subset
Device pond;
Step 3: the preselected fundamental classifier in candidate classification device pond, the classifier for not having minority class sample ability is deleted
It removes;
Step 4: using dynamic select algorithm by test sample peripheral region sample from the classifier pond that step 3 is screened
The strong candidate sub-classifier of classification capacity picks out formation base classifier set;
Step 5: using a kind of fundamental classifier set that will be selected based on the adaptive set of distance at rule for test specimens
This prediction result output.
2. according to claim 1 based on the adaptive set of Euclidean distance at unbalanced data classification method, feature exists
In in step 1, to data prediction: including the balanced subset obtained to training set stochastic equilibrium, verifying collection and test
Collection;Specific steps are as follows:
1. according to training set Strain, verifying collection SvaWith test set sample StestQuantitative proportion is a:b:c, concentrates and draws in initial data
Divide sample, and guarantees the training set after division, the ratio and original of verifying collection and most classes in test set sample and minority class
The ratio of most classes and minority class is consistent in beginning data set;
2. being randomly assigned a random number num according to formula (1)rand;
numrand=Smin+rand(0,1)*(Smax-Smin) (1)
Wherein SminFor training set StrainMiddle minority class sample size, rand (0,1) are the random number between 0 and 1, SmaxIt is trained
Collect StrainMiddle majority class sample size;
3. in training set StrainIt takes at random in most class samples and does not put back to sample until the sample newly formed reaches sample size
For numrand, while over-sampling is carried out to minority class sample according to formula (2) and is generated in new sample z addition minority class sample,
Repeating minority class number of samples of the over-sampling after being added is numrand, after the most class samples and over-sampling that newly form
The merging of minority class sample then obtains a balanced subset;
Z=β p+ (1- β) q (2)
Wherein p, q are StrainMiddle minority class sample, β are the random numbers between 0 to 1;
4. repeating step 2. and 3. until obtaining m balanced subset.
3. according to claim 1 based on the adaptive set of Euclidean distance at unbalanced data classification method, feature exists
In needing preselected to the fundamental classifier in candidate classification device pond in step 3;Specific steps are as follows:
1. to currently in test set StestIn sample x to be sortedq, collect S in verifyingvaMiddle k nearest-neighbors for calculating it, if k
There are different classes of samples in nearest-neighbors, then recording k current neighbours is ψ;If in k nearest-neighbors, there are same class
Other sample, then enter step four;
2. each fundamental classifier h using the Ψ of acquisition as input, in candidate classification device pondiΨ prediction for label of erasing
Obtain output yp;
3. comparison basis classification prediction output ypWith the label y of true Ψ, if there is cannot at least correct one group of classification simultaneously it is few
The fundamental classifier of the sample of several classes of and most classes, which is given, deletes;Fundamental classifier after deletion in candidate classification device is n.
4. according to claim 1 based on the adaptive set of Euclidean distance at unbalanced data classification method, feature exists
In needing to be dynamically selected the candidate classification device after preselected, specific steps in step 4 are as follows:
1. to currently in test set StestIn sample x to be sortedq, collect S in verifyingvaMiddle k nearest-neighbors for calculating it, by k
Sample is denoted as £;
2. each fundamental classifier h using the £ of acquisition as input, in candidate classification device pondi£ prediction for label of erasing
Obtain output yout;Y is exported for predictionoutWith true label y, the ability of each fundamental classifier is calculated according to formula (3)
Weight:
Wherein I () is indicator function, θjFor the weight coefficient of j-th of sample class, θjIt is defined as follows:
3. it sorts after ability weight has been calculated according to numerical values recited, P% formation base classification before being taken from n fundamental classifier
Device set C'.
5. according to claim 4 based on the adaptive set of Euclidean distance at unbalanced data classification method, feature exists
In, in step 5, classifier set C' is obtained to selection and is provided to the integrated output of the prediction of current sample to be sorted, it is specific to walk
Suddenly are as follows:
1. calculating separately out parameter R1 and R2 according to formula (4) and (5)
Wherein t is the fundamental classifier quantity in set C', Pi1And Pi2It corresponds respectively in i-th of classifier for test sample
The probability of the minority class and most classes that provide, Di1And Di2Correspond respectively to test sample minority class into i-th of fundamental classifier
With the average Euclidean distance of the training sample of most classes, α is auto-adaptive parameter;
Before calculating distance, need that sample is normalized by formula (6):
WhereinFor the value after normalization, xiFor the value before normalization, xmax、xminRespectively indicate maximum value in sample data,
Minimum value;
2. comparing the value of parameter R1 and R2, if R1>R2, then current sample classification be minority class, on the contrary it is then for majority classes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910832525.4A CN110533116A (en) | 2019-09-04 | 2019-09-04 | Based on the adaptive set of Euclidean distance at unbalanced data classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910832525.4A CN110533116A (en) | 2019-09-04 | 2019-09-04 | Based on the adaptive set of Euclidean distance at unbalanced data classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110533116A true CN110533116A (en) | 2019-12-03 |
Family
ID=68666803
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910832525.4A Pending CN110533116A (en) | 2019-09-04 | 2019-09-04 | Based on the adaptive set of Euclidean distance at unbalanced data classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110533116A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080442A (en) * | 2019-12-21 | 2020-04-28 | 湖南大学 | Credit scoring model construction method, device, equipment and storage medium |
CN111210343A (en) * | 2020-02-21 | 2020-05-29 | 浙江工商大学 | Credit card fraud detection method based on unbalanced stream data classification |
CN112035719A (en) * | 2020-09-01 | 2020-12-04 | 渤海大学 | Class imbalance data classification method and system based on convex polyhedron classifier |
CN113204481A (en) * | 2021-04-21 | 2021-08-03 | 武汉大学 | Class imbalance software defect prediction method based on data resampling |
CN113673573A (en) * | 2021-07-22 | 2021-11-19 | 华南理工大学 | Anomaly detection method based on self-adaptive integrated random fuzzy classification |
CN114220026A (en) * | 2021-12-30 | 2022-03-22 | 杭州电子科技大学 | Sea surface small target detection method based on multi-classification idea |
CN114548327A (en) * | 2022-04-27 | 2022-05-27 | 湖南工商大学 | Software defect prediction method, system, device and medium based on balanced subsets |
CN113673573B (en) * | 2021-07-22 | 2024-04-30 | 华南理工大学 | Abnormality detection method based on self-adaptive integrated random fuzzy classification |
-
2019
- 2019-09-04 CN CN201910832525.4A patent/CN110533116A/en active Pending
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080442A (en) * | 2019-12-21 | 2020-04-28 | 湖南大学 | Credit scoring model construction method, device, equipment and storage medium |
CN111210343A (en) * | 2020-02-21 | 2020-05-29 | 浙江工商大学 | Credit card fraud detection method based on unbalanced stream data classification |
CN111210343B (en) * | 2020-02-21 | 2022-03-29 | 浙江工商大学 | Credit card fraud detection method based on unbalanced stream data classification |
CN112035719A (en) * | 2020-09-01 | 2020-12-04 | 渤海大学 | Class imbalance data classification method and system based on convex polyhedron classifier |
CN112035719B (en) * | 2020-09-01 | 2024-02-20 | 渤海大学 | Category imbalance data classification method and system based on convex polyhedron classifier |
CN113204481A (en) * | 2021-04-21 | 2021-08-03 | 武汉大学 | Class imbalance software defect prediction method based on data resampling |
CN113204481B (en) * | 2021-04-21 | 2022-03-04 | 武汉大学 | Class imbalance software defect prediction method based on data resampling |
CN113673573A (en) * | 2021-07-22 | 2021-11-19 | 华南理工大学 | Anomaly detection method based on self-adaptive integrated random fuzzy classification |
CN113673573B (en) * | 2021-07-22 | 2024-04-30 | 华南理工大学 | Abnormality detection method based on self-adaptive integrated random fuzzy classification |
CN114220026A (en) * | 2021-12-30 | 2022-03-22 | 杭州电子科技大学 | Sea surface small target detection method based on multi-classification idea |
CN114548327A (en) * | 2022-04-27 | 2022-05-27 | 湖南工商大学 | Software defect prediction method, system, device and medium based on balanced subsets |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110533116A (en) | Based on the adaptive set of Euclidean distance at unbalanced data classification method | |
CN109492026B (en) | Telecommunication fraud classification detection method based on improved active learning technology | |
CN106326913A (en) | Money laundering account determination method and device | |
WO2019179403A1 (en) | Fraud transaction detection method based on sequence width depth learning | |
Ahalya et al. | Data clustering approaches survey and analysis | |
CN110147321A (en) | A kind of recognition methods of the defect high risk module based on software network | |
CN107766418A (en) | A kind of credit estimation method based on Fusion Model, electronic equipment and storage medium | |
CN108363810A (en) | A kind of file classification method and device | |
CN106228554B (en) | Fuzzy coarse central coal dust image partition method based on many attribute reductions | |
CN108319987A (en) | A kind of filtering based on support vector machines-packaged type combined flow feature selection approach | |
CN109886284B (en) | Fraud detection method and system based on hierarchical clustering | |
CN107273387A (en) | Towards higher-dimension and unbalanced data classify it is integrated | |
CN109739844A (en) | Data classification method based on decaying weight | |
CN110147760A (en) | A kind of efficient electrical energy power quality disturbance image characteristics extraction and identification new method | |
CN112633337A (en) | Unbalanced data processing method based on clustering and boundary points | |
CN110377605A (en) | A kind of Sensitive Attributes identification of structural data and classification stage division | |
CN112417176B (en) | Method, equipment and medium for mining implicit association relation between enterprises based on graph characteristics | |
CN110134719A (en) | A kind of identification of structural data Sensitive Attributes and stage division of classifying | |
CN104850868A (en) | Customer segmentation method based on k-means and neural network cluster | |
CN112001788A (en) | Credit card default fraud identification method based on RF-DBSCAN algorithm | |
CN106934410A (en) | The sorting technique and system of data | |
CN109993042A (en) | A kind of face identification method and its device | |
CN110334773A (en) | Model based on machine learning enters the screening technique of modular character | |
Dong | Application of Big Data Mining Technology in Blockchain Computing | |
CN110516741A (en) | Classification based on dynamic classifier selection is overlapped unbalanced data classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191203 |