CN106372655A - Synthetic method for minority class samples in non-balanced IPTV data set - Google Patents

Synthetic method for minority class samples in non-balanced IPTV data set Download PDF

Info

Publication number
CN106372655A
CN106372655A CN201610753263.9A CN201610753263A CN106372655A CN 106372655 A CN106372655 A CN 106372655A CN 201610753263 A CN201610753263 A CN 201610753263A CN 106372655 A CN106372655 A CN 106372655A
Authority
CN
China
Prior art keywords
sample
minority class
samples
dangerous
major
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610753263.9A
Other languages
Chinese (zh)
Inventor
魏昕
李智林
周亮
黄若尘
刘榕华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201610753263.9A priority Critical patent/CN106372655A/en
Publication of CN106372655A publication Critical patent/CN106372655A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a synthetic method for minority class samples in a non-balanced IPTV data set, and aims to overcome the defect of performance reduction of a subsequent classification and prediction model caused by the fact that new samples are directly generated without analytic processing of minority samples in an existing minority class data synthesis method. The synthetic method is implemented by the steps of firstly finding out a neighbor set of the minority class samples; dividing neighbor samples into a noise set, a security set and a dangerous set according to a proportion of categories which the neighbor samples belong to; not processing samples in the noise set; calculating a ratio of the security set to the dangerous set, and calculating a related probability; selecting the security set or the dangerous set according to the probability; and generating new minority class samples based on the samples in the selected set. By adopting the method, the minority class sample effect having the negative effect for classification can be removed; the utility of the minority class samples near a classification face is improved; and the obtained new minority class samples can better improve the performance of the subsequent classification and prediction model.

Description

A kind of synthetic method of the minority class sample on non-equilibrium iptv data set
Technical field
The present invention relates to non-equilibrium data process field, especially relate to the minority class on a kind of non-equilibrium iptv data set The synthetic method of sample.
Background technology
With the business transformation of domestic fixed network operator, become the new industry of operator based on the various value-added services of the Internet The important component part of business growth point, especially IPTV (iptv) business has presented the situation of rapid growth. Iptv has following features: (1) user is obtained in that high-quality digital media service;(2) user can pass through broadband ip network Free selection video frequency program;(3) it provides wide emerging market for operator.In recent years, operator and research institution Personnel are devoted to lifting impression and the satisfaction of iptv user by the key factor of research impact user experience quality (qoe) Degree.
In existing solution, the report based on the status data gathering from iptv Set Top Box and user hinders data, leads to Cross the model in machine learning and correlation technique to predict the qoe of user.But due in iptv business in most cases, net In order, Consumer's Experience is also preferable, does not report barrier for network, in limited instances poor user experience and report barrier, thus Set Top Box institute The data collecting is nonequilibrium, i.e. there are two class users report barrier classifications and user does not report barrier classification.Wherein use The sample number of family report barrier classification is far smaller than the sample number that user does not report barrier classification, then in this problem, user's report barrier classification For minority class, it is many several classes ofs that user does not report barrier classification.
In order to solve non-equilibrium data process problem it is often necessary to according to available data characteristic, synthesize a part of minority class Sample, so that two class data volumes reach balance.In existing method, synthetic minority Oversampling technique (smote), as the technology of an over-sampling, is frequently utilized for synthesizing minority class.Although Smote algorithm has many good qualities, but still has some defects, including over-fitting data polytropy.Particularly, work as smote Generate equal number of generated data for each a few sample, neighbours' sample is not taken into account, this can increase minority class The probability that internal specimen overlapping phenomenon occurs.In addition some minority class samples are located near classification interface, and subsequent classifier is risen Pivotal role, and other samples are located at most apoplexy due to endogenous wind, belong to noise, generate minority class sample if based on it, then can be right Classification has the opposite effect, and existing smote algorithm does not consider these problems.Based on this, the present invention specifically addresses smote technology Some technological deficiencies existing, preferably solve the problems, such as the data nonbalance in iptv user qoe prediction.
Content of the invention
The technical problem to be solved is that the deficiency for background technology provides a kind of non-equilibrium iptv data The synthetic method of the minority class sample on collection.
The present invention is to solve above-mentioned technical problem to employ the following technical solutions:
A kind of synthetic method of the minority class sample on non-equilibrium iptv data set, specifically includes following steps:
Step 1: find out minority class sample set xminorIn each sample point xiCorresponding k neighbour set si, wherein k is nature Number, i=1 ... n, xi∈xminor;K neighbour collection is combined into apart from xiThe set that k nearest sample is formed;
Step 2, the characteristic of k each the minority class sample of neighbour's set analysis being obtained according to step 1, and then be classified as making an uproar Sound collection, safe collection and dangerous collection three classes;
Step 3, the sample that noise is concentrated does not process, and calculates the sample size in safe collection and the dangerous sample concentrated Ratio t between quantity;
Step 4, produces an equally distributed random number b obeying on interval [0,1];If b is ∈ [0, t/ (t+1)], then Select the dangerous all samples concentrated as input, the smote algorithm sending into standard generates new minority class sample;Conversely, then Select all samples in safe collection as input, the smote algorithm sending into standard generates new minority class sample;
Step 5, original minority class sample and newly-generated minority class sample are combined the new minority class set of composition Close.
Further preferred side as the synthetic method of the minority class sample on a kind of non-equilibrium iptv data set of the present invention Case, described step 2 specifically comprises the steps of:
Step 2.1, counts siIn belong to many several classes ofs xmajorNumber of samples, use | si∩xmajor| to represent, its expression Many several classes ofs sample set xmajorAnd siCommon factor in number of samples.
Step 2.2, judges | si∩xmajor| residing interval, it is specifically divided into three kinds of situations:
If | si∩xmajor|=k, then current sample xiIt is in most apoplexy due to endogenous wind, it is believed that it is to make an uproar for classification problem Sound;xminorIn all samples composition safe collections meeting this condition;
If 0≤| si∩xmajor| < 0.5k then shows current sample xiDangerous very little by misclassification;xminorIn all full The sample composition safe collection of this condition of foot;
If 0.5k≤| si∩xmajor| < k then shows current sample xiExist by the danger of misclassification;xminorIn all full The dangerous collection of sample composition of this condition of foot.
Further preferred side as the synthetic method of the minority class sample on a kind of non-equilibrium iptv data set of the present invention Case, in step 4, the concrete calculating process of algorithm of described smote is as follows: sets current sample as xi, from the k neighbour of this sample Set siOne sample x of middle random selectionj, produce an equally distributed random number δ of obedience from interval [0,1], then newly-generated Minority class sample is: xnew=xi+δ×(xj-xi).
The present invention adopts above technical scheme compared with prior art, has following technical effect that
1. the present invention can solve the problems such as classification of non-equilibrium data, prediction by producing minority class sample;
2. the present invention classifies to minority class sample, does not consider using the minority class sample being absorbed among many several classes ofs sample Produce new samples, it is to avoid the hydraulic performance decline being brought in subsequent classification by noise.Further, since the dangerous sample concentrated is in two Near the classification interface of class, the minority class sample new using the sample generation in this set as much as possible, be conducive to significantly Improve subsequent classification, the performance of Forecasting Methodology;
3. the data overlap during the present invention can avoid the minority class sample that traditional smote algorithm is brought to produce Problem.
Brief description
Fig. 1 is the synthetic method flow chart of the minority class sample on the present invention non-equilibrium iptv data set;
Fig. 2 is to be respectively adopted three kinds of methods under knn grader of the present invention to process the g average ratio of non-equilibrium iptv data sets relatively Result;
Fig. 3 is to be respectively adopted the g average ratio that three kinds of methods process non-equilibrium iptv data set under c4.5 grader of the present invention Relatively result;
Fig. 4 is that the present invention is respectively adopted the minority class data that the smote method of standard and method proposed by the present invention generate G average comparative result as test set.
Specific embodiment
Below in conjunction with the accompanying drawings technical scheme is described in further detail:
As shown in figure 1, a kind of synthetic method of the minority class sample on non-equilibrium iptv data set, its step includes:
Step 1: find out all minority class sample points respective k neighbour set si, wherein k is natural number, and i is positive integer;
Step 2, the characteristic of k each the minority class sample of neighbour's set analysis being obtained according to step 1, and then be classified as making an uproar Sound collection, safe collection and dangerous collection three classes;
Step 3, the sample that noise is concentrated does not process, and calculates the sample size in safe collection and the dangerous sample concentrated Ratio t between quantity;
Step 4, produces an equally distributed random number b obeying on interval [0,1];If b is ∈ [0, t/ (t+1)], then Select the dangerous all samples concentrated as input, the smote algorithm sending into standard generates new minority class sample;Conversely, then Select all samples in safe collection as input, the smote algorithm sending into standard generates new minority class sample;
Step 5, original minority class sample and newly-generated minority class sample are combined the new minority class set of composition Close.
The detailed process of all steps is as follows:
Step 1: set the data that iptv Set Top Box collects and include status dataHinder data with the report of userBoth are one-to-one.Wherein vector xiDimension be p, reflect iptv network condition (time delay, packet loss, interim card Deng), yiFor scalar, it is the labelling whether user reports barrier, such as user ensures, then yi=1, conversely, yi=0.So, minority class sample This collection xminorIt is defined as yi=1, i=1 ..., corresponding all x of ni;Many several classes ofs sample set xmajorIt is defined as yi=0, i =1 ..., corresponding all x of ni, i.e. xmajor=x xmajor.For each sample x in minority classi∈xminor, calculate Its with x in all samples Euclidean distance, k nearest sample of selected distance form xiK neighbour set si.
Step 2: by the characteristic of k each minority class sample of neighbour's set analysis, minority class sample is classified further, specifically As follows:
(2-1) count siIn k sample in belong to many several classes ofs xmajorNumber of samples, that is, obtain | si∩xmajor|, This can be by counting siMiddle sample generic labelling y obtains.
(2-2) judge | si∩xmajor| residing interval, it is divided into three kinds of situations:
Situation 1: if | si∩xmajor|=k, then show current sample xiIt is in most apoplexy due to endogenous wind, for classification problem Speech is it is believed that it is noise.xminorIn the set of all samples compositions meeting this condition be defined as " safe collection ";
Situation 2: if 0≤| si∩xmajor| < 0.5k then shows current sample xiDangerous very little by misclassification.xminor In the set of all samples compositions meeting this condition be defined as " safe collection ";
Situation 3: if 0.5k≤| si∩xmajor| < k then shows current sample xiExist by the danger of misclassification.xminor In the set of all samples compositions meeting this condition be defined as " dangerous collection ";
(2-3) sample point concentrated for the noise meeting situation 1, it does not do any subsequent treatment, i.e. do not utilize its life The minority class sample of Cheng Xin.For the dangerous sample point concentrated of the safe collection meeting situation 2 and situation 3, enter next step Continue with.
Step 3: calculate the ratio between the sample size in safe collection obtained in the previous step and the dangerous sample size concentrated Value, is designated as t.
Step 4: produce an equally distributed random number obeyed on interval [0,1], be designated as b.If b is ∈ [0, t/ (t+ 1)], then with the dangerous all samples concentrated as input, send into standard smote algorithm and generate new minority class sample;No Then, with all samples in safe collection as input, the smote algorithm sending into standard generates new minority class sample.Original few Several classes of sample and newly-generated minority class sample are combined, and form new minority class set.
Standard smote algorithm in this step is as follows: sets current sample as xi, from the k neighbour set s of this sampleiIn with Machine selects a sample xj, produce an equally distributed random number δ of obedience from interval [0,1], then newly-generated minority class sample Originally it is: xnew=xi+δ×(xj-xi).
It should be noted that needing the new sample number producing by between many several classes ofs sample number and original minority class sample number Difference determine.Assume that many several classes ofs sample and minority class sample size are respectively | xmajor| and | xminor|, then need newly-generated (xmajor|-|xminor|) individual minority class sample.If this step is with safe collection, and (sample number in this set is nsafe) as standard The input of smote algorithm, then each sample in safe collection need operation standard smote algorithmSecondary.With Reason, if the danger of this step integrates, and (sample number in this set is as ndanger) as standard smote algorithm input, then dangerous Each sample concentrated needs operation standard smote algorithmSecondary.
Embodiment and performance evaluation
In order to the synthetic method of the minority class sample non-equilibrium iptv data set present invention designed by is better described Advantage, be applied to prediction iptv system user report barrier.Here, two original data sets both are from Jiangsu Telecom. Data set 1 (i.e. x) is to April iptv Key Performance Indicator (kpi) data of No. ten from April No. one.Data set 2 (i.e. y) is Hinder data (the user's report barrier data receiving by phone) from the report of user.
After collecting raw data set, need to carry out data cleansing to it, its object is to remove in initial data Repeat to record, the data such as error logging and property value disappearance record, and by the data data collection 2 in data set 1 Data corresponds, and according to the report barrier labelling of data set 2, the data in data set 1 is classified, for use as subsequently pre- Survey the training of model.After data cleansing, in data set x, total record (sample) has 439050, wherein 4871 genus In minority class, 434179 belong to many several classes ofs, and dimension p of each data is 11.The implication of each dimension is to be shown in Table 1.
The implication of each dimension of table 1 data
After data cleansing, for equilibrium majority class and minority class sample, using the non-equilibrium iptv designed by the present invention The synthetic method of the minority class sample on data set, produces minority class sample so that new minority class sample total is original number According to the minority class sample number concentrated 40 times.
With several classes of sample more than 150000 and 150000 minority class samples (new) as training dataset, have chosen here K arest neighbors (knn) sorting algorithm and c4.5 Decision Tree Algorithm implementation model training, and the model logarithm with training Classified according to the remaining data concentrated.Fig. 2 directly carries out for not producing new minority class sample under knn grader classifying, Generate new minority class sample only with standard smote method to be classified and using method proposed by the present invention new the lacking of generation Several classes of sample carries out the comparative result of the g average (g-mean) under three kinds of methods of classifying.
In figs. 2 and 3, the ratio for minority class and many several classes ofs in the data set classified is 1:20 respectively (6000:12000), 1:25 (6000:15000) and 1:30 (6000:18000).Under the test case of these three ratios, we As can be seen that either knn grader or c4.5 grader, the g of the minority class sample synthetic method designed by the present invention is equal Value is higher than other two methods.Longitudinally contrast finds Fig. 2 and Fig. 3, compares with c4.5 grader, knn grader and the present invention The minority class sample synthetic method being proposed combines, and has more preferable classification performance.
Additionally, the minority class data generating the smote method of standard and method proposed by the present invention, as test set, is come Compare the g average of two methods.The numeral that Fig. 4 can be seen that on transverse axis represents by standard smote method and present invention proposition The number of minority class that generates respectively of method.In test data, the ratio of minority class and many several classes ofs is is 1:20.This three In the case of kind, it may be seen that the g average of method proposed by the present invention is above standard smote method.
Test result indicate that using the minority class sample synthetic method designed by the present invention, significantly improving existing non-equilibrium The classification estimated performance of iptv data set.

Claims (3)

1. the minority class sample on a kind of non-equilibrium iptv data set synthetic method it is characterised in that: specifically include following step Rapid:
Step 1: find out minority class sample set xminorIn each sample point xiCorresponding k neighbour set si, wherein k is natural number, i =1 ... n, xi∈xminor;K neighbour collection is combined into apart from xiThe set that k nearest sample is formed;
Step 2, the characteristic of k each the minority class sample of neighbour's set analysis being obtained according to step 1, and then it is classified as noise Collection, safe collection and dangerous collection three classes;
Step 3, the sample that noise is concentrated does not process, and calculates the sample size in safe collection and the dangerous sample size concentrated Between ratio t;
Step 4, produces an equally distributed random number b obeying on interval [0,1];If b is ∈ [0, t/ (t+1)], then select The dangerous all samples concentrated generate new minority class sample as input, the smote algorithm sending into standard;Conversely, then selecting All samples in safe collection generate new minority class sample as input, the smote algorithm sending into standard;
Step 5, original minority class sample and newly-generated minority class sample are combined the new minority class set of composition.
2. the synthetic method of the minority class sample on a kind of non-equilibrium iptv data set according to claim 1, its feature It is: described step 2 specifically comprises the steps of:
Step 2.1, counts siIn belong to many several classes ofs xmajorNumber of samples, use | si∩xmajor| to represent, it represents most Class sample set xmajorAnd siCommon factor in number of samples;
Step 2.2, judges | si∩xmajor| residing interval, it is specifically divided into three kinds of situations:
If | si∩xmajor|=k, then current sample xiIt is in most apoplexy due to endogenous wind, it is believed that it is noise for classification problem; xminorIn all samples composition safe collections meeting this condition;
If 0≤| si∩xmajor| < 0.5k then shows current sample xiDangerous very little by misclassification;xminorIn all meet this The sample composition safe collection of condition;
If 0.5k≤| si∩xmajor| < k then shows current sample xiExist by the danger of misclassification;xminorIn all meet this The dangerous collection of sample composition of condition.
3. the synthetic method of the minority class sample on a kind of non-equilibrium iptv data set according to claim 1, its feature It is: in step 4, the concrete calculating process of algorithm of described smote is as follows: sets current sample as xi, near from the k of this sample Adjacent set siOne sample x of middle random selectionj, produce an equally distributed random number δ of obedience from interval [0,1], then newly-generated Minority class sample be: xnew=xi+δ×(xj-xi).
CN201610753263.9A 2016-08-26 2016-08-26 Synthetic method for minority class samples in non-balanced IPTV data set Pending CN106372655A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610753263.9A CN106372655A (en) 2016-08-26 2016-08-26 Synthetic method for minority class samples in non-balanced IPTV data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610753263.9A CN106372655A (en) 2016-08-26 2016-08-26 Synthetic method for minority class samples in non-balanced IPTV data set

Publications (1)

Publication Number Publication Date
CN106372655A true CN106372655A (en) 2017-02-01

Family

ID=57903855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610753263.9A Pending CN106372655A (en) 2016-08-26 2016-08-26 Synthetic method for minority class samples in non-balanced IPTV data set

Country Status (1)

Country Link
CN (1) CN106372655A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871862A (en) * 2018-12-28 2019-06-11 北京航天测控技术有限公司 A kind of failure prediction method based on synthesis minority class over-sampling and deep learning
CN111782904A (en) * 2019-12-10 2020-10-16 国网天津市电力公司电力科学研究院 Improved SMOTE algorithm-based unbalanced data set processing method and system
CN112365060A (en) * 2020-11-13 2021-02-12 广东电力信息科技有限公司 Preprocessing method for power grid internet of things perception data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945280A (en) * 2012-11-15 2013-02-27 翟云 Unbalanced data distribution-based multi-heterogeneous base classifier fusion classification method
CN103500159A (en) * 2013-09-06 2014-01-08 西安交通大学 Method for recognizing topics of nonequilibrium interactive texts based on example obtaining
CN103927874A (en) * 2014-04-29 2014-07-16 东南大学 Automatic incident detection method based on under-sampling and used for unbalanced data set
CN105589806A (en) * 2015-12-17 2016-05-18 北京航空航天大学 SMOTE+Boosting algorithm based software defect tendency prediction method
CN105760889A (en) * 2016-03-01 2016-07-13 中国科学技术大学 Efficient imbalanced data set classification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945280A (en) * 2012-11-15 2013-02-27 翟云 Unbalanced data distribution-based multi-heterogeneous base classifier fusion classification method
CN103500159A (en) * 2013-09-06 2014-01-08 西安交通大学 Method for recognizing topics of nonequilibrium interactive texts based on example obtaining
CN103927874A (en) * 2014-04-29 2014-07-16 东南大学 Automatic incident detection method based on under-sampling and used for unbalanced data set
CN105589806A (en) * 2015-12-17 2016-05-18 北京航空航天大学 SMOTE+Boosting algorithm based software defect tendency prediction method
CN105760889A (en) * 2016-03-01 2016-07-13 中国科学技术大学 Efficient imbalanced data set classification method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUI HAN等: ""Borderline-SMOTE:A new over-sampling method in imbalanced data sets learning"", 《ADVANCES IN INTELLIGENT COMPUTING》 *
王璐林: ""面向不平衡样本的Boosting分类算法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871862A (en) * 2018-12-28 2019-06-11 北京航天测控技术有限公司 A kind of failure prediction method based on synthesis minority class over-sampling and deep learning
CN111782904A (en) * 2019-12-10 2020-10-16 国网天津市电力公司电力科学研究院 Improved SMOTE algorithm-based unbalanced data set processing method and system
CN111782904B (en) * 2019-12-10 2023-10-27 国网天津市电力公司电力科学研究院 Unbalanced data set processing method and system based on improved SMOTE algorithm
CN112365060A (en) * 2020-11-13 2021-02-12 广东电力信息科技有限公司 Preprocessing method for power grid internet of things perception data
CN112365060B (en) * 2020-11-13 2024-01-26 广东电力信息科技有限公司 Preprocessing method for network Internet of things sensing data

Similar Documents

Publication Publication Date Title
CN111817982B (en) Encrypted flow identification method for category imbalance
CN107330477A (en) A kind of improvement SMOTE resampling methods classified for lack of balance data
CN108229550B (en) Cloud picture classification method based on multi-granularity cascade forest network
CN105760889A (en) Efficient imbalanced data set classification method
CN102420723A (en) Anomaly detection method for various kinds of intrusion
CN107038167A (en) Big data excavating analysis system and its analysis method based on model evaluation
CN106228389A (en) Network potential usage mining method and system based on random forests algorithm
CN110012029A (en) A kind of method and system for distinguishing encryption and non-encrypted compression flow
CN106372655A (en) Synthetic method for minority class samples in non-balanced IPTV data set
CN107273387A (en) Towards higher-dimension and unbalanced data classify it is integrated
CN108243435B (en) Parameter optimization method and device in LTE cell scene division
CN109981474A (en) A kind of network flow fine grit classification system and method for application-oriented software
CN116582133B (en) Intelligent management system for data in transformer production process
CN114239807A (en) RFE-DAGMM-based high-dimensional data anomaly detection method
CN104484412A (en) Big data analysis system based on multiform processing
CN103780588A (en) User abnormal behavior detection method in digital home network
CN111145027A (en) Suspected money laundering transaction identification method and device
CN106251241A (en) A kind of feature based selects the LR Bagging algorithm improved
CN109376752A (en) A kind of PTM-WKNN classification method and device based on unbalanced dataset
CN109213865A (en) A kind of software bug report categorizing system and classification method
CN106056160B (en) User fault reporting prediction method under unbalanced IPTV data set
CN111600878A (en) Low-rate denial of service attack detection method based on MAF-ADM
CN107770813B (en) LTE uplink interference classification method based on PCA and two-dimensional skewness characteristics
CN110097120A (en) Network flow data classification method, equipment and computer storage medium
CN108494620B (en) Network service flow characteristic selection and classification method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170201

RJ01 Rejection of invention patent application after publication