CN102521656A - Integrated transfer learning method for classification of unbalance samples - Google Patents

Integrated transfer learning method for classification of unbalance samples Download PDF

Info

Publication number
CN102521656A
CN102521656A CN201110452050XA CN201110452050A CN102521656A CN 102521656 A CN102521656 A CN 102521656A CN 201110452050X A CN201110452050X A CN 201110452050XA CN 201110452050 A CN201110452050 A CN 201110452050A CN 102521656 A CN102521656 A CN 102521656A
Authority
CN
China
Prior art keywords
sample
training
data
classification
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201110452050XA
Other languages
Chinese (zh)
Other versions
CN102521656B (en
Inventor
于重重
谭励
田蕊
刘宇
吴子珺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN201110452050.XA priority Critical patent/CN102521656B/en
Publication of CN102521656A publication Critical patent/CN102521656A/en
Application granted granted Critical
Publication of CN102521656B publication Critical patent/CN102521656B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an integrated transfer learning method for classification of unbalance samples, which comprises the following steps of: in the initializing process, giving different weights to positive and negative samples to ensure that the negative samples which account a small ratio for the total samples and have a large amount of information have large initial weights; in the training process in each round, extracting part of samples according to a certain ratio and using the selected samples as a training subset to carry out training, and after finishing the training, selecting the classifier with the smallest error from a plurality of simple classifiers as a weak classifier and regulating the training dataset according to a redundant data dynamic eliminating algorithm; and obtaining a weak classifier sequence after T rounds of iteration and overlaying and combining a plurality of weak classifiers into a strong classifier. According to the invention, the classification law of novel data which is distributed similarly with old data is found by effectively utilizing the classification law of the old data; particularly, a novel method is provided for solving the problem of classification of the data which is classified in an unbalance mode; the effect of a small amount of the negative samples in the classification process in the classification training process is ensured; the contribution rate of the negative samples is effectively improved; and the classification efficiency and accuracy are improved.

Description

The integrated transfer learning method of non-equilibrium sample classification
Technical field
The invention belongs to the machine learning field, the big and unbalanced supplemental training data of positive negative sample have proposed the integrated transfer learning algorithm of a kind of improvement to the redundant data amount, utilize these supplemental training data migtations to help target data to classify.
Background technology
Transfer learning is the hot topic of machine learning area research in recent years; It is to the little characteristics of flag data amount in the new task; Proposition effectively utilizes out-of-date data migtation and is applied in the new task: though a large amount of outdated datas and problem domain to be solved difference to some extent wherein is certain to exist some data helpful to new classification problem.In order to find these useful data, utilize the new data of having been classified on a small quantity, excavate the valuable information in the legacy data.Train the disaggregated model of a more efficient at last based on useful informations all in two parts data, realize the knowledge migration of legacy data to new data.
At present, to different transfer learning tasks multiple solution is arranged:
People such as Q.Yang propose Naive Bayes Classification device (Naive Bayes Classifier) is promoted into a sorter of supporting the cross-domain texts classification, have realized the migration of knowledge between the different field text.(WDai,G.-R.Xue,QYang,and?YYu.Transferring?naive?bayes?classifiers?for?text?classification[A].The?Twenty-Second?National?Conference?on?Artificial?Intelligence[C],2007.540-545.)
People such as Dai propose integrated study is applied in the transfer learning; Through the boosting technology is the algorithm-TrAdaboost of strong study with weak learning algorithm " lifting "; This algorithm will move auxiliary data and this two parts data set of target data is directly combined; Form a mixed data set as training set, on this data set, utilize TrAdaboost algorithm training disaggregated model then.(YLiu?and?PStone.Value-function-based?transfer?for?reinforcement?learning?using?structure?mapping[A].In?Proceedings?of?theTwenty-First?National?Conference?on?Artificial?Intelligence?[C],2006.877-882.)
With the integrated study algorithm application in transfer learning; Can be under the situation that does not change the Weak Classifier nicety of grading; Through integrated be strong learning algorithm with weak learning algorithm " lifting ", thereby effectively promote the transfer learning effect, yet also there are some problems in this method:
The TrAdaboost algorithm is applicable to two classification problems of solution based on symmetry, and positive negative data is treated on an equal basis.Yet it might be extremely unbalanced in real world, characterizing on two types of different classes of sample distribution, and also there is very big difference in importance.
In addition, often have mass of redundancy data in the auxiliary data, these data maybe be very dissimilar with target data set, and their existence not only can influence the training speed of model, also can cause the decline of nicety of grading.
Summary of the invention
The purpose of this invention is to provide a kind of new method, sample weights is distributed and the adjustment strategy through optimizing, and improves that data volume is little, the contribution rate of one type of sample (negative sample) of containing much information; And in training process, dynamically reject " uncorrelated " data, and according to the sample threshold lower limit that configures, eliminate the too small that part of data of weighted value, through the iteration training of T wheel, the supplemental training data will constantly be tending towards optimizing.
Principle of the present invention is: the method for utilizing migration; Classify to the unbalanced data of positive negative sample; The characteristic attribute vector that at first supplemental training data and target data is extracted is mixed into training set, then the every dimensional feature attribute on this training set is used weak learning algorithm respectively.When initialization, give different weights with positive negative sample, it is little but negative sample initial weight that contain much information is big to guarantee to account for total sample ratio.The every wheel extracted the part sample in proportion as training subclass to train in the training process; After training finishes; Minimum that of Select Error from several simple classification devices as a Weak Classifier h, and dynamically rejected algorithm adjustment training dataset according to redundant data.Like this, through just obtaining a Weak Classifier sequence (h after the T wheel iteration 1, h 2..., h T), final classification function f (x) adopts a kind of ballot mode to produce, and is about to a plurality of Weak Classifiers and gets up to be combined into a strong classifier through certain method stack (boost).Method flow is as shown in Figure 5.
Technical scheme provided by the invention is following:
A kind of integrated transfer learning method of non-equilibrium sample classification is characterized in that, comprises the steps:
1) will move auxiliary data collection A and target data set O is mixed into training dataset C in proportion;
2) initialization sample weight;
3) obtain the normalization sample weights;
If the iteration total degree is T, from 1 to T every take turns iteration training accomplish successively below 4)~9) go on foot:
4) randomly draw the training subset D;
5) if contain positive and negative two types of samples in the training subset D, then carry out the 6th) step; Otherwise, do not comprise another kind of in extract the part sample and insert the training subset D, have positive and negative two types of samples in the subset D to guarantee to train;
6) on the training subset D,, train basic sorter and integrated summation to obtain Weak Classifier with weak learning algorithm P;
7) calculate Weak Classifier h tTraining error rate on the target training data, wherein t is an iteration factor;
8) based on classification error rate adjustment sample weights;
9) dynamically reject redundant data;
10) obtain final integrated classifier and export positive sample and negative sample.
In the step 1), the extracting part divided data is mixed into training dataset C, C={ (X in proportion in migration auxiliary data collection A and target data set O respectively 1, Y 1), (X 2, Y 2) ..., (X N, Y N), (X wherein i, Y i) be the training sample that is combined into by sample characteristics attribute vector and sample class, i=1,2 ..., N; N sample is data among the A before among the training dataset C, and a remaining m sample is the data among the O, n+m=N; X wherein i∈ X, X is input sample data, X iBe the characteristic attribute vector of sample, dimension is q, Y i{ 0 ,+1} is the class label of sample to ∈.
In the step 3), the computing method of normalization sample weights are: the initializes weights of each sample divided by total sample weights, is promptly obtained the sample weights after the normalization.
In the step 4), contained sample size is half among the C in the training subset D of extraction.
In the step 6), said weak learning algorithm is a decision tree, artificial neural network, SVM.
In the step 7), Weak Classifier h tTraining error rate ε on the target training data tFollowing calculating:
ϵ t = Σ i = n + 1 n + m ω i t | h t ( x i ) - y i | Σ i = n + 1 n + m ω i t
Wherein
Figure BDA0000126690150000032
Be the weight of i sample when the t time iteration, h t(x i) be i the output of sample when the t time iteration, y iIt is the true class label of i sample; If the ε that calculates tGreater than 1/2, then its value is set to 1/2, i.e. ε tBe not more than 1/2.
In the step 9), whenever when being lower than, the weighted value of training sample is regarded as redundant data under the threshold value that configures in limited time after taking turns training and finishing, from this partial data of training sample deletion.
In the step 9),, promptly stop step 4)~9 when the training sample sum stops training during smaller or equal to the minimum number of samples that configures) iteration.
In the step 10), final integrated classifier is output as
Figure BDA0000126690150000033
Promptly with the voting results of most Weak Classifiers as final classification results; Wherein value 1 is represented the one type of sample that in unbalanced data, occupies the majority, the another kind of sample that 0 representative of value occupies the minority; X is a training sample, and T is total iterations, and t is an iteration factor, h t(x) be the output of sample x when the t time iteration,
Figure BDA0000126690150000041
Integrated transfer learning method noted earlier is applied to bridge monitoring, and for final integrated classifier output valve Z, Z is that 1 this sample of expression is positive sample, characterizes bridge and is in a good state of health; Z is that 0 this sample of expression is the duplicate sample basis, and the sign bridge is a faulted condition.
Beneficial effect of the present invention: utilize technical scheme provided by the invention; Can effectively utilize the classification rule that has legacy data to find out the classification rule of the new data of APPROXIMATE DISTRIBUTION; Especially the classification problem to the classification unbalanced data provides new method; Guarantee the few effect of negative sample in classification based training of quantity in the classification, improved the contribution rate of negative sample effectively, improved the efficient and the precision of classification.
Description of drawings
Fig. 1 embodiment step block diagram
The error rate of Fig. 2 the present invention on training data and test data
Fig. 3 TrAdaboost increases relative error with training data
Fig. 4 the present invention increases relative error with training data
Fig. 5 the inventive method process flow diagram
Table 1 input data
Table 2 training data constituent
The test result of table 3TrAdaboost algorithm and algorithm of the present invention
Embodiment
The integrated transfer learning method (being referred to as UBITLA) of non-equilibrium sample classification provided by the invention, step is (ginseng Fig. 1) as follows:
1. input: the input data come from two parts: migration auxiliary data collection A, target data set O.The extracting part divided data is mixed into training dataset C={ (X in proportion in these two parts data respectively 1, Y 1), (X 2, Y 2) ..., (X N, Y N), (X wherein i, Y i) be the training sample that is combined into by sample characteristics attribute vector and sample class.i=1,2,…,N。N sample is data among the A before among the C, and m remaining among C sample is the data (n+m=N) among the O.Predetermined iterations is T.X wherein i∈ X, X is input sample data, X iBe the characteristic attribute vector of sample, dimension is q, Y i{ 0 ,+1} is the class label of sample to ∈.
2. initialization sample weight:
Figure BDA0000126690150000051
Wherein,
Figure BDA0000126690150000052
Be the initializes weights of i sample, 1 these weights of expression wherein are init states, i 0Represent that this initializes weights value handles without normalization, i=1 wherein, 2 ..., N.D, l are respectively positive sample size among A and the O.
3. obtain the normalization sample weights:
ω i 1 = ω i 0 1 Σ i = 1 N ω i 0 1
Figure BDA0000126690150000054
is that i sample is through normalized sample weights; Respectively with the initializes weights of each sample divided by total sample weights, promptly obtain the sample weights after the normalization.
If the iteration total degree is T, whenever takes turns iteration training and accomplish the 4-9 step successively from 1 to T:
4. (wherein contained sample number is half among the C among the D, and D randomly draws from C to randomly draw the training subset D.)
5. whether contain positive and negative two types of samples in the training of judgement subset D.Go on foot if contain two types of samples then carry out the 6th,, then directly in another kind of, extract the part sample and insert the training subset D if only contain wherein one type of sample.Guarantee to train and have positive and negative two types of samples in the subclass.
The training subset D on, with weak learning algorithm P (like decision tree, artificial neural network; Basic classification algorithms such as SVM); Train basic sorter j=1 of each dimensional feature attribute, 2 ... Q; Wherein h representes the basic sorter by basic classification algorithm training structure, and the dimension of q representation feature attribute vector, t are represented t wheel iteration.Obtain Weak Classifier by basic sorter summation:
h t = Σ j = 1 q h t j
7. calculate Weak Classifier h tTraining error rate on the target training data:
ϵ t = Σ i = n + 1 n + m ω i t | h t ( x i ) - y i | Σ i = n + 1 n + m ω i t
Wherein
Figure BDA0000126690150000062
Be the weight of i sample when the t time iteration, h t(x i) be i the output of sample when the t time iteration, y iIt is the true class label of i sample.
If the ε that calculates tGreater than 1/2, then need its value be set to 1/2.Be ε tBe not more than 1/2.
8. adjustment sample weights:
If y i=0 and h t(x) ≠ y i, 1≤i≤n wherein, then
ω i t + 1 = ω i t β | h t ( x i ) - y i | + dr × ω i t (0≤dr≤1)
Otherwise
ω i t + 1 = ω i t β | h t ( x i ) - y i | , 1 ≤ i ≤ n ω i t β t - | h t ( x i ) - y i | , n + 1 ≤ i ≤ m + n
Make
Figure BDA0000126690150000065
Figure BDA0000126690150000066
if i sample is negative sample; And when not waiting with the output of t subseries device; According to above-mentioned first formula adjustment weight, otherwise adjust according to second formula.Wherein, dr is a decay factor, and effect is to make by the negative sample weights of misclassification adjustment to have memory function, guarantees that it unlikelyly diminishes rapidly.
9. dynamically reject redundant data:
The every wheel is regarded as redundant data after training finishes when the weighted value of training sample is lower than the threshold value lower limit r that configures, from this partial data of training sample deletion., the training sample sum stops training when being lower than the minimum number of samples that configures.
10. output: final integrated classifier is output as:
Figure BDA0000126690150000067
Promptly the voting results with most Weak Classifiers are final classification results, wherein 1 represent the one type of sample that in unbalanced data, occupies the majority, 0 another kind of sample that representative occupies the minority.
Embodiment adopts the historical bridge spanning the sea Monitoring Data collection (DataS1) of monitoring in existing 2 years and the Monitoring Data collection (DataS2) that newly builds up highway bridges as research object; From the actual monitoring strain data of two bridges according to early; In; Evening peak period and ebb period morning and afternoon are extracted a certain proportion of data respectively as migration supplemental training data set and target data set, and the data constituent is as shown in table 2.Wherein the positive and negative samples ratio is 5: 1, adopts 14 crucial monitoring point static strain data on the bridge plate of two bridges respectively, as 14 dimension input data.Output data: 1 representative is normal, 0 representative damage.In every training process of taking turns, randomly draw 1/2 data as the training data subclass, randomly draw a part of target data equally as the test data subclass.
In above-mentioned the 1st step, actual importation data are seen table 1, and wherein, the ratio of supplemental training data and target data is about 5: 1, and positive and negative classification ratio is about 5: 1,6000 of total sample number.The 10th step was output as the positive sample of 1 expression, characterized bridge and was in a good state of health; Be output as 0 expression negative sample, the sign bridge is a faulted condition.
Present embodiment is applied to the integrated transfer learning method of non-equilibrium sample classification in the bridge actual monitoring data qualification; Little to damage data amount in the bridge actual monitoring data; The positive unbalanced characteristics of negative sample; Rationally utilize out-of-date data to help new data to classify, can improve the little but contribution rate of the bridge damnification data that contain much information of data volume effectively, thereby improve the discrimination of positive negative sample; Can effectively instruct the related personnel that the bridge structure that produces damage data is monitored more closely, in time take corresponding maintenance measure.
Fig. 2 explains the auxiliary validity of setting up disaggregated model of migration data.Fig. 3, Fig. 4 explanation can improve final sorter precision through optimizing the supplemental training sample set.There is the bright the present invention of this illustration can optimize the supplemental training data set, thereby reaches the doulbe-sides' victory of efficient and precision, promoted the transfer learning effect.
1 2 3 4 5 6 7 8 9 10 11 12 13 Output
Sample 1 44 53.7 20.8 24.8 26.8 30.8 33.7 31.9 30 29.2 71.3 43.8 89.6 0
Sample 2 23 24.3 43.8 55.7 20.2 25 27.1 30.6 33.3 31.7 20.7 29.9 70.9 1
Sample 3 42.6 89.4 26.3 21.8 67.9 48.8 54.8 22.1 30.8 70.6 42.5 89.4 26.5 1
Sample 4 20.8 23.8 42.8 58.9 19 25.8 28.5 30.6 32.1 30.7 31 31.5 69.6 1
Sample 5 32.9 46.8 20.1 14.7 39.4 50.7 14.3 18.1 21.9 28.9 28.5 28.1 40.7 1
Sample 6 44.9 53.6 20.1 22.1 41.8 51.7 19 23.5 29.1 27.9 31.7 30.1 19.5 1
Sample 7 20.5 17.5 58.5 32.2 46.6 21.1 14.2 41.2 48.4 15.3 16.8 20.7 28.9 0
Sample 8 29.9 70.9 42.6 89.4 26.3 21.8 67.9 48.8 54.8 22.1 30.8 70.6 42.5 1
Sample 9 31.5 69.6 41.3 88.6 26.5 20.7 66.1 47.2 53.6 20.8 23.5 42.5 59.7 0
Sample 10 32.6 79.1 55.2 95.2 32.3 27.3 76.6 54.1 59.1 18.3 33.9 78.7 54.4 1
Table 1 input data
Figure BDA0000126690150000071
Figure BDA0000126690150000081
Table 2 training data constituent
Figure BDA0000126690150000082
The test result of table 3TrAdaboost algorithm and algorithm of the present invention

Claims (10)

1. the integrated transfer learning method of a non-equilibrium sample classification is characterized in that, comprises the steps:
1) will move auxiliary data collection A and target data set O is mixed into training dataset C in proportion;
2) initialization sample weight;
3) obtain the normalization sample weights;
If the iteration total degree is T, from 1 to T every take turns iteration training accomplish successively below 4)~9) go on foot:
4) randomly draw the training subset D;
5) if contain positive and negative two types of samples in the training subset D, then carry out the 6th) step; Otherwise, do not comprise another kind of in extract the part sample and insert the training subset D, have positive and negative two types of samples in the subset D to guarantee to train;
6) on the training subset D,, train basic sorter and integrated summation to obtain Weak Classifier with weak learning algorithm P;
7) calculate Weak Classifier h tTraining error rate on the target training data, wherein t is an iteration factor;
8) based on classification error rate adjustment sample weights;
9) dynamically reject redundant data;
10) obtain final integrated classifier and export positive sample and negative sample.
2. integrated transfer learning method as claimed in claim 1 is characterized in that, in the step 1), the extracting part divided data is mixed into training dataset C, C={ (X in proportion in migration auxiliary data collection A and target data set O respectively 1, Y 1), (X 2, Y 2) ..., (X N, Y N), (X wherein i, Y i) be the training sample that is combined into by sample characteristics attribute vector and sample class, i=1,2 ..., N; N sample is data among the A before among the training dataset C, and a remaining m sample is the data among the O, n+m=N; X wherein i∈ X, X is input sample data, X iBe the characteristic attribute vector of sample, dimension is q, Y i{ 0 ,+1} is the class label of sample to ∈.
3. integrated transfer learning method as claimed in claim 1 is characterized in that, in the step 3), the computing method of normalization sample weights are: the initializes weights of each sample divided by total sample weights, is promptly obtained the sample weights after the normalization.
4. integrated transfer learning method as claimed in claim 1 is characterized in that, in the step 4), contained sample size is half among the C in the training subset D of extraction.
5. integrated transfer learning method as claimed in claim 1 is characterized in that, in the step 6), said weak learning algorithm is a decision tree, artificial neural network, SVM.
6. integrated transfer learning method as claimed in claim 2 is characterized in that, in the step 7), and Weak Classifier h tTraining error rate ε on the target training data tFollowing calculating:
ϵ t = Σ i = n + 1 n + m ω i t | h t ( x i ) - y i | Σ i = n + 1 n + m ω i t
Wherein
Figure FDA0000126690140000022
Be the weight of i sample when the t time iteration, h t(x i) be i the output of sample when the t time iteration, y iIt is the true class label of i sample; If the ε that calculates tGreater than 1/2, then its value is set to 1/2, i.e. ε tBe not more than 1/2.
7. integrated transfer learning method as claimed in claim 1 is characterized in that, in the step 9), whenever is regarded as redundant data under the threshold value that configures in limited time when the weighted value of training sample is lower than after taking turns training and finishing, from this partial data of training sample deletion.
8. integrated transfer learning method as claimed in claim 1 is characterized in that, in the step 9), when the training sample sum stops training during smaller or equal to the minimum number of samples that configures, promptly stops step 4)~9) iteration.
9. integrated transfer learning method as claimed in claim 6 is characterized in that in the step 10), final integrated classifier is output as
Promptly with the voting results of most Weak Classifiers as final classification results; Wherein value 1 is represented the one type of sample that in unbalanced data, occupies the majority, the another kind of sample that 0 representative of value occupies the minority; X is a training sample, and T is total iterations, and t is an iteration factor, h t(x) be the output of sample x when the t time iteration,
Figure FDA0000126690140000024
10. the described integrated transfer learning method of claim 1~9 is applied to bridge monitoring, for final integrated classifier output valve Z, Z is that 1 this sample of expression is positive sample, characterizes bridge and is in a good state of health; Z is that 0 this sample of expression is the duplicate sample basis, and the sign bridge is a faulted condition.
CN201110452050.XA 2011-12-29 2011-12-29 Integrated transfer learning method for classification of unbalance samples Expired - Fee Related CN102521656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110452050.XA CN102521656B (en) 2011-12-29 2011-12-29 Integrated transfer learning method for classification of unbalance samples

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110452050.XA CN102521656B (en) 2011-12-29 2011-12-29 Integrated transfer learning method for classification of unbalance samples

Publications (2)

Publication Number Publication Date
CN102521656A true CN102521656A (en) 2012-06-27
CN102521656B CN102521656B (en) 2014-02-26

Family

ID=46292567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110452050.XA Expired - Fee Related CN102521656B (en) 2011-12-29 2011-12-29 Integrated transfer learning method for classification of unbalance samples

Country Status (1)

Country Link
CN (1) CN102521656B (en)

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573708A (en) * 2014-12-19 2015-04-29 天津大学 Ensemble-of-under-sampled extreme learning machine
CN104751234A (en) * 2013-12-31 2015-07-01 华为技术有限公司 User asset predicting method and device
CN104951809A (en) * 2015-07-14 2015-09-30 西安电子科技大学 Unbalanced data classification method based on unbalanced classification indexes and integrated learning
CN105243394A (en) * 2015-11-03 2016-01-13 中国矿业大学 Evaluation method for performance influence degree of classification models by class imbalance
CN105589037A (en) * 2016-03-16 2016-05-18 合肥工业大学 Ensemble learning-based electric power electronic switch device network fault diagnosis method
CN106570164A (en) * 2016-11-07 2017-04-19 中国农业大学 Integrated foodstuff safety text classification method based on deep learning
CN106909981A (en) * 2015-12-23 2017-06-30 阿里巴巴集团控股有限公司 Model training, sample balance method and device and personal credit points-scoring system
CN106934462A (en) * 2017-02-09 2017-07-07 华南理工大学 Defence under antagonism environment based on migration poisons the learning method of attack
CN107291739A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 Evaluation method, system and the equipment of network user's health status
CN107316061A (en) * 2017-06-22 2017-11-03 华南理工大学 A kind of uneven classification ensemble method of depth migration study
CN107644057A (en) * 2017-08-09 2018-01-30 天津大学 A kind of absolute uneven file classification method based on transfer learning
CN107728476A (en) * 2017-09-20 2018-02-23 浙江大学 A kind of method from non-equilibrium class extracting data sensitive data based on SVM forest
CN107944874A (en) * 2017-12-13 2018-04-20 阿里巴巴集团控股有限公司 Air control method, apparatus and system based on transfer learning
CN108154237A (en) * 2016-12-06 2018-06-12 华为技术有限公司 A kind of data processing system and method
CN108520220A (en) * 2018-03-30 2018-09-11 百度在线网络技术(北京)有限公司 model generating method and device
CN108629419A (en) * 2017-03-21 2018-10-09 发那科株式会社 Machine learning device and thermal displacement correction device
CN109143199A (en) * 2018-11-09 2019-01-04 大连东软信息学院 Sea clutter small target detecting method based on transfer learning
CN109272056A (en) * 2018-10-30 2019-01-25 成都信息工程大学 The method of data balancing method and raising data classification performance based on pseudo- negative sample
CN109462610A (en) * 2018-12-24 2019-03-12 哈尔滨工程大学 A kind of network inbreak detection method based on Active Learning and transfer learning
CN109508457A (en) * 2018-10-31 2019-03-22 浙江大学 A kind of transfer learning method reading series model based on machine
CN109523018A (en) * 2019-01-08 2019-03-26 重庆邮电大学 A kind of picture classification method based on depth migration study
CN109800807A (en) * 2019-01-18 2019-05-24 北京市商汤科技开发有限公司 The training method and classification method and device of sorter network, electronic equipment
CN109886303A (en) * 2019-01-21 2019-06-14 武汉大学 A kind of TrAdaboost sample migration aviation image classification method based on particle group optimizing
CN109948478A (en) * 2019-03-06 2019-06-28 中国科学院自动化研究所 The face identification method of extensive lack of balance data neural network based, system
CN110245232A (en) * 2019-06-03 2019-09-17 网易传媒科技(北京)有限公司 File classification method, device, medium and calculating equipment
CN110688983A (en) * 2019-08-22 2020-01-14 中国矿业大学 Microseismic signal identification method based on multi-mode optimization and ensemble learning
CN110998648A (en) * 2018-08-09 2020-04-10 北京嘀嘀无限科技发展有限公司 System and method for distributing orders
CN111046924A (en) * 2019-11-26 2020-04-21 成都旷视金智科技有限公司 Data processing method, device and system and storage medium
CN111291818A (en) * 2020-02-18 2020-06-16 浙江工业大学 Non-uniform class sample equalization method for cloud mask
CN111881289A (en) * 2020-06-10 2020-11-03 北京启明星辰信息安全技术有限公司 Training method of classification model, and detection method and device of data risk category
CN112465152A (en) * 2020-12-03 2021-03-09 中国科学院大学宁波华美医院 Online migration learning method suitable for emotional brain-computer interface
CN112819076A (en) * 2021-02-03 2021-05-18 中南大学 Deep migration learning-based medical image classification model training method and device
US11049007B2 (en) 2016-05-06 2021-06-29 Fujitsu Limited Recognition apparatus based on deep neural network, training apparatus and methods thereof
CN113421122A (en) * 2021-06-25 2021-09-21 创络(上海)数据科技有限公司 First-purchase user refined loss prediction method under improved transfer learning framework
CN113657428A (en) * 2021-06-30 2021-11-16 北京邮电大学 Method and device for extracting network traffic data
CN114765772A (en) * 2021-01-04 2022-07-19 中国移动通信有限公司研究院 Method and device for outputting terminal information and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101794396A (en) * 2010-03-25 2010-08-04 西安电子科技大学 System and method for recognizing remote sensing image target based on migration network learning
CN101840569A (en) * 2010-03-19 2010-09-22 西安电子科技大学 Projection pursuit hyperspectral image segmentation method based on transfer learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101840569A (en) * 2010-03-19 2010-09-22 西安电子科技大学 Projection pursuit hyperspectral image segmentation method based on transfer learning
CN101794396A (en) * 2010-03-25 2010-08-04 西安电子科技大学 System and method for recognizing remote sensing image target based on migration network learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘伟,等: "非平衡样本分类的集成迁移学习算法", 《计算机工程与应用》 *
张彦峰,等: "一种改进的AdaBoost算法_M_AsyAdaBoost", 《北京理工大学学报》 *

Cited By (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104751234B (en) * 2013-12-31 2018-10-19 华为技术有限公司 A kind of prediction technique and device of user's assets
CN104751234A (en) * 2013-12-31 2015-07-01 华为技术有限公司 User asset predicting method and device
CN104573708A (en) * 2014-12-19 2015-04-29 天津大学 Ensemble-of-under-sampled extreme learning machine
CN104951809A (en) * 2015-07-14 2015-09-30 西安电子科技大学 Unbalanced data classification method based on unbalanced classification indexes and integrated learning
CN105243394A (en) * 2015-11-03 2016-01-13 中国矿业大学 Evaluation method for performance influence degree of classification models by class imbalance
CN105243394B (en) * 2015-11-03 2019-03-19 中国矿业大学 Evaluation method of the one type imbalance to disaggregated model performance influence degree
CN106909981B (en) * 2015-12-23 2020-08-25 阿里巴巴集团控股有限公司 Model training method, sample balancing method, model training device, sample balancing device and personal credit scoring system
CN106909981A (en) * 2015-12-23 2017-06-30 阿里巴巴集团控股有限公司 Model training, sample balance method and device and personal credit points-scoring system
CN105589037A (en) * 2016-03-16 2016-05-18 合肥工业大学 Ensemble learning-based electric power electronic switch device network fault diagnosis method
CN107291739A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 Evaluation method, system and the equipment of network user's health status
US11049007B2 (en) 2016-05-06 2021-06-29 Fujitsu Limited Recognition apparatus based on deep neural network, training apparatus and methods thereof
CN106570164A (en) * 2016-11-07 2017-04-19 中国农业大学 Integrated foodstuff safety text classification method based on deep learning
CN108154237A (en) * 2016-12-06 2018-06-12 华为技术有限公司 A kind of data processing system and method
CN108154237B (en) * 2016-12-06 2022-04-05 华为技术有限公司 Data processing system and method
CN106934462A (en) * 2017-02-09 2017-07-07 华南理工大学 Defence under antagonism environment based on migration poisons the learning method of attack
CN108629419B (en) * 2017-03-21 2023-07-14 发那科株式会社 Machine learning device and thermal displacement correction device
CN108629419A (en) * 2017-03-21 2018-10-09 发那科株式会社 Machine learning device and thermal displacement correction device
CN107316061B (en) * 2017-06-22 2020-09-22 华南理工大学 Deep migration learning unbalanced classification integration method
CN107316061A (en) * 2017-06-22 2017-11-03 华南理工大学 A kind of uneven classification ensemble method of depth migration study
CN107644057B (en) * 2017-08-09 2020-03-03 天津大学 Absolute imbalance text classification method based on transfer learning
CN107644057A (en) * 2017-08-09 2018-01-30 天津大学 A kind of absolute uneven file classification method based on transfer learning
CN107728476A (en) * 2017-09-20 2018-02-23 浙江大学 A kind of method from non-equilibrium class extracting data sensitive data based on SVM forest
CN107728476B (en) * 2017-09-20 2020-05-22 浙江大学 SVM-forest based method for extracting sensitive data from unbalanced data
CN107944874A (en) * 2017-12-13 2018-04-20 阿里巴巴集团控股有限公司 Air control method, apparatus and system based on transfer learning
CN107944874B (en) * 2017-12-13 2021-07-20 创新先进技术有限公司 Wind control method, device and system based on transfer learning
CN108520220B (en) * 2018-03-30 2021-07-09 百度在线网络技术(北京)有限公司 Model generation method and device
CN108520220A (en) * 2018-03-30 2018-09-11 百度在线网络技术(北京)有限公司 model generating method and device
CN110998648A (en) * 2018-08-09 2020-04-10 北京嘀嘀无限科技发展有限公司 System and method for distributing orders
CN109272056B (en) * 2018-10-30 2021-09-21 成都信息工程大学 Data balancing method based on pseudo negative sample and method for improving data classification performance
CN109272056A (en) * 2018-10-30 2019-01-25 成都信息工程大学 The method of data balancing method and raising data classification performance based on pseudo- negative sample
CN109508457A (en) * 2018-10-31 2019-03-22 浙江大学 A kind of transfer learning method reading series model based on machine
CN109143199A (en) * 2018-11-09 2019-01-04 大连东软信息学院 Sea clutter small target detecting method based on transfer learning
CN109462610A (en) * 2018-12-24 2019-03-12 哈尔滨工程大学 A kind of network inbreak detection method based on Active Learning and transfer learning
CN109523018A (en) * 2019-01-08 2019-03-26 重庆邮电大学 A kind of picture classification method based on depth migration study
CN109523018B (en) * 2019-01-08 2022-10-18 重庆邮电大学 Image classification method based on deep migration learning
CN109800807A (en) * 2019-01-18 2019-05-24 北京市商汤科技开发有限公司 The training method and classification method and device of sorter network, electronic equipment
CN109800807B (en) * 2019-01-18 2021-08-31 北京市商汤科技开发有限公司 Training method and classification method and device of classification network, and electronic equipment
CN109886303A (en) * 2019-01-21 2019-06-14 武汉大学 A kind of TrAdaboost sample migration aviation image classification method based on particle group optimizing
CN109948478B (en) * 2019-03-06 2021-05-11 中国科学院自动化研究所 Large-scale unbalanced data face recognition method and system based on neural network
CN109948478A (en) * 2019-03-06 2019-06-28 中国科学院自动化研究所 The face identification method of extensive lack of balance data neural network based, system
CN110245232A (en) * 2019-06-03 2019-09-17 网易传媒科技(北京)有限公司 File classification method, device, medium and calculating equipment
CN110245232B (en) * 2019-06-03 2022-02-18 网易传媒科技(北京)有限公司 Text classification method, device, medium and computing equipment
CN110688983A (en) * 2019-08-22 2020-01-14 中国矿业大学 Microseismic signal identification method based on multi-mode optimization and ensemble learning
CN111046924B (en) * 2019-11-26 2023-12-19 成都旷视金智科技有限公司 Data processing method, device, system and storage medium
CN111046924A (en) * 2019-11-26 2020-04-21 成都旷视金智科技有限公司 Data processing method, device and system and storage medium
CN111291818A (en) * 2020-02-18 2020-06-16 浙江工业大学 Non-uniform class sample equalization method for cloud mask
CN111881289B (en) * 2020-06-10 2023-09-08 北京启明星辰信息安全技术有限公司 Training method of classification model, and detection method and device of data risk class
CN111881289A (en) * 2020-06-10 2020-11-03 北京启明星辰信息安全技术有限公司 Training method of classification model, and detection method and device of data risk category
CN112465152B (en) * 2020-12-03 2022-11-29 中国科学院大学宁波华美医院 Online migration learning method suitable for emotional brain-computer interface
CN112465152A (en) * 2020-12-03 2021-03-09 中国科学院大学宁波华美医院 Online migration learning method suitable for emotional brain-computer interface
CN114765772A (en) * 2021-01-04 2022-07-19 中国移动通信有限公司研究院 Method and device for outputting terminal information and readable storage medium
CN112819076B (en) * 2021-02-03 2022-06-17 中南大学 Deep migration learning-based medical image classification model training method and device
CN112819076A (en) * 2021-02-03 2021-05-18 中南大学 Deep migration learning-based medical image classification model training method and device
CN113421122A (en) * 2021-06-25 2021-09-21 创络(上海)数据科技有限公司 First-purchase user refined loss prediction method under improved transfer learning framework
CN113657428A (en) * 2021-06-30 2021-11-16 北京邮电大学 Method and device for extracting network traffic data

Also Published As

Publication number Publication date
CN102521656B (en) 2014-02-26

Similar Documents

Publication Publication Date Title
CN102521656B (en) Integrated transfer learning method for classification of unbalance samples
CN105373606A (en) Unbalanced data sampling method in improved C4.5 decision tree algorithm
CN105487526B (en) A kind of Fast RVM sewage treatment method for diagnosing faults
CN106709754A (en) Power user grouping method based on text mining
CN104966105A (en) Robust machine error retrieving method and system
CN103886330A (en) Classification method based on semi-supervised SVM ensemble learning
CN104657718A (en) Face recognition method based on face image feature extreme learning machine
CN103020122A (en) Transfer learning method based on semi-supervised clustering
CN106203534A (en) A kind of cost-sensitive Software Defects Predict Methods based on Boosting
CN103177265B (en) High-definition image classification method based on kernel function Yu sparse coding
CN104598920B (en) Scene classification method based on Gist feature and extreme learning machine
CN110188047A (en) A kind of repeated defects report detection method based on binary channels convolutional neural networks
CN112925908A (en) Attention-based text classification method and system for graph Attention network
CN106022954A (en) Multiple BP neural network load prediction method based on grey correlation degree
CN106681305A (en) Online fault diagnosing method for Fast RVM (relevance vector machine) sewage treatment
CN106156805A (en) A kind of classifier training method of sample label missing data
CN108664633A (en) A method of carrying out text classification using diversified text feature
CN110348608A (en) A kind of prediction technique for improving LSTM based on fuzzy clustering algorithm
CN104951987B (en) Crop Breeding evaluation method based on decision tree
CN105975611A (en) Self-adaptive combined downsampling reinforcing learning machine
CN110738232A (en) grid voltage out-of-limit cause diagnosis method based on data mining technology
CN106127333A (en) Movie attendance Forecasting Methodology and system
CN102629272A (en) Clustering based optimization method for examination system database
CN109784488A (en) A kind of construction method of the binaryzation convolutional neural networks suitable for embedded platform
CN104751175A (en) Multi-label scene classification method of SAR (Synthetic Aperture Radar) image based on incremental support vector machine

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Tan Li

Inventor after: Su Weijun

Inventor after: Yu Zhongzhong

Inventor after: Tian Rui

Inventor after: Liu Yu

Inventor after: Wu Zijun

Inventor after: Ma Meng

Inventor before: Yu Zhongzhong

Inventor before: Tan Li

Inventor before: Tian Rui

Inventor before: Liu Yu

Inventor before: Wu Zijun

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: YU CHONGCHONG TAN LI TIAN RUI LIU YU WU ZIJUN TO: TAN LI SU WEIJUN YU CHONGCHONG TIAN RUI LIU YU WU ZIJUN MA MENG

C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140226

Termination date: 20161229