CN105760899A - Adboost training learning method and device based on distributed computation and detection cost ordering - Google Patents

Adboost training learning method and device based on distributed computation and detection cost ordering Download PDF

Info

Publication number
CN105760899A
CN105760899A CN201610201531.6A CN201610201531A CN105760899A CN 105760899 A CN105760899 A CN 105760899A CN 201610201531 A CN201610201531 A CN 201610201531A CN 105760899 A CN105760899 A CN 105760899A
Authority
CN
China
Prior art keywords
strong classifier
classifier
sample
computer
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610201531.6A
Other languages
Chinese (zh)
Other versions
CN105760899B (en
Inventor
田雨农
吴子章
周秀田
于维双
陆振波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Roiland Technology Co Ltd
Original Assignee
Dalian Roiland Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Roiland Technology Co Ltd filed Critical Dalian Roiland Technology Co Ltd
Priority to CN201610201531.6A priority Critical patent/CN105760899B/en
Publication of CN105760899A publication Critical patent/CN105760899A/en
Application granted granted Critical
Publication of CN105760899B publication Critical patent/CN105760899B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention belongs to the field of computation, relates to an Adboost training learning method and device based on distributed computation and detection cost ordering, and solves the problem of slow detection speed of high-dimensional mass data sample training caused by an upper limit of memory and computing power of a single computer, so as to realize parallel training, form a new cascade classifier through circulation, and increase detection speed. The method is characterized by including: Step 3. distributed training: a first sample is split and distributed to computers, the computers train strong classifiers respectively according to allocated computation indexes of strong classifiers, and a new cascade classifier is formed according to the classification capacity of the strong classifiers; and Step 4. distributed training of a next time: the first samples of the computers that are wrongly classified are collected, the first sample is split and distributed to the computers, the computer train strong classifiers according to the allocated computation indexes of the strong classifiers, and a new cascade classifier is formed according to the classification capacity of the strong classifiers.

Description

Adboost based on Distributed Calculation with detection cost sequence trains learning method and device
Technical field
The invention belongs to calculating field, relate to the training learning method of a kind of grader.
Background technology
Along with the commercialization step of artificial intelligence is constantly accelerated, use machine learning algorithm that various one-dimensional, two-dimentional, three-dimensional or even more high-dimensional sample trainings become the invisible obstacle of many machine learning algorithms application.Because along with the increase of sample dimension Yu quantity, training memory source shared by learning sample calculates resource with CPU and GPU etc. and is also continuously increased.The internal memory of a general PC or server always has the upper limit, thus can limit the number of samples of training study to a great extent.Although machine learning algorithm traditional at present varies in training skills, but occupying system resources is unavoidable.Therefore the combination realizing the sample training of mass data, parallel computation and Distributed Calculation is requisite means.
Summary of the invention
In order to adapt to high-dimensional sample training, solving the internal memory due to single computer and the computing capability upper limit, the high-dimensional mass data sample training caused detects slow-footed problem, and the present invention proposes oneAdboost based on Distributed Calculation with detection cost sequence trains learning method and device, to realize parallel training, and circulation forms new cascade classifier, acceleration detection speed.
To achieve these goals, technical key point is as follows:
A kind of Adboost based on Distributed Calculation with detection cost sequence trains learning method, comprises the steps:
Step one. set detection target, determine the number of computers of distribution according to cascade classifier quantity;
Step 2. the second sample is distributed each computer;
Step 3. distributed training, the first sample splits and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Step 4. distributed training next time, first sample of the misclassification of each computer is collected, first sample is split and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Step 5. repeat step 4, until new cascade classifier meets exit criteria;
Wherein: the first described sample is a fairly large number of sample in positive negative sample to be tested, the second described sample is the sample of negligible amounts in positive negative sample to be tested.
The invention still further relates to a kind of Adboost based on Distributed Calculation with detection cost sequence and train learning device, including:
Detection goal setting device, sets detection target, determines the number of computers of distribution according to cascade classifier quantity;
Second sample dispensing device, distributes each computer by the second sample;
First distributed training devices, distributed training, the first sample splits and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Second distributed training devices, distributed training next time, first sample of the misclassification of each computer is collected, first sample is split and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Distributed training repetition training device, second distributed training devices's repetition training, until new cascade classifier meets exit criteria;
Wherein: the first described sample is a fairly large number of sample in positive negative sample to be tested, the second described sample is the sample of negligible amounts in positive negative sample to be tested.
Beneficial effect:
1. the present invention utilizes Adboost algorithm self-characteristic, and its cascade classifier is split as the strong classifier of multiple computer distribution type parallel computation, overcomes the dependence to internal memory Yu computing capability of the single computer.
2. the present invention utilizes repeatedly the mode of distributed training iteration, constantly sample huge for sample size it is compressed and redistributes, and it is sequentially carried out the composition of final cascade classifier according to the classification capacity of strong classifier in each computer so that it is the detection speed of optimum is reached when detection in real time.
3. the present invention is when iteration carries out distributed training, and last three distributed training carry out the special adjustment of index strategy, it is ensured that the stable convergence of final cascade classifier.
Accompanying drawing explanation
Fig. 1It it is distributed Adboost training method flow processFigure
Detailed description of the invention
Embodiment 1: a kind of Adboost based on Distributed Calculation with detection cost sequence trains learning method, comprises the steps:
Step one. set detection target, determine the number of computers of distribution according to cascade classifier quantity;
Step 2. the second sample is distributed each computer;
Step 3. distributed training, the first sample splits and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Step 4. distributed training next time, first sample of the misclassification of each computer is collected, first sample is split and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Step 5. repeat step 4, until new cascade classifier meets exit criteria;Namely cascade classifier meets the detection target of setting, and wherein: the first described sample is a fairly large number of sample in positive negative sample to be tested, the second described sample is the sample of negligible amounts in positive negative sample to be tested.
The present embodiment utilizes Adboost algorithm self-characteristic, and its cascade classifier is split as the strong classifier of multiple computer distribution type parallel computation, overcomes the dependence to internal memory Yu computing capability of the single computer.And utilize repeatedly the mode of distributed training iteration, constantly sample huge for sample size it is compressed and redistributes, and it is sequentially carried out the composition of new cascade classifier according to the classification capacity of strong classifier in each computer so that it is the detection speed of optimum is reached when detection in real time.
Embodiment 2: there is the technical scheme identical with embodiment 1, more specifically: the step of described step 3 is:
S3.1. the first sample is split, and be distributed to each computer successively;
S3.2. strong classifier parameter is distributed to each computer;
S3.3. each computer is respectively trained strong classifier, and the strong classifier of training gained is collected to server, server uses the strong classifier collecting to all test samples, preferably segmented test, the classification capacity of each strong classifier being sorted, uses with server the strong classifier collecting the mistake point rate to adding up in all test samples for sort by, wrong point rate is more little, classification capacity is more strong, and according to institute's alignment sequence, the strong classifier collected is formed grader that level newly joins successively;
Embodiment 3: have the technical scheme identical with embodiment 1 or 2, more specifically: in step one, set detection target, including the false alarm rate index of cascade classifier, the recall rate index of cascade classifier;Wherein, false alarm rate target setting is equal to 0, and recall rate target setting is equal to 99%;In step 2, the parameter of distribution strong classifier is to distribute the recall rate identical with each strong classifier in entire cascaded grader and false alarm rate index, number of computers is n, the quantity of strong classifier is Max_Num, the strong classifier number joined together is equal to (n+n/2+n/4+...), makes (n+n/2+n/4+...) < 2n≤Max_Num.
If the first total sample number is N, number of computers is n, is distributed to the quantity k=N/n of the negative sample of each computer.The false alarm rate of each strong classifier needs Strong_False_Alarm≤50%, assume that the quantity Max_Num of strong classifier is less than 20, so n≤10, the strong classifier number joined together due to subsequent calculations is equal to (n+n/2+n/4+...), make (n+n/2+n/4+...)<2n≤Max_Num, the then recall rate of each strong classifier needs Strong_Detection_Rate>=99.95%.
When repeating step 3 and step 4, being for the negative sample being excluded on each computer, its sample label is summarized on an extra computer by we, and it is split again.When again splitting, owing to negative sample sum can only achieve at most the 50% of original scale, so the number of computers needed also will reduce by half.
Firstly, it is necessary to remaining strong classifier quantity space Max_Num-n in judgement cascade classifier, compare with this strong classifier quantity n/2 that will appear from.Need to ensure that remaining strong classifier quantity space is bigger than this strong classifier quantity that will appear from, i.e. (n/2)≤(Max_Num-n), and the 2n≤Max_Num above set just ensure that this point.
Embodiment 4: have and the identical technical scheme of embodiment 1 or 2 or 3, more specifically: described sequence is: the higher strong classifier of classification capacity sorts in more forwardly of position, the more weak strong classifier of classification capacity sorts in position more posteriorly;The cascade classifier that described composition is new: strong classifier sequence more forwardly of position in cascade classifier that classification capacity is higher, strong classifier sequence position more posteriorly in cascade classifier that classification capacity is more weak, by the classification capacity the strongest strong classifier first order strong classifier as cascade classifier, and discharge second level strong classifier, third level grader ..., n-th grade of strong classifier successively, n is the quantity of computer, and first order strong classifier rearranges new cascade classifier to n-th grade of strong classifier.N represents the quantity carrying out the computer that distributed training uses, and namely completes the generation of a strong classifier on each computer.
Embodiment 5: have and the identical technical scheme of embodiment 1 or 2 or 3 or 4, more specifically: when distributed training proceeds to second from the bottom secondary and first time, gradually reduce the distribution of false alarm rate in the parameter of strong classifier, and when last time, set false alarm rate as being not more than 0%.When iteration carries out distributed training, last three distributed training are carried out the special adjustment of index strategy, it is ensured that the stable convergence of final cascade classifier.
If the recall rate index of entirety is A, the front respective recall rate index of k level strong classifier is a, then the overall recall rate that front k level strong classifier realizes has reached ak, and now do not exit, A is described > ak.So, next stage strong classifier is as the words of afterbody strong classifier, it is necessary to reach A/akRecall rate.Therefore, now the recall rate of strong classifier is set to A/ak, work as A/ak> a time, this kind of method to set up can relax the exit criteria of last strong classifier to a certain extent, thus accelerating its convergence rate.
Because in whole training process, often remaining last also very low by the sample degree of being to discriminate between of misclassification, be namely difficult to carry out correct classification, the training of this part sample has tended to take up most training resource and Weak Classifier quantity in traditional training method.And said method can be alleviated to a great extent and even greatly reduces this training cost, so that the training of grader restrains quickly.
Embodiment 6: the present embodiment is for Adboost learning algorithm, propose a kind of Adboost based on Distributed Calculation with detection cost sequence and train learning method, sample is carried out two classification by the mathematical property first with algorithm, utilize the difference relation of positive and negative sample size simultaneously, relatively small number of for sample size sample is distributed to each computer successively, fewer for positive sample here.
Then negative sample being split, if negative sample adds up to N, we have n computer, then k=N/n negative sample is distributed to each computer successively, specifically train flow processSuch as Fig. 1Shown in.
It is assumed here that the false alarm rate index of whole cascade classifier is Cascade_False_Alarm=0;
The recall rate index of whole cascade classifier is Cascade_Detection_Rate=99%;
The false alarm rate index of each strong classifier is Strong_False_Alarm≤50%;
Such as, set the quantity Max_Num of strong classifier less than 20, so n≤10, because the strong classifier number that subsequent calculations is joined together=(n+n/2+n/4+...), (n+n/2+n/4+...)<the recall rate index of 2n≤Max_Num. then each strong classifier is Strong_Detection_Rate>=99.95%;
(1) process of first time distributed training is described below
Here the sample in each computer is carried out Distribution Indexes, namely distribute the recall rate identical with each strong classifier in entire cascaded grader and false recognition rate index.The index that each computer receives is identical, and the false alarm rate of each strong classifier needs Strong_False_Alarm≤50%;The recall rate of each strong classifier needs Strong_Detection_Rate >=99.95%.
In so each computer, after the training of first order strong classifier, when using first order strong classifier that the sample on this computer is detected, all at least ensureing have the positive sample of 99.95% to be detected, the negative sample of at least 50% is excluded.Therefore, the first order strong classifier on each computer all possesses the ability that all positive samples and the negative sample being excluded carry out correct two classification.Then, the first order strong classifier on each computer is sent to server successively, carries out collecting of strong classifier.
On the server, the strong classifier of coming is sent for each computer, we use it that all negative sample segmentations are tested successively, and (negative sample sum is likely to too many, once cannot not test completely), and the mistake point rate to negative sample of the strong classifier that each computer is sent here is added up and as the index of classification capacity.According to classification capacity, each strong classifier is ranked up, successively using first order strong classifier the strongest for classification capacity as the first order strong classifier of cascade classifier, second level strong classifier ... n-th grade of strong classifier.
Then we complete the distributed training of first time Adboost algorithm, it is achieved that to all positive samples (99.95%)nRecall rate, possess the separating capacity at least to the negative sample of more than 50%.So in follow-up distributed training, we are accomplished by the separating capacity of negative sample is trained further.
(2) process of second time distributed training is described below
For the negative sample being excluded on each computer, its sample label is summarized on an extra computer by we, and it is split again.When again splitting, owing to negative sample sum can only achieve at most the 50% of original scale, so the number of computers needed also will reduce by half.
Firstly, it is necessary to remaining strong classifier quantity space Max_Num-n in judgement cascade classifier, compare with this strong classifier quantity n/2 that will appear from.Need to ensure that remaining strong classifier quantity space is bigger than this strong classifier quantity that will appear from, i.e. (n/2)≤(Max_Num-n), and the 2n≤Max_Num above set just ensure that this point.
Then, similar with first time distributed training process, the sample in each computer is carried out Distribution Indexes, namely distributes the recall rate identical with each strong classifier in entire cascaded grader and false recognition rate index.The false alarm rate that the index that each computer receives is each strong classifier needs Strong_False_Alarm≤50%;The recall rate of each strong classifier needs Strong_Detection_Rate >=99.95%.
(3) the distributed training of third time and last distributed training
From third time distributed training, training afterwards is similar with above twice, arrive successively only remain three computers time, it is necessary to special handling.In the present invention, when the computer of training only remains three, also mean that and need to perform twice at distributed training.So, remaining negative sample is accomplished by twice last distributed training to separate with positive sample completely.
If the recall rate index of entirety is A, the front respective recall rate index of k level strong classifier is a, then the overall recall rate that front k level strong classifier realizes has reached ak, and now do not exit, A is described > ak.So, next stage strong classifier is as the words of afterbody strong classifier, it is necessary to reach A/akRecall rate.Therefore, now the recall rate of strong classifier is set to A/ak, work as A/ak> a time, this kind of method to set up can relax the exit criteria of last strong classifier to a certain extent, thus accelerating its convergence rate.
Because in whole training process, often remaining last also very low by the sample degree of being to discriminate between of misclassification, be namely difficult to carry out correct classification, the training of this part sample has tended to take up most training resource and Weak Classifier quantity in traditional training method.And said method can be alleviated to a great extent and even greatly reduces this training cost, so that the training of grader restrains quickly.
So the adjustment being evaluated index of needing for property, the recall rate of distributed training second from the bottom time and false alarm rate are set as Strong_Detection_Rate >=99.95% with Strong_False_Alarm≤30%, provide a transition for the index convergence trained for the last time.And last index is accomplished by being set as Strong_Detection_Rate >=99.95% with Strong_False_Alarm≤0%.
Embodiment 7: a kind of Adboost based on Distributed Calculation with detection cost sequence trains learning device, including:
Detection goal setting device, sets detection target, determines the number of computers of distribution according to cascade classifier quantity;
Second sample dispensing device, distributes each computer by the second sample;
First distributed training devices, distributed training, the first sample splits and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Second distributed training devices, distributed training next time, first sample of the misclassification of each computer is collected, first sample is split and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Distributed training repetition training device, second distributed training devices's repetition training, until new cascade classifier meets exit criteria;
Wherein: the first described sample is a fairly large number of sample in positive negative sample to be tested, the second described sample is the sample of negligible amounts in positive negative sample to be tested.
More specifically, the device described in the present embodiment, set detection target, including the false alarm rate index of cascade classifier, the recall rate index of cascade classifier;The parameter of distribution strong classifier is to distribute the recall rate identical with each strong classifier in entire cascaded grader and false alarm rate index, number of computers is n, the quantity of strong classifier is Max_Num, the strong classifier number joined together is equal to (n+n/2+n/4+...), makes (n+n/2+n/4+...) < 2n≤Max_Num.
Described sequence is: the higher strong classifier of classification capacity sorts in more forwardly of position, and the more weak strong classifier of classification capacity sorts in position more posteriorly;The cascade classifier that described composition is new: using the strong classifier the strongest for the classification capacity first order strong classifier as cascade classifier, and discharge second level strong classifier, third level grader ..., n-th grade of strong classifier successively, n is the quantity of computer, and first order strong classifier rearranges new cascade classifier to n-th grade of strong classifier.
When distributed training proceeds to second from the bottom time and first time, gradually reduce the distribution of false alarm rate in the parameter of strong classifier, and when last time, set false alarm rate as being not more than 0%.
Device in any technology scheme described in the present embodiment may be used for the method described in embodiment 1-6 that performs.

Claims (10)

1. train learning method based on the Adboost of Distributed Calculation with detection cost sequence for one kind, it is characterised in that comprise the steps:
Step one. set detection target, determine the number of computers of distribution according to cascade classifier quantity;
Step 2. the second sample is distributed each computer;
Step 3. distributed training, the first sample splits and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Step 4. distributed training next time, first sample of the misclassification of each computer is collected, first sample is split and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Step 5. repeat step 4, until new cascade classifier meets exit criteria;
Wherein: the first described sample is a fairly large number of sample in positive negative sample to be tested, the second described sample is the sample of negligible amounts in positive negative sample to be tested.
2. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 1 trains learning method, it is characterised in that the step of described step 3 is:
S3.1. the first sample is split, and be distributed to each computer successively;
S3.2. strong classifier parameter is distributed to each computer;
S3.3. each computer is respectively trained strong classifier, and the strong classifier of training gained is collected to server, server uses the strong classifier collecting to all test samples, the classification capacity of each strong classifier is sorted, and according to institute's alignment sequence, the strong classifier collected is formed the grader that level newly joins successively.
3. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 1 trains learning method, it is characterised in that in step one, set detection target, including the false alarm rate index of cascade classifier, the recall rate index of cascade classifier;In step 2, the parameter of distribution strong classifier is to distribute the recall rate identical with each strong classifier in entire cascaded grader and false alarm rate index, number of computers is n, the quantity of strong classifier is Max_Num, the strong classifier number joined together is equal to (n+n/2+n/4+...), makes (n+n/2+n/4+...) < 2n≤Max_Num.
4. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 2 trains learning method, it is characterized in that, described sequence is: the higher strong classifier of classification capacity sorts in more forwardly of position, and the more weak strong classifier of classification capacity sorts in position more posteriorly;The cascade classifier that described composition is new: using the strong classifier the strongest for the classification capacity first order strong classifier as cascade classifier, and discharge successively second level strong classifier, third level grader, n-th grade of strong classifier, n is the quantity of computer, and first order strong classifier rearranges new cascade classifier to n-th grade of strong classifier.
5. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 1 or 2 or 3 or 4 trains learning method, it is characterized in that, when distributed training proceeds to second from the bottom secondary and first time, gradually reduce the distribution of false alarm rate in the parameter of strong classifier, and when last time, set false alarm rate as being not more than 0%.
6. train learning device based on the Adboost of Distributed Calculation with detection cost sequence for one kind, it is characterised in that including:
Detection goal setting device, sets detection target, determines the number of computers of distribution according to cascade classifier quantity;Second sample dispensing device, distributes each computer by the second sample;
First distributed training devices, distributed training, the first sample splits and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Second distributed training devices, distributed training next time, first sample of the misclassification of each computer is collected, first sample is split and distributes each computer, each computer is respectively trained strong classifier according to the parameter of the strong classifier of distribution, and forms new cascade classifier according to the classification capacity of strong classifier;
Distributed training repetition training device, second distributed training devices's repetition training, until new cascade classifier meets exit criteria;
Wherein: the first described sample is a fairly large number of sample in positive negative sample to be tested, the second described sample is the sample of negligible amounts in positive negative sample to be tested.
7. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 6 trains learning device, it is characterised in that set detection target, including the false alarm rate index of cascade classifier, the recall rate index of cascade classifier;The parameter of distribution strong classifier is to distribute the recall rate identical with each strong classifier in entire cascaded grader and false alarm rate index, number of computers is n, the quantity of strong classifier is Max_Num, the strong classifier number joined together is equal to (n+n/2+n/4+...), makes (n+n/2+n/4+...) < 2n≤Max_Num.
8. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 7 trains learning device, it is characterized in that, described sequence is: the higher strong classifier of classification capacity sorts in more forwardly of position, and the more weak strong classifier of classification capacity sorts in position more posteriorly;The cascade classifier that described composition is new: using the strong classifier the strongest for the classification capacity first order strong classifier as cascade classifier, and discharge successively second level strong classifier, third level grader, n-th grade of strong classifier, n is the quantity of computer, and first order strong classifier rearranges new cascade classifier to n-th grade of strong classifier.
9. the Adboost based on Distributed Calculation with detection cost sequence as described in claim 6 or 7 or 8 trains learning device, it is characterized in that, when distributed training proceeds to second from the bottom secondary and first time, gradually reduce the distribution of false alarm rate in the parameter of strong classifier, and when last time, set false alarm rate as being not more than 0%.
10. the Adboost based on Distributed Calculation with detection cost sequence as claimed in claim 9 trains learning device, it is characterized in that, the method of the distribution gradually reducing the false alarm rate in the parameter of strong classifier is: set overall recall rate index as A, the front respective recall rate index of k level strong classifier is a, and the overall recall rate that front k level strong classifier realizes has reached ak, the recall rate of strong classifier is set to A/ak
CN201610201531.6A 2016-03-31 2016-03-31 Training learning method and device based on distributed computing and detection cost sequence Active CN105760899B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610201531.6A CN105760899B (en) 2016-03-31 2016-03-31 Training learning method and device based on distributed computing and detection cost sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610201531.6A CN105760899B (en) 2016-03-31 2016-03-31 Training learning method and device based on distributed computing and detection cost sequence

Publications (2)

Publication Number Publication Date
CN105760899A true CN105760899A (en) 2016-07-13
CN105760899B CN105760899B (en) 2019-04-05

Family

ID=56347107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610201531.6A Active CN105760899B (en) 2016-03-31 2016-03-31 Training learning method and device based on distributed computing and detection cost sequence

Country Status (1)

Country Link
CN (1) CN105760899B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463444A (en) * 2017-07-13 2017-12-12 中国航空工业集团公司西安飞机设计研究所 A kind of false alarm rate distribution method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070112701A1 (en) * 2005-08-15 2007-05-17 Microsoft Corporation Optimization of cascaded classifiers
CN101075291A (en) * 2006-05-18 2007-11-21 中国科学院自动化研究所 Efficient promoting exercising method for discriminating human face
CN101743537A (en) * 2007-07-13 2010-06-16 微软公司 Multiple-instance pruning for learning efficient cascade detectors
US20130013542A1 (en) * 2009-08-11 2013-01-10 At&T Intellectual Property I, L.P. Scalable traffic classifier and classifier training system
CN103020712A (en) * 2012-12-28 2013-04-03 东北大学 Distributed classification device and distributed classification method for massive micro-blog data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070112701A1 (en) * 2005-08-15 2007-05-17 Microsoft Corporation Optimization of cascaded classifiers
CN101075291A (en) * 2006-05-18 2007-11-21 中国科学院自动化研究所 Efficient promoting exercising method for discriminating human face
CN101743537A (en) * 2007-07-13 2010-06-16 微软公司 Multiple-instance pruning for learning efficient cascade detectors
US20130013542A1 (en) * 2009-08-11 2013-01-10 At&T Intellectual Property I, L.P. Scalable traffic classifier and classifier training system
CN103020712A (en) * 2012-12-28 2013-04-03 东北大学 Distributed classification device and distributed classification method for massive micro-blog data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
贺灏等: "一种分布式计算的Adaboost训练算法", 《计算机应用与软件》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463444A (en) * 2017-07-13 2017-12-12 中国航空工业集团公司西安飞机设计研究所 A kind of false alarm rate distribution method

Also Published As

Publication number Publication date
CN105760899B (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN108564129B (en) Trajectory data classification method based on generation countermeasure network
Pandey et al. A decision tree algorithm pertaining to the student performance analysis and prediction
CN105589806B (en) A kind of software defect tendency Forecasting Methodology based on SMOTE+Boosting algorithms
CN110751121B (en) Unsupervised radar signal sorting method based on clustering and SOFM
CN106897821A (en) A kind of transient state assesses feature selection approach and device
CN104820702B (en) A kind of attribute weight method and file classification method based on decision tree
CN106251172A (en) A kind of user based on Information Entropy is worth score calculation method and system
CN102629272A (en) Clustering based optimization method for examination system database
Nababan et al. Attribute weighting based K-nearest neighbor using gain ratio
CN104463221A (en) Imbalance sample weighting method suitable for training of support vector machine
CN105224577A (en) Multi-label text classification method and system
CN104200134A (en) Tumor gene expression data feature selection method based on locally linear embedding algorithm
CN106611021B (en) Data processing method and equipment
CN103077405A (en) Bayes classification method based on Fisher discriminant analysis
Bader-El-Den Self-adaptive heterogeneous random forest
CN105760899A (en) Adboost training learning method and device based on distributed computation and detection cost ordering
CN107590538A (en) A kind of dangerous source discrimination based on online Sequence Learning machine
Chidlovskii et al. Assembling Heterogeneous Domain Adaptation Methods for Image Classification.
Du et al. A knowledge transfer-based evolutionary algorithm for multimodal optimization
Zhang et al. PolSAR images classification through GA-based selective ensemble learning
CN103544500A (en) Multi-user natural scene mark sequencing method
Zhao et al. Hypersurface classifiers ensemble for high dimensional data sets
Lustosa Filho et al. An analysis of diversity measures for the dynamic design of ensemble of classifiers
Pristyanto et al. Ensemble model approach for imbalanced class handling on dataset
Raju et al. Detecting communities in social networks using unnormalized spectral clustering incorporated with Bisecting K-means

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant