CN110378872A - A kind of multi-source adaptive equalization transfer learning method towards crack image detection - Google Patents
A kind of multi-source adaptive equalization transfer learning method towards crack image detection Download PDFInfo
- Publication number
- CN110378872A CN110378872A CN201910496225.3A CN201910496225A CN110378872A CN 110378872 A CN110378872 A CN 110378872A CN 201910496225 A CN201910496225 A CN 201910496225A CN 110378872 A CN110378872 A CN 110378872A
- Authority
- CN
- China
- Prior art keywords
- weight
- data set
- sample
- training
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 33
- 238000001514 detection method Methods 0.000 title claims abstract description 22
- 238000013526 transfer learning Methods 0.000 title claims abstract description 15
- 238000012937 correction Methods 0.000 claims abstract description 45
- 238000013480 data collection Methods 0.000 claims abstract description 44
- 238000012549 training Methods 0.000 claims description 86
- 230000000153 supplemental effect Effects 0.000 claims description 22
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 5
- 239000008186 active pharmaceutical agent Substances 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000003709 image segmentation Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 241000208340 Araliaceae Species 0.000 claims 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims 1
- 235000003140 Panax quinquefolius Nutrition 0.000 claims 1
- 235000008434 ginseng Nutrition 0.000 claims 1
- 238000011156 evaluation Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 6
- 230000003252 repetitive effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000006378 damage Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30132—Masonry; Concrete
Abstract
The multi-source adaptive equalization transfer learning method towards crack image detection that the invention discloses a kind of, comprising the following steps: 1) correction coefficient is added on the basis of TrAdaBoost algorithm, solves the problems, such as that auxiliary data weight convergence is too fast;2) adaptive covering parameter is introduced in correction coefficient, reflects between auxiliary data collection and target data set whether there is similarity relationships;3) keep finally obtained target data set consistent with each field crack data set different degree with final balance weight method, improve Dam Crack image detection accuracy and efficiency.The present invention can be improved Crack Detection accuracy rate, realize the promotion of Dam Crack picture detection performance on Small Sample Database collection.
Description
Technical field
The invention belongs to Distributed Database cluster field, in particular to a kind of multi-source towards crack image detection is adaptive
Equilibrium transfer learning method.
Background technique
China is the country that reservoir dam is most in the world, and by the end of 2016, China was completed all kinds of reservoir dams 9.8
More than ten thousand seats.It over time with the growth in dam age, is influenced by natural environment and human factor, dam surface and internal hair
There is a series of dam visual defects, the probabilities that are in danger such as deformation, crack, leakage, calcified material precipitation and increases, threatens people in raw deformation
People's security of the lives and property, crack are one of main harms of dam.
Traditional machine learning method needs a large amount of training sample, and it is same for needing to meet training data and test data
The hypothesis of distribution.However, in practical applications, test data and training data not necessarily fully meet same distributional assumption.Migration
Two basic assumptions in conventional machines study are relaxed in study, purpose primarily directed to small, limited sample size
Specific area data set is easy to produce over-fitting using machine learning and leads to not the problem of training is with study, by utilizing tool
There are trained preferable excellent model and sample in the field of certain similitude to construct the model for meeting mission requirements, thus real
The effect of good model is constructed under existing small data set.
TrAdaBoost algorithm is a kind of transfer learning method of Case-based Reasoning selection, for solving training set and test set
Different problems are distributed, good result can be obtained when auxiliary data and source data have many similitudes.
TrAdaBoost algorithm constructs a part of usable supplemental training collection, combining target training set than using target source merely
The more accurate model of training set training.
Summary of the invention
Goal of the invention: in order to overcome Dam Crack image existing in the prior art less, training sample is unevenly distributed weighing apparatus,
And TrAdaBoost algorithm easily weakens the problem of auxiliary data collection effect in the training process, the present invention provides a kind of towards splitting
The multi-source adaptive equalization transfer learning method for stitching image detection is improved to train the strong classifier of Dam Crack image
Crack Detection accuracy rate realizes the promotion of Dam Crack picture detection performance on Small Sample Database collection.
Technical solution: to achieve the above object, the present invention provides a kind of multi-source towards crack image detection and adaptively puts down
Weigh transfer learning method, includes the following steps:
(1) multi-source assistant images data set is inputted;
(2) K-means cluster is carried out to image, then rejected and the big picture of target data difference;
(3) discarding and the biggish auxiliary data of target difference from the image library of crack, and classifier is trained;
(4) setting right value update formula isAdaptively covering parameter is To modify corresponding weight;Wherein, increase compensation coefficient in right value update strategy, introduced in compensation coefficient adaptive
Parameter is covered, it is auxiliary that target data set final weight is finally reset into each field in last time iteration with final balance weight method
Help the average value of training set weight;
(5) weight vectors are updated, return step (3) obtains SVM strong classifier;Finally reset DTWeight: by DTWeights resetting
Each D after for iterationsAverage weight;Use DsWith D after resettingTOne final classification device of training jointly.
Further, K-means cluster is carried out to image in the step (2), then rejected big with target data difference
Picture specific step is as follows:
(2.1) first by image X in the image library of cracki(i=1,2 ..., n) carries out gray processing, and successively storage is to one-dimensional
Matrix DXIn;
(2.2) then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, record the first place of every fritter
It sets, obtains n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
(2.3) according to the gray average of each image array fritter, using Euclidean distance, shown in following formula, meter
These objects are calculated at a distance from 30 image pattern cluster centres;And it is again equal to respective image fritter gray scale according to minimum range
Value is divided, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between;As dis (xi,yj) value is bigger, illustrate xiWith
yjIt is more similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger;
(2.4) mass center of each image fritter pixel grey scale mean value changed is recalculated;
(2.5) it repeats the above steps (2.3), step (2.4) is until the cluster center of each data class is no longer changed
Until;
After the image array of input is stored in the form of block of pixels, with K-means clustering algorithm to picture element matrix block
It is clustered, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, the remote picture of clustering distance
It is deleted.
Further, it is trained that specific step is as follows in the step (3) to classifier:
(3.1) target domain tape label training set DT={ (xt,yt), DSFor the set of N number of auxiliary data collection, i.e. Ds=
{D1, D2..., DN}={ (x1,y1),...(xk,yk),...(xN,yN)};Initialize weight vector w=(ws,wT), and to merging
The processing of training set samples normalization;By multi-source auxiliary data collection Di(i=1,2 ..., N) and target domain tape label data set DTPoint
Not Zu He < Di, DT>, obtain combined data set Di,T;
(3.2) start to train network, in combination Di,TUpper unified progress image preprocessing, image segmentation, feature extraction and handle
All features save, and training SVM classifier obtains i Weak Classifier Fm;
(3.3) Weak Classifier F is calculated separatelymIn Ds, DTUpper error:
Wherein, nSTo assist data set sample size, nTFor target data set number of samples;yiReally to classify, Fm(xi)
For classifier FmExport classification.
Further, target data set final weight is reset into each field in last time iteration in the step (4)
Specific step is as follows for the average value of supplemental training collection weight:
(4.1) increase the weight that correction coefficient updates auxiliary data collection:
Increase the weight that correction coefficient updates auxiliary data collection sample on the basis of TrAdaBoost;When the number of iterations M not
Disconnected to increase, every field supplemental training collection can be returned correctly, after M iteration, each field of auxiliary sample weights
The sum of are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For each training sample in source domain a
Weight;
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier in b
On error rate;
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,It can obtain:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,Correction coefficient can be obtained are as follows:
Find out from correction coefficient formula, correction coefficient CmWith error rate of the Weak Classifier on target data set bAt anti-
Than relationship, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, to next iteration training
The influence of Weak Classifier increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to next
The influence of secondary repetitive exercise Weak Classifier reduces;Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can be same
When keep target data set and auxiliary data collection sample weights to be restrained;
(4.2) adaptive covering parameter is introduced:
Introduce adaptive covering parameter in correction coefficient, it is adaptive cover parameter be base classifier in auxiliary data collection and
The sum of classification accuracy rate on target data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(4.3) final balance weight method:
Target data set final weight is reset to the average value of each field supplemental training collection weight in last time iteration,
Spend finally obtained target data set and each field supplemental training collection unanimously.
The utility model has the advantages that compared with the prior art, the present invention has the following advantages:
(1) it is clustered before TrAdaBoost algorithm using K-means, by the remote picture of clustering distance from crack image library
Middle deletion is conducive to the training of subsequent classifier, improves training effectiveness.
(2) TrAdaBoost is compared, introducing correction coefficient can solve the increase due to the number of iterations, and source domain is caused to be weighed
Decline too fast, the excessive problem of gap between target source domain weight again.
(3) adaptive covering parameter is introduced in correction coefficient can reflect source domain training dataset and target domain instruction
Whether practice between data set has similarity relationships, improvement method detection performance.
(4) final balance weight method can make finally obtained target set of source data and each field crack data set different degree
Unanimously, realize that Dam Crack picture classifier performance on Small Sample Database collection is promoted.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Specific embodiment
Combined with specific embodiments below, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate the present invention
Rather than limit the scope of the invention, after the present invention has been read, those skilled in the art are to various equivalences of the invention
The modification of form falls within the application range as defined in the appended claims.
A kind of transfer learning side multi-source adaptive equalization TrAdaBoost towards crack image detection of the present invention
Method, as shown in algorithm 1, including two aspects: K-means image clustering and multi-source adaptive equalization TrAdaBoost migration are learned
It practises;
Algorithm 1: it is based on K-means multi-source adaptive equalization TrAdaBoost transfer learning method
1) K-means image clustering:
K-means clustering method, wherein K represents cluster mass center number, and means indicates the mean value of data in cluster.Its core
Thought is: randomly selecting the initial center of k cluster, each center representative one cluster;And to remaining data object, according to
At a distance from each center cluster, it is respectively allocated to therewith in nearest clustering cluster.
By K-means image clustering method, cluster row is carried out as method for measuring similarity using Euclidean distance
Sequence.The remote picture of clustering distance is deleted from the image library of crack, is conducive to the training of subsequent classifier, improves training effectiveness.
Specific step is as follows for K-means image clustering method:
Step 1: first by image X in the image library of cracki(i=1,2 ..., n) carries out gray processing, and successively storage is to one-dimensional
Matrix DXIn;
Step 2: then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, record the first place of every fritter
It sets, obtains n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
Step 3: according to the gray average of each image array fritter, using Euclidean distance, as shown in formula 1, meter
These objects are calculated at a distance from 30 image pattern cluster centres;And it is again equal to respective image fritter gray scale according to minimum range
Value is divided, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between.As dis (xi,yj) value is bigger, illustrate xiWith
yjIt is more similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger.
Step 4: recalculating the mass center of each image fritter pixel grey scale mean value changed;
Step 5: repeating the above steps 3,4 until the cluster center of each data class is no longer changed.
After the image array of input is stored in the form of block of pixels, with K-means clustering algorithm to picture element matrix block
It is clustered, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, the remote picture of clustering distance
It is deleted.Discarding and the biggish auxiliary data of target difference, are conducive to the training of subsequent classifier, mention from the image library of crack
High training effectiveness.
2) multi-source adaptive equalization TrAdaBoost transfer learning
Multi-source adaptive equalization TrAdaBoost (Multi-source Adaptive Balance TrAdaBoost,
MABtrA) correction coefficient is added in transfer learning method on the basis of TrAdaBoost algorithm, solves auxiliary data weight convergence mistake
Fast problem;Adaptive covering parameter is introduced in correction coefficient, reflects between auxiliary data collection and target data set whether have
There are similarity relationships;After iteration, using final balance weight method, make finally obtained target data set and each field crack
Data set different degree is consistent, improves Dam Crack image detection accuracy and efficiency.Increase correction coefficient and updates auxiliary data collection sample
Adaptively covering parameter and final balance weight method are specific as follows for this weight, introducing:
(1) increase the weight that correction coefficient updates auxiliary data collection
Due to having differences between each field auxiliary data collection and target data set, the Weak Classifier for causing training to obtain exists
Error rate is higher on target data set, and therefore, the weight of each field supplemental training collection constantly reduces as the number of iterations increases,
The weight that final training obtains becomes very small so that uncorrelated to auxiliary data collection, can not play auxiliary mark data set
The effect of habit.However, the weight of target data set is continuously increased as the number of iterations increases, easily there is difficulty and divide sample situation.
In order to preferably using each field auxiliary data collection and target data set training, increase on the basis of TrAdaBoost
The weight of correction coefficient update auxiliary data collection sample.When the number of iterations m constantly increases, every field supplemental training collection can be by
It is correct to return, after m iteration, the sum of each field of auxiliary sample weights are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For each training sample in source domain a
Weight.
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier in b
On error rate.
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
Therefore, when the auxiliary data collection sample weights distribution of M+1 iteration are as follows:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,Connection formula 7 can obtain:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,It can according to relational expression 8 and 9
Obtain correction coefficient are as follows:
From formula 10 as can be seen that correction coefficient CmWith error rate of the Weak Classifier on target data set bIt is inversely proportional
Relationship, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, weak to next iteration training
The influence of classifier increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to next time
The influence of repetitive exercise Weak Classifier reduces.Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can be simultaneously
Target data set and auxiliary data collection sample weights is kept to be restrained.
(2) adaptive covering parameter is introduced
However, even if εbWhen lower, Weak Classifier can also have differences the classifying quality of source domain training set, this difference
Similarities and differences sample can reflect out the correlation between source domain training set and target domain training set.In order to reflect that this similitude is closed
System introduces adaptive covering parameter in correction coefficient, and the adaptive parameter that covers is base classifier in auxiliary data collection and target
The sum of classification accuracy rate on data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(3) final balance weight method
The basic conception of final balance weight method is: in an iterative process, auxiliary data weight constantly declines, target data
Weight is continuously increased, and after iteration, gap is larger between auxiliary data weight and target data weight, but in final classification device
On generation form, it should to target data set and every field supplemental training collection fair play.By target data set final weight
The average value for resetting to each field supplemental training collection weight in last time iteration, makes finally obtained target data set and each neck
Supplemental training collection in domain will be spent unanimously, improve the Detection accuracy of algorithm.
The evaluation criterion of the specific embodiment of the invention is as follows:
Evaluation criterion recall rate (Recall), the precision (Precision), accuracy rate of the specific embodiment of the invention
(Accuracy) and comprehensive evaluation index (F-Measure).Recall rate indicates that positive class predicts positive exact figures and reality in prediction result
The ratio between the positive class in border, that is, the target being identified ratio shared in such target;Accuracy representing predict positive class successfully count and
Predict that the ratio between successfully, i.e., all identification returns the result ratio shared by middle real goal;Accuracy rate indicates to predict correct sample
The ratio of the total sample of this Zhan;The comprehensive assessment of comprehensive evaluation index expression recall rate and accuracy.
F-Measure=(2 × TP)/(2 × TP+FP+FN) (16)
Above-mentioned four kinds of evaluation criterias, the bigger expression algorithm prediction effect of value are more preferable.
Wherein the meaning of TP, FN, FP, TN are as shown in 1 two classification confusion matrix of table.
1 two classification confusion matrix of table
Fig. 1 be the embodiment of the present invention model training flow chart, the course of work as described below:
1. inputting multi-source assistant images data set.
2. image after K-means is clustered, is deleted and the biggish picture of target source difference.The image clustering side K-means
Specific step is as follows for method:
Step 1: first by image X in the image library of cracki(i=1,2 ..., n) carries out gray processing, and successively storage is to one-dimensional
Matrix DXIn;
Step 2: then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, record the first place of every fritter
It sets, obtains n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
Step 3: according to the gray average of each image array fritter, using Euclidean distance, as shown in formula 1, meter
These objects are calculated at a distance from 30 image pattern cluster centres;And it is again equal to respective image fritter gray scale according to minimum range
Value is divided, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between.As dis (xi,yj) value is bigger, illustrate xiWith
yjIt is more similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger.
Step 4: recalculating the mass center of each image fritter pixel grey scale mean value changed;
Step 5: repeating the above steps 3,4 until the cluster center of each data class is no longer changed.
After the image array of input is stored in the form of block of pixels, with K-means clustering algorithm to picture element matrix block
It is clustered, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, the remote picture of clustering distance
It is deleted.Discarding and the biggish auxiliary data of target difference, are conducive to the training of subsequent classifier, mention from the image library of crack
High training effectiveness.
3. target domain tape label training set DT={ (xt,yt), DSFor the set of N number of auxiliary data collection, i.e. Ds={ D1,
D2..., DN}={ (x1,y1),...(xk,yk),...(xN,yN)}.Initialize weight vector w=(ws,wT), and instructed to merging
Practice collection samples normalization processing.By multi-source auxiliary data collection Di(i=1,2 ..., N) and target domain tape label data set DTRespectively
Combine < Di, DT>, obtain combined data set Di,T。
4. starting to train network, in combination Di,TUpper unified progress image preprocessing, image segmentation, feature extraction is simultaneously institute
There is feature to save, training SVM classifier, available i Weak Classifier Fm。
5. calculating separately Weak Classifier FmIn Ds, DTUpper error:
Wherein, nSTo assist data set sample size, nTFor target data set number of samples;yiReally to classify, Fm(xi)
For classifier FmExport classification.
6. setting right value update formula isAdaptively covering parameter is To modify corresponding weight.Wherein, compensation coefficient is increased in right value update strategy, is introduced in compensation coefficient adaptive
Parameter should be covered, target data set final weight is finally reset into each field in last time iteration with final balance weight method
The average value of supplemental training collection weight.Concrete principle is as follows:
(1) increase the weight that correction coefficient updates auxiliary data collection
In order to preferably using each field auxiliary data collection and target data set training, increase on the basis of TrAdaBoost
The weight of correction coefficient update auxiliary data collection sample.When the number of iterations M constantly increases, every field supplemental training collection can be by
It is correct to return, after M iteration, the sum of each field of auxiliary sample weights are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For each training sample in source domain a
Weight.
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier in b
On error rate.
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,Connection formula 7 can obtain:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,It can according to relational expression 8 and 9
Obtain correction coefficient are as follows:
From formula 10 as can be seen that correction coefficient CmWith error rate of the Weak Classifier on target data set bIt is inversely proportional
Relationship, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, weak to next iteration training
The influence of classifier increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to next time
The influence of repetitive exercise Weak Classifier reduces.Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can be simultaneously
Target data set and auxiliary data collection sample weights is kept to be restrained.
(2) adaptive covering parameter is introduced
However, even if εbWhen lower, Weak Classifier can also have differences the classifying quality of source domain training set, this difference
Similarities and differences sample can reflect out the correlation between source domain training set and target domain training set.In order to reflect that this similitude is closed
System introduces adaptive covering parameter in correction coefficient, and the adaptive parameter that covers is base classifier in auxiliary data collection and target
The sum of classification accuracy rate on data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(3) final balance weight method
Target data set final weight is reset to the average value of each field supplemental training collection weight in last time iteration,
It spend finally obtained target data set and each field supplemental training collection unanimously, improve the Detection accuracy of algorithm.
7. updating weight vectors:
7. repeating step 4. 5. 6. until reach the number of iterations M of setting, SVM strong classifier is obtained.Finally reset DTPower
Weight: by DTWeights resetting is each D after iterationsAverage weight;Use DsWith D after resettingTFinal point of common training one
Class device.
The evaluation criterion of the specific embodiment of the invention is as follows:
Evaluation criterion recall rate (Recall), the precision (Precision), accuracy rate of the specific embodiment of the invention
(Accuracy) and comprehensive evaluation index (F-Measure).Recall rate indicates that positive class predicts positive exact figures and reality in prediction result
The ratio between the positive class in border, that is, the target being identified ratio shared in such target;Accuracy representing predict positive class successfully count and
Predict that the ratio between successfully, i.e., all identification returns the result ratio shared by middle real goal;Accuracy rate indicates to predict correct sample
The ratio of the total sample of this Zhan;The comprehensive assessment of comprehensive evaluation index expression recall rate and accuracy.
F-Measure=(2 × TP)/(2 × TP+FP+FN) (16)
Above-mentioned four kinds of evaluation criterias, the bigger expression algorithm prediction effect of value are more preferable.
Wherein the meaning of TP, FN, FP, TN are as shown in 1 two classification confusion matrix of table.
1 two classification confusion matrix of table
It is less for Dam Crack image according to above embodiments it is found that in practical applications, training sample distribution
Unbalanced and TrAdaBoost algorithm easily weakens the problem of auxiliary data collection effect, method of the invention in the training process
The strong classifier of Dam Crack image can be trained, Crack Detection accuracy rate is improved, realizes Dam Crack picture in small sample
The promotion of detection performance on data set.
Claims (4)
1. a kind of multi-source adaptive equalization transfer learning method towards crack image detection, which is characterized in that including walking as follows
It is rapid:
(1) multi-source assistant images data set is inputted;
(2) K-means cluster is carried out to image, then rejected and the big picture of target data difference;
(3) discarding and the biggish auxiliary data of target difference from the image library of crack, and classifier is trained;
(4) setting right value update formula isAdaptively covering parameter is
To modify corresponding weight;Wherein, increase compensation coefficient in right value update strategy, adaptive covering ginseng is introduced in compensation coefficient
Number, finally resets to each field supplemental training in last time iteration for target data set final weight with final balance weight method
Collect the average value of weight;
(5) weight vectors are updated, return step (3) obtains SVM strong classifier;Finally reset DTWeight: by DTWeights resetting is repeatedly
Each D after generationsAverage weight;Use DsWith D after resettingTOne final classification device of training jointly.
2. a kind of multi-source adaptive equalization transfer learning method towards crack image detection according to claim 1,
It is characterized in that, carrying out K-means cluster to image in the step (2), then reject and the big picture of target data difference
Specific step is as follows:
(2.1) first by image X in the image library of cracki(i=1,2 ..., n) carry out gray processing, successively storage arrive one-dimensional matrix DX
In;
(2.2) then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, and the first place for recording every fritter is set, and is obtained
To n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
(2.3) according to the gray average of each image array fritter, using Euclidean distance, shown in following formula, this is calculated
A little objects are at a distance from 30 image pattern cluster centres;And according to minimum range again to respective image fritter gray average into
Row divides, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between;As dis (xi,yj) value is bigger, illustrate xiAnd yjMore
It is similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger;
(2.4) mass center of each image fritter pixel grey scale mean value changed is recalculated;
(2.5) it repeats the above steps (2.3), step (2.4) is until the cluster center of each data class is no longer changed;
After the image array of input is stored in the form of block of pixels, picture element matrix block is carried out with K-means clustering algorithm
Cluster, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, and the remote picture of clustering distance carries out
It deletes.
3. a kind of multi-source adaptive equalization transfer learning method towards crack image detection according to claim 1,
It is characterized in that, being trained in the step (3) to classifier, specific step is as follows:
(3.1) target domain tape label training set DT={ (xt,yt), DSFor the set of N number of auxiliary data collection, i.e. Ds={ D1,
D2..., DN}={ (x1,y1),...(xk,yk),...(xN,yN)};Initialize weight vector w=(ws,wT), and instructed to merging
Practice collection samples normalization processing;By multi-source auxiliary data collection Di(i=1,2 ..., N) and target domain tape label data set DTPoint
Not Zu He < Di, DT>, obtain combined data set Di,T;
(3.2) start to train network, in combination Di,TUpper unified progress image preprocessing, image segmentation, feature extraction is simultaneously all
Feature saves, and training SVM classifier obtains i Weak Classifier Fm;
(3.3) Weak Classifier F is calculated separatelymIn Ds, DTUpper error:
Wherein, nSTo assist data set sample size, nTFor target data set number of samples;yiReally to classify, Fm(xi) it is point
Class device FmExport classification.
4. a kind of multi-source adaptive equalization transfer learning method towards crack image detection according to claim 1,
It is characterized in that, target data set final weight is reset to each field auxiliary instruction in last time iteration in the step (4)
Specific step is as follows for the average value of white silk collection weight:
(4.1) increase the weight that correction coefficient updates auxiliary data collection:
Increase the weight that correction coefficient updates auxiliary data collection sample on the basis of TrAdaBoost;When the number of iterations M constantly increases
Greatly, every field supplemental training collection can be returned correctly, after M iteration, the sum of each field of auxiliary sample weights
Are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For training sample weight each in source domain a;
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier on b
Error rate;
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,It can
:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,Correction coefficient can be obtained are as follows:
Find out from correction coefficient formula, correction coefficient CmWith error rate of the Weak Classifier on target data set bBe inversely proportional pass
System, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, to weak point of next iteration training
The influence of class device increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to changing next time
The influence of generation training Weak Classifier reduces;Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can protect simultaneously
It holds target data set and auxiliary data collection sample weights is restrained;
(4.2) adaptive covering parameter is introduced:
Adaptive covering parameter is introduced in correction coefficient, the adaptive parameter that covers is base classifier in auxiliary data collection and target
The sum of classification accuracy rate on data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(4.3) final balance weight method:
The average value that target data set final weight is reset to each field supplemental training collection weight in last time iteration, makes most
The target data set and each field supplemental training collection obtained eventually will be spent unanimously.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910496225.3A CN110378872A (en) | 2019-06-10 | 2019-06-10 | A kind of multi-source adaptive equalization transfer learning method towards crack image detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910496225.3A CN110378872A (en) | 2019-06-10 | 2019-06-10 | A kind of multi-source adaptive equalization transfer learning method towards crack image detection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110378872A true CN110378872A (en) | 2019-10-25 |
Family
ID=68249935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910496225.3A Pending CN110378872A (en) | 2019-06-10 | 2019-06-10 | A kind of multi-source adaptive equalization transfer learning method towards crack image detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110378872A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291818A (en) * | 2020-02-18 | 2020-06-16 | 浙江工业大学 | Non-uniform class sample equalization method for cloud mask |
CN112668583A (en) * | 2021-01-07 | 2021-04-16 | 浙江星汉信息技术股份有限公司 | Image recognition method and device and electronic equipment |
CN113011513A (en) * | 2021-03-29 | 2021-06-22 | 华南理工大学 | Image big data classification method based on general domain self-adaption |
CN113516334A (en) * | 2021-03-12 | 2021-10-19 | 中电建电力检修工程有限公司 | Dam joint and crack inspection method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761311A (en) * | 2014-01-23 | 2014-04-30 | 中国矿业大学 | Sentiment classification method based on multi-source field instance migration |
CN107341512A (en) * | 2017-07-06 | 2017-11-10 | 广东工业大学 | A kind of method and device of transfer learning classification |
CN109254219A (en) * | 2018-11-22 | 2019-01-22 | 国网湖北省电力有限公司电力科学研究院 | A kind of distribution transforming transfer learning method for diagnosing faults considering multiple factors Situation Evolution |
-
2019
- 2019-06-10 CN CN201910496225.3A patent/CN110378872A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761311A (en) * | 2014-01-23 | 2014-04-30 | 中国矿业大学 | Sentiment classification method based on multi-source field instance migration |
CN107341512A (en) * | 2017-07-06 | 2017-11-10 | 广东工业大学 | A kind of method and device of transfer learning classification |
CN109254219A (en) * | 2018-11-22 | 2019-01-22 | 国网湖北省电力有限公司电力科学研究院 | A kind of distribution transforming transfer learning method for diagnosing faults considering multiple factors Situation Evolution |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291818A (en) * | 2020-02-18 | 2020-06-16 | 浙江工业大学 | Non-uniform class sample equalization method for cloud mask |
CN112668583A (en) * | 2021-01-07 | 2021-04-16 | 浙江星汉信息技术股份有限公司 | Image recognition method and device and electronic equipment |
CN113516334A (en) * | 2021-03-12 | 2021-10-19 | 中电建电力检修工程有限公司 | Dam joint and crack inspection method and system |
CN113011513A (en) * | 2021-03-29 | 2021-06-22 | 华南理工大学 | Image big data classification method based on general domain self-adaption |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110378872A (en) | A kind of multi-source adaptive equalization transfer learning method towards crack image detection | |
CN105354595B (en) | A kind of robust visual pattern classification method and system | |
CN111191732A (en) | Target detection method based on full-automatic learning | |
CN109447998B (en) | Automatic segmentation method based on PCANet deep learning model | |
CN108428229A (en) | It is a kind of that apparent and geometric properties lung's Texture Recognitions are extracted based on deep neural network | |
CN104484681B (en) | Hyperspectral Remote Sensing Imagery Classification method based on spatial information and integrated study | |
CN103699904B (en) | The image computer auxiliary judgment method of multisequencing nuclear magnetic resonance image | |
CN104182538B (en) | Image search method based on semi-supervised Hash | |
CN109376796A (en) | Image classification method based on active semi-supervised learning | |
CN107657008A (en) | Across media training and search method based on depth discrimination sequence study | |
CN105808665B (en) | A kind of new image search method based on cartographical sketching | |
CN112365471B (en) | Cervical cancer cell intelligent detection method based on deep learning | |
CN110751027B (en) | Pedestrian re-identification method based on deep multi-instance learning | |
CN110210625A (en) | Modeling method, device, computer equipment and storage medium based on transfer learning | |
CN109726746A (en) | A kind of method and device of template matching | |
CN109448854A (en) | A kind of construction method of pulmonary tuberculosis detection model and application | |
CN112597324A (en) | Image hash index construction method, system and equipment based on correlation filtering | |
CN108897750A (en) | Merge the personalized location recommendation method and equipment of polynary contextual information | |
CN109858972A (en) | The prediction technique and device of ad click rate | |
CN113032613B (en) | Three-dimensional model retrieval method based on interactive attention convolution neural network | |
CN112200262B (en) | Small sample classification training method and device supporting multitasking and cross-tasking | |
CN108038467B (en) | A kind of sparse face identification method of mirror image in conjunction with thickness level | |
CN110378384B (en) | Image classification method combining privilege information and ordering support vector machine | |
CN107341189A (en) | A kind of indirect labor carries out the method and system of examination, classification and storage to image | |
CN116612307A (en) | Solanaceae disease grade identification method based on transfer learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191025 |
|
RJ01 | Rejection of invention patent application after publication |