CN110378872A - A kind of multi-source adaptive equalization transfer learning method towards crack image detection - Google Patents

A kind of multi-source adaptive equalization transfer learning method towards crack image detection Download PDF

Info

Publication number
CN110378872A
CN110378872A CN201910496225.3A CN201910496225A CN110378872A CN 110378872 A CN110378872 A CN 110378872A CN 201910496225 A CN201910496225 A CN 201910496225A CN 110378872 A CN110378872 A CN 110378872A
Authority
CN
China
Prior art keywords
weight
data set
sample
training
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910496225.3A
Other languages
Chinese (zh)
Inventor
毛莺池
唐江红
王静
刘凡
平萍
黄倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201910496225.3A priority Critical patent/CN110378872A/en
Publication of CN110378872A publication Critical patent/CN110378872A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30132Masonry; Concrete

Abstract

The multi-source adaptive equalization transfer learning method towards crack image detection that the invention discloses a kind of, comprising the following steps: 1) correction coefficient is added on the basis of TrAdaBoost algorithm, solves the problems, such as that auxiliary data weight convergence is too fast;2) adaptive covering parameter is introduced in correction coefficient, reflects between auxiliary data collection and target data set whether there is similarity relationships;3) keep finally obtained target data set consistent with each field crack data set different degree with final balance weight method, improve Dam Crack image detection accuracy and efficiency.The present invention can be improved Crack Detection accuracy rate, realize the promotion of Dam Crack picture detection performance on Small Sample Database collection.

Description

A kind of multi-source adaptive equalization transfer learning method towards crack image detection
Technical field
The invention belongs to Distributed Database cluster field, in particular to a kind of multi-source towards crack image detection is adaptive Equilibrium transfer learning method.
Background technique
China is the country that reservoir dam is most in the world, and by the end of 2016, China was completed all kinds of reservoir dams 9.8 More than ten thousand seats.It over time with the growth in dam age, is influenced by natural environment and human factor, dam surface and internal hair There is a series of dam visual defects, the probabilities that are in danger such as deformation, crack, leakage, calcified material precipitation and increases, threatens people in raw deformation People's security of the lives and property, crack are one of main harms of dam.
Traditional machine learning method needs a large amount of training sample, and it is same for needing to meet training data and test data The hypothesis of distribution.However, in practical applications, test data and training data not necessarily fully meet same distributional assumption.Migration Two basic assumptions in conventional machines study are relaxed in study, purpose primarily directed to small, limited sample size Specific area data set is easy to produce over-fitting using machine learning and leads to not the problem of training is with study, by utilizing tool There are trained preferable excellent model and sample in the field of certain similitude to construct the model for meeting mission requirements, thus real The effect of good model is constructed under existing small data set.
TrAdaBoost algorithm is a kind of transfer learning method of Case-based Reasoning selection, for solving training set and test set Different problems are distributed, good result can be obtained when auxiliary data and source data have many similitudes. TrAdaBoost algorithm constructs a part of usable supplemental training collection, combining target training set than using target source merely The more accurate model of training set training.
Summary of the invention
Goal of the invention: in order to overcome Dam Crack image existing in the prior art less, training sample is unevenly distributed weighing apparatus, And TrAdaBoost algorithm easily weakens the problem of auxiliary data collection effect in the training process, the present invention provides a kind of towards splitting The multi-source adaptive equalization transfer learning method for stitching image detection is improved to train the strong classifier of Dam Crack image Crack Detection accuracy rate realizes the promotion of Dam Crack picture detection performance on Small Sample Database collection.
Technical solution: to achieve the above object, the present invention provides a kind of multi-source towards crack image detection and adaptively puts down Weigh transfer learning method, includes the following steps:
(1) multi-source assistant images data set is inputted;
(2) K-means cluster is carried out to image, then rejected and the big picture of target data difference;
(3) discarding and the biggish auxiliary data of target difference from the image library of crack, and classifier is trained;
(4) setting right value update formula isAdaptively covering parameter is To modify corresponding weight;Wherein, increase compensation coefficient in right value update strategy, introduced in compensation coefficient adaptive Parameter is covered, it is auxiliary that target data set final weight is finally reset into each field in last time iteration with final balance weight method Help the average value of training set weight;
(5) weight vectors are updated, return step (3) obtains SVM strong classifier;Finally reset DTWeight: by DTWeights resetting Each D after for iterationsAverage weight;Use DsWith D after resettingTOne final classification device of training jointly.
Further, K-means cluster is carried out to image in the step (2), then rejected big with target data difference Picture specific step is as follows:
(2.1) first by image X in the image library of cracki(i=1,2 ..., n) carries out gray processing, and successively storage is to one-dimensional Matrix DXIn;
(2.2) then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, record the first place of every fritter It sets, obtains n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
(2.3) according to the gray average of each image array fritter, using Euclidean distance, shown in following formula, meter These objects are calculated at a distance from 30 image pattern cluster centres;And it is again equal to respective image fritter gray scale according to minimum range Value is divided, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between;As dis (xi,yj) value is bigger, illustrate xiWith yjIt is more similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger;
(2.4) mass center of each image fritter pixel grey scale mean value changed is recalculated;
(2.5) it repeats the above steps (2.3), step (2.4) is until the cluster center of each data class is no longer changed Until;
After the image array of input is stored in the form of block of pixels, with K-means clustering algorithm to picture element matrix block It is clustered, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, the remote picture of clustering distance It is deleted.
Further, it is trained that specific step is as follows in the step (3) to classifier:
(3.1) target domain tape label training set DT={ (xt,yt), DSFor the set of N number of auxiliary data collection, i.e. Ds= {D1, D2..., DN}={ (x1,y1),...(xk,yk),...(xN,yN)};Initialize weight vector w=(ws,wT), and to merging The processing of training set samples normalization;By multi-source auxiliary data collection Di(i=1,2 ..., N) and target domain tape label data set DTPoint Not Zu He < Di, DT>, obtain combined data set Di,T
(3.2) start to train network, in combination Di,TUpper unified progress image preprocessing, image segmentation, feature extraction and handle All features save, and training SVM classifier obtains i Weak Classifier Fm
(3.3) Weak Classifier F is calculated separatelymIn Ds, DTUpper error:
Wherein, nSTo assist data set sample size, nTFor target data set number of samples;yiReally to classify, Fm(xi) For classifier FmExport classification.
Further, target data set final weight is reset into each field in last time iteration in the step (4) Specific step is as follows for the average value of supplemental training collection weight:
(4.1) increase the weight that correction coefficient updates auxiliary data collection:
Increase the weight that correction coefficient updates auxiliary data collection sample on the basis of TrAdaBoost;When the number of iterations M not Disconnected to increase, every field supplemental training collection can be returned correctly, after M iteration, each field of auxiliary sample weights The sum of are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For each training sample in source domain a Weight;
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier in b On error rate;
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,It can obtain:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,Correction coefficient can be obtained are as follows:
Find out from correction coefficient formula, correction coefficient CmWith error rate of the Weak Classifier on target data set bAt anti- Than relationship, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, to next iteration training The influence of Weak Classifier increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to next The influence of secondary repetitive exercise Weak Classifier reduces;Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can be same When keep target data set and auxiliary data collection sample weights to be restrained;
(4.2) adaptive covering parameter is introduced:
Introduce adaptive covering parameter in correction coefficient, it is adaptive cover parameter be base classifier in auxiliary data collection and The sum of classification accuracy rate on target data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(4.3) final balance weight method:
Target data set final weight is reset to the average value of each field supplemental training collection weight in last time iteration, Spend finally obtained target data set and each field supplemental training collection unanimously.
The utility model has the advantages that compared with the prior art, the present invention has the following advantages:
(1) it is clustered before TrAdaBoost algorithm using K-means, by the remote picture of clustering distance from crack image library Middle deletion is conducive to the training of subsequent classifier, improves training effectiveness.
(2) TrAdaBoost is compared, introducing correction coefficient can solve the increase due to the number of iterations, and source domain is caused to be weighed Decline too fast, the excessive problem of gap between target source domain weight again.
(3) adaptive covering parameter is introduced in correction coefficient can reflect source domain training dataset and target domain instruction Whether practice between data set has similarity relationships, improvement method detection performance.
(4) final balance weight method can make finally obtained target set of source data and each field crack data set different degree Unanimously, realize that Dam Crack picture classifier performance on Small Sample Database collection is promoted.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Specific embodiment
Combined with specific embodiments below, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate the present invention Rather than limit the scope of the invention, after the present invention has been read, those skilled in the art are to various equivalences of the invention The modification of form falls within the application range as defined in the appended claims.
A kind of transfer learning side multi-source adaptive equalization TrAdaBoost towards crack image detection of the present invention Method, as shown in algorithm 1, including two aspects: K-means image clustering and multi-source adaptive equalization TrAdaBoost migration are learned It practises;
Algorithm 1: it is based on K-means multi-source adaptive equalization TrAdaBoost transfer learning method
1) K-means image clustering:
K-means clustering method, wherein K represents cluster mass center number, and means indicates the mean value of data in cluster.Its core Thought is: randomly selecting the initial center of k cluster, each center representative one cluster;And to remaining data object, according to At a distance from each center cluster, it is respectively allocated to therewith in nearest clustering cluster.
By K-means image clustering method, cluster row is carried out as method for measuring similarity using Euclidean distance Sequence.The remote picture of clustering distance is deleted from the image library of crack, is conducive to the training of subsequent classifier, improves training effectiveness. Specific step is as follows for K-means image clustering method:
Step 1: first by image X in the image library of cracki(i=1,2 ..., n) carries out gray processing, and successively storage is to one-dimensional Matrix DXIn;
Step 2: then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, record the first place of every fritter It sets, obtains n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
Step 3: according to the gray average of each image array fritter, using Euclidean distance, as shown in formula 1, meter These objects are calculated at a distance from 30 image pattern cluster centres;And it is again equal to respective image fritter gray scale according to minimum range Value is divided, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between.As dis (xi,yj) value is bigger, illustrate xiWith yjIt is more similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger.
Step 4: recalculating the mass center of each image fritter pixel grey scale mean value changed;
Step 5: repeating the above steps 3,4 until the cluster center of each data class is no longer changed.
After the image array of input is stored in the form of block of pixels, with K-means clustering algorithm to picture element matrix block It is clustered, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, the remote picture of clustering distance It is deleted.Discarding and the biggish auxiliary data of target difference, are conducive to the training of subsequent classifier, mention from the image library of crack High training effectiveness.
2) multi-source adaptive equalization TrAdaBoost transfer learning
Multi-source adaptive equalization TrAdaBoost (Multi-source Adaptive Balance TrAdaBoost, MABtrA) correction coefficient is added in transfer learning method on the basis of TrAdaBoost algorithm, solves auxiliary data weight convergence mistake Fast problem;Adaptive covering parameter is introduced in correction coefficient, reflects between auxiliary data collection and target data set whether have There are similarity relationships;After iteration, using final balance weight method, make finally obtained target data set and each field crack Data set different degree is consistent, improves Dam Crack image detection accuracy and efficiency.Increase correction coefficient and updates auxiliary data collection sample Adaptively covering parameter and final balance weight method are specific as follows for this weight, introducing:
(1) increase the weight that correction coefficient updates auxiliary data collection
Due to having differences between each field auxiliary data collection and target data set, the Weak Classifier for causing training to obtain exists Error rate is higher on target data set, and therefore, the weight of each field supplemental training collection constantly reduces as the number of iterations increases, The weight that final training obtains becomes very small so that uncorrelated to auxiliary data collection, can not play auxiliary mark data set The effect of habit.However, the weight of target data set is continuously increased as the number of iterations increases, easily there is difficulty and divide sample situation.
In order to preferably using each field auxiliary data collection and target data set training, increase on the basis of TrAdaBoost The weight of correction coefficient update auxiliary data collection sample.When the number of iterations m constantly increases, every field supplemental training collection can be by It is correct to return, after m iteration, the sum of each field of auxiliary sample weights are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For each training sample in source domain a Weight.
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier in b On error rate.
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
Therefore, when the auxiliary data collection sample weights distribution of M+1 iteration are as follows:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,Connection formula 7 can obtain:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,It can according to relational expression 8 and 9 Obtain correction coefficient are as follows:
From formula 10 as can be seen that correction coefficient CmWith error rate of the Weak Classifier on target data set bIt is inversely proportional Relationship, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, weak to next iteration training The influence of classifier increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to next time The influence of repetitive exercise Weak Classifier reduces.Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can be simultaneously Target data set and auxiliary data collection sample weights is kept to be restrained.
(2) adaptive covering parameter is introduced
However, even if εbWhen lower, Weak Classifier can also have differences the classifying quality of source domain training set, this difference Similarities and differences sample can reflect out the correlation between source domain training set and target domain training set.In order to reflect that this similitude is closed System introduces adaptive covering parameter in correction coefficient, and the adaptive parameter that covers is base classifier in auxiliary data collection and target The sum of classification accuracy rate on data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(3) final balance weight method
The basic conception of final balance weight method is: in an iterative process, auxiliary data weight constantly declines, target data Weight is continuously increased, and after iteration, gap is larger between auxiliary data weight and target data weight, but in final classification device On generation form, it should to target data set and every field supplemental training collection fair play.By target data set final weight The average value for resetting to each field supplemental training collection weight in last time iteration, makes finally obtained target data set and each neck Supplemental training collection in domain will be spent unanimously, improve the Detection accuracy of algorithm.
The evaluation criterion of the specific embodiment of the invention is as follows:
Evaluation criterion recall rate (Recall), the precision (Precision), accuracy rate of the specific embodiment of the invention (Accuracy) and comprehensive evaluation index (F-Measure).Recall rate indicates that positive class predicts positive exact figures and reality in prediction result The ratio between the positive class in border, that is, the target being identified ratio shared in such target;Accuracy representing predict positive class successfully count and Predict that the ratio between successfully, i.e., all identification returns the result ratio shared by middle real goal;Accuracy rate indicates to predict correct sample The ratio of the total sample of this Zhan;The comprehensive assessment of comprehensive evaluation index expression recall rate and accuracy.
F-Measure=(2 × TP)/(2 × TP+FP+FN) (16)
Above-mentioned four kinds of evaluation criterias, the bigger expression algorithm prediction effect of value are more preferable.
Wherein the meaning of TP, FN, FP, TN are as shown in 1 two classification confusion matrix of table.
1 two classification confusion matrix of table
Fig. 1 be the embodiment of the present invention model training flow chart, the course of work as described below:
1. inputting multi-source assistant images data set.
2. image after K-means is clustered, is deleted and the biggish picture of target source difference.The image clustering side K-means Specific step is as follows for method:
Step 1: first by image X in the image library of cracki(i=1,2 ..., n) carries out gray processing, and successively storage is to one-dimensional Matrix DXIn;
Step 2: then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, record the first place of every fritter It sets, obtains n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
Step 3: according to the gray average of each image array fritter, using Euclidean distance, as shown in formula 1, meter These objects are calculated at a distance from 30 image pattern cluster centres;And it is again equal to respective image fritter gray scale according to minimum range Value is divided, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between.As dis (xi,yj) value is bigger, illustrate xiWith yjIt is more similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger.
Step 4: recalculating the mass center of each image fritter pixel grey scale mean value changed;
Step 5: repeating the above steps 3,4 until the cluster center of each data class is no longer changed.
After the image array of input is stored in the form of block of pixels, with K-means clustering algorithm to picture element matrix block It is clustered, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, the remote picture of clustering distance It is deleted.Discarding and the biggish auxiliary data of target difference, are conducive to the training of subsequent classifier, mention from the image library of crack High training effectiveness.
3. target domain tape label training set DT={ (xt,yt), DSFor the set of N number of auxiliary data collection, i.e. Ds={ D1, D2..., DN}={ (x1,y1),...(xk,yk),...(xN,yN)}.Initialize weight vector w=(ws,wT), and instructed to merging Practice collection samples normalization processing.By multi-source auxiliary data collection Di(i=1,2 ..., N) and target domain tape label data set DTRespectively Combine < Di, DT>, obtain combined data set Di,T
4. starting to train network, in combination Di,TUpper unified progress image preprocessing, image segmentation, feature extraction is simultaneously institute There is feature to save, training SVM classifier, available i Weak Classifier Fm
5. calculating separately Weak Classifier FmIn Ds, DTUpper error:
Wherein, nSTo assist data set sample size, nTFor target data set number of samples;yiReally to classify, Fm(xi) For classifier FmExport classification.
6. setting right value update formula isAdaptively covering parameter is To modify corresponding weight.Wherein, compensation coefficient is increased in right value update strategy, is introduced in compensation coefficient adaptive Parameter should be covered, target data set final weight is finally reset into each field in last time iteration with final balance weight method The average value of supplemental training collection weight.Concrete principle is as follows:
(1) increase the weight that correction coefficient updates auxiliary data collection
In order to preferably using each field auxiliary data collection and target data set training, increase on the basis of TrAdaBoost The weight of correction coefficient update auxiliary data collection sample.When the number of iterations M constantly increases, every field supplemental training collection can be by It is correct to return, after M iteration, the sum of each field of auxiliary sample weights are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For each training sample in source domain a Weight.
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier in b On error rate.
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,Connection formula 7 can obtain:
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,It can according to relational expression 8 and 9 Obtain correction coefficient are as follows:
From formula 10 as can be seen that correction coefficient CmWith error rate of the Weak Classifier on target data set bIt is inversely proportional Relationship, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, weak to next iteration training The influence of classifier increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to next time The influence of repetitive exercise Weak Classifier reduces.Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can be simultaneously Target data set and auxiliary data collection sample weights is kept to be restrained.
(2) adaptive covering parameter is introduced
However, even if εbWhen lower, Weak Classifier can also have differences the classifying quality of source domain training set, this difference Similarities and differences sample can reflect out the correlation between source domain training set and target domain training set.In order to reflect that this similitude is closed System introduces adaptive covering parameter in correction coefficient, and the adaptive parameter that covers is base classifier in auxiliary data collection and target The sum of classification accuracy rate on data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(3) final balance weight method
Target data set final weight is reset to the average value of each field supplemental training collection weight in last time iteration, It spend finally obtained target data set and each field supplemental training collection unanimously, improve the Detection accuracy of algorithm.
7. updating weight vectors:
7. repeating step 4. 5. 6. until reach the number of iterations M of setting, SVM strong classifier is obtained.Finally reset DTPower Weight: by DTWeights resetting is each D after iterationsAverage weight;Use DsWith D after resettingTFinal point of common training one Class device.
The evaluation criterion of the specific embodiment of the invention is as follows:
Evaluation criterion recall rate (Recall), the precision (Precision), accuracy rate of the specific embodiment of the invention (Accuracy) and comprehensive evaluation index (F-Measure).Recall rate indicates that positive class predicts positive exact figures and reality in prediction result The ratio between the positive class in border, that is, the target being identified ratio shared in such target;Accuracy representing predict positive class successfully count and Predict that the ratio between successfully, i.e., all identification returns the result ratio shared by middle real goal;Accuracy rate indicates to predict correct sample The ratio of the total sample of this Zhan;The comprehensive assessment of comprehensive evaluation index expression recall rate and accuracy.
F-Measure=(2 × TP)/(2 × TP+FP+FN) (16)
Above-mentioned four kinds of evaluation criterias, the bigger expression algorithm prediction effect of value are more preferable.
Wherein the meaning of TP, FN, FP, TN are as shown in 1 two classification confusion matrix of table.
1 two classification confusion matrix of table
It is less for Dam Crack image according to above embodiments it is found that in practical applications, training sample distribution Unbalanced and TrAdaBoost algorithm easily weakens the problem of auxiliary data collection effect, method of the invention in the training process The strong classifier of Dam Crack image can be trained, Crack Detection accuracy rate is improved, realizes Dam Crack picture in small sample The promotion of detection performance on data set.

Claims (4)

1. a kind of multi-source adaptive equalization transfer learning method towards crack image detection, which is characterized in that including walking as follows It is rapid:
(1) multi-source assistant images data set is inputted;
(2) K-means cluster is carried out to image, then rejected and the big picture of target data difference;
(3) discarding and the biggish auxiliary data of target difference from the image library of crack, and classifier is trained;
(4) setting right value update formula isAdaptively covering parameter is To modify corresponding weight;Wherein, increase compensation coefficient in right value update strategy, adaptive covering ginseng is introduced in compensation coefficient Number, finally resets to each field supplemental training in last time iteration for target data set final weight with final balance weight method Collect the average value of weight;
(5) weight vectors are updated, return step (3) obtains SVM strong classifier;Finally reset DTWeight: by DTWeights resetting is repeatedly Each D after generationsAverage weight;Use DsWith D after resettingTOne final classification device of training jointly.
2. a kind of multi-source adaptive equalization transfer learning method towards crack image detection according to claim 1, It is characterized in that, carrying out K-means cluster to image in the step (2), then reject and the big picture of target data difference Specific step is as follows:
(2.1) first by image X in the image library of cracki(i=1,2 ..., n) carry out gray processing, successively storage arrive one-dimensional matrix DX In;
(2.2) then with 10 length in pixels, 3 pixel moving step lengths successively carry out piecemeal storage, and the first place for recording every fritter is set, and is obtained To n block of pixels data set, arbitrarily select the gray average of 30 image fritters as initial cluster center;
(2.3) according to the gray average of each image array fritter, using Euclidean distance, shown in following formula, this is calculated A little objects are at a distance from 30 image pattern cluster centres;And according to minimum range again to respective image fritter gray average into Row divides, and each image array fritter is assigned to most similar class;
Wherein, dis (xi,yj) it is two data object xiAnd yjThe distance between;As dis (xi,yj) value is bigger, illustrate xiAnd yjMore It is similar;As dis (xi,yj) value is smaller, illustrate xiAnd yjGap is bigger;
(2.4) mass center of each image fritter pixel grey scale mean value changed is recalculated;
(2.5) it repeats the above steps (2.3), step (2.4) is until the cluster center of each data class is no longer changed;
After the image array of input is stored in the form of block of pixels, picture element matrix block is carried out with K-means clustering algorithm Cluster, the Euclidean distance at the center of block of pixels in cluster set to cluster centre is ranked up, and the remote picture of clustering distance carries out It deletes.
3. a kind of multi-source adaptive equalization transfer learning method towards crack image detection according to claim 1, It is characterized in that, being trained in the step (3) to classifier, specific step is as follows:
(3.1) target domain tape label training set DT={ (xt,yt), DSFor the set of N number of auxiliary data collection, i.e. Ds={ D1, D2..., DN}={ (x1,y1),...(xk,yk),...(xN,yN)};Initialize weight vector w=(ws,wT), and instructed to merging Practice collection samples normalization processing;By multi-source auxiliary data collection Di(i=1,2 ..., N) and target domain tape label data set DTPoint Not Zu He < Di, DT>, obtain combined data set Di,T
(3.2) start to train network, in combination Di,TUpper unified progress image preprocessing, image segmentation, feature extraction is simultaneously all Feature saves, and training SVM classifier obtains i Weak Classifier Fm
(3.3) Weak Classifier F is calculated separatelymIn Ds, DTUpper error:
Wherein, nSTo assist data set sample size, nTFor target data set number of samples;yiReally to classify, Fm(xi) it is point Class device FmExport classification.
4. a kind of multi-source adaptive equalization transfer learning method towards crack image detection according to claim 1, It is characterized in that, target data set final weight is reset to each field auxiliary instruction in last time iteration in the step (4) Specific step is as follows for the average value of white silk collection weight:
(4.1) increase the weight that correction coefficient updates auxiliary data collection:
Increase the weight that correction coefficient updates auxiliary data collection sample on the basis of TrAdaBoost;When the number of iterations M constantly increases Greatly, every field supplemental training collection can be returned correctly, after M iteration, the sum of each field of auxiliary sample weights Are as follows:
Wherein, a is auxiliary training set, naTo assist number of samples in training set a,For training sample weight each in source domain a;
The correct sample weights of forecast sample are constant in target data set b, then the weights sum of correct sampleAre as follows:
Wherein, nbFor number of samples in target data set b,For training sample weight each in b,It is Weak Classifier on b Error rate;
Prediction error sample needs to update in target data set bModify corresponding weight:
The weights sum of error sample in target data set bAre as follows:
The sum of all aiming field sample weights, even if correct sample and error sample weights sum:
When the number of iterations is sufficiently large, each field supplemental training collection can be returned correctly, after iteration,It can :
If auxiliary data, which integrates sample, increases correction coefficient as Cm, weight becomes:
Due to auxiliary data collection sample weights at this time stablize it is constant, i.e.,Correction coefficient can be obtained are as follows:
Find out from correction coefficient formula, correction coefficient CmWith error rate of the Weak Classifier on target data set bBe inversely proportional pass System, i.e. error rateIt is bigger, correction coefficient CmSmaller, auxiliary data collection sample weights increase, to weak point of next iteration training The influence of class device increases;Error rateIt is smaller, correction coefficient CmBigger, auxiliary data collection sample weights reduce, to changing next time The influence of generation training Weak Classifier reduces;Therefore, correction coefficient C is added on the basis of TrAdaBoost algorithmmIt can protect simultaneously It holds target data set and auxiliary data collection sample weights is restrained;
(4.2) adaptive covering parameter is introduced:
Adaptive covering parameter is introduced in correction coefficient, the adaptive parameter that covers is base classifier in auxiliary data collection and target The sum of classification accuracy rate on data set, it may be assumed that
Field of auxiliary data sample weight after the m+1 times iteration:
(4.3) final balance weight method:
The average value that target data set final weight is reset to each field supplemental training collection weight in last time iteration, makes most The target data set and each field supplemental training collection obtained eventually will be spent unanimously.
CN201910496225.3A 2019-06-10 2019-06-10 A kind of multi-source adaptive equalization transfer learning method towards crack image detection Pending CN110378872A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910496225.3A CN110378872A (en) 2019-06-10 2019-06-10 A kind of multi-source adaptive equalization transfer learning method towards crack image detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910496225.3A CN110378872A (en) 2019-06-10 2019-06-10 A kind of multi-source adaptive equalization transfer learning method towards crack image detection

Publications (1)

Publication Number Publication Date
CN110378872A true CN110378872A (en) 2019-10-25

Family

ID=68249935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910496225.3A Pending CN110378872A (en) 2019-06-10 2019-06-10 A kind of multi-source adaptive equalization transfer learning method towards crack image detection

Country Status (1)

Country Link
CN (1) CN110378872A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291818A (en) * 2020-02-18 2020-06-16 浙江工业大学 Non-uniform class sample equalization method for cloud mask
CN112668583A (en) * 2021-01-07 2021-04-16 浙江星汉信息技术股份有限公司 Image recognition method and device and electronic equipment
CN113011513A (en) * 2021-03-29 2021-06-22 华南理工大学 Image big data classification method based on general domain self-adaption
CN113516334A (en) * 2021-03-12 2021-10-19 中电建电力检修工程有限公司 Dam joint and crack inspection method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761311A (en) * 2014-01-23 2014-04-30 中国矿业大学 Sentiment classification method based on multi-source field instance migration
CN107341512A (en) * 2017-07-06 2017-11-10 广东工业大学 A kind of method and device of transfer learning classification
CN109254219A (en) * 2018-11-22 2019-01-22 国网湖北省电力有限公司电力科学研究院 A kind of distribution transforming transfer learning method for diagnosing faults considering multiple factors Situation Evolution

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761311A (en) * 2014-01-23 2014-04-30 中国矿业大学 Sentiment classification method based on multi-source field instance migration
CN107341512A (en) * 2017-07-06 2017-11-10 广东工业大学 A kind of method and device of transfer learning classification
CN109254219A (en) * 2018-11-22 2019-01-22 国网湖北省电力有限公司电力科学研究院 A kind of distribution transforming transfer learning method for diagnosing faults considering multiple factors Situation Evolution

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291818A (en) * 2020-02-18 2020-06-16 浙江工业大学 Non-uniform class sample equalization method for cloud mask
CN112668583A (en) * 2021-01-07 2021-04-16 浙江星汉信息技术股份有限公司 Image recognition method and device and electronic equipment
CN113516334A (en) * 2021-03-12 2021-10-19 中电建电力检修工程有限公司 Dam joint and crack inspection method and system
CN113011513A (en) * 2021-03-29 2021-06-22 华南理工大学 Image big data classification method based on general domain self-adaption

Similar Documents

Publication Publication Date Title
CN110378872A (en) A kind of multi-source adaptive equalization transfer learning method towards crack image detection
CN105354595B (en) A kind of robust visual pattern classification method and system
CN111191732A (en) Target detection method based on full-automatic learning
CN109447998B (en) Automatic segmentation method based on PCANet deep learning model
CN108428229A (en) It is a kind of that apparent and geometric properties lung&#39;s Texture Recognitions are extracted based on deep neural network
CN104484681B (en) Hyperspectral Remote Sensing Imagery Classification method based on spatial information and integrated study
CN103699904B (en) The image computer auxiliary judgment method of multisequencing nuclear magnetic resonance image
CN104182538B (en) Image search method based on semi-supervised Hash
CN109376796A (en) Image classification method based on active semi-supervised learning
CN107657008A (en) Across media training and search method based on depth discrimination sequence study
CN105808665B (en) A kind of new image search method based on cartographical sketching
CN112365471B (en) Cervical cancer cell intelligent detection method based on deep learning
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN110210625A (en) Modeling method, device, computer equipment and storage medium based on transfer learning
CN109726746A (en) A kind of method and device of template matching
CN109448854A (en) A kind of construction method of pulmonary tuberculosis detection model and application
CN112597324A (en) Image hash index construction method, system and equipment based on correlation filtering
CN108897750A (en) Merge the personalized location recommendation method and equipment of polynary contextual information
CN109858972A (en) The prediction technique and device of ad click rate
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN112200262B (en) Small sample classification training method and device supporting multitasking and cross-tasking
CN108038467B (en) A kind of sparse face identification method of mirror image in conjunction with thickness level
CN110378384B (en) Image classification method combining privilege information and ordering support vector machine
CN107341189A (en) A kind of indirect labor carries out the method and system of examination, classification and storage to image
CN116612307A (en) Solanaceae disease grade identification method based on transfer learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191025

RJ01 Rejection of invention patent application after publication