CN107886539B - High-precision gear visual detection method in industrial scene - Google Patents

High-precision gear visual detection method in industrial scene Download PDF

Info

Publication number
CN107886539B
CN107886539B CN201710974598.8A CN201710974598A CN107886539B CN 107886539 B CN107886539 B CN 107886539B CN 201710974598 A CN201710974598 A CN 201710974598A CN 107886539 B CN107886539 B CN 107886539B
Authority
CN
China
Prior art keywords
image
target
classifier
feature
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710974598.8A
Other languages
Chinese (zh)
Other versions
CN107886539A (en
Inventor
张印辉
田敏
王森
何自芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201710974598.8A priority Critical patent/CN107886539B/en
Publication of CN107886539A publication Critical patent/CN107886539A/en
Application granted granted Critical
Publication of CN107886539B publication Critical patent/CN107886539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20192Edge enhancement; Edge preservation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30164Workpiece; Machine component

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a high-precision gear visual detection method in an industrial scene, and belongs to the field of target detection of machine learning technology. Firstly, acquiring a positive sample image with a gear target and a negative sample image without the gear target, carrying out bounding box labeling on the gear target, and dividing the gear target into a training set and a testing set according to the proportion of 1: 1; after the image is subjected to Par-King image enhancement processing, HOG features of the gradient direction histogram are extracted, and corresponding feature positive samples and feature negative samples are obtained; training two different classifiers for the extracted training set characteristic samples, wherein one classifier is a common overall SVM classifier, and the other classifier is a local SVR combined classifier on a frequency domain; and then, carrying out joint matching on the test set characteristic samples by the two classifiers to obtain the optimal detection position of the target. The method can effectively acquire the high-precision position information of the gear target in the industrial scene by using the combined model matching method.

Description

High-precision gear visual detection method in industrial scene
Technical Field
The invention relates to a high-precision gear visual detection method in an industrial scene, in particular to a gear visual detection method based on joint model matching in the industrial scene, and belongs to the field of target detection of machine learning technology.
Background
Industrial robots in industrial production automatically recognize the category and specific position of parts to be processed by using an object detection technique, and perform corresponding processes such as grasping, welding, cutting, and the like. Compared with the traditional production mode, the automatic production mode of automatically detecting the target improves the production efficiency and saves labor force.
The object detection technology is an important research topic in the field of pattern recognition and digital image processing. The research subject is rapidly progressing in more than ten years, a plurality of excellent target detection algorithms are proposed every year, and the detection effect and speed are continuously optimized. The AdaBoost algorithm framework based on Viola et al uses Haar-like wavelet feature classification, and then adopts a sliding window search strategy to realize accurate and effective positioning. The method is a first object class detection algorithm capable of processing in real time and giving a good detection rate, and is mainly applied to face detection. Dalal et al propose to use image local Gradient direction Histogram (HOG) as a feature and Support Vector Machine (SVM) as a classifier for pedestrian detection, the HOG feature can well reflect the direction information of the target object, and since the feature appears, the target detection technology has been developed more rapidly, and various improved HOG features are also called for.
Based on the requirements on detection precision and speed, Henriques et al propose a Block-circular Decomposition algorithm (Block-circular Decomposition) in 2013, and the detection effect of the algorithm on the pedestrian detection data sets of INRIA and ETH is obviously improved compared with the traditional detection algorithm. The method applies the algorithm to the gear target detection of the industrial scene, and aims at the characteristics that the illumination change is obvious in the industrial scene, part of target images are not clear in acquisition, and the HOG characteristics are sensitive to the directional gradient, and the original images are converted into the fuzzy domain to be subjected to gradient enhancement processing, and then the HOG characteristics are extracted, so that the characteristics have better separability; according to the steps of a block cycle decomposition algorithm, an extracted feature sample is subjected to Fourier transform, an independent Support Vector Regression (SVR) classifier is trained for each block at a corresponding position in HOG features in a frequency domain, then the block is reversely transformed into an integral combined SVR classifier of a space domain, as the classifier is a combined classifier focusing on a single local block, the identification power of integral differences is weaker than the identification power of the local differences, on the basis of the consideration, an integral SVM classifier model is trained by using the extracted features, and a target is jointly detected by using the SVM classifier model and the combined SVR classifier model, and the detection method is finally detected by using an image pyramid matching method provided by Felzenszwalb and the like. The invention is funded by national science fund projects (61461022 and 61761024), mainly aims to explore a global and local characteristic multi-scale coupling mechanism and a multi-scale perception error measure robust fusion algorithm, solves the problems of inconsistency of a coupled posterior test and real distribution and inconsistency of a multi-scale error measure optimization structure, and provides a theoretical basis for efficient, rapid and accurate foreground target information detection and segmentation of a production line in a dynamic scene.
Disclosure of Invention
Based on the problems, the invention provides a high-precision gear visual detection method in an industrial scene. According to the requirement of detection precision, the separability of the target and background gradient direction histogram features is improved by using a Par-King image enhancement algorithm, two different classifier models of an SVM (support vector machine) and a frequency domain SVR (support vector regression) are trained from a self-built data set, and the images of the test set are subjected to joint visual detection by using an image pyramid matching method in a variability component model algorithm.
The technical scheme of the invention is as follows: a high-precision gear visual detection method in an industrial scene comprises the following specific steps:
step1, acquiring a positive sample image with a gear target and a negative sample image without the gear target in an industrial scene, carrying out bounding box labeling on the gear target, and dividing the gear target into a training set and a test set according to the proportion of 1: 1;
step2, enhancing the images of the training set and the test set by using a Par-King image enhancement algorithm;
step3, extracting HOG (histogram of gradient directions) characteristics of the image processed in Step2 to obtain corresponding characteristic positive samples and characteristic negative samples;
step4, training two different classifiers for the extracted training set feature samples, wherein one classifier is a common overall SVM classifier, and the other classifier is a local SVR combined classifier on a frequency domain;
and Step5, carrying out joint matching on the feature samples of the test set by the two classifiers by using an image pyramid matching algorithm to obtain the optimal detection position of the target.
In Step2, the image enhancement method is as follows:
step2.1, regarding a gray image X with the gray level L as a fuzzy point array, assuming that X is an M multiplied by N gray image, and the fuzzy point array is as follows:
Figure BDA0001438251940000031
in the formula, xmnIs the pixel value, mu, of the pixel point corresponding to the coordinate (m, n) in the image XmnFor which the pixel point corresponds to a fuzzy eigenvalue. And a plane formed by all the fuzzy characteristic values is a fuzzy characteristic plane.
Step2.2, the image X is transformed from the image domain to the blur domain, and the transformation function (also called membership function) is:
Figure BDA0001438251940000032
m=1,2,...,M;n=1,2,...N.
in the formula, xmaxIs the maximum value of the pixel values in image X, FdFuzzification of the parameters for the denominator, FeFor the exponential blurring parameter, for images with pixel values in the range of 0 to 255, F is usually chosendAnd FeThe parameter values are 128 and 1, respectively.
Step2.3 for fuzzy domain { μ }mnPerforming enhancement processing, wherein an enhancement function is as follows:
Figure BDA0001438251940000033
step2.4, after the image enhancement is finished, the formula x is reusedmn=G-1mn) Inverse transform back to spatial domain, G-1Is the inverse of function G.
Because the Par-King algorithm processes a single-channel gray image, and the training set and test set image data are RGB three-channel images, the Par-King algorithm is used for independently enhancing three channels in one frame of image, and then the three channels of RGB images are combined to obtain a final enhanced image.
In Step3, the HOG feature extraction method comprises the following steps:
the original HOG characteristics divide the sample image into a plurality of cells multiplied by cells (the cells respectively use the center difference to calculate the gradient amplitude M of each pixel point (x, y)(x,y)And the gradient direction omega(x,y)The calculation formula is as follows:
Figure BDA0001438251940000034
Ω(x,y)=arctan(My/Mx) (5)
in the formula MxAnd MyThe horizontal gradient and the vertical gradient at the pixel point (x, y) are respectively calculated according to the following formula:
Mx=N(x+1,y)-N(x-1,y) (6)
My=N(x,y+1)-N(x,y-1) (7)
in the formula, N (x, y) is a pixel value of the pixel (x, y).
Since all image data formats of the present invention are RGB color images, three color channel gradient magnitude maxima at each pixel point location are selected as outputs. Extracting improved HOG characteristics by using a vlfeat function library, averagely dividing the gradient direction of [0,2 pi ] into 2 xk intervals (bins) (wherein k is 1,2,3 …), carrying out histogram statistics on all gradient values in each unit by using the 2 xk intervals (bins) to obtain a 2 xk-dimensional characteristic vector, wherein each adjacent four units are a region block (block), and carrying out bilinear interpolation calculation on four characteristic vectors in one region block to obtain 2 xk-dimensional characteristic vector output of the region block; in addition, the gradient direction of [0, pi ] is averagely divided into k intervals (bins), a histogram without gradient amplitude values is counted, and the characteristic vector output of the k dimension is obtained in the same way; finally, the reciprocal of the L2 norm is obtained for the four units in each area block to be used as the normalization factor of each unit, and the four normalization factors are output to be used as the 4-dimensional characteristic vector of the area block. The dimension of the HOG feature vector of one area block obtained finally is (4+3 xk) dimension.
And traversing the sample image by using the block feature vector calculation rule, wherein the traversal step length is cellsize. Go from the top left corner first down and then right to the bottom right corner, and at least half of each region block is within the sample image. Therefore, for an RGB image with a size of w × h × 3, the number of horizontal area blocks hogw and the number of vertical area blocks hogh are:
hogw=(w+cellsize/2)/cellsize (8)
hogh=(h+cellsize/2)/cellsize (9)
the dimension of the HOG feature matrix of the sample image obtained finally is hogw × hogh × (4+3 × k).
In Step4, the classifier training method is as follows:
step4.1, for training SVM model, the size of each extracted feature sample is m1×n1X p, a positive samples of features, b negative samples of features1A plurality of; each feature sample was pulled to 1 × (m)1×n1X p) dimensional feature vector, using a size of
(a+b1)×(m1×n1X p) dimension feature matrix to train out SVM classifier, and then changing the size of the classifier back to m1×n1X p as the final SVM model w1
Step4.2 for training the SVR model, the size of each extracted feature sample is m2×n2X p, a positive samples of features, b negative samples of features2A plurality of; fourier transform is carried out on the extracted features, a hyperplane model is independently trained for each block of all feature samples in a frequency domain, then an integral hyperplane model is combined according to the original spatial position, and then inverse Fourier transform is carried out to obtain a final SVR model w2
In Step5, the joint matching method comprises the following steps:
because the size of the target to be detected is variable and the angle images of the target are different, a standard image pyramid is obtained by repeated smoothing and downsampling, namely, an image is obtainedGenerating a series of images with different sizes; reuse of learned model w1And w2And respectively carrying out dot product operation on the features of each layer of image of the pyramid to obtain the scores of the two models at different scales and different positions for output, defining an integral score for each root position, finally selecting the optimal target position and the score for the two models, comparing the scores of the two target positions detected by the two models, and selecting the target position with high score as the final position for output.
The invention has the beneficial effects that:
(1) the gear visual identification method introduces the machine learning algorithm to the gear visual identification under the industrial scene, and the identification effect of the machine learning algorithm is obviously improved compared with the detection precision of the traditional visual algorithm;
(2) according to the method, a Par-King image enhancement algorithm is introduced to enhance the original image, so that the separability of the target and background gradient direction histogram features is improved;
(3) the invention trains two classifiers to carry out joint detection on the test image, thereby realizing the advantage complementation of different classifiers;
(4) the method can effectively acquire the high-precision position information of the gear target in the industrial scene by using the combined model matching method.
Drawings
FIG. 1 is a block flow diagram of the present invention;
FIG. 2 is an example of a partial positive sample image of the present invention; wherein the black square frame is marked for bounding box;
FIG. 3 is an example of a partial negative example image of the present invention;
FIG. 4 is a partial original image of the present invention;
FIG. 5 is an enhanced image corresponding to the image of FIG. 4 according to the present invention;
FIG. 6 shows the present invention w1The model is visualized, wherein a positive sample model is arranged on the left side, and a negative sample model is arranged on the right side;
FIG. 7 shows the present invention w2The model is visualized, wherein a positive sample model is arranged on the left side, and a negative sample model is arranged on the right side;
FIG. 8 shows a part of the test results of the present invention;
FIG. 9 is a graph of accuracy versus recall according to the present invention;
fig. 10 is an enlarged view of a portion of fig. 9 of the present invention.
Detailed Description
Example 1: as shown in fig. 1 to 10, a high-precision gear visual inspection method in an industrial scene includes the following specific steps:
step1, the equipment used to collect data here has a six degree of freedom manipulator and binocular stereo vision system, a camera, and several gear parts. The method comprises the steps of taking a conveyor belt and the periphery of the conveyor belt in an industrial scene as a main background, carrying out image acquisition on gear positive samples and background negative samples, acquiring 320 images of the positive samples and 688 images of the negative samples in a JPG format, storing the images in the JPG format, marking the coordinate position of a boundary box where an object in the image is located by the positive samples, using the coordinate position as a grouping label of the object, and manufacturing a data set, wherein 160 positive samples of a training set, 344 negative samples of the training set, 160 positive samples of a testing set and 344 negative samples of the testing set are obtained. The collected partial data are shown in FIGS. 2 and 3;
step2, enhancing the images of the training set and the test set by using a Par-King image enhancement algorithm;
in Step2, the image enhancement method is as follows:
a gray image X with a gray level L is regarded as a fuzzy point array, the gray image X is converted into a fuzzy domain by utilizing a membership function, and then is converted into an image domain after being enhanced by an enhancement function to obtain an enhanced image;
because the Par-King algorithm processes single-channel gray images, and the image data of the training set and the test set are RGB three-channel images, the Par-King algorithm is used for independently enhancing three channels in one frame of image, and then the three channels of RGB images are combined to obtain a final enhanced image. Fig. 4 and 5 show a part of the original image and the enhanced image. The qualitative contrast shows that the edge contour of the enhanced image is clearer compared with the edge contour of the original image.
Quantitative analysis of the image after Par-King algorithm enhancement processing by using the average gradient magnitude difference of unit pixels before and after the image is enhancedGradient amplitude variation, assuming gradient amplitude at pixel point (x, y) of RGB image I of size M N3
Figure BDA0001438251940000061
The enhanced image of I is I', and the calculation formula of the average gradient magnitude difference L of each channel of the image I is as follows:
Figure BDA0001438251940000071
when L >0, it indicates that the average gradient amplitude of the enhanced image is increased compared to the original image. The L values for each channel of the four exemplary images of FIGS. 6-7 are found by the above equation as shown in Table 1:
TABLE 1 partial image mean gradient amplitude enhancement results
Image of a person Channel
R G B
Image 1 0.5975 0.1604 0.1529
Image 2 1.1827 0.2905 0.3022
Image 3 0.8658 0.2086 0.1924
Image 4 1.8346 0.2275 0.2542
As can be seen from Table 1, the gradient amplitude of the image can be effectively improved by enhancing the image by using the Par-King algorithm, so that the gradient information of the image is more obvious.
Step3, extracting HOG (histogram of gradient directions) characteristics of the image processed in Step2 to obtain corresponding characteristic positive samples and characteristic negative samples;
in Step3, the HOG feature extraction method comprises the following steps:
selecting the maximum value of the gradient amplitudes of the three color channels at the position of each pixel point as output; averagely dividing the gradient direction of [0,2 pi) into 2 xk interval bins (wherein k is 1,2,3 …), realizing the extraction of improved HOG characteristics by using a vlfeat function library, and finally obtaining the dimension of the HOG characteristic vector of one region block as (4+3 xk);
traversing the sample image by using a block feature vector calculation rule, wherein the traversal step length is cellsize; for an RGB image of size w × h × 3, the number of horizontal area blocks hogw and the number of vertical area blocks hogh are:
hogw=(w+cellsize/2)/cellsize (11)
hogh=(h+cellsize/2)/cellsize (12)
the dimension of the HOG feature matrix of the sample image obtained finally is hogw × hogh × (4+3 × k).
Specifically, 31-dimensional HOG features of a small part of area containing a target in an original positive sample are used as feature positive samples, and the spatial position relationship of different blocks after the features are extracted is unchanged, so that after the HOG features are extracted from an original negative sample, a plurality of different HOG features are respectively divided into a plurality of feature negative samples from top left to bottom right in a certain step length in the longitudinal direction and the transverse direction according to the size standard of the feature positive sample, and the HOG features are used as a plurality of feature negative samples, so that the number of the negative samples is effectively expanded, and the defect that the number of the negative samples is possibly insufficient is overcome;
step4, training two different classifiers for the extracted training set feature samples, wherein one classifier is a common overall SVM classifier, and the other classifier is a local SVR combined classifier on a frequency domain;
in Step4, the classifier training method is as follows:
step4.1, for training SVM model, the size of each extracted feature sample is 19 × 18 × 31, 316 feature positive samples and 3480 feature negative samples. Drawing each feature sample into a feature vector with the dimension of 1 x 10602, training a SVM classifier by using a feature matrix with the dimension of 3796 x 10602, and then changing the size of the classifier back to 19 x 18 x 31 to be used as a final SVM model w1
Step4.2, for training the SVR model, the size of each extracted feature sample is 21 × 20 × 31, 316 feature positive samples and 4676 feature negative samples. By using the idea of a block cyclic decomposition algorithm for reference, Fourier transform is carried out on the extracted features, a hyperplane model is independently trained for each block of all feature samples in a frequency domain, then an integral hyperplane model is combined according to the original spatial position, and then inverse Fourier transform is carried out to obtain a final SVR model w2. The two classifier models were visualized separately using a vlfeat library as shown in fig. 6-7.
According to the target detection process, a Par-King algorithm is utilizedThe method enhances the gradient information of the image, extracts the HOG characteristics of 31 dimensions, and trains an integral SVM model w by utilizing the characteristic samples of the training set1And frequency domain SVR block combination model w2Then, the following steps are carried out;
and Step5, carrying out joint matching on the feature samples of the test set by the two classifiers by using an image pyramid matching algorithm to obtain the optimal detection position of the target.
In Step5, the joint matching method comprises the following steps:
because the size of the target to be detected is variable and the angle images of the target are different, a standard image pyramid is obtained through repeated smoothing and downsampling, namely a series of images with different sizes are generated from one image; reuse of learned model w1And w2And respectively carrying out dot product operation on the features of each layer of image of the pyramid to obtain the scores of the two models at different scales and different positions for output, defining an integral score for each root position, finally selecting the optimal target position and the score for the two models, comparing the scores of the two target positions detected by the two models, and selecting the target position with high score as the final position for output.
Aiming at the application of the visual detection of mechanical parts in an industrial scene, the effect of the method is tested by utilizing a self-made gear single-type target detection data set. FIG. 8 shows w1And w2Example of partial detection qualitative results, dark box w1The detection result of the model has a bright color square frame of w2The detection result of the model is shown in the figure, the independent detection effects of the two models are good, but the comparison of the two detection results shows that w is1The model often incorporates a dimly lit portion into the target, accounting for w2Robustness ratio w of model to illumination effect1The model is strong, but where the illumination effect is relatively small, w2The model detects more redundant backgrounds, when w1Model ratio w2The model detection is more precise.
The two models are combined to carry out target detection, and the optimal positions of the two models are selected as final detection output, so that the purpose of making the two models get the best and make up for the weakness can be achieved. The test is carried out on a notebook computer configured as a Kurui i7 processor and a 12G memory, the accuracy and recall rate indexes are used for comparing the algorithm of the text, the detection effect of the original SVM and circulation algorithm, the circulation + SVM combined algorithm without adding the Par-King algorithm and the SVM and circulation algorithm with adding the Par-King algorithm, the operation time and the accuracy are shown in the table 2, and the accuracy-recall rate curve graph is made as shown in the attached figures 9-10.
Although the improved method of the invention is not superior in calculation speed, the average detection precision can reach 96.8%, and the 93% average detection precision applied to the invention is improved compared with the circular pedestrian detection algorithm. The comparison of the properties in Table 2 shows that.
TABLE 2 Performance comparison Table for six detection algorithms
Method Classifier training time/s Average detection time/s Average detection accuracy/%)
SVM 3.745 0.608 91.9
circulant 19.019 0.392 93.0
circulant+SVM 22.764 0.991 95.1
Par-King+SVM 3.636 2.911 94.8
Par-King+circulant 16.481 2.718 93.8
The method of the invention 20.117 5.614 96.8
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (5)

1. A high-precision gear visual detection method in an industrial scene is characterized by comprising the following steps: the method comprises the following specific steps:
step1, acquiring a positive sample image with a gear target and a negative sample image without the gear target in an industrial scene, carrying out bounding box labeling on the gear target, and dividing the gear target into a training set and a test set according to the proportion of 1: 1;
step2, enhancing the images of the training set and the test set by using a Par-King image enhancement algorithm;
step3, extracting HOG (histogram of gradient directions) characteristics of the image processed in Step2 to obtain corresponding characteristic positive samples and characteristic negative samples;
step4, training two different classifiers for the extracted training set feature samples, wherein one classifier is a common overall SVM classifier, and the other classifier is a local SVR combined classifier on a frequency domain;
and Step5, carrying out joint matching on the feature samples of the test set by the two classifiers by using an image pyramid matching algorithm to obtain the optimal detection position of the target.
2. The visual inspection method for the high-precision gear in the industrial scene according to claim 1 is characterized in that: in Step2, the image enhancement method is as follows:
a gray image X with a gray level L is regarded as a fuzzy point array, the gray image X is converted into a fuzzy domain by utilizing a membership function, and then is converted into an image domain after being enhanced by an enhancement function to obtain an enhanced image;
because the Par-King algorithm processes a single-channel gray image, and the training set and test set image data are RGB three-channel images, the Par-King algorithm is used for independently enhancing three channels in one frame of image, and then the three channels of RGB images are combined to obtain a final enhanced image.
3. The visual inspection method for the high-precision gear in the industrial scene according to claim 1 is characterized in that:
in Step3, the HOG feature extraction method comprises the following steps:
selecting the maximum value of the gradient amplitudes of the three color channels at the position of each pixel point as output; averagely dividing the gradient direction of [0,2 pi) into 2 xk interval bins, wherein k is 1,2,3 …, realizing the extraction of improved HOG characteristics by using a vlfeat function library, and finally obtaining the dimension of the HOG characteristic vector of one region block as (4+3 xk);
traversing the sample image by using a block feature vector calculation rule, wherein the traversal step length is cellsize; for an RGB image of size w × h × 3, the number of horizontal area blocks hogw and the number of vertical area blocks hogh are:
hogw=(w+cellsize/2)/cellsize (1)
hogh=(h+cellsize/2)/cellsize (2)
the dimension of the HOG feature matrix of the sample image obtained finally is hogw × hogh × (4+3 × k).
4. The visual inspection method for the high-precision gear in the industrial scene according to claim 1 is characterized in that: in Step4, the classifier training method is as follows:
step4.1, for training SVM model, the size of each extracted feature sample is m1×n1X p, a positive samples of features, b negative samples of features1A plurality of; each feature sample was pulled to 1 × (m)1×n1X p) dimensional feature vector using a size of (a + b)1)×(m1×n1X p) dimension feature matrix to train out SVM classifier, and then changing the size of the classifier back to m1×n1X p as the final SVM model w1
Step4.2 for training the SVR model, the size of each extracted feature sample is m2×n2X p, a positive samples of features, b negative samples of features2A plurality of; fourier transform is carried out on the extracted features, a hyperplane model is independently trained for each block of all feature samples in a frequency domain, then an integral hyperplane model is combined according to the original spatial position, and then inverse Fourier transform is carried out to obtain a final SVR model w2
5. The visual inspection method for the high-precision gear in the industrial scene according to claim 1 is characterized in that: in Step5, the joint matching method comprises the following steps:
because the size of the target to be detected is variable and the angle images of the target are different, a standard image pyramid is obtained by repeated smoothing and downsampling, namely a series of images are generatedImages of different sizes; reuse of learned model w1And w2And respectively carrying out dot product operation on the features of each layer of image of the pyramid to obtain the scores of the two models at different scales and different positions for output, defining an integral score for each root position, finally selecting the optimal target position and the score for the two models, comparing the scores of the two target positions detected by the two models, and selecting the target position with high score as the final position for output.
CN201710974598.8A 2017-10-19 2017-10-19 High-precision gear visual detection method in industrial scene Active CN107886539B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710974598.8A CN107886539B (en) 2017-10-19 2017-10-19 High-precision gear visual detection method in industrial scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710974598.8A CN107886539B (en) 2017-10-19 2017-10-19 High-precision gear visual detection method in industrial scene

Publications (2)

Publication Number Publication Date
CN107886539A CN107886539A (en) 2018-04-06
CN107886539B true CN107886539B (en) 2021-05-14

Family

ID=61781831

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710974598.8A Active CN107886539B (en) 2017-10-19 2017-10-19 High-precision gear visual detection method in industrial scene

Country Status (1)

Country Link
CN (1) CN107886539B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875819B (en) * 2018-06-08 2020-10-27 浙江大学 Object and component joint detection method based on long-term and short-term memory network
CN109886932A (en) * 2019-01-25 2019-06-14 中国计量大学 Gear ring of wheel speed sensor detection method of surface flaw based on SVM
CN109948432A (en) * 2019-01-29 2019-06-28 江苏裕兰信息科技有限公司 A kind of pedestrian detection method
CN111651629B (en) * 2019-03-27 2023-08-18 上海铼锶信息技术有限公司 Method and system for constructing full sample data
CN111353526A (en) * 2020-02-19 2020-06-30 上海小萌科技有限公司 Image matching method and device and related equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102054178A (en) * 2011-01-20 2011-05-11 北京联合大学 Chinese painting image identifying method based on local semantic concept
CN103033362A (en) * 2012-12-31 2013-04-10 湖南大学 Gear fault diagnosis method based on improving multivariable predictive models
CN104506162A (en) * 2014-12-15 2015-04-08 西北工业大学 Fault prognosis method for high-order particle filter on basis of LS-SVR (least squares support vector regression) modeling
CN105160434A (en) * 2015-09-15 2015-12-16 武汉大学 Wind power ramp event prediction method by adopting SVM to select forecasting model
WO2016080913A1 (en) * 2014-11-18 2016-05-26 Agency For Science, Technology And Research Method and device for traffic sign recognition
CN105956632A (en) * 2016-05-20 2016-09-21 浙江宇视科技有限公司 Target detection method and device
CN106444703A (en) * 2016-09-20 2017-02-22 西南石油大学 Rotating equipment running state fuzzy evaluation and prediction methods based on occurrence probability of fault modes
CN106503748A (en) * 2016-11-07 2017-03-15 湖南源信光电科技有限公司 A kind of based on S SIFT features and the vehicle targets of SVM training aids
CN106769051A (en) * 2017-03-10 2017-05-31 哈尔滨理工大学 A kind of rolling bearing remaining life Forecasting Methodology based on MCEA KPCA and combination S VR

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9311564B2 (en) * 2012-10-05 2016-04-12 Carnegie Mellon University Face age-estimation and methods, systems, and software therefor

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102054178A (en) * 2011-01-20 2011-05-11 北京联合大学 Chinese painting image identifying method based on local semantic concept
CN103033362A (en) * 2012-12-31 2013-04-10 湖南大学 Gear fault diagnosis method based on improving multivariable predictive models
WO2016080913A1 (en) * 2014-11-18 2016-05-26 Agency For Science, Technology And Research Method and device for traffic sign recognition
CN104506162A (en) * 2014-12-15 2015-04-08 西北工业大学 Fault prognosis method for high-order particle filter on basis of LS-SVR (least squares support vector regression) modeling
CN105160434A (en) * 2015-09-15 2015-12-16 武汉大学 Wind power ramp event prediction method by adopting SVM to select forecasting model
CN105956632A (en) * 2016-05-20 2016-09-21 浙江宇视科技有限公司 Target detection method and device
CN106444703A (en) * 2016-09-20 2017-02-22 西南石油大学 Rotating equipment running state fuzzy evaluation and prediction methods based on occurrence probability of fault modes
CN106503748A (en) * 2016-11-07 2017-03-15 湖南源信光电科技有限公司 A kind of based on S SIFT features and the vehicle targets of SVM training aids
CN106769051A (en) * 2017-03-10 2017-05-31 哈尔滨理工大学 A kind of rolling bearing remaining life Forecasting Methodology based on MCEA KPCA and combination S VR

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Unified modeling based on SVM and SVR for prediction of forest area ratio by human population density and relief energy";Ryuei Nishii et.al.;《IGARSS》;20151231;第2552-2555页 *
"基于SVR的组合预测模型及其应用";刘显德,高泓;《计算机工程与设计》;20091231;第19卷(第30期);第4506-4508页 *

Also Published As

Publication number Publication date
CN107886539A (en) 2018-04-06

Similar Documents

Publication Publication Date Title
CN107886539B (en) High-precision gear visual detection method in industrial scene
CN110543837B (en) Visible light airport airplane detection method based on potential target point
CN109800824B (en) Pipeline defect identification method based on computer vision and machine learning
CN107657279B (en) Remote sensing target detection method based on small amount of samples
CN105956582B (en) A kind of face identification system based on three-dimensional data
CN105574527B (en) A kind of quick object detecting method based on local feature learning
CN107767387B (en) Contour detection method based on variable receptive field scale global modulation
CN107230203B (en) Casting defect identification method based on human eye visual attention mechanism
CN110399884B (en) Feature fusion self-adaptive anchor frame model vehicle detection method
CN106485651B (en) The image matching method of fast robust Scale invariant
CN104036284A (en) Adaboost algorithm based multi-scale pedestrian detection method
CN102930300B (en) Method and system for identifying airplane target
CN105335725A (en) Gait identification identity authentication method based on feature fusion
CN108710909B (en) Counting method for deformable, rotary and invariant boxed objects
CN107256547A (en) A kind of face crack recognition methods detected based on conspicuousness
CN106709452B (en) Instrument position detection method based on intelligent inspection robot
CN110222661B (en) Feature extraction method for moving target identification and tracking
CN104298995A (en) Three-dimensional face identification device and method based on three-dimensional point cloud
CN112257711B (en) Method for detecting damage fault of railway wagon floor
Chu et al. Strip steel surface defect recognition based on novel feature extraction and enhanced least squares twin support vector machine
CN105809173A (en) Bionic vision transformation-based image RSTN (rotation, scaling, translation and noise) invariant attributive feature extraction and recognition method
CN114863189B (en) Intelligent image identification method based on big data
CN114821358A (en) Optical remote sensing image marine ship target extraction and identification method
CN105354547A (en) Pedestrian detection method in combination of texture and color features
CN107679467B (en) Pedestrian re-identification algorithm implementation method based on HSV and SDALF

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant