CN113537410A

CN113537410A - Universal automatic balancing method for deep learning positive samples

Info

Publication number: CN113537410A
Application number: CN202111071518.0A
Authority: CN
Inventors: 都卫东; 王岩松; 王天翔; 吴健雄
Original assignee: Focusight Technology Co Ltd
Current assignee: Focusight Technology Co Ltd
Priority date: 2021-09-14
Filing date: 2021-09-14
Publication date: 2021-10-22
Anticipated expiration: 2041-09-14
Also published as: CN113537410B

Abstract

The invention relates to a universal deep learning positive sample automatic balancing method, which comprises the steps of S1, determining the sequence of traversing and processing all pictures according to sample subset distribution; s2, selecting the best channel for the image; s3, performing sliding window screenshot on the optimal channel, calculating the attribute of each small image obtained by screenshot, and classifying according to the attribute; s4, determining the number of the small images intercepted from each large image according to the distribution of the sample subsets and the total number of the required samples; s5, balancing the required quantity according to the classified quantity ratio; s6, selecting samples according to the required quantity; and S7, sending the obtained samples consistent with the demand number into a neural network for training, and sending the samples into the neural network for training. The method solves the problem of high over-killing rate of the trained network caused by uneven selection and distribution of the positive samples, has the characteristics of universality and no need of manual intervention, realizes complete automatic execution and plays a role.

Description

Universal automatic balancing method for deep learning positive samples

Technical Field

The invention relates to the technical field of image visual detection, in particular to a universal automatic deep learning positive sample equalization method.

Background

Because the resolution ratio of the industrial detection image is too large, the industrial detection image cannot be directly sent to a neural network when deep learning is used, the industrial detection image needs to be matched with the traditional algorithm, suspicious defects are extracted by the traditional algorithm, then a cutting operation with a certain size is carried out by taking the suspicious defects as a center, and the obtained defect minimaps are sent to the network for further judgment; the above is the inference phase, i.e. the use phase of the neural network.

In the training stage, according to manual labeling of the image, a cutting operation with the labeling as the center is carried out to obtain a negative sample; when selecting a positive sample, the most original method is to directly perform window sliding cutting on a good image, and all small images obtained by sliding cutting are taken as the positive sample.

Later, in order to avoid the number unevenness of the positive and negative samples, the required number n of the positive samples is determined by a method of changing the number n to 1(n is generally 3), coordinates are randomly selected on a good image, and n small positive sample images are cut.

However, in the scheme of cutting the image through the sliding window and taking all the obtained small images as positive samples, the number of defects is small in a real application environment, and the number of the positive samples obtained by the method is large, so that the number of the positive samples and the number of the negative samples are unbalanced, and a high false drop rate is caused.

The method inhibits the problem of unbalance of the positive and negative samples in quantity, but introduces a new problem:

for most industrial product images, such as glass or mobile phone back shells, when the surface of a product has no defect, the region located at the non-edge is often a flat region, that is, the gray scale aberration is not large, and the number of samples of the type is most; the gray level difference of the edge area is larger, the contrast is higher, the shape difference between the samples in the class is larger than that of the flat area, the diversity requirement of the samples in theory is met, and the occupied quantity is small; this results in that, if a random selection method is used, the probability that the edge area sample is selected is small, the probability that the edge area sample is missed is high, and even if the edge area sample is selected, the number of the edge area sample is small, and finally the trained network is over killed in the edge area.

For the above problems, an equalization scheme based on coordinate position distribution can be generally designed for different tested products and different imaging effects, but a design algorithm needs to be adopted for specific products, namely imaging effects, which cannot be used universally, an adjustment threshold needs to be adopted, and automation cannot be realized.

Moreover, if the industrial detection images are often acquired on different detection devices, the imaging conditions of different devices are often not completely the same, and there are some differences, and if the positive samples are all captured from one image or several images on a single device, the samples under the imaging conditions on other devices are lacked in the positive samples, so that the trained network normally works on the device on which the positive sample is selected, and higher over-killing occurs on other devices.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the method solves the problems that in the industrial detection image processing, when a positive sample is sampled, the phenomena that the selected positive sample lacks diversity, and the coverage degree and the representativeness are not enough are likely to occur, so that the network over-killing rate trained by the sample is high.

The technical scheme adopted by the invention for solving the technical problems is as follows: a universal deep learning positive sample automatic equalization method comprises the following steps,

s1, numbering a plurality of images respectively collected on a plurality of machines to form a sample subset, and determining the sequence of traversing all the images according to the distribution of the sample subset;

s2, selecting the best channel for the image;

s3, performing sliding window screenshot on the optimal channel, calculating the attribute of each small image obtained by screenshot, and classifying according to the attribute;

s4, determining the number of the small images intercepted from each large image according to the distribution of the sample subsets and the total number of the required samples;

s5, according to the quantity ratio of the classification, suppressing the classification with low relative contrast in the classification with low contrast, increasing the classification with high relative contrast and the classification with high contrast in the classification with low contrast according to an enhancement strategy, and finally determining the quantity of each small graph intercepted on each large graph;

s6, determining the required number of each type of small images in each large image, selecting the small images obtained by sliding a window on the optimal channel according to a connected domain balance selection strategy, mapping the screenshot window to the original image of the large image, and cutting the small images at the window positions on the original image to be used as samples;

s7, sending the obtained sample consistent with the demand number into a neural network for training, and sending the sample into the neural network for training;

and S8, selecting a small picture from all the small pictures which are not selected according to the established completion strategy so as to reach the required sample number.

The method classifies the positive samples obtained by the window sliding cutting by using the binary tree combined with the classification characteristics, and then selects the obtained samples according to the classification scheme by using balanced selection, so that the diversity of the positive samples is ensured, the positive samples with less quantity under a certain characteristic form are not omitted, and the over-killing rate of the trained network is reduced; the method solves the problem of high trained network over-killing rate caused by uneven positive sample selection distribution, has the characteristics of universality and no need of manual intervention, realizes complete automatic execution, and plays a role.

The invention has the beneficial effects that:

1. aiming at the problem that if random sampling edge regions are possibly omitted, the algorithm cannot be designed according to the coordinate-based equalization scheme, the method is not universal, the method is based on a binary tree combined with contrast and average gray scale, samples are classified, the classification and the equalization are carried out based on the characteristics instead of coordinate distribution, and therefore the algorithm has universality;

2. for the problem that the algorithm based on the feature classification needs to adjust the threshold value and cannot perform automatic classification and balance, the invention adopts the method based on the feature distribution statistics and combined with an unsupervised automatic classifier to automatically select the optimal image channel and determine the optimal classification threshold value, so that the algorithm has the characteristics of automatically determining the threshold value and automatically optimizing;

3. aiming at the problem that the network over-killing height can be trained due to the fact that a positive sample is selected on a single image or single equipment under different equipment and different imaging conditions, the method and the device adopt the scheme that images obtained on different equipment are divided into respective sample subsets, and balanced selection among the sample subsets is adopted, so that the problem is solved.

Drawings

Fig. 1 is a traversal sampling graph cutting mode of a large graph sample.

FIG. 2 is the logic and steps of sample thumbnail classification.

FIG. 3 is a schematic diagram of an optimal channel selection method when the image is a color image.

Detailed Description

The invention will now be described in further detail with reference to the drawings and preferred embodiments. These drawings are simplified schematic views illustrating only the basic structure of the present invention in a schematic manner, and thus show only the constitution related to the present invention.

A general method for automatic equalization of deep learning positive samples as shown in fig. 1-3, comprises the steps of,

s2, selecting the best channel for the image;

The following examples are given.

The method comprises the steps that N machines are arranged, a certain number of images can be shot on each machine, the number of the images shot on each machine is possibly different, all the machines are numbered, the images are respectively numbered as 1,2,3,. N, the number of the machine is i (i =1,2,3,4,5 … N), all the images shot by each machine are numbered, and the number of the jth image in the current machine i is i _ j; i (i =1,2,3,4,5 … N) can be considered as a subset.

1. Traversal processing mode

As shown in fig. 1, for traversing the sequence of processing all sample images, according to the sequence numbers shown in fig. 1,2,3, and …, traversing starts from the horizontal direction, traverses the current row, and traverses the next row after processing each image of the current row;

first processing a first image of a first machine, then processing a first image of a second machine, and then processing a first image of a third machine until the first image of the Nth (i.e. the last) machine; this is a cycle;

then processing a second image of the first machine, then processing each image on each machine in sequence according to the traversal sequence;

the processing refers to the step of intercepting the small image sample from the sample image, and the detailed strategy of the interception is in the later step.

2. Selecting an optimal channel:

(1) if the image is a gray image, directly taking the original image as an optimal channel;

(2) if the image is a color image, as shown in fig. 1, a first round of loop is performed in sequence, and the following operations are performed on each image traversed in the loop:

before the cyclic capture shown in fig. 3 starts, first traversing the first image of each machine, namely executing the first cycle twice, before the capture processing cycle starts, traversing the first image according to the sequence of the first cycle, not capturing the first image, only making data statistics, and selecting the optimal channel according to the statistical data;

the specific method comprises the following steps: and (3) performing channel decomposition on each image in the traversal (the number of subsets is 3, and the first round of circulation corresponds to 3 images in the graph) to obtain: red channel R, blue channel G, green channel B, color transition Gray scale graph Gray = R0.299 + G0.587 + B0.114, saturation channel S, luminance channel V.

And binarizing images of Gray, R, G, B, S and V, 6 channels decomposed from each big image by using an Otsu method, and subtracting the Gray level of a bright area from the average value of the Gray level of a dark area to obtain the contrast of each channel.

As shown in fig. 3, the channel with the largest contrast is selected from the images of 6 channels, i.e. Gray, R, G, B, S and V, which are decomposed from each map, and is denoted as Idexi, i is the number of the subset (machine); and obtaining the channel with the maximum contrast corresponding to each big image in the first round of cycle, namely Idex1, Idex2 and Idex3, and then selecting the channel selected the most times as the best channel, namely Idex.

3. Classification scheme

After the selection of the optimal channel is finished, according to the traversal processing mode of step 1, formally starting the traversal screenshot shown in fig. 1 and classifying.

As shown in fig. 2, in each cycle as shown in fig. 1, the contrast threshold Tc, the average grayscale threshold Tg, the polymerization degree threshold Tp, the boundary entropy threshold Te, the dark contrast threshold Tl, and the bright contrast threshold Th are obtained by statistics, as shown in fig. 2.

Samples classified into categories of low contrast and low gray scale are classified again into categories of low relative contrast and high relative contrast, and Tl is a threshold value of the categories;

similarly, samples classified into categories of low contrast and high gray scale are classified again into categories of low relative contrast and high relative contrast, and Th is a threshold value of the classification;

classifying the small graphs cut from each large graph, wherein the threshold is obtained by statistics and using an automatic classifier, and the specific obtaining method comprises the following steps:

(1) traversing the current round, namely traversing the current row as shown in fig. 1, sliding windows with the height of H and the width of W in a form of first-column and second-column from the upper left corner to the lower right corner of each large graph with H/2 and W/2 as longitudinal and transverse step lengths according to the size of the small graph required by the training of the neural network, and sequentially cutting out the small graphs; calculating the attribute value of the small graph for classification, wherein the specific calculation mode is as follows:

contrast c:

before sliding, carrying out binarization on a large image to be cut by sliding at present by using an Otsu method to obtain a bright area with higher gray level and a dark area with lower gray level on the large image, wherein a small image cut when a window slides corresponds to the large image through a window coordinate position to obtain the corresponding bright area and dark area on the large image, and the calculation mode of c is as follows:

if the large graph position corresponding to the small graph window only has a bright area or only has a dark area, c = 0;

if the light area and the dark area exist, c = MeanLight-MealDark, wherein MeanLight and MealDark are the average gray levels of the light area and the dark area of the large image position corresponding to the small image window respectively;

average gray g:

i.e. average gray scale of the truncated small graph;

degree of polymerization p:

if the large graph position corresponding to the small graph window only has a bright area or only has a dark area, the polymerization degree is 1;

if the large graph position corresponding to the small graph window has both a bright area and a dark area, the bright area and the dark area are divided into connected areas respectively; the area of each connected domain of the bright area is respectively calculated as follows: AreaLight = [ AreaLight1, AreaLight2, AreaLight3. ], and area of each dark-area connected domain: AreaDarks = [ AreaDarks1, AreaDarks2, AreaDarks3. ], then according to the formula: p = min (max (arealights), max (areadarks))/s;

respectively taking the maximum value of the areas of all connected domains in the dark region and all connected domains in the bright region in the small image, comparing the two maximum values, taking the small value, and dividing the small value by the area of the small image to obtain the polymerization degree p; where s refers to the area of the sample panel s = W × H.

Boundary entropy e:

if the large graph position corresponding to the small graph window only has a bright area or only has a dark area, e = 0;

if the large picture position corresponding to the small picture window has both bright area and dark area, then

Wherein ig refers to the value range of the pixel gray value of 0-255,

the ratio of the number of pixels with the gray scale of ig in the boundary area to the total number of pixels in the boundary area, wherein the boundary area refers to the boundary between the dark area and the two areas, and the ratio is defined by r =

Is a region with a radius and outward expansion;

relative contrast:

dividing the gray scale into a sample relative contrast cl with low average gray scale and a sample relative contrast ch with high average gray scale;

the following calculation method is adopted for both cl and ch:

carrying out Dajin threshold segmentation on the small image, wherein if the segmentation result only has a bright area or only has a dark area, the relative contrast is 0;

if the division result has both bright and dark regions, the relative contrast is the difference between the average gray levels of the bright and dark regions.

The complementary polymerization operation method comprises the following steps:

in fig. 2, the complementary polymerization operation is performed on the low-polymerization-degree class, and the specific steps are as follows: the method comprises the steps of dividing high-contrast classes into classes with low polymerization degrees, comparing the areas of the smallest connected domains in a bright area and a dark area, taking the center of the connected domain with the large area as an origin and the center of the connected domain with the small area as an end point, taking a translation vector, sliding a screenshot window corresponding to a current small picture in the vector direction by taking 5 as a step length and taking min (W, H)/2 as a boundary, if the polymerization degree in the sliding process exceeds 0.3, replacing the original small picture with the small picture intercepted by the current window, and if the polymerization degree of no small picture exceeds 0.3 after sliding is completed, replacing the original small picture with the one with the largest polymerization degree in all small pictures obtained in the sliding process.

(2) After the classification attribute values of all the thumbnails in the current round are obtained according to the method, all the sample attribute values to be classified are sent to a smooth histogram classifier for classification according to the logic of the binary tree in fig. 2, and a specific calculation method for obtaining the classification threshold value of the currently used classification attribute is as follows:

normalizing the attribute value to be within a range of 0-255, determining a classification threshold value by using a Gaussian smooth histogram method, and then mapping the determined threshold value within the range of 0-255 to be within an original range to obtain the classification threshold value of the attribute;

all calculations in this step are performed on the best channel selected in step 2.

4. Demand quantity allocation

Setting the number of demands given by network training as N, that is, a total of N sample minimaps are required, the number of subsets of the large map obtained from different devices is N _ sub (e.g., N _ sub =3 in fig. 1), and the number of images in the subset with the least number of images in all subsets is N _ min (e.g., N _ min =3 in fig. 1);

in the loop shown in fig. 1, the number of small samples to be taken on each large map is need to be new _ image = N/(N _ sum × N _ min); on the large graph currently circulated to, as shown in fig. 2, the required number need _ i = need _ image/4 of class 1 (average gray level is low), class 2 (average gray level is high), class 3 (boundary entropy is low), class 4 (boundary entropy is high) sample small graphs;

5. demand quantity balancing

Taking class 1 as an example, according to the above method and the description of the step "3, classification scheme", class 1 is classified into class 1.1 (low relative contrast) and class 1.2 (high relative contrast) by using Tl;

wherein the class 1 corresponds to the class with low contrast and low gray level in the step "3, classification scheme", the samples classified into the classes with low contrast and low gray level mentioned in the step "3, classification scheme" are again classified into the classes with low relative contrast and high relative contrast, the former represents the class 1.1, and the latter corresponds to the class 1.2;

the current large graph number is n1 and n2 respectively, the proportion is n2/n1, the large graphs are distributed to classes 1.1 and 1.2 according to the opposite proportion of the requirement number of the class 1, and each class takes at least 1, namely need _1.1= max (1, need _ 1(n 2/(n1+ n2))), need _1.2= max (1, need _ 1(n 1/(n1+ n 2));

where need _1 is the number of class 1 requests on the current big graph, need _1.1 and need _1.2 are the numbers allocated to class 1.1 and class 1.2.

Let the number of classes 1.1 and 1.2 on the current big graph be n _1.1 and n _1.2 respectively,

if n _1.2< need _1.2, then the deficit numbers are assigned to class 3 and class 4,

namely, it is

need_3=need_3+[(need_1-(need_1.1+n_1.2)/2]；

need_4=need_4+[(need_1-(need_1.1+n_1.2)/2]。

Similarly, class 2.1 and class 2.2 in class 2 correspond to class 2 with a lower relative contrast than the relative contrast divided by Th in step "3 and classification scheme", respectively, and the required number is balanced according to the above method.

6. Selecting samples by required number

The screenshot saved in the step is obtained by performing screenshot operation on the large map original image.

The required number of each type of samples on each large graph in the cycle is determined according to the method, and then the small graphs of the samples classified well are selected according to the required number, wherein the selection method comprises the following steps:

(1) if the number of the small pattern samples of the current class in the current image has _ i is less than or equal to the required number need _ i of the class in the current image, i represents the class, i =1.1,1.2,2.1,2.2,3, 4; all the small pictures of the type on the image are directly selected and stored;

(2) if the wave _ i > need _ i, selecting according to the following method:

the positions of all current image windows on the current image are mapped to the positions on the large image, namely, the windows form an area on the large image in a rectangular mode, and then the obtained area is processed by a connected domain.

And (4) setting n _ c connected domains, sequentially traversing the 1 st, 2 nd and … th n _ c connected domains, randomly selecting a sample on the traversed current connected domain, stopping the whole sample balancing step if the selected number reaches the required number in the midway, and sending the obtained sample consistent with the required number into a neural network for training.

And if the number of the selected samples in the first cycle is less than the required number, sequentially traversing the 1 st, 2 nd and … n _ c connected domains, randomly selecting an unselected sample on the traversed current connected domain, and skipping the connected domain if all the samples on the connected domain are selected.

And sequentially selecting samples according to the method of traversing and randomly selecting the connected domain, and exiting the cycle if the number of the selected samples reaches the required number of the class on the image.

7. Deficiency and complement treatment method

The deficiency treatment method comprises the following steps:

in the above cycle, if the number of samples of the current class of the current image is less than the number of samples of the class required by the current image, all the samples of the class on the current image are selected.

The completion processing method comprises the following steps:

as shown in fig. 1, all the cycles are completed, the number N _ select of the selected thumbnails fails to reach the total demand N, and when the total number N _ reduce of all the unselected samples in the statistical class 1.2, class 2.2, class 3, and class 4 cannot or just can complement the total demand, that is, N _ reduce < = N-N _ select, all the unselected thumbnails of the classes are selected, and the selected thumbnails are combined into the finally selected samples.

If N _ reduce > N-N _ select, sequentially traversing class 1.2, class 2.2, class 3 and class 4, randomly selecting a small image sample from the small image samples which are not selected in the current class, repeating the circulation until the total selection number reaches the total demand number, and stopping the circulation if the total selection number (namely the total number of the small image samples selected in the complement process and the above process) reaches the total demand number in the midway.

While particular embodiments of the present invention have been described in the foregoing specification, various modifications and alterations to the previously described embodiments will become apparent to those skilled in the art from this description without departing from the spirit and scope of the invention.

Claims

1. A universal deep learning positive sample automatic equalization method is characterized in that: comprises the following steps of (a) carrying out,

s2, selecting the best channel for the image;

and S7, sending the obtained samples consistent with the demand number into a neural network for training, and sending the samples into the neural network for training.

2. A general method of automatic equalization of deep learning positive samples according to claim 1, characterized by: the method also comprises the step of carrying out the following steps,

3. A general method of automatic equalization of deep-learning positive samples according to claim 1 or 2, characterized by: in step S2, if the image is a grayscale image, the image itself is used as the best channel; if the image is a color image, performing channel decomposition on the image to obtain 6 images of a color-to-gray image, red, green, blue, saturation and brightness, selecting an image with the highest brightness-to-dark contrast as an optimal channel from the 6 images obtained by the image decomposition by adopting a method of obtaining brightness-to-dark contrast based on global binarization, counting which channel is obtained by all images subjected to the optimal channel selection, and using the channel selected most as the optimal channel.

4. A general method of automatic equalization of deep-learning positive samples according to claim 1 or 2, characterized by: in step S3, sliding windows with height H and width W in a row from the top left corner to the bottom right corner of each large graph with H/2 and W/2 as longitudinal and transverse steps according to the small graph size required by the training neural network, and cutting out small graphs in sequence; calculating attribute values of the small graphs for classification; and then sending all sample attribute values to be classified into a smooth histogram classifier for classification according to the logic of the binary tree.

5. A general method of automatic equalization of deep learning positive samples according to claim 4, characterized by: the attribute values of the small graph for classification comprise contrast c, average gray g, polymerization degree p and boundary entropy e; the threshold values of the classified attribute values are respectively as follows: a contrast threshold Tc, an average grayscale threshold Tg, a polymerization degree threshold Tp, a boundary entropy threshold Te, a dark contrast threshold Tl, a bright contrast threshold Th.

6. A general method of automatic equalization of deep learning positive samples according to claim 5, characterized by: the method has a complementary polymerization step for the low-polymerization-degree class, and comprises the following specific steps:

dividing the high-contrast class into classes with low polymerization degrees, comparing the areas of the smallest connected domains in the bright region and the dark region, and taking the center of the connected domain with the large area as an origin and the center of the connected domain with the small area as an end point to be used as a translation vector; in the vector direction, a screenshot window corresponding to the current small image slides by taking a set distance as a step length and taking min (W, H)/2 as a boundary, and if the polymerization degree exceeds a set value in the sliding process, replacing the original small image with the small image intercepted by the current window; and if the sliding is finished and the polymerization degree of no small graph exceeds a set value, replacing the original small graph with the one with the largest polymerization degree in all the small graphs obtained in the sliding process.

7. A general method of automatic equalization of deep-learning positive samples according to claim 1 or 2, characterized by: in step S6, the sorted sample thumbnails are selected according to the required number, and the selection method includes:

1) if the number of the small pattern samples of the current class in the current image has _ i is less than or equal to the required number need _ i of the class in the current image, i represents the class, i =1.1,1.2,2.1,2.2,3, 4; all the small pictures of the type on the image are directly selected and stored;

2) if have _ i > need _ i, then map the positions of all current class image windows on the current image to the positions on the large map, that is, make up these windows into the area on a large map in the form of rectangle, and then make the obtained area process the connected domain.

8. A general method of automatic equalization of deep learning positive samples according to claim 7, characterized by: in the step 2), n _ c connected domains are set, the 1 st, 2 nd and … th connected domains are sequentially traversed, a sample is randomly selected from the traversed current connected domain, if the selected number reaches the required number in the midway, the whole sample balancing step is stopped and ended, and the obtained sample consistent with the required number is sent to a neural network for training; if the number of the selected samples is less than the required number, sequentially traversing the 1 st, 2 nd and … n _ c connected domains again, and randomly selecting an unselected sample on the traversed current connected domain; if all samples on the connected component have been selected, the connected component is skipped.

9. A general method of automatic equalization of deep learning positive samples according to claim 2, characterized by: in step S8, if the number N _ select of the selected thumbnails fails to reach the total demand N, then it is counted whether the total number N _ reduce of all unselected samples in the classes with high relative contrast, low boundary entropy and high boundary entropy cannot or just can complement the total demand, i.e. whether N _ reduce < = N-N _ select is present, if yes, all the unselected thumbnails are selected, and the selected thumbnails are combined into the finally selected sample; if not, successively traversing the classes with high relative contrast, low boundary entropy and high boundary entropy, randomly selecting a small image sample from the small image samples which are not selected in the current class, repeating the circulation until the total selection number reaches the total demand number, and stopping the circulation if the total selection number, namely the total number of the small image samples selected in the complement process and the above processes, reaches the total demand number N in the midway.