CN113221881A

CN113221881A - Multi-level smart phone screen defect detection method

Info

Publication number: CN113221881A
Application number: CN202110489131.0A
Authority: CN
Inventors: 陈垣毅
Original assignee: Hangzhou City University
Current assignee: Hangzhou Zhenwei Food Collection Technology Co ltd
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-08-06
Anticipated expiration: 2041-04-30
Also published as: CN113221881B

Abstract

The invention relates to a multi-level smart phone screen defect detection method, which comprises the following steps: manually marking defects in the image, and then separating the foreground defect image from the background image; performing multi-scale defect region feature extraction on the data set by using a depth residual error network and a feature pyramid network to obtain a multi-scale feature image; and extracting a target interest detection area. The invention has the beneficial effects that: through the extraction of the interest detection area, the efficiency of image preprocessing is effectively improved, the problem that the existing general target detection technology is greatly interfered by background factors in a small target detection task of mobile phone screen defect detection is solved, a targeted enhancement mode is provided for the problem of data set shortage in actual floor application, the early-stage manpower marking investment is greatly reduced, and the efficiency of the whole scheme is further improved.

Description

Multi-level smart phone screen defect detection method

Technical Field

The invention belongs to the field of mobile phone screen defect detection, and particularly relates to a multi-level smart phone screen defect detection method.

Background

The screen is used as a core component of the smart phone and is a key point of human-computer interaction, and the quality of the screen seriously affects the experience of a user on the smart phone. Therefore, the production process of the mobile phone screen by each large mobile phone manufacturer is more and more demanding. However, the mobile phone screen is very susceptible to the production environment and the production process during the production process. In order to prevent the mobile phone with the defective screen from flowing into the market to damage the benefits of consumers and influence the credit of mobile phone screen manufacturers, the mobile phone screen manufacturers adopt some necessary means to detect the quality of the mobile phone screen. The traditional detection means is that workers are arranged on a production line for watching, and the workers detect screens on the production line one by one with naked eyes. However, the method has the defects of low detection efficiency, high labor cost, lack of uniform judgment standard and the like. In addition, some mobile phone screen defect detection technologies for calculating the traditional computer vision technology exist, but most of the methods only carry out algorithm design aiming at one or more specific defect types, once a new defect is met, a new algorithm needs to be designed in a targeted mode, and the universality is poor. Specifically, in the task of appearance defect of the screen of the smart phone, the following main difficulties exist:

1) defects have a strong correlation with background. Whether the defects exist in the region of interest or not needs to be comprehensively judged by combining background information of the position of the region of interest, and under different backgrounds, whether the defects exist or not is uncertain. For example, the different color defects are determined under the condition that the foreground defects and the background image have significant differences in hue, lightness, and the like.

2) Background information is extremely disturbing. For an irrelevant background region between the two-dimensional rectangular image and the irregular target detection region, there is a high possibility that there is a region similar to the defect.

3) Defects are of a wide variety and define ambiguities. Due to the differences of different production environments, production standards, production technologies and the like, different types of defect types exist in mobile phone screens in different production scenes, a unified classification standard is lacked at present, and the problem that the defect types are different from person to person in calibration exists.

4) The calibration data sets are of varying quality. Unlike detection data sets such as VOC, COCO and the like widely accepted and used by the academic community, the data sets applied in the field of the industry need to be specifically and specifically customized collected according to different requirements of task scenes. Based on the factors that subjectively calibrated personnel lack of cognition deficiency on an artificial intelligence technology and objectively defect sample distinguishing degree is not high, the industrial defect data set of the screen defect of the smart phone has two main problems: firstly, the sample data quantity under different types has large difference and does not meet the independent same distribution condition; the second is that the number of whole samples is not sufficient.

The former three problems are specific difficult problems in the task of detecting the appearance defects of the mobile phone screen, the reason is that the definition of the defects is different according to scenes, and the 4 th problem is a common difficulty in landing application of various deep learning algorithms at present.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a multi-level smart phone screen defect detection method.

The method for detecting the defects of the multi-level smart phone screen comprises the following steps:

step 1, manually marking defects in an image, and then separating a foreground defect image from a background image; on the basis of the labeled foreground defect image, the color and the size of the foreground defect image are enhanced, and the sample number and the form diversity of the foreground defect are expanded; finally, combining the enhanced foreground defect image and the enhanced background image to generate a data set which is reliable in annotation, balanced in category and suitable for an actual production scene;

step 1.1, manually marking the position and the category information of a rectangular area where each foreground defect image in a small number of images is located, and then separating the foreground defect image from the background image;

step 1.2, performing image denoising treatment on the foreground defect image: processing image noise by median filtering, setting the gray value of the target point as the median of all points in a region around the target point, and enabling the pixel value in the field of the target point to be closer to the true value so as to eliminate independent noise signals;

after the image filtering process is carried out and the noise is reduced, the definition of the image contour and the edge can be reduced, the image after the noise is reduced is sharpened by adopting a Laplacian operator, and the difference between an object in the image and a background gray value is enhanced:

in the above formula, f (x, y) is an image after noise reduction;

the image is subjected to Laplace transform; c is an enhancement coefficient and is generally 1; g (x, y) is the enhanced image;

randomly adjusting the chromaticity, the sharpness and the brightness of the foreground defect image, and randomly rotating the foreground defect image by a random angle and randomly zooming in a fixed proportion range;

1.3, four defects such as black spots, dirt, scratches, hairs and the like are obvious in characteristics, and for heterochromatic defects, the defects are characterized by color difference with a region adjacent to a background, so that the enhanced defect foreground is difficult to be universally matched with a background image; in the randomly selected background image area, adjusting the pixel value:

in the above formula, value_newIs the enhanced pixel value; value_oldThe value range is 0-255 for the original pixel value; brightness represents the brightness ratio and takes a value of 0.9 or 1.1, the brightness value represents 0.9 dark heterochrosis, and the brightness value 1.1 represents bright heterochrosis; the value range of exp is 1.5-3, and the method is suitable for different color defects of different forms; a and b represent half the width and height of the rectangular background image area, respectively; the geometric meaning of the factor represents the ratio of the distance from a certain point to the defect center to the distance from the defect area boundary to the defect center, the value is less than 1 and is 0.95, the defect area contains a part of background information, and the generalization is improved; center_xAnd center_yThe horizontal and vertical coordinates of the center of the rectangular background image region are expressed by (center)_x，center_y) Establishing a rectangular coordinate system for an origin, wherein x and y respectively represent horizontal and vertical coordinates of points in a rectangular background image area;

step 1.4, performing image multi-band fusion on the foreground defect image and the random position of the target interest detection area, constructing a Laplacian pyramid on the foreground defect image and the background image, and fusing each layer:

in the above formula, superscript i denotes the ith layer of the laplacian pyramid;

features of an i-th layer of the laplacian pyramid representing the output fused image;

laplaca representing foreground defect imagesFeatures of the ith layer of the pyramid;

features of the ith layer of the laplacian pyramid representing a background image; rⁱA fusion area of an ith layer is represented, and the value of i is 2 generally; the multi-band fusion uses larger scale fusion at a low frequency position to avoid the cutting of a foreground defect and a background, and uses smaller scale fusion at a high frequency position to avoid the interference of background information on the foreground defect;

step 2, performing multi-scale defect region feature extraction on the data set by using a depth residual error network and a feature pyramid network to obtain a multi-scale feature image; the multi-scale feature images are fused by utilizing a feature pyramid network, so that the loss of small target defects in training is reduced, the defect detection generalization capability is improved, and a high-level feature map with high-level semantics is generated, wherein the high-level features have high-level semantic information and a larger receptive field and are suitable for detecting large objects; generating a shallow feature map with low-level semantics, wherein the shallow feature has low-level detail semantic information and a smaller receptive field and is suitable for detecting small objects; providing a series of possible defect candidate areas in the image by using an area detection network, and then adding the input and the output of the depth residual error network together by using a depth residual error network through jump layer connection to extract the image characteristics of the defect candidate areas; finally, classifying the feature map fused with all the information to complete the target classification and regression task; by means of convolution of a depth residual error network with the highest 152 layers, image features can be better learned, meanwhile, by means of a feature pyramid network, features of the image on different scales are extracted, large-scale position information and small-scale semantic information are combined, and the method is very suitable for a small-area defect detection task on a mobile phone screen;

and 3, extracting a target interest detection area based on the characteristic image in the step 2, aiming at reducing the detection area, improving the detection efficiency and avoiding background information interference: the background information is filtered, so that the accuracy of defect detection can be greatly improved; detecting a target interest detection area of an image to be detected shot at different stations through a large-scale anchored area detection network;

and 4, step 4: generating potential defect areas to be detected in the target interest detection area by using an area detection network, and judging whether defects exist in each potential defect area to be detected by combining the characteristic images in the step 2 to obtain the types of the defects; extracting feature vectors with equal length, judging the probability of each region as a foreground defect image or a background by a softmax function through setting anchoring with different scales by the region detection network, and regressing the positions of candidate regions of the foreground defect image;

and 5, outputting a defect detection result.

Preferably, the step 4 specifically comprises the following steps:

step 4.1, counting the sizes and proportions of defect samples in the data set, and setting a targeted anchoring scale; the anchored length-width proportion comprises five types of 0.2, 0.5, 1, 2 and 5, the pixel size comprises five types of 32 pixels, 64 pixels, 128 pixels, 256 pixels and 512 pixels, the defect area can be screened out better, and the regression process of the candidate frame can be accelerated;

step 4.2, combining the characteristic image, performing target interest detection area pooling operation on the defect candidate area, performing normalization processing on the image characteristic, and mapping the normalized image characteristic to probability distribution of defect categories through a softmax function, wherein the target function is as follows:

in the above formula, J (theta) is the objective function, N is the number of samples, y is the label class,

for prediction classes, θ is the model parameter, R (θ) is L to prevent overfitting²The regularization term, λ, is an empirically set parameter term.

Preferably, the defects in step 1 include hair, dirt, black spots, scratches and off-colors.

Preferably, in the step 2, the high-level semantics fuses the feature map with strong low-resolution semantic information or the high-level semantics fuses the feature map with weak high-resolution semantic information but rich spatial information.

Preferably, the target interest detection area in step 3 refers to an approximately rectangular mobile phone screen area in the image.

Preferably, the feature pyramid network in step 3 is composed of three parts, the first part is up-sampled by a convolutional network, the second part is down-sampled from the feature map, the third part is transversely connected, and the feature maps of the first two parts with the same size are fused to obtain the multi-scale feature map.

Preferably, in step 4.2, feature vectors with equal length are extracted by pooling the target interest detection region, and the maximum value of the defect classification probability is the classification category of the detection result.

The invention has the beneficial effects that:

aiming at the two points that the defect has stronger relevance with the background and the background information has great interference, the invention provides a method for extracting the interest area to be detected by the target by adding a layer of segmentation extraction process between the defect area detection networks, which can effectively avoid the interference of the background information; aiming at the two points of various defects, fuzzy definition and uneven quality of a calibration data set, the invention provides an image local area enhancement mechanism for the specific task requirement of the appearance defect detection of a mobile phone screen, can abandon the subjective difference of different annotators, has high expansibility and can be suitable for the continuous amplification of the task requirement; the defect detection method can effectively improve the defect detection capability and the defect classification accuracy, improve the defect detection efficiency and the real-time performance and meet the actual production requirement.

According to the invention, through the extraction of the interest detection area, the image preprocessing efficiency is effectively improved, the problem that the existing general target detection technology is greatly interfered by background factors in a small target detection task of mobile phone screen defect detection is solved, a targeted enhancement mode is provided for the problem of data set shortage in actual landing application, the early-stage manpower marking investment is greatly reduced, and the efficiency of the whole scheme is further improved.

Drawings

FIG. 1 is an overall flowchart of defect target detection according to the present invention;

FIG. 2 is a flow chart of defect sample enhancement according to the present invention;

FIG. 3 is a flow chart of an enhancement of the defect prospect of the present invention;

FIG. 4 is a flow chart of defect detection according to the present invention;

fig. 5 is a diagram of a feature pyramid network structure for extracting multi-scale image features according to the present invention.

Detailed Description

The present invention will be further described with reference to the following examples. The following examples are set forth merely to aid in the understanding of the invention. It should be noted that, for a person skilled in the art, several modifications can be made to the invention without departing from the principle of the invention, and these modifications and modifications also fall within the protection scope of the claims of the present invention.

Example 1:

the overall flow of a multi-level smart phone screen defect detection method is shown in fig. 1, and comprises the following steps:

step 1, as shown in FIG. 2, artificially marking defects in an image, and then separating a foreground defect image from a background image; on the basis of the labeled foreground defect image, the color and the size of the foreground defect image are enhanced (as shown in fig. 3), and the sample number and the form diversity of the foreground defect are expanded; finally, combining the enhanced foreground defect image and the enhanced background image to generate a data set which is reliable in annotation, balanced in category and suitable for an actual production scene; defects include hair, dirt, black spots, scratches, and off-colors;

in the above formula, f (x, y) is an image after noise reduction;

features of the ith layer of the laplacian pyramid representing the foreground defect image;

step 2, as shown in fig. 4, performing multi-scale defect region feature extraction on the data set by using a depth residual error network and a feature pyramid network to obtain a multi-scale feature image; the multi-scale feature images are fused by utilizing a feature pyramid network, so that the loss of small target defects in training is reduced, the defect detection generalization capability is improved, and a high-level feature map with high-level semantics is generated, wherein the high-level features have high-level semantic information and a larger receptive field and are suitable for detecting large objects; generating a shallow feature map with low-level semantics, wherein the shallow feature has low-level detail semantic information and a smaller receptive field and is suitable for detecting small objects; providing a series of possible defect candidate areas in the image by using an area detection network, and then adding the input and the output of the depth residual error network together by using a depth residual error network through jump layer connection to extract the image characteristics of the defect candidate areas; finally, classifying the feature map fused with all the information to complete the target classification and regression task; by means of convolution of a depth residual error network with the highest 152 layers, image features can be better learned, meanwhile, by means of a feature pyramid network, features of the image on different scales are extracted, large-scale position information and small-scale semantic information are combined, and the method is very suitable for a small-area defect detection task on a mobile phone screen; in the high-level feature map, high-level semantics are fused with a feature map with strong low-resolution semantic information, or high-level semantics are fused with a feature map with weak high-resolution semantic information but rich spatial information;

step 3, as shown in fig. 4, extracting a target interest detection region based on the feature image in step 2, aiming at reducing the detection region, improving the detection efficiency and avoiding background information interference:

the method is different from the conventional target detection task which focuses on the whole image for target detection, the mobile phone screen appearance defect detection task only focuses on target detection of a specific area, images of different stations have different detection interest areas, and the proportion of the area of the detection area in the whole image is usually 50% -80%. The defects are defined relative to the screen appearance, and are not absolute standards, so that background factors outside the detection area have great interference, and the accuracy of defect detection can be greatly improved by filtering the background information.

The traditional edge contour detection algorithm can better extract the image area where the screen object is located, but has the following problems:

1) the time consumption is long, the time occupation ratio of image cutting in the whole model detection process is up to 50%, and the bottleneck problem greatly influencing the detection efficiency is solved.

2) The generalization is poor, the image needs to be binarized in the preposing step of the edge contour detection algorithm, the artificially set binarization threshold cannot adapt to different shooting stations and illumination conditions, the self-adaptive binarization algorithm, such as a watershed algorithm, still cannot achieve an ideal effect under certain illumination conditions, and false detection of interference factors in a background area is easily caused.

The interested detection area of the image to be detected shot at different stations is relatively fixed and is limited by the position deviation and the illumination deviation of different degrees under the condition of hardware. The image characteristics are relatively fixed, and the network is easy to detect through large-scale anchoring area detection. Compared with an extraction method based on edge contour detection, the method can greatly improve the detection time efficiency due to the fact that the characteristic images of the preamble steps are shared.

The background information is filtered, so that the accuracy of defect detection can be greatly improved; detecting a target interest detection area of an image to be detected shot at different stations through a large-scale anchored area detection network;

the target interest detection area refers to a mobile phone screen area which is approximately rectangular in the image; as shown in fig. 5, the feature pyramid network is composed of three parts, the first part is up-sampled by a convolutional network, the second part is down-sampled from the feature map, and the third part is transversely connected, and the feature maps of the first two parts with the same size are fused to obtain a multi-scale feature map;

and 4, step 4: as shown in fig. 4, generating potential defect regions to be detected in the target interest detection region by using a region detection network, and determining whether a defect exists in each potential defect region to be detected by combining the feature images in step 2 to obtain a defect type; extracting feature vectors with equal length, judging the probability of each region as a foreground defect image or a background by a softmax function through setting anchoring with different scales by the region detection network, and regressing the positions of candidate regions of the foreground defect image;

for prediction classes, θ is the model parameter, R (θ) is L to prevent overfitting²A regular term, wherein lambda is a parameter term set by experience;

extracting feature vectors with equal length in a target interest detection area pooling mode, wherein the maximum value of defect classification probability is a classification category of a detection result;

and 5, outputting a defect detection result.

Example 2:

on the basis of embodiment 1, multi-level convolutional neural network model training and testing are performed by using the data set generated in step 1, the experimental data includes 5422 pieces of real scene labeling data, the experimental evaluation indexes are defect detection accuracy, recall rate and F1 score, the accuracy (precision) indicates how many samples in the predicted result are correct, and the recall (recall) indicates how many positive samples in the predicted result are correctly detected. F1 is defined as follows:

the definition of correct detection is that IoU (overlap degree) between the detection area and the label area is 0.5 or more, and the detection type and the label type are consistent. IoU, the ratio of the areas of the intersection and union of the "predicted region" and the "true region" was calculated, and the results of the experiment are shown in Table 1 below.

TABLE 1 Defect Category test result Table

Type of defect	Rate of accuracy	Recall rate	F1
				Black spot	97.2％	93.1％	0.95
Hair, hair-care product and method for producing the same	96.1％	97.0％	0.96
				Scratch mark	95.2％	95.1％	0.95
Smudge	94.9％	93.6％	0.94
				Different colors	93.3％	94.7％	0.94

Claims

1. A multi-level smart phone screen defect detection method is characterized by comprising the following steps:

step 1, manually marking defects in an image, and then separating a foreground defect image from a background image; on the basis of the labeled foreground defect image, the color and the size of the foreground defect image are enhanced; finally, combining the enhanced foreground defect image and the enhanced background image to generate a data set;

step 1.2, performing image denoising treatment on the foreground defect image: processing image noise by adopting median filtering, setting the gray value of a target point as the median of all points in an area around the target point, and removing independent noise signals;

adopting a Laplace operator to sharpen the noise-reduced image, and enhancing the difference between the gray values of an object and a background in the image:

in the above formula, f (x, y) is an image after noise reduction;

the image is subjected to Laplace transform; c is an enhancement coefficient; g (x, y) is the enhanced image;

step 1.3, in the randomly selected background image area, adjusting the pixel value:

in the above formula, value_newIs the enhanced pixel value; value_oldThe value range is 0-255 for the original pixel value; brightness represents the brightness ratio and takes a value of 0.9 or 1.1, the brightness value represents 0.9 dark heterochrosis, and the brightness value 1.1 represents bright heterochrosis; the value range of exp is 1.5-3; a and b represent half the width and height of the rectangular background image area, respectively; the factor represents the ratio of the distance from a certain point to the center of the defect and the distance from the boundary of the defect area to the center of the defect; center_xAnd center_yThe horizontal and vertical coordinates of the center of the rectangular background image region are expressed by (center)_x，center_y) Establishing a rectangular coordinate system for an origin, wherein x and y respectively represent horizontal and vertical coordinates of points in a rectangular background image area;

features of the ith layer of the laplacian pyramid representing a background image; rⁱA fusion region representing the ith layer;

step 2, performing multi-scale defect region feature extraction on the data set by using a depth residual error network and a feature pyramid network to obtain a multi-scale feature image; fusing the multi-scale feature images by using a feature pyramid network to generate a high-level feature map with high-level semantics and a shallow-level feature map with low-level semantics; providing a defect candidate area in the image by using an area detection network, and then adding the input and the output of a depth residual error network together by using the depth residual error network through jump layer connection to extract the image characteristics of the defect candidate area; finally, classifying the feature maps;

step 3, extracting a target interest detection area based on the characteristic image in the step 2: filtering background information; detecting a target interest detection area of an image to be detected shot at different stations through a large-scale anchored area detection network;

and 5, outputting a defect detection result.

2. The method for detecting the defects of the multi-level smart phone screen according to claim 1, wherein the step 4 specifically comprises the following steps:

step 4.1, counting the sizes and proportions of defect samples in the data set, and setting an anchoring scale; the anchored aspect ratio includes 0.2, 0.5, 1, 2, and 5, and the pixel size includes 32 pixels, 64 pixels, 128 pixels, 256 pixels, and 512 pixels;

step 4.2, combining the characteristic image, performing target interest detection area pooling operation on the defect candidate area, performing normalization processing on the image characteristic, and mapping the normalized image characteristic to probability distribution of defect categories through a softmax function:

for prediction classes, θ is the model parameter, R (θ) is L to prevent overfitting²A regular term and lambda is a parameter term.

3. The multi-level smart phone screen defect detection method of claim 1, wherein: defects in step 1 include hair, dirt, black spots, scratches, and off-colors.

4. The multi-level smart phone screen defect detection method of claim 1, wherein: and 2, fusing a feature map with stronger low-resolution semantic information by using high-level semantics in the high-level feature map, or fusing a feature map with weaker high-resolution semantic information and rich spatial information by using high-level semantics.

5. The multi-level smart phone screen defect detection method of claim 1, wherein: the target interest detection area in the step 3 refers to a mobile phone screen area in the image.

6. The multi-level smart phone screen defect detection method of claim 1, wherein: and 3, the characteristic pyramid network consists of three parts, wherein the first part is subjected to up-sampling through a convolutional network, the second part is subjected to down-sampling from the characteristic diagram, the third part is transversely connected, and the two parts of characteristic diagrams with the same size are fused to obtain the multi-scale characteristic diagram.

7. The multi-level smart phone screen defect detection method of claim 2, wherein: and 4.2, extracting feature vectors with equal length in a target interest detection area pooling mode, wherein the maximum value of the defect classification probability is the classification category of the detection result.