CN117557487A

CN117557487A - Smooth object highlight removing method and system based on pix2pixHD and defect detecting device

Info

Publication number: CN117557487A
Application number: CN202311495156.7A
Authority: CN
Inventors: 杜鑫; 王小刚; 林俊杰; 周佳义; 宋瑞; 张刘宏; 陈柯宇; 詹秉翰; 尹知为
Original assignee: Sichuan University of Science and Engineering
Current assignee: Sichuan University of Science and Engineering
Priority date: 2023-11-10
Filing date: 2023-11-10
Publication date: 2024-02-13

Abstract

The invention discloses a method and a system for removing highlight of a smooth object based on pix2pixHD and a defect detection device, relates to the technical field of computer vision and image processing, and solves the problem that original defect information of a highlight region cannot be well reserved when the highlight region is removed in the prior art; the invention comprises the following steps: inputting a highlight smooth object real image and a highlight-free smooth object real image pair into a pix2pixHD network for training to obtain a highlight removal model, wherein the pix2pixHD network comprises a generator and a discriminator, a spectrum normalization and a self-attention mechanism are added in the generator, and a spectrum normalization is added in the discriminator; when the highlight region is removed, the original defect information of the object in the highlight region can be well reserved, meanwhile, the image quality after the highlight region is removed is reduced less than that of the conventional highlight removal technology, and the method has better effects than those of other prior art in view or in image quality indexes.

Description

Smooth object highlight removing method and system based on pix2pixHD and defect detecting device

Technical Field

The invention relates to the technical field of computer vision and image processing, in particular to a method and a system for removing highlight of a smooth object based on pix2pixHD and a defect detection device.

Background

Most materials in natural scenes show specular highlights on the surface under illumination conditions, the phenomenon is particularly obvious on smooth-surface materials, and removal of specular highlights has been a key problem in computer vision and image processing. In practical industrial applications, the presence of high light has a serious impact on many links in production. The most common effects are: the presence of specular highlights can cause noise and interference to the original image, thereby degrading performance of certain tasks such as small target detection, tracking, and recognition. In particular, some computer vision-based automated techniques, such as image segmentation, object detection, tracking, and recognition, rely heavily on color or saturation intensity information of the image itself. The presence of multi-region complex highlights can severely degrade image quality and undesirably tamper with the original information and intensity values of the pixels, resulting in dramatic degradation of the performance of these automated techniques. For example, when a defect of an object exists in a highlight region, defect detection is not performed correctly, and the main reason is that the defect feature is not obvious in the highlight condition.

The prior art scheme and the defects are that:

1. the traditional method based on the bicolor reflection model is operated based on thresholds under various forms, the brightest pixels are mainly regarded as highlight in the operation of the thresholds, the method is used for processing the characteristics that the whole surface of an object is dark, the smoothness of the material surface is general, and highlight objects in a random reality scene are not provided, so that the method has the advantages of difficult highlight positioning, low processing speed, poor effect of removing the strong highlight and general significance in the aspect of facing the image which is clear in surface and has the highlight similar to the color information of the object to be detected in the reality scene.

2. The sparse and low rank reflection model based method of specular removal is based on the observation that specular regions in the image typically have higher luminance values, while other regions have lower luminance values. This approach exploits this observation to decompose and process the image. However, the method tends to remove all low-frequency information in the image, including normal scene reflection, so that details of the image are often removed excessively, so that the image is darkly distorted, the actual strong highlight removing effect is poor, and other information in a highlight region cannot be well reserved when the highlight region is removed.

3. A mirror surface high light detection network (SHDNet) based on deep learning utilizes multi-scale context contrast characteristics to accurately detect mirror surface high light of different scales. They then propose a novel multi-tasking network combining highlight detection and removal on this basis, aiming at detecting and removing bright spots in natural images and achieving excellent effects on their data sets. However, the test model disclosed in the prior art is trained based on an SHIQ data set, and the data set is more general and cannot better exert the network performance in the face of a strong highlight image with stronger pertinence, so that the test model is not suitable for a study object with the characteristics of smooth surface material and strong reflectivity.

4. A smooth surface highlight quick removing method combining U2-Net and LaMa models utilizes the ideas of image segmentation and image restoration to remove highlights, and can remove strong highlights in a large area from the visual effect, but can not retain information such as defects of a highlight area.

In the prior art, the defect information inherent to the original object in the highlight region is removed together while the highlight region is removed, so that the subsequent automatic defect detection is wrong.

Disclosure of Invention

In order to solve the problems in the prior art, the invention aims to provide a method and a system for removing the highlight of a smooth object based on pix2pixHD and a defect detection device, and aims to solve the problem that the original defect information of a highlight area cannot be well reserved when the highlight area is removed in the prior art.

A pix2 pixHD-based smooth object highlight removal method comprising: and inputting the highlight smooth object real images and the highlight-free smooth object real images into a pix2pixHD network for training to obtain a highlight removal model, wherein the pix2pixHD network comprises a generator and a discriminator, the generator is added with a spectrum normalization and self-attention mechanism, and the discriminator is added with the spectrum normalization.

Preferably, the training comprises:

firstly, preprocessing an input image by using the pix2pixHD network, wherein the preprocessing comprises image formatting, image resize and image normalization;

inputting the preprocessed image data into a generator, converting the input image into a generated composite image, transmitting the generated composite image pair and a real image pair to a discriminator, carrying out true and false discrimination on the image pairs by the discriminator, outputting corresponding discrimination results, and respectively calculating losses of the generator and the discriminator according to the discrimination results;

then, calculating gradients of losses of the generator and the discriminator with respect to the respective parameters using a back propagation algorithm, and updating the parameters of the generator and the discriminator using an optimizer, respectively;

by alternately training the generator and the discriminator and optimizing their parameters by gradient descent, the generator is enabled to generate more realistic images while the discriminator is enabled to discriminate the generated images from realistic images more accurately.

Preferably, the generator is a global generator that processes input through upsampling and downsampling to capture information of different scales and global structures; the spectrum normalization is added into the generator, the size of the weight is controlled by limiting the spectrum norm of the weight matrix of the generator, the explosion of generator parameters is avoided, and a sample with higher quality is generated; the self-attention mechanism is added into the generator to help the model learn the global dependency relationship, and the modeling capability of the generator on the long-range context is enhanced, so that the generator generates more coherent images with better global structure.

Preferably, the discriminator includes a global discriminator for discriminating the entire image, capturing global structure and consistency information, and a local discriminator for discriminating texture details of the local area to provide finer discrimination results; the spectrum normalization is added into the discriminator, and the magnitude of the weight is controlled by limiting the spectrum norm of the weight matrix of the discriminator, so that the performance of the discriminator is improved, the real sample is better distinguished from the generated sample, and the stability of the GAN in the training process is enhanced.

Preferably, the objective function of the discriminator is defined as:

wherein G represents a generator, D ₁ And D ₂ Representing two different scale discriminator networks, pix2pixHD extracts features from multiple layers of the discriminator and learns to match these intermediate representations from the true image and the generated image. To stabilize training, the loss of GAN in the above formula is improved by adding a discriminant-based feature matching loss, wherein the feature matchingThe loss function can be expressed as:

wherein G represents a generator, D _k Represents the kth discriminator network, i represents the ith layer in the discriminator network, T is the total number of layers, N _i The number of elements in each layer is represented, s represents a picture to be converted, x represents a conversion target picture, and G(s) represents a target picture generated by a generating network.

After combining the objective function and the loss function, the final objective function is expressed as:

wherein G represents a generator, D represents a discriminator, D _k Representing a kth discriminator network, lambda being a super-parameter for controlling L _FM Is a weight of (2).

Preferably, the highlight removal model is obtained by converging through n epochs training losses based on the input m sample data by the pix2pixHD network.

The smooth object highlight removing system based on pix2pixHD comprises a storage module, a model training module and a highlight removing module;

the storage module is used for storing the highlight real image of the smooth object and the non-highlight real image of the smooth object;

the model training module trains the pix2pixHD network by utilizing the smooth object image stored by the storage module to acquire a highlight removal model;

the highlight removing module is used for removing the highlights in the real image of the smooth object by using the highlight removing model.

The smooth object defect detection device based on pix2pixHD comprises an edge device, an input device and an output device; the edge equipment is provided with a highlight removing model and a defect detection model which are obtained by a trained highlight removing method of a smooth object based on pix2pixHD, the highlight removing model is used for carrying out highlight removing on a real image of the smooth object and carrying out defect detection based on the highlight removed image, the input equipment comprises a camera and is used for collecting the real image of the smooth object in real time, and the output equipment comprises a display and is used for displaying the highlight removed smooth object image and the detected related defects.

The beneficial effects of the invention include:

compared with the prior art, the invention optimizes the parameters of the generator and the discriminator through alternate training and gradient descent, so that the generator can generate more realistic images, the discriminator can accurately discriminate the generated images and real images, and spectrum normalization is added in the generator and the discriminator, thereby improving the performance of the whole highlight removal model; in addition, the global generator is used for capturing information of different scales and global structures through up-sampling and down-sampling processing input, the global discriminator is used for discriminating the whole image, the global structure and consistency information are captured, and the local discriminator is used for discriminating texture details of local areas so as to provide finer discrimination results; therefore, when the highlight region is removed, the original defect information of the object under the highlight region can be better kept, and meanwhile, the image quality after the highlight region is removed is reduced less than that of the conventional highlight removal technology.

Drawings

Fig. 1 is a schematic flow chart of a method for removing highlight from a smooth object based on a pix2pixHD model in example 1.

Fig. 2 is a visual comparison of a smooth object highlight removal method based on the pix2pixHD model of example 1 with other recent highlight removal methods.

FIG. 3 is a comparison of the results of defect detection of an experimental object under high light conditions and defect detection after image highlights are removed using the methods of FIG. 2.

Fig. 4 is a schematic diagram of the structure of a pix2pixHD based smooth object highlight removal system.

Fig. 5 is a schematic structural diagram of a pix2 pixHD-based smooth object defect detection device.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

Example 1

Specific embodiments of the present invention will be described in detail below with reference to fig. 1-3;

as shown in fig. 1, the pix2 pixHD-based smooth object highlight removal method includes: inputting shot highlight smooth object images and highlight-free images thereof into a pix2pixHD network for training, driving the network to learn the mapping from the highlight-free images to the highlight-free images, and finally obtaining a highlight removal model, so that the highlight removal model can automatically remove highlight areas in the images. The Pix2pixHD performs data preprocessing (image format conversion, image restoration, image normalization) on the input image data, the preprocessed image data is input into a generator, the generator converts the input image into a generated composite image, the generated composite image pair and a real image pair are transmitted to a discriminator, the discriminator performs true and false discrimination on the image pairs, and corresponding discrimination results are output, and losses of the generator and the discriminator are calculated according to the discrimination results. The gradients of the losses of the generator and discriminator with respect to the respective parameters are then calculated using a back propagation algorithm, and the parameters of the generator and discriminator are updated using an optimizer, respectively. By alternately training the generator and the discriminator and optimizing their parameters by gradient descent, the generator is enabled to generate more realistic images while the discriminator is enabled to discriminate the generated images from realistic images more accurately.

Training for pix2pixHD networks. The data set is a small sample data set and comprises 2000 real images with different shapes and high light and 2000 corresponding real images without high light at all angles in the horizontal direction under the same scene. The dataset image was processed as 9:1 is divided into a training set and a test set, during the training process, the size of the training image is firstly adjusted to 640×640, and the convolution layer is initialized by Xavier. Our network was trained using Adam optimizer, batch size was set to 4, initial learning rate was lr=2e-4, and ngf (number of filters generated by the generator) was set to 64.

In particular, the objective function of a multi-scale discriminator used in the training process can be expressed as:

wherein G represents a generator, D ₁ And D ₂ Representing two different scale discriminator networks, pix2pixHD extracts features from multiple layers of the discriminator and learns to match these intermediate representations from the true image and the generated image. To stabilize training, the loss of GAN in the above formula is improved by adding a discriminant-based feature matching loss, which can be expressed as:

After combining the objective function equation (1) and the loss function equation (2), the final objective function is expressed as:

wherein G represents a generator, D represents a discriminator, D _k Representing the kth discriminator network, lambda is a super-parameter for controlling L _FM Is a weight of (2).

We train the network until the loss converges, which means that training to a certain epoch, the loss of the generator begins to stabilize and approach a certain value, which indicates that the generator has substantially converged; the loss of the discriminator also gradually stabilizes. This process iterates approximately 110 epochs for a total of 2000 pairs of pictures of the dataset at 9: the scale of 1 is divided into training and test sets, and the whole training process takes about 10 hours. During testing, ngf also needs to be set to 64, the input image size is adjusted to 640 x 640 and input into the trained network for highlight removal.

Compared with the prior art, the method has better improvement on the highlight treatment effect of the smooth object, and the specific reference is made to the table 1; the PSNR (Peak Signal-to-Noise Ratio) is a commonly used image quality evaluation index, and is used to measure the difference or distortion degree between an original image and a processed image, where the higher the PSNR value, the smaller the difference between the processed image and the original image, and the better the image quality. SSIM (Structural Similarity Index) is an image quality assessment index for measuring the structural similarity between two images. SSIM is widely used to compare the impact of image processing, compression or repair algorithms on image quality. The SSIM is calculated based on the brightness, contrast and structural information of the image and generates a similarity score between 0 and 1. The closer the score is to 1, the higher the structural similarity of the two images, and the better the image quality. The visual contrast is shown in fig. 2, wherein (a) is an inputted highlight image, (b) is a true no highlight image, (c) - (f) are other prior art highlight removal effects, and (g) is the highlight removal effect of the method of example 1; it can be seen from the figure that the conventional methods (c), (d) and (e) have poor highlight treatment effect, a great amount of highlight residues still exist on the object, the highlight on the smooth object in the true sense cannot be removed, and (f) the original defect characteristics of the highlight region of the image are removed at the same time of removing the highlight, which obviously is not in accordance with objective facts.

Table 1, results of contrast in highlight data for smooth objects treated by the present invention (sources) and other prior art methods

Fig. 3 is a comparison of the results of defect detection performed on an object under high light conditions and after the image is removed by using the methods in fig. 2, it can be seen that the defect features are not obvious under high light conditions, correct defect detection cannot be performed, and correct defect detection can be performed only after the image is removed by using the high light removal method.

In summary, the invention provides a method for removing the high light based on pix2pixHD, which solves the problem that the defect information inherent to the original object in the high light area is removed together while the high light area is removed in the prior art. Based on the self-made smooth object highlight data set, the method has more excellent effects than other prior art no matter in vision, image quality index and defect information retention of the original image highlight region.

Example 2

Referring to fig. 4, a pix2 pixHD-based smooth object highlight removal system, it should be understood that the system corresponds to the above-described embodiment of the method of fig. 1, and is capable of performing the steps involved in the embodiment of the method of fig. 1, and specific functions of the system may be referred to the above description, and detailed descriptions thereof are omitted herein as appropriate to avoid redundancy. As shown in fig. 4:

Example 3

With reference to fig. 5, a pix2 pixHD-based smooth object defect detection apparatus, it should be understood that the system corresponds to the above embodiment of the method of fig. 1, and is capable of performing the steps involved in the embodiment of the method of fig. 1, and specific functions of the system may be referred to the above description, and detailed descriptions thereof are omitted herein as appropriate to avoid redundancy.

The smooth object defect detection device based on pix2pixHD comprises an edge device, an input device and an output device; the edge equipment is provided with a trained highlight removing model and a defect detecting model which are obtained by a highlight removing method of a smooth object based on pix2pixHD, and the highlight removing model is used for carrying out highlight removing on a real image of the smooth object and carrying out defect detecting based on the highlight removing image, wherein the defect detecting model is obtained by training on yoloX for the smooth object image with defects, the input equipment comprises a mouse, a keyboard and a camera, the real image of the smooth object is acquired in real time by the camera, and the output equipment comprises a display for displaying the highlight removing smooth object image and the detected related defects.

The foregoing examples merely represent specific embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that, for those skilled in the art, several variations and modifications can be made without departing from the technical solution of the present application, which fall within the protection scope of the present application.

Claims

1. A pix2 pixHD-based smooth object highlight removal method, comprising: and inputting the highlight smooth object real images and the highlight-free smooth object real images into a pix2pixHD network for training to obtain a highlight removal model, wherein the pix2pixHD network comprises a generator and a discriminator, the generator is added with a spectrum normalization and self-attention mechanism, and the discriminator is added with the spectrum normalization.

2. The pix2pixHD based smooth object highlight removal method according to claim 1, wherein the training comprises:

firstly, preprocessing an input image by using the pix2pixHD network, wherein the preprocessing comprises image formatting, image restoration and image normalization;

the gradients of the losses of the generator and discriminator with respect to the respective parameters are then calculated using a back propagation algorithm, and the parameters of the generator and discriminator are updated using an optimizer, respectively.

3. The pix2pixHD based smooth object highlight removal method according to claim 1, wherein the generator is a global generator that processes input through upsampling and downsampling to capture information of different scales and global structures; the spectrum normalization is added into a generator, the size of the weight is controlled by limiting the spectrum norm of a generator weight matrix, the explosion of generator parameters is avoided, and a sample with higher quality is generated; the self-attention mechanism is added into the generator to help the model learn the global dependency relationship, and the modeling capability of the generator on the long-range context is enhanced, so that the generator generates more coherent images with better global structure.

4. The pix2pixHD based smooth object highlight removal method according to claim 1, wherein the discriminator comprises a global discriminator for discriminating the whole image, capturing global structure and consistency information, and a local discriminator for discriminating texture details of the local area to provide finer discrimination results; the spectrum normalization is added into the discriminator to control the weight by limiting the spectrum norm of the weight matrix of the discriminator, so that the performance of the discriminator is improved, the real sample and the generated sample are better distinguished, and the stability of the GAN in the training process is enhanced.

5. The pix2pixHD based smooth object highlight removal method according to claim 4, wherein the objective function of the discriminator is defined as:

wherein G represents a generator, D ₁ And D ₂ Representing two different scale discriminator networks, pix2pixHD extracts features from multiple layers of the discriminator and learns to match these intermediate representations from the true image and the generated image; to stabilize training, the loss of GAN in the above formula is improved by adding a discriminant-based feature matching loss, where the feature matching loss function can be expressed as:

wherein G represents a generator, D _k Represents the kth discriminator network, i represents the ith layer in the discriminator network, T is the total number of layers, N _i Representing the number of elements in each layer, _s a picture to be converted is represented, _x representing a conversion target picture, and G(s) represents a target picture generated by a generating network;

6. The pix2 pixHD-based smooth object highlight removal method according to claim 1, wherein the highlight removal model is obtained by converging a pix2pixHD network through n epochs training losses based on the input m sample data.

7. The system for removing the highlight of the smooth object based on pix2pixHD is characterized by comprising a storage module, a model training module and a highlight removing module;

8. The device for detecting the defect of the smooth object based on pix2pixHD is characterized by comprising an edge device, an input device and an output device; the edge device is provided with a trained defect detection model and the highlight removal model as claimed in claims 1-6, and is used for performing highlight removal on a real image of a smooth object and performing defect detection based on the highlight removed image, the input device comprises a camera for acquiring the real image of the smooth object in real time, and the output device comprises a display for displaying the highlight removed image of the smooth object and the detected related defects.