CN109636792B

CN109636792B - Lens defect detection method based on deep learning

Info

Publication number: CN109636792B
Application number: CN201811533354.7A
Authority: CN
Inventors: 曹睿龙; 郭孟宇; 穆港
Original assignee: Isvision Hangzhou Technology Co Ltd
Current assignee: Yi Si Si Hangzhou Technology Co ltd
Priority date: 2018-12-14
Filing date: 2018-12-14
Publication date: 2020-05-22
Anticipated expiration: 2038-12-14
Also published as: CN109636792A

Abstract

The invention discloses a lens defect detection method based on deep learning, which comprises the steps of firstly, acquiring a training sample off-line gallery, generating a network structure corresponding to image features in the gallery by utilizing a caffe model and an ARM-NN SDK toolkit in an upper computer, generating different protobuf dynamic link libraries by utilizing the network structure, directly and differently calling the protobuf dynamic link libraries by a lower computer during actual detection, carrying out implementation analysis on acquired images, and judging whether the lens has defect features; realizing automatic detection of the lens image to be detected; by carrying out different classification on the image characteristics, the false detection rate is reduced, the detection of the whole lens image to be detected is realized, and meanwhile, the lower computer analyzes and processes the image, so that the cost caused by equipment and wiring is saved compared with a method of transmitting the image to the upper computer for analysis.

Description

Lens defect detection method based on deep learning

Technical Field

The invention relates to the field of visual inspection, in particular to a lens defect detection method based on deep learning.

Background

The lens is widely applied in the fields of real life and industry, and has very wide application prospect in the intelligent detection of the defects of the lens.

Taking laser welding protective lens detection in industrial production as an example, compared with the traditional welding technology, laser welding has incomparable superiority in aspects of welding precision, efficiency, reliability and automation. With the increase of the demand of modern automobile manufacturing, the dependence on laser welding is higher and higher. When the welding operation lasts for a certain time, the transparent lens of the protective welding machine is easy to crack due to splashing of high-temperature welding slag, so that the welding quality is influenced, and even the welding machine or a robot is damaged, therefore, the method is very important for effectively detecting the laser welding protective lens.

At the present stage, two schemes for detecting the mirror surface are provided, namely, workers at the downstream of a production line carry out manual detection, and welding accidents caused by defects of protective lenses are prevented by a method for manually checking the protective lenses of laser welding heads. However, workers find that the welding problem has certain hysteresis, the workers on the production line often perform integral inspection on the white car body after laser welding, and once the mirror surface is damaged midway, a large amount of bad-welded white car bodies can be subjected to repair welding again, so that the production efficiency of the car is reduced, and the manufacturing cost is increased. The second existing scheme is that a traditional mirror defect detection sensor is used, an image of a mirror surface is collected by a camera to obtain an image of the real-time condition of a laser welding protective lens, whether the lens has defects is judged by the image, the method is high in automation degree and strong in real-time property and is widely popular, and the processing method of the image of the protective lens comprises the following steps: firstly, extracting a circular view field of a protective lens image in the whole image, judging whether the area of a dead pixel is more than or equal to the number of dead pixel defined by a user, and prompting that the protective lens is normal if the area of the dead pixel is less than the number of dead pixel defined by the user; and if the area of the dead pixel is more than or equal to the number of the dead pixel defined by the user, marking the dead pixel outline and prompting the abnormality of the protective lens.

The method only simply identifies the gray level and the number of the pixels, and cannot analyze whether the pixels are real dead pixels or not; in an actual welding site, the problem condition of the protective lens is complex, and the conditions of a dead spot, shadow, blur, offset and the like can occur;

if the light change of a welding production line is large, the edge shadow of the protective lens is easily aggravated due to reflection and vibration, the situation of false detection of the detection sensor can be caused by adopting the existing image detection method, when the shadow is aggravated, dark pixels in a field of view can be mistaken as dead spots due to aggregation to a certain degree, and false alarm is caused; meanwhile, the circular field of view area is intercepted by the existing method, and the area beyond the circular field of view is not detected, so that the measurement range is reduced, and effective data waste is caused.

Disclosure of Invention

In order to solve the technical problems, the invention provides a lens defect detection method based on deep learning.

The technical scheme is as follows:

a lens defect detection method based on deep learning comprises the following steps:

step 1) storing not less than 1000 images of the lens to be detected as an offline gallery, inputting the offline gallery into an lmdb database, classifying each image of the lens to be detected according to image characteristics and adding a corresponding characteristic label according to a caffe model, and establishing a corresponding relation between the image of the lens to be detected and the characteristic label; the number of the types of the feature tags is recorded as n;

copying all the to-be-detected lens images with the same feature labels into the same sub-classification gallery;

step 2) carrying out averaging operation on all the images of the lens to be detected in each sub-classification gallery in sequence, wherein each sub-classification gallery correspondingly generates an image mean value binary file;

step 3) sequentially inputting the image mean binary files into an ARM-NN SDK toolkit according to a predefined Makefile compiling rule, wherein the ARM-NN SDK toolkit executes make compiling according to the Makefile compiling rule, and each image mean binary file correspondingly generates a group of intermediate files containing the compileable files;

step 4) according to configuration information of a prototxt file preset for an ARM-NN SDK toolkit, correspondingly generating a network structure for the intermediate file obtained in the step 3), wherein the network structure comprises a convolution layer, a pooling layer, an activation layer, a full connection layer, a softmax layer, a drop layer and an output layer which are connected according to a certain rule;

the number of the intermediate files is n, the number of the corresponding network structures is also n, and the number is marked as DL_i，i＝1,2,3…n；

Step 5) to each network structure DL_iRespectively inputting all the images of the lens to be tested in the sub-classification gallery corresponding to each network structure;

iterating the network structure according to preset configuration information of the prototxt file, wherein the total iteration times S is 8000-20000, and the learning rate is reduced for each iteration q times until the total iteration times are finished;

after iteration is completed, according to the verification information in the softmax layer, the network structure DL is obtained_iThe accuracy of' is high;

when the accuracy reaches 90%, the network structure DL will be iterated_i' stored in a binary protobuf file; if the accuracy is less than 90%, acquiring the off-line gallery again, and returning to the step 1);

iterating the n network structures respectively to obtain n binary protobuf files;

or the following method is adopted for iteration:

iterating the network structure according to preset configuration information of a prototxt file, wherein the total iteration number S is 8000-20000, and the accuracy of the current network structure is obtained and the learning rate is reduced according to verification information in a softmax layer every iteration q times, and iteration is continued; when the accuracy reaches 90%, the iteration is ended, and the network structure DL after the iteration is processed_i' the parameters are stored in a binary protobuf file; if the accuracy is less than 90% after the total iteration times are finished, acquiring the off-line gallery again, and returning to the step 1);

step 6) loading the n binary protobuf files into an ARM-NN SDK toolkit, changing a software compiling tool g + + in the Makefile compiling rule into a g + + corresponding to a target microprocessor architecture, wherein the software compiling tool g + + is used for compiling files into executable programs;

the target processor is a lower computer architecture, and a general processor architecture includes: x86, ARM, PowerPC, etc., in the invention, the processor architecture for detection is ARMv 732 bit;

g + + between different microprocessor versions is incompatible, i.e., compiling an executable program of the ARM requires the use of an ARM g + + tool;

starting cross compilation according to other Makefile compilation rules in the step 3), and correspondingly generating n protobuf dynamic link libraries;

the protobuf dynamic link library can be directly called by a C + + program;

and 7) loading the protobuf dynamic link library when the lower computer main program runs, processing the real-time acquired image of the lens to be detected, and outputting a detection result corresponding to the image characteristic of the lens to be detected.

Further, in the ARM-NN SDK toolkit, the Makefile compiling rule comprises a path for defining engineering source codes, a path for generating files and a software compiling tool g + +; designating a target microprocessor Architecture (ARCH) as "arm", a compilation toolchain version (CC) as "arm-Linux-gnueabihf-gcc", a source file required to program protobuf, a Linux kernel version, and a link library.

Further, the prototxt file is used for configuring model parameters, and is configured and compiled before the step 3);

the configuration information of the prototxt file comprises:

setting the number of layers and connection relations of the convolutional layer, the pooling layer, the activation layer, the full connection layer, the softmax layer, the drop layer and the output layer in each network structure in the step 4); specifically, the definition of the name, input and output and acquisition variable value of each layer; taking the convolutional layer as an example, the name and characteristics of the layer, the input and output of the layer, the value of the acquired variable, the size of the convolutional kernel, the weight matrix, and other information are defined.

Setting a total number of iterations S, a number of iterations q, and each learning rate for each network structure in step 5).

Preferably, the settings for each network structure in step 5) in the prototxt file are as follows: the total number of iterations S is 10000, the number of times q is 1000, and the setting of each learning rate: the selection range of the initial value of the learning rate is 0.01-0.8, and the learning rate is reduced to 50% of the previous value every iteration q times.

The learning rate is set to be different according to the complexity of the picture and the size of the sample, the learning rate is set to be too small, the convergence rate is very slow, and the optimal point cannot be reached if the learning rate is set to be too large.

Setting according to experience: generally, a smaller learning rate tends to be selected to ensure the stability of the system, the selection range of the initial value of the learning rate is between 0.01 and 0.8, and preferably, the initial value of the learning rate is 0.01; each iteration q times, reducing the learning rate to the previous 50%.

Further, 4 feature tags are set in the step 1), including: presence/absence of dead spots, presence/absence of shadows, yes/no blurring, yes/no offset;

classifying the offline gallery into 4 sub-classification galleries according to 4 different feature labels;

at this time, 4 network structures are correspondingly generated in step 4), and are respectively recorded as: network structure DL for judging the presence/absence of dead spots₁Network structure DL for judging presence/absence of shadow₂Network structure DL for judging yes/no ambiguity₃Network structure DL for judging yes/no offset₄。

Preferably, the prototxt file is configured for each network structure in step 4) to:

network structure DL for judging the presence/absence of dead spots₁The total number of layers is 16, including 1 input layer, 3 convolution layers, 3 pooling layers, 4 active layers, 2 full connection layers, 1 softmax layer, 1 drop layer and 1 output layer, the connection order is: the method comprises the following steps of inputting a layer, namely a rolling layer, an active layer, a pooling layer, a rolling layer, an active layer, a pooling layer, an active layer, a full connecting layer, an active layer, a drop layer, a full connecting layer, a softmax layer and an outputting layer;

the network structure DL for judging the presence/absence of shadow₂The total number of layers is 16, including 1 input layer, 3 convolution layers, 3 pooling layers, 4 active layers, 2 full connection layers, 1 softmax layer, 1 drop layer and 1 output layer, the connection order is: the method comprises the following steps of inputting a layer, namely a rolling layer, an active layer, a pooling layer, a rolling layer, an active layer, a pooling layer, an active layer, a full connecting layer, an active layer, a drop layer, a full connecting layer, a softmax layer and an outputting layer;

the network structure DL for judging yes/no ambiguity₃The total number of layers of 13, including 1 input layer, 2 convolution layers, 2 pooling layers, 3 active layers, 2 full connection layers, 1 softmax layer, 1 drop layer and 1 output layer, the connection order is: input layer-convolution layer-activation layer-pooling layer-full connection layer-activation layer-drop layer-full connection layer-softmax layer-output layer;

the network structure DL for judging yes/no ambiguity₄The total number of layers of 13, including 1 input layer, 2 convolution layers, 2 pooling layers, 3 active layers, 2 full connection layers, 1 softmax layer, 1 drop layer and 1 output layer, the connection order is: input layer-convolutional layer-active layer-poolingLayer-convolution layer-active layer-pooling layer-full-connection layer-active layer-drop layer-full-connection layer-softmax layer-output layer.

The convolution layer mainly extracts features through convolution kernels, and each feature value is the sum of products of a template and corresponding pixels of an original image.

The pooling layer reduces the spatial resolution of the convolutional layer primarily by downsampling.

The activation layer performs activation operation on input data, that is, if each data element meets the condition, the data is activated and is transmitted to the next layer, otherwise, the data is not transmitted.

The drop layer randomly inhibits some neurons in an inactive state.

The softmax layer is mainly used for normalizing data after full connection to enable the range of the data to be [0, 1], and the accuracy of the current network structure can be obtained after each iteration.

Further, in the step 7), firstly, unifying the size of the real-time acquired image of the lens to be detected, so that the resolution of the real-time image to be detected to be analyzed finally is consistent with the image in the off-line image library;

loading the real-time image to be detected to a lower computer main program, and calling a corresponding protobuf dynamic link library by the main program;

the detection result finally output in the step 7) is as follows: presence/absence of a dead-spot image, presence/absence of a shadow image, yes/no blur image, yes/no offset image.

Further, pictures in the off-line picture library are images of laser welding protection lenses collected by a camera in the process that the robot drives the welding gun to carry out laser welding.

The advantages are that:

the method utilizes an ARM-NN SDK toolkit and a cafe deep learning structure, an upper computer trains a network structure aiming at the characteristics of the lens to be detected, a dynamic link library which can be called by a lower computer is generated, the lower computer calls the dynamic link library, the whole lens image to be detected is analyzed and processed, the characteristic information in the image is identified, and the automatic detection of the lens image to be detected is realized; by carrying out different classification on the image characteristics, the false detection rate is reduced, the detection of the whole lens image to be detected is realized, and meanwhile, the lower computer analyzes and processes the image, so that the cost caused by equipment and wiring is saved compared with a method of transmitting the image to the upper computer for analysis.

Drawings

FIG. 1 is a schematic flow chart of an embodiment of the present invention;

FIG. 2 is a diagram showing the results of laser welding inspection of lenses using a conventional method;

FIG. 3 is a diagram showing the results of laser welding inspection of the lenses obtained by the method of the present invention.

Detailed Description

The technical solution of the present invention will be described in detail below with reference to a specific process for detecting defects of a laser welding protective lens.

A lens defect detection method based on deep learning is characterized in that an ARM-NN SDK toolkit is utilized in an upper computer to generate a protobuf dynamic link library, the protobuf dynamic link library is called in a lower computer, and whether defect characteristics exist in a protective lens or not is judged;

the method comprises the following steps: in the process of carrying out laser welding by driving a welding gun by a robot, a camera collects a laser welding gun protective lens image once in each laser welding process; collecting 2000 laser welding gun protective lens images and storing the images as an off-line gallery;

step 1), classifying the ion map library:

storing the offline image library into an lmdb database, classifying the offline image library according to the caffe model and the image characteristics, and adding a characteristic label corresponding to the image characteristics for each picture;

the image feature/feature label includes: presence/absence of dead spots, presence/absence of shadows, yes/no blurring, yes/no offset; n is 4;

classifying the offline gallery into four sub-classification galleries according to different feature labels, and copying all images with the same feature labels into one sub-classification gallery;

configuring and compiling a prototxt file;

the information of the configuration of the prototxt file includes:

further, the prototxt file configures each network fabric to:

the total number of layers of the network structure DL1 for judging whether there is a dead pixel is 16, and the network structure DL1 is formed by connecting 1 input layer, 3 convolution layers, 3 pooling layers, 4 active layers, 2 full-connection layers, 1 softmax layer, 1 drop layer and 1 output layer respectively, and the connection sequence is as follows: the method comprises the following steps of inputting a layer, namely a rolling layer, an active layer, a pooling layer, a rolling layer, an active layer, a pooling layer, an active layer, a full connecting layer, an active layer, a drop layer, a full connecting layer, a softmax layer and an outputting layer;

the total number of layers of the network structure DL2 for judging whether the shadow exists or not is 16, and the network structure DL2 is formed by connecting 1 input layer, 3 convolutional layers, 3 pooling layers, 4 active layers, 2 full-connection layers, 1 softmax layer, 1 drop layer and 1 output layer respectively, wherein the connection sequence is as follows: the method comprises the following steps of inputting a layer, namely a rolling layer, an active layer, a pooling layer, a rolling layer, an active layer, a pooling layer, an active layer, a full connecting layer, an active layer, a drop layer, a full connecting layer, a softmax layer and an outputting layer;

the total number of layers of the network structure DL3 for determining yes/no ambiguity is 13, and the network structure DL3 is formed by connecting 1 input layer, 2 convolutional layers, 2 pooling layers, 3 active layers, 2 full-link layers, 1 softmax layer, 1 drop layer and 1 output layer, and the connection sequence is as follows: input layer-convolution layer-activation layer-pooling layer-full connection layer-activation layer-drop layer-full connection layer-softmax layer-output layer;

the total number of layers of the network structure DL4 for determining yes/no ambiguity is 13, and the network structure DL4 is formed by connecting 1 input layer, 2 convolutional layers, 2 pooling layers, 3 active layers, 2 full-link layers, 1 softmax layer, 1 drop layer and 1 output layer, and the connection sequence is as follows: input layer-convolution layer-activation layer-pooling layer-full connection layer-activation layer-drop layer-full connection layer-soft mass layer-output layer.

The total iteration times S of each network structure is 10000 times, the times q of each network structure is 1000 times, and the initial value of the learning rate is selected to be 0.01.

Step 2), generating a mean value binary file:

carrying out averaging operation on each protection lens characteristic image in different sub-classification image libraries in sequence to generate four image mean value binary files corresponding to different sub-classification image libraries;

the image mean binary file comprises the position information of the image features in each protective lens image in the corresponding sub-classification gallery;

step 3), generating a compliable intermediate file:

defining Makefile compilation rules: in the ARM-NN SDK toolkit, the Makefile compiling rule comprises a path for defining engineering source codes, a path for generating files and a software compiling tool g + +; designating a target microprocessor Architecture (ARCH) as 'arm', a compiling tool chain version (CC) as 'arm-Linux-gnueabihf-gcc', a source file required by compiling into protobuf, a Linux kernel version and a link library;

sequentially inputting the image mean binary files into an ARM-NNSDK toolkit according to a defined Makefile compiling rule, wherein the ARM-NN SDK toolkit executes make compiling according to the Makefile compiling rule and correspondingly generates four groups of intermediate files;

a set of intermediate files comprises two compilable files with suffix name of pb.cc and pb.h, the two compilable files are used for subsequent steps to compile;

step 4), constructing a model:

according to the preset configuration information of the prototxt file, sequentially executing script tools on the four groups of intermediate files, and correspondingly generating four network structures which are respectively recorded as: network structure DL for judging the presence/absence of dead spots₁Network structure DL for judging yes/no ambiguity₃(ii) a Network structure DL for judging presence/absence of shadow₂Network structure DL for judging yes/no offset₄；

The network structure comprises a convolution layer, a pooling layer, an activation layer, a full connection layer, a softmax layer, a drop layer and an output layer which are connected in sequence;

step 5), model training:

inputting a plurality of protection lens images in the sub-classification gallery of the dead spots to the network structure DL for judging the dead spots₁Will yes/no moduleA plurality of protection lens images in the fuzzy classification gallery are input to the network configuration DL for judging yes/no blur₃(ii) a Inputting a plurality of protection lens images in a shadow/shade classification gallery to a network structure DL for judging the presence/absence of shadows₂Inputting a plurality of protective lens images in the yes/no shift sub-classification gallery to the network configuration DL for determining yes/no shift₄；

Iterating each network structure once according to configuration information of a preset prototxt file, and traversing all input images in the network structure; until the iteration is finished for 10000 times, obtaining the accuracy of the network structure according to the verification information in the softmax layer;

reducing the learning rate to 50% before every 1000 times of iteration;

when the accuracy reaches 90%, storing the parameters of the iterated network structure in a binary protobuf file;

respectively iterating the four network structures to obtain four binary protobuf files;

if the accuracy is less than 90%, acquiring the off-line gallery again, and returning to the step 1);

step 6), generating a protobuf dynamic link library:

changing g + + in the original Makefile into g + +, on the target microprocessor architecture; the software compiling tool g + + is used for compiling the file into an executable program; appointing a link path of the dynamic link library file;

loading the four binary protobuf files into an ARM-NN SDK toolkit, starting cross compilation, and correspondingly generating four ARMv7-A architecture 32-bit protobuf dynamic link libraries;

the protobuf dynamic link library can be directly called by a C + + program;

step 7), the lower computer judges the image characteristics:

the lower computer main program loads a protobuf dynamic link library, processes the protective lens images acquired in real time, unifies the sizes of the images of the lens to be detected acquired in real time, and enables the resolution of the images of the lens to be detected acquired in real time to be finally analyzed to be consistent with the images in the off-line image library;

loading the protection lens image acquired in real time to a lower computer main program, and calling a corresponding protobuf dynamic link library by the main program; the lower computer main program analyzes the input image, and outputs a corresponding detection result: presence/absence of a dead-spot image, presence/absence of a shadow image, yes/no blur image, yes/no offset image.

FIG. 2 is a diagram showing the results of laser welding inspection of lenses using a conventional method; in the figure, the number 1 is a detection edge, the number 101 is a shadow area, and the number 102 is a circled dead pixel; therefore, the shadow part in the image can be identified as a dead pixel by using the existing method, and the judgment result is wrong.

FIG. 3 is a diagram showing the result of the laser welding inspection of the lens according to the present invention, wherein reference numeral 2 is the edge, reference numeral 201 is the shadow area, and reference numeral 202 is the circled dead pixel; therefore, the shadow part in the image can not be identified as the dead pixel by utilizing the method provided by the invention, and the judgment result is accurate.

The invention realizes the defect identification and detection of the lens based on deep learning, and the embodiment provides the detection process of the laser welding protective lens.

Claims

1. A lens defect detection method based on deep learning is characterized in that: the method comprises the following steps:

alternatively, the first and second electrodes may be,

iterating the network structure according to preset configuration information of a prototxt file, wherein the total iteration number S is 8000-20000, and the accuracy of the current network structure is obtained and the learning rate is reduced according to verification information in a softmax layer every iteration q times, and iteration is continued; when the accuracy reaches 90%, the iteration is endedSaid iterated network structure DL_i' the parameters are stored in a binary protobuf file; if the accuracy is less than 90% after the total iteration times are finished, acquiring the off-line gallery again, and returning to the step 1);

step 6) loading the n binary protobuf files into an ARM-NN SDK toolkit, changing a software compiling tool g + + in the Makefile compiling rule into a g + + on a target microprocessor architecture, wherein the software compiling tool g + + is used for compiling files into executable programs;

the protobuf dynamic link library can be directly called by a C + + program;

2. The lens defect detection method based on deep learning of claim 1, wherein: in the ARM-NN SDK toolkit, the Makefile compiling rule comprises a path for defining engineering source codes, a path for generating files and a software compiling tool g + +; the target microprocessor architecture is designated as 'arm', the compiling tool chain version is 'arm-Linux-gnueabihf-gcc', a source file required by the protobuf is compiled, a Linux kernel version and a link library.

3. The lens defect detection method based on deep learning of claim 1, wherein: the prototxt file is used for configuring model parameters;

the configuration information of the prototxt file comprises:

setting the number of layers and connection relations of the convolutional layer, the pooling layer, the activation layer, the full connection layer, the softmax layer, the drop layer and the output layer in each network structure in the step 4);

4. The lens defect detection method based on deep learning of claim 3, wherein:

the settings of each network structure in the step 5) in the prototxt file are as follows: the total number of iterations S is 10000, the number of times q is 1000, and the setting of each learning rate: the selection range of the initial value of the learning rate is 0.01-0.8, and the learning rate is reduced to 50% of the previous value every iteration q times.

5. The lens defect detection method based on deep learning of claim 1, wherein: the lens is laser welding protection lens, in step 1) the characteristic label sets up 4, includes: presence/absence of dead spots, presence/absence of shadows, yes/no blurring, yes/no offset;

correspondingly generating 4 network structures in the step 4), and respectively recording as: network structure DL for judging the presence/absence of dead spots₁Network structure DL for judging presence/absence of shadow₂Network structure DL for judging yes/no ambiguity₃Network structure DL for judging yes/no offset₄。

6. The lens defect detection method based on deep learning of claim 5, wherein:

the prototxt file is configured for each network structure in step 4) as:

network structure DL for judging the presence/absence of dead spots₁The total number of layers is 16, including 1 input layer, 3 convolution layers, 3 pooling layers, 4 active layers, 2 full connection layers, 1 softmax layer, 1 drop layer and 1 output layer, the connection order is: input layer-convolution layer-activation layer-pooling layer-full connection layer-activation layer-drop layer-a full connectivity layer-softmax layer-output layer;

the network structure DL for judging yes/no ambiguity₄The total number of layers of 13, including 1 input layer, 2 convolution layers, 2 pooling layers, 3 active layers, 2 full connection layers, 1 softmax layer, 1 drop layer and 1 output layer, the connection order is: input layer-convolution layer-activation layer-pooling layer-full connection layer-activation layer-drop layer-full connection layer-softmax layer-output layer.

7. The method for detecting lens defects based on deep learning as claimed in claim 5 or 6, wherein: in the step 7), firstly, unifying the size of the real-time acquired image of the lens to be detected, so that the resolution of the real-time image to be detected to be analyzed finally is consistent with the image in the off-line image library;

8. The lens defect detection method based on deep learning of claim 5, wherein: the pictures in the off-line gallery are images of laser welding protection lenses collected by a camera in the process that the robot drives the welding gun to carry out laser welding.