CN113421230B

CN113421230B - Visual detection method for defects of vehicle-mounted liquid crystal display light guide plate based on target detection network

Info

Publication number: CN113421230B
Application number: CN202110635854.7A
Authority: CN
Inventors: 李俊峰; 王昊; 杨元勋; 周栋峰
Original assignee: Zhejiang Sci Tech University ZSTU
Current assignee: Hubei Yuejia Intelligent Technology Co ltd
Priority date: 2021-06-08
Filing date: 2021-06-08
Publication date: 2023-10-20
Anticipated expiration: 2041-06-08
Also published as: CN113421230A

Abstract

The invention relates to the technical field of image recognition, and particularly discloses a visual detection method for defects of a vehicle-mounted liquid crystal display light guide plate based on a target detection network, which comprises the following steps: collecting an image of a light guide plate; preprocessing an image; establishing and training a one-stage target detection network, wherein the one-stage target detection network comprises a trunk feature extraction sub-network, a feature enhancement sub-network and a classification and regression sub-network; defect detection: four effective feature layers are obtained through an image pyramid of the feature fusion sub-network in a feature fusion mode, the four effective feature layers are transmitted to the classified and regressed sub-network to obtain a prediction result of the defects of the light guide plate, and the prediction result is displayed in an upper computer. The invention solves the problems of unbalanced positive and negative samples, tiny defect detection rate, detection efficiency and the like, simultaneously completes the positioning task of the defects and the classification of the defects, provides the visualized result, and achieves and realizes the industrial application.

Description

Visual detection method for defects of vehicle-mounted liquid crystal display light guide plate based on target detection network

Technical Field

The invention relates to the technical field of image recognition, in particular to a visual detection method for defects of a vehicle-mounted liquid crystal display light guide plate based on a target detection network.

Background

The light guide plate (light guide plate) is made of an optical acrylic/PC plate, and then a high-tech material with extremely high reflectivity and no light absorption is used for printing light guide points on the bottom surface of the optical acrylic plate by using laser engraving, V-shaped cross grid engraving and UV screen printing technologies. The light guide plate is also an important component of the backlight module of the liquid crystal display screen, and the defects of bright spots, scratches, crush injuries and the like are unavoidable in the production process of the light guide plate, so that the display effect is directly affected. Defects are classified into two main categories according to the shape of the defect: point defects and line defects. The point defect mainly refers to a point defect formed inside the light guide plate and mainly comprises a bright point and a crush injury. In the plasticizing process, the plastic raw material cannot be completely melted due to the too low temperature, dust around the forming machine is heavy or the plastic raw material is not clean, white impurities are doped, and the like, so that the plastic raw material can show a bright point defect. The line defect refers to a linear defect formed on the surface of the light guide plate, and is mainly a scratch trace on the surface of the light guide plate. The forming reasons mainly include surface scratch of the mold core, or the surface scratch of the light guide plate is formed on the surface of the light guide plate due to uncleanness of the contact surface of the light guide plate, such as polishing machine, roller cleaning and the like, which generates larger friction with the light guide plate in the moving process.

At present, domestic light guide plate defect detection mainly relies on manual operation to accomplish, under the condition of polishing of inspection tool, lights the light guide plate, and whether defects such as bright spot, fish tail appear in detection personnel's visual inspection light guide plate somewhere or many places to judge whether the light guide plate exists the defect, but the limitation of manual detection defect is very obvious, mainly lies in: (1) The manual detection environment is poor, and the vision of staff can be seriously damaged when workers face the light guide plate for a long time; (2) The defect detection of the light guide plate mainly relies on human eye judgment and identification, and has artificial subjective factors, so that a quantifiable quality standard is difficult to form; (3) The manual operation is easy to be interfered by various factors, such as external environment, eye fatigue and the like, so that the actual detection efficiency and the accuracy are affected to a certain extent; (4) The light guide plate has high detection complexity, high difficulty and various defects, and staff can hardly master the related detection technology. Because of various limitations of manual detection defects, the precision, efficiency, stability and the like of manual bright spot detection are difficult to adapt to the requirements of enterprises. The quality detection precision requirement of the vehicle navigation light guide plate is high, the defects of more than 10um are required to be detected, and the industrial area array camera is difficult to meet the requirement. The industrial area array camera adopts a 16K linear array camera to present a clear image of the vehicle navigation light guide plate, the size of the acquired high-resolution image is 10084 multiplied by 14500, and in an industrial field, enterprises are required to finish defect detection of one light guide plate within 6 seconds, so that higher requirements are also provided for the defect detection efficiency. The existing defect detection method based on deep learning cannot effectively detect smaller point defects and shallower line defects in the vehicle navigation light guide plate, and has serious false detection and missing detection, so that the accuracy requirement of industrial detection is difficult to meet.

Disclosure of Invention

The invention aims to solve the technical problems of providing a visual detection method for defects of a vehicle-mounted liquid crystal display screen light guide plate based on a target detection network, which is used for automatically detecting and classifying the defects of the light guide plate, solves the problems of unbalance of positive and negative samples, detection rate and detection efficiency of tiny defects and the like, simultaneously completes classification of a positioning task of the defects and existence of the defects, provides a visual result, and achieves and realizes industrial application.

In order to solve the technical problems, the invention provides a visual detection method for defects of a vehicle-mounted liquid crystal display light guide plate based on a target detection network, which comprises the following steps:

s01, collecting an image of the light guide plate: collecting an image of the light guide plate by using a 16K linear array camera and transmitting the image to an upper computer for processing;

s02, image preprocessing: obtaining a region image of the light guide plate ROI by using a threshold segmentation technology, and then cutting the region image of the light guide plate ROI into a group of small images with the size of H multiplied by W multiplied by 1, wherein the adjacent small images are overlapped by about 1/10 of the image width;

s03, establishing and training a one-stage target detection network, wherein the one-stage target detection network comprises a trunk feature extraction sub-network, a feature enhancement sub-network and a classification and regression sub-network;

s04, defect detection:

inputting the H multiplied by W multiplied by 1 small image subjected to S02 image pretreatment into a one-stage target detection network trained in S03, extracting four multi-scale feature images from the H multiplied by W multiplied by 1 small image by using a feature extraction sub-network, obtaining four effective feature layers by using a feature fusion mode through an image pyramid of a feature fusion sub-network, transmitting the four effective feature layers to a classification and regression sub-network to obtain a prediction result of the defect of the light guide plate, and displaying the prediction result in an upper computer.

As the improvement of the visual detection method of the defect of the vehicle-mounted liquid crystal display light guide plate based on the target detection network, the invention has the advantages that:

the trunk feature extraction sub-network comprises a batch residual error network ResNeXt50 network with 5 convolution layers, wherein the batch residual error network ResNeXt50 network is a base line network, and 1X1 convolution of the lower half part of each ResNeXt_block of the convolution layers Conv2, conv3, conv4 and Conv5 in the base line ResNeXt50 network is replaced by Ghost_Module; the input of the trunk feature extraction sub-network is H multiplied by W multiplied by 1 small image obtained by S02, and the small image is output as multi-scale feature graphs p1, p2, p3 and p4 respectively output by convolution layers Conv2, conv3, conv4 and Conv 5;

the input of the feature fusion sub-network is multi-scale feature graphs P1, P2, P3 and P4, the channel is firstly changed into p1_in, p2_in, p3_in and p4_in through a 1x1 convolution module, and then four effective feature layers P1_out, P2_out, P3_out and P4_out are obtained through a feature pyramid in a feature fusion mode;

the classification and regression sub-network comprises four class+box sub-net structures, each class+box sub-net structure comprises a class sub-net and a box sub-net, the class sub-net comprises convolution of a 4-time 256-channel convolution and a 1-time num_priority xnum_class, the box sub-net comprises convolution of a 4-time 256-channel convolution and a 1-time num_priority x 4, four effective feature layers P1_out, P2_out, P3_out and P4_out are respectively transmitted through one class+box sub-net structure, and finally, each class+box sub-net structure outputs a prediction result: target position information and a type corresponding to each prediction frame on each grid point on the effective feature layer.

As the further improvement of the visual detection method of the defect of the vehicle-mounted liquid crystal display light guide plate based on the target detection network, the invention has the advantages that:

the training a one-stage target detection network comprises:

s0301, building a training dataset: collecting an image of a light guide plate, establishing a data set, manually labeling the image in the data set, selecting a defect frame by using a rectangular frame during labeling, and inputting a corresponding defect name; and then, carrying out data enhancement on the data set image by using a background replacement and brightness transformation method, and then, carrying out data enhancement on the data set image according to 7:2:1 is divided into a training set, a verification set and a test set;

s0302, establishing a loss function Focal loss:

Fl(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

wherein p is _t Representing the probability that the prediction is correct, alpha _t Is the weight, alpha _t Taking 0.25, gamma represents an adjustable focusing parameter, and gamma takes 2;

s0303, training

The optimizer adopts an Adam optimizer, and the learning rate is 1x10 ^-5 The method comprises the steps of carrying out a first treatment on the surface of the Will establish S0301Inputting the training set image into the one-stage target detection network for training, calculating the loss of the one-stage target detection network of the current round through a loss function Focal loss by the output prediction result of the one-stage target detection network and manual labeling, regulating network parameters through a back propagation algorithm and a gradient descent algorithm to continuously descend the training set loss, and if the verification set loss in the current training round is lower than the verification set loss in the previous round, storing a model of the current round, wherein the one-stage target detection network carries out training for 200 rounds altogether; and then testing through the test set and obtaining the trained one-stage target detection network.

and carrying out normalized BN and an activation function ReLU operation after each convolution operation of the trunk feature extraction sub-network, wherein the normalized BN is defined as: will input pixel point x _i Subtracting the mean mu and then dividing by the mean square errorObtaining a normalized value x _i Then scale transformation and offset are carried out to obtain a value y after batch normalization _i Wherein:

n is the batch size; epsilon is a fixed value; gamma and beta are parameters learned by the network;

the activation function ReLU is defined as:

the beneficial effects of the invention are mainly as follows:

1. the trunk feature extraction sub-network is improved and optimized based on the ResNeXt50 network, and Ghost_Module is utilized to replace the lower half part 1X1 convolution in the ResNeXt_block, so that the resource parameters and the resource consumption are reduced, and the training and reasoning speed is accelerated;

2. compared with the traditional detection algorithm, the invention aims at the characteristic that the light guide plate image density area is different, the traditional detection algorithm usually needs to carry out partition treatment, and adopts different treatment strategies aiming at different partitions, but the invention does not need to distinguish the problem of light guide point density, and the one-stage target detection network can solve the problem in a learning way;

3. the feature fusion sub-network is improved, so that the shallow semantic information and the high semantic information in the sub-network are effectively fused, and the detection capability of the small target defects is further improved.

4. The detection algorithm of the invention has strong universality and stability, reduces misjudgment and missed detection, and improves detection precision.

Drawings

The following describes the embodiments of the present invention in further detail with reference to the accompanying drawings.

FIG. 1 is a schematic flow chart of a visual detection method of defects of a vehicle-mounted liquid crystal display light guide plate based on a target detection network;

FIG. 2 is a schematic diagram of a one-stage object detection network according to the present invention;

FIG. 3 is a schematic diagram of a backbone feature extraction sub-network of the one-stage object detection network of FIG. 2;

FIG. 4 is a diagram illustrating an exemplary structure of a feature fusion subnetwork of the one-phase object detection network of FIG. 2;

FIG. 5 is a schematic diagram of the classification and regression sub-network of the one-stage object detection network of FIG. 2;

FIG. 6 is a schematic diagram of the detection result of comparative experiment 1.

Detailed Description

The invention will be further described with reference to the following specific examples, but the scope of the invention is not limited thereto:

embodiment 1, a visual inspection method for defects of a light guide plate of a vehicle-mounted liquid crystal display based on a target inspection network, as shown in fig. 1-5, includes the following steps:

step 1, collecting an image of a light guide plate

The method comprises the steps that a light guide plate image acquisition device is arranged at the tail end of a vehicle navigation light guide plate production line, the light guide plate image acquisition device acquires images by adopting a 16K linear array camera, and then the acquired light guide plate images are transmitted to an upper computer for processing;

step 2, image preprocessing

The collected light guide plate image contains a background area, the background contained in the collected light guide plate image is removed by using a threshold segmentation technology, the area image of the light guide plate ROI is obtained, and the detection efficiency is further improved;

then, cutting the extracted region image of the light guide plate ROI into a group of small images with the size of H multiplied by W multiplied by 1, wherein the adjacent small images are overlapped by about 1/10 of the image width, so that the integrity of the edge of the defect image can be ensured, and the defect is not segmented due to image segmentation; this set of small images cut out h×w×1, where h=600, w=600, the following H, W appearing below are the same size, as input for the subsequent step;

step 3, establishing a one-stage target detection network

The first-stage target detection network comprises a trunk feature extraction sub-network, a feature enhancement sub-network and a classification and regression sub-network; step 3.1, establishing a trunk feature extraction sub-network

The main feature extraction sub-network takes a batch residual network ResNeXt50 network as a base line network and comprises 5 convolution layers Conv1, conv2, conv3, conv4 and Conv5 which are sequentially connected, wherein the input of the main feature extraction sub-network, namely the input of the convolution layer Conv1, is H multiplied by W multiplied by 1 small images obtained in the step 2, the input of the convolution layers Conv2, conv3, conv4 and Conv5 are respectively the output of the previous convolution layer, the output of the main feature extraction sub-network is the output of the convolution layers Conv2, conv3, conv4 and Conv5, respectively the multi-scale feature map P1, the multi-scale feature map P2, the multi-scale feature map P3 and the multi-scale feature map P4, namely the multi-scale feature map P1 is simultaneously the input of the first layer of the feature fusion sub-network (IFPN),

replacing 1X1 convolutions of the lower half of each ResNeXt_block of convolutional layers Conv2, conv3, conv4 and Conv5 in a baseline ResNeXt50 network with Ghost_Module, thereby establishing a trunk feature extraction sub-network; the improved ResNeXt50 network, namely a trunk feature extraction sub-network, reduces resource parameters and resource consumption, accelerates training and reasoning speed, and has a specific network layered structure shown in a table 1;

table 1: backbone feature extraction sub-network hierarchical structure table

Wherein the convolution layer Conv1 is a 7×7 convolution kernel with 128 channels, a largest pooling layer with a step length of 3 pooling size of 3×3 is connected, the convolution layers Conv2, conv3, conv4 and Conv5 are all of similar structures, taking the convolution layer Conv2 as an example, as shown in fig. 3, a multiscale feature map P1 with input being output of the convolution layer Conv1 is input, a 1×1 convolution kernel with 128 channels is sequentially connected, 32 groups of parallel stacked convolution layers with 3×3 convolution kernels with 128 channels are connected, then a ghost_module with 1×1 convolution kernel with 256 channels is output, and the multiscale feature map P2 is output;

the convolution layers Conv2, conv3, conv4 and Conv5 in table 1 contain convolutions of 1X1, and the convolution calculation amount is as follows:

computation＝n×h×w×c×k×k

wherein: n is the number of channels for outputting the feature map, h and w are the height and width of the feature map respectively, c is the number of input channels, and k is the convolution kernel size;

Ghost_Module defines init_channels as the number of output channels, output_channels as the number of output1 channels, new_channels as the number of output (2) channels, and ratio as the ratio of the number of output (1) and output (2) channels to the number of output (1) channels. The Ghost_Module implementation is performed in two steps, the specific operation is as follows:

(1) Performing pre-convolution to obtain the number of output channels as output_channel/ratio, wherein ratio=2;

(2) Performing simple linear transformation cheap_operation, and performing one-time convolution on each feature map of the pre-convolution module, so that the group number is k, and the output channel number is new_channels; the number of channels generated by adding the output (1) and the output (2) together is the final number of channels init_channels of the Ghost_Module;

the backbone feature extraction sub-network operates using batch normalization BN (Batch Normalization) to input pixel point x _i Subtracting the mean mu and then dividing by the mean square errorObtaining a normalized value x _i Then scale transformation and offset are carried out to obtain a value y after batch normalization _i Wherein:

n is the batch size; epsilon is a fixed value in order to prevent divide-by-0 errors; gamma and beta are parameters learned by a network, normalization and activation functions (BN+ReLU) are required to be carried out after each convolution operation in the trunk feature extraction sub-network, and one activation function is next to the default BN operation, so that the regularization of the network is facilitated; the activation function employs a ReLU activation function, wherein:

step 3.2 construction of feature fusion subnetwork

The specific operation of the feature fusion sub-network (IFPN) can be divided into three parts, and four effective feature layers are obtained by using the feature fusion mode through the image pyramid, as shown in fig. 4, the process is as follows:

(1) The output multi-scale feature map P1, the multi-scale feature map P2, the multi-scale feature map P3 and the multi-scale feature map P4 of the main feature extraction sub-network cannot be directly used as the input of a feature fusion sub-network (IFPN), and a 1x1 convolution module is adopted to adjust channels to p1_in, p2_in, p3_in and p4_in respectively so as to input a feature pyramid; after up-sampling, p4_in is stacked with p3_in to obtain p3_td, after up-sampling, p3_td is further up-sampled, p2_td is stacked with p2_in to obtain p2_td, after up-sampling, p2_td is also up-sampled, p1_tg is further stacked with p1_in to obtain p1_tg, and a first part of a feature fusion subnetwork (IFPN) is completed;

(2) The multi-scale feature map P1, the multi-scale feature map P2, the multi-scale feature map P3 and the multi-scale feature map P4 are feature layers with rich information resources, which are obtained from a main feature extraction sub-network, and the receptive fields of the feature layers are increased along with the continuous convolution operation of the main network, but the fine granularity in the picture shows a decreasing trend; compared with the feature layers of the P multi-scale feature map P3 and the multi-scale feature map P4, the multi-scale feature map P1 and the multi-scale feature map P2 are subjected to less convolution, have more abundant fine granularity, are more sensitive to small defects such as points, lines and the like of the light guide plate, are favorable for rapidly completing positioning, and help to reduce missed detection; in order to make full use of shallow semantic information, p2_tg is obtained by stacking p2_td and p2_in after downsampling, and a focusing mechanism is introduced during stacking to judge which channel information is more focused; stacking the p 2-tg with p 3-td after one downsampling to obtain p 3-tg, and stacking the p 3-tg with p 4-in after one downsampling to obtain p 4-tg;

(3) For point defects and partially shallower line defects, more feature extraction is obtained from p1_tg and p2_tg, while p1_tg and p1_in have smaller differences, so that more sufficient feature information is difficult to obtain, and the improvement of the detection accuracy of the defects is not greatly facilitated; for this reason, a third feature fusion is added, p4_out is up-sampled and then stacked with p3_tg to obtain p3_out, p3_out is stacked with p2_tg to obtain p2_out after up-sampling is completed once, and p2_out is further up-sampled and then stacked with p1_tg and p1_in with an attention mechanism to obtain p1_out; in order to eliminate the aliasing effect caused by feature fusion, the feature layers are added and then subjected to 3X3 depth separable convolution during stacking, and channel adjustment is performed by 1X1 convolution;

the feature layers, p1_out, p2_out, p3_out, p4_out, are obtained by feature pyramids of feature Fusion Subnetworks (IFPNs), respectively, and are called effective feature layers for distinguishing from normal feature layers.

Step 3.3, constructing Classification and regression sub-networks

The classification and regression sub-network comprises four class+box sub-network structures, four effective feature layers P1_out, P2_out, P3_out and P4_out are respectively transmitted through one class+box sub-network structure, each class+box sub-network structure comprises class sub-network and box sub-network, the class sub-network adopts convolution of 4 times 256 channels and convolution of 1 times of num_priority x num_classes, the convolution of num_priority x num_classes is used for predicting the type corresponding to each prediction frame on each grid point on the effective feature layer, the num_priority refers to the prior frame number owned by the effective feature layer, and the num_classes refer to the targets of how many types are detected by the network together; the box subnet adopts convolution of 256 channels for 4 times and convolution of 1 times num_priority x 4, num_priority refers to the number of prior frames owned by the effective feature layer, 4 refers to the adjustment condition of the prior frames, and the convolution of num_priority x 4 is used for predicting the change condition of each prior frame on each grid point on the effective feature layer; finally, each class+box subnet structure outputs a prediction result: the change condition of each prior frame on each grid point on each effective feature layer, namely the target position information and the corresponding type of each prediction frame on each grid point on the effective feature layer.

Step 4, training and testing a one-stage target detection network

Step 4.1, building a training data set

At the tail end of a vehicle-mounted navigation light guide plate production line, collecting light guide plate images of an industrial site through a light guide plate image collecting device, wherein the light guide plate image collecting device comprises a 16K linear array camera, establishing a data set, manually marking the images in the data set, selecting a defect frame by using a rectangular frame during marking, and inputting a corresponding defect name; then using background replacement and brightness transformation method to make data enhancement so as to obtain 2897 sheets of point defect, 2936 sheets of line defect and 2856 sheets of surface defect, according to 7:2:1, wherein the training set is divided into 6082 pieces, the verification set is divided into 1737 pieces and the test set is divided into 286 pieces, and the training set, the verification set and the test set are combined with the light guide plate image containing point defects, line defects and surface defects as far as possible;

step 4.2, establishing a loss function

The method adopts a Focal loss function for balancing positive and negative samples to solve the problem that the positive and negative samples are extremely unbalanced in defect detection of the light guide plate, and the Focal loss is as follows:

Fl(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

step 4.3 training a one-phase target detection network

Training and testing using a pytorch1.2 build network, with an Adam optimizer having a learning rate of 1x10 ^-5 The method comprises the steps of carrying out a first treatment on the surface of the Inputting the built training set image into a one-stage target detection network built in the step 3 for training, calculating the loss of the one-stage target detection network of the current round through a loss function Focal loss by the output prediction result and manual labeling of the one-stage target detection network, regulating network parameters through a back propagation algorithm and a gradient descent algorithm to continuously reduce the training set loss, and if the verification set loss in the current training round is lower than the verification set loss in the previous round, storing a model of the current round, wherein the one-stage target detection network carries out training for 200 rounds;

after training, the parameters of the one-stage target detection network are stored as configuration files to be output, so that the trained one-stage target detection network is obtained.

Step 4.4 off-line testing

Inputting a test set into a one-stage target detection network trained in the step 4.3 for defect detection, extracting an input H multiplied by W multiplied by 1 image into 4 multi-scale feature images by using a feature extraction sub-network, obtaining 4 effective feature layers by using a feature fusion sub-network, and respectively transmitting the 4 effective feature layers into a classification sub-network and a regression sub-network to respectively obtain 4 prediction results: the change condition of each prior frame on each grid point on each effective feature layer, namely the target position information and the type corresponding to each prediction frame on each grid point on the effective feature layer, outputs and stores the prediction result on an upper computer of a light guide plate production line;

the Accuracy of the prediction result is measured by adopting an average Accuracy AP, mAP (mean Average Precision) and an Accuracy Accuracy:

1) AP can reflect the average precision of various defects, and the specific expression is shown by making a definite integral between precision and recovery:

wherein x represents a defect of a certain kind,

precision_rate refers to accuracy: precision _ rate = TP/(TP + FP),

the recall_rate refers to the recall rate: recall rate=tp/(tp+fn),

wherein,,

TP (True Positive), the correct number of defect detection results, refers to positive samples predicted to be positive,

FP (False Positive) false detection, i.e. meaning that a negative sample predicted to be positive refers to the number of such defect detection errors,

FN (False Negative) missing, meaning that a positive sample that is predicted negative indicates the number of results not detected,

TN (True Negative) is predicted as a negative sample and is not practical in this detection task;

2) mAP is the average value of all the APs, and as an overall performance evaluation index capable of reflecting the model, the average condition of detection precision of each category in the model is reflected:

wherein X refers to a set of all types of defects;

3) Accuracy rate:

Accuracy＝(TP+TN)/TP+TN+FP+FN。

the test results are shown in table 2,

table 2 test results

The network structure provided by the invention has higher accuracy rate which reaches 98.6% and mAP reaches 96.7%, which proves that the network structure based on the one-stage target detection has higher accuracy rate and can obtain excellent effect on the task of detecting the defects of the light guide plate; as the training set, the verification set and the test set are all the light guide plate images actually obtained from the online production, the invention is also verified to be suitable for the detection of the online production.

Step 5, using the trained one-stage target detection network to detect the defects and outputting the result

Extracting four multi-scale feature images, namely P1, P2, P3 and P4, of the H multiplied by W multiplied by 1 small images subjected to the image pretreatment in the step 2 by using a feature extraction sub-network; and then obtaining four effective feature layers by using a feature fusion mode through an image pyramid of the feature fusion sub-network, namely P1_out, P2_out, P3_out and P4_out, classifying the four effective feature layers and returning the four effective feature layers to a class+box sub-net structure of the sub-network to obtain a prediction result of the defect of the light guide plate, and displaying the prediction result in an upper computer.

Comparative experiment 1:

the method comprises the steps of adopting one light guide plate image with line defects, point defects and surface defects, wherein the light guide plate image with the line defects comprises one light guide plate image with deep line defects and one light guide plate image with shallow line defects, as shown in fig. 6, preprocessing the light guide plate image in an upper computer through the step 2 image described in the embodiment 1 to obtain H multiplied by W multiplied by 1 small images, respectively inputting the H multiplied by W multiplied by 1 small images into a Retinonet network and a one-stage target detection network of the invention, carrying out a comparison experiment of defect detection, and outputting a result to show that the one-stage target detection network of the invention recognizes all the defects, wherein the Retinonet network cannot recognize the shallow line defects and the point defects, so that the detection capability and the detection precision of the small target defects can be improved.

Finally, it should also be noted that the above list is merely a few specific embodiments of the present invention. Obviously, the invention is not limited to the above embodiments, but many variations are possible. All modifications directly derived or suggested to one skilled in the art from the present disclosure should be considered as being within the scope of the present invention.

Claims

1. The visual detection method for the defects of the vehicle-mounted liquid crystal display light guide plate based on the target detection network is characterized by comprising the following steps of:

s03, establishing and training a one-stage target detection network, wherein the one-stage target detection network comprises a trunk feature extraction sub-network, a feature fusion sub-network and a classification and regression sub-network;

s04, defect detection:

inputting the H multiplied by W multiplied by 1 small image subjected to S02 image pretreatment into a first-stage target detection network trained by S03, extracting four multi-scale feature images from the H multiplied by W multiplied by 1 small image by using a feature extraction sub-network, obtaining four effective feature layers by using a feature fusion mode through an image pyramid of a feature fusion sub-network, transmitting the four effective feature layers to a classification and regression sub-network to obtain a prediction result of the defect of the light guide plate, and displaying the prediction result in an upper computer;

the main feature extraction sub-network outputs a multi-scale feature map p1, a multi-scale feature map p2, a multi-scale feature map p3 and a multi-scale feature map p4, and a 1x1 convolution module is adopted to adjust channels to be p1_ in, p2_ in, p3_ in and p4_ in respectively, and a feature pyramid is input; after upsampling, p4_in is stacked with p3_in to obtain p3_td, after upsampling, p3_td is further stacked with p2_in to obtain p2_td, after upsampling, p2_td is also stacked with p1_in to obtain p1_tg, and the first part of the feature fusion subnetwork IFPN is completed;

after downsampling, p1_tg is stacked with p2_td and p2_in to obtain p2_tg, and a attention mechanism is introduced during stacking; stacking the P2-tg with P3-td after one-time downsampling to obtain P3-tg, and after one-time downsampling, completing one-time stacking with P4-in to obtain P4-out;

after the up-sampling of the P4_out, stacking the P3_out with the P3_tg to obtain the P3_out, after the up-sampling of the P3_out is finished once, stacking the P2_out with the P2_tg to obtain the P2_out, and after the up-sampling of the P2_out is continued to be carried out in one step, stacking the P1_tg with the attention mechanism with the P1_in to obtain the P1_out; the characteristic layers are added and then subjected to 3X3 depth separable convolution during stacking, and channel adjustment is performed by 1X1 convolution;

the classification and regression sub-network comprises four class+box sub-net structures, each class+box sub-net structure comprises class sub-net and box sub-net, the class sub-net comprises convolution of 256 channels for 4 times and convolution of 1 time of num_priority xnum_class, the convolution of num_priority x num_class predicts the type corresponding to each prediction frame on each grid point on the effective feature layer, num_priority refers to the prior frame number owned by the effective feature layer, and num_class refers to the detection of the targets of how many types are jointly detected by the network;

the box subnet comprises convolution of 256 channels for 4 times and convolution of 1 num_priority x 4, 4 refers to adjustment conditions of prior frames, and the convolution of num_priority x 4 predicts the change condition of each prior frame on each grid point on the effective feature layer; the four effective feature layers P1_out, P2_out, P3_out and P4_out are respectively transmitted through a class+box subsystem structure, and finally, each class+box subsystem structure outputs a prediction result: target position information and a type corresponding to each prediction frame on each grid point on the effective feature layer.

2. The visual inspection method for defects of the light guide plate of the vehicle-mounted liquid crystal display screen based on the target detection network according to claim 1, wherein the visual inspection method is characterized by comprising the following steps of:

training a phase one object detection network comprising:

s0302, establishing a loss function Focal loss:

Fl(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

s0303, training

The optimizer adopts an Adam optimizer, and the learning rate is 1x10 ^-5 The method comprises the steps of carrying out a first treatment on the surface of the Inputting the training set image established in the S0301 into the one-stage target detection network for training, and outputting the one-stage target detection networkCalculating the loss of a one-stage target detection network of the current round through a loss function Focal loss through the measurement result and manual labeling, regulating network parameters through a back propagation algorithm and a gradient descent algorithm to enable the loss of a training set to be continuously reduced, and if the loss of a verification set in the current training round is lower than the loss of the verification set in the previous round, storing a model of the current round, wherein the one-stage target detection network is trained for 200 rounds altogether; and then testing through the test set and obtaining the trained one-stage target detection network.

3. The visual inspection method for defects of the light guide plate of the vehicle-mounted liquid crystal display screen based on the target detection network according to claim 2, wherein the visual inspection method is characterized by comprising the following steps of:

and carrying out normalized BN and an activation function ReLU operation after each convolution operation of the trunk feature extraction sub-network, wherein the normalized BN is defined as: will input pixel point x _i Subtracting the mean mu and then dividing by the mean square errorObtaining normalized value->Then scale transformation and offset are carried out to obtain a value y after batch normalization processing _i Wherein:

n is the batch size; epsilon is a fixed value, and 0 division errors are prevented; gamma and beta are parameters learned by the network;

the activation function ReLU is defined as: