CN115294033A

CN115294033A - Tire belt layer difference level and misalignment defect detection method based on semantic segmentation network

Info

Publication number: CN115294033A
Application number: CN202210855373.1A
Authority: CN
Inventors: 彭晨; 肖亮; 纪玉华; 杨朔; 张振; 伊鑫
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2022-07-19
Filing date: 2022-07-19
Publication date: 2022-11-04

Abstract

The invention provides a tire belt layer difference level and misalignment defect detection method based on a semantic segmentation network, which comprises the steps of collecting a tire radial image pre-training encoder network of a belt layer, a toe opening and a tire side; dividing areas on the left side and the right side of a tire crown belted layer to manufacture training and testing data sets; designing a decoder network to fuse the characteristic diagrams of all stages and output a belted layer region of a pixel level; the semantic segmentation algorithm is integrated in the defect detection software; obtaining the tire model through a code scanner, and determining the coordinates of a tire belt layer by using the proportion of the pixel distance on the image of the tire X-ray image to the actual size; according to the actual production standard, whether two defects occur or not and the severity are judged. The method can monitor whether the belt layer defect exists in real time, and solves the problems that the traditional image processing method cannot flexibly meet the detection requirements of complex and various tire models and the missed detection rate is high during manual detection.

Description

Tire belt layer difference level and misalignment defect detection method based on semantic segmentation network

Technical Field

The invention relates to the field of automatic detection of tire quality, in particular to a tire belt layer difference level and misalignment defect detection method based on a semantic segmentation network.

Background

The quality of the tire not only relates to the safety of the passengers, but also influences the life and property of each road operator. Therefore, in the tire production process, quality inspection can avoid the inflow of unqualified products into the hands of consumers. At present, radial tires occupy a main market share, and the tires of the type are attached with a plurality of layers of steel wires on the tire surfaces, so that the strength and the stability of the tires are greatly improved. The radial tire has a complicated structure and increased production links, thereby causing various types of defects. In order to solve the problem of internal defects of the tire, the most common nondestructive testing scheme is to place the tire product side in an X-ray chamber for irradiation, and measure the generated two-dimensional gray scale according to enterprise standards. Although a large number of enterprises still adopt the method of manual visual inspection, the process of manual visual inspection has great defects in efficiency and accuracy. It is a trend of development to implement defect detection based on computer vision methods.

To meet the demand for intelligent detection, many researchers have conducted research on tire X-ray images. The traditional defect detection method needs to set related algorithm parameters according to the type of the tire and is particularly sensitive to the brightness of the gray-scale image. How to design a tire defect detection scheme, reduce the influence of image brightness to the detection effect, compromise the defect detection of multiple model tire simultaneously is the problem that awaits a urgent need to be solved.

In recent years, deep learning techniques have been widely used in the fields of image classification, object recognition, semantic segmentation, and the like. In the actual production process of a tire factory, the defect detection process not only includes judging whether a defect exists, but also includes judging the severity of the defect. Defects can be generally classified into rejects and rejects according to severity. Therefore, the invention aims to realize accurate judgment of defective products and waste products by applying a deep learning network.

The tyre belt layer area is easy to generate the step and the misalignment defects, and the boundary coordinates of the belt layer determine whether the defects occur. In the manual visual defect detection process, the severity of the defect can be judged by accurately measuring the tire by tire. This seriously affects the defect detection speed of each tire. The semantic segmentation network can realize pixel-level classification and meet the requirement of extracting and measuring the pixel coordinates of the belted layer. Therefore, the invention proposes to use a semantic segmentation network to replace the manual defect level judgment task.

Disclosure of Invention

The invention detects two defects of the belted layer based on a light semantic segmentation network, and balances the segmentation precision and the inference speed, so that the detection of the belt layer difference level and the misalignment defect is superior to the manual visual inspection mode in the aspects of speed and precision.

In order to achieve the purpose, the method comprises the following steps: the code reader is used for acquiring model information contained in a two-dimensional code attached to the surface of the tire, the distance sensor senses the time when the tire is detected in the X-ray chamber, and the X-ray image collected by the upper computer is transmitted to the automatic defect detection system to complete online detection of the quality of the tire. The key point of the invention is the design of a semantic segmentation network based on deep learning, which comprises the preprocessing of an original tire X-ray image, the design of an encoder structure and the design of a decoder structure. And the output result of the tire X-ray image detection algorithm is dynamically displayed on a designed upper computer interface. The semantic segmentation network comprises a training process and a testing process, wherein the used data come from real samples of cooperative tire production enterprises. The input image size of the semantic segmentation network is 256 multiplied by 256 pixels, the input image size of the original tire X-ray is 8000 multiplied by 2469 pixels, and the size is cut by utilizing the special region segmentation algorithm in the invention. The acquisition upper computer of the X-ray image and the upper computer installed by the automatic defect detection software are independent.

According to the above idea, the present invention is specifically as follows:

a tire belted layer difference level and misalignment defect detection method based on a semantic segmentation network can be used for a delivery detection link of finished tire quality, is divided into qualified products, defective products and waste products according to detection results, and executes corresponding sorting actions, and comprises the following steps:

step 1, determining the installation positions of a distance sensor and a code scanner according to the action process of tire X-ray image generation equipment, and designing a signal trigger device;

step 2, carrying out region segmentation on the collected tire X-ray image, cutting out the input image size required by a semantic segmentation network, and forming a data set;

step 3, designing a lightweight encoder structure, capturing texture features and spatial information of each region of the X-ray image, recording semantic context information, and storing parameter weights of the encoder by using a constructed pre-training method;

step 4, designing a decoder structure based on feature fusion, and learning and outputting pixel level classification results of the gray level images according to a training data set;

and 5, calculating the actual tire position corresponding to the belt ply image coordinate according to the scale, comparing defective products with waste product standards, and determining the defect position, size and type.

Further, the step 1 specifically includes the following steps:

step 1.1, a bar code label on the surface is read by a code scanner, information about the tire model is obtained, automatic defect detection software is connected through a network cable, and the proportion of pixels on a tire image in the detection stage and the actual size is correspondingly obtained according to the model-scale corresponding relation stored in a software connection database.

And 1.2, tracking and recording the action flow of the tire X-ray imaging equipment, recording the action related to the ending of X-ray imaging, and analyzing and finding the proper installation position of the distance sensor on the premise of not influencing imaging.

And step 1.3, installing a distance sensor, establishing a communication path with a PLC control module and an upper computer, transmitting a detection signal of the distance sensor to a PLC control unit in real time, sending an instruction for reading an X-ray image to automatic defect detection software by the PLC control unit when the position information is consistent with the action position detected by the X-ray imaging equipment, and transmitting and storing image data by the two industrial computers.

Further, the step 2 specifically includes the following steps:

step 2.1, the contrast of the image is improved by histogram equalization, and in order to further highlight the texture differences of the toe, the carcass (including the sidewall and the shoulder), and the crown, the use of the Scharr operator to eliminate the carcass cords is considered here. The operator of the Scharr operator in the vertical direction is a 3 × 3 matrix, and the convolution operation on the image I is expressed by the following formula:

therefore, the texture of the steel cord can be effectively eliminated through convolution in the vertical direction, and therefore, the cord elimination is realized by using a vertical Sobel operator.

2.2, obviously distinguishing the gray values of the full-size tire X-ray image in each area after convolution operation of a Scharr operator, and realizing binarization processing by using Otsu, wherein the segmented threshold value is the maximum between-class variance of the foreground and the background, and is shown as the following formula:

in the above formula, m _a And m _b Representing the mean gray value of the pixels in the foreground and background, respectively, m _g Representing the mean gray level, P, of the entire image _a Representing the proportion of foreground pixels in the whole image, P _b The specific gravity of the background pixel in the whole image is represented, the binary threshold selection mode is that the segmentation threshold value k is more than or equal to 0 and less than or equal to 255 and the optimal segmentation threshold value k is selected in sequence in an iteration mode ^* Satisfy the requirement of

And 2.3, converting the tire region segmentation task into connected domain contour extraction, calculating the contour coordinate of each connected domain according to a contour tracking algorithm, realizing the calculation of the area of the connected domains, and selecting the connected domain with the largest area. The crown area represented by the connected component can be determined according to the position of the outline coordinate of the connected component.

And 2.4, respectively expanding the boundary coordinates of the left side and the right side of the tire crown, and cutting out an area map under the condition of ensuring that the width of the image in the vertical direction is 256 pixels, wherein the area map of the left tire crown comprises the coordinates of the two layers of belt layers on the left side, and the area map of the right tire crown comprises the coordinates of the two layers of belt layers on the right side. And cutting out sub-images with 256 multiplied by 256 pixels, sequentially selecting 256 pixels in the horizontal direction for cutting, and extracting partial areas upwards under the condition that the height of the last sub-image is less than 256 pixels to achieve the purpose of redundant cutting, thereby finally forming an input image in a network training stage.

And 2.5, the input images in the pre-training stage comprise three types, namely a toe region, a sidewall region and a belt layer region, and the texture structures of the three regions are different, so that the network in the pre-training stage can learn the representative characteristics of the texture of each region. The image size is 256 × 256 pixels, and the class numbers are 0, 1, and 2, respectively.

And 2.6, labeling labels of the input images in the training stage by adopting Labelme, classifying pixels of a belted layer in the labeled outline into 1, and classifying pixels of a background outside the labeled outline into 0.

Step 2.7, the image division ratio of the training set and the test set in the pre-training stage is 4:1, the image division ratio of a training set and a test set in a training stage is 3:1. the selection mode during image division is random batch extraction, and extraction operation is stopped until the proportion of the data set is met. The data set composed of the belt region images in the training phase includes all tire models currently produced, and meanwhile, the quantity of various types of tires is basically consistent.

Further, the step 3 specifically includes the following steps:

and 3.1, performing a classification task on the input images in the pre-training stage, and adopting an efficient comparison learning method without labeling the data in the pre-training stage in order to reduce the number of samples in a data set. The network structure performing the classification task uses the modified ConvNeXt followed by the pooling layer and the linear connection layer, with the same number of negative samples stored in the output classes as in the queue structure.

And 3.2, pre-training to continuously update learnable parameters in the network, and finally, reserving parameter values before the pooling layer and loading the parameter values as initial values of the semantic segmentation network in a training stage.

Further, the step 4 specifically includes the following steps:

and 4.1, the decoder of the semantic segmentation network comprises a depth separable convolution module, a 1 x 1 convolution module and a multi-size pooling module. And connecting a multi-size pooling structure at the deepest layer of the encoder, mapping the shallow feature map and the deep feature map through channels before fusion, and extracting features by using depth separable convolution after fusion.

The decoder can up-sample the output image to the same size with the input image in four stages, respectively calculate the cross entropy loss with the label graph, and finish the back propagation training network parameter by combining the loss, and the loss function is shown as the following formula:

wherein G represents a label graph, P _i Is the output of the decoder, beta _i The deep characteristic diagram is heavier in the total loss and is set to be beta by default ₁ ＝0.5,β ₂ ＝0.25,β ₃ ＝0.1，L _m Is a principal loss calculation strategy; p is ₀ Representing the output of the decoder at the first stage; i represents a stage number of the decoder; l is _a Representing the secondary penalty calculation strategy.

And 4.2, when the pixel-level output of the tire X-ray image is predicted, the encoder output of the shallowest layer is used, and the feature map comprises semantic information, spatial information and boundary information.

Further, the step 5 specifically includes the following steps:

and 5.1, in the prediction process, carrying out region segmentation and cutting on the original tire X-ray image to obtain N samples with the size of 256 multiplied by 256, predicting two classification regions through a semantic segmentation network, and obtaining the coordinates of the left and right belt layers from the boundary of region segmentation.

And 5.2, defect detection is to measure the coordinate spacing on the basis of pixel level segmentation, predict the belted area by a semantic segmentation network, map the boundary coordinates, determine whether defects occur according to the conversion of a scale of pixels and the real positions of tires on the map, and sort defective tires and defective tires.

Compared with the prior art, the invention has the following advantages:

1. the method is easy to realize, does not need manual intervention, and automatically detects the fault in real time.

2. The detection speed of the tyre belt layer difference level and the misalignment defect is high, and the precision is high.

3. The online real-time diagnosis of the belt layer difference level and the misalignment defect can be realized, and the fault information interface can be dynamically displayed.

4. The quality of the finished tire is subjected to defect detection, so that the defect of the production link can be found in time, and reference is provided for optimizing the production process.

Drawings

FIG. 1 is a flow chart of two types of defect detection for a tire belt in an embodiment of the present invention;

FIG. 2 is a diagram of a pre-training network in an embodiment of the present invention;

FIG. 3 is an architecture diagram of a semantic segmentation network in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a multi-sized pooling module in an embodiment of the present invention;

FIG. 5 is a diagram of the coordinate positioning and defect detection effects of the belt in an embodiment of the present invention;

FIG. 6 is a diagram illustrating a defect in an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention is clearly and completely described below with reference to the accompanying drawings.

The embodiment of the invention is an exemplary illustration of the invention, and the defect detection method is not limited to detect the defect type of the belt region, and a person skilled in the art can apply the method to the tire defect detection of a plurality of regions.

As shown in fig. 1, a tire belt layer error level and misalignment defect detection method based on a semantic segmentation network includes the following steps:

and step 1, designing and installing a distance sensor and a code scanner.

In the embodiment of the invention, the code scanner reads the bar code label on the surface and acquires information about the tire model, the automatic defect detection software is connected through the network cable, and the proportion of the pixel and the actual size on the tire image in the detection stage is correspondingly obtained according to the model-scale corresponding relation stored in the software connection database.

Furthermore, the action process of the tire X-ray imaging device is tracked and recorded, the action related to the ending of X-ray imaging is recorded, and the distance sensor is properly installed on the premise that the imaging is not influenced by analysis and discovery.

Further, a distance sensor is installed, a communication path with the PLC control module and the upper computer is established, detection signals of the distance sensor are transmitted to the PLC control unit in real time, when the position information is consistent with the action position detected by the X-ray imaging equipment, the PLC control unit sends an instruction for reading an X-ray image to the automatic defect detection software, and at the moment, the two industrial computers transmit and store image data.

And 2, constructing and preprocessing a data set.

Histogram equalization is used in embodiments of the invention to improve image contrast, and to further highlight texture differences in the toe, carcass (including sidewall and shoulder), and crown, the use of the Scharr operator to eliminate carcass cords is contemplated herein. The operator of the Scharr operator in the vertical direction is a 3 × 3 matrix, and the convolution operation on the image I is expressed by the following formula:

it can be seen that the texture of the steel cord can be effectively eliminated by convolution in the vertical direction, so the cord elimination is realized by using the Sobel operator in the vertical direction in the invention.

Furthermore, the gray values of the full-size tire X-ray images after convolution operation by the Scharr operator are obviously different in each region, the binarization processing is realized by using Otsu, and the segmented threshold value is the maximum between-class variance of the foreground and the background, and is shown as the following formula:

in the above formula, m _a And m _b Mean gray scale values, m, of pixels in class a and class b, respectively _g The average gray scale of the whole image is represented, the threshold selection mode of binarization is that the segmentation threshold value k is more than or equal to 0 and less than or equal to 255 and the optimal segmentation threshold value k is selected in sequence in an iterative mode ^* Satisfy the requirement of

P _a Represents the specific gravity of the class-a pixels in the whole image, P _b Representing the specific gravity of the b-like pixels in the whole image.

Furthermore, the tire region segmentation task can be converted into connected domain contour extraction, the contour coordinate of each connected domain is calculated according to a contour tracking algorithm, the calculation of the area of the connected domains is realized, and the connected domain with the largest area is selected. The crown area represented by the connected component can be determined according to the position of the contour coordinate of the connected component.

Further, boundary coordinates of the left side and the right side of the tire crown are respectively expanded, an area map is cut out under the condition that the width of the image in the vertical direction is 256 pixels, the area map of the left tire crown comprises the coordinates of the two layers of belt layers on the left side, and the area map of the right tire crown comprises the coordinates of the two layers of belt layers on the right side. And cutting out sub-images with 256 multiplied by 256 pixels, sequentially selecting 256 pixels in the horizontal direction for cutting, and extracting partial areas upwards under the condition that the height of the last sub-image is less than 256 pixels to achieve the purpose of redundant cutting, thereby finally forming an input image in a network training stage.

Furthermore, the input image in the pre-training stage comprises three types, namely a toe region, a sidewall region and a belt region, and the texture structures of the three regions are different, so that the network in the pre-training stage can learn the representative characteristics of the texture of each region. The image size is 256 × 256 pixels, and the class numbers are 0, 1, and 2, respectively.

Further, labels of input images in the training stage are labeled by Labelme, pixels of a belt layer inside a labeled contour are classified into 1, and pixels of a background outside the labeled contour are classified into 0.

Further, the image division ratio of the training set and the test set in the pre-training stage is 4:1, the image division ratio of a training set and a test set in a training stage is 3:1. the selection mode during image division is random batch extraction, and extraction operation is stopped until the proportion of the data set is met. The data set of the belt region image composition in the training stage includes all tire models currently produced, and meanwhile, the number of various types of tires is basically consistent.

And 3, pre-training based on comparison learning.

In the embodiment of the invention, the pre-training stage is used for classifying the input images, so that in order to reduce the number of samples in a data set, an efficient comparison learning method is adopted, and the data in the pre-training stage does not need to be labeled. The network structure performing the classification task uses the modified ConvNeXt followed by the pooling layer and the linear connection layer, with the same number of negative samples stored in the output classes as in the queue structure.

Furthermore, the pre-training enables learnable parameters in the network to be continuously updated, and finally, parameter values before the pooling layer are reserved and loaded as initialization values of the semantic segmentation network in the training stage.

Further, as shown in fig. 2, the capacity of the negative sample dictionary selected in the pre-training process is 100, the dictionary is dynamically updated along with the training batch, and the longest batch of samples stored in the dictionary is replaced during each update.

Furthermore, the tire X-ray image input into the pre-training network generates specialized positive samples through image enhancement, the similarity between the positive samples is continuously improved and the similarity between the negative samples is reduced through comparing the similarity between the input image and the specialized positive samples and the similarity between the input image and the negative samples in the dictionary, so that the texture feature coding structure effectively identifies the feature information of the belted layer.

And 4, designing a decoder structure based on feature fusion.

As shown in fig. 3, the decoder of the semantic segmentation network in the embodiment of the present invention includes a deep separable convolution, 1 × 1 convolution, multi-size pooling module. As shown in fig. 4, a multi-size pooling structure is connected at the deepest layer of the encoder, the shallow feature map and the deep feature map are subjected to channel mapping before fusion, and depth separable convolution is used for feature extraction after fusion.

Further, in the step 4, the decoder may up-sample the output image to the same size as the input image at four stages, respectively calculate cross entropy loss with the label graph, and complete back propagation of the training network parameters in association with the loss, where the loss function is shown as follows:

wherein G represents a label graph, P _i Is the output of the decoder, beta _i The deep characteristic diagram is heavier in the total loss and is set to be beta by default ₁ ＝0.5,β ₂ ＝0.25,β ₃ ＝0.1，L _m Is a principal loss calculation strategy; p ₀ Representing the output of the decoder at the first stage; i represents a stage number of the decoder; l is _a Representing the secondary penalty calculation strategy.

Further, the output of the encoder of the lightest layer is used when the pixel-level output of the tire X-ray image is predicted, and the feature map comprises semantic information, spatial information and boundary information.

In the prediction process, the original tire X-ray image is subjected to the region segmentation and cutting to obtain N samples with the size of 256 multiplied by 256, two classification regions are predicted through a semantic segmentation network, and the coordinates of the belt layers on the left side and the right side are obtained through the boundary of the region segmentation.

Further, as shown in fig. 5, the defect detection is to measure the coordinate distance on the basis of pixel level segmentation, predict the belt area by the semantic segmentation network, then map the boundary coordinates, determine whether the defect occurs according to the conversion of the pixel on the map and the scale of the real position of the tire, sort the defective tire and the rejected tire, and as shown in fig. 6, schematically show the difference level and the misalignment of the belt.

The tire belt layer difference level and misalignment defect detection method based on the semantic segmentation network comprises the steps of collecting tire radial image pre-training encoder networks of a belt layer, a toe and a sidewall; dividing areas on the left side and the right side of a tire crown belted layer to manufacture training and testing data sets; designing a decoder network to fuse the characteristic graphs of all stages and output a belted area of a pixel level; the semantic segmentation algorithm is integrated in the defect detection software; obtaining the tire model through a code scanner, and determining the coordinates of a tire belt ply by comparing the proportion of the pixel distance on the image of the tire X-ray image to the actual size; according to the actual production standard, whether two defects occur or not and the severity are judged. The embodiment of the invention can monitor whether the defects of the belt ply exist in real time, and solves the problems that the traditional image processing method cannot flexibly meet the detection requirements of complicated and various tire models and the detection omission rate is high during manual detection.

The embodiments of the present invention have been described with reference to the accompanying drawings, but the present invention is not limited to the embodiments, and various changes and modifications can be made according to the purpose of the invention, and any changes, modifications, substitutions, combinations or simplifications made according to the spirit and principle of the technical solution of the present invention shall be equivalent substitutions, as long as the purpose of the present invention is met, and the present invention shall fall within the protection scope of the present invention without departing from the technical principle and inventive concept of the present invention.

Claims

1. A tire belt layer difference level and misalignment defect detection method based on a semantic segmentation network is characterized by comprising the following steps: comprises the following steps:

2. The tire belt difference level and misalignment defect detection method based on the semantic segmentation network as claimed in claim 1, characterized in that: the step 1 specifically comprises:

step 1.1, a bar code label on the surface is read by a code scanner, information about the tire model is obtained, automatic defect detection software is connected through a network cable, and the proportion of pixels on a tire image in a detection stage and the actual size is correspondingly obtained according to the model-scale corresponding relation stored in a software connection database;

step 1.2, tracking and recording the action flow of the tire X-ray imaging equipment, recording the action related to the ending of X-ray imaging, and analyzing and finding the proper installation position of the distance sensor on the premise of not influencing imaging;

3. The tire belt error level and misalignment defect detection method based on the semantic segmentation network as claimed in claim 1, wherein: the step 2 specifically comprises:

step 2.1, utilizing histogram equalization to improve the contrast of the image, and further highlighting the toe opening and the carcass: the method comprises the texture difference of a tire side, a tire shoulder and a tire crown, and takes the Scharr operator into consideration to eliminate tire body cords, wherein the operator of the Scharr operator in the vertical direction is a 3 multiplied by 3 matrix, and the convolution operation of an image I is expressed as the following formula:

effectively eliminating the texture of the steel cord by convolution in the vertical direction, so that the cord elimination is realized by using a Sobel operator in the vertical direction;

in the above formula, m _a And m _b Representing the mean gray value of the pixels in the foreground and background, m, respectively _g Representing the mean gray level, P, of the entire image _a Representing the proportion of foreground pixels in the entire image, P _b The specific gravity of the background pixel in the whole image is represented, the binary threshold selection mode is that the segmentation threshold value k is more than or equal to 0 and less than or equal to 255 and the optimal segmentation threshold value k is selected in sequence in an iteration mode ^* Satisfy the requirement of

2.3, converting the tire region segmentation task into connected domain contour extraction, calculating the contour coordinate of each connected domain according to a contour tracking algorithm to realize the calculation of the area of the connected domains, selecting the connected domain with the largest area, and determining the tire crown region represented by the connected domain according to the position of the contour coordinate of the connected domain;

2.4, respectively expanding the boundary coordinates of the left side and the right side of the tire crown, and cutting out an area map under the condition of ensuring that the width of the image in the vertical direction is 256 pixels, wherein the area map of the left tire crown comprises the coordinates of the two layers of belt layers on the left side, and the area map of the right tire crown comprises the coordinates of the two layers of belt layers on the right side; cutting out sub-images with 256 multiplied by 256 pixels, sequentially selecting 256 pixels in the horizontal direction for cutting, extracting partial areas upwards under the condition that the height of the last sub-image is less than 256 pixels to achieve the purpose of redundant cutting, and finally forming an input image in a network training stage;

step 2.5, the input image in the pre-training stage comprises three types, namely a toe region, a sidewall region and a belt region, wherein texture structures of the three regions are different, so that the representative characteristics of textures of each region are learned by a network in the pre-training stage, the image size is 256 multiplied by 256 pixels, and the class numbers are 0, 1 and 2 respectively;

step 2.6, labeling labels of input images in a training stage by using Labelme, classifying pixels of a belt layer in a labeled outline into 1 in two ways, and classifying pixels of a background outside the labeled outline into 0;

step 2.7, the image division ratio of the training set and the test set in the pre-training stage is 4:1, the image division ratio of a training set and a test set in a training stage is 3:1; the selection mode during image division is random batch extraction, and extraction operation is stopped until the proportion of a data set is met; the data set of the belt region image composition in the training stage includes all tire models currently produced, and meanwhile, the number of various types of tires is basically consistent.

4. The tire belt error level and misalignment defect detection method based on the semantic segmentation network as claimed in claim 1, wherein: the step 3 specifically includes:

step 3.1, performing a classification task on the input images in a pre-training stage, and adopting an efficient comparison learning method without labeling the data in the pre-training stage in order to reduce the number of samples in a data set; the network structure for executing the classification task uses improved ConvNeXt, and then connects a pooling layer and a linear connection layer, and the output category is the same as the number of negative samples stored in a queue structure;

5. The tire belt error level and misalignment defect detection method based on the semantic segmentation network as claimed in claim 1, wherein: the step 4 specifically includes:

step 4.1, the decoder of the semantic segmentation network comprises a depth separable convolution module, a 1 × 1 convolution module and a multi-size pooling module; connecting a multi-size pooling structure at the deepest layer of the encoder, mapping the shallow feature map and the deep feature map through channels before fusion, and extracting features by using depth separable convolution after fusion;

the decoder samples the output image in four stages to be the same as the input image in size, cross entropy loss is calculated with the label graph respectively, the back propagation training network parameters are completed by combining loss, and the loss function is shown as the following formula:

wherein G represents a label graph, P _i Is the output of the decoder, beta _i The deep characteristic diagram is heavier in the total loss and is set to be beta by default ₁ ＝0.5,β ₂ ＝0.25,β ₃ ＝0.1，L _m Is a principal loss calculation strategy; p ₀ Representing the output of the decoder at the first stage; i represents a stage number of the decoder; l is a radical of an alcohol _a Representing the secondary penalty calculation strategy.

6. The tire belt error level and misalignment defect detection method based on the semantic segmentation network as claimed in claim 1, wherein: the step 5 specifically includes:

step 5.1, in the prediction process, the original tire X-ray image is subjected to the region segmentation and cutting to obtain N samples with the size of 256 multiplied by 256, two classification regions are predicted through a semantic segmentation network, and the coordinates of the belt layers on the left side and the right side are obtained through the boundary of the region segmentation;