CN117173461A

CN117173461A - Multi-visual task filling container defect detection method, system and medium

Info

Publication number: CN117173461A
Application number: CN202311100670.6A
Authority: CN
Inventors: 李刚; 张�林; 熊茂盛; 李婕
Original assignee: Hubei Shenglin Bio Engineering Co ltd
Current assignee: Hubei Shenglin Bio Engineering Co ltd
Priority date: 2023-08-29
Filing date: 2023-08-29
Publication date: 2023-12-05

Abstract

The application relates to a multi-vision task filling container defect detection method, a system and a medium, wherein the method comprises the following specific steps: acquiring data, namely acquiring images inside and outside a container storage tank to acquire storage tank defect image data; data enhancement, regenerating more defect image samples for the obtained storage tank defect image, and classifying, dividing and labeling the defect image samples at the pixel level; the method comprises the steps of classifying, detecting and establishing a segmentation model of defects, and utilizing a multi-task learning frame system MTL-CDS structure to perform the establishment and training of three visual task models of defect classification, detection and segmentation; and (3) estimating parameters of the upper computer, deploying a model with optimal evaluation indexes to a field workstation, carrying out on-line monitoring on the filling container through the upper computer, extracting the types of defects, segmenting relevant defect areas, providing parameter results, and providing effective data evidence for subsequent defect evaluation. The method solves the problems of low defect detection accuracy and difficult field application of the traditional storage tank.

Description

Multi-visual task filling container defect detection method, system and medium

Technical Field

The application belongs to the technical field of machine vision and digital image processing, and particularly relates to a multi-vision task filling container defect detection method, system and medium.

Background

The storage tank container can appear the defect of different degree because of various reasons such as transportation, product friction in the use, and wherein some defects can influence not only follow-up production, still can influence corrosion resistance and the wearability of final product. Because of the large internal space of the storage tank, it is difficult to perform efficient defect detection and determination manually within a certain period of time, and thus nondestructive detection techniques by advanced instruments have been gradually developed.

The current common method for easily detecting the storage tank comprises the following steps: ultrasonic detection, magnetic flux leakage detection, acoustic emission detection, three-dimensional laser scanning and magnetic powder detection. The ultrasonic detection can clearly detect defects existing on the surface of the detected equipment according to the attenuation degree of the sound wave, can effectively detect buried defects in the welding line and the like, and is also suitable for potential quality defects of the pressure vessel bolt; the magnetic leakage detection usually adopts a Hall probe or an induction coil to scan the magnetized workpiece, and corresponding defects are analyzed according to detection signals; the acoustic emission detection technology is to utilize the internal structure of the material to change to cause the redistribution of internal stress of the material, so that the mechanical energy is converted into acoustic energy, the online detection of the storage tank can be realized, and the three-dimensional laser scanning technology is to measure the vertical and horizontal direction angles of the large storage tank by utilizing the laser ranging principle, thereby obtaining the space coordinates of each measuring point in the scanning area and carrying out three-dimensional reconstruction on the tank body. However, the above methods have the problems of complex detection instruments, high price, limited detection target materials and the like, the magnetic powder detection equipment is simple, easy to operate, short in detection period, high in surface defect detection sensitivity and low in cost, but the magnetic powder detection cannot detect deeper internal defects, and the detection sensitivity is influenced by the defect depth.

As a necessity on production lines in the fields of biology, chemical industry, food and the like, the storage tank equipment is also in need of an automatic defect detection technology with high automation and intelligence, high detection precision and low cost. With the development of artificial intelligence technology, especially the development of AlexNet deep convolutional network, deep learning has come to a high-speed development stage, and classification and detection methods based on deep learning are gradually applied, and the feasibility of nondestructive detection in industry by using deep learning is proved from the detection of internal defects invisible to naked eyes, such as oil tanks, oil storage tanks, large-scale water tanks and the like, to the detection of fine granularity defects on production lines, such as steel plates, mobile phone screen defects and the like.

Disclosure of Invention

The embodiment of the application aims to provide a multi-visual task filling container defect detection method, system and medium, which solve the problems of low defect detection accuracy and difficult field application of the traditional storage tank.

In order to achieve the above purpose, the present application provides the following technical solutions:

in a first aspect, an embodiment of the present application provides a multi-vision task filling container defect detection method, including the following specific steps:

s1: acquiring data, namely acquiring images inside and outside a container storage tank to acquire storage tank defect image data;

s2: data enhancement, regenerating more defect image samples for the obtained storage tank defect image, and classifying, dividing and labeling the defect image samples at the pixel level;

s3: the method comprises the steps of classifying, detecting and establishing a segmentation model of defects, and utilizing a multi-task learning frame system MTL-CDS structure to perform the establishment and training of three visual task models of defect classification, detection and segmentation;

s4: and (3) estimating parameters of the upper computer, deploying a model with optimal evaluation indexes to a field workstation, carrying out on-line monitoring on the filling container through the upper computer, extracting the types of defects, segmenting relevant defect areas, providing parameter results, and providing effective data evidence for subsequent defect evaluation.

In the step S1, the main collecting part for collecting the images inside and outside the container storage tank includes the container storage tank opening, the container storage tank bottom and the vicinity of the welding seam, the defect image data of the storage tank is obtained by converting the obtained image into an RGB three-channel mode, calculating the pixel mean value of all the images by sub-channels, and subtracting the mean value from the value of each position on the image when training and verifying the model without considering the spatial position relationship of the pixel points, so as to obtain normalized sample data, and avoid abnormal values.

The step S2 includes:

making matched data, and performing background removal and binarization threshold processing on the obtained storage tank defect image to obtain a binary image I with defects only _mask Values 0 and 255, where 255 represents a defective area, and then I _mask An image input generator for inputting I _mask Filling the region to obtain a generated defect image I _{mask_out} Next, I is _mask Image and defect image I generated _{mask_out} Pairing is carried out, and the pairing is input into a discriminator as a negative sample, and I is simultaneously carried out _mask Pairing with a real defect image, inputting the paired defect image serving as a positive sample into a discriminator, and outputting the probability of the true defect image by the discriminator;

condition generation countermeasure network training: firstly, fixing parameters of a generator, supervising the discriminator through a real defect image, learning the characteristics of the real defect image in the process of optimizing the discriminator parameters, further distinguishing the generated defect image from the real defect image, then, fixing the parameters of the discriminator, starting to optimize the parameters of the generator, enabling the generated defect image to gradually approach the real defect image, and alternately executing the two processes until the discriminator cannot distinguish the real defect image from the generated defect image, so as to obtain a defect image sample with enhanced data.

The step S2 further includes:

the method comprises the steps of carrying out pixel-level classification and segmentation labeling on a defect image sample enhanced by using labelme software, calibrating flaw distinction as 1, forming a labeling of a segmentation task, respectively taking the maximum value of the abscissa and the ordinate according to the abscissa of a segmentation region, forming a minimum rectangular frame surrounding the defect region, forming a labeling of a detection task, carrying out classification naming on the region and the detection frame according to common 4 types of flaws, forming a labeling of a classification task, forming a multi-classification task data set comprising four types of samples including plaques, cracks, pitted surfaces and scratches, wherein the number of each sample is 1000, and providing a sample data set for training of a subsequent network model.

The step S3 of modeling includes:

l for detecting loss function of task _detection The boundary box is refined by using a heat map containing the pixel size value of the box, the specific expression is shown in the formula 1, gamma and sigma are an integer super-parameter, N is the total number of targets,the values of 0 and 1,0 indicating that the current point p does not have the defect type c,1 indicating that the current point p has the defect type c,

since the detection is based on the detection of the anchor-free frame, the size information of the detection frame needs to be predicted, and

by L _boundingbox Expressed as regression lossThe function is obtained by a specific expression (2), wherein,5 is denoted as predicted detection frame size, s _k Represented as the size of the marked box, the expression. The second order norm,

meanwhile, for detecting the accuracy of the frame, an offset L of the heat map bounding box is established _off The specific expression is shown in the formula (3),

wherein,representing the predicted offset value, p being the center point coordinate, R being the scale, and +.>To scale the center point coordinates by R,

in the segmentation task in the MTL-CDSnet architecture, image classification is carried out, a feature map is gradually up-sampled to the size of a segmentation map in the Decoder process, and a softmax layer is combined to serve as normalization on a class so as to predict each pixel and each region, and the loss of the feature map can describe L _seg And L _classify The specific expression is shown as the formula (4) and the formula (5),

L _seg ＝-ylog(y′)-(1-y)log (1-y) (4)

wherein y represents the marked segmentation value, and y' is the predicted segmentation value; batch size is a batch value set during training, generally 4, 8, 16, C is the total number of classes of defects, p _i,j For the true labeling value of the ith sample on the jth class,the probability is predicted for the network and,

the three visual tasks of classification, detection and segmentation are combined together in parallel, and the anchor-free detection is carried out _detection ,L _boundingbox And L _off Utilization of segmentation and classification L _seg And L _classify The loss function of the final MTL-CDS can

Expressed in terms of description as equation (6):

where α, β are the effects of a single task on the total loss, the specific parameter selection can be adjusted according to the characteristics of the dataset.

The training model in step S3 includes: the last layer downsampling of the residual network is replaced with dilation convolution using the pre-trained resnet50 on the ImageNet dataset as a training model parameter, so that the resolution of the last layer is 1/16 of the original resolution.

In a second aspect, an embodiment of the present application provides a multi-vision task filling container defect detection system, including:

the data acquisition module is used for acquiring images inside and outside the container storage tank, acquiring storage tank defect image data and transmitting the storage tank defect image data to the data enhancement module;

the data enhancement module is used for regenerating more defect image samples for the obtained storage tank defect image, classifying and segmenting and labeling the defect image samples at the pixel level, and transmitting the defect image samples to the model building and training module;

the model building and training module is used for carrying out the building and training of three visual task models, namely defect classification, detection and segmentation, by utilizing the MTL-CDS structure of the multi-task learning framework system;

the upper computer parameter estimation module is used for deploying the model with the optimal evaluation index to the field workstation, carrying out on-line monitoring on the filling container through the upper computer, extracting the type of the defect, dividing the related defect area, providing a parameter result and providing effective data evidence for subsequent defect evaluation.

In a third aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs a method of inspecting a filling container for defects according to any of the above-described multi-vision tasks.

Compared with the prior art, the application has the beneficial effects that:

aiming at the problem of defect sample deficiency, a CGAN generation type training model mode is adopted, and more sample images are generated by combining samples collected in the earlier stage, so that enough data support is provided for deep learning in the aspect of filling container defect detection.

Aiming at various defect types, the defects are divided into four major categories, an MTL-CDSnet multi-task learning network architecture is designed, meanwhile, the defects are classified, detected and segmented, the performance of the multi-task learning network architecture is equivalent to that of the traditional single task, but the traditional single task learning model is greatly simplified because three learning tasks share a backbone network and other operations;

the filling container defect detection upper computer is designed, can perform on-line monitoring on filling equipment, extracts the types of defects, divides relevant defect areas, provides parameter results, and provides effective data evidence for subsequent defect evaluation.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a monitoring effect provided by an embodiment of the present application;

fig. 3 is a system block diagram provided by an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The terms "first," "second," and the like, are used merely to distinguish one entity or action from another entity or action, and are not to be construed as indicating or implying any actual such relationship or order between such entities or actions.

As shown in fig. 1, a multi-vision task filling container defect detection method according to an embodiment of the present application includes:

step 1: data acquisition terminal

Acquisition device

The image acquisition inside and outside the container storage tank is carried out by utilizing the Balser industrial camera and the light source light filling, and the resolution of the camera is as follows: 2048x1080, the main acquisition parts comprise the tank mouth and the bottom, the vicinity of the weld joint and the like.

1.2 imaging sample data collection

Acquiring storage tank defect image data: four typical surface defects were collected including plaque, cracks, pitted surfaces and scratches. And converting the acquired image into an RGB three-channel mode, and calculating the pixel mean value of all the images by sub-channels without considering the spatial position relation of the pixel points. And subtracting the average value from the value of each position on the picture to obtain normalized sample data when training and verifying the model, so as to avoid abnormal values.

Step 2: data enhancement

2.1 semi-supervised data enhancement method

In the field of defect detection and identification, a large number of labeling samples are usually required to be collected for training of a deep learning model. However, in an actual application scenario, since the probability of occurrence of a part of defects is low, it is often difficult to collect enough defect samples, and the detection effect of the deep learning model needs a large number of samples as a support, which poses a great challenge to improving the model accuracy.

The Conditional GAN (Conditional Generative Adversarial Network, abbreviated as cGAN) condition generating countermeasure network is composed of two parts: the generator is responsible for generating data; the arbiter is responsible for determining whether the data is authentic or synthetic. The generator is continuously evolved, so that the generated data is more real, the arbiter judges the data to be true, the arbiter also optimizes the data, the input in the process of judging the data more accurately is only the real data, and the purpose of the generator is to generate the synthesized data consistent with the real data.

Since cGAN requires training data to be paired, paired data is first created when training data is created. Collecting a sample image for background removal and binarization threshold processing to obtain a binary image I with only flaws _mask Values are 0 and 255, wherein 255 represents a defective region. Then, I is _mask An image input generator for inputting I _mask Filling the region to obtain a generated defect image I _{mask_out} . Next, I is carried out _mask Image and defect image I generated _{mask_out} Pairing is carried out, and the pairing is input into a discriminator as a negative sample, and I is simultaneously carried out _mask Paired with a true defective image, input as a positive sample to a discriminator, which outputs it as a true imageProbability.

The specific training process is described as: firstly, parameters of a generator are fixed, a discriminator is supervised through a real flaw image, and in the process of optimizing the parameters of the discriminator, the characteristics of the real flaw image are learned, so that the generated flaw image can be distinguished from the real flaw image. Then, the parameters of the discriminator are fixed, and the parameters of the generator are optimized, so that the generated defect image gradually approaches to the real defect image. The above two processes are alternately performed until the discriminator cannot distinguish between the true defect image and the generated defect image.

More defective image samples can be regenerated by using cGAN, providing sufficient data support for subsequent classification, inspection and segmentation.

2.2 image annotation

And classifying and segmenting and labeling the flaw sample image at the pixel level by using labelme software, wherein flaw distinction is calibrated to be 1, so that labeling of a segmentation task is formed. And respectively taking the maximum value of the abscissa and the ordinate according to the abscissa of the divided region, so as to form a minimum rectangular frame surrounding the defect region, and forming the label of the detection task. And finally, classifying and naming the region and the detection frame according to the common 6-type flaws to form the labels of the classification tasks.

After the working camera is used for collecting and sample enhancing, four types of samples including plaque, crack, pitting surface and scratch are finally formed, and a multi-classification task data set with the number of each sample being 1000 is provided for training of a subsequent network model.

Step 3: classification, detection and segmentation of flaws

3.1 neural network-based Multi-task learning framework

The application mainly considers that the same network structure is utilized to classify, detect and divide three visual tasks of defects on the premise of improving the network size, delay and performance caused by layer sharing and the influence on sharing training in different multi-task learning. The structure of the Multi-task learning framework system (Multi-task Learning for Classification, detection and Segmentation, MTL-CDSnet) is shown in the middle box of the figure 1, and the backbone network of the MTL-CDSnet is formed by combining stacked convolution layers, sigmoid activation and batch normalization. The image after image acquisition enters a backbone network formed by ResNet50, so that a multi-level shared characteristic diagram is learned. Then, classification, detection and segmentation are performed in three heads, thereby achieving the purpose of parallel multitasking.

Current detection methods are for example: the Mask RCNN or SSD adopts a mode with a boundary frame, and then the candidate frames are selected by utilizing non-maximum value inhibition, so that although higher detection precision can be achieved, the method involves setting super parameters such as the number of anchor frames, the size ratio, the number of image division areas and the like. Similar to classical CenterNet networks, in MTL-CDSnet, anchor-free detection is adopted, and semantic segmentation is performed by adopting a method of combining a Decoder in Unet with a self-attention module. In the anchor-free method, the bounding box is directly converted into a two-dimensional gaussian distribution to form a heat map, the maximum of which marks the center of the bounding box. The maximum value of the heat map is used for searching a rectangular bounding box formed in the later stage. The anchor-free detection algorithm directly uses the heat map generated by the feature map, does not need to use a default value to discretize data for boundary box detection, and greatly simplifies the detection network.

Based on the above feature map, the detection head is responsible for interpreting the feature map into a detection result. In the detection head module, a 3x3 convolution layer is connected to compress the dimension of the input feature map to 256, and then two parallel 1x1 convolution layers are connected to generate a target center point heat map and a target scale prediction map.

In the application, L is used for detecting the loss function of a task _detection The boundary box is refined by using another heat map containing the pixel size value of the box, the specific expression is shown in the formula 1, gamma and sigma are an integer super-parameter (less than 10), and N is the total number of targets.The values of (1) are 0 and 1,0 indicates that the defect type c does not exist in the current point p, and 1 indicates that the defect type c exists in the current point p.

Since the detection is based on the detection of the anchor-free frame, the size information of the detection frame needs to be predicted and L is used _boundingbox The regression loss function is expressed as a specific expression (2), wherein,the detection frame size, s, expressed as predicted _k Represented as the size of the marked box, the expression. Second order norms.

Meanwhile, for detecting the accuracy of the frame, an offset L of the heat map bounding box is established _off The specific expression is shown in the formula (3).

Wherein,representing the predicted offset value, p being the center point coordinate, R being the scale, and +.>The coordinates of the center point are scaled by R.

In addition, in the semantic segmentation task in the MTL-CDSnet architecture, classification of images is performed. The feature map is gradually upsampled to the segmentation map size during the Decoder process, and used in conjunction with the softmax layer as a normalization over the class to predict each pixel and region. Its loss can be described as L _seg And L _classify The specific expression is shown in the formula (4) and the formula (5).

L _seg ＝-ylog(y′)-(1-y)log (1-y) (4)

Wherein y represents the marked segmentation value, and y' is the predicted segmentation value; batch size is a batch value set during training, generally 4, 8, 16, etc., C is the total number of classes of defects, p _i,j For the true labeling value of the ith sample on the jth class,probabilities are predicted for the network.

These three visual tasks (classification, detection and segmentation) are combined together in parallel. Anchor-free detection pass L _detection ,L _boundingbox And L _off Utilization of segmentation and classification L _seg And L _classify The loss function of the final MTL-CDSnet can be described as expressed in equation (6):

the influence of a and beta on the total loss by a single task can be considered as noise parameters and can be learned. The specific parameter selection may be adjusted according to the characteristics of the data set, for example, setting α, β to a constant to achieve the best performance index.

3.2 training method of network model

The application adopts a self-built data set to obtain a network model, and a network training platform is a server of the GeForce RTX4080 GPU for training. Model parameters are pre-trained with the resnet50 pre-trained on the ImageNet dataset. To preserve the detail information of the image, the last layer downsampling of the residual network is replaced with a dilation convolution, so that the resolution of the last layer is 1/16 of the original resolution. In the training process, 300 cycles were trained in total. Adam's optimizer is used to train an initial learning rate of 0.0001 and balance the web learning rate using a method of Poly strategy adaptive adjustment learning rate, with batch_size set to 8. The training data set is partitioned in a ratio of 8:1:1.

The segmentation accuracy of the segmentation region was evaluated in terms of the cross-over ratio (IoU). It evaluates the overlapping degree of the detection frame and the real frame by calculating the ratio of the intersection and the union between the two. And selecting a comprehensive evaluation index mAP (mean Average Precision) of the combination of the precision rate and the recall rate in target detection to evaluate the detection task. Precision and recall reconciliation averages F1-score were chosen to evaluate the ability of the network to efficiently find all samples of the class. The expressions of the above three evaluation indexes can be expressed as shown in the group of formula (7):

the TP is expressed as a predicted positive example, and the actual positive example is the predicted positive example, namely the algorithm is correctly predicted; FP represents a positive case of prediction, and a negative case of actual prediction, i.e., an algorithm prediction error; FN represents a negative example of prediction and a positive example of actual prediction, namely an algorithm prediction error; c is the number of categories in the target detection task, and APi is the average precision of specific categories.

Table 1 shows the evaluation results of the MTL-CDSnet multitask network structure when a certain single task and multiple tasks are executed simultaneously, and the evaluation indexes of the multiple tasks are equivalent to those of the single task, so that the characteristics extracted by the three tasks at the bottom convolution layer are similar, and the mode of the common low-layer characteristics simplifies the original three models while ensuring the task precision.

Table 1 evaluation results of single and multiple tasks

Network structure	Segmentation index (IoU)	Detection index (mAP)	Classification (F1-score)
				Only split tasks	72.9％	Without any means for	Without any means for
Only detection tasks	Without any means for	55.1％	Without any means for
				Only classification tasks	Without any means for	Without any means for	78.3％
Segmentation+detection	71.3％	56.0％	Without any means for
				Segmentation+detection+classification	72.9％	56.2％	77.8％

Step 4: upper computer parameter estimation

After the ML-CDS model is trained according to the step 3, the model with the optimal evaluation index is deployed to a workstation. The main working modes of the workstation are divided into two types, one type can work an industrial camera to monitor the filling container in real time, and the other type can be firstly scanned and stored by the camera and then is locally opened by using an upper computer. A specific operation interface is shown in fig. 2. Taking real-time detection as an example, a picture is acquired every 15 frames, the picture is sent into a deployed and trained model, detection of a plurality of tasks is carried out, and detection results are displayed in a visual area, such as the type and the number of defects existing in a certain filling container and the pixel proportion occupied by each defect. Meanwhile, when a new flaw is encountered, the image can be stored in a sample library, a sample training set is expanded, and a new model after sample expansion is obtained by clicking model update.

The application provides a filling container defect detection method based on deep learning, which has the remarkable advantages compared with the existing defect detection method that:

(1) aiming at the problem of defect sample deficiency, a cGAN generation type training model mode is adopted, and more sample images are generated by combining samples collected in the earlier stage, so that enough data support is provided for deep learning in the aspect of filling container defect detection;

(2) aiming at various defect types, the defects are divided into four major categories, an MTL-CDSnet multi-task learning network architecture is designed, meanwhile, the defects are classified, detected and segmented, the performance of the multi-task learning network architecture is equivalent to that of the traditional single task, but the traditional single task learning model is greatly simplified because three learning tasks share a backbone network and other operations;

(3) the filling container defect detection upper computer is designed, can perform on-line monitoring on filling equipment, extracts the types of defects, divides relevant defect areas, provides parameter results, and provides effective data evidence for subsequent defect evaluation.

The embodiment of the application provides a multi-vision task filling container defect detection system, which comprises the following steps:

the data acquisition module 1 is used for acquiring images inside and outside the container storage tank, acquiring storage tank defect image data and transmitting the storage tank defect image data to the data enhancement module 2;

the data enhancement module 2 is used for regenerating more defect image samples for the obtained defect image of the storage tank, classifying and segmenting and labeling the defect image samples at the pixel level, and transmitting the defect image samples to the model building and training module 3;

the model building and training module 3 is used for carrying out the building and training of three visual task models of defect classification, detection and segmentation by utilizing the MTL-CDS structure of the multi-task learning framework system;

the upper computer parameter estimation module 4 deploys the model with the optimal evaluation index to a field workstation, monitors the filling container on line through the upper computer, extracts the type of the defect, partitions out the related defect area, provides a parameter result and provides effective data evidence for subsequent defect evaluation.

An embodiment of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for detecting a defect in a container for multi-vision tasks as described in any one of the above.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. The multi-vision task filling container defect detection method is characterized by comprising the following specific steps of:

s2: data enhancement, namely carrying out sample enhancement on the acquired storage tank defect image, regenerating more defect image samples, and carrying out pixel-level classification and segmentation labeling on the defect image samples;

2. The multi-vision task filling container defect detection method according to claim 1, wherein in the step S1, the main collection parts for collecting the images inside and outside the container storage tank include the container storage tank opening, the container storage tank bottom and the vicinity of the weld seam, the storage tank defect image data are obtained by converting the obtained images into an RGB three-channel mode, calculating the pixel mean value of all the images by sub-channels, and subtracting the mean value from the value of each position on the image during training and verifying the model, so as to obtain normalized sample data, and avoid abnormal values.

3. A multi-vision task filling container defect detection method according to claim 1, wherein said step S2 comprises:

4. A multi-vision task filling container defect detection method according to claim 3, wherein said step S2 further comprises:

5. A multi-vision task filling container defect detection method according to claim 1, wherein said step S3 of modeling comprises:

since the detection is based on the detection of the anchor-free frame, the size information of the detection frame needs to be predicted and L is used _boundingbox The regression loss function is expressed as a specific expression (2), wherein,the detection frame size, s, expressed as predicted _k Represented as the size of the marked box, the expression. The second order norm,

L _seg ＝-ylog(y′)-(1-y)log(1-y) (4)

the three visual tasks of classification, detection and segmentation are combined together in parallel, the anchor-free detection passing through detection, L _boundingbox And L _off Utilization of segmentation and classification L _seg And L _classify The loss function of the final MTL-CDS can be described as expressed in equation (6):

6. A multi-vision task filling container defect detection method according to claim 1, wherein said step S3 training the model comprises: the method comprises the steps of taking a pre-trained resnet50 on an ImageNet data set as a training model parameter, adopting dilation convolution to replace the last layer of downsampling of a residual network, enabling the resolution of the last layer to be 1/16 of the original resolution, taking a network training platform as a GeForce RTX4080 GPU, adopting an Adam optimizer, training an initial learning rate to be 0.0001, balancing a network learning speed by using a method of adaptively adjusting the learning rate through a Poly strategy, setting batch_size to be 8, and dividing a training data set according to a ratio of 8:1:1.

7. A multi-vision task filling container defect detection system, comprising:

the upper computer parameter estimation module deploys the model with the optimal evaluation index to the upper computer, monitors the filling container on line through the upper computer, extracts the type of the defect, partitions out the related defect area, provides a parameter result and provides effective data evidence for subsequent defect evaluation.

8. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, implements the multi-vision task filling container defect detection method according to any of claims 1 to 6.