CN113989585B

CN113989585B - Medium-thickness plate surface defect detection method based on multi-feature fusion semantic segmentation

Info

Publication number: CN113989585B
Application number: CN202111200761.8A
Authority: CN
Inventors: 吴昆鹏; 孙文权; 杨朝霖; 石杰
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2021-10-13
Filing date: 2021-10-13
Publication date: 2022-08-26
Anticipated expiration: 2041-10-13
Also published as: CN113989585A

Abstract

The invention provides a medium plate surface defect detection method based on multi-feature fusion semantic segmentation, and belongs to the technical field of steel plate surface defect detection. The method comprises the following steps: constructing a multi-feature fusion semantic segmentation model, wherein the multi-feature fusion semantic segmentation model comprises the following steps: the encoding part and the decoding part obtain the characteristics of a plurality of different receptive fields through convolution with different expansion rates, and the characteristics with the same size are respectively fused under each scale of the decoding part, so that the capability of extracting context semantic information by a model is enhanced; based on a semi-supervised learning optimization strategy, the constructed multi-feature fusion semantic segmentation model is optimized and trained by utilizing three types of data, namely a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set, so that the trained multi-feature fusion semantic segmentation model can be used for detecting the surface defects of the medium and thick plates. By adopting the invention, the detection and identification accuracy of the defects can be improved.

Description

Medium-thickness plate surface defect detection method based on multi-feature fusion semantic segmentation

Technical Field

The invention relates to the technical field of steel plate surface defect detection, in particular to a medium plate surface defect detection method based on multi-feature fusion semantic segmentation.

Background

In recent years, surface defects inevitably occur in the process of producing medium plates, which generally refer to steel plates with a thickness of 4.5mm to 25mm, seriously affect the quality of final products, so that the detection needs to be performed accurately in time to avoid unqualified products from flowing to customers.

The early detection of the surface defects mainly depends on manual work to detect by auxiliary tools such as a flashlight after cooling a steel plate, the manual detection mode is difficult to cover all steel plates, the problems of incomplete detection, non-real time, incapability of tracing and the like exist, the field environment is severe, the risk exists, and the intelligent safety production standard is not met.

Therefore, at the present stage, a detection mode of checking defects by using machine vision-based detection equipment instead of artificial naked eyes is adopted, and some visual detection methods are already mature and applied to the fields of hot rolling, cold rolling, nonferrous metals, light industry and the like. However, on a medium plate production line, due to the influence of complex surface background and large interference of iron scale and watermark, the common target detection method cannot be well adapted to the scenes, and the problems of false defect report, missing report and the like easily occur, so that the detection and identification accuracy rate of the final defect cannot meet the actual field requirement.

Disclosure of Invention

The embodiment of the invention provides a medium plate surface defect detection method based on multi-feature fusion semantic segmentation, which can improve the detection and identification accuracy of defects. The technical scheme is as follows:

the embodiment of the invention provides a medium plate surface defect detection method based on multi-feature fusion semantic segmentation, which is applied to electronic equipment and comprises the following steps:

constructing a multi-feature fusion semantic segmentation model, wherein the multi-feature fusion semantic segmentation model comprises the following steps: the encoding part and the decoding part are used for obtaining the characteristics of a plurality of different receptive fields through convolution with different expansion rates in the encoding part and respectively fusing the characteristics with the same size under each scale of the decoding part;

based on a semi-supervised learning optimization strategy, the constructed multi-feature fusion semantic segmentation model is optimized and trained by utilizing three types of data, namely a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set, so that the trained multi-feature fusion semantic segmentation model can be used for detecting the surface defects of the medium and thick plates.

Further, the skeleton of the coding partThe structure includes: a feature extraction skeleton Resnet18, Resnet50 or Xception 39; the scale features of the feature extraction skeleton output are { C0, C1, C2, C3 and C4 }; wherein, if the input picture size is H × W, the size of the C0 feature is H × W, and the size of the C1 feature is H × W

C2 feature size of

C3 feature size of

C4 feature size of

The obtaining of the characteristics of a plurality of different receptive fields by convolution with different dilation rates in the encoding section includes:

c0 characteristic is subjected to convolution with the expansion rate of 2 and size adjustment to be equal to C1 to obtain C0_1 characteristic, C0 characteristic is subjected to convolution with the expansion rate of 4 and size adjustment to be equal to C2 to obtain C0_2 characteristic, C0 characteristic is subjected to convolution with the expansion rate of 8 and size adjustment to be equal to C3 to obtain C0_3 characteristic, and C0 characteristic is subjected to convolution with the expansion rate of 16 and size adjustment to be equal to C4 to obtain C0_4 characteristic;

c1 characteristic is subjected to convolution with the expansion ratio of 2 and size adjustment to the same size as C2 to obtain C1_2 characteristic, C1 characteristic is subjected to convolution with the expansion ratio of 4 and size adjustment to the same size as C3 to obtain C1_3 characteristic, and C1 characteristic is subjected to convolution with the expansion ratio of 8 and size adjustment to the same size as C4 to obtain C1_4 characteristic;

c2 characteristic is subjected to convolution with the expansion ratio of 2 and size adjustment to be equal to C3 to obtain C2_3 characteristic, and C2 characteristic is subjected to convolution with the expansion ratio of 4 and size adjustment to be equal to C4 to obtain C2_4 characteristic;

the C3 signature was convolved with a dilation rate of 2 and resized to the same size as C4 to yield the C3_4 signature.

Further, the fusing the features of the same size at each scale of the decoding part respectively comprises:

in that

Fusing { C4, C3_4, C2_4, C1_4 and C0_4} features under the scale and obtaining a feature P3 through convolution and 2 times of upsampling;

in that

Fusing { P3, C3, C2_3, C1_3 and C0_3} features under the scale and obtaining a feature P2 through convolution and 2 times of upsampling;

in that

Fusing { P2, C2, C1_2 and C0_2} features under the scale and obtaining a feature P1 through convolution and 2 times of upsampling;

in that

Fusing the { P1, C1 and C0_1} features under the scale and obtaining a feature P0 through convolution and 2 times of upsampling;

and fusing the { P0, C0} features in the H multiplied by W scale and obtaining a feature P through convolution, wherein the feature P is followed by the multi-Head output structure.

Further, the multi-feature fusion semantic segmentation model further comprises: a multi-Head output structure; wherein the content of the first and second substances,

the multi-Head output structure includes: and two branches, wherein one branch is used for outputting the defect segmentation map, and the other branch is used for outputting the region weight map.

Further, the output defect segmentation graph represents defect categories by using gray value information, the gray value corresponding to the background is a first preset value, and gray values corresponding to other defect categories are sequentially added with a second preset value;

the output area weight graph is that under a single target area, the weight is set to be a third preset value by taking the mass point of the area as the center, other points in the area are subjected to weight attenuation by taking the distance from the mass point to the center of mass as reference and the descending rate of a fourth preset value, and the minimum descending weight is up to a fifth preset value.

Further, the defect categories include: working roll marks, pits, warping, longitudinal cracks, longitudinal scratches, transverse cracks, star-shaped cracks and foreign matter pressing.

Further, the optimization training of the constructed multi-feature fusion semantic segmentation model by using three types of data, namely a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set, based on the semi-supervised learning optimization strategy comprises the following steps:

based on semi-supervised learning optimization strategy, utilizing labeled defect sample set D ^l Set of unlabeled defective samples D ^u And steel plate background sample set D ^b And the three types of data adopt a multi-stage training mode to carry out optimization training on the constructed multi-feature fusion semantic segmentation model.

Further, the optimization strategy based on semi-supervised learning utilizes a labeled defect sample set D ^l Set of unlabeled defect samples D ^u And steel plate background sample set D ^b The optimization training of the constructed multi-feature fusion semantic segmentation model by adopting a multi-stage training mode comprises the following steps:

first stage training utilizes set D ^l Training a multi-feature fusion semantic segmentation model with the model input as a set D ^l In the original steel plate image, model output is a defect segmentation graph and an area weight graph, wherein the model output and a sample label are subjected to cross entropy loss and L1 loss to optimize model parameters, and an optimal model is reserved as a teacher model F (T) in the stage;

second stage training teacher model F (T) obtained by first stage training to set D ^u Generating a pseudo label by using the sample without the label defect, comprising the following steps:

set D ^u Inputting the defect sample without label into a teacher model F (T), discretizing the defect area in the output defect segmentation graph according to the area weight graph output by the teacher model F (T) to obtain an independent defect area, intercepting the defect original image at the corresponding position of each defect area, inputting the defect original image into a two-classification model, and inputting the defect original image into the teacher model F (T)Performing secondary judgment, setting all label values of the areas classified as the background in the secondary judgment and the areas of the teacher model F (T) divided into the background as sixth preset values, keeping the label values of other areas unchanged to obtain a pseudo label of the image, and utilizing a set D formed by the non-label defect sample and the generated pseudo label ^ul Training a multi-feature fusion semantic segmentation model, and completing the second stage of training to obtain a model F (S);

third stage training will set D ^l And set D ^ul Combining to obtain an amplification data set D ^e Taking the model F (S) as a first branch and copying the model F (S) to obtain a model F' (S) as a second branch, wherein the model parameters of the second branch are completely copied from the model of the first branch, and the input of the first branch is a set D ^e The second branch input is input from the first branch to the foreground defect area and set D in the sample ^b And (3) fusing the background samples obtained by the random extraction in the middle stage, keeping the labels of the samples unchanged, optimizing the model F (S) of the first branch through the cross entropy loss of the output of the first branch and the corresponding labels on one hand, and optimizing the model F (S) of the first branch through the consistency loss of the output results of the two branches on the other hand to obtain a third-stage training model.

Further, the two classification models are obtained by adopting a defect small graph and a background small graph in a transfer learning training mode; wherein the content of the first and second substances,

the model of migration includes: Incep-V3, efficiency-B3, or Resnet101, the binary model has 2 outputs, including: background and defects.

The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:

in the embodiment of the invention, aiming at the problems of complex surface background and more interference of color difference defects of the medium and thick plate, a multi-feature fusion semantic segmentation model is constructed to realize the detection of the surface defects of the medium and thick plate, the model has the characteristics of the same size of receptive fields in the encoding part by utilizing the expansion convolution and the down-sampling process, a plurality of features of different receptive fields are obtained through the convolution of different expansion rates, and the features obtained by the convolution under different expansion rates are reserved to accumulate more context semantic information; features with the same size are respectively fused under each scale of the decoding part, and feature information from different branches is absorbed to enhance the detection capability of the model and the capability of mining context semantic information, so that the detection effect is improved; meanwhile, in order to solve the problem that labeling cost of semantic segmentation samples is high, a semi-supervised learning optimization strategy is provided, three types of data including a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set are respectively utilized to carry out model training, and accuracy and anti-interference capability of model detection and defect identification are further improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart of a method for detecting surface defects of a medium plate based on multi-feature fusion semantic segmentation according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a multi-feature fusion semantic segmentation model according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a first stage training process in a semi-supervised learning optimization process according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a second stage training process in a semi-supervised learning optimization process according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a third stage training process in a semi-supervised learning optimization process according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

As shown in fig. 1 and fig. 2, an embodiment of the present invention provides a method for detecting a defect on a surface of a medium-thick board based on multi-feature fusion semantic segmentation, where the method may be implemented by an electronic device, and the electronic device may be a terminal or a server, and the method includes:

s101, constructing a multi-feature fusion semantic segmentation model, wherein the multi-feature fusion semantic segmentation model comprises the following steps: the encoding part and the decoding part are used for obtaining a plurality of characteristics of different receptive fields through convolution with different expansion rates, and the characteristics with the same size are respectively fused under each scale of the decoding part;

in this embodiment, the skeleton structure of the encoding portion includes: a feature extraction skeleton Resnet18, Resnet50 or Xception 39; the scale features of the feature extraction skeleton output are { C0, C1, C2, C3 and C4 }; wherein, if the input picture size is H × W, the size of the C0 feature is H × W, and the size of the C1 feature is H × W

C2 feature size of

C3 feature size of

C4 feature size of

the C3 signature was convolved with a dilation rate of 2 and resized to C4 equal size to yield the C3 — 4 signature.

In this embodiment, it is assumed that, in the application process, the size of the input picture is 1024 × 1024, the size of the C0 feature is 1024 × 1024, the size of the C1 feature is 512 × 512, the size of the C2 feature is 256 × 256, the size of the C3 feature is 128 × 128, and the size of the C4 feature is 64 × 64;

the C0 feature is subjected to convolution with a dilation rate of 2 and resizing to 512 × 512 to obtain a C0_1 feature, the C0 feature is subjected to convolution with a dilation rate of 4 and resizing to 256 × 256 to obtain a C0_2 feature, the C0 feature is subjected to convolution with a dilation rate of 8 and resizing to 128 × 128 to obtain a C0_3 feature, and the C0 feature is subjected to convolution with a dilation rate of 16 and resizing to 64 × 64 to obtain a C0_4 feature;

the C1 feature is subjected to convolution with a dilation rate of 2 and is adjusted to be 256 multiplied by 256 to obtain a C1_2 feature, the C1 feature is subjected to convolution with a dilation rate of 4 and is adjusted to be 128 multiplied by 128 to obtain a C1_3 feature, and the C1 feature is subjected to convolution with a dilation rate of 8 and is adjusted to be 64 multiplied by 64 to obtain a C1_4 feature;

c2 characteristic is convolved with the expansion rate of 2 and is adjusted to the size of 128 multiplied by 128 to obtain C2_3 characteristic, and C2 characteristic is convolved with the expansion rate of 4 and is adjusted to the size of 64 multiplied by 64 to obtain C2_4 characteristic;

the C3 signature was convolved with a dilation rate of 2 and resized to 64 x 64 to yield the C3 — 4 signature.

In this embodiment, the fusing the features with the same size in each scale of the decoding part includes:

in that

in that

in that

in that

The { P1, C1, C0_1} features are fused under the scale, and the features P0 are obtained through convolution and 2 times of upsampling;

and fusing the { P0, C0} features under H multiplied by W scale and obtaining a feature P through convolution, wherein the feature P is followed by a multi-Head output structure.

In the embodiment, { C4, C3_4, C2_4, C1_4, C0_4} features are fused at a 64 × 64 scale, and a feature P3 is obtained through convolution and 2 times of upsampling;

fusing { P3, C3, C2_3, C1_3, C0_3} features at a 128 × 128 scale and obtaining feature P2 through convolution and 2-fold upsampling;

fusing { P2, C2, C1_2, C0_2} features at 256 × 256 scale and obtaining feature P1 through convolution and 2 times up-sampling;

fusing the { P1, C1 and C0_1} features under the scale of 512 x 512 and obtaining a feature P0 through convolution and 2 times of upsampling;

and fusing the { P0, C0} features at the 1024 × 1024 scales and obtaining a feature P through convolution, wherein the feature P is followed by the multi-Head output structure.

In the embodiment, the multi-feature fusion semantic segmentation model has the characteristics of the same size of receptive fields in the encoding part by using the expansion convolution and the down-sampling process, obtains the features of a plurality of different receptive fields through the convolution with different expansion rates, and reserves the features obtained by the convolution with different expansion rates to accumulate more context semantic information; features with the same size are respectively fused under each scale of the decoding part, and feature information from different branches is absorbed to enhance the detection capability of the model and the capability of mining context semantic information, so that the detection effect is improved.

In this embodiment, the multi-feature fusion semantic segmentation model further includes: a multi-Head output structure; wherein the multi-Head output structure comprises: and two branches, wherein one branch is used for outputting the defect segmentation map, and the other branch is used for outputting the region weight map.

In this embodiment, the output defect segmentation map represents the defect type by using the gray value information, the gray value corresponding to the background is a first preset value (e.g., 0), and the gray values corresponding to other defect types are sequentially added with a second preset value (e.g., 1); in the output area weight map, the weight is set to a third preset value (e.g., 1) by taking the mass point of the area as the center, and the weight is attenuated by taking the distance from the mass point to the center of mass of other points in the area as a reference by the decreasing rate of a fourth preset value (e.g., 0.01), and the minimum decreasing weight is until the fifth preset value (e.g., 0.1), it should be noted that: when no defect is contained in the area at all, the weight is 0.

S102, based on a semi-supervised learning optimization strategy, optimizing and training the constructed multi-feature fusion semantic segmentation model by using three types of data, namely a labeled defect sample set, a non-labeled defect sample set and a steel plate background sample set, so that the trained multi-feature fusion semantic segmentation model can detect the surface defects of the medium and thick plates.

In this embodiment, the samples used in the semi-supervised learning optimization process include: labeled defect sample set D ^l Set of unlabeled defect samples D ^u And steel plate background sample set D ^b And optimizing the process: and performing optimization training on the constructed multi-feature fusion semantic segmentation model by adopting a multi-stage training mode. Table 1 shows the number of samples corresponding to each sample set, and the types of defects include work roll mark, pit, warping, longitudinal crack, longitudinal scratch, transverse crack, and starThe 8 categories of the form cracks and the foreign matter pressing-in are, for example, the working roll marks, the pits, the warping, the longitudinal cracks, the longitudinal scratches, the transverse cracks, the star cracks and the foreign matter pressing-in are sequentially provided with the following gray values: 1. 2, 3, 4, 5, 6, 7 and 8.

TABLE 1 number of samples corresponding to each sample set

Data collection	Labeled defect sample set D ^l	Unlabeled defect sample set D ^u	Steel plate background sample set D ^b
				Number of	1500	5000	500

In this embodiment, the labeled defect sample includes: the method comprises the steps that an original steel plate image and corresponding pixel semantic category labels are obtained, wherein the original steel plate image is 1024 x 1024 pixels in size, and at least one area in the image has defects; the unlabeled defect samples contained only: an original steel plate image, wherein at least one area in the image has a defect; the image of the steel plate background sample is also 1024 × 1024 pixels in size, but the image contains no defects at all.

In this embodiment, the performing optimization training on the constructed multi-feature fusion semantic segmentation model by using the multi-stage training mode may specifically include the following steps:

the first stage of training (as shown in FIG. 3) utilizes set D ^l Training a multi-feature fusion semantic segmentation model with the model input as a set D ^l In the original steel plate image, model output is a defect segmentation graph and an area weight graph, wherein cross entropy loss and L1 loss are carried out on the model output and a sample label to optimize model parameters (specifically: parameters of each convolution kernel in the model), and an optimal model is reserved at the stage as a teacher model F (T) and used for generating a pseudo label for a label-free defect sample;

set D utilized by the second stage training (as shown in FIG. 4) ^u Teacher model F (T) obtained by first-stage training without label value and not directly participating in training ^u The generating of the pseudo label from the no-label defect sample specifically comprises the following steps:

set D ^u Inputting the unlabeled defect sample into a teacher model F (T), wherein the defect segmentation graph obtained by the teacher model F (T) is not directly used as a pseudo label, but discretizing the area (namely the defect area) with the gray value larger than 0 in the output defect segmentation graph to obtain independent defect areas according to the area weight graph output by the teacher model F (T), intercepting the defect original image at the corresponding position of each defect area, inputting the defect original image into a binary model with higher accuracy rate for secondary judgment, the area classified as the background in the secondary determination and the label value of the area divided into the background by the teacher model f (t) are all set to a sixth preset value (for example, -1), the label values of other areas are kept unchanged to obtain the pseudo labels of the image, and then a set D consisting of the non-label defect samples and the generated pseudo labels is utilized. ^ul Training a multi-feature fusion semantic segmentation model, and finishing the second stage of training to obtain a model F (S);

in this embodiment, discretizing the region (i.e., the defect region) with the gray value greater than 0 in the output defect segmentation map according to the region weight map output by the teacher model f (t) to obtain the independent defect region may specifically include the following steps:

under the defect segmentation map, the number of discontinuous regions (namely defect regions) with the gray value larger than 0 is n1 through contour analysis statistics;

setting the data range in the region weight map to be 0-1, setting the weight threshold value to be 0.5, and obtaining n2 which is the number of discontinuous regions with the gray value larger than 0;

when n1 is equal to n2, correspondingly intercepting an image with the minimum circumscribed rectangle size of a single region on the original image directly depending on the position information of n1 different regions obtained in the defect segmentation map;

when n1 is not equal to n2, firstly, calculating n2 area center points in the area weight graph after threshold segmentation as clustering center points, then comparing n1 areas in the defect segmentation graph, traversing each area in the defect segmentation graph, if the area range contains more than 2 clustering center points, clustering and splitting the area into a plurality of sub-areas again by using the clustering center points, otherwise, keeping the area unchanged; and finally, correspondingly intercepting the image with the minimum size of the externally-connected rectangle of the area on the original image according to the position information of different areas in the split defect segmentation image.

The third stage training (as shown in FIG. 5) will assemble D ^l And set D ^ul Combining to obtain an amplification data set D ^e Taking the model F (S) as a first branch and copying the model F (S) to obtain a model F' (S) as a second branch, wherein the model parameters of the second branch are completely copied from the model of the first branch, and the input of the first branch is a set D ^e The second branch input is input from the first branch to the foreground defect area and set D in the sample ^b And (3) fusing the background samples obtained by the random extraction in the middle stage, keeping the labels of the samples unchanged, optimizing the model F (S) of the first branch circuit through the cross entropy loss of the output of the model F and the cross entropy loss of the corresponding labels on the one hand, and optimizing the model F (S) by utilizing the consistency loss of the output results of the two branch circuits on the other hand to obtain a third-stage training model (namely the model F (S) after the training in the third stage), and detecting the surface defects of the medium and thick plates through the obtained third-stage training model.

In this embodiment, the classifier model with higher accuracy is obtained by using a migration learning training mode using a defect minimap (the defect minimap refers to an image which is collected in a production field and only contains defects, the image is generally 128 × 128 pixels in size and is easy to collect, wherein an original image of the intercepted defects can be used as a part of the original image) and a background minimap (the background minimap is a subgraph which is randomly intercepted from all sample sets and only is a background, and the subgraph does not contain any defect region), wherein the defect minimap is 10000 in total, the background minimap is 35000 in total, the migration model can be selected from one of inclusion-V3, effectiveness-B3 and Resnet101, the classifier has 2 outputs by modifying an output part, 0 is output to represent the background, and 1 is output as a defect; and then, selecting and freezing a part of bottom layer model parameters, optimizing a high-level characteristic parameter part, and quickly obtaining a classifier model with higher accuracy.

In the embodiment, in order to deal with the problem that labeling cost of semantic segmentation samples is high, a semi-supervised learning optimization strategy is provided, model training is performed by using three types of data, namely a labeled defect sample set, an unlabelled defect sample set and a steel plate background sample set, the training process is divided into three stages, continuous progressive optimization is performed, useful information hidden in the labeled defect sample set, the unlabelled defect sample set and the steel plate background sample set is fully mined, accuracy and anti-interference capability of model detection and identification of defects are further improved, and the semi-supervised learning optimization strategy has a very strong practical application value.

The multi-feature fusion semantic segmentation model constructed in the embodiment is applied to the detection process of the defects of the medium plate, the detection accuracy of the defects is greatly improved compared with that of some common target detection methods, and meanwhile, the multi-feature fusion semantic segmentation model plays a good role in the aspects of false defect inhibition, false alarm reduction, omission ratio reduction and the like.

Aiming at the problems of complex surface background and more interference of color difference defects of the medium plate, the method for detecting the surface defects of the medium plate based on the multi-feature fusion semantic segmentation establishes a multi-feature fusion semantic segmentation model to realize the detection of the surface defects of the medium plate, the model has the characteristics of the same size receptive field in the encoding part by utilizing the expansion convolution and the down-sampling process, obtains the characteristics of a plurality of different receptive fields through the convolution with different expansion rates, and retains the characteristics obtained by the convolution with different expansion rates so as to accumulate more context semantic information; features with the same size are respectively fused under each scale of the decoding part, and feature information from different branches is absorbed to enhance the detection capability of the model and the capability of mining context semantic information, so that the detection effect is improved; meanwhile, in order to solve the problem that labeling cost of semantic segmentation samples is high, a semi-supervised learning optimization strategy is provided, three types of data including a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set are respectively utilized to carry out model training, and accuracy and anti-interference capability of model detection and defect identification are further improved.

Fig. 6 is a schematic structural diagram of an electronic device 600 according to an embodiment of the present invention, where the electronic device 600 may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 601 and one or more memories 602, where at least one instruction is stored in the memory 602, and the at least one instruction is loaded and executed by the processor 601 to implement the method for detecting surface defects of a medium plate based on multi-feature fusion semantic segmentation.

In an exemplary embodiment, a computer-readable storage medium, such as a memory, including instructions executable by a processor in a terminal, is also provided to perform the above-mentioned medium plate surface defect detection method based on multi-feature fusion semantic segmentation. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims

1. A method for detecting surface defects of a medium plate based on multi-feature fusion semantic segmentation is characterized by comprising the following steps:

constructing a multi-feature fusion semantic segmentation model, wherein the multi-feature fusion semantic segmentation model comprises the following steps: the encoding part and the decoding part are used for obtaining a plurality of characteristics of different receptive fields through convolution with different expansion rates, and the characteristics with the same size are respectively fused under each scale of the decoding part;

based on a semi-supervised learning optimization strategy, carrying out optimization training on the constructed multi-feature fusion semantic segmentation model by utilizing three types of data, namely a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set, so that the trained multi-feature fusion semantic segmentation model can be used for detecting the surface defects of the medium and thick plates;

wherein, the skeleton texture of the coding part includes: a feature extraction skeleton Resnet18, Resnet50 or Xception 39; the scale features of the feature extraction skeleton output are { C0, C1, C2, C3 and C4 }; wherein, if the input picture size is H × W, the size of the C0 feature is H × W, and the size of the C1 feature is H × W

C2 feature size of

C3 feature size of

C4 feature size of

c1 characteristic is subjected to convolution with a swelling ratio of 2 and size adjustment to be equal to C2 to obtain C1_2 characteristic, C1 characteristic is subjected to convolution with a swelling ratio of 4 and size adjustment to be equal to C3 to obtain C1_3 characteristic, and C1 characteristic is subjected to convolution with a swelling ratio of 8 and size adjustment to be equal to C4 to obtain C1_4 characteristic;

c2 characteristic is subjected to convolution with expansion ratio of 2 and size adjustment to C3 equivalent size to obtain C2_3 characteristic, and C2 characteristic is subjected to convolution with expansion ratio of 4 and size adjustment to C4 equivalent size to obtain C2_4 characteristic;

c3 characteristic is convolved with expansion ratio of 2 and is adjusted to be the same size as C4 to obtain C3_4 characteristic;

wherein the fusing the features of the same size at each scale of the decoded part respectively comprises:

in that

The { C4, C3_4, C2_4, C1_4 and C0_4} features are fused under the scale, and the features P3 are obtained through convolution and 2 times of upsampling;

in that

in that

in that

Fusion of { P1, C1, C0_1} features under scale and up sampling by convolution and 2-foldObtaining a characteristic P0;

fusing { P0, C0} characteristics under H multiplied by W scale and obtaining characteristic P through convolution, wherein the characteristic P is connected with a multi-Head output structure;

the optimization training of the constructed multi-feature fusion semantic segmentation model based on the semi-supervised learning optimization strategy by using three types of data, namely a labeled defect sample set, an unlabeled defect sample set and a steel plate background sample set, comprises the following steps of:

based on semi-supervised learning optimization strategy, utilizing labeled defect sample set D ^l Set of unlabeled defect samples D ^u And steel plate background sample set D ^b The three types of data adopt a multi-stage training mode to carry out optimization training on the constructed multi-feature fusion semantic segmentation model;

wherein the optimization strategy based on semi-supervised learning utilizes a labeled defect sample set D ^l Set of unlabeled defect samples D ^u And steel plate background sample set D ^b The optimization training of the constructed multi-feature fusion semantic segmentation model by adopting a multi-stage training mode comprises the following steps:

set D ^u Inputting the unlabeled defect sample into a teacher model F (T), discretizing the defect area in the output defect segmentation graph according to an area weight graph output by the teacher model F (T) to obtain independent defect areas, intercepting the defect original image at the corresponding position of each defect area, inputting the defect original image into two classification models for secondary judgment, and classifying the defect area into a background area and the area divided into the background by the teacher model F (T) in the secondary judgmentSetting all the label values as a sixth preset value, keeping the label values of other areas unchanged to obtain a pseudo label of the image, and utilizing a set D formed by a non-label defect sample and the generated pseudo label ^ul Training a multi-feature fusion semantic segmentation model, and finishing the second stage of training to obtain a model F (S);

third stage training will set D ^l And set D ^ul Combining to obtain an amplification data set D ^e Taking the model F (S) as a first branch and copying the model F (S) to obtain a model F' (S) as a second branch, wherein the model parameters of the second branch are completely copied from the model of the first branch, and the input of the first branch is a set D ^e The second branch inputs the foreground defect region and set D in the sample from the first branch ^b And (3) fusing the background samples obtained by the random extraction in the middle stage, keeping the labels of the samples unchanged, optimizing the model F (S) of the first branch through the cross entropy loss of the output of the first branch and the corresponding labels on one hand, and optimizing the model F (S) of the first branch through the consistency loss of the output results of the two branches on the other hand to obtain a third-stage training model.

2. The method for detecting the surface defects of the medium plate based on the multi-feature fusion semantic segmentation according to claim 1, wherein the multi-feature fusion semantic segmentation model further comprises: a multi-Head output structure; wherein the content of the first and second substances,

3. The method for detecting the surface defects of the medium plate based on the multi-feature fusion semantic segmentation according to claim 2, wherein the outputted defect segmentation map represents defect categories by utilizing gray value information, the gray value corresponding to a background is a first preset value, and gray values corresponding to other defect categories are sequentially added with a second preset value;

4. The method for detecting the surface defect of the medium plate based on the multi-feature fusion semantic segmentation as claimed in claim 3, wherein the defect categories comprise: working roll marks, pits, warping, longitudinal cracks, longitudinal scratches, transverse cracks, star-shaped cracks and foreign matter pressing.

5. The method for detecting the surface defects of the medium plate based on the multi-feature fusion semantic segmentation according to claim 1, wherein the two classification models are obtained by adopting a defect small graph and a background small graph in a transfer learning training mode; wherein the content of the first and second substances,