CN114419078A

CN114419078A - Surface defect region segmentation method and device based on convolutional neural network

Info

Publication number: CN114419078A
Application number: CN202210335463.8A
Authority: CN
Inventors: 弭宝瞳; 周展; 王涛
Original assignee: Beijing Jushi Intelligent Technology Co ltd
Current assignee: Beijing Jushi Intelligent Technology Co ltd
Priority date: 2022-04-01
Filing date: 2022-04-01
Publication date: 2022-04-29
Anticipated expiration: 2042-04-01
Also published as: CN114419078B

Abstract

The method comprises the steps of obtaining a source software code, constructing a code attribute graph according to the source software code, inputting the code attribute graph into a preset source code defect detection model based on a graph neural network, presetting the source code defect detection model based on the graph neural network for generating a self-adaptive receiving path, and outputting a detection result according to the self-adaptive receiving path, so that the influence of irrelevant code information can be reduced, and the efficiency of code vulnerability detection is improved.

Description

Surface defect region segmentation method and device based on convolutional neural network

Technical Field

The application belongs to the technical field of defect detection, and particularly relates to a surface defect region segmentation method and device based on a convolutional neural network.

Background

In the industrial production process, the surface defect detection of related products is required. For example, in the production and processing process of steel, the defects such as holes, scratches, inclusions, scratches, roll marks and the like are easily generated due to the influence of a plurality of factors such as raw materials, rolling equipment, operating techniques of workers and the like, and the appearance of the steel is affected and the properties such as corrosion resistance, wear resistance, fatigue strength and the like are also affected due to the existence of the defects, so that the quality of the steel is seriously reduced. With the rapid development of deep learning technology, the workpiece surface defect region segmentation method based on deep learning has achieved tremendous achievement in recent years. The existing workpiece surface defect region segmentation method based on deep learning carries out defect region segmentation based on multi-scale pooling and convolution operation, but because the method only focuses on aggregation of global context information of a defect region in an image, when the image is complex, because scene information included by the context information is incomplete, the workpiece surface defect region segmentation is often inaccurate.

Disclosure of Invention

In order to overcome the problem that the existing workpiece surface defect region segmentation method based on deep learning only focuses on aggregation of global context information of a defect region in an image to a certain extent, and when the image is complex, the segmentation of the workpiece surface defect region is inaccurate due to incomplete scene information included by the context information, the application provides a surface defect region segmentation method and a surface defect region segmentation device based on a convolutional neural network.

In a first aspect, the present application provides a surface defect region segmentation method based on a convolutional neural network, including:

extracting feature maps of different scales corresponding to the input image;

acquiring image structure information and scene information of different levels of the input image according to the feature maps of different scales;

and outputting a surface defect region segmentation result according to the image structure information and the scene information of different layers.

Further, the extracting feature maps of different scales corresponding to the input image includes:

inputting the input image into a pre-trained ResNet network to extract bottom layer feature maps with different scales;

inputting the bottom-layer feature maps of different scales into a channel feature pyramid, wherein the channel feature pyramid is used for N ^ or N ^ of the bottom-layer feature maps of different scales4 channels utilize 3

3 convolution operations for processing, N/4 channels using 5

5 convolution operation processing, N/2 channels utilizing 7

7, carrying out convolution operation;

and performing jumping connection splicing on the feature map subjected to channel feature pyramid processing to obtain a multi-scale feature map.

Further, the acquiring image structure information and scene information of different levels of the input image according to the feature maps of different scales includes:

inputting the feature diagrams of different scales into a dynamic context polymerization module to obtain mask diagrams of different structural layers;

performing matrix transposition operation on the mask image;

multiplying the feature graph with the mask graph after the matrix transposition operation to obtain a context information vector;

remapping the context information vector to the mask image to obtain refined image interpretation information;

and obtaining the characteristic vectors of the image structure information and the scene information according to the refined image interpretation information.

Further, the inputting the feature maps of different scales into the dynamic context aggregation module to obtain mask maps of different structural layers includes:

1 is carried out on the feature maps with different scales

1, performing convolution operation to obtain an intermediate characteristic diagram;

inputting the intermediate feature map into a dynamic context aggregation module;

if the number of layers of the dynamic context aggregation module is more than 1, outputting the dynamic context aggregation module of the previous layerSplicing the feature graph with the output features of the current layer, and performing 3 on the spliced feature graph

3 convolution, 1

And (3) carrying out convolution and Softmax function processing to obtain mask graphs of different structural layers of the dynamic context aggregation module.

Further, the inputting the feature maps of different scales into the dynamic context aggregation module to obtain mask maps of different structural layers further includes:

if the number of layers of the dynamic context aggregation module is 1, performing 3 on the intermediate feature map

3 convolution, 1

And (4) processing the convolution 1 and the Softmax function to obtain a mask image of the dynamic context aggregation module.

Further, the context information vector includes context information corresponding to a plurality of categories, and the remapping the context information vector to the mask map to obtain refined image interpretation information includes:

and calculating the average value of the context information corresponding to a plurality of categories in the context information vector to obtain refined image interpretation information.

Further, the outputting a surface defect region segmentation result according to the image structure information and the scene information of the different layers includes:

repeatedly acquiring a characteristic diagram corresponding to a plurality of refined image interpretation information in a recursive mode;

and performing up-sampling operation on the feature maps corresponding to the plurality of refined image interpretation information to obtain a surface defect area segmentation result.

In a second aspect, the present application provides a surface defect region segmentation apparatus based on a convolutional neural network, including:

the extraction module is used for extracting feature maps with different scales corresponding to the input image;

the acquisition module is used for acquiring image structure information and scene information of different levels of the input image according to the feature maps with different scales;

and the output module is used for outputting the segmentation result of the surface defect region according to the image structure information and the scene information of different layers.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

according to the surface defect region segmentation method and device based on the convolutional neural network, the feature maps of different scales corresponding to the input image are extracted, the image structure information and the scene information of different levels of the input image are obtained according to the feature maps of different scales, the segmentation result of the surface defect region is output according to the image structure information and the scene information of different levels, the hierarchical dynamic aggregation of context information is achieved on the premise that no additional supervision is introduced, and the accuracy of workpiece surface defect region segmentation is improved by obtaining the image structure information and the scene information of different levels of the input image.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a flowchart of a surface defect region segmentation method based on a convolutional neural network according to an embodiment of the present application.

Fig. 2 is a flowchart of a surface defect region segmentation method based on a convolutional neural network according to another embodiment of the present application.

Fig. 3 is a flowchart of another surface defect region segmentation method based on a convolutional neural network according to an embodiment of the present application.

Fig. 4 is a flowchart of another surface defect region segmentation method based on a convolutional neural network according to an embodiment of the present application.

Fig. 5 is a flowchart of another surface defect region segmentation method based on a convolutional neural network according to an embodiment of the present application.

Fig. 6 is a functional block diagram of a surface defect area segmentation apparatus based on a convolutional neural network according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail below. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 is a flowchart of a surface defect region segmentation method based on a convolutional neural network according to an embodiment of the present application, and as shown in fig. 1, the surface defect region segmentation method based on a convolutional neural network includes:

s11: extracting feature maps of different scales corresponding to the input image;

s12: acquiring image structure information and scene information of different levels of an input image according to feature maps of different scales;

s13: and outputting a surface defect region segmentation result according to the image structure information and the scene information of different layers.

The traditional workpiece surface defect region segmentation method based on deep learning carries out defect region segmentation based on multi-scale pooling and convolution operation, but because the method only focuses on aggregation of global context information of a defect region in an image, when the image is complex, because scene information included by the context information is incomplete, the workpiece surface defect region segmentation is often inaccurate.

In the embodiment, the feature maps of different scales corresponding to the input image are extracted, the image structure information and the scene information of different levels of the input image are obtained according to the feature maps of different scales, the segmentation result of the surface defect region is output according to the image structure information and the scene information of different levels, the hierarchical dynamic aggregation of context information is realized on the premise of not introducing extra supervision, and the segmentation accuracy of the workpiece surface defect region is improved by obtaining the image structure information and the scene information of different levels of the input image.

Fig. 2 is a flowchart of a surface defect region segmentation method based on a convolutional neural network according to another embodiment of the present disclosure, and as shown in fig. 2, the surface defect region segmentation method based on a convolutional neural network includes:

s201: inputting an input image into a pre-trained ResNet network to extract bottom layer feature maps with different scales;

for the input image, extracting feature maps with different scales by using a pre-trained ResNet network, and outputting a bottom layer feature map

。

S202: inputting the bottom layer characteristic graphs of different scales into a channel characteristic pyramid, wherein the channel characteristic pyramid is used for utilizing 3 channels of N/4 bottom layer characteristic graphs of different scales

3 convolution operations for processing, N/4 channels using 5

5 convolution operation processing, N/2 channels utilizing 7

7, carrying out convolution operation;

s203: and performing jump connection splicing on the feature map subjected to channel feature pyramid processing to obtain a multi-scale feature map.

For the underlying feature map

Wherein R isThe figure of the figure, C is the number of channels, H is the height of the image, W is the width of the image, as the input of the channel feature pyramid, the flow of the channel feature pyramid is shown in figure 3, the bottom layer feature graph is

N/4 channels of (3)

3 convolution operations followed by N/4 channels with 5

5 convolution operation process, utilizing 7 for the remaining N/2 channels

7, performing convolution operation, splicing the processed characteristic graphs by jump connection to obtain final output characteristic graphs

And through multi-channel characteristic pyramid processing, on the premise of controlling the calculated amount, a multi-scale characteristic diagram is effectively extracted.

The method and the device for capturing the image structure information and the scene structure hierarchy learn the multi-scale pyramid representation of the hierarchical context of the input image by designing the network.

S204: inputting the feature diagrams of different scales into a dynamic context polymerization module to obtain mask diagrams of different structural layers;

in some embodiments, inputting feature maps of different scales into the dynamic context aggregation module to obtain mask maps of different structural layers includes:

1 for feature maps of different scales

inputting the intermediate feature map into a dynamic context aggregation module; the dynamic context aggregation module flow is shown in fig. 4.

If the number of layers of the dynamic context aggregation module is more than 1, splicing the output characteristic diagram of the previous layer of dynamic context aggregation module with the output characteristic of the current layer, and performing 3 on the spliced characteristic diagram

3 convolution, 1

Or, if the number of layers of the dynamic context aggregation module is 1, performing 3 on the intermediate feature map

3 convolution, 1

The method and the device utilize a plurality of layered dynamic context aggregation modules to extract scene information of different layers of an input image in a recursive mode, and the processing flow of the layered dynamic context aggregation modules is shown in fig. 4. Feature maps for backbone network output

First of all by using

Convolution reduces the number of channels to obtain a characteristic diagram

. Then the feature map is processed

And

as input to the first level dynamic context aggregation module. For a dynamic context aggregation module with n =1, first the feature map is compared

Make use of successively

Convolution and,

The convolution is processed with the Softmax function. For the dynamic context aggregation module with n > 1, obtaining the output characteristic diagram of the dynamic context aggregation module at the n-1 layer

And characteristic diagram

After splicing, reuse

Convolution and,

The convolution is processed with the Softmax function. Finally obtaining the mask graph of the layer

Wherein

Representing the number of classes that need to be predicted.

S205: performing matrix transposition operation on the mask image;

s206: multiplying the feature graph and the mask graph after the matrix transposition operation to obtain a context information vector;

mask map to be obtained

Performing matrix transposition operation and on the feature map

Performing matrix multiplication to obtain context information vector

The calculation formula is as follows:

。

s207: remapping the context information vector to a mask image to obtain refined image interpretation information;

(Vector)

comprises a

Calculating a vector from the context information corresponding to each category

Is obtained as an average of

，

(ii) a Where i represents the ith object class, j represents j as a pixel in the input image,

represents the refined image interpretation information of the nth layer. Then combining the n vectors to obtain the output context information vector

。

In some embodiments, the context information vector includes context information corresponding to a plurality of categories, and remapping the context information vector to the mask map results in refined image interpretation information, including:

Remapping the obtained context information vector to a mask map

In (1), obtaining an output feature vector

Feature vector

Namely the output of the dynamic context aggregation module of the nth layer.

S208: repeatedly acquiring a characteristic diagram corresponding to a plurality of refined image interpretation information in a recursive mode;

s209: the feature maps corresponding to the plurality of refined image interpretation information are subjected to upsampling operation to obtain a surface defect area segmentation result, and the whole process is shown in fig. 5.

The existing refined image interpretation method based on multi-scale pooling and convolution operation focuses on aggregation of global context information in an image, structural hierarchy and objects of a scene are omitted, the hierarchical structure of the image is composed of the scene and a series of objects, and therefore the hierarchical structure of the image is beneficial to understanding context information containing complex scene images and refined image interpretation.

In the embodiment, the hierarchical dynamic aggregation of the context information is realized by using the plurality of dynamic context aggregation modules on the premise of not introducing additional supervision, and the accuracy of workpiece surface defect area segmentation is improved by performing recursive refined image interpretation on the input image.

An embodiment of the present invention provides a surface defect region segmentation apparatus based on a convolutional neural network, and as shown in a functional structure diagram of fig. 6, the surface defect region segmentation apparatus based on the convolutional neural network includes:

the extraction module 61 is used for extracting feature maps of different scales corresponding to the input image;

an obtaining module 62, configured to obtain image structure information and scene information of different levels of the input image according to feature maps of different scales;

and the output module 63 is configured to output the surface defect region segmentation result according to the image structure information and the scene information of different layers.

In the embodiment, feature maps of different scales corresponding to the input image are extracted through an extraction module; the acquisition module acquires image structure information and scene information of different levels of the input image according to the feature maps of different scales; the output module outputs the segmentation result of the surface defect region according to the image structure information and the scene information of different layers, can realize the layered dynamic aggregation of context information on the premise of not introducing additional supervision, and improves the segmentation accuracy of the surface defect region of the workpiece by acquiring the image structure information and the scene information of different layers of the input image.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.

It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional component mode. The integrated module, if implemented in the form of a software functional component and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

It should be noted that the present invention is not limited to the above-mentioned preferred embodiments, and those skilled in the art can obtain other products in various forms without departing from the spirit of the present invention, but any changes in shape or structure can be made within the scope of the present invention with the same or similar technical solutions as those of the present invention.

Claims

1. A surface defect region segmentation method based on a convolutional neural network is characterized by comprising the following steps:

extracting feature maps of different scales corresponding to the input image;

2. The surface defect region segmentation method based on the convolutional neural network as claimed in claim 1, wherein the extracting feature maps of different scales corresponding to the input image comprises:

inputting the bottom layer feature maps of different scales into a channel feature pyramid, wherein the channel feature pyramid is used for utilizing 3 channels of N/4 bottom layer feature maps of different scales

3 convolution operations for processing, N/4 channels using 5

5 convolution operation processing, N/2 channels utilizing 7

7, carrying out convolution operation;

3. The surface defect region segmentation method based on the convolutional neural network as claimed in claim 1, wherein the obtaining of image structure information and scene information of different levels of the input image according to the feature maps of different scales comprises:

performing matrix transposition operation on the mask image;

4. The surface defect region segmentation method based on the convolutional neural network as claimed in claim 3, wherein the inputting the feature maps of different scales into the dynamic context aggregation module to obtain the mask maps of different structural layers comprises:

1 is carried out on the feature maps with different scales

3 convolution, 1

5. The surface defect region segmentation method based on the convolutional neural network as claimed in claim 4, wherein the inputting the feature maps of different scales into the dynamic context aggregation module to obtain mask maps of different structural layers further comprises:

3 convolution, 1

6. The surface defect region segmentation method based on convolutional neural network as claimed in claim 4, wherein the context information vector comprises context information corresponding to a plurality of classes, and the remapping the context information vector to the mask map yields refined image interpretation information, comprising:

7. The surface defect region segmentation method based on the convolutional neural network as claimed in claim 6, wherein the outputting the surface defect region segmentation result according to the different levels of image structure information and scene information comprises:

8. A surface defect region segmentation apparatus based on a convolutional neural network, comprising: