CN112150460A

CN112150460A - Detection method, detection system, device, and medium

Info

Publication number: CN112150460A
Application number: CN202011112325.0A
Authority: CN
Inventors: 崔淼
Original assignee: Shanghai Xiaoi Robot Technology Co Ltd
Current assignee: Shanghai Xiaoi Robot Technology Co Ltd
Priority date: 2020-10-16
Filing date: 2020-10-16
Publication date: 2020-12-29
Anticipated expiration: 2040-10-16
Also published as: CN112150460B

Abstract

The embodiment of the invention provides a detection method, a detection system, equipment and a medium, wherein the detection method comprises the following steps: obtaining a detection picture; inputting the detected picture into a first convolution neural network for processing, wherein the first convolution neural network firstly reduces the dimension of the characteristic data of the detected picture and then increases the dimension; obtaining first characteristic data according to the dimension reduction and dimension increase process; obtaining second characteristic data according to the dimension reduction process; fusing the first characteristic data and the second characteristic data to obtain spatial characteristic data; and judging the defect information according to the spatial characteristic data. The embodiment of the invention can improve the detection precision.

Description

Detection method, detection system, device, and medium

Technical Field

The embodiment of the invention relates to the technical field of artificial intelligence computer vision, in particular to a detection method, a detection system, equipment and a medium.

Background

The textile industry always occupies a great position in the economy of China, the cloth yield of China in 2019 exceeds 1000 hundred million meters, and the yield needs to be in an increasing trend all the time. With the rapid development of artificial intelligence and big data. The value to the textile industry would undoubtedly be enormous if artificial intelligence and computer vision technologies could be applied to the textile industry. Cloth defect detection is an important link of production and quality management in the textile industry, but the cloth defect detection is completed by manual detection all the time. The manual detection speed is low, the labor intensity is high, the consistency of the cloth defects is poor due to the influence of subjective factors, and the automation degree of the textile production flow is seriously reduced by the method. It is understood that the manual detection speed is generally 15-18 m/min, and at this speed, a single inspector can only complete the detection of the width of 0.8-1 m, so that the cloth inspection and arrangement links become bottlenecks in the whole production process. The manual detection also has the defect of over-dependence on the experience of cloth inspection workers, and frequent occurrence of multiple detections, missed detections and error classification of defects. Especially, missing detection often occurs when the feature with no obvious defect is only detected by manpower. At present, some existing detection algorithms are low in detection rate, high in overdetection rate and slow in speed, so that the requirements for industrialization are difficult to achieve. Therefore, the invention provides a more efficient flaw detection method by combining sliding window and full convolution network cloth flaw detection, and solves the problem that some detection algorithms fail to detect, detect excessively and classify the flaws and the industrial pain points are wrong at present.

Referring to fig. 1 and 2, two schematic views of the cloth with the flaw are shown, respectively. In fig. 1, a long patchwork defect 10 is formed on the surface of the cloth. The surface of the cloth in fig. 2 has a less pronounced blemish 20.

In the prior art, cloth in a fabric material processing program is subjected to image acquisition to obtain an input image, and then the input image is analyzed to realize flaw detection. However, the existing flaw detection has the problem of insufficient precision. Specifically, the rectangular frame 30 in fig. 2 is a defect detection result output by the conventional detection method, however, the stain 20 on the cloth is not located within the range of the rectangular frame 30, that is, the method does not accurately detect the stain 20.

Besides the surface defects of the fabric, the image recognition can also be applied to the defect recognition of other products (such as metal, plastic molds, automobile components, mechanical parts and the like). In addition, the image recognition technology can also be applied to scenes such as bills, license plates, face recognition and the like, but the problem that the detection precision is not satisfactory also exists in other application scenes.

Disclosure of Invention

The invention aims to provide a detection method, a detection system, equipment and a medium, and improve detection precision.

The technical scheme of the invention provides a detection method, which comprises the following steps: obtaining a detection picture; inputting the detected picture into a first convolution neural network for processing, wherein the first convolution neural network firstly reduces the dimension of the characteristic data of the detected picture and then increases the dimension; obtaining first characteristic data according to the dimension reduction and dimension increase process; obtaining second characteristic data according to the dimension reduction process; fusing the first characteristic data and the second characteristic data to obtain spatial characteristic data; and judging the defect information according to the spatial characteristics.

Optionally, the detection method further comprises: inputting the detection picture into a second convolutional neural network for processing to obtain detail characteristic data; fusing the spatial feature data and the detail feature data to obtain fused feature data; and judging the defect information according to the fusion characteristic data.

Optionally, the step of fusing the spatial feature data and the detail feature data to obtain fused feature data includes: and fusing the spatial feature data and the detail feature data based on a preset weight to obtain fused feature data.

Optionally, before obtaining the detection picture, the detection method further includes: a modeling step comprising: obtaining a sample picture; performing the first convolution neural network processing on the sample picture to obtain sample spatial feature data; performing second convolution neural network processing on the sample picture to obtain sample detail characteristic data; based on the initial weight, fusing the sample space characteristic data and the sample detail characteristic data to obtain sample picture data, and finishing one-time training; and continuously adjusting the initial weight through multiple times of training, and when the loss of the sample picture data meets a specification value, taking the adjusted weight as a preset weight.

Optionally, the step of obtaining a sample picture includes: obtaining an original sample picture, and converting the original sample picture into a mask picture; and carrying out gray level processing on the mask image to obtain a gray level image, and taking the gray level image and the original sample image as sample images.

Optionally, the step of determining the defect information on the detected picture includes: inputting the spatial feature information and/or the detail feature information into a contour recognition model to obtain a plurality of contour prediction data of the detection image; performing feature superposition on the plurality of contour prediction data to obtain prediction superposition data; and judging defect information based on the predicted superimposed data.

Optionally, the second convolutional neural network comprises a VGG network, the first convolutional neural network comprises a MobileNet V2 network; or, the first convolutional neural network comprises a mobilene V2 network and a feature image pyramid, and is used for processing data output by the mobilene V2 network; alternatively, the first convolution neural network includes a ResNet50 network, a feature image pyramid, and a full convolution network.

Optionally, the detection picture is a cloth picture, and the defect information is flaw information on the cloth.

Correspondingly, an embodiment of the present invention further provides a detection system, including:

the first picture acquisition unit is used for acquiring a detection picture; the semantic unit is used for inputting the detection picture into a first convolutional neural network for processing to obtain spatial feature data; a judging unit that judges defect information based on the spatial feature information; wherein, the inputting the detected picture into the first convolution neural network for processing comprises: firstly reducing the dimension of the feature data of the detected picture and then increasing the dimension; obtaining first characteristic data according to the dimension reduction and dimension increase process; obtaining second characteristic data according to the dimension reduction process; and fusing the first characteristic data and the second characteristic data to obtain spatial characteristic data, and judging defect information according to the spatial characteristic data.

Correspondingly, the embodiment of the invention also provides equipment comprising the detection system provided by the embodiment of the invention.

Optionally, the method further comprises: the sampling device is used for photographing an object to be detected; the first picture acquisition unit is used for acquiring an original picture from the sampling device.

Optionally, the sampling device is a camera, the apparatus further comprising: the transportation platform is used for transporting the object to be detected; the camera is arranged on the transportation platform and is used for photographing an object to be detected; the detection system is used for judging the type and the position of the defect on the object to be detected according to the picture of the object to be detected obtained by the camera.

Correspondingly, the embodiment of the present invention further provides a medium, on which computer instructions are stored, where the computer instructions execute the steps of the detection method according to the embodiment of the present invention.

Compared with the prior art, the technical scheme of the invention has the following advantages:

in the technical scheme of the invention, the detection picture is subjected to image recognition by a Convolutional Neural Network (CNN) deep learning method to obtain the defect information on the detection picture.

Drawings

FIG. 1 is a schematic view of a cloth with a flaw;

FIG. 2 is a schematic view of another cloth with a flaw;

FIG. 3 is a schematic flow chart of an embodiment of the detection method of the present invention;

FIG. 4 is a diagram illustrating the detection of the picture 101 in step S1 in FIG. 3;

FIG. 5 is a schematic diagram of a residual structure;

FIG. 6 is a schematic diagram of a first convolutional neural network of step 2 in FIG. 3;

FIG. 7 is a schematic diagram of another first convolutional neural network of step 2 in FIG. 3;

FIG. 8 is a schematic diagram of histogram equalization;

FIG. 9 is a schematic diagram of the fusion step of step 4 in FIG. 3;

FIG. 10, panels a and b, respectively, are graphs illustrating a comparison of the output of a prior art detection method and a detection method of the present invention;

FIG. 11 is a functional block diagram of an embodiment of a detection system of the present invention;

fig. 12 is a functional block diagram of an embodiment of the apparatus of the present invention.

Detailed Description

As described in the background art, the defect detection of the cloth in the prior art has a problem of low detection accuracy, and the problem of the image processing in the prior art is analyzed in combination with fig. 1 and 2.

In the prior art, image processing performed on an input image includes algorithms such as gaussian filtering, and these image algorithms need to adjust parameters according to different pictures. The size of the patchwork defect 10 is larger as in fig. 1, while the size of the spot defect 20 is smaller as in fig. 2. When the image processing is performed on the flaws with different sizes, different parameters need to be set, and the flaws can be detected more accurately.

In the actual cloth processing procedure, when it is impossible to know what kind of flaws occur on the cloth, image processing is usually performed by using a parameter setting, which is likely to cause target missing detection, for example: the patchwork defect 10 in fig. 1 is large in size and is easily detected, while the spot defect 20 in fig. 2 is not easily detected because of its small size, thereby presenting a problem of missing defects.

In addition, in order to reduce the computational complexity, the prior art also defines the input image by a method such as cropping or zooming a picture, which easily causes loss of spatial details, especially loss of details caused by a boundary part is more serious, and thus the problem of detection accuracy reduction easily occurs.

In order to solve the technical problem, embodiments of the present invention provide a detection method, where an image of a detected picture is identified by a Convolutional Neural Network (CNN) deep learning method to obtain defect information on the detected picture, and spatial and detail feature data are obtained for one detected picture through first and second Convolutional Neural network processing, respectively; therefore, the combined picture data contains spatial information and does not lose detail information, and the detection method of the embodiment of the invention can obtain higher defect detection precision on the basis of ensuring the processing efficiency.

The embodiments herein are described by taking the cloth defect detection as an example, but the present invention is not limited to the cloth defect detection, and can be used for defect identification of metals, plastic molds, automobile components, mechanical parts, and the like. In addition, the image recognition technology can also be applied to scenes such as bills, license plates, face recognition and the like.

Referring to fig. 3, a flow chart of an embodiment of the detection method of the present invention is schematically shown. The detection method comprises the following steps:

step S1, obtaining a detection picture;

step S2, performing first convolution neural network processing on the detection picture to obtain spatial characteristic data;

step S3, performing second convolution neural network processing on the detection picture to obtain detail feature data, wherein the second convolution neural network is shallower than the first convolution neural network in level and wider than the first convolution neural network in channel;

step S4, fusing the spatial feature data and the detail feature data to obtain picture data;

step S5, based on the picture data, determines defect information.

It should be noted that the above steps may be added or deleted or modified or combined according to actual situations. For example, in some embodiments, steps S3 and S4 may be eliminated, and the defect information may be obtained directly from the spatial signature data. In other embodiments, S4 and S5 may be combined to determine defect information by fusing feature information. Specifically, for example, the detection method includes the following steps:

firstly, obtaining a detection picture;

inputting the detected picture into a first convolution neural network for processing, wherein the first convolution neural network firstly reduces the dimension and then increases the dimension of the feature data of the detected picture; obtaining first characteristic data according to the dimension reduction and dimension increase process; obtaining second characteristic data according to the dimension reduction process;

thirdly, fusing the first characteristic data and the second characteristic data to obtain spatial characteristic data;

and fourthly, judging the defect information according to the spatial characteristic data.

The following describes each step of the above detection method in detail.

As shown in fig. 4, step S1 is executed to obtain the detected picture 101. The detection picture 101 here refers to a picture that can be recognized and processed by a convolutional neural network.

In this embodiment, the detection method detects the fabric, and the defect information is defect information on the fabric, so that the detected picture is a fabric picture.

In an actual cloth processing program, cloth moves rapidly on a production line, and when flaw detection is carried out, the surface of the cloth is photographed through a camera (or other image sensors) to obtain an original picture of the surface of the cloth. After obtaining the original picture, the present embodiment further includes performing equivalent segmentation on the original picture to obtain a plurality of detected pictures 101. This is because the size of the original image obtained by the common industrial camera does not meet the processing requirements of the convolutional neural network, and the original image needs to be cut.

For example, the original picture is 4096 × 500, and the original picture is cut into 500 × 500 detected pictures by equal cutting, and then sent to a convolutional neural network for processing.

In other embodiments, the size of the equal cut can be selected according to the requirement of the convolutional neural network on the detection picture. Alternatively, in other embodiments, the same amount of segmentation may not be performed if the original picture meets the requirements of convolutional neural network processing.

In order to facilitate the identification of the defect information, the embodiment further includes: after obtaining an original picture, preprocessing the original picture before performing equivalent segmentation to enhance the characteristic information of the picture. Because the embodiment of the invention identifies whether the image characteristics which are different from and abnormal with the cloth background exist on the cloth background, the abnormal image information related to the defects can be more remarkable by strengthening the characteristic information on the picture, thereby being beneficial to the accuracy of subsequent defect identification and detection.

Specifically, the preprocessing of the present embodiment is to perform gaussian filter processing (Gauss filter) on the original image. The gaussian filtering is very effective for suppressing the noise data of normal distribution by smoothing the data, thereby obtaining an image with high signal-to-noise ratio and capable of reflecting real image information.

In other embodiments, the pre-processing may further include: picture expansion (dispation) or picture Erosion (Erosion). The image expansion processing can strengthen the information of the image characteristics; the image corrosion treatment can weaken noise so as to highlight the characteristic information, so that the image expansion treatment and the image corrosion treatment can both play a role in strengthening the characteristic information of the image.

And step S2 is executed, and the first convolutional neural network processing is performed on the detected picture to obtain spatial feature data.

The detected image obtained in step S1 may be represented by data, specifically, the data when the detected image is input into the network is a pixel value matrix, and each element of the matrix is a pixel value representing different gray levels in the image.

Features (features) in the pixel value matrix can be extracted and learned through a convolutional neural network, and image information (such as a flat background, a small object on the background, an edge of a large object on the background, and the like) of a detected picture is obtained.

The process of feature extraction mainly comprises the following steps: the pixel value matrix is convolved by different convolution kernels (filters) (usually 3 × 3 or 5 × 5) to obtain different feature maps (feature maps), and based on the feature maps and subsequent processing (e.g., sampling, etc.), the image learning and recognition process can be realized.

In the embodiment of the invention, the first convolutional neural network processing is carried out through the convolutional neural network with narrow Channel (Channel) and deep hierarchy to obtain the first matrix which embodies the spatial information and is used as the spatial characteristic data.

When the first convolution neural network processing is performed, as the number of downsampling or convolution times increases, the receptive field (recurrent field) of the pixel matrix gradually increases, the overlapping area between the receptive fields also continuously increases, and the obtained information is the information of one area, that is, the obtained information is the characteristic information between the current area or adjacent areas. Therefore, high-level semantics can be obtained by enlarging the receptive field, and further, spatial feature data can be obtained.

The first convolutional neural network process has a narrow Channel (e.g., Channel is 32 or 64), and accordingly, the number of convolution kernels is small, which can reduce the amount of computation for image processing.

It should be noted that, the deeper the hierarchy of the deep convolutional network, i.e., the more convolution times, the more easily the gradient between layers diverges, and thus the error is easily generated.

The embodiment can obtain the spatial feature data through the convolutional neural network with the residual error structure, and the convolutional neural network with the residual error structure can extract the abstract spatial features of the cloth defects, so that the training effect is optimized. The network without the residual error structure can have some loss on the input data after dimension increasing and dimension reducing, and the lost signals are corrected by taking the original information as a reference through the residual error structure, so that the extraction of rich abstract spatial features is realized. As shown in fig. 5, which is a specific schematic diagram of a residual structure formed by three convolutional layers, the first convolution with 1 × 1 reduces 256-dimensional channels to 64-dimensional channels, and then 256 feature spectrums are output by adding the output of the last layer of convolution and the feature of the first layer of convolution.

In particular, the first convolutional neural network may be a MobileNet V2 network. The main framework of the MobileNet V2 network also combines residual units of MobileNet V1 and residual network ResNet, and adopts a method of ascending dimension first and then descending dimension, and processes of expansion, convolution feature extraction and compression are sequentially executed. The MobileNet V2 network is a lightweight network with narrow channels and deep layers, so that the processing speed of the network on the detected picture 101 can be increased.

In other embodiments, other convolutional neural networks may be used to process the detected pictures to obtain high-level semantics. Referring to fig. 6, a schematic diagram of a first convolutional neural network of step 2 in fig. 3 is shown. The first convolutional neural network includes: the MobileNet Network 201, and a Feature Pyramid 202 (FPN) are used for further processing data output by the MobileNet Network 201.

Specifically, FPN is a method of feature fusion with different resolutions, and features at different levels are enhanced by adding a feature map (feature map) of each resolution and a low-resolution feature (element-wise) of an upsample (up sample), so that the performance of target detection can be improved more significantly. In addition, because FPN is based on MobileNet network, cross-layer connection and low-resolution feature addition are carried out. Compared with the embodiment only adopting the MobileNet V2 network 2, the embodiment increases the calculation amount less, thereby taking efficiency and precision into consideration.

Specifically, as shown in FIG. 6, the FPN is characterized by 4 layers of features (e.g., C2, C3, C4, C5 extract pictures 32, 64, 128, 256, respectively), and each layer incorporates low resolution features (e.g., C4 incorporates C5 features). The FPN can extract features in the pixel matrix at different scales by fusing the features at different scales, so as to prevent the defect target on the detected picture 101 from being lost as much as possible.

More specifically, the first convolutional neural network shown in fig. 6 is implemented as follows. Firstly, a detection picture is obtained, wherein the detection picture comprises flaws. The test picture may be a true color image (i.e., an RGB image). When a defect in a certain detection picture needs to be identified, the defect in the detection picture needs to be positioned first, and then the type and size of the defect are identified based on the positioning. The method comprises the steps that a detection picture to be identified can be sent to a terminal, after the terminal obtains the picture, the operation of positioning the flaw to be identified in the detection picture is started, and the following steps are executed.

Firstly, inputting the detection picture into a first neural network shown in fig. 6 to obtain a plurality of feature layers with different feature dimensions of the detection picture, wherein the first neural network is a model which is obtained by analyzing the flaw features in the sample and is used for extracting the flaw feature layers in the input picture. The first neural network model is a model which is used for acquiring a texture feature layer of an input picture based on different feature dimensions and is trained according to a defective part and a non-defective part in a sample image. When the feature dimension corresponds to the pixel value of the picture, the texture extraction model may use a MobileNet and FPN feature image pyramid model, and perform convolution processing on the pixel of the detected picture through a plurality of different convolution kernels to obtain a plurality of feature layers corresponding to the detected picture, for example, the number of feature layers may be 19.

And secondly, screening a basic characteristic layer from the plurality of characteristic layers. The base feature layer is the best of the feature layers for locating flaws. After obtaining a plurality of feature layers, not all feature layers are subjected to the next operation, but are screened according to the identification requirement, and only the basic feature layer with the best flaw positioning effect is reserved. If the neural network model is MobileNetv2, the 2 nd, 3 th, 4 th and 5 th layers can be used as basic feature layers from the 19 texture feature layers, and the dimensions of the feature matrix of each selected layer are 1/2, 1/4, 1/8 and 1/16 of the original image respectively.

And thirdly, performing feature superposition on the basic feature layer to obtain a feature layer of the flaws in the detection picture. After obtaining the above basic feature layers, if there is more than one basic feature layer, features in the basic feature layers need to be superimposed to obtain a feature layer representing the position of the flaw in the inspection picture. When the basic texture feature layer corresponds to a convolution layer obtained by the detection image through a plurality of different convolution kernels, pixel interpolation can be carried out on the basic feature layer, and high-dimensional pixel images of the plurality of basic feature layers are obtained.

And finally, acquiring the position of the flaw in the detection picture according to the characteristic layer of the flaw. The position of the defect in the detection picture can be obtained according to the characteristic distribution condition in the layer, such as the characteristic pixel distribution corresponding to the defect in the pixel.

Referring to fig. 7, a schematic diagram of another first convolutional neural network of step 2 in fig. 3 is shown. The first convolution neural network of the embodiment is composed of a ResNet101 or ResNet50 network, a feature image pyramid and a full convolution network. The step of processing by the first convolutional neural network further comprises: the feature data obtained by the network is fused through the native module 301, the segmentation module 302 and the fusion module 303.

Specifically, as shown in fig. 7, when the inspection picture 101 is input into a network for processing, the feature data may be processed in a way of descending dimension (for example, a process from block1 to block 5) and then ascending dimension (for example, a process from up4 to up 1). The native module 301 outputs the first feature data obtained by performing the whole process of dimensionality reduction and dimensionality lifting, and the segmentation module 302 outputs the second feature data obtained by performing the first half of dimensionality reduction.

The fusion module 303 fuses the first feature data and the second feature data to obtain spatial feature data.

In a specific embodiment, the detected picture 101 is input into a first convolution neural network, the neural convolution basis network is composed of ResNet101 or ResNet50, a feature image pyramid and a full convolution network, and then feature pyramid Fusion (FPN) is performed on a 1 st layer, a 3 rd layer and a 5 th layer of a convolution layer block in ResNet101 or ResNet50, with up4, up3, up2 and up1 respectively. The up4, up3, up2 and up1 respectively use the hole convolution up sampling feature spectrum, the primary module 301 outputs first feature data, the segmentation module 302 outputs second feature data, and then the primary module and the segmentation module are respectively fused, and the fusion features are subjected to full convolution operation. The method of the cavity convolution can enlarge the receptive field under the condition of not making pooling loss information, and each convolution output contains information in a large range. Before inputting the detected picture into the neural network, a preprocessing may be performed on the detected picture, for example, histogram equalization may be performed on the detected picture, as shown in fig. 8, the histogram equalization is used to increase the global contrast of the cloth defect, and especially when the contrast of the cloth defect is relatively close, by using this method, the cloth defect brightness may be better distributed on the histogram. Therefore, the method can be used for enhancing the defect characteristics of the cloth, can enable the classification of similar flaws to be more accurate, inputs the histogram equalization image into a model for training, and finally obtains a mask image.

One of the technical effects of the operation is to solve the problems that the missing detection of the defect unobvious characteristics occurs in the convolution process and the classification of the cloth defects is more accurate. And meanwhile, the robustness of the training model is increased.

More specifically, the first convolutional neural network shown in fig. 7 is implemented as follows. Firstly, a detection picture is obtained, and the target picture comprises flaws. This step is the same as the step of fig. 6, and when it is necessary to identify whether there is a defect in a certain inspection picture, the defect in the inspection picture is generally located first, and then the type and size of the defect are identified based on the location. The detection picture to be identified can be sent to the terminal, and the following steps are executed after the terminal acquires the picture.

Firstly, inputting the detected picture into a first convolution neural network, wherein conv3, block1, block3 and block5 form a full convolution network structure which is provided with a residual error structure and is constructed by combining a neural network resnet101 basic network and an image golden tower, after the detected picture is input, obtaining a first characteristic layer through conv3, obtaining a second characteristic layer through conv3 and block1, continuously obtaining a third characteristic layer through block3, and obtaining a fourth characteristic layer through block5, so that a plurality of characteristic layers with different characteristic dimensions of the detected image can be obtained.

And secondly, performing up4 upsampling (also called image interpolation) on a fifth feature layer obtained by a fourth feature layer obtained by block5, stacking the fifth feature layer and a third feature layer, continuously performing up3 on the fifth feature layer to obtain a sixth feature layer, repeating the steps, finally obtaining an eighth feature layer by up1, enabling first feature data output by up1 to enter a native module, enabling second feature data output by up4 to enter a segmentation module, then respectively segmenting and fusing outputs of the native module and the segmentation module, and performing full convolution operation on fused features.

And finally, acquiring different defect type mask images (also mask images) of the flaws in the detection picture according to a fused feature layer, namely spatial feature data. When the defect feature layer is obtained, the position of the defect in the inspection picture and the defect type can be obtained according to the feature distribution condition in the layer, such as the feature pixel distribution corresponding to the defect in the pixel.

In the embodiments of fig. 6 and 7, the defect information may be directly obtained from the obtained spatial feature information. In another embodiment, the defect information may also be obtained by using fused feature information of the spatial feature information and the detail feature information.

And step 3 is executed, second convolutional neural network processing is carried out on the detection picture, and detail feature data are obtained, wherein the second convolutional neural network is shallower than the first convolutional neural network in hierarchy, and the second convolutional neural network is wider than the first convolutional neural network in channel.

In the embodiment of the invention, the second convolutional neural network processing adopts a network complementary to the first convolutional neural network, so that complementary characteristic information can be obtained. In addition, the second convolutional neural network processing and the first convolutional neural network in the embodiment of the invention adopt a parallel mode to process the detection picture, so that the processing efficiency of the detection method can be improved.

And performing second convolutional neural network processing through a convolutional neural network with a wide channel and shallow layers to obtain a second matrix which embodies detailed information and is used as the detailed characteristic data.

Specifically, the second convolutional neural network has shallow hierarchy and correspondingly has smaller receptive field, so that the finally output characteristic diagram can embody more fine-grained characteristic information.

The Channel width of the second convolutional neural network (for example, Channel is 512), and data of three channels of RGB can be processed, so that more detail information can be obtained through more convolution kernels.

In this embodiment, the second convolutional neural network adopts a VGG network structure.

Specifically, each layer of the VGG network structure comprises: convolution layer, Batch Normalization (Batch Normalization) and activation function.

The step size (stride) of the first layer can be set to be 2, the feature mapping output by the second convolutional neural network processing is 1/8 of the original input, so that the fine granularity is small, and the detail information can be obtained.

In practical application, the step size and the size of the convolution kernel can be adjusted according to the requirements of calculation speed and precision.

It should be noted that, compared with the second convolutional neural network in step S3, the first convolutional neural network adopted in step S2 has a residual error structure, so that errors caused by network processing are reduced; in addition, the first convolutional neural network is also provided with network layer feature fusion, so that more features can be reserved.

In addition, compared with the second convolutional neural network, the first convolutional neural network has fewer parameters under the same convolutional layer, and the calculation amount can be reduced, and the characteristic is mainly directed to the basic network MobileNet V2.

And executing step S4, fusing the spatial feature data and the detail feature data to obtain picture data. The fusion is a process of adding corresponding positions of a first matrix representing the spatial information and a second matrix representing the detail information. And the preset weight refers to the ratio of the spatial characteristic data and the detail characteristic data in addition.

The weight of the spatial feature data and the weight of the detail feature data are respectively between 0 and 1, and the sum of the two weights is 1.

The spatial feature data and the detail feature data are mutually complementary feature data, and the obtained picture data contain spatial information and have no loss of detail information by fusing the spatial feature data and the detail feature data, so that the higher detection precision can be ensured on the basis of keeping a certain processing speed.

Specifically, the step of fusing comprises: and fusing the spatial feature data and the detail feature data based on a preset weight to obtain picture data.

In practical applications, the preset weight may be set to 1: 1. that is, picture data is obtained by simply adding the spatial feature data and the detail feature data. The processing mode is simple and the calculation amount is small.

The step of fusing the two feature data can also be performed in other ways. Referring to fig. 9, a schematic diagram of one way of fusion of step S4 is shown. The fusing step comprises: processing the spatial feature data through two different convolutions to respectively obtain first spatial data and second spatial data; processing the detail characteristic data through two different convolutions to respectively obtain first detail data and second detail data; in the fusion, the first spatial data and the first detail data (or the second detail data) are combined, and the second spatial data and the first detail data (or the second detail data) are combined, so that four combination modes are obtained. Through the multiple combination modes, the loss of the obtained image data is smaller than that of the original image by adjusting the preset weight, so that the information of the original image can be reflected more truly, and the accuracy of defect judgment is improved.

In other embodiments, more paths or combinations of paths may be used to configure the weights to perform the fusing step.

In the fusion step, the spatial feature data and the detail feature data learned in step S2 and step S3 are superimposed together to obtain fusion feature data, and the picture data can be obtained according to the fusion feature data, thereby completing the learning process of the detected picture.

The spatial characteristic information and/or the detail characteristic information before the data picture is obtained can be further subjected to the following operations, so that the type and the position of the flaw can be more accurately judged: inputting the spatial feature information and/or the detail feature information into a contour recognition model to obtain a plurality of contour prediction data of the detection image; performing feature superposition on the plurality of contour prediction data to obtain prediction superposition data; and obtaining the position and/or the type of the judgment defect based on the predicted superposition data.

Firstly, the spatial feature information and/or the detail feature information are input into a contour recognition model, and a plurality of contour prediction data of the detection image are obtained. Since the contour recognition model trained by the above-described method has the capability of inputting the feature information of one sample and outputting a plurality of contour prediction data corresponding to each of a plurality of training data obtained based on the contour of the recognition object in the sample, the plurality of contour prediction data represent the contours of prediction target objects in different size ranges. Then, by inputting the feature information of one image to be detected, a plurality of contour prediction data of the image to be detected can be output, and the plurality of contour prediction data predict the contour of the target object in the image to be detected from a plurality of size ranges.

Secondly, performing feature superposition on the plurality of contour prediction data to obtain prediction superposition data; specifically, for a target object in an image to be detected or a plurality of target objects overlapped together, the prediction cannot be predicted or clearly divided in obtaining contour prediction data from one size range in the related art. But predicts the contour of the target object in the image to be detected from a plurality of size ranges due to a plurality of contour prediction data. For a target object or a plurality of target objects overlapped together, even if it cannot be predicted in one size range, it can be predicted in the other size range. A plurality of contour prediction numbers representing contours of target objects in a plurality of size range prediction images to be detected are superimposed, and prediction superimposed data can clearly reflect detection results of small-size target objects or a plurality of superimposed target objects.

Finally, based on the predicted superimposed data, contour data of the target object is obtained. The predicted superimposition data can clearly reflect the detection result of the target object or a plurality of target objects superimposed together, obtain contour data of the target object based on the predicted superimposition data, determine the position and contour of the target object, and output the detection result. The outlines of the overlapped target objects can be clearly separated, and the outline of each target object can be accurately predicted. For example, in cloth inspection, a flaw of the cloth is identified.

In one embodiment, the plurality of contour prediction data includes: and reducing the outline of the identification object corresponding to the basic data according to M different scaling factors by taking the geometric center of the outline of the flaw corresponding to the spatial characteristic information and/or the detail characteristic information as a scaling center.

The M pieces of contour prediction data acquired in the reduction mode can reflect the contours of flaws in the sample from different proportions, so that reference values are clearer under different proportions, and the output plurality of contour prediction data reflect the flaws under different proportions, so that the contour recognition model has the capability of recognizing recognition objects under different scales, and can recognize the recognition objects more comprehensively, accurately and quickly.

Step S5, based on the picture data, determines defect information.

And performing machine learning on the detected picture to obtain corresponding picture data, and comparing the picture data with the prestored defect information to judge the position and/or type of the defect.

In this embodiment, the same amount of segmentation processing is performed before the detected picture is input to the network. Correspondingly, the step of judging the defect information on the detected picture based on the picture data comprises the following steps: and merging the picture data corresponding to the plurality of detected pictures, and judging the position or the type of the defect based on the merged data.

When merging is carried out, restoration is carried out according to the corresponding position of each detection picture during cutting, so that picture data of the whole original picture is obtained, and accurate positioning of the defect position is facilitated.

It should be noted that the picture data herein is equivalent to a matrix, and the elements in the matrix represent whether there is a defect at each position and the type of the defect. For example, the element value is 0 for the locations without flaws; the defective positions have element values of 1, 2 and 3 … …, wherein 1, 2 and … … represent different defect types respectively.

Please refer to the flowchart of fig. 3, in this embodiment, before performing actual detection, a modeling step is further required, and the modeling step is mainly used for configuring the preset weight. In the modeling process, the convolutional neural network performs a feature learning process, and is also a defect feature learning process.

Specifically, the step of modeling comprises: obtaining a sample picture; performing first convolution neural network processing on the sample picture to obtain sample spatial feature data; performing second convolution neural network processing on the sample picture to obtain sample detail characteristic data; based on the initial weight, fusing the sample space characteristic data and the sample detail characteristic data to obtain sample picture data, and finishing one-time training; and adjusting the initial weight through multiple times of training, and when the loss of the sample picture data meets a specification value, taking the adjusted weight as a preset weight.

The processing modes executed in the modeling step and the detection method step are the same, and the difference is that data input into the network in the modeling process are different, the data input into the network in the modeling process are sample pictures, and the network learns the defect characteristics in the pictures on the one hand and configures the preset weight of the two characteristic data in the fusion step on the other hand based on the learning of the sample pictures.

In the initial learning process, the initial weight is a randomly set weight, the weight is adjusted in each learning process to reduce the loss of the image data, and when the loss of the sample image data meets a specification value, the adjusted weight is used as a preset weight.

And then, in the actual detection process, detecting by using the preset weight obtained in the modeling process.

In addition, the modeling step is different from the detection method in that: after an original sample picture is obtained, converting the original sample picture into a mask picture; and carrying out gray level processing on the mask image to obtain a gray level image, and taking the gray level image and the original sample image as sample images.

The sample picture after gray processing is used for training, so that on one hand, the data volume is small, on the other hand, defect edge information can be reflected, and the learning of defect characteristics is facilitated.

It should be noted that the original sample picture in the modeling step and the original picture actually subjected to detection are both equally-cut pictures. The gray-scale image and the original sample image are used as sample images to be trained in pairs, and the association between the gray-scale image and the original image can be established through the original sample image, so that in the subsequent detection process, the defect detection can be realized as long as the original image is input.

Referring to fig. 10, a and b are graphs illustrating comparison of the output results of the prior art detection method and the detection method of the present invention, respectively.

For the blemish 501, the detection box 502 of the prior art detection method of FIG. a is not calibrated to the location of the blemish 501. As shown in fig. b, the inspection method of the present invention accurately marks the locations of the smudge defects 501, and determines that the defect type is a spot defect (spot) and the number of defects is 1.

In other embodiments, other types of defects may also be detected, such as, for example, pins, snags, patchwork, and so forth.

It should be further noted that, when the detection method of the embodiment of the present invention detects the defect information on the cloth image, the processing of one cloth image can be completed in less than 0.1 s. In addition, in the defect information detection, the mIOU can be detected with a precision of 0.7 or more. The detection method provided by the embodiment of the invention simultaneously considers the processing speed and the detection precision.

Accordingly, the present invention further provides a detection system, referring to fig. 11, which shows a functional block diagram of an embodiment of the detection system of the present invention, the detection system includes:

a first picture obtaining unit 601, configured to obtain a detection picture;

a semantic unit 602, configured to perform first convolutional neural network processing on the detected picture to obtain spatial feature data;

a detail unit 603, configured to perform a second convolutional neural network processing on the detected picture to obtain detail feature data, where the second convolutional neural network has a shallower level than the first convolutional neural network, and a channel of the second convolutional neural network is wider than that of the first convolutional neural network;

a fusion unit 604, configured to fuse the spatial feature data and the detail feature data to obtain picture data;

the determining unit 605 is configured to determine defect information according to the picture data.

The detection system of the embodiment of the invention carries out image recognition on the detection picture by a CNN deep learning method to obtain the defect information on the detection picture, and the detection system of the embodiment of the invention respectively obtains space and detail characteristic data for one detection picture by a first convolutional neural network and a second convolutional neural network; therefore, the combined picture data contains spatial information and does not lose detail information, and the detection method of the embodiment of the invention can obtain higher defect detection precision on the basis of ensuring the processing efficiency.

The various elements and modules of the detection system are described in detail below with reference to the figures.

With reference to fig. 4 in combination, a first picture acquisition unit 601 configured to obtain a detection picture 101; the detection picture 101 here refers to a picture that can be recognized and processed by a convolutional neural network.

In this embodiment, the detection system is configured to detect a fabric, and the defect information is defect information on the fabric. Therefore, the detection picture is a cloth picture.

In an actual cloth processing program, cloth moves rapidly on a production line, and when flaw detection is carried out, the surface of the cloth is photographed through a camera (or other image sensors) to obtain an original picture of the surface of the cloth. After obtaining the original picture, the present embodiment further includes performing equivalent segmentation on the original picture to obtain a plurality of detected pictures 101. This is because the size of the original image obtained by a common industrial camera does not meet the processing requirements of the convolutional neural network, and the original image needs to be cut.

For example, the original picture is 4096 × 500, and the original picture is cut into 500 × 500 detected pictures by equal amount of cutting, and the extra parts are enlarged to 500 × 500 at this time; or cutting the image into 512 × 500 detection images, and sending the images into a convolutional neural network for processing.

In other embodiments, the size of the equal cut can be selected according to the requirement of the convolutional neural network on the detection picture. Alternatively, in other embodiments, the original image meets the requirements of convolutional neural network processing, and does not need to be cut equally.

It should be noted that the types of defects that may occur on the cloth are many, and the sizes of the defects are different. In order to facilitate the identification of the defect information, the first picture capturing unit 601 in the system of this embodiment is further configured to perform preprocessing on the original picture before performing equal-amount segmentation, so as to enhance the feature information of the picture. Because the embodiment of the invention identifies whether the image characteristics which are different from and abnormal with the cloth background exist on the cloth background, the abnormal image information related to the defects can be more remarkable by strengthening the characteristic information on the picture, thereby being beneficial to the accuracy of subsequent defect identification and detection.

The first picture acquiring unit 601 is configured to perform expansion or corrosion preprocessing on the original picture; the image expansion processing can strengthen the information of the image characteristics; the image corrosion treatment can weaken noise so as to highlight the characteristic information, so that the image expansion treatment and the image corrosion treatment can both play a role in strengthening the characteristic information of the image.

Alternatively, the first picture obtaining unit 601 is configured to perform gaussian filtering preprocessing on the original picture. The gaussian filtering is very effective for suppressing noise data that follows normal distribution by smoothing the data, thereby obtaining an image that has a high signal-to-noise ratio and can reflect real image information.

the detected picture obtained by the first picture obtaining unit 601 can be represented by data, specifically, a pixel matrix, and each element of the matrix is a pixel value representing different gray scales. That is, the input to the first convolutional neural network for processing is a matrix of pixel values.

Features (features) in the pixel value matrix can be extracted and learned through a convolutional neural network, and image information (such as a flat background, a small object on the background, an edge of a large object on the background, and the like) of a detection picture can be obtained.

In the embodiment of the invention, the first convolutional neural network processing is carried out through the convolutional neural network with narrow Channel (Channel) and deep hierarchy to obtain the first matrix which embodies the spatial information, namely the spatial characteristic data.

The level of the first convolution neural network processing can be gradually increased for the reception fields (reconstruction fields) of the pixel matrix along with the increase of the down-sampling or convolution times, the overlapping area between the reception fields is also continuously increased, and the obtained information is the information of one area, namely the obtained information is the characteristic information between the area or the adjacent areas. Therefore, high-level semantics can be obtained by enlarging the receptive field, and further, spatial feature data can be obtained.

It should be noted that, the deeper the hierarchy of the deep convolutional network, i.e., the more convolution times, the more easily the gradient between layers diverges, and thus the error is easily generated. The embodiment can obtain the spatial feature data through the convolutional neural network with the direct connection (short) structure, and the convolutional neural network with the direct connection structure can reduce errors caused by network processing by enabling the input data to be data with residual errors, so that the training effect is optimized.

In other embodiments, other convolutional neural networks may be used to process the detected pictures to obtain high-level semantics. Referring to fig. 6, a schematic diagram of another first convolutional neural network of step 2 in fig. 3 is shown. The first convolutional neural network includes: the system comprises a MobileNet Network 201 and a Feature image Pyramid 202 (FPN) for further processing data output by the MobileNet Network 201.

As shown in fig. 7, the first convolution neural network consists of a ResNet-50 network, a feature image pyramid, and a full convolution network. The step of processing through the first convolutional neural network further includes processing the feature data obtained by the network through the native module 301, the segmentation module 302 and the fusion module 303.

Specifically, as shown in fig. 7, when the inspection picture 101 is input into a network for processing, the feature data may be processed in a way of descending dimension (for example, a process from block1 to block 5) and then ascending dimension (for example, a process from up4 to up 1).

The native module 301 outputs the first feature data obtained by performing the whole process of dimensionality reduction and dimensionality lifting, and the segmentation module 302 outputs the second feature data obtained by performing the first half of the process of dimensionality reduction only.

The detection system of the embodiment of the invention also comprises: a detail unit 603, configured to perform a second convolutional neural network processing on the detected picture to obtain detail feature data, where the second convolutional neural network has a shallower level than the first convolutional neural network, and the second convolutional neural network has a wider channel than the first convolutional neural network.

The second convolutional neural network processing of the detail unit 603 employs a network complementary to the first convolutional neural network, so that complementary feature information can be obtained. In addition, the second convolutional neural network processing is performed in parallel with the first convolutional neural network, so that the processing efficiency of the detection method can be improved.

And performing second convolutional neural network processing through a convolutional neural network with a wide channel and shallow layers to obtain a second matrix which embodies detail information, namely the detail characteristic data.

The Channel width of the second convolutional neural network (e.g., Channel is 512), and more detail information is obtained by more convolution kernels.

Specifically, each layer of the VGG network structure comprises: convolutional layer, batch normalization and activation functions.

The step size (stride) of the first layer of each stage can be set to be 2, the feature mapping output by the second convolutional neural network processing is 1/8 of the original input, so that the feature mapping has smaller fine granularity, and the detail information can be obtained.

It should be noted that, compared with the second convolutional neural network, the first convolutional neural network has a residual error structure, which reduces errors caused by network processing; in addition, the first convolutional neural network is also provided with network layer feature fusion, so that more features can be reserved.

In addition, compared with the second convolutional neural network, the first convolutional neural network can have fewer parameters under the same convolutional layer, and the calculation amount can be reduced.

The detection system of the embodiment of the invention also comprises: the fusion unit 604 is configured to fuse the spatial feature data and the detail feature data according to a preset weight to obtain picture data.

The fusion is a process of adding corresponding positions of a first matrix representing the spatial information and a second matrix representing the detail information. And the preset weight refers to the ratio of the spatial characteristic data and the detail characteristic data in addition.

In practical applications, the preset weight may be set to 1: and 1, simply adding the spatial feature data and the detail feature data to obtain picture data. The processing mode is simple and the calculation amount is small.

The step of fusing the two feature data can also be performed in other ways. As shown in fig. 11, the fusion unit 604 is configured to process the spatial feature data through two different convolutions to obtain first spatial data and second spatial data, respectively; the first convolution processing module is further used for processing the detail characteristic data through two different convolutions to respectively obtain first detail data and second detail data; the fusion unit 604 combines the first spatial data and the first detail data (or the second detail data) during fusion, and combines the second spatial data and the first detail data (or the second detail data), thereby obtaining four combination modes. Through the multiple combination modes, the loss of the obtained image data is smaller than that of the original image by adjusting the preset weight, so that the information of the original image can be reflected more truly, and the accuracy of defect judgment is improved.

The fusion unit 604 obtains picture data by superimposing the spatial feature data and the detail feature data that are learned separately, thereby completing the learning process of the detected picture.

As shown in fig. 11, the detection system further includes: the determining unit 605 is configured to determine defect information according to the picture data.

The determining unit 605 performs machine learning on the detected picture to obtain corresponding picture data, and compares the picture data with the defect information stored in advance to determine the position and/or type of the defect.

In this embodiment, the same amount of segmentation processing is performed before the detected picture is input to the network. Correspondingly, the determining unit 605 is further configured to combine the picture data corresponding to the multiple detected pictures, and determine the position and/or type of the defect based on the combined data.

When merging, the determining unit 605 restores the original pictures according to the corresponding positions of the detected pictures during cutting, so as to obtain the picture data of the whole original picture, thereby facilitating accurate positioning of the defect position.

In the above description, the functions and the connection relationships of the modules of the detection system during actual detection are described, and in practical application, before detection, the deep convolutional neural network training process for the modules of the detection system is further included. And in the training process, the method is mainly used for configuring the preset weight so as to realize modeling. In addition, the detection system also completes defect feature learning in the training process so as to compare and judge the defect information in the subsequent detection process.

With continued reference to the functional block diagram of the detection system shown in FIG. 11, the detection system further includes: a second picture obtaining unit 701, configured to obtain a sample picture; the semantic unit 602 is further configured to perform first convolutional neural network processing on the sample picture to obtain sample spatial feature data; the detail unit 603 is further configured to perform second convolutional neural network processing on the sample picture to obtain sample detail feature data; the fusion unit 604 is further configured to fuse the sample spatial feature data and the sample detail feature data according to the initial weight to obtain sample picture data, and complete a training; and the initial weight is adjusted through multiple times of training, and when the loss of the sample picture data meets the specification value, the adjusted weight is used as a preset weight.

With continuing reference to fig. 11, the second picture taking unit 701 includes: a first picture processing unit 7011, configured to obtain an original sample picture, and convert the original sample picture into a mask picture; a second picture processing unit 7012, configured to perform gray scale processing on the mask image to obtain a gray scale image, and use the gray scale image and the original sample picture as sample pictures.

It should be noted that the original sample picture obtained by the first picture processing unit 7011 and the detected picture obtained by the first picture obtaining unit 601 are both equally divided pictures. The gray-scale image and the original sample image are used as sample images to be trained in pairs, and the association between the gray-scale image and the original image can be established through the original sample image, so that in the subsequent detection process, the defect detection can be realized as long as the original image is input.

In the process of training to realize modeling, data input into the network by the detection system are different, a sample picture is input into the network in the modeling process, and the network learns the defect characteristics in the picture on the one hand and configures the preset weights of the two characteristic data in the fusion step on the other hand based on the learning of the sample picture.

In the initial learning process, the initial weight is a randomly set weight, the weight is adjusted in each learning process to reduce the loss of the image data, and when the loss of the sample image data meets a specification value, the adjusted weight is used as a preset weight. And then, in the actual detection process, detecting by preset weights in the modeling process.

Referring to fig. 12, a schematic diagram of an embodiment of the apparatus of the present invention is shown.

The equipment comprises the detection system provided by the embodiment of the invention and is used for judging the type and/or position of the defect on the cloth according to the detection picture of the cloth to be detected. The following description will be given with reference to a cloth defect detecting apparatus as a specific example. In other embodiments the detection system may be used in other devices.

The apparatus further comprises: and the sampling device 30 is used for photographing the object to be detected. A first picture acquisition unit in the detection system is used for acquiring an original picture from the sampling device.

In particular, the sampling device 30 is a camera (which may also be another image sensor). In the cloth processing procedure, the cloth is moved rapidly on the production line. During detection, the device photographs the surface of the cloth through the camera to obtain a picture of the surface of the cloth, then performs image processing on the picture through the detection system to judge whether the surface of the cloth has defects or not, and further analyzes defect information, for example: the location and type of the defect.

The camera can take pictures of moving cloth at a high shooting rate (tens of thousands of times per second), thereby ensuring the production efficiency of the production line.

The apparatus further comprises: the transportation platform 40 is used for transporting the cloth to be detected; the camera is arranged on the cloth conveying platform and used for photographing the cloth to be detected.

The apparatus may further include: and the marking device is used for marking the defects on the cloth according to the types and positions of the defects judged by the detection system.

The cloth manufacturer can choose to discard the cloth section with the defects according to the defect type and the product quality requirement, or the cloth section with the defects is still used as a qualified product after the defects are removed through cleaning.

The equipment disclosed by the invention can be applied to the field of cloth detection, and also can be applied to other fields such as automobile part flatness detection and image recognition. The equipment comprises the detection system, so that the equipment has higher detection precision and higher detection efficiency.

Correspondingly, the embodiment of the invention also provides a medium, on which computer instructions are stored, and when the computer instructions are executed, the steps of the detection method of the invention are executed.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims. Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method of detection, comprising:

obtaining a detection picture;

inputting the detected picture into a first convolution neural network for processing, wherein the first convolution neural network firstly reduces the dimension of the characteristic data of the detected picture and then increases the dimension; obtaining first characteristic data according to the dimension reduction and dimension increase process; obtaining second characteristic data according to the dimension reduction process;

fusing the first characteristic data and the second characteristic data to obtain spatial characteristic data;

and judging the defect information according to the spatial characteristic data.

2. The detection method of claim 1, further comprising: inputting the detection picture into a second convolutional neural network for processing to obtain detail characteristic data;

fusing the spatial feature data and the detail feature data to obtain fused feature data;

and judging the defect information according to the fusion characteristic data.

3. The detection method according to claim 2, wherein the step of fusing the spatial feature data and the detail feature data to obtain fused feature data comprises:

and fusing the spatial feature data and the detail feature data based on a preset weight to obtain fused feature data.

4. The detection method of claim 3, wherein prior to obtaining the detection picture, the detection method further comprises: a modeling step comprising:

obtaining a sample picture;

performing the first convolution neural network processing on the sample picture to obtain sample spatial feature data;

performing second convolution neural network processing on the sample picture to obtain sample detail characteristic data;

based on the initial weight, fusing the sample space characteristic data and the sample detail characteristic data to obtain sample picture data, and finishing one-time training;

and continuously adjusting the initial weight through multiple times of training, and when the loss of the sample picture data meets a specification value, taking the adjusted weight as a preset weight.

5. The detection method of claim 4, wherein the step of obtaining a sample picture comprises: obtaining an original sample picture, and converting the original sample picture into a mask picture;

and carrying out gray level processing on the mask image to obtain a gray level image, and taking the gray level image and the original sample image as sample images.

6. The detection method of claim 1, wherein the step of obtaining the detection picture comprises: obtaining an original picture; and cutting the original picture to obtain a plurality of detection pictures.

7. The detecting method as claimed in claim 1, wherein the step of determining the defect information on the detected picture comprises: inputting the spatial feature information and/or the detail feature information into a contour recognition model to obtain a plurality of contour prediction data of the detection image;

performing feature superposition on the plurality of contour prediction data to obtain prediction superposition data; and judging defect information based on the predicted superimposed data.

8. The detection method of any one of claims 1-7, wherein the second convolutional neural network comprises a VGG network, the first convolutional neural network comprises a MobileNet V2 network; alternatively, the first and second electrodes may be,

the first convolutional neural network comprises a Mobilene V2 network and a characteristic image pyramid, and is used for processing data output by the Mobilene V2 network; alternatively, the first and second electrodes may be,

the first convolution neural network includes a ResNet101 or ResNet50 network, a feature image pyramid, and a full convolution network.

9. The inspection method according to any one of claims 1 to 7, wherein the inspection picture is a picture of a cloth, and the defect information is defect information on the cloth.

10. A detection system, comprising:

the first picture acquisition unit is used for acquiring a detection picture;

the semantic unit is used for inputting the detection picture into a first convolutional neural network for processing to obtain spatial feature data;

a judging unit that judges the defect information based on the spatial feature information,

wherein, the inputting the detected picture into the first convolution neural network for processing comprises: firstly reducing the dimension of the feature data of the detected picture and then increasing the dimension; obtaining first characteristic data according to the dimension reduction and dimension increase process; obtaining second characteristic data according to the dimension reduction process; and fusing the first characteristic data and the second characteristic data to obtain spatial characteristic data.

11. An apparatus comprising a detection system according to claim 10.

12. The apparatus of claim 11, further comprising: the sampling device is used for photographing an object to be detected; the first picture acquisition unit is used for acquiring an original picture from the sampling device.

13. The apparatus of claim 11, wherein the sampling device is a camera, the apparatus further comprising: the transportation platform is used for transporting the object to be detected; the camera is arranged on the transportation platform and is used for photographing an object to be detected;

the detection system is used for judging the defect type and/or position of the object to be detected according to the picture of the object to be detected obtained by the camera.

14. A medium having stored thereon computer instructions, characterized in that the computer instructions are operable to perform the steps of the method according to any one of claims 1 to 9.