CN116539619B

CN116539619B - Product defect detection method, system, device and storage medium

Info

Publication number: CN116539619B
Application number: CN202310424078.5A
Authority: CN
Inventors: 彭广德; 王睿; 李卫燊; 李卫铳
Original assignee: Guangzhou Ligong Industrial Co ltd
Current assignee: Guangzhou Ligong Industrial Co ltd
Priority date: 2023-04-19
Filing date: 2023-04-19
Publication date: 2024-05-10
Anticipated expiration: 2043-04-19
Also published as: CN116539619A

Abstract

The invention discloses a product defect detection method, a system, a device and a storage medium. The method comprises the following steps: acquiring point cloud data, audio data and image data of a product to be detected; inputting the point cloud data, the audio data and the image data into a first defect detection model to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to the defective product based on multimode data of the product to be detected; inputting the point cloud data into a second defect detection model to obtain a second prediction result; inputting the image data into a third defect detection model to obtain a third prediction result; weighting the first prediction result, the second prediction result and the third prediction result to obtain a detection classification result of the product to be detected; the detection classification result is used for representing that the product to be detected belongs to a defective product or a normal product. The embodiment of the invention is beneficial to improving the detection stability of product defect detection and the precision of detection results, and can be widely applied to the technical field of defect detection.

Description

Product defect detection method, system, device and storage medium

Technical Field

The invention relates to the technical field of defect detection, in particular to a method, a system, a device and a storage medium for detecting product defects.

Background

Industrial anomaly detection aims at finding abnormal areas of products and plays an important role in industrial quality inspection. In an industrial scene, the industrial abnormality detection method is mostly an unsupervised method, and only the detection example is tested in the detection process. In addition, most of the existing industrial anomaly detection methods are based on two-dimensional image detection. However, in quality inspection of industrial products, human inspectors use both three-dimensional shape and color characteristics to determine whether it is a defective product. Therefore, the detection stability and the detection precision of the product abnormality detection method in the related art are poor, and the industrial requirements cannot be met.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems existing in the prior art to a certain extent.

Therefore, the invention aims to provide a high-precision product defect detection method, a high-precision product defect detection system, a high-precision product defect detection device and a high-precision storage medium.

In order to achieve the technical purpose, the technical scheme adopted by the embodiment of the invention comprises the following steps:

in one aspect, an embodiment of the present invention provides a method for detecting a product defect, including the following steps:

The product defect detection method of the embodiment of the invention comprises the following steps: acquiring point cloud data, audio data and image data of a product to be detected; inputting the point cloud data, the audio data and the image data into a first defect detection model to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to a defective product based on multimode data of the product to be detected; inputting the point cloud data into a second defect detection model to obtain a second prediction result; the second prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the point cloud data of the product to be detected; inputting the image data into a third defect detection model to obtain a third prediction result; the third prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the image data of the product to be detected; weighting the first prediction result, the second prediction result and the third prediction result to obtain a detection classification result of the product to be detected; the detection classification result is used for representing that the product to be detected belongs to a defective product or a normal product. According to the embodiment of the invention, the product is predicted based on the multidimensional data of the product to be detected through the first defect detection model; predicting the product based on the point cloud data of the product to be detected through a second defect detection model; and simultaneously, predicting the product based on the image data of the product to be detected through a third defect detection model. The product to be detected is classified by integrating the prediction results, so that the multidimensional characteristics of the product can be fully considered, and the detection stability of product defect detection and the precision of the detection result are improved.

In addition, the product defect detection method according to the above embodiment of the present invention may further have the following additional technical features:

further, the method for detecting the product defects in the embodiment of the invention further comprises the following steps:

acquiring a data stream of a product to be detected, wherein the data stream comprises a plurality of data frames;

Acquiring point cloud data, audio data and image data of each data frame, and determining a detection classification result corresponding to each data frame according to the point cloud data, the audio data and the image data of each data frame;

if the data frames of the continuous first count have the same detection classification result, determining the detection classification result of the product to be detected; the first count is used to characterize the count of the number of detected data frames having the same detection classification result.

Further, in an embodiment of the present invention, the step of inputting the point cloud data, the audio data, and the image data into a first defect detection model to obtain a first prediction result includes:

inputting the point cloud data, the audio data and the image data into a feature extractor to obtain point cloud feature data, audio feature data and image feature data; the first defect detection model includes the feature extractor;

Inputting the image characteristic data into a first task network to obtain a first score; the first score is used for representing a defect score based on the image characteristic data;

Inputting the point cloud characteristic data into a second task network to obtain a second score; the second score is used for representing a defect score based on the point cloud characteristic data;

performing feature fusion on the point cloud feature data, the audio feature data and the image feature data to obtain fusion feature data;

inputting the fusion characteristic data into a multi-mode characteristic network to obtain a third score; the third score is used for representing a defect score based on the fusion characteristic data;

Inputting the audio characteristic data into a third task network to obtain a fourth score; the fourth score is used for representing a defect score based on the audio feature data; the first task network, the second task network and the third task network are used for representing a sub-model with a structure of a head task network structure;

And inputting the first score, the second score, the third score and the fourth score into a defect detection sub-model to obtain a first prediction result.

Further, in one embodiment of the present invention, the method further comprises the steps of:

Sample point cloud data, sample audio data, sample image data and corresponding state labels of batch sample products are obtained; the state label is used for representing whether the sample product belongs to a real result of a defective product or not;

Inputting the sample point cloud data, the sample audio data and the sample image data into an initialized first defect detection model, and processing the sample point cloud data, the sample audio data and the sample image data through the first defect detection model to obtain a fourth prediction result; the fourth prediction result is used for representing whether the sample product belongs to a defective product or not;

Determining a total loss value of training according to the fourth prediction result and the state label;

And updating parameters of the first defect detection model according to the total loss value to obtain a trained first defect detection model.

Further, in an embodiment of the present invention, the step of performing feature fusion on the point cloud feature data, the audio feature data, and the image feature data to obtain fused feature data includes:

inputting the point cloud characteristic data, the audio characteristic data and the image characteristic data into a multi-layer perceptron to extract interaction characteristic data;

Processing the interactive feature data through a full connection layer to obtain feature mapping data;

and carrying out feature fusion processing on the feature mapping data to obtain fusion feature data.

Further, in an embodiment of the present invention, the step of weighting the first prediction result, the second prediction result, and the third prediction result to obtain a detection classification result of the product to be detected includes:

And processing the first prediction result, the second prediction result and the third prediction result through a support vector machine to obtain a detection classification result of the product to be detected.

Further, in one embodiment of the present invention, the method further comprises:

Preprocessing the image data to obtain processed image data; the processed image data is used for representing an image with a preset size;

Performing registration interpolation processing on the point cloud data to obtain processed point cloud data; wherein the processed point cloud data and the processed image data have the same size;

And carrying out Mel frequency cepstrum coefficient extraction processing on the audio data to obtain processed audio data.

In another aspect, an embodiment of the present invention provides a product defect detection system, including:

the first module is used for acquiring point cloud data, audio data and image data of a product to be detected;

The second module is used for inputting the point cloud data, the audio data and the image data into a first defect detection model to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to a defective product based on multimode data of the product to be detected;

the third module is used for inputting the point cloud data into a second defect detection model to obtain a second prediction result; the second prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the point cloud data of the product to be detected;

A fourth module, configured to input the image data into a third defect detection model, to obtain a third prediction result; the third prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the image data of the product to be detected;

A fifth module, configured to perform weighting processing on the first prediction result, the second prediction result, and the third prediction result, to obtain a detection classification result of the product to be detected; the detection classification result is used for representing that the product to be detected belongs to a defective product or a normal product.

In another aspect, an embodiment of the present invention provides a product defect detecting apparatus, including:

At least one processor;

At least one memory for storing at least one program;

the at least one program, when executed by the at least one processor, causes the at least one processor to implement any of the product defect detection methods described above.

In another aspect, an embodiment of the present invention provides a storage medium in which a processor-executable program is stored, which when executed by a processor is configured to implement any one of the product defect detection methods described above.

According to the embodiment of the invention, the product is predicted based on the multidimensional data of the product to be detected through the first defect detection model; predicting the product based on the point cloud data of the product to be detected through a second defect detection model; and simultaneously, predicting the product based on the image data of the product to be detected through a third defect detection model. The point cloud data represents the three-dimensional shape of the product, and the image data represents the color and other information of the product. The product to be detected is classified by integrating the prediction results, so that the multidimensional characteristics, the shapes and the color characteristics of the product can be fully considered, and the detection stability of the defect detection of the product and the precision of the detection result can be improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made with reference to the accompanying drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and other drawings may be obtained according to these drawings without the need of inventive labor for those skilled in the art.

FIG. 1 is a schematic flow chart of an embodiment of a method for detecting product defects according to the present invention;

FIG. 2 is a flow chart illustrating another embodiment of a method for detecting product defects according to the present invention;

FIG. 3 is a flow chart of one embodiment of feature fusion provided by the present invention;

FIG. 4 is a flow chart of a second exemplary embodiment of a defect detection model according to the present invention;

FIG. 5 is a flow chart of a third exemplary embodiment of a defect detection model according to the present invention;

FIG. 6 is a schematic diagram illustrating a product defect detection system according to an embodiment of the present invention;

Fig. 7 is a schematic structural diagram of an embodiment of a product defect detecting device according to the present invention.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.

Industrial anomaly detection aims at finding abnormal areas of products and plays an important role in industrial quality inspection. In an industrial scenario, a large number of normal examples are readily available, but few examples of defects. Most of the existing industrial anomaly detection methods are unsupervised methods, namely only training normal examples, and only testing detection examples in the reasoning process. In addition, most of the existing industrial anomaly detection methods are based on two-dimensional images. However, in quality inspection of industrial products, human inspectors use both three-dimensional shape and color characteristics to determine whether it is a defective product, three-dimensional information being necessary and closely related to product defect detection. In the related art, industrial anomaly detection based on two dimensions has been widely used, but multi-mode industrial anomaly detection based on three-dimensional point cloud, RGB image and audio signals still belongs to the blank field. The existing multi-mode industrial anomaly detection method generally directly connects multi-mode features, so that strong interference is generated between the features, and the stability and the accuracy of defect detection are damaged.

It will be appreciated that the core idea of unsupervised anomaly detection is to find the distinction between an anomaly representation and a normal representation. The current two-dimensional industrial anomaly detection methods can be divided into two categories: (1) a reconstruction-based method. The image reconstruction task is widely applied to anomaly detection methods to learn normal representations. For single modality input (two-dimensional image or three-dimensional point cloud), the reconstruction-based method is easy to implement. But for multi-modal inputs it is difficult to find a reconstruction target. (2) a pre-trained feature extractor-based method. One intuitive way to utilize a feature extractor is to map the extracted features to a normal distribution and find out the out-of-distribution features as anomalies. A common method is to directly construct a normal distribution by converting feature information, and based on the method provided by the embodiment of the invention, some representative features are stored to implicitly construct the feature distribution. Compared with a reconstruction-based method, the method directly uses a pre-trained feature extractor, does not relate to the design of a multi-mode reconstruction target, and is a better choice for multi-mode tasks. The current multi-mode industrial anomaly detection method directly connects the characteristics of the two modes together. However, when the feature dimension is high, interference between multi-modal features can be severe and result in performance degradation.

In order to solve the problems, the embodiment of the invention provides a multi-mode anomaly detection scheme based on RGB images, three-dimensional point clouds and audio signals. Unlike existing methods for directly connecting two modal features, embodiments of the present invention propose a hybrid fusion scheme to reduce interference between multi-modal features and encourage feature interactions. The embodiment of the invention provides an unsupervised feature fusion for fusing multi-modal features, and uses contrast loss training to learn the internal relation between multi-modal feature blocks at the same position. For the final decision, a decision layer fusion is constructed to consider all for anomaly detection and segmentation.

It can be understood that the anomaly detection needs to include features of global information and local information, where the local information is helpful for detecting small defects, and the global information is concentrated in the relationships between all the parts, so that the embodiment of the invention introduces a point feature alignment operation to better align three-dimensional and two-dimensional features, and improve the accuracy of product defect detection.

The method, system, apparatus and storage medium for detecting a product defect according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings, and first, the method for detecting a product defect according to an embodiment of the present invention will be described with reference to the accompanying drawings.

Referring to fig. 1, a product defect detection method is provided in an embodiment of the present invention, and the product defect detection method in the embodiment of the present invention may be applied to a terminal, a server, software running in the terminal or the server, and the like. The terminal may be, but is not limited to, a tablet computer, a notebook computer, a desktop computer, etc. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms. The product defect detection method in the embodiment of the invention mainly comprises the following steps:

S100: acquiring point cloud data, audio data and image data of a product to be detected;

S200: inputting the point cloud data, the audio data and the image data into a first defect detection model to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to the defective product based on multimode data of the product to be detected;

S300: inputting the point cloud data into a second defect detection model to obtain a second prediction result; the second prediction result is used for representing the probability that the product to be detected belongs to the defective product based on the point cloud data of the product to be detected;

S400: inputting the image data into a third defect detection model to obtain a third prediction result; the third prediction result is used for representing the probability that the product to be detected belongs to the defective product based on the image data of the product to be detected;

S500: weighting the first prediction result, the second prediction result and the third prediction result to obtain a detection classification result of the product to be detected; the detection classification result is used for representing that the product to be detected belongs to a defective product or a normal product; the detection classification result is used for representing the defect position of the defect product.

In some possible implementations, referring to fig. 2, the first defect detection model, the second defect detection model, and the third defect detection model in the embodiment of the present invention belong to a trained model. The first defect detection model is used for determining a first prediction result based on multimode data of a product to be detected, and the second defect detection model is used for representing a second prediction result based on three-dimensional point cloud data of the product to be detected; the third defect detection model is used for representing the two-dimensional image data based on the product to be detected to determine a third prediction result. And (5) synthesizing three prediction results, and determining the classification condition of the product to be detected.

In the embodiment of the invention, for the training process of the first defect detection model, normal/defect products are collected and marked with defect areas therein to obtain two-dimensional and three-dimensional defect data sets, and audio data marks are carried out to obtain audio classification data sets. Training is carried out by using the multi-modal convolutional neural network until convergence, and a multi-modal defect detection model (namely the first defect detection model) is obtained. For the defect position positioning problem of the defect product, the model training process can be that target scene pictures can be collected, data frames are marked, and a segmentation data set is made based on the defect position deduction area. In addition, a certain number of defect data sets are selected, the defect areas are scratched out, the size, the direction, the color saturation and the like (including noise data generated by using convolutional neural networks such as spatial data enhancement and gan) are randomly changed, and the data sets are attached to the defect-free normal data sets to synthesize a new data set. The two parts of data form a segmented data set for defect detection; can be used for the positioning prediction of the defective part of the defective product. Training by using a multi-modal convolutional neural network until convergence to obtain a part defect detection model (namely the first defect detection model).

In the embodiment of the application, the second defect detection model can be two-dimensional visual detection, the third defect detection module is used for representing three-dimensional point cloud detection, and when the second defect detection model and the third defect detection model are trained, rule judgment is carried out through a related algorithm, and super-parameter tuning is automatically carried out until the model generalization capability reaches the best, so that the trained second defect detection model and the trained third defect detection model are obtained.

In the embodiment of the invention, referring to fig. 3, when a defect product is predicted, a data frame is preprocessed into a plurality of different input specifications, the data frame is input into the trained convolutional neural network to obtain a plurality of output scores of t.srgb, t.srp, t. Sfq, t. SPts (namely, a first score, a second score, a third score and a fourth score, and an intermediate result of checking by a first defect detection model), all output combinations are trained to obtain a multi-mode model (namely, a defect detection sub-model, a sub-model belonging to the first defect detection model), and the data frame is processed and judged (namely, a second defect detection model and a third defect detection model) by using a two-dimensional and three-dimensional modeled algorithm, wherein the scores of ψγ, ψa and ψb (namely, a first prediction result, a second prediction result and a third prediction result) are respectively; and using the multi-mode model and the defect score result obtained by the traditional algorithm to carry out weighted judgment (Y=w1×ψγ+w2×ψa+w3×ψb), carrying out automatic reasoning detection division on the defect area of the product, and outputting position coordinate information (center point and length, width and height: x, Y, z, w, h, l) of the defect area.

Specifically, in one embodiment, data is pulled from an industrial production scene in a video stream or the like in real time, normal objects and defective objects are detected and classified respectively by using a first defect detection model, a second defect detection model and a third defect detection model, and weighted voting is performed on the three-party model results to obtain whether the three-party model results are defective product results. If no defect is detected, judging that the frame data quantity does not contain the defect and counting defective products, and if the defect is detected, extracting an object area from the original data frame. Meanwhile, the defect classification model is used for detection, unreasonable product defect areas are removed according to the relative sizes and the relative position priori information of defects of different types, and the rest defects are judged to be object defects according to the following two conditions: (1) the data frames of both the normal and defect products do not intersect. (2) Defects (scratch/color difference and the like are not easy to find characteristics) when the defective areas of the product are classified.

The technical scheme provided by the embodiment of the invention is as follows: collecting and labeling the product images, collecting the defect images, labeling different types of defects in the images, scratching out the random change of the defect areas from the images, then attaching the random change to the background images, and calculating labels; training a first defect detection model using a multi-modal convolutional neural network; using the second defect detection model and the third defect detection model to process vision and point cloud data, and selecting the best judgment super parameters to judge different types of defect classification models; and judging whether defects exist or not and counting according to the prediction results of the three defect detection models by the self-adaptive voting algorithm. By the method, the problems that the defect detection accuracy of the existing edge end industrial production scene product is low, the robustness is poor and the like are difficult to implement can be solved. Compared with the existing method, the method provided by the embodiment of the invention is easy to realize, the method can solve the problems that the size and shape of the complex scene object are too large, part of defects are small, the accuracy is affected by the information dimension such as depth and the like, a plurality of models are used for completing the product defect detection task by using one set of self-adaptive adjustable parameters, and the method has extremely high accuracy and robustness. Can be easily popularized to various edge application scenes.

Optionally, in an embodiment of the present invention, the step of inputting the point cloud data into the second defect detection model to obtain a second prediction result further includes:

the second defect detection model preprocesses the point cloud data to obtain preprocessed point cloud data;

The second defect detection model performs feature extraction on the preprocessed point cloud data to obtain first data;

And performing defect prediction on the first data by the second defect detection model to obtain a second prediction result.

In some possible implementations, referring to fig. 4, the training process of the second defect detection model may be: three-dimensional point cloud STL format data (a commonly used three-dimensional model data format, supporting grid data types) is input. Preprocessing point cloud data: the method comprises the operations of point cloud denoising, point cloud filtering and the like, and common characteristics comprise curvature, normal vector, surface roughness and the like, so as to improve the effect of subsequent processing. Feature extraction of point cloud data: the point cloud is converted into three-dimensional data from which useful features such as volume shape, contour, cavity noise, etc. are extracted. Feature selection of point cloud data: the most representative feature is selected to reduce redundant information. The detector adjusts parameters and builds the model: and using undefined defect rules, such as special features of volume size, shape size, flatness, concave holes and the like, performing super-parameter tuning and adapting to model a defect detection model according to the prior data in the defect detection model, and outputting whether the defect is a defect or not and all point cloud coordinates x, y and z of the defect.

In some possible embodiments, referring to fig. 5, the training process of the third defect detection model may be: the input rgb image is converted into a corresponding gray scale image with a size of 1920 x 1080 (as an illustrative example, those skilled in the art may adapt according to the product surface size, and the invention is not limited in particular). Image preprocessing of image data: including image denoising, image enhancement, etc. Feature extraction of image data: useful features such as texture, color, shape, etc. are extracted from the image. Feature selection of image data: the most representative feature is selected to reduce redundant information. The detector adjusts parameters and builds the model: and judging whether the output is the defect or not and x and y under coordinates according to specific rules such as the size, the shape, the area and the like of the defect by using the marked data set.

Optionally, in one embodiment of the present invention, the method further comprises:

In some possible embodiments, the point cloud data, the audio data and the image data of the product to be tested may be derived from the data frames, and defect detection is performed on a plurality of continuous data frames, and the final inspection classification result is determined according to the following logic: if t frame counts (i.e., first counts) in the continuous s frame data stream are judged to be consistent, then the product defect count is judged to be valid, and the system distributes a stable count value. For example, in the embodiment of the invention, a preset count may be set, and if the first count is greater than the preset count, the detection classification result of the product to be detected may be determined.

Optionally, in one embodiment of the present invention, the step of inputting the point cloud data, the audio data, and the image data into the first defect detection model to obtain the first prediction result includes:

Inputting the point cloud data, the audio data and the image data into a feature extractor to obtain point cloud feature data, audio feature data and image feature data; the first defect detection model includes a feature extractor;

inputting the fusion characteristic data into a multi-mode characteristic network to obtain a third score; the third score is used for characterizing a defect score based on the fused feature data;

Inputting the audio characteristic data into a third task network to obtain a fourth score; the fourth score is used for representing the defect score based on the audio feature data; the first task network, the second task network and the third task network are used for representing a sub-model with a structure of a head task network structure;

Inputting the first score, the second score, the third score and the fourth score into a defect detection sub-model to obtain a first prediction result.

In some possible embodiments, referring to fig. 2, vision transformation may be used as a multi-modal backbone network, with point cloud data and RGB pictures (i.e., image data) being input at the time of training. Taking a two-dimensional image, a three-dimensional point cloud and an audio signal as input, and taking a pre-trained backbone basic network as a feature extractor, and firstly extracting features of data. Specifically, the feature extractor belongs to the feature extraction of the data, which is trained in advance. Illustratively, convnext may be selected for feature extraction of the image data; pointnet is used for extracting the characteristics of the point cloud data; convolutional neural networks are used for feature extraction of audio data. Two-dimensional and 3D feature representations and audio features are extracted, respectively, by a feature extractor. According to each characteristic, defect prediction is carried out on the product, and three scores (namely a first score, a second score and a fourth score) are obtained. And meanwhile, after the three characteristic data are fused, carrying out defect prediction on the product to obtain a third score. And the result of the four scores is combined, the defect prediction is carried out on the product, and the problem of low prediction precision in the characteristic arrangement process is solved.

The defect detection sub-model can give out the position prediction of the defect product according to the defect position information included in the four scoring conditions.

Sample point cloud data, sample audio data, sample image data and corresponding state labels of batch sample products are obtained; the state label is used for representing whether the sample product belongs to a real result of the defective product or not;

Inputting sample point cloud data, sample audio data and sample image data into an initialized first defect detection model, and processing the sample point cloud data, the sample audio data and the sample image data through the first defect detection model to obtain a fourth prediction result; the fourth prediction result is used for representing whether the sample product belongs to a defective product or not;

sample point cloud data, sample audio data and sample image data of a batch sample product are obtained as a first sample set;

Obtaining batch of defect sample products, selecting a defect region in the defect sample products, preprocessing the defect region, and combining the defect region with a normal region to form a second sample set;

combining the first sample set and the second sample set to form a third sample set;

Training the first defect detection model according to the first sample set to obtain a pre-trained first defect detection model;

And fine tuning the pre-trained first defect detection model according to the third sample set to obtain a trained first defect detection model.

In some possible embodiments, the training process of the first defect detection model may be: first, data set creation is performed. ) Collecting product defect/normal pictures, point cloud data and audio signals; marking standards and marking formats of semantic segmentation, point cloud segmentation and audio classification are formulated; labeling personnel adopts a standard flow to label; the image and the marker constitute a data set one (i.e., a first sample set). Selecting a certain number of defect data sets, digging out a defect area, randomly changing characteristic shapes and the like, attaching the defect data sets to a defect-free normal data set, synthesizing a new data set, calculating marks of the defect positions in the new data set, and forming a data set II (namely a second sample set) by the data and the marks. The first data set and the second data set are combined to form a third data set (i.e., a third sample set). And training on the first data set by using the first defect detection model until the model converges, and performing fine tuning training on the third data set to obtain the first defect detection model.

Optionally, in one embodiment of the present invention, the step of performing feature fusion on the point cloud feature data, the audio feature data and the image feature data to obtain fused feature data includes:

inputting the point cloud characteristic data, the audio characteristic data and the image characteristic data into a multi-layer perceptron, and extracting interactive characteristic data;

In some possible implementations, the embodiment of the invention provides an unsupervised feature fusion module, and a contrast loss is designed to train the feature fusion module: referring to FIG. 3, given RGB features frgb and point cloud features fpt and audio features ffq, features of different modalities at the same location are stimulated to obtain more corresponding information, while features at different locations have less corresponding information. Therefore, the embodiment of the invention adopts a multi-layer perceptron (MLP) layer { Mrgb, mpt, mfq } to extract the interaction information among three modes, uses the features processed by the full-connection layer { beta rgb, beta pt, beta fq } to map to a low-dimensional space, and fuses the types of features concat. Through feature fusion, the problem of low detection precision in the feature combination prediction process is solved.

Optionally, in an embodiment of the present invention, the step of weighting the first prediction result, the second prediction result, and the third prediction result to obtain a detection classification result of the product to be detected includes:

In some possible implementations, embodiments of the present invention provide decision layer fusion. And based on the single abnormal characteristics of the partial defects, extracting and fusing the partial characteristic information by using the multi-mode characteristics. Therefore, the color features, the position features and the fusion features output by the original segmentation, the audio, the point cloud and the multi-modal model structure are respectively processed, and are connected to a simple SVM (support vector machine) or a regression network structure. And finally deciding the scores psigamma.s of defect abnormality, segmentation and the like.

Carrying out registration interpolation processing on the point cloud data to obtain processed point cloud data; the processed point cloud data and the processed image data have the same size;

And carrying out Mel frequency cepstrum coefficient extraction processing on the audio data to obtain the processed audio data.

In some possible implementations, the present embodiments also provide point cloud feature registration. Specifically, a line scanning camera is used for extracting data frames from mobile scanning, and interpolation sampling is carried out by using the data stream to synthesize a three-dimensional point cloud characteristic coding vector. After interpolation, p is projected onto a two-dimensional plane by using the points coordinates and camera parameters, the projection point is expressed as p and the mismatch position between the sparse point and the two-dimensional plane is set as 0. The projection profile is represented as { p x, y | (x, y) ∈D } (D is a two-dimensional planar area of the RGB image), which has the same size as the input RGB image. An averaging pooling operation is used to obtain patch features on the two-dimensional planar feature map.

Specifically, in the embodiment of the invention, the data frame is preprocessed, the input size of the image is changed into 3X1920X1080, and the image is recorded as X; carrying out a series of registration interpolation processing on the input 3D point cloud stl format data, and finally aligning the data with the image data and marking the data as L; continuous audio data frame data (for example, y audio signal sampling value, sr sampling rate, default sampling rate is 22050 Hz) is input, and Mel frequency cepstral coefficient (mfcc) of audio is extracted to enable the audio to be in a format which can be trained by a neural network and is marked as M. Inputting a multi-mode neural network, respectively accessing corresponding data features to audio, image and point cloud task, calculating Loss, respectively setting Loss weights to a, b, c, d for four tasks, namely, total Loss loss=a (loss1+ DLoss 1) plus b (loss2+ DLoss 2) +c (loss3+ DLoss 3) plus d (loss4+ DLoss 4), reversely propagating errors and updating parameters, and repeating the process until the network converges. In summary, the embodiment of the invention further includes the steps of: acquiring sample point cloud data, sample audio data and sample image data; inputting sample point cloud data, sample audio data and sample image data into a first defect detection model to obtain a first score, a second score, a third score and a fourth score; obtaining the loss corresponding to each score according to each score; determining comprehensive loss according to the loss corresponding to each score; and carrying out back propagation according to the comprehensive loss to obtain a converged first defect detection model.

As can be seen from the above, according to the embodiment of the present invention, the product is predicted based on the multidimensional data of the product to be detected through the first defect detection model; predicting the product based on the point cloud data of the product to be detected through a second defect detection model; and simultaneously, predicting the product based on the image data of the product to be detected through a third defect detection model. The point cloud data represents the three-dimensional shape of the product, and the image data represents the color and other information of the product. The product to be detected is classified by integrating the prediction results, so that the multidimensional characteristics, the shapes and the color characteristics of the product can be fully considered, and the detection stability of the defect detection of the product and the precision of the detection result can be improved. Specifically, the multi-mode network algorithm structure provided by the embodiment of the invention is efficient and easy to train and converge; the product data set can be acquired in real time on site, and the defect data set can be rapidly generated by using an automatic labeling and data augmentation method, so that the data volume of the data set target can be greatly increased. The accuracy is high. The detection of the product classification defect objects can achieve high accuracy. The recall rate and the accuracy rate of the object can be improved through the synthesized object data set, meanwhile, unreasonable defect detection areas can be removed through the relative size and the relative position information of the product defects, and false detection is further removed according to comprehensive analysis conditions of classified defective objects. The self-adaptive hyper-parameter voting mechanism can more accurately judge whether the flaw product is a defective product or not. Further, by continuously collecting image frame judgment, background data interference can be avoided. The detection result is abundant, and diversified results are provided. The defect detection algorithm can judge various defects, can position the positions and the types of the defects, and can count the defects in real time. The result can be returned in a personalized way according to the actual requirement.

Next, a product defect detection system according to an embodiment of the present invention will be described with reference to fig. 6.

FIG. 6 is a schematic diagram of a product defect detection system according to an embodiment of the present invention, the system specifically includes:

A first module 610, configured to obtain point cloud data, audio data, and image data of a product to be tested;

A second module 620, configured to input the point cloud data, the audio data, and the image data into a first defect detection model, to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to the defective product based on multimode data of the product to be detected;

A third module 630, configured to input the point cloud data into a second defect detection model, to obtain a second prediction result; the second prediction result is used for representing the probability that the product to be detected belongs to the defective product based on the point cloud data of the product to be detected;

A fourth module 640, configured to input the image data into a third defect detection model, to obtain a third prediction result; the third prediction result is used for representing the probability that the product to be detected belongs to the defective product based on the image data of the product to be detected;

a fifth module 650, configured to perform weighting processing on the first prediction result, the second prediction result, and the third prediction result, so as to obtain a detection classification result of the product to be detected; the detection classification result is used for representing that the product to be detected belongs to a defective product or a normal product.

It can be seen that the content in the above method embodiment is applicable to the system embodiment, and the functions specifically implemented by the system embodiment are the same as those of the method embodiment, and the beneficial effects achieved by the method embodiment are the same as those achieved by the method embodiment.

Referring to fig. 7, an embodiment of the present invention provides a product defect detecting apparatus, including:

at least one processor 710;

At least one memory 720 for storing at least one program;

The at least one program, when executed by the at least one processor 710, causes the at least one processor 710 to implement the product defect detection method.

Similarly, the content in the above method embodiment is applicable to the embodiment of the present device, and the functions specifically implemented by the embodiment of the present device are the same as those of the embodiment of the above method, and the beneficial effects achieved by the embodiment of the above method are the same as those achieved by the embodiment of the above method.

The embodiment of the present invention also provides a computer-readable storage medium in which a processor-executable program is stored, which when executed by a processor is configured to perform the above-described product defect detection method.

Similarly, the content in the above method embodiment is applicable to the present storage medium embodiment, and the specific functions of the present storage medium embodiment are the same as those of the above method embodiment, and the achieved beneficial effects are the same as those of the above method embodiment.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in the form of a software product stored in a storage medium, including several programs for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable programs for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with a program execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the programs from the program execution system, apparatus, or device and execute the programs. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the program execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable program execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims

1. A method for detecting product defects, comprising the steps of:

Acquiring three-dimensional point cloud data, audio data and image data of a product to be detected;

inputting the three-dimensional point cloud data, the audio data and the image data into a first defect detection model to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to a defective product based on multimode data of the product to be detected;

inputting the three-dimensional point cloud data into a second defect detection model to obtain a second prediction result; the second prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the three-dimensional point cloud data of the product to be detected;

Inputting the image data into a third defect detection model to obtain a third prediction result; the third prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the image data of the product to be detected;

Weighting the first prediction result, the second prediction result and the third prediction result to obtain a detection classification result of the product to be detected; the detection classification result is used for representing that the product to be detected belongs to a defective product or a normal product, and the detection classification result is used for predicting the defect position of the defective product.

2. The method for detecting product defects according to claim 1, wherein the step of acquiring three-dimensional point cloud data, audio data, and image data of the product to be detected comprises:

acquiring three-dimensional point cloud data, audio data and image data of each data frame, and according to the three-dimensional point cloud data, the audio data and the image data of each data frame;

the method further comprises the steps of:

determining a detection classification result corresponding to each data frame;

3. The method of claim 1, wherein the step of inputting the three-dimensional point cloud data, the audio data, and the image data into a first defect detection model to obtain a first prediction result comprises:

inputting the three-dimensional point cloud data, the audio data and the image data into a feature extractor to obtain point cloud feature data, audio feature data and image feature data; the first defect detection model includes the feature extractor;

4. The method of claim 1, further comprising the steps of:

Sample three-dimensional point cloud data, sample audio data, sample image data and corresponding state labels of batch sample products are obtained; the state label is used for representing whether the sample product belongs to a real result of a defective product or not;

Inputting the sample three-dimensional point cloud data, the sample audio data and the sample image data into an initialized first defect detection model, and processing the sample three-dimensional point cloud data, the sample audio data and the sample image data through the first defect detection model to obtain a fourth prediction result; the fourth prediction result is used for representing whether the sample product belongs to a defective product or not;

5. The method for detecting product defects according to claim 3, wherein the step of feature-fusing the point cloud feature data, the audio feature data, and the image feature data to obtain fused feature data comprises:

6. The method according to claim 1, wherein the step of weighting the first predicted result, the second predicted result, and the third predicted result to obtain the detection classification result of the product to be detected comprises:

7. The method according to claim 1, wherein before the step of inputting the three-dimensional point cloud data, the audio data, and the image data into a first defect detection model to obtain a first prediction result, the method further comprises:

Registering and interpolating the three-dimensional point cloud data to obtain processed three-dimensional point cloud data; wherein the processed three-dimensional point cloud data and the processed image data have the same size;

8. A product defect detection system, comprising:

the first module is used for acquiring three-dimensional point cloud data, audio data and image data of a product to be detected;

The second module is used for inputting the three-dimensional point cloud data, the audio data and the image data into a first defect detection model to obtain a first prediction result; the first prediction result is used for representing the probability that the product to be detected belongs to a defective product based on multimode data of the product to be detected;

the third module is used for inputting the three-dimensional point cloud data into a second defect detection model to obtain a second prediction result; the second prediction result is used for representing the probability that the product to be detected belongs to a defective product based on the three-dimensional point cloud data of the product to be detected;

9. A product defect detection apparatus, comprising:

At least one processor;

At least one memory for storing at least one program;

the at least one program, when executed by the at least one processor, causes the at least one processor to implement the product defect detection method of any one of claims 1 to 7.

10. A computer-readable storage medium in which a processor-executable program is stored, characterized in that the processor-executable program is for realizing the product defect detection method according to any one of claims 1 to 7 when being executed by a processor.