CN116958021A

CN116958021A - Product defect identification method based on artificial intelligence, related device and medium

Info

Publication number: CN116958021A
Application number: CN202211433182.2A
Authority: CN
Inventors: 吴文龙
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-10-27

Abstract

The present disclosure provides an artificial intelligence based product defect identification method, related apparatus and medium. The product defect identification method comprises the following steps: acquiring an image to be detected and a reference image of a product; generating a first feature map of the image to be detected and a second feature map of the reference image; determining a calibration matrix based on the first feature map and the second feature map; calibrating the first feature map by using the calibration matrix to obtain a third feature map; and identifying the defect of the product from the third characteristic diagram. The embodiment of the disclosure reduces the detection time of the product defects and improves the detection efficiency. The embodiment of the disclosure can be applied to various scenes such as industrial automation, artificial intelligence, the Internet of things and the like.

Description

Product defect identification method based on artificial intelligence, related device and medium

Technical Field

The present disclosure relates to the field of industrial automation, and in particular, to an artificial intelligence-based product defect identification method, related apparatus, and medium.

Background

In the industrial automation quality inspection process, due to the complexity of industrial parts, the parts are generally photographed at multiple angles, and then part defects are detected from the part images photographed at multiple angles. Because the mechanical arm of the camera has a certain shake, the shot image may shake. In addition, industrial cameras generally use fixed focus lenses, so that only a small part of the ROI area in the photographed image may be clear, and the rest is blurred, so that defects can be identified only for the ROI area. Therefore, after capturing an image of the part, it is generally necessary to calibrate the image first to reduce the above-mentioned problems of jitter and ROI area mismatch, and identify defects from the calibrated image.

In the prior art, two processes of image calibration and defect identification from the calibrated image are fracture, the calibration process must be performed, and then the defect identification from the calibrated image is performed, which takes a long time.

Disclosure of Invention

The embodiment of the disclosure provides a product defect identification method based on artificial intelligence, a related device and a medium, which can reduce the detection time of product defects and improve the detection efficiency.

According to an aspect of the present disclosure, there is provided an artificial intelligence-based product defect identification method, including:

acquiring an image to be detected and a reference image of a product;

generating a first feature map of the image to be detected and a second feature map of the reference image;

determining a calibration matrix based on the first feature map and the second feature map;

calibrating the first feature map by using the calibration matrix to obtain a third feature map;

and identifying defects of the product from the third characteristic diagram.

According to an aspect of the present disclosure, there is provided an artificial intelligence-based product defect recognition apparatus, including:

the acquisition unit is used for acquiring an image to be detected and a reference image of the product;

the generation unit is used for generating a first characteristic image of the image to be detected and a second characteristic image of the reference image;

A determining unit configured to determine a calibration matrix based on the first feature map and the second feature map;

the calibration unit is used for calibrating the first characteristic diagram by utilizing the calibration matrix to obtain a third characteristic diagram;

and the first identification unit is used for identifying the defects of the product from the third characteristic diagram.

Optionally, the determining unit is specifically configured to:

obtaining a set of matched feature point pairs, wherein the set of matched feature point pairs comprises a plurality of matched feature point pairs, each matched feature point pair comprises a first feature point and a second feature point, the first feature point is from the first feature map, and the second feature point is from the second feature map and corresponds to the first feature point;

and determining the calibration matrix based on the matched characteristic point pair set.

Optionally, the determining unit is specifically configured to:

acquiring a relevance score matrix based on the first feature map and the second feature map, wherein each feature point in the first feature map corresponds to a column of the relevance score matrix, each feature point in the second feature map corresponds to a row of the relevance score matrix, and an element in the relevance score matrix is equal to a feature point corresponding to a column of the element in the first feature map and a relevance score of a feature point corresponding to a row of the element in the second feature map;

Selecting a plurality of first feature points from the first feature map;

and acquiring second feature points corresponding to the first feature points in the second feature map based on the relevance score matrix for each selected first feature point.

Optionally, the determining unit is specifically configured to:

setting a sliding window in a column corresponding to the first feature point selected in the association score matrix, wherein the sliding window comprises a first sliding window element, a second sliding window element, a third sliding window element and a fourth sliding window element, each sliding window element in the first sliding window element, the second sliding window element, the third sliding window element and the fourth sliding window element covers one association score in the column, the second sliding window element is below the first sliding window element and is close to the first sliding window element, the number of rows of the third sliding window element, which is below the second sliding window element and is spaced from the second sliding window element, is the column number of the second feature map minus 2, and the fourth sliding window element is below the third sliding window element and is close to the third sliding window element;

and enabling the sliding window to slide in the column corresponding to the first characteristic point, and acquiring the second characteristic point according to four elements falling in the sliding window in the sliding process of the sliding window.

Optionally, the determining unit is specifically configured to:

determining, for each sliding position during sliding of the sliding window, a sum of four relevance scores falling in the sliding window;

and acquiring the second feature points based on the four feature points corresponding to the maximum four relevance scores in the second feature map.

Optionally, the determining unit is specifically configured to:

taking any one of the four feature points as a first reference feature point, wherein the feature point which is positioned in the same row as the first reference feature point in the four feature points is a second reference feature point, the feature point which is positioned in the same column as the first reference feature point in the four feature points is a third reference feature point, and the feature point which is positioned in neither the same row nor the same column as the first reference feature point in the four feature points is a fourth reference feature point;

establishing a coordinate system by taking the center of the first reference feature point as an origin, taking the direction of the origin pointing to the center of the second reference feature point as the forward direction of a transverse axis, taking the direction of the origin pointing to the center of the third reference feature point as the forward direction of a longitudinal axis, and taking the standard side length of the feature point as the unit of the transverse line and the longitudinal axis;

Determining center coordinates of the second feature points based on relevance scores of the first reference feature points, the second reference feature points, the third reference feature points and the fourth reference feature points;

and acquiring the second characteristic point based on the center coordinates of the second characteristic point and the standard side length of the characteristic point.

Optionally, the determining unit is specifically configured to:

determining an abscissa of the center based on a sum of association scores of the second reference feature point and the fourth reference feature point and a sum of association scores of the first reference feature point and the third reference feature point;

and determining the ordinate of the center based on the sum of the relevance scores of the third reference feature point and the fourth reference feature point and the sum of the relevance scores of the first reference feature point and the second reference feature point.

Optionally, the determining unit is specifically configured to:

after acquiring a relevance score matrix based on the first feature map and the second feature map, inputting the first feature map and the second feature map into a self-attention model to generate a first relevance score adjustment value;

inputting the first feature map and the second feature map into an interactive attention model to generate a second relevance score adjustment value;

And adjusting the relevance score matrix by using the first relevance score adjustment value and the second relevance score adjustment value.

Optionally, the determining unit is specifically configured to:

performing a first number of times a first process, the first process comprising: extracting a second number of matching feature point pairs from the set of matching feature point pairs; calculating a candidate calibration matrix according to each matching characteristic point pair; determining a first error of the matching feature point pairs of the set of matching feature point pairs on the candidate calibration matrix; determining a third number of matched feature point pairs for which the first error is less than a first threshold;

and taking the candidate calibration matrix calculated in the first process with the largest third number in the first process as the determined calibration matrix.

Optionally, the first feature map includes a plurality of first feature maps of different feature extraction ratios, the second feature map includes a plurality of second feature maps of different feature extraction ratios, and correspondingly, the third feature map includes a plurality of third feature maps of different feature extraction ratios;

the first identification unit is specifically configured to:

inputting the third feature images into a defect recognition model, and recognizing defects according to the third feature images by the defect recognition model.

Optionally, the product defect identifying device based on artificial intelligence further comprises:

a second identifying unit configured to input a plurality of the third feature maps into a defect classification model, and identify a type of the defect for the identified defect by the defect classification model;

and a third recognition unit, configured to input a plurality of third feature maps into a defect contour recognition model, and recognize a contour of the defect for the recognized defect by the defect contour recognition model.

a training unit for jointly training the defect recognition model and the defect classification model by:

constructing a sample image set having a plurality of sample images, the sample images having a first label indicating whether the sample image has a defect and a second label indicating a type of the defect;

generating a fourth feature map of sample images of the sample image set;

identifying defects according to the fourth feature map by using the defect identification model, and classifying the identified defects according to the fourth feature map by using the defect classification model;

Constructing an error function based on the comparison of the identification result of the defect identification model and the first label and the comparison of the classification result of the defect classification model and the second label;

and training the defect identification model and the defect classification model by utilizing the error function.

According to an aspect of the present disclosure, there is provided an electronic device comprising a memory storing a computer program and a processor implementing an artificial intelligence based product defect identification method as described above when executing the computer program.

According to an aspect of the present disclosure, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements an artificial intelligence based product defect identification method as described above.

According to an aspect of the present disclosure, there is provided a computer program product comprising a computer program, which is read and executed by a processor of a computer device, causing the computer device to perform the artificial intelligence based product defect identification method as described above.

In the embodiment of the disclosure, since the two processes of image calibration included in product defect identification and defect identification from the calibrated image generally include a step of generating a feature map from an image to be detected, the steps of generating the feature map in the two processes are combined. The feature map is generated for both calibration and identification of the final defect. The calibration process directly calibrates the feature map, and the identification of the product defects is directly performed on the calibrated feature map, so that compared with the mode of calibrating the image to be detected and generating the feature map on the calibrated image, the method reduces the execution times of the step of generating the feature map, thereby reducing the defect detection time and improving the defect detection efficiency.

Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the disclosure. The objectives and other advantages of the disclosure will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosed embodiments and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain, without limitation, the disclosed embodiments.

FIG. 1 is a architectural diagram of a system to which an artificial intelligence based product defect identification method is applied, according to an embodiment of the present disclosure;

FIG. 2A is a schematic illustration of a scenario in which an artificial intelligence-based product defect identification method is applied in industrial automation quality inspection, according to an embodiment of the present disclosure;

FIG. 2B is a schematic diagram of an application of an artificial intelligence based product defect identification method in industrial automation quality inspection in accordance with an embodiment of the present disclosure;

FIG. 3 is a flow chart of an artificial intelligence based product defect identification method according to one embodiment of the present disclosure;

FIG. 4 is a block diagram of an implementation of an artificial intelligence based product defect identification method according to one embodiment of the present disclosure;

FIG. 5 is a block diagram of one implementation of the feature extraction model of FIG. 4;

FIG. 6 is a block diagram of one particular implementation of the feature pyramid network of FIG. 4;

FIG. 7 is a flow chart of one specific implementation of step 330 in FIG. 3;

FIG. 8 is a schematic diagram of the set of matched feature point pairs in step 710 of FIG. 7;

FIG. 9 is a flow chart of one specific implementation of step 710 of FIG. 7;

FIG. 10 is a schematic diagram of obtaining a relevance score matrix based on the first feature map and the second feature map in step 910 of FIG. 9;

FIG. 11 is a flow chart of one specific implementation of step 930 of FIG. 9;

12A-B illustrate the principle of step 1110-1120 of FIG. 11 to obtain a second feature point through four elements in a sliding window;

13A-C illustrate various sliding position schematics of the sliding window of step 1120 of FIG. 11;

FIG. 14 shows a flow chart of one specific implementation of step 1120 of FIG. 11;

15A-C illustrate the sum of four relevance scores at different sliding positions of the sliding window in step 1410 of FIG. 14;

FIG. 16 shows a flow chart of a specific implementation of step 1420 of FIG. 14;

17A-B illustrate a schematic diagram of the process of acquiring the second feature point at steps 1610-1640 of FIG. 16;

FIG. 18 is a flow chart of another specific implementation of step 710 of FIG. 7;

FIG. 19A is a schematic diagram of the self-attention model of step 911 of FIG. 18;

FIG. 19B is a schematic diagram of the structure of the interactive attention model in step 912 of FIG. 18;

FIGS. 20A-E are schematic diagrams illustrating the process of adjusting the relevance score matrix with the first relevance score adjustment value and the second relevance score adjustment value in step 913 of FIG. 18;

21A-B illustrate a list of implementation data for one specific implementation of step 720 of FIG. 7;

FIG. 22 is a flow chart of an artificial intelligence based product defect identification method according to another embodiment of the present disclosure;

FIG. 23 is a particular flowchart for jointly training a defect recognition model and a defect classification model;

FIG. 24 is a block diagram of a product defect identification device according to one embodiment of the present disclosure;

FIG. 25 is a terminal block diagram implementing an artificial intelligence based product defect identification method according to one embodiment of the present disclosure;

FIG. 26 is a server block diagram implementing an artificial intelligence based product defect identification method according to one embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present disclosure more apparent, the present disclosure will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present disclosure.

Before proceeding to further detailed description of the disclosed embodiments, the terms and terms involved in the disclosed embodiments are described, which are applicable to the following explanation:

artificial intelligence: the system is a theory, a method, a technology and an application system which simulate, extend and extend human intelligence by using a digital computer or a machine controlled by the digital computer, sense environment, acquire knowledge and acquire a target result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions. With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human being to acquire new knowledge or skills, reorganizing existing knowledge structures to continuously improve its own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Homography transformation: refers to the mapping of one plane to another, typically represented by a homography transformation matrix. After the homography transformation matrix is determined, the coordinates of any point on one plane are transformed by the homography transformation matrix, so that the point coordinates of the point mapped to the other plane can be obtained. The homography transformation matrix is typically a 3x3 matrix, having 9 elements, but one element is to follow the value changes of the other 8 elements, so that only the values of the 8 elements can be changed at will, thus having 8 degrees of freedom. Since 8 elements are unknown, as long as there are coordinates of 4 known point pairs (each point pair includes one point in the one plane and one point corresponding thereto in the other plane), each known point has 2 coordinate values (abscissa, ordinate), 8-element once equation sets can be listed to solve for 8 elements. Thus, for a 3x3 homography transformation matrix, it can typically be uniquely determined from 4 known pairs of points.

Industrial automation quality inspection: refers to a process for automated quality inspection during the production of industrial products using computer algorithms. Compared with the traditional manual quality inspection process, the industrial automatic quality inspection has the characteristics of intelligence, high efficiency and stability, and is the development direction of future industrial quality inspection.

In the industrial automatic quality inspection process, a detection algorithm based on deep learning is generally adopted to detect defects of parts. Because of the complexity of the part, it is common to take images of the part at multiple angles and identify defects from the images of the part. There are two problems in the process of multi-angle shooting: firstly, as the mechanical arm of the camera has certain shake, the shot image also has certain shake; second, industrial cameras generally use fixed focus lenses, in some oblique photographing angles, only a small part of the image is clear, and the rest is blurred, so that the image at each angle needs to be drawn with a region of interest (ROI), and the result of the detection algorithm is calculated to be a valid result only if the result falls in the ROI. Based on the two problems, in the industrial automatic quality inspection process, the shot image of the part is generally calibrated first to eliminate the influence of shake, the ROI in the image of the part is kept consistent with the ideal ROI, and then the defect is identified from the calibrated image. At present, the two processes of image calibration and defect identification from the calibrated image are isolated, and after all the steps of image calibration are required to be performed, all the steps of defect identification from the calibrated image are required to be performed, so that the time consumption is long and the efficiency is low.

Therefore, there is a need for a product defect identification scheme that reduces the detection duration and improves the detection efficiency.

System architecture and scenario description applied to embodiments of the present disclosure

FIG. 1 is a system architecture diagram to which an artificial intelligence based product defect identification method is applied, according to an embodiment of the present disclosure. The system 100 includes a table 110, a pipeline 120, a product 130, a plurality of differently angled cameras 140, a defect recognition device 150, and a reference image library 160.

The table 110 is a work platform for producing products. The apparatus for producing the product is arranged on the table 110, a worker works on the table 110, and a defect of the product is detected on the table 110. For producing parts, the machining station for the parts is the table 110.

The assembly line 120 is also called an assembly line, and is a production mode in industry, in which each production unit only focuses on processing a certain segment, so as to improve the working efficiency. For example, for part machining, a worker responsible for casting performs casting of a part on a section of the line 120 that performs casting, a worker responsible for cutting performs cutting of a part on a section of the line 120 that performs cutting, and a inspection worker or inspection apparatus performs inspection of a part on a section of the line 120 that performs inspection. Since each production unit often completes the transfer of the product in each unit by means of a conveyor such as a belt, the conveyor itself is sometimes referred to as a line 120.

When the product is transferred to the location where the inspection is performed via the pipeline 120, a plurality of sets of cameras 140 are positioned at the location where the inspection is performed for capturing images of the product from a plurality of angles. As shown in fig. 1, cameras 140 are mounted at various angles above the pipeline 120. The reason for taking images of the product from different angles is that certain defects may only be apparent in images at one angle and not at other angles. Taking images of the product from multiple angles helps to improve the comprehensiveness of defect detection.

The camera 140 transmits the photographed images of the different angles of the product to the defect recognition apparatus 150, and the defect recognition apparatus 150 recognizes a defect from the images of the different angles of the product. The artificial intelligence based product defect identification method of the embodiments of the present disclosure is mainly performed by the defect identification apparatus 150. It may be embodied as a terminal (for example, a device installed separately for a certain detection position of a workshop, only responsible for detecting product defects at the detection position), or may be embodied as a server (for example, defect recognition is uniformly performed on product images photographed at different detection positions of a plurality of workshops or even a plurality of factories). Terminals may include desktop computers, laptops, PDAs (personal digital assistants), cell phones, dedicated terminals, etc. The server may be a high-performance computer in a network platform, a cluster of multiple high-performance computers, a portion of a high-performance computer (e.g., a virtual machine), a combination of portions of multiple high-performance computers (e.g., virtual machines), etc.

The reference image library 160 stores reference images of various angles of the product. The defect recognition apparatus 150 refers to reference images of different angles of the products in the reference image library 160 when recognizing defects from the product images photographed from different angles.

The embodiment of the disclosure can be applied to an industrial automation quality inspection scene. Fig. 2A shows a schematic diagram of an industrial automation quality inspection scenario.

As shown in fig. 2A, the products 130 are conveyed on a line 120 positioned on a table 110. The end of the transfer is a receiving box 190. A test is performed to identify if the product 130 is defective before entering the bin 190. Defects may require re-production. At the end of the line 120, immediately before the product 130 enters the receiving bin 190, a detection support 138 is provided above the line 120 for supporting necessary equipment for defect detection, such as a camera 140.

The product 130 passes under the detection rack 138 as it is conveyed on the line 120. The camera 140 is disposed at a plurality of different angles below the inspection support 138, images of the product 130 are taken from different angles, and the images are sent to the defect recognition device 150 to inspect the defect. After passing under the inspection rack 138, the product 130 enters a receiving bin 190 at the end of the line 120.

Fig. 2B identifies a schematic diagram of product defect detection in an industrial automation quality inspection scenario. A plurality of cameras 140 at different angles capture images of the product from different angles. These images become images to be detected. Since there are problems of jitter and only ROI area clarity in these images as described above, in order to reduce these problems, reference images of the product, which are jitter-free and have explicit ROI area positions, are stored in the reference image library 160. The defect recognition device 150 calibrates the image to be detected by means of the reference image to reduce jitter and perform ROI area matching, resulting in a calibrated image. From the calibrated image, defects in the product can be more accurately identified.

General description of embodiments of the disclosure

According to one embodiment of the present disclosure, a product defect identification method based on artificial intelligence is provided.

A product refers to any tangible manufactured article. It may be a complete item (e.g., an automobile) or may be a part (e.g., a part) of a complete item. It may be a consumer product, or the manufacturing tool itself (if a production line for a cutting machine is being manufactured, the cutting machine is a product, but it may be used to cut other products).

Product defects refer to parts of the product which can be exposed from the outside and are out of compliance with production specifications. It can be stain, bulge, dent and incomplete on the product, and can also be wrongly processed into wrong shape. For example, a standard shape of a part is hexagonal, wrought into octagons, and the entire shape is a product defect.

The product defect recognition method refers to a method of recognizing defects from a product. In a narrow sense, it is merely a method of identifying whether a product is defect-free, and in a broad sense, it also includes identifying the type of defect after identifying the defect, and the outline of the defect (described in detail in the embodiments below).

As shown in fig. 3, an artificial intelligence based product defect recognition method according to an embodiment of the present disclosure includes:

step 310, obtaining an image to be detected and a reference image of a product;

step 320, generating a first feature map of the image to be detected and a second feature map of the reference image;

step 330, determining a calibration matrix based on the first feature map and the second feature map;

step 340, calibrating the first feature map by using the calibration matrix to obtain a third feature map;

step 350, identifying a defect of the product from the third feature map.

Steps 310-350 are described in detail below.

The image to be detected of the product in step 310 refers to the original image of the product for defect detection. Which is acquired from camera 140. As described above, the cameras 140 of different angles capture images to be detected of different angles of the product, and send the images to the defect recognition device 150. The defect recognition device 150 receives images to be detected of different angles of the product.

The reference image of the product in step 310 refers to the image of the product as a reference for defect detection. It is a product image with no jitter and standard ROI area at a specific camera angle. With this, it is possible to eliminate the shake and correct the ROI area in the image to be detected, compared with the image to be detected. Which is obtained from the reference image library 160. The reference images may be indexed by product and camera angle in the reference image library 160. After receiving an image to be detected from the camera 140 at a specific angle at a certain detection position, a product detected at the detection position can be determined according to the detection position, and a reference image corresponding to the product and the camera angle can be obtained from the reference image library 160 according to the product index and the camera angle index.

In step 320, the feature map refers to an image in which the extracted features in the image are reconstructed. It differs from the original image in that each pixel of the original image is a real-world real map, e.g. the color of a certain pixel of the original image corresponds to the color of a corresponding location on the real-world object to be photographed, but a certain pixel in the feature map is not necessarily a real-world real map, which is the feature value of a feature of a certain aspect extracted from the original image. For example, colors are divided into RGB (red, green, blue). A certain feature value in the feature map may represent only the red value of the corresponding position color, etc. After the feature map is convolved by the convolution neural network nodes, the obtained feature map cannot correspond to points on an object in the real world one by one, is more abstract and is summarized on the basis of original image pixels.

The first feature map is a feature map extracted from an image to be detected, and the second feature map is a feature map extracted from a reference image. In step 320, generating a first feature map of the image to be detected and a second feature map of the reference image may be implemented by a feature extraction model. The feature extraction model may be a deep learning model such as Convolutional Neural Network (CNN).

Fig. 5 shows a structure diagram of a CNN as an example of a feature extraction model. As shown in fig. 5, the CNN includes an input layer, a plurality of hidden layers 1-n, and an output layer. The input layer has a plurality of processing nodes, each having a weight matrix (convolution kernel). Each hidden layer also has a plurality of processing nodes, each processing node also having a weight matrix (convolution kernel). The output layer also has a plurality of processing nodes, each of which also has a weight matrix (convolution kernel).

Each processing node of the input layer receives features of some aspect extracted in the input image (image to be detected or reference image). For example, the 3 processing nodes of the input layer in fig. 5 receive pixel values of R (red), G (green), and B (blue) in the input image, respectively. A certain feature extraction ratio may be set. The feature extraction ratio refers to the ratio of the number of extracted features to the total number of pixels of the input image. For example, a feature extraction ratio of 1/8 means that one feature is extracted every 8 pixels in the input image. Extracting a feature every 8 pixels refers to extracting an R value of 1 pixel every 8 pixels if the feature is an R value of a pixel. Assuming that the CNN of fig. 5 is a feature extraction model with a feature extraction scale of 1/8, the first processing node of the input layer receives the R value of one pixel extracted every 8 pixels in the input image; the second processing node of the input layer receives the G value of one pixel extracted from each 8 pixels in the input image; the third processing node of the input layer receives the B value of one pixel extracted every 8 pixels in the input image.

Each processing node of the input layer convolves the input of CNN with its own convolution kernel to obtain the output of that processing node as the input of each processing node of the hidden layer 1. The processing nodes of the hidden layer 1 convolve the output of the processing nodes of the input layer with the convolution kernel of the processing nodes to obtain the output of the processing nodes, and the output is used as the input of the processing nodes of the hidden layer 2. And the like, until each processing node of the output layer convolves the output of each processing node of the hidden layer n with the convolution kernel of the processing node to obtain an output characteristic diagram. The output feature map has the same feature extraction scale as the input.

When the input image is an image to be detected, the output feature map is a first feature map. When the input image is a reference image, the output feature map is a second feature map. Since the image to be detected may include a plurality of product images photographed at different angles, there are also a plurality of corresponding first feature maps. Since the reference image may comprise a plurality of reference images of different angles, there are also a plurality of corresponding second feature maps.

In addition, the first feature map may further include a plurality of first feature maps having different feature extraction ratios, and the second feature map may further include a plurality of second feature maps having different feature extraction ratios. As shown in FIG. 4, feature extraction models of 3 feature extraction ratios (1/8, 1/16, and 1/32) may be set. They can all employ the CNN structure shown in fig. 5. Since the feature extraction ratio of 1/16 is smaller than that of 1/8, the feature extraction ratio of 1/32 is smaller than that of 1/16, the output of the feature extraction model with the feature extraction ratio of 1/8 can be taken as the input of the feature extraction model with the feature extraction ratio of 1/16, and the output of the feature extraction model with the feature extraction ratio of 1/16 can be taken as the input of the feature extraction model with the feature extraction ratio of 1/32. The feature extraction model with the feature extraction ratio of 1/8, the feature extraction model with the feature extraction ratio of 1/16 and the feature extraction model with the feature extraction ratio of 1/32 are combined to serve as a backbone network.

It should be noted that the number of feature extraction models shown in fig. 4 is exemplary. Those skilled in the art can, with the benefit of the teachings of the present disclosure, make embodiments having more or fewer feature extraction models. In addition, the above-described feature extraction ratios of 1/8, 1/16, and 1/32 are also exemplary, and those skilled in the art can change to other feature extraction ratios as needed.

The advantage of setting up a plurality of feature extraction models of different feature extraction ratios is that feature maps of a plurality of different feature extraction ratios are simultaneously generated for defect recognition. The feature graphs with different feature extraction ratios can reflect global features identified under different feature extraction ratios, and compared with a scheme of identifying defects from the feature graph with only one feature extraction ratio, the identification accuracy is greatly improved.

Next, in step 330, a calibration matrix is determined based on the first and second feature maps.

The calibration matrix refers to a matrix for calibrating an input image to reduce jitter and ROI area mismatch in the input image. An example of this is the homography transformation matrix described above. The homography transformation matrix reflects the mapping of one image to another. The coordinates of any point on one image are transformed with a homography transformation matrix to obtain the coordinates of that point mapped onto the coordinates of the other image. If the other image is an image (reference image) under ideal conditions (no jitter and ROI area mismatch), the mapped coordinates are the coordinates where the point is expected to be located. Thus, after the whole input image is subjected to homography transformation matrix transformation, a calibrated image is obtained, namely, an image for reducing jitter and ROI (region of interest) mismatch in the image is obtained.

Since the second feature map is derived based on the reference image, which reflects the state without jitter and ROI area mismatch, the calibration matrix can be determined based on the first and second feature maps. Specific determination methods are described in detail hereinafter.

Next, in step 340, the first feature map is calibrated by using the calibration matrix to obtain a third feature map.

In one embodiment, calibration may be achieved by means of a calibration matrix application function as shown in fig. 4. The calibration matrix application function refers to a function of applying a calibration matrix to an input pixel or feature point matrix to obtain a calibrated pixel or feature point matrix. In one embodiment, the calibration matrix application function includes a feature point offset matrix generation function and a feature point offset function implementation.

The offset matrix generating function is a function for generating a characteristic point offset matrix corresponding to the calibration matrix according to the calibration matrix. The feature point offset matrix is a matrix in which the value of the position required offset of each feature point of the input feature map is recorded. And shifting the coordinates of each characteristic point of the input characteristic point matrix according to the corresponding shift value in the characteristic point shift matrix to obtain the coordinates of the shifted characteristic points. An example of this is the affin_grid function in pytorch. The characteristic point offset function is a function for executing the offset, and the coordinate of each characteristic point of the input characteristic point matrix is offset by a corresponding position amount according to the characteristic point offset matrix, so as to obtain the calibrated characteristic point coordinate, and a third characteristic diagram is obtained. An example of this is the grid_sample function in the pytorch.

In addition to the above embodiment, the calibration may be performed by other embodiments, for example, the amount of change required for each feature point in the first feature map is estimated based on the calibration matrix manually, but the above manner implemented by means of the feature point offset matrix generating function and the feature point offset function improves the calibration efficiency and reduces the processing overhead.

It should be noted that in one embodiment, as shown in fig. 4, the calibration matrix may not be used as the first feature map generated for each feature extraction model, but only for the first feature map generated by the bottom-most feature extraction model of the backbone network (the feature extraction model with the largest feature extraction scale, such as the 1/8 feature extraction scale feature extraction model of fig. 4), resulting in a corresponding third feature map. The feature extraction model of the upper layer in the backbone network can be used for further feature extraction on the basis of the third feature map generated by the feature extraction model of the bottom layer instead of the first feature map, so as to generate the corresponding third feature maps of the layers. As shown in fig. 4, the feature extraction model with the 1/16 feature extraction ratio performs further feature extraction on the basis of the third feature map generated by the feature extraction model with the 1/8 feature extraction ratio, to generate the third feature map with the 1/16 feature extraction ratio. The feature extraction model with the feature extraction ratio of 1/32 carries out further feature extraction on the basis of the third feature map generated by the feature extraction model with the feature extraction ratio of 1/16, so as to generate the third feature map with the feature extraction ratio of 1/32. Since the first feature map is only an intermediate result of each feature extraction model, fig. 4 shows only the third feature map generated by these feature extraction models, and the first feature map is not shown.

In the above embodiment, the calibration matrix only acts on the bottom-layer feature extraction model of the backbone network, so that the feature extraction efficiency is improved because the above feature extraction model does not need to generate the first feature map, and the third feature map with higher feature extraction ratio is directly generated on the basis of the third feature map extracted by the bottom-layer feature extraction model.

In addition, in one embodiment, as shown in fig. 4, the third feature map proposed by each feature extraction model may be further processed via a feature pyramid network to produce a third feature map of more feature extraction scale.

In general, backbone networks have inherent overhead limitations, and it is not desirable to set up too many layers of feature extraction models, typically 3-4 layers. However, for subsequent defect recognition, the more layers of feature maps of different feature extraction ratios are used as inputs, the more high-level image features can be found from the feature maps by the defect recognition model, so that the accuracy of defect recognition is improved. Thus, in one embodiment, a feature map of still other feature extraction scales is generated based on the 3-4 layers of feature maps of different feature extraction scales generated by the backbone network. In this embodiment, it is implemented by a feature pyramid network.

Fig. 6 illustrates an exemplary architecture of a feature pyramid network. The feature pyramid network has a plurality of processing layers in one-to-one correspondence with a plurality of feature extraction models of the backbone network, each processing layer including an input convolution module and an output convolution module. The input convolution module receives as input a third feature map output by the corresponding feature extraction model. And the convolution result input into the convolution module is transmitted to an up-sampler for up-sampling, so that an up-sampling result consistent with the feature extraction proportion of the next processing layer is obtained, and the up-sampling result is transmitted to the next processing layer. The other processing layers except the uppermost processing layer receive the up-sampling result sampled by the up-sampler of the last processing layer, and add the up-sampling result with the convolution result of the input convolution module of the processing layer to be used as the input of the output convolution module of the processing layer. In this way, the third feature map output by the processing layer not only reflects the features in the third feature map with the corresponding feature extraction proportion from the processing layer, but also reflects the higher layer features summarized by the processing layer of the higher layer, thereby being beneficial to the accuracy of the subsequent defect recognition.

As shown in fig. 6, the third feature map output by the feature extraction model with the feature extraction ratio of 1/8 is C3, the third feature map output by the feature extraction model with the feature extraction ratio of 1/16 is C4, the third feature map output by the feature extraction model with the feature extraction ratio of 1/32 is C5, and the processed third feature maps output by the processing layers corresponding to C3, C4 and C5 in the feature pyramid network are P3, P4 and P5, respectively. The P4 not only reflects the characteristics of the processing layer on the third characteristic diagram C4, but also reflects the high-level characteristics in the third characteristic diagram C5 transmitted from the upper layer, and the semantics of the processed third characteristic diagram P4 are greatly improved. The P3 not only reflects the characteristics of the processing layer on the third characteristic diagram C3, but also reflects the high-level characteristics in the third characteristic diagrams C4 and C5 transmitted from the upper layer, and the semantics of the processed third characteristic diagram P3 are greatly improved.

In addition, the convolution result of the input convolution module of the highest processing layer of the backbone network is also downsampled into the expected feature with lower feature extraction ratio by the downsampler, and is output after being convolved by the corresponding output convolution module to serve as a third feature map with lower feature extraction ratio, so that the effect of generating more feature maps with other feature extraction ratios on the basis of the feature maps with different feature extraction ratios of 3-4 layers generated by the backbone network is achieved, and the defect identification accuracy is improved.

As shown in fig. 6, the convolution result of the input convolution module corresponding to the third feature map C5 with the feature extraction ratio of 1/32 in the feature pyramid network generates features with the feature extraction ratio of 1/64 through a 2:1 downsampler, and convolves the features with the corresponding output convolution module to obtain a third feature map P6 with the feature extraction ratio of 1/64. The features with the 1/64 feature extraction ratio are then processed by a 2:1 downsampler to generate features with the 1/128 feature extraction ratio, and are convolved by a corresponding output convolution module to obtain a third feature map P7 with the 1/128 feature extraction ratio, so that the third feature map is expanded to P6-7 on the basis of P3-5.

Next, in step 350, a defect of the product is identified from the third feature map.

As described above, identifying product defects is classified into identifying product defects in a narrow sense and identifying product defects in a broad sense. In a narrow sense of implementation, it only recognizes the presence of defects in the product. At this time, the third feature map may be input into a trained defect recognition model. In the case where the third feature map includes only the third feature map of one feature extraction ratio, only the third feature map of one feature extraction ratio may be input into the defect recognition model, and whether or not the product is defective may be recognized from the third feature map by the defect recognition model. In the case where the third feature map includes a plurality of different feature extraction ratios as shown in fig. 4, the third feature maps of the plurality of feature extraction ratios may be input into a defect recognition model, and whether or not the product is defective may be recognized by the defect recognition model based on the third feature maps of the plurality of feature extraction ratios.

The defect recognition model may take the form of, for example, a Convolutional Neural Network (CNN), a decision tree, a Recurrent Neural Network (RNN), or the like. When CNN is used, reference may be made to the structure diagram shown in fig. 5, and a detailed description will be omitted.

The sample image set may be pre-constructed while training the defect recognition model. The sample image set contains a large number of sample images for training about the product. In each sample image, the presence or absence of a product is known. Each sample image is labeled with a first label based on the presence or absence of product. The first label indicates whether the sample image has a defect. Next, a fourth signature similar to the third signature of the presently disclosed embodiments is generated for the sample image using methods similar to steps 310-340 of the presently disclosed embodiments. And inputting the fourth feature map into a defect recognition model, and generating whether defects exist or not by the defect recognition model to obtain a recognition result. The recognition result is compared with the first tag. If the proportion of the recognition result in the sample image set to be consistent with the first label is higher than the first proportion (for example, 95%), the defect recognition model training is successful. And if the ratio is not higher than the first ratio, adjusting parameters in the defect recognition model until the ratio of the recognition result in the sample image set to the first label is higher than the first ratio.

In the broad sense of identifying product defects, identifying defects includes not only identifying the presence or absence of a defect, but also identifying the type of defect and outlining the defect. This may employ a defect classification model and a defect contour recognition model as shown in fig. 4. At this time, as shown in fig. 22, the defect identifying method of the embodiment of the present disclosure further includes:

step 360, inputting a plurality of third feature maps into a defect classification model, and identifying the type of the defect aiming at the identified defect by the defect classification model;

step 370, inputting the plurality of third feature maps into a defect contour recognition model, and recognizing the contour of the defect by the defect contour recognition model according to the recognized defect.

The defect classification model and the defect contour recognition model are similar to the defect recognition model, and can adopt structures such as a Convolutional Neural Network (CNN), a decision tree, a cyclic neural network (RNN) and the like. For saving the space, the description is omitted. Note that both the defect classification model and the defect contour recognition model start to operate only if the defect recognition model recognizes a defect, so that as shown in fig. 4, the defect recognition model also has one output directly to the defect classification model and the defect contour recognition model as trigger inputs.

As shown in fig. 4, the defect recognition model, the defect classification model, and the defect contour recognition model may be co-located within the recognition head. The identification head is a device for performing relevant detection on defects of the product.

The advantage of steps 360-370 is that in the case of identifying whether the product is defective in step 350, additional identification of the defect type and defect profile with steps 360-370 facilitates further analysis of the defect, resulting in countermeasures against the defect. In addition, by adopting a mode of commonly inputting a plurality of third feature graphs with different feature extraction ratios, the classification and the contour recognition of the defects can take into consideration information of different layers in the third feature graphs with different feature extraction ratios, so that the quality of the defect classification and the contour recognition is improved.

Additionally, considering that identifying whether a defect exists is inseparable from which types of defects, in one embodiment, as shown in FIG. 23, the defect identification model and the defect classification model may be co-trained by:

2310, constructing a sample image set;

step 2320, generating a fourth feature map of the sample image set;

step 2330, identifying defects according to the fourth feature map by using the defect identification model, and classifying the identified defects according to the fourth feature map by using the defect classification model;

Step 2340, constructing an error function based on the comparison of the identification result of the defect identification model and the first label and the comparison of the classification result of the defect classification model and the second label;

step 2350, training the defect recognition model and the defect classification model by using the error function.

Steps 2310-2350 are described in detail below.

The set of sample images in step 2310 is a set of sample images. The sample image is a product image that is used as a sample for jointly training the defect recognition model and the defect classification model. It is necessary to ensure that a portion of the image of the product is an image of the defective product. For example, more than 20% are required to be images of defective products to ensure training effect of the model. Each sample image is labeled in advance with a first label. The first label indicates whether the sample image has a defect. A second label is applied to the sample image where the first label indicates a defect. The second label indicates the type of defect. In one example, a sample image set of 1000 sample images may be employed. The more sample images in the sample image set, the better the joint training effect.

Next, in step 2320, a fourth feature map of the sample images of the sample image set is generated. In one embodiment, the fourth feature map may be extracted from the sample image by a deep learning model such as CNN at a predetermined feature extraction scale. In another embodiment, a method similar to steps 310-340 of the disclosed embodiment may be utilized to obtain corresponding reference images for the sample images, then obtain feature images from the sample images and the reference images, respectively, determine a calibration matrix based on the feature images obtained from the sample images and the reference images, and then calibrate the feature images obtained from the sample images using the calibration matrix to obtain a fourth feature image. The latter embodiment can reduce the influence caused by jitter and ROI region mismatch in the sample image, and can improve the model training effect.

Then, in step 2330, the fourth feature map is input into a defect recognition model, and whether a defect of the product exists or not is recognized by the defect recognition model based on the fourth feature map. And meanwhile, inputting the fourth characteristic diagram with the identified defects into a defect classification model, wherein the defect classification model identifies the defect type of the product.

Next, in step 2340, an error function is constructed based on the comparison of the identification result of the defect identification model with the first label and the comparison of the classification result of the defect classification model with the second label. In one embodiment, the error function is constructed as follows:

e= (B1/C1+B2/C2)/2 equation 3

Wherein E is the joint training error of the defect recognition model and the defect classification model, C1 is the total number of sample images in the sample image set, B1 is the number of inconsistent recognition results of the defect recognition model and the first label, C2 is the number of sample images with the second label in the sample image set and the defect recognition model also recognizes the defect, and B2 is the number of sample images consistent with the classification results of the defect classification model and the second label. C2 is set as the number of sample images with the second label in the sample image set and the defect recognition model also recognizes the defect, because the defect classification model cannot be triggered to classify the defect for the sample image with the defect recognition model not recognizing the defect, whereas the sample image with the defect but actually has no defect cannot participate in the comparison with the classification result of the defect classification model because the sample image has no second label.

B1/C1 in equation 3 is actually the ratio of erroneous recognition of the defect recognition model, and B2/C2 is actually the ratio of erroneous classification of the defect classification model. And taking the average value of the two to obtain the combined training error. The joint training error represents the error of the defect identification model and the error of the defect classification model. The defect recognition model and the defect classification model are jointly trained by using the method, so that the model training effect is improved.

In another embodiment, the error function may also be constructed as follows:

e= α·B1/C1+β·B2/C2 equation 4

Where α is the weight of the ratio of erroneous recognition of the defect recognition model and β is the weight of the ratio of erroneous classification of the defect classification model. In general, α+β=1. In comparison with equation 3, equation 4 replaces the average of the two with a weighted sum of the ratio of erroneous recognition of the defect recognition model and the ratio of erroneous classification of the defect classification model. The weighting mode can flexibly adjust the weights alpha and beta according to actual needs, and the alpha and the beta practically represent the importance of defect identification and defect classification in specific application, so that the effect of improving the model training flexibility is achieved. Meanwhile, defect identification and defect classification can be better adapted to the requirements of application scenes.

Then, in step 2350, the defect recognition model and the defect classification model are trained using the error function. In particular implementations, an error threshold (e.g., 5%) may be set. If the combined training error E is smaller than the error threshold, the training is successful, and the training of the defect identification model and the defect classification model is stopped. And if the error threshold value is not smaller than the error threshold value, adjusting parameters in the defect identification model and parameters in the defect classification model, and re-inputting the sample images in the sample image set into the two models for training until the combined training error E is smaller than the error threshold value.

The benefit of the embodiment of steps 2310-2350 is that it does not train the defect identification model and defect classification model in isolation, but instead uses the error joint generated by the two models to construct an error function, jointly adjusting the parameters of the two models, to take into account whether the defect is identified as being inseparable from what types of defects are present. Therefore, the training effect of the defect recognition model and the defect classification model is greatly improved, and the accuracy of defect recognition and defect classification is improved.

In addition to the step 330, which needs to be discussed in further detail above, other steps than the step 330 are described in detail. In steps 310-370, the steps of generating a feature map in both processes are combined, since the product defect identification typically involves image calibration and the two processes of identifying defects from the calibrated images typically involve steps of generating a feature map from the image to be inspected. The feature map is used for both calibration and as a basis for subsequent defect identification. The process of identifying product defects of step 350 is performed directly on the third signature generated by the calibration. Compared with the method in the prior art that the defect of the product is identified to be carried out on the image to be detected (the image to be detected is firstly calibrated, then the feature map is generated on the calibrated image, and the defect is identified on the feature map), the process of generating the feature map on the calibrated image is obviously omitted, the defect detection time is shortened, and the defect detection efficiency is improved.

The detailed implementation of step 330 is described below.

Further detailed description of step 330

As shown in fig. 7, in one embodiment, step 330 includes:

step 710, acquiring a matched characteristic point pair set;

step 720, determining a calibration matrix based on the matched feature point pair set.

Step 710 may be performed by the matched feature point pair acquisition module of fig. 4 and step 720 may be performed by the calibration matrix generation algorithm module of fig. 4. Steps 710-720 are described below.

The calibration matrix is a matrix for calibrating a first feature map formed by an image to be detected. The first feature map formed by the image to be detected is calibrated to eliminate jitter and ROI mismatch problems. Whereas the reference image is free from jitter and ROI mismatch problems. Thus, the calibration of the first feature map is dependent on a comparison with the second feature map generated from the reference image. The matching feature point pairs are the point pairs formed by one feature point found in the first feature map and the second feature map respectively in order to realize the comparison. The matching feature point pair includes a first feature point from the first feature map and a second feature point from the second feature map and corresponding to the first feature point. A set of matching feature point pairs, i.e. a set formed by a plurality of such matching feature point pairs.

The number of pairs of matching features in the set of pairs of matching features is related to the size of the calibration matrix. Taking an n×n homography matrix as an example. Due to n of the homography transformation matrix ² One of the transform coefficients needs to follow the other element changes, and therefore, there is (n ² -1) degrees of freedom. Since a matching feature point pair has an abscissa and an ordinate, 2 equations about the transform coefficient can be listed according to the abscissa and the ordinate of a matching feature point pair, and it is necessary to list (n ² More than 1)/2 equations can be solved for the homography transformation matrix. Therefore, the number of the matching feature point pairs in the matching feature point pair set should be at least equal to or greater than (n ² -1)/2.

It should be noted that the second feature points in the second feature map corresponding to the first feature points are not necessarily the inherent feature points extracted from the reference image. It may be a new feature point combined from respective parts of several intrinsic feature points extracted from the reference image. The second feature map is formed by extracting features from the reference image at a predetermined feature extraction scale. For example, for a 1/8 feature extraction scale, a pixel is selected for every 8 pixels in the reference image, and an intrinsic feature point (e.g., the R value of that pixel) is formed therefrom. The second feature point corresponding to the first feature point in the second feature map does not necessarily coincide with one of the intrinsic feature points, and is likely to coincide with a portion of each of the intrinsic feature points (e.g., a portion corresponding to the middle of the plurality of extracted pixels in the reference image).

As shown in fig. 8, each small square in the first feature map or the second feature map corresponds to one intrinsic feature point extracted from the image to be detected or the reference image. Assuming that the center of a small square at the left upper corner of the first feature map is taken as the origin of a coordinate system, the horizontal right direction is the positive x-axis direction, and the vertical downward direction is the positive y-axis direction, a plane rectangular coordinate system is established. The unit of the plane rectangular coordinate system is the side length of one characteristic point. 3 first feature points P1-3 are selected from the first feature map, wherein the three first feature points are inherent feature points in the first feature map, namely, the three first feature points exactly coincide with a certain small square in the first feature map, and the central coordinates of the three first feature points are (0, 0), (1, 1) and (1, 2) respectively. The second feature points corresponding to the first feature points in the second feature map are not necessarily coincident with a small square. In fig. 8, a second feature point P1' corresponding to the first feature point P1 in the second feature map coincides with a small square, and its center coordinates are (2, 1); the second feature point P2' corresponding to the first feature point P2 in the second feature map does not uniquely coincide with one small square, but occupies a part of each of three small squares having center coordinates (2, 2), (3, 2), (2, 3); the second feature point P3' corresponding to the first feature point P3 in the second feature map does not uniquely coincide with one small square, but occupies a part of each of two small squares whose center coordinates are (1, 3), (2, 3). Thus, the first feature point P1 and the second feature point P1', the first feature point P2 and the second feature point P2', and the first feature point P3 and the second feature point P3' are three matching feature point pairs, and form a matching feature point pair set.

The specific implementation of step 710 will be described in detail later.

In step 720, a calibration matrix is determined based on the set of matching feature point pairs.

In one embodiment, step 720 includes:

performing a first number of times a first process, the first process comprising: extracting a second number of matching feature point pairs from the set of matching feature point pairs; calculating a candidate calibration matrix according to each matching characteristic point pair; determining a first error of the matched feature point pairs of the matched feature point pair set on the candidate calibration matrix; determining a third number of matched feature point pairs having a first error less than a first threshold;

and taking the candidate calibration matrix calculated in the first process with the largest third number in the first processes as a determined calibration matrix.

The above embodiments are based on the idea that: the set of matching feature point pairs formed in step 710 has a plurality of matching feature point pairs. These matching feature point pairs may play different roles in the calibration matrix determination process. For example, certain matching feature points are relatively close, exhibiting similar characteristics, and thus, using them together to determine the calibration matrix may make the quality of the determined calibration matrix low. Therefore, some of the pairs of matching feature points, which are representative of the comparison, need to be selected therefrom, from which the calibration matrix is determined. In the above embodiment, the matching feature point pairs may be taken a plurality of times (first number of times), and the matching feature point pairs may be taken a plurality of times (second number of times). For this second number of matching feature point pairs, a corresponding calibration matrix can be obtained as a candidate calibration matrix. Then, it is measured how many pairs of matching feature points all pairs of matching feature points of the whole set of pairs of matching feature points are to fit this candidate calibration matrix (by way of the first error described above). The more pairs of matching feature points that fit the candidate calibration matrix, the higher the quality of the candidate calibration matrix. For the candidate calibration matrix obtained by taking a plurality of matching characteristic point pairs each time, the number (third number) of matching characteristic point pairs conforming to the candidate calibration matrix in the whole matching characteristic point pair set can be obtained. The third number being the largest means that the quality of selecting a plurality of matching feature point pairs is the highest, and the candidate calibration matrix generated by this selection can be used as the determined calibration matrix.

The calculation of the candidate calibration matrix according to each matching feature point pair in the first process can be implemented by listing the equation of the change coefficient of the candidate calibration matrix according to the abscissa and the ordinate of each matching feature point pair and solving the equation.

In one embodiment, the determining the first error of the matching feature point pair set in the first process on the candidate calibration matrix may be implemented by the following steps: mapping the first feature point in the matched feature point pair to a third feature point in the second feature map according to the candidate calibration matrix; a distance between the third feature point and the second feature point in the matching feature point pair is determined as the first error.

The mapping of the candidate calibration matrix to the second feature map is an inverse of the calculation of the candidate calibration matrix from the pairs of matching feature points. For the second number of matching feature point pairs extracted from the matching feature point pair set, since the candidate calibration matrix is calculated according to the matching feature point pairs, the third feature point obtained after the first feature point is mapped into the second feature map according to the candidate calibration matrix is the second feature point. At this time, the distance between the third feature point and the second feature point is 0, and the first error is 0. For those matching feature point pairs other than the second number of matching feature point pairs in the set of matching feature point pairs, the third feature point obtained after mapping the matching feature point pairs to the second feature map according to the candidate calibration matrix may not be the second feature point. However, if the third feature point is slightly offset from the second feature point, i.e., not far apart, the matching feature point pair may also be considered to substantially match the candidate calibration matrix. The matching feature point pair is not considered to fit the candidate calibration matrix only if the distance between the third feature point and the second feature point is sufficiently large, e.g., not less than the first threshold. The distance between the third feature point and the second feature point is equal to the first error. Therefore, from the set of matching feature point pairs, the first error of the second number of matching feature point pairs is 0, the first errors of the remaining matching feature point pairs are small, and the third number of matching feature point pairs is determined by how many of them are smaller than the first threshold.

The distance may be a euclidean distance or a cosine distance, and the first error may be calculated according to a calculation method of the euclidean or cosine distance.

The embodiment of taking the distance between the third feature point mapped to the second feature point in the second feature map and the second feature point as the first error has the advantages that for each first feature point, the actual third feature point corresponding to the first feature point in the second feature map is found through the candidate calibration matrix, the first error is obtained through the distance between the third feature point and the second feature point, the calculated first error reflects the actual deviation of all the matched feature points of the matched feature point pair set, which is caused by the mapping of the candidate calibration matrix, as the first error, the accuracy is higher, and the quality of the determined calibration matrix is improved.

Assume that there are 10 pairs of matching feature points in the set of matching feature point pairs. The first number is 5 and the second number is 4. An example of the above embodiment is described below in conjunction with fig. 21A-B. That is, 5 rounds of extraction will be performed among 10 matching feature point pairs, with 4 feature point pairs randomly extracted for each round.

As shown in fig. 21A, when the first round of extraction is performed, 4 feature point pairs are extracted from 10 matching feature point pairs. And calculating a candidate calibration matrix by using the 4 characteristic point pairs to obtain a candidate calibration matrix A. At this time, as shown in fig. 21B, the first errors on the candidate calibration matrix a for each of the 10 pairs of matching feature points are found to be 0.19, 0.17, 0.25, 0, 0.23, 0.19, 0, 0.16, respectively, where four first errors are 0 for the feature point pairs corresponding to the 4 extracted calculation candidate calibration matrices. The first threshold is 0.2. The first error of the matched feature point pair 3 is 0.25 and is not smaller than a first threshold value of 0.2; the first error of the matched feature point pair 7 is 0.23 and is not smaller than a first threshold value of 0.2; the first error of the matching feature point pair 10 is 0.26 and is not less than the first threshold value of 0.2. These three matching feature point pairs are removed from the 10 matching feature point pairs, leaving 7 matching feature point pairs. The third number is 7. Returning to fig. 21A, when 4 feature point pairs are extracted in the second round, a candidate calibration matrix B is obtained, and the obtained third number is 6; when 4 characteristic point pairs are extracted in a third round, a candidate calibration matrix C is obtained, and the obtained third number is 5; when 4 characteristic point pairs are extracted at the fourth round, a candidate calibration matrix D is obtained, and the obtained third number is 8; and when the fifth round of extraction is performed on 4 feature point pairs, obtaining a candidate calibration matrix E, wherein the obtained third number is 6. Since the largest third number of five rounds of decimation is 8, corresponding to the fourth round of five rounds, the corresponding calibration matrix is D. At this time, the calibration matrix D is the determined calibration matrix.

The above embodiments may also be implemented by means of an inner point set. The set of interior points is a set of pairs of matching feature points for which the determined first error is less than a first threshold. The set of interior points may be updated with the number of execution rounds of the first process described above. In the execution of the first pass of each pass, the third number of pairs of matching feature points for which the first error generated by the pass is less than the first threshold is compared with the number of pairs of matching feature points already in the interior point set. If the number of the matching characteristic point pairs in the inner point set is larger than the number of the existing matching characteristic point pairs in the inner point set, all the matching characteristic points in the inner point set are replaced by the matching characteristic point pairs, wherein the first error generated by the matching characteristic point pairs is smaller than a first threshold value. After all the rounds are executed, the candidate calibration matrix corresponding to the matched characteristic points in the inner point set is the determined calibration matrix.

As shown in fig. 21A, after the first process of the first round is performed, the determined calibration matrix is the candidate calibration matrix a, and the third number of pairs of matching feature points for which the first error generated by the round is smaller than the first threshold is 7. At this time, the inner point set is an empty set, and 7 matched feature point pairs with the first error smaller than the first threshold value are placed in the inner point set. When the first pass of the second pass is completed, the determined calibration matrix is the candidate calibration matrix B, and the third number of passes generated is 6, no greater than 7, and therefore is not processed. When the first pass of the third round is completed, the determined calibration matrix is the candidate calibration matrix C, and the third number of rounds generated is 5, not more than 7, and thus is not processed. After the first pass of the fourth pass is completed, the determined calibration matrix is the candidate calibration matrix D, the third number of passes produced by the pass is 8, greater than 7, and the 7 matching feature points in the inner set of points are replaced with 8 pairs of matching feature points produced by the fourth pass with the first error less than the first threshold. When the first pass of the fifth pass is completed, the determined calibration matrix is the candidate calibration matrix E, and the third number of passes generated is 6, no greater than 8, and is therefore not processed. At this time, the matching feature points in the inner point set are generated in the fourth round, and the corresponding candidate calibration matrix is D, and the candidate calibration matrix D is used as the determined calibration matrix.

The implementation mode of the inner point set can simplify the process of determining the calibration matrix and improve the determination efficiency of the calibration matrix.

The embodiment of executing the first number of the first processes and determining the calibration matrix according to the candidate calibration matrix corresponding to the maximum third number has the advantages that the embodiment selects a plurality of matching characteristic point pairs, generates a plurality of candidate calibration matrices, and actually compares the actual effects of the plurality of candidate calibration matrices according to the errors of the whole matching characteristic point pair set on each candidate calibration matrix, considers the actual errors, and improves the accuracy of determining the calibration matrix.

The advantage of steps 710-720 is that the calibration matrix is determined from the pairs of matching feature points actually obtained in the first feature map and the second feature map, which reflect the actual objective situation, since these feature point pairs are actually extracted from the first feature map and the second feature map, with a high calibration accuracy.

The detailed process of step 710 is discussed below.

Further detailed description of step 710

In one embodiment, as shown in FIG. 9, step 710 includes:

step 910, acquiring a relevance score matrix based on the first feature map and the second feature map;

Step 920, selecting a plurality of first feature points from the first feature map;

step 930, for each selected first feature point, acquiring a second feature point corresponding to the first feature point in the second feature map based on the association score matrix.

Steps 910-930 are described below.

The relevance score of step 910 refers to a score representing the degree of relevance of the feature points in the first feature map to the feature points in the second feature map. Its value is generally between 0 and 100%. If the feature point a in the first feature map is mapped to the feature point B in the second feature map, the association degree score of the feature point a and the feature point B is 100%. For feature points in the second feature map that are closer to feature point B, their relevance score to feature point a will also be higher. For feature points in the second feature map that are farther from feature point B, their relevance scores to feature point a are lower.

Any intrinsic feature point in the first feature map (the feature point generated by the pixel directly extracted from the image to be detected), and any intrinsic feature point in the second feature map in fig. 8, have a relevance score. All of these matrices of relevance scores are called relevance score matrices.

When the association degree score matrix is formed, all feature points of the first feature map are pulled into a row vector, and each element of the row vector corresponds to one column of the association degree score matrix. All feature points of the second feature map are pulled into a column vector, each element of the column vector corresponding to a row of the relevance score matrix. Thus, if the first feature map is a (p1×p2) matrix and the second feature map is a (q1×q2) matrix, the association score matrix has (p1×p2) columns and (q1×q2) rows, and becomes a (p1×p2) × (q1×q2) matrix. As shown in fig. 10, the first profile P has 3 rows and 2 columns. First, the feature point P of the first row ₀₀ And P ₀₁ And (5) taking out. And then the feature point P of the second row ₁₀ And P ₁₁ Take out and put in P ₀₀ And P ₀₁ Is included in the above-described patent document. And then the characteristic point P of the third row ₂₀ And P ₂₁ Take out and put in P ₁₀ And P ₁₁ Is included in the above-described patent document. Thus, a row vector (P ₀₀ ，P ₀₁ ，P ₁₀ ，P ₁₁ ，P ₂₀ ，P ₂₁ ) Wherein 6 elements correspond to 6 columns of the relevance score matrix. The second feature map Q has 2 rows and 4 columns. First, the feature point Q of the first row ₀₀ 、Q ₀₁ 、Q ₀₂ 、Q ₀₃ And (5) taking out. And then the characteristic point Q of the second row ₁₀ 、Q ₁₁ 、Q ₁₂ 、Q ₁₃ Take out and put in Q ₀₀ 、Q ₀₁ 、Q ₀₂ 、Is a column vector (Q) ₀₀ 、Q ₀₁ 、Q ₀₂ 、Q ₀₃ 、Q ₁₀ 、Q ₁₁ 、Q ₁₂ 、Q ₁₃ ) Wherein 8 elements correspond to 8 rows of the association score matrix.

Each element in the association score matrix thus formed represents the corresponding feature point of the column in the first feature map where the element is located, and the element is located And the association degree score of the corresponding feature points in the second feature map is displayed. As shown in FIG. 10, element 0.3 of the 4 th row and 3 rd column in the relevance score matrix is the feature point P corresponding to the 3 rd column in the first feature map ₁₀ And the 4 th line in the second feature map ₀₂ Is a correlation score of (a). Fig. 10 shows the degree of association score between each feature point in the first feature map P and each feature point in the second feature map Q.

The association score matrix may be formed in a manually determined manner. That is, for each first feature point in the first feature map, it is manually checked in the second feature map which feature point is the feature point to which it is mapped, the degree of association score of this feature point with the first feature point is 100%, and the degree of association score of other feature points in the second feature map with the first feature point is manually given according to the distance from this other feature point to the feature point to which the first feature point is mapped. This embodiment takes a lot of manual time and is inaccurate.

In one embodiment, the relevance score matrix may be automatically generated by:

taking the characteristic point in the first characteristic diagram as a first dimension, taking the characteristic point column in the first characteristic diagram as a second dimension, adding a preset third dimension, and taking the characteristic point value in the first characteristic diagram as a vector element to construct a first three-dimensional matrix;

Taking the characteristic point in the second characteristic diagram as a fourth dimension, taking the characteristic point column in the second characteristic diagram as a fifth dimension, adding a preset sixth dimension, and taking the characteristic point value in the second characteristic diagram as a vector element to construct a second three-dimensional matrix;

multiplying the transposed matrix of the first three-dimensional matrix by the second three-dimensional matrix to obtain a four-dimensional matrix with third dimension and sixth dimension eliminated;

and obtaining a relevance score matrix from the four-dimensional matrix.

The first feature map is formed by feature values (such as R values) of pixels extracted from an image to be detected according to a certain feature extraction ratio, and the feature value of each pixel forms an inherent feature point value. The eigenvalues are arranged in a two-dimensional matrix. The behavior of the two-dimensional matrix is a first dimension, and the columns of the two-dimensional matrix are a second dimension. To obtain the relevance score matrix, a one-dimensional thickness may be added to the two-dimensional matrix. I.e. each pixel is considered to be thick, the corresponding feature point also has thickness coordinates in thickness. Since the thickness coordinates are canceled in the following calculation, an arbitrary thickness coordinate (for example, thickness coordinates of all elements of the two-dimensional matrix are 1) can be set. Thus, the two-dimensional matrix becomes a three-dimensional matrix, i.e., a first three-dimensional matrix, and the values of the characteristic points of the two-dimensional matrix become the values of the characteristic points at the corresponding coordinates of the first three-dimensional matrix.

Similarly, the values of the intrinsic feature points in the second feature map may be considered to be arranged in a two-dimensional matrix. And the fourth dimension of the behavior of the two-dimensional matrix, and the fifth dimension of the columns of the two-dimensional matrix. To obtain the association score matrix, a sixth dimension, the thickness dimension, may be added to the two-dimensional matrix. Any thickness coordinates may be set. Thus, the two-dimensional matrix becomes a three-dimensional matrix, that is, a second three-dimensional matrix, and the values of the characteristic points of the two-dimensional matrix become the values of the characteristic points at the corresponding coordinates of the second three-dimensional matrix.

And adding the thickness dimension, so that the first characteristic diagram and the second characteristic diagram are changed into a first three-dimensional matrix and a second three-dimensional matrix, and multiplying the transposed matrix of the first three-dimensional matrix by the second three-dimensional matrix to completely eliminate the added thickness dimension, thereby obtaining a four-dimensional matrix with the third dimension and the sixth dimension eliminated. Assuming that the first three-dimensional matrix is a matrix of (m1×m2×m3), the second three-dimensional matrix is a matrix of (n1×n2×n3), where M3 and N3 are the number of elements in the thickness dimension, and the resulting four-dimensional matrix is a matrix of (m1×m2×n1×n2).

Two-dimensional or three-dimensional matrices have shapes that are distinguishable to the human eye, whereas four-dimensional matrices do not have shapes that are distinguishable to the human eye. However, if a plurality of dimensions of the four-dimensional matrix are unified and element values of the plurality of dimensions are arranged in one dimension, the four-dimensional matrix can be reduced in dimension to a two-dimensional or three-dimensional matrix having a certain shape. A dimension of a two-dimensional or three-dimensional matrix is now the result of the unification of multiple dimensions in four dimensions, which pulls the possible combined values of the multiple dimensions into one dimension. For example, in the above-described matrix of (m1×m2×n1×n2), M1 elements are in total in the first dimension and M2 elements are in total in the second dimension on the four-dimensional matrix, and the first dimension and the second dimension are stretched into one dimension, and (m1×m2) elements are in the dimension. Similarly, the fourth dimension of the four-dimensional matrix has N1 elements, the fifth dimension has N2 elements, and the fourth and fifth dimensions are stretched into one dimension having (n1×n2) elements. Thus, the four-dimensional matrix becomes a two-dimensional matrix having (m1×m2) elements in one dimension and (n1×n2) elements in the other dimension. The relevance score matrix is obtained by normalizing the two-dimensional matrix (changing the maximum element value in the two-dimensional matrix to 1 and dividing the other element values in the two-dimensional matrix by the maximum element value).

The embodiment for obtaining the association degree score matrix by multiplying the first feature map and the second feature map by the dimension lifting transpose has the advantages that the association degree score matrix can be obtained through simple matrix operation, the processing cost is reduced, and the processing efficiency is improved.

The above-described operation of generating the relevance score matrix may be performed by the relevance score matrix generation module of fig. 4. After the relevance score matrix generation module generates the relevance score matrix, in one embodiment, the relevance scores in the relevance score matrix generated by the relevance score matrix generation module may be adjusted by the self-attention model and the interactive attention model.

As shown in fig. 18, in this embodiment, after step 910, the product defect identification method includes:

step 911, inputting the first feature map and the second feature map into the self-attention model to generate a first relevance score adjustment value;

step 912, inputting the first feature map and the second feature map into the interaction attention model to generate a second relevance score adjustment value;

step 913, adjusting the relevance score matrix by using the first relevance score adjustment value and the second relevance score adjustment value.

The self-attention model is a model for generating a correlation adjustment value by measuring the interaction between elements for an input having a plurality of elements, thereby measuring the degree of attention of each of the elements. In step 911, the first feature map is input into a self-attention model, and the self-attention model measures the degree of attention of each feature point in the first feature map according to the interaction relationship of each feature point in the first feature map, so as to generate a first relevance score adjustment value. This first relevance score adjustment value is for a feature point in the first feature map. The feature points in the first feature map correspond to a column in the relevance score matrix, and therefore, the column in the relevance score matrix is adjusted according to the first relevance score adjustment value.

Similarly, the second feature map is input into a self-attention model, and the self-attention model measures the attention degree of each feature point in the second feature map according to the interaction relation of each feature point in the second feature map, so as to generate a first relevance score adjustment value. The first relevance score adjustment value is for a feature point in the second feature map, corresponding to a row in the relevance score matrix. One row in the relevance score matrix is adjusted according to the first relevance score adjustment value.

Fig. 19A shows an exemplary structure of the self-attention model. The self-attention model includes a plurality of self-attention model nodes, and decides a number of enabled self-attention model nodes according to a number of received feature point values. Assume that the first feature map includes U feature point values P ₀ 、P ₁ 、……P _U-1 Correspondingly, the self-attention model enables U self-attention model nodes L ₀ 、L ₁ 、……L _U-1 Wherein L is a positive integer of 2 or more. First self-attention model node L ₀ Except for the feature point P according to the input ₀ Generating a first relevance score adjustment value a ₀ In addition, the self state information S is generated according to the input characteristic point P0 ₀ To the next self-attention model node L ₁ And (5) transmitting. I+1th (i=1, 2 … … U-1) self-attention model node L _i On the one hand according to the input characteristicsThe symptom point P _i And the last self-attention model node L _i-1 Status information S transmitted _i-1 Generating a corresponding first relevance score adjustment value a _i On the other hand, according to the inputted feature point P _i And the last self-attention model node L _i-1 Status information S transmitted _i-1 Generating own state information S _i To the next self-attention model node L _i+1 And (5) transmitting. The state information here reflects U feature point values P ₀ 、P ₁ 、……P _U-1 Interaction between them. Thus, the obtained first relevance score adjustment value a ₀ 、a ₁ 、……a _U-1 Reflects the U feature point values P ₀ 、P ₁ 、……P _U-1 Is a function of the interaction of (a) and (b). Fig. 19A illustrates the operation of the self-attention model, taking only a plurality of feature point values of the first feature map as input as an example. The working of the self-attention model is similar when a plurality of feature point values in the second feature map are taken as inputs.

The interactive attention model is a model for measuring interaction effects of elements among a plurality of inputs aiming at the plurality of inputs with multiple elements, thereby generating relevant adjustment values. In step 912, the first feature map and the second feature map are input into an interactive attention model. The feature points in the first feature map interact with the feature points in the second feature map. The interaction attention model can measure the interaction of the feature points in the first feature map and the feature points in the second feature map, and accordingly a second relevance score adjustment value is generated. The second relevance score adjustment value is for a point pair composed of one feature point in the first feature map and one feature point in the second feature map. When the second relevance score adjustment value is adjusted, only one element value in the relevance score matrix, namely, the element value at the intersection of the column corresponding to the feature point in the first feature map and the row corresponding to the feature point in the second feature map, is adjusted.

FIG. 19B illustrates an exemplary structure of an interactive attention model. The interactive attention model comprises a plurality of interactive attention model nodes and is based on the received firstThe number of feature point values in the feature map determines the number of interaction attention model enabled nodes. Still with the first feature map including U feature point values P ₀ 、P ₁ 、……P _U-1 For example, the interactive attention model enables U interactive attention model nodes G ₀ 、G ₁ 、……G _U-1 . In addition, a first interaction attention model node G ₀ Also receives all V feature point values Q in the second feature map ₀ 、Q ₁ 、……Q _V-1 As V-way initial state information. First interaction attention model node G ₀ Except for the first feature point value P in the first feature map ₀ V-way initial state information Q ₀ 、Q ₁ 、……Q _V-1 Generating a first feature point value P0 in the first feature map and a second score adjustment value b for each feature point value in the second feature map _0，0 、b _0，1 、……b _0，V-1 In addition to the first feature point value P in the first feature map ₀ V-way initial state information Q ₀ 、Q ₁ 、……Q _V-1 Generating state information S of V path _0，0 、S _0，1 、……S _0，V-1 To the next interaction attention model node G ₁ . Ith+1 (i=1, 2 … … U-1) interaction attention model node G _i Except for the value P according to the (i+1) th feature point in the first feature map _i Received V-way state information S _i-1，0 、S _i-1，1 、……S _i-1，V-1 Generating P _i A second score adjustment value b with each feature point value in the second feature map _i，0 、b _i，1 、……b _j，V-1 In addition, according to the (i+1) th feature point value P in the first feature map _i Received V-way state information S _i-1，0 、S _i-1，1 、……S _i-1，V-1 Generating state information S of V path _i，0 、S _i，1 、……S _i，V-1 To the next interaction attention model node G _i+1 . The state information transmitted by each interaction attention model node in the process reflects each of the second feature graphsInteraction of feature points with feature points in the first feature map. Therefore, the obtained second relevance score adjustment value can measure the interaction of the feature points in the first feature map and the feature points in the second feature map.

After the first relevance score adjustment value and the second relevance score adjustment value are obtained, the relevance score matrix may be adjusted in step 913 using the first relevance score adjustment value and the second relevance score adjustment value. Specifically, since the first relevance score adjustment value corresponds to one column or one row in the relevance score matrix, the element value of the one column or one row in the relevance score matrix corresponding to the first relevance score adjustment value may be adjusted using the first relevance score adjustment value. Since the second relevance score adjustment value corresponds to a single element in the relevance score matrix, the element value in the relevance score matrix corresponding to the second relevance score adjustment value may be adjusted with the second relevance score adjustment value.

Fig. 20A shows a schematic diagram of a relevance score matrix. The first feature map has 4×2 feature points, and the second feature map has 4×2 feature points, and thus the association degree score matrix is a (4×2) × (4×2) matrix, that is, an 8×8 matrix. FIG. 20B shows 8 first relevance score adjustment values obtained for 8 feature points of the first feature map, where the 3 rd feature point P ₂ The corresponding first relevancy score adjustment value is 0.1, which corresponds to column 3 in the relevancy score matrix, so that all elements in column 3 are added by 0.1; 8 th feature point P ₇ The corresponding first relevance score adjustment value is 0.2, which corresponds to column 8 of the relevance score matrix, and therefore, all elements in column 8 are added by 0.2, and the resulting relevance score matrix is shown in fig. 20C.

Fig. 20D shows second association score adjustment values corresponding to 64 pairs of points formed for 8 feature points of the first feature map and 8 feature points of the second feature map. It is embodied in a matrix form having a structure similar to the association score matrix. The element in the ith row and jth column of the matrix of fig. 20D represents exactly the second relevance score adjustment value with the ith row and jth column of the relevance score matrix. Thus, the matrix of fig. 20C may be added to the matrix of fig. 20D to obtain the matrix of fig. 20E, i.e., the adjusted relevance score matrix.

The advantage of the embodiment of steps 911-913 is that it adopts the self-attention model to measure the interaction relation of each feature point in the first feature map or the second feature map, and adopts the interaction attention model to measure the interaction relation of the feature points between the first feature map and the second feature map, so as to generate corresponding score adjustment values and adjust the relevance score matrix, thereby improving the quality of the relevance score matrix and the accuracy of defect identification.

After the execution of step 913, as shown in fig. 18, in step 920, a plurality of first feature points are selected in the first feature map. As shown in fig. 8, the selected plurality of first feature points are as many as possible intrinsic feature points of the first feature map, such as P1, P2, and P3.

Next, in step 930, for each first feature point selected, a second feature point corresponding to the first feature point in the second feature map is acquired based on the association score matrix. Note that the second feature point here is not necessarily an inherent feature point of the second feature map. As shown in fig. 8, it may be a respective portion that occupies a plurality of inherent feature points.

Because the association score matrix lists the association scores between each intrinsic feature point in the first feature map and each intrinsic feature point in the second feature map, the closer the feature points in the second feature map are to the result of mapping the first feature points into the second feature map, the higher the association score, so that steps 910-930 can acquire the second feature points corresponding to the first feature points by means of the association score matrix, and the second feature points thus found are helpful to improve the quality of the acquired matching feature point pair set and improve the defect identification accuracy.

The implementation of step 930 is described in detail below.

Further detailed description of step 930

For each feature point selected in the first feature map, it corresponds to a column in the relevance score matrix. The relevance score represents how close the feature points in the second feature map are to the result of the mapping of the first feature points into the second feature map. If a correlation score of 100% is found in the corresponding column of the correlation score matrix for the first feature point in the first feature map, it is the feature point to which the first feature point is mapped, and it may be selected as the second feature point. In one embodiment, in the case that there is no 100% relevance score in the above column, a feature point corresponding to the highest relevance score may be selected as the second feature point roughly.

However, as shown in fig. 8, the second feature point is not necessarily one of the inherent feature points in the second feature map, and it may cover a respective part of the plurality of inherent feature points. Thus, if the second feature point covers a respective part of the plurality of inherent feature points, the plurality of inherent feature points covered by the second feature point can be found first, and then the second feature point can be located. Specifically, since the second feature point covers at most 4 intrinsic feature points, in order to accurately locate the second feature point, the associated 4 intrinsic feature points are found first. As shown in fig. 8, for the second feature point P3', the associated 4 intrinsic feature points may be feature points of the center point coordinates (2, 2), (3, 2), (2, 3), (3, 3).

The above 4 associated inherent feature points are actually 4 feature points obtained by intersecting the adjacent 2 rows and the adjacent 2 columns on the second feature map. As shown in fig. 12A, a plane rectangular coordinate system is established with the center of the intrinsic feature point in the upper left corner of the first feature map or the second feature map as the origin of coordinates, the horizontal right direction as the forward direction of the x axis, and the vertical downward direction as the forward direction of the y axis. The center coordinates of the first feature point X on the first feature map are (i) _a ,j _a ). The second feature point of the first feature point X mapped to the second feature map is Y, which is not exactly one intrinsic feature point of the second feature map, and the central coordinates of the associated 4 intrinsic feature points Y1-Y4 are (i) _b ,j _b )、(i _b +1,j _b )、(i _b +1，j _b )、(i _b +1，j _b +1)。

The coordinates of the first feature point X are (i) _a ,j _a ) Description of j above the first feature point X in the first feature map _a Lines, each line feature point number is W _a There is also i on the left side of the row where the first feature point X is located _a As shown in fig. 12B, therefore, the first feature point corresponds to the column (j _a ·W _a +i _a ). The distribution of the association scores in the column for the associated 4 inherent feature points Y1-Y4 of FIG. 12A will be discussed below.

For the inherent feature point Y1 (i _b ，j _b ) In the second feature map, the unique feature point Y1 (i _b ，j _b ) Also j above _b Lines, each line feature point number is W _b The inherent feature point Y1 (i _b ，j _b ) To the left of the row is also i _b Each feature point, therefore, the inherent feature point Y1 (i _b ，j _b ) Row (j) corresponding to the relevance score matrix _b ·W _b +i _b ) The corresponding association score is 0.1; intrinsic feature point Y2 (i _b +1，j _b ) Row (j) corresponding to the relevance score matrix _b ·W _b +i _b +1), the corresponding association score is 0.3; intrinsic feature point Y3 (i _b ，j _b +1) corresponds to row ((j) of the relevance score matrix _b +1)·W _b +i _b ) The corresponding association score is 0.2; intrinsic feature point Y4 (i _b +1，j _b +1) corresponds to row ((j) of the relevance score matrix _b +1)·W _b +i _b +1), the corresponding relevance score is 0.5.

It can be seen that, in column (j _a ·W _a +i _a ) In the natural feature point Y1 (i _b ，j _b ) And an inherent feature point Y2 (i _b +1，j _b ) The corresponding relevance scores are adjacent, and the inherent feature point Y3 (i _b ，j _b +1) and intrinsic feature point Y4 (i) _b +1，j _b The relevance scores corresponding to +1) are also adjacent, and the inherent feature point Y2 (i _b +1，j _b ) And an inherent feature point Y3 (i _b ，j _b The correlation score of +1) is separated byW _b -2) relevance scores. The distribution rules are always arranged among the relevance score positions corresponding to the four inherent feature points Y1-4.

In the above description with respect to fig. 12A-B, the distribution of 4 inherent feature points associated with the second feature point in the column corresponding to the first feature point in the association degree score matrix is discussed on the assumption that the second feature point is known. But in reality the second feature point is unknown, step 930 is to find the second feature point by finding the 4 inherent feature points with which the second feature point is associated. Since the relative positions of the relevance scores corresponding to the four intrinsic-feature points Y1-4 are fixed as described above, the associated 4 intrinsic-feature points can be found by sliding windows as shown in fig. 13A-C.

In this embodiment, as shown in fig. 11, step 930 includes:

step 1110, setting a sliding window in a column corresponding to the first feature point selected in the association degree score matrix;

and 1120, sliding the sliding window in the column corresponding to the first characteristic point, and acquiring a second characteristic point according to four elements falling in the sliding window in the sliding process of the sliding window.

Sliding windows are sliding windows. In general, certain processing may be performed according to content falling into a window during the window sliding. As shown in fig. 13A-C, the sliding window may include a first sliding window element W1, a second sliding window element W2, a third sliding window element W3, and a fourth sliding window element W4. The sliding window element is a unit in the sliding window that is sized to exactly cover one relevance score in the relevance score matrix. Since W1-W4 can be applied only to the column (j _a ·W _a +i _a ) Which each cover a relevance score in the column. The number of rows of the second signature below the second window element W2 and spaced from the second window element W2 is the number of columns W of the second signature, with the second window element W2 below the first window element W1 and immediately adjacent to the first window element W1, and the third window element W3 below the second window element W2 _b Minus 2, the fourth sliding window element W4 is below the third sliding window element W3 and immediately adjacent to the third sliding window element W3, which corresponds to the associations corresponding to the four inherent feature points Y1-4 as described above The relative positional relationship of the degree scores is uniform so that the four sliding window elements W1-4 can cover the four inherent feature points Y1-4.

So that the sliding window slides in the column corresponding to the first feature point. Specifically, the first sliding window element W1 may be first allowed to cover the column (j _a ·W _a +i _a ) The other three sliding window elements W2-4 vary accordingly according to the above rule. Thus, four relevance scores falling in the four sliding window elements W1-4 are obtained. Next, the first window element W1-4 is slid down by one window element side length in sequence, and four relevance scores falling in the four window elements W1-4 are obtained again until the fourth window element W4 covers the column (j _a ·W _a +i _a ) And cannot slide down any more.

As shown in fig. 13A, a row (j) is first overlaid from the first sliding window element W1 _a ·W _a +i _a ) Starting with the uppermost one of the relevance scores, the four relevance scores obtained are 0.1, 0.3, 0.2, 0.5, respectively. Next, as shown in fig. 13B, each sliding window element W1-4 slides down by the length of one sliding window element side length, and the obtained four correlations are 0.3, 0.1, 0.5, and 0.1, respectively. Then, as shown in fig. 13C, each sliding window element W1-4 slides down by one sliding window element side length, and the obtained four correlations are 0.1, 0.2, 0.1, and 0.3, respectively. The fourth sliding window element W4 now has covered the lowest one of the relevance scores of the column. And (5) finishing sliding.

Then, a second feature point is acquired according to four elements falling in the sliding window during sliding of the sliding window. In one embodiment, the second feature point is obtained from the sum of four relevance scores falling in the sliding window. This is because if the center of the second feature point is located exactly between the centers of four intrinsic feature points Y1-4 of the second feature point, or the second feature point covers a respective part of the four intrinsic feature points, the sum of the correlation scores corresponding to the four intrinsic feature points must be larger than the sum of the correlation scores of the other four intrinsic feature points having the same positional relationship on the second feature map, and therefore the second feature point can be acquired from the sum of the four correlation scores falling in the sliding window at each position of the sliding window.

Specifically, in this embodiment, as shown in fig. 14, step 1120 includes:

step 1410, determining, for each sliding position in the sliding process of the sliding window, a sum of four relevance scores falling in the sliding window;

step 1420, obtaining a second feature point based on four feature points corresponding to the four largest association scores in the second feature map.

As shown in fig. 15A, when the first sliding window element W1 covers the column (j _a ·W _a +i _a ) The sum of the four relevance scores falling in the sliding window is 0.1+0.3+0.2+0.5=1.1 at the uppermost one of the relevance scores. After each sliding window element W1-4 slides down by the length of one sliding window element side length, as shown in fig. 15B, the sum of four association scores falling in the sliding window is 0.3+0.1+0.5+0.1=1.0. After each sliding window element W1-4 slides down one more sliding window element side length, as shown in fig. 15C, the sum of the four relevance scores falling in the sliding window is 0.1+0.2+0.1+0.3=0.7. It can be seen that the sum of the four association scores falling in the sliding window at the sliding position of fig. 15A is maximum, and therefore, in step 1420, the second feature point is acquired from the four feature points corresponding to the four association scores in the second feature map. Specific acquisition methods will be described in specific detail below.

An advantage of steps 1410-1420 is that it obtains the second feature point based on the sum of the four relevance scores falling in the sliding window at each sliding position of the sliding window. If the center of the second feature point is located between the centers of four feature points corresponding to the four relevance scores falling in the sliding window, the relevance scores of the four feature points are made to be generally larger than those of the four feature points having similar positional relationships around. The second characteristic points are obtained in a comparison and maximum mode, so that the method is simple and feasible, and the cost for obtaining the second characteristic points is greatly reduced.

The advantage of steps 1110-1120 is that, by means of sliding window, on the premise of maintaining the fixed position relationship of the four inherent feature points associated with the second feature point, all possible four inherent feature points satisfying the fixed position relationship in the second feature map are traversed, and four associated inherent feature points for acquiring the second feature point are found from the four inherent feature points, so that the acquisition accuracy of the second feature point is maintained, and meanwhile, the acquisition efficiency is improved.

Further description is provided below with respect to a specific implementation of step 1420.

Further detailed description of step 1420

In general, feature points are extracted from pixels, and have the same side length as the pixels. Therefore, the key to acquiring the second feature point in step 1420 is to determine the location of the center of the second feature point. If the feature point is considered as a square, the center of the feature point refers to the center of the inscribed circle of the square.

In this embodiment, as shown in fig. 16, step 1420 includes:

in step 1610, any one of the four feature points is taken as a first reference feature point, a feature point located in the same row as the first reference feature point in the four feature points is taken as a second reference feature point, a feature point located in the same column as the first reference feature point in the four feature points is taken as a third reference feature point, and a feature point located in neither the same row nor the same column as the first reference feature point in the four feature points is taken as a fourth reference feature point;

Step 1620, using the center of the first reference feature point as an origin, using the direction of the origin pointing to the center of the second reference feature point as the forward direction of the horizontal axis, using the direction of the origin pointing to the center of the third reference feature point as the forward direction of the vertical axis, and using the standard side length of the feature point as the unit of the horizontal line and the vertical axis to establish a coordinate system;

step 1630, determining the center coordinates of the second feature point based on the relevance scores of the first reference feature point, the second reference feature point, the third reference feature point and the fourth reference feature point;

and step 1640, acquiring the second feature point based on the center coordinates of the second feature point and the standard side length of the feature point.

In order to determine the position of the center of the second feature point, a coordinate system is first established. Steps 1610-1620 are the process of establishing a planar rectangular coordinate system. Step 1420 is to obtain the second feature based on four feature points corresponding to the maximum four relevance scores in the second feature map. To establish a coordinate system for convenience of description, the four feature points are named first. As shown in fig. 17A, among the four feature points Y1-4, it is assumed that Y1 is selected as the first reference feature point. At this time, the feature point Y2 located in the same row as Y1 is a second reference feature point, the feature point Y3 located in the same column as Y1 is a third reference feature point, and the feature point Y4 located in neither the same row nor the same column as Y1 is a fourth reference feature point. Note that the second reference feature point, the third reference feature point, and the fourth reference feature point vary with the first reference feature point selected. For example, if Y2 is selected as the first reference feature point, Y1 becomes the second reference feature point, Y4 becomes the third reference feature point, and Y3 becomes the fourth reference feature point.

As shown in fig. 17A, with the center of the first reference feature point Y1 as the origin, the direction of the origin pointing to the center of the second reference feature point Y2 is the forward direction of the x-axis, the direction of the origin pointing to the center of the third reference feature point Y3 is the forward direction of the Y-axis, and the standard side length of the feature point is the unit of the horizontal line and the vertical axis, a planar rectangular coordinate system is established. Thus, the coordinates of the centers of Y1, Y2, Y3 and Y4 are (0, 0), (1, 0), (0, 1), (1, 1), respectively. The center coordinates of the second feature point Y are located within a square formed by these 4 points.

Next, in step 1630, the center coordinates of the second feature point are solved according to the relevance scores of the first to fourth reference feature points.

In one embodiment, step 1630 includes:

determining an abscissa of a center of the second feature point based on a sum of association scores of the second reference feature point and the fourth reference feature point and a sum of association scores of the first reference feature point and the third reference feature point;

and determining the ordinate of the center of the second feature point based on the sum of the association scores of the third reference feature point and the fourth reference feature point and the sum of the association scores of the first reference feature point and the second reference feature point.

The center coordinates of the second feature point include an abscissa and an ordinate of the second feature point, both between 0 and 1. The coordinates of the centers of Y1, Y2, Y3 and Y4 are (0, 0), (1, 0), (0, 1) and (1, 1), respectively, and form a square with the centers (0.5 and 0.5). The second feature point is located in the square. The abscissa of the second reference feature point and the fourth reference feature point is 1, and the abscissa of the first reference feature point and the third reference feature point is 0. If the relevance scores of the first to fourth reference feature points are the same, the left and right sides of the second feature point are matched with the first feature point equally, and any position of the first feature point mapped to the square is the same, the abscissa of the second feature point is considered to be 0.5. If the association score of the second reference feature point with the fourth reference feature point with the abscissa of 1 is generally greater than the association score of the first reference feature point with the third reference feature point with the abscissa of 0, it is explained that the second feature point should be shifted from the center to the right because the right is closer to the position to which the first feature point is mapped. The distance by which the second feature point is shifted rightward may be determined by a sum of association scores of the second reference feature point and the fourth reference feature point, and a sum of association scores of the first reference feature point and the third reference feature point. The more the sum of the association scores of the second reference feature point and the fourth reference feature point is greater than the sum of the association scores of the first reference feature point and the third reference feature point, the more the abscissa of the second feature point is shifted to the right. The following formula is set:

x _T ＝0.5+(S ₁₁ +S ₁₀ -S ₀₀ -S ₀₁ ) Formula 1/2

Wherein x is _T Is the abscissa of the center of the second feature point, 0.5 is the abscissa of the center of the square formed by the centers of the first to fourth reference feature points, S ₀₀ Is the relevance score of the first reference feature point, S ₁₀ Is the relevance score of the second reference feature point, S ₀₁ Is the relevance score of the third reference feature point, S ₁₁ Is the relevance score of the fourth reference feature point, (S) ₁₁ +S ₁₀ -S ₀₀ -S ₀₁ ) And/2 is the distance that the abscissa of the second feature point should be shifted positively from 0.5 to the x-axis (if negative, meaning a negative shift to the x-axis).

Similarly, a calculation formula regarding the ordinate of the second feature point may be set as follows:

y _T ＝0.5+(S ₁₁ +S ₀₁ -S ₀₀ -S ₁₀ ) Formula 2/2

Wherein y is _T Is the ordinate of the center of the second feature point, 0.5 is the ordinate of the center of the square formed by the centers of the first to fourth reference feature points, S ₀₀ 、S ₁₀ 、S ₀₁ And S is ₁₁ Is defined as in equation 1, (S) ₁₁ +S ₀₁ -S ₀₀ -S ₁₀ ) And/2 is the distance that the ordinate of the second feature point should be shifted positively from 0.5 to the y-axis (if negative, this means a negative shift to the y-axis).

For example, as seen in FIG. 15A, S ₀₀ ＝0.1，S ₁₀ ＝0.3，S ₀₁ ＝0.2，S ₁₁ =0.5, substituting them into equation 1 to obtain x _T =0.5+ (0.5+0.3-0.1-0.2)/2=0.75. Substituting them into formula 2 to obtain y _T =0.5+ (0.5+0.2-0.1-0.3)/2=0.65, and therefore, as shown in fig. 17B, the coordinate of the center of the second feature point is (0.75,0.65).

The above embodiment has an advantage in that it determines the degree of deviation of the second feature point to the left or right from the center position in consideration of the fact that, in the square formed at the center of the above four reference feature points, the second reference feature point and the fourth reference feature point are the two reference feature points whose abscissas are the largest, the first reference feature point and the third reference feature point are the two reference feature points whose abscissas are the smallest, and the sum of the association scores of the two reference feature points whose abscissas are the largest and the sum of the association scores of the two reference feature points whose abscissas are the smallest. Therefore, it determines the abscissa of the center of the second feature point using the sum of the association scores of the second reference feature point and the fourth reference feature point and the sum of the association scores of the first reference feature point and the third reference feature point. Similarly, it uses the sum of the association scores of the third reference feature point and the fourth reference feature point and the sum of the association scores of the first reference feature point and the second reference feature point to determine the ordinate of the center of the second feature point. In this way, the efficiency of acquiring the second feature point is improved.

Next, in step 1640, the second feature point is acquired based on the center coordinates of the second feature point and the feature point standard side length. As shown in fig. 17B, the standard side length is 1, and it can be determined that the second feature point is located in the square formed by (0.25,0.15), (1.25,0.15), (0.25,1.15), (1.25,1.15).

The steps 1610-1640 have the advantage that considering that the position of the center of the second feature point is related to the association scores of the first reference feature point, the second reference feature point, the third reference feature point and the fourth reference feature point, the association scores of the first reference feature point, the second reference feature point, the third reference feature point and the fourth reference feature point can determine the deviation of the center of the second feature point in the up-down, left-right direction, so that the center coordinates of the second feature point are determined based on the association scores of the first reference feature point, the second reference feature point, the third reference feature point and the fourth reference feature point, and the acquisition efficiency of the second feature point is greatly improved.

Apparatus and device descriptions of embodiments of the present disclosure

It will be appreciated that, although the steps in the various flowcharts described above are shown in succession in the order indicated by the arrows, the steps are not necessarily executed in the order indicated by the arrows. The steps are not strictly limited in order unless explicitly stated in the present embodiment, and may be performed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of steps or stages that are not necessarily performed at the same time but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be performed in turn or alternately with at least a portion of the steps or stages in other steps or other steps.

In the embodiments of the present application, when related processing is performed according to data related to characteristics of a target object, such as attribute information or attribute information set of the target object, permission or consent of the target object is obtained first, and the collection, use and processing of the data comply with relevant laws and regulations and standards of relevant countries and regions. In addition, when the embodiment of the application needs to acquire the attribute information of the target object, the independent permission or independent consent of the target object is acquired through a popup window or a jump to a confirmation page or the like, and after the independent permission or independent consent of the target object is explicitly acquired, the necessary target object related data for enabling the embodiment of the application to normally operate is acquired.

Fig. 24 is a schematic structural view of a product defect recognition device 2400 provided in an embodiment of the present disclosure. The product defect recognition apparatus 2400 includes:

an acquisition unit 2410 for acquiring an image to be detected of a product and a reference image;

a generation unit 2420 for generating a first feature map of the image to be detected and a second feature map of the reference image;

a determining unit 2430 configured to determine a calibration matrix based on the first feature map and the second feature map;

A calibration unit 2440, configured to calibrate the first feature map by using the calibration matrix to obtain a third feature map;

a first identifying unit 2450 is configured to identify a defect of the product from the third feature map.

Alternatively, the determining unit 2430 is specifically configured to:

obtaining a set of matched feature point pairs, wherein the set of matched feature point pairs comprises a plurality of matched feature point pairs, each matched feature point pair comprises a first feature point and a second feature point, the first feature point is from a first feature map, and the second feature point is from a second feature map and corresponds to the first feature point;

based on the set of matching feature point pairs, a calibration matrix is determined.

Alternatively, the determining unit 2430 is specifically configured to:

acquiring a relevance score matrix based on a first feature map and a second feature map, wherein each feature point in the first feature map corresponds to a column of the relevance score matrix, each feature point in the second feature map corresponds to a row of the relevance score matrix, and an element in the relevance score matrix is equal to a relevance score of a corresponding feature point in the first feature map of the column where the element is located and a corresponding feature point in the second feature map of the row where the element is located;

Selecting a plurality of first feature points from the first feature map;

and acquiring second feature points corresponding to the first feature points in the second feature map based on the association degree score matrix aiming at each selected first feature point.

Alternatively, the determining unit 2430 is specifically configured to:

setting a sliding window in a column corresponding to a first characteristic point selected from the association degree score matrix, wherein the sliding window comprises a first sliding window element, a second sliding window element, a third sliding window element and a fourth sliding window element, each sliding window element in the first sliding window element, the second sliding window element, the third sliding window element and the fourth sliding window element covers one association degree score in the column, the second sliding window element is below the first sliding window element and is close to the first sliding window element, the number of rows of the third sliding window element, which is below the second sliding window element and is spaced from the second sliding window element, is the number of columns of the second characteristic diagram minus 2, and the fourth sliding window element is below the third sliding window element and is close to the third sliding window element;

and sliding the sliding window in the column corresponding to the first characteristic point, and acquiring a second characteristic point according to four elements falling in the sliding window in the sliding process of the sliding window.

Alternatively, the determining unit 2430 is specifically configured to:

and acquiring a second feature point based on four feature points corresponding to the maximum four relevance scores in the second feature map.

Alternatively, the determining unit 2430 is specifically configured to:

taking any one of the four feature points as a first reference feature point, taking the feature point which is in the same row as the first reference feature point in the four feature points as a second reference feature point, taking the feature point which is in the same column as the first reference feature point in the four feature points as a third reference feature point, and taking the feature point which is not in the same row as the first reference feature point in the four feature points and is not in the same column as the first reference feature point as a fourth reference feature point;

establishing a coordinate system by taking the center of the first reference feature point as an origin, taking the direction of the origin pointing to the center of the second reference feature point as the forward direction of a transverse axis, taking the direction of the origin pointing to the center of the third reference feature point as the forward direction of a longitudinal axis, and taking the standard side length of the feature point as the unit of a transverse line and the longitudinal axis;

determining center coordinates of the second feature points based on the relevance scores of the first reference feature points, the second reference feature points, the third reference feature points and the fourth reference feature points;

Alternatively, the determining unit 2430 is specifically configured to:

after a relevance score matrix is acquired based on the first feature map and the second feature map, inputting the first feature map and the second feature map into a self-attention model to generate a first relevance score adjustment value;

Alternatively, the determining unit 2430 is specifically configured to:

Optionally, the first feature map includes a plurality of first feature maps of different feature extraction ratios, the second feature map includes a plurality of second feature maps of different feature extraction ratios, and the third feature map correspondingly includes a plurality of third feature maps of different feature extraction ratios;

the first identifying unit 2450 is specifically configured to:

inputting the plurality of third feature maps into a defect recognition model, and recognizing defects according to the plurality of third feature maps by the defect recognition model.

Optionally, the product defect recognition apparatus 2400 further includes:

a second identifying unit (not shown) for inputting the plurality of third feature maps into a defect classification model, identifying a type of defect for the identified defect by the defect classification model;

and a third recognition unit (not shown) for inputting the plurality of third feature maps into a defect contour recognition model, and recognizing the contour of the defect for the recognized defect by the defect contour recognition model.

Optionally, the product defect recognition apparatus 2400 further includes: a training unit (not shown) for jointly training the defect recognition model and the defect classification model by:

Constructing a sample image set, wherein the sample image set is provided with a plurality of sample images, the sample images are provided with first labels and second labels, the first labels indicate whether the sample images are provided with defects or not, and the second labels indicate the types of the defects;

generating a fourth feature map of the sample images of the sample image set;

constructing an error function based on the comparison of the identification result of the defect identification model with the first label and the comparison of the classification result of the defect classification model with the second label;

and training the defect identification model and the defect classification model by utilizing an error function.

As described above, the defect identifying apparatus 150 may be embodied as a terminal or as a server. When embodied as a terminal, referring to fig. 25, fig. 25 is a block diagram of a portion of a terminal implementing a product defect identification method of an embodiment of the present disclosure, the terminal comprising: radio Frequency (RF) circuit 2510, memory 2515, input unit 2530, display unit 2540, sensor 2550, audio circuit 2560, wireless fidelity (wireless fidelity, wiFi) module 2570, processor 2580, and power supply 2590. It will be appreciated by those skilled in the art that the terminal structure shown in fig. 25 is not limiting of a cell phone or computer and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The RF circuit 2510 may be used for receiving and transmitting signals during a message or a call, and in particular, after receiving downlink information of a base station, the downlink information is processed by the processor 2580; in addition, the data of the design uplink is sent to the base station.

The memory 2515 may be used to store software programs and modules, and the processor 2580 performs various functional applications and data processing of the terminal by executing the software programs and modules stored in the memory 2515.

The input unit 2530 may be used to receive input numerical or character information and generate key signal inputs related to setting and function control of the terminal. Specifically, the input unit 2530 may include a touch panel 2531 and other input devices 2532.

The display unit 2540 may be used to display input information or provided information and various menus of the terminal. The display unit 2540 may include a display panel 2541.

Audio circuitry 2560, speaker 2561, microphone 2562 can provide an audio interface.

In this embodiment, the processor 2580 included in the terminal may perform the product defect recognition method of the previous embodiment.

Terminals of embodiments of the present disclosure include, but are not limited to, cell phones, computers, intelligent voice interaction devices, intelligent home appliances, vehicle terminals, aircraft, and the like. The embodiment of the invention can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and the like.

Fig. 26 is a block diagram of a portion of a server implementing a product defect identification method of an embodiment of the present disclosure. The servers may vary widely by configuration or performance, and may include one or more central processing units (Central Processing Units, simply CPU) 2622 (e.g., one or more processors) and memory 2632, one or more storage media 2630 (e.g., one or more mass storage devices) storing applications 2642 or data 2644. Wherein the memory 2632 and storage medium 2630 may be transitory or persistent. The program stored on the storage medium 2630 may include one or more modules (not shown), each of which may include a series of instruction operations on the server. Further, the central processor 2622 may be configured to communicate with the storage medium 2630 and execute a series of instruction operations in the storage medium 2630 on a server.

The server(s) may also include one or more power supplies 2626, one or more wired or wireless network interfaces 2650, one or more input/output interfaces 2658, and/or one or more operating systems 2641, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.

Processor 2622 in the server may be used to perform the product defect identification method of the embodiments of the present disclosure.

The embodiments of the present disclosure also provide a computer-readable storage medium storing a program code for executing the product defect identification method of the foregoing embodiments.

The disclosed embodiments also provide a computer program product comprising a computer program. The processor of the computer device reads the computer program and executes it, causing the computer device to execute the product defect identification method described above.

The terms "first," "second," "third," "fourth," and the like in the description of the present disclosure and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this disclosure, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

It should be understood that in the description of the embodiments of the present disclosure, the meaning of a plurality (or multiple) is two or more, and that greater than, less than, exceeding, etc. is understood to not include the present number, and that greater than, less than, within, etc. is understood to include the present number.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

It should also be appreciated that the various implementations provided by the embodiments of the present disclosure may be arbitrarily combined to achieve different technical effects.

The above is a specific description of the embodiments of the present disclosure, but the present disclosure is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present disclosure, and are included in the scope of the present disclosure as defined in the claims.

Claims

1. An artificial intelligence-based product defect identification method is characterized by comprising the following steps:

acquiring an image to be detected and a reference image of a product;

and identifying defects of the product from the third characteristic diagram.

2. The method of claim 1, wherein the determining a calibration matrix based on the first feature map and the second feature map comprises:

3. The method of claim 2, wherein the obtaining the set of matching feature point pairs comprises:

selecting a plurality of first feature points from the first feature map;

4. A method according to claim 3, wherein, for each selected first feature point, based on the association score matrix, obtaining a second feature point corresponding to the first feature point in the second feature map includes:

5. The method of claim 4, wherein the obtaining the second feature point from four elements falling in the sliding window during sliding of the sliding window includes:

6. The method of claim 5, wherein the obtaining the second feature point based on the four feature points corresponding to the four highest relevance scores in the second feature map includes:

7. The method of claim 6, wherein the determining the center coordinates of the second feature point based on the relevance scores of the first reference feature point, the second reference feature point, the third reference feature point, and the fourth reference feature point comprises:

8. The method of claim 3, wherein after obtaining a relevance score matrix based on the first feature map and the second feature map, the method further comprises:

inputting the first feature map and the second feature map into a self-attention model to generate a first relevance score adjustment value;

9. The method of claim 2, wherein the determining the calibration matrix based on the set of matched pairs of feature points comprises:

performing a first number of times a first process, the first process comprising: extracting a second number of matching feature point pairs from the set of matching feature point pairs; calculating a candidate calibration matrix according to the second number of the matched characteristic point pairs; determining a first error of the matching feature point pairs of the set of matching feature point pairs on the candidate calibration matrix; determining a third number of matched feature point pairs for which the first error is less than a first threshold;

10. The method of claim 1, wherein the first feature map comprises a plurality of first feature maps of different feature extraction scales, the second feature map comprises a plurality of second feature maps of the different feature extraction scales, and correspondingly the third feature map comprises a plurality of third feature maps of the different feature extraction scales;

the identifying the defect of the product from the third feature map comprises:

11. The method of claim 10, wherein after inputting a plurality of the third feature maps into a defect recognition model, recognizing a defect from the plurality of the third feature maps by the defect recognition model, the method further comprises:

inputting a plurality of third feature maps into a defect classification model, and identifying the type of the defect aiming at the identified defect by the defect classification model;

Inputting a plurality of third feature maps into a defect contour recognition model, and recognizing the contour of the defect aiming at the recognized defect by the defect contour recognition model.

12. The method of claim 11, wherein the defect recognition model and the defect classification model are jointly trained by:

generating a fourth feature map of sample images of the sample image set;

13. An artificial intelligence based product defect identification device, comprising:

14. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the artificial intelligence based product defect identification method of any one of claims 1 to 12 when executing the computer program.

15. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the artificial intelligence based product defect identification method of any one of claims 1 to 12.