CN117456291B

CN117456291B - Defect classification method and device, electronic equipment and storage medium

Info

Publication number: CN117456291B
Application number: CN202311803304.7A
Authority: CN
Inventors: 韩晓; 徐海俊
Original assignee: Suzhou Mega Technology Co Ltd
Current assignee: Suzhou Mega Technology Co Ltd
Priority date: 2023-12-26
Filing date: 2023-12-26
Publication date: 2024-04-16
Anticipated expiration: 2043-12-26
Also published as: CN117456291A

Abstract

The embodiment of the invention provides a defect classification method and device, electronic equipment and a storage medium. The method comprises the following steps: inputting the image to be detected and the first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features; inputting the image features and the position identification features into a fully connected neural network of the defect classification model to obtain a defect classification result; the defect classification model is obtained through training a target sample data set, the target sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, corresponding second position prompt messages and marking labels, the unmarked data comprises a plurality of second sample images and corresponding third position prompt messages, and the second position prompt messages and/or the third position prompt messages are obtained through a semi-automatic marking network in a mode of marking positions of defect areas in the corresponding sample images. The method can reduce manual intervention and improve the defect classification efficiency.

Description

Defect classification method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of image processing technology, and more particularly, to a defect classification method, a defect classification apparatus, an electronic device, and a storage medium.

Background

In semiconductor manufacturing processes, it is necessary to inspect semiconductors to detect the presence of defects and the type of defects. Taking a semiconductor wafer as an example, in the prior art, the wafer is generally inspected by an automated optical inspection method, and the manufacturing process is optimized according to the type of defect detected. However, the types of defects of the wafers are various, and the morphology difference of the defects is often larger for different types of wafers or for wafers in different process steps, and then whether defects exist in the wafers or not and the types of the defects are determined by a manual detection mode or a fixed-type deep learning classification algorithm are needed. The cost of manual detection is high and the efficiency is low. Whereas for a class-fixed deep learning classification algorithm, a large amount of labeled data is typically required during training. Meanwhile, the types of defects are various, and all the types of defects are difficult to be marked at one time, so that data are required to be collected continuously and marked manually. When a new defect class appears, it is also necessary to manually check whether the new class meets the requirements. The whole process is longer, the manual intervention is more, and the efficiency is lower.

Disclosure of Invention

The present invention has been made in view of the above-described problems. The invention provides a defect classification method, a defect classification device, an electronic device and a storage medium.

According to an aspect of the present invention, there is provided a defect classification method, the method comprising: inputting the image to be detected and first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected; inputting the image features and the position identification features into a fully-connected neural network of a defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of a defect region in an image to be detected; the defect classification model is a classification model obtained through training of a target sample data set, the target sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information and marking labels corresponding to the first sample images, the marking labels are used for indicating categories of defect areas in the corresponding first sample images, the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to the second sample images, and the second position prompt information and/or the third position prompt information are obtained through position marking of the defect areas in the corresponding sample images by utilizing a semi-automatic marking network.

Illustratively, the defect classification model is trained by: acquiring a target sample data set; training a defect classification model through a target sample data set; wherein obtaining a target sample dataset comprises: acquiring a plurality of first sample images; inputting the plurality of first sample images into a semiautomatic labeling network to determine the positions of defect areas corresponding to the plurality of first sample images; labeling positions of defect areas corresponding to the first sample images respectively to obtain second position prompt information corresponding to the first sample images respectively; acquiring labeling labels corresponding to a plurality of first sample images respectively; acquiring a plurality of second sample images; inputting the plurality of second sample images into a semiautomatic labeling network to determine the positions of defect areas corresponding to the plurality of second sample images; labeling positions of defect areas corresponding to the second sample images to obtain third position prompt information corresponding to the second sample images.

Illustratively, inputting the plurality of first sample images into a semi-automatic labeling network, determining the location of the defect region corresponding to each of the plurality of first sample images includes: inputting initial position prompt information corresponding to the plurality of first sample images and the plurality of first sample images into a semi-automatic labeling network to determine positions of defect areas corresponding to the plurality of first sample images; inputting the plurality of second sample images into a semi-automatic labeling network, and determining the positions of defect areas corresponding to the plurality of second sample images respectively comprises the following steps: and inputting the plurality of second sample images and initial position prompt information corresponding to the plurality of second sample images into a semi-automatic labeling network so as to determine the positions of defect areas corresponding to the plurality of second sample images.

The initial position prompt information corresponding to any one of the first sample image and any one of the second sample image is obtained by the following method: matching the first sample image or the second sample image with a template image, wherein the image to be detected contains a target object, and the template image is an image containing a target object without defects; and determining the area, inconsistent with the template image, on the first sample image or the second sample image as a defect area in the first sample image or the second sample image so as to obtain initial position prompt information.

The initial position prompt information corresponding to any one of the first sample image and any one of the second sample image is obtained by the following method: and acquiring initial position prompt information input by a user.

Illustratively, obtaining labeling labels corresponding to each of the plurality of first sample images includes: when the plurality of first sample images are input into the semiautomatic labeling network, the categories of defect areas corresponding to the plurality of first sample images are also determined, so that labeling labels corresponding to the plurality of first sample images are obtained.

Illustratively, training the defect classification model with the target sample dataset includes: the following model training operations are performed based on the target sample dataset: training the defect classification model by using the marked data to obtain an initially trained defect classification model; inputting a plurality of second sample images and third position prompt information corresponding to the plurality of second sample images into the defect classification model subjected to initial training to obtain first prediction classification results corresponding to the plurality of second sample images, wherein the first prediction classification results are used for indicating the category of the defect region in the corresponding second sample images; acquiring pseudo labels corresponding to at least part of the second sample images in the plurality of second sample images based on the first prediction classification results corresponding to the second sample images; inputting a plurality of first sample images and second position prompt information corresponding to the plurality of first sample images into the defect classification model subjected to initial training to obtain second prediction classification results corresponding to the plurality of first sample images, wherein the second prediction classification results are used for indicating the category to which the defect area in the corresponding first sample images belongs; inputting at least part of second sample images and third position prompt information corresponding to the second sample images into the defect classification model subjected to initial training to obtain third prediction classification results corresponding to the second sample images, wherein the third prediction classification results are used for indicating the category of the defect area in the corresponding second sample images; calculating a prediction loss value based on the difference between the labeling labels corresponding to the first sample images and the second prediction classification result and the difference between the pseudo labels corresponding to at least part of the second sample images and the third prediction classification result; parameters in the initially trained defect classification model are optimized based on the predictive loss values to obtain a trained defect classification model corresponding to the present round of training operations.

Illustratively, the image under test comprises a plurality of first images under test, the training operation further comprising: after the training operation of the round is completed, a fourth prediction classification result which is obtained through prediction of the trained defect classification model and corresponds to at least one test image is obtained, wherein the fourth prediction classification result is used for indicating the category to which the defect area in the corresponding test image belongs, and the at least one test image comprises one or more of the following: at least a portion of the plurality of second sample images; an image to be measured; a new image other than the plurality of second sample images and the image to be measured; outputting fourth prediction classification results corresponding to at least one test image respectively; and responding to the check information input by the user, and executing corresponding operation on the fourth prediction classification result and/or the test image.

Illustratively, responsive to the user entered verification information, performing a corresponding operation on the fourth predictive classification result and/or the test image, including one or more of: for any test image classified into a known class based on the fourth prediction classification result, deleting the test image if the verification information comprises deletion information about the test image, wherein the known class is a class indicated by a labeling label in labeled data; for any new category divided in the fourth prediction classification result, if the checking information comprises deletion information about the new category, deleting the new category from the fourth prediction classification result, wherein the new category is a category different from the known category; for any new category divided in the fourth prediction classification result, if the check information comprises merging information about the new category, merging the new category and a known category designated by the merging information into the same category; and for any new category divided in the fourth prediction classification result, if the check information comprises the adding information about the new category, adding the new category and the test image corresponding to the new category into the marked data.

Illustratively, the first location hint information, the second location hint information, and the third location hint information each include one or more of: target frame prompt information corresponding to a target detection frame containing a defective area; mask prompt information corresponding to a mask of the defect region; identification point prompt information corresponding to one or more identification points in the defect area; the location identification features include one or more of the following: a target frame encoding feature corresponding to a target detection frame containing a defective region; mask coding features corresponding to the mask of the defect region; identification point code features corresponding to one or more identification points within the defective area.

Illustratively, the defect classification model further comprises an attention network, the method further comprising, prior to inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results: inputting the image features and the position identification features into an attention network for feature fusion so as to obtain fusion features; inputting the image features and the position identification features into a fully connected neural network of a defect classification model to obtain a defect classification result, comprising: and inputting the fusion characteristics into a fully-connected neural network to obtain a defect classification result.

The feature extraction network includes an image encoding module and a position encoding module, and inputs the image to be detected and the first position prompt information into the feature extraction network of the defect classification model to obtain an image feature and a position identification feature of the image to be detected, including: inputting the image to be detected into an image coding module to obtain image characteristics; the first position prompt information is input into a position coding module to obtain a position identification feature.

The image coding module is obtained by training in an unsupervised training mode, and the parameters of the image coding module are kept fixed when the defect classification model is trained.

Illustratively, the defect classification model is an open world semi-supervised classification model.

Illustratively, the image to be measured is a wafer image including a wafer, and the defect area is a defect area on the wafer.

According to another aspect of the present invention, there is also provided a defect classification apparatus, including: the first input module is used for inputting the image to be detected and the first position prompt information into the feature extraction network of the defect classification model to obtain the image features and the position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of the defect region in the image to be detected; the second input module is used for inputting the image features and the position identification features into the fully-connected neural network of the defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of the defect region in the image to be detected; the defect classification model is a classification model obtained through training of a target sample data set, the target sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information and marking labels corresponding to the first sample images, the marking labels are used for indicating categories of defect areas in the corresponding first sample images, the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to the second sample images, and the second position prompt information and/or the third position prompt information are obtained through position marking of the defect areas in the corresponding sample images by utilizing a semi-automatic marking network.

According to yet another aspect of the present invention, there is also provided an electronic device comprising a processor and a memory, the memory having stored therein computer program instructions which, when executed by the processor, are adapted to carry out the defect classification method described above.

According to a further aspect of the present invention, there is also provided a storage medium storing a computer program/instruction for executing the above-described defect classification method when executed.

According to the defect classification method, the defect classification device, the electronic equipment and the storage medium, the image characteristics and the position identification characteristics of the image to be detected can be obtained by inputting the image to be detected into the characteristic extraction network of the defect classification model. And inputting the obtained image features and the position identification features into a fully-connected neural network of the defect classification model, so that a defect classification result can be obtained. According to the scheme, position prompt information contained in marked data and/or unmarked data in the target sample data set is obtained through the semiautomatic marking network, and the defect classification model is trained based on the target sample data set, so that the attention degree of the defect classification model to the region of interest can be enhanced, and the classification of the defect region of the image to be detected can be realized based on less marked data. Meanwhile, the method can also reduce manual intervention and improve the defect classification efficiency.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent from the following more particular description of embodiments of the present invention, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, and not constitute a limitation to the invention. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 shows a schematic flow chart of a defect classification method according to an embodiment of the invention;

FIG. 2 illustrates a schematic diagram of obtaining hint information using a semi-automatic annotation network according to one embodiment of the present invention;

FIG. 3 illustrates a defect classification model training schematic in accordance with an embodiment of the invention;

FIG. 4 shows a schematic diagram of acquiring fusion features according to one embodiment of the invention;

FIG. 5 shows a schematic diagram of a defect classification model training process according to another embodiment of the invention;

FIG. 6 shows a schematic block diagram of a defect classification apparatus according to an embodiment of the invention; and

Fig. 7 shows a schematic block diagram of an electronic device according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the invention described in the present application, all other embodiments that a person skilled in the art would have without inventive effort shall fall within the scope of the invention.

In order to at least partially solve the above-mentioned problems, an embodiment of the present invention provides a defect classification method. FIG. 1 shows a schematic flow chart of a defect classification method 100 according to one embodiment of the invention. As shown in fig. 1, the method 100 may include the following steps S110 and S120.

Step S110, inputting the image to be detected and the first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected.

The image to be measured may be, for example, any type of image including a target object. The target object may be any type of object such as a wafer, a character, an electronic component, etc. The image to be measured can be a static image or any video frame in a dynamic video. The image to be measured may be an original image acquired by an image acquisition device (for example, an image sensor in a camera), or may be an image obtained after preprocessing (such as digitizing, normalizing, smoothing, etc.) the original image.

The first position prompt information can indicate the position of the defect area in the image to be detected. The first position prompt information can be manually marked or obtained after the defect area in the image to be detected is detected through the defect detection model. For example, the first position indication information may be any type of indication information such as a detection frame (Bounding Box), a mask (mask), a position point (point) and the like corresponding to the defect area, which is not limited in the present invention. Inputting the image to be detected and the first position prompt information into a feature extraction network of the defect classification model, and obtaining the image features and the position identification features of the image to be detected. The defect classification model is an arbitrary classification model.

Illustratively, the defect classification model is an open world semi-supervised classification model. In one embodiment of the invention, the defect classification model may be an Open World Semi-supervised classification (Open-World Semi-Supervised Learning, OWSSL) model, such as Semi-supervised learning to reality (Towards REALISTIC SEMI-Supervised Learning, TRSSL).

Inputting the image to be detected and the first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features of the image to be detected. By way of example and not limitation, the feature extraction network may be implemented using a vision transformer (Vision Transformer, viT) or convolutional neural network (Convolutional Neural Networks, CNN), or the like.

Step S120, inputting the image features and the position identification features into a fully connected neural network of a defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of a defect region in an image to be detected; the defect classification model is a classification model obtained through training of a target sample data set, the target sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information and marking labels corresponding to the first sample images, the marking labels are used for indicating categories of defect areas in the corresponding first sample images, the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to the second sample images, and the second position prompt information and/or the third position prompt information are obtained through position marking of the defect areas in the corresponding sample images by utilizing a semi-automatic marking network.

Illustratively, the defect classification model may include a fully connected neural network. The image features and the position identification features are input into a fully connected neural network (Full Connect Neural Network) of the defect classification model, and a defect classification result can be obtained. The defect classification result may indicate a category to which the defective region in the image to be measured belongs. The categories to which the defective areas belong may include, but are not limited to, scratches, deletions, or particulate matter, etc.

Illustratively, the defect classification model is a classification model obtained by training a target sample dataset. In one embodiment, the target sample dataset may include annotated data and unlabeled data. The marked data may include a plurality of first sample images and second location cues and marking labels corresponding to each of the first sample images. Each first sample image is provided with corresponding second position prompt information and label. Similar to the first position prompt information, the second position prompt information may include any one or more of any position prompt information such as a labeling frame, a mask, or a mark point corresponding to a defect region included in the first sample image. The labeling tag may be used to indicate the category to which the defective area in the corresponding first sample image belongs. The unlabeled data may include a plurality of second sample images and a third location hint information corresponding to each of the second sample images. The plurality of second sample images may include an image obtained after the initial second sample image is subjected to a plurality of data enhancement and the initial second sample image. Implementations of data enhancement include, but are not limited to, flipping, rotating, cropping, etc. the initial second sample image. For example, if the number of the initial second sample images U is 100, the initial second sample images U1 after the first data enhancement and the initial second sample images U2 after the second data enhancement can be obtained after the initial second sample images U are subjected to the two data enhancement. The plurality of second sample images may include an initial second sample image U, an initial second sample image U1 after the first data enhancement, and an initial second sample image U2 after the second data enhancement. The third position indication information is similar to the first position indication information and the second position indication information in the previous embodiments, and is not described herein for brevity. The second position prompt information and/or the third position prompt information can be obtained by using a semi-automatic labeling network to label the position of the defect area in the corresponding sample image. By way of example and not limitation, the semi-automatic labeling network may be implemented using a semi-automatic labeling algorithm that segments a all Model (SEGMENT ANYTHING Model, SAM), segments a full content Model at Once (SEGMENT EVERYTHING EVERYWHERE ALL AT onse, SEEM), and the like.

In one embodiment, the following description will take the first location hint information as an example for convenience of description and understanding, and one of ordinary skill in the art can understand the content contained in the second location hint information and the third location hint information by reading the related description about the first location hint information. The first location hint information may include: target frame prompt information corresponding to a target detection frame containing a defect area, mask prompt information corresponding to a mask of the defect area, and any one or more of identification point prompt information corresponding to one or more identification points in the defect area. For example, the first location hint information may include target frame hint information corresponding to a target detection frame that includes a defective area, or the first location hint information may include mask hint information corresponding to a mask for the defective area, or the first location hint information may include identification point hint information corresponding to one or more identification points within the defective area. For another example, the first location hint information may include target frame hint information corresponding to a target detection frame that includes a defective area, mask hint information corresponding to a mask for the defective area, and identification point hint information corresponding to one or more identification points within the defective area. For example, the target frame prompt may be presented as a circumscribed rectangular box of the defect area. Mask hints may be presented via a binary map. In the binary image, each pixel in the defective area may be highlighted. The identification point prompt information can be presented through image coordinates of one or more identification points in the image to be detected. The identification point may be any representative pixel point in the defect area. The location identification feature may include: a target frame encoding feature corresponding to a target detection frame containing a defective area, a mask encoding feature corresponding to a mask of the defective area, and any one or more of identification point encoding features corresponding to one or more identification points within the defective area. The type of location identification feature corresponds to the type of first location hint information. That is, when the first location hint information includes target frame hint information corresponding to a target detection frame that includes a defective area, mask hint information corresponding to a mask of the defective area, and identification point hint information corresponding to one or more identification points within the defective area, the location identification feature may include target frame code features corresponding to the target detection frame that includes the defective area, mask code features corresponding to the mask of the defective area, and identification point code features corresponding to the one or more identification points within the defective area.

According to the technical scheme, through the target frame prompt information corresponding to the target detection frame containing the defect area, the mask prompt information corresponding to the mask of the defect area, any one or more position prompt information in the identification point prompt information corresponding to one or more identification points in the defect area and the corresponding position identification characteristics, supervision information which is beneficial to training of the defect classification model can be provided, so that interference of an irrelevant area to training of the defect classification model is reduced, and training efficiency of the defect classification model is improved.

For example, if the location identification feature is a target frame encoding feature, then the image feature and the target frame encoding feature are input into a fully connected neural network, and a defect classification result may be obtained. The defect classification result may include a target frame and a confidence level corresponding to the defect region in the image to be detected. The category corresponding to the maximum confidence level can be determined as the category to which the defect area belongs through the confidence levels corresponding to the different categories respectively.

According to the defect classification method provided by the embodiment of the invention, the image characteristics and the position identification characteristics of the image to be detected can be obtained by inputting the image to be detected into the characteristic extraction network of the defect classification model. And inputting the obtained image features and the position identification features into a fully-connected neural network of the defect classification model, so that a defect classification result can be obtained. According to the scheme, position prompt information contained in marked data and/or unmarked data in the target sample data set is obtained through the semiautomatic marking network, and the defect classification model is trained based on the target sample data set, so that the attention degree of the defect classification model to the region of interest can be enhanced, and the classification of the defect region of the image to be detected can be realized based on less marked data. Meanwhile, the method can also reduce manual intervention and improve the defect classification efficiency.

In one embodiment, the method of acquiring the first sample image is similar to the method of acquiring the image to be measured, and the method of acquiring the image to be measured has been described in detail in step S110, and is not described here again for brevity. And respectively inputting the plurality of first sample images into a semi-automatic labeling network, so that the positions of defect areas corresponding to the plurality of first sample images can be determined. And marking the positions of the defect areas corresponding to the first sample images, so that second position prompt information corresponding to the first sample images can be obtained. For example, the positions of the defect regions corresponding to the plurality of first sample images may be marked by any position marking method such as a target frame, a mask, or a position point. The labeling labels corresponding to the plurality of first sample images can be pre-labeled and stored in a local storage device, or can be manually labeled. The second sample image acquiring manner is similar to the acquiring manner of the image to be measured, and the acquiring manner of the image to be measured has been described in detail in step S110, and for brevity, the description is omitted here. The third position indication information corresponding to each of the plurality of second sample images may be obtained in a similar manner to the second position indication information corresponding to each of the plurality of first sample images, which is not described herein for brevity.

According to the technical scheme, the defect classification model can be trained through the target sample data set. And inputting the plurality of first sample images in the target sample data set into a semi-automatic labeling network, and determining the positions of defect areas corresponding to the plurality of first sample images. And marking the positions of the defect areas, so that second position prompt information and marking labels corresponding to the first sample images can be obtained. In a similar manner, third position prompt information corresponding to each of the plurality of second sample images may be acquired. According to the method, the positions of the defect areas in the first sample images and the second sample images can be obtained through the semi-automatic labeling network, manual intervention is not needed, and efficiency is high.

Illustratively, inputting the plurality of first sample images into the semi-automatic labeling network, determining the positions of the defect areas corresponding to the plurality of first sample images may include: inputting initial position prompt information corresponding to the plurality of first sample images and the plurality of first sample images into a semi-automatic labeling network to determine positions of defect areas corresponding to the plurality of first sample images; inputting the plurality of second sample images into a semi-automatic labeling network, and determining the positions of the defect areas corresponding to the plurality of second sample images respectively may include: and inputting the plurality of second sample images and initial position prompt information corresponding to the plurality of second sample images into a semi-automatic labeling network so as to determine the positions of defect areas corresponding to the plurality of second sample images.

In one embodiment, the plurality of first sample images and initial position prompt information corresponding to the plurality of first sample images are input into the semi-automatic labeling network, so that positions of defect areas corresponding to the plurality of first sample images can be determined. Similarly, the plurality of second sample images and initial position prompt information corresponding to the plurality of second sample images are input into the semi-automatic labeling network, and the positions of defect areas corresponding to the plurality of second sample images can be determined. The initial position prompt information can be obtained by manual labeling or can be stored in a storage device of an upper computer in advance, and the invention is not limited to the initial position prompt information.

According to the technical scheme, the initial position prompt information corresponding to each of the plurality of first sample images and/or the plurality of second sample images is input into the semi-automatic labeling network, so that the positions of the defect areas corresponding to each of the plurality of first sample images and/or the plurality of second sample images can be determined. This makes it possible to obtain defect position information of high quality.

Illustratively, the image to be measured is a wafer image including a wafer, and the defect area is a defect area on the wafer. In one embodiment of the invention, the target object may be a wafer. The image to be measured may be an image containing a wafer. The defect region may represent a location on the wafer where the defect is located. Defects on the wafer can be classified in this way, so that corresponding operations are performed on different types of defects, respectively.

In one embodiment, in an application scenario where a template image is present and target object consistency is high, such as wafer automated optical inspection (Automatic Optical Inspection, AOI), the template image may be constructed by capturing images including defect-free target objects. For example, in an application scene where the target object consistency is high, there is little difference in the images acquired for the target object at a time, so that the template image can be obtained by acquiring images including target objects without defects and performing image coordinate alignment. By matching the first sample image or the second sample image with the template image, a region on the first sample image or the second sample image that does not coincide with the template image can be determined as a defective region. Based on the position of the defective area in the first sample image or the second sample image, initial position cue information can be obtained. By way of example and not limitation, template matching includes, but is not limited to, gray scale matching, morphing matching, shape matching, or the like.

According to the technical scheme, the defect area in the first sample image or the second sample image can be determined by matching the first sample image or the second sample image with the template image. The method is mature in technology, so that the efficiency is high and the accuracy is high.

In one embodiment, the means for performing the defect classification method in the embodiment of the present invention may comprise input means and/or output means. The input device and/or the output device may be communicatively coupled to or included in a device for performing the defect classification method in an embodiment of the present invention. The input device may include, but is not limited to, one or more of a mouse, keyboard, microphone, touch screen, etc. The output devices may include, but are not limited to, one or more of a display device, a speaker, and the like. The user can input the initial position prompt information through an input device such as a mouse or a keyboard. For example, the display interface of the display device may include an input control of the initial position prompt information, and the user may input the position of the identification point in the input control through the keyboard, so as to obtain the initial position prompt information corresponding to the identification point. For example, the display interface of the display device may display a defective area of the first sample image or the second sample image, and the user may display a target frame where the defective area is located through a mouse frame, so as to obtain initial position prompt information corresponding to the target frame.

According to the technical scheme, the user can input the initial position prompt information through the input device and/or the output device, so that the accuracy of the obtained initial position prompt information can be ensured. Meanwhile, due to the existence of the semi-automatic labeling network, the efficiency of manual labeling is also high.

Illustratively, acquiring the labeling labels corresponding to the plurality of first sample images may include: when the plurality of first sample images are input into the semiautomatic labeling network, the categories of defect areas corresponding to the plurality of first sample images are also determined, so that labeling labels corresponding to the plurality of first sample images are obtained.

In one embodiment, when the plurality of first sample images are input into the semi-automatic labeling network, a category of a defect region corresponding to each of the plurality of first sample images may also be determined. The types of the defect areas corresponding to the plurality of first sample images may be stored in the storage device of the host computer in advance, or may be manually marked when the plurality of first sample images are input into the semiautomatic marking network.

According to the technical scheme, when the plurality of first sample images are input into the semiautomatic labeling network, the categories of the defect areas corresponding to the plurality of first sample images are determined, so that the obtaining efficiency of the labeling label can be improved. Meanwhile, the semi-automatic labeling network can be finely tuned (finetune) through the determined categories of the defect areas corresponding to the plurality of first sample images, so that the labeling label is more accurate.

FIG. 2 illustrates a schematic diagram of obtaining hints information using a semi-automatic annotation network, according to one embodiment of the invention. As shown in fig. 2, the sample image may be the first sample image and/or the second sample image in the previous embodiment, and the initial position cue information is used to indicate a defective area in the sample image. The initial position prompt information can be obtained by means of template matching or can be obtained by means of initial position prompt information input by a user. And inputting the sample image and the initial position prompt information of the sample image into a semi-automatic labeling network, so that the sample image with the prompt information can be obtained. The hint information may include a second location hint information and/or a third location hint information. And meanwhile, the semi-automatic labeling network can be optionally finely adjusted by using the sample image with the prompt information.

Illustratively, training the defect classification model with the target sample dataset may include: the following model training operations are performed based on the target sample dataset: training the defect classification model by using the marked data to obtain an initially trained defect classification model; inputting a plurality of second sample images and third position prompt information corresponding to the plurality of second sample images into the defect classification model subjected to initial training to obtain first prediction classification results corresponding to the plurality of second sample images, wherein the first prediction classification results are used for indicating the category of the defect region in the corresponding second sample images; acquiring pseudo labels corresponding to at least part of the second sample images in the plurality of second sample images based on the first prediction classification results corresponding to the second sample images; inputting a plurality of first sample images and second position prompt information corresponding to the plurality of first sample images into the defect classification model subjected to initial training to obtain second prediction classification results corresponding to the plurality of first sample images, wherein the second prediction classification results are used for indicating the category to which the defect area in the corresponding first sample images belongs; inputting at least part of second sample images and third position prompt information corresponding to the second sample images into the defect classification model subjected to initial training to obtain third prediction classification results corresponding to the second sample images, wherein the third prediction classification results are used for indicating the category of the defect area in the corresponding second sample images; calculating a prediction loss value based on the difference between the labeling labels corresponding to the first sample images and the second prediction classification result and the difference between the pseudo labels corresponding to at least part of the second sample images and the third prediction classification result; parameters in the initially trained defect classification model are optimized based on the predictive loss values to obtain a trained defect classification model corresponding to the present round of training operations.

In one embodiment, the target sample dataset may include annotated data and unlabeled data. The marked data may include a plurality of first sample images and second location cues and marking labels corresponding to each of the first sample images. FIG. 3 illustrates a defect classification model training schematic in accordance with one embodiment of the invention. And training the defect classification model by using the marked data to obtain an initially trained defect classification model. As shown in fig. 3, the first prediction classification results corresponding to the second sample images can be obtained by inputting the second sample images and the third position prompt information (i.e. the unlabeled data) corresponding to each of the second sample images into the defect classification model after initial training. The first prediction classification result may be used to indicate a category to which the defective region in the corresponding second sample image belongs. And acquiring all or part of pseudo tags corresponding to the second sample images based on the first prediction classification results corresponding to the second sample images. Constraints may be added to the respective pseudo tags of the obtained second sample images, for example, using the Xin Kehuo en-keno pu (Sinkhorn-Knopp) algorithm. Similarly, the first sample images and the second position prompt information corresponding to the first sample images are input into the defect classification model which is initially trained, and the second prediction classification result corresponding to the first sample images can be obtained. The second prediction classification result may indicate a category to which the defective region in the corresponding first sample image belongs. And inputting the obtained at least part of second sample images with the pseudo labels and the corresponding third position prompt information (namely pseudo labeling data shown in fig. 3) of the at least part of second sample images into the defect classification model after initial training, so as to obtain the corresponding third prediction classification result of the at least part of second sample images. The third prediction classification result may indicate a category to which the defective region in the corresponding second sample image belongs. The prediction classification result shown in fig. 3 may include a third prediction classification result and a second prediction classification result. "ema" represents the running average of the defect classification model weights. The predictive loss value can be calculated by a preset loss function based on the difference between the labeling labels corresponding to the first sample images and the second predictive classification result and the difference between the pseudo labels corresponding to at least part of the second sample images and the third predictive classification result. The preset loss function may be any loss function such as a cross entropy loss function, which is not limited in the present invention. In one embodiment of the present invention, the labeling labels corresponding to the plurality of first sample images and the pseudo labels corresponding to at least part of the second sample images may be substituted into a preset loss function, so as to calculate the predicted loss value. Parameters in the initially trained defect classification model may then be optimized using back-propagation and gradient descent algorithms based on the predicted loss values. The optimization operation may be repeatedly performed until the defect classification model reaches a converged state. After the training of the present round is finished, a defect classification model which corresponds to the training of the present round and is finished can be obtained.

According to the technical scheme, the defect classification model is trained by using the marked data and the unmarked data, the new class is found based on the consistency of carrying out multiple data enhancement on the same initial sample image and the added constraint. Meanwhile, the defect classification model obtained through training can detect defects of unknown types.

Illustratively, the image under test may include a plurality of first images under test, and the training operation may further include: after the training operation of the round is completed, a fourth prediction classification result which is obtained through prediction of the trained defect classification model and corresponds to at least one test image is obtained, wherein the fourth prediction classification result is used for indicating the category to which the defect area in the corresponding test image belongs, and the at least one test image comprises one or more of the following: at least a portion of the plurality of second sample images; an image to be measured; a new image other than the plurality of second sample images and the image to be measured; outputting fourth prediction classification results corresponding to at least one test image respectively; and responding to the check information input by the user, and executing corresponding operation on the fourth prediction classification result and/or the test image.

In one embodiment, the image to be measured may include a plurality of first images to be measured. In the foregoing embodiment, the defect classification model after the training is obtained through the training, and one or more test images may be respectively input into the defect classification model, so as to obtain a fourth prediction classification result corresponding to each test image. The fourth prediction classification result may indicate a category to which the defective region in the corresponding test image belongs. The test image may be any image containing the target object. For example, the test image may include at least a part of the plurality of second sample images, the image to be tested, and a new image other than the plurality of second sample images and the image to be tested. The one or more test images may also include only at least part of the plurality of second sample images, or the image to be tested, or a new image other than the plurality of second sample images and the image to be tested.

In one embodiment, after the fourth prediction classification result is obtained, the fourth prediction classification result corresponding to each of the at least one test image may be output. For example, if the output device is a display device, the fourth prediction classification result may be displayed in a display interface of the display device. The user may also input verification information through the input device. And through the check information input by the user, corresponding operation can be performed on the fourth prediction classification result and/or the test image. For example, if the user's verification information indicates that the image of the test image that is partially unsatisfactory is required to be deleted, then the unsatisfactory image may be automatically deleted from the test image in response to the user's input of verification information.

According to the technical scheme, after the training operation of the round is completed, the fourth prediction classification result corresponding to each of the at least one test image, which is obtained through the prediction of the trained defect classification model, can be obtained. And then outputting fourth prediction classification results corresponding to the at least one test image respectively. The user may input the check information to perform a corresponding operation on the fourth prediction classification result and/or the test image. According to the scheme, the user can conveniently check the fourth prediction classification result and the test image, and corresponding operation is performed on the fourth prediction classification result and the test image, so that the requirements of different users are met, and the interactivity is high.

Illustratively, performing a corresponding operation on the fourth predictive classification result and/or the test image in response to the user-entered verification information may include one or more of: for any test image classified into a known class based on the fourth prediction classification result, deleting the test image if the verification information comprises deletion information about the test image, wherein the known class is a class indicated by a labeling label in labeled data; for any new category divided in the fourth prediction classification result, if the checking information comprises deletion information about the new category, deleting the new category from the fourth prediction classification result, wherein the new category is a category different from the known category; for any new category divided in the fourth prediction classification result, if the check information comprises merging information about the new category, merging the new category and a known category designated by the merging information into the same category; and for any new category divided in the fourth prediction classification result, if the check information comprises the adding information about the new category, adding the new category and the test image corresponding to the new category into the marked data.

In one embodiment, the fourth prediction classification result may indicate a defective area belonging to a known class, or may indicate a defective area belonging to an unknown class. The known category is the category indicated by the annotation tag in the annotated data. For example, the fourth prediction classification result corresponding to the test image indicates that the category to which the defect area in the test image belongs is a scratch, and then the category of the defect area indicated by the fourth prediction classification result is a known category.

In the first embodiment, for any test image classified into a known class based on the fourth prediction classification result, if the verification information includes deletion information on the test image, the test image is deleted. For example, for test image a, if a user inputs a character such as "delete test image a" or the user clicks an "x" control corresponding to test image a, which may indicate that the user desires to delete the test image, then test image a may be deleted from at least one test image in response to the verification information currently input by the user.

In the second embodiment, for any new category divided in the fourth prediction classification result, if the check information includes deletion information on the new category, the new category is deleted from the fourth prediction classification result. The new category may represent other categories than the known category. For example, the fourth prediction classification result corresponding to the test image B indicates that the class to which the defect region included in the current test image B belongs is B. If the known class does not include class b, then class b may be used as the new class. If the user-entered verification information includes deletion information, such as a character of "delete category b," which may indicate that category b is an invalid category and the user desires to delete the category, category a may be deleted from the fourth prediction classification result in response to the user-currently entered verification information.

In the third embodiment, for any new category divided in the fourth prediction classification result, if the check information includes merging information about the new category, the new category may be merged into the same category as a known category specified by the merging information. For example, the fourth prediction classification result corresponding to the test image C indicates that the category to which the defect area included in the current test image C belongs is a category C, and the similarity between the category C and the scratch is high, and then the user may input merging information to merge the category C and the scratch category. And merging the new category and the known category appointed by the merging information into the same category, and then marking the category corresponding to the defect area contained in the test image C.

In the fourth embodiment, for any new category divided in the fourth prediction classification result, if the check information includes additional information about the new category, the new category and the test image corresponding to the new category are added to the labeled data. For example, the fourth prediction classification result corresponding to the test image D indicates that the class to which the defective region included in the current test image D belongs is a class D and the class D is a valid new class, then the user may add information to add the class D to the known class. Meanwhile, based on the adding information input by the user, the test image D corresponding to the category D can be added into the marked data.

According to the above technical solution, for any test image classified into a known class based on the fourth prediction classification result, different operations may be performed for different verification information, respectively. When the verification information includes deletion information on any one of the test images, the test image may be deleted. When the verification information includes deletion information for any new category, the new category may be deleted from the fourth prediction classification result. Where the audit information includes merging information about any new category, the new category may be merged into the same category as the known category specified by the merging information. When the check information includes additional information about any new category, the new category and the test image corresponding to the new category may be added to the labeled data. The scheme can execute operations such as deleting, adding to a known class or merging with the known class on the new class in an artificial checking mode, and can delete a test image. This allows iterative optimization of the annotated data to obtain more new categories.

Illustratively, the defect classification model may further comprise an attention network, the method further comprising, prior to inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results: inputting the image features and the position identification features into an attention network for feature fusion so as to obtain fusion features; inputting the image features and the location identification features into a fully connected neural network of the defect classification model to obtain a defect classification result may include: and inputting the fusion characteristics into a fully-connected neural network to obtain a defect classification result.

In one embodiment, the defect classification model may also include an attention network. The image features and the position identification features are input into an Attention network (Attention) for feature fusion, and fusion features corresponding to the image features and the position identification features can be obtained through Attention mechanism (Attention) operation. And inputting the obtained fusion characteristics into a fully-connected neural network to obtain a defect classification result.

FIG. 4 illustrates a schematic diagram of acquiring fusion features according to one embodiment of the invention. As shown in fig. 4, inputting the image to be measured into the image coding module can obtain the image coding features corresponding to the image to be measured. And inputting target frame prompt information in the first position prompt information corresponding to the image to be detected into a target frame encoder, so that target frame coding characteristics can be obtained. And inputting mask prompt information in the first position prompt information corresponding to the image to be detected into a mask encoder, so that mask coding characteristics can be obtained. And inputting the obtained image coding features, the target frame coding features and the mask coding features into a attention network to obtain fusion features.

According to the technical scheme, the image features and the position identification features can be fused through the attention network, so that the obtained fusion features simultaneously comprise the image features and the position identification features. Therefore, the defect classification model can be prompted to pay attention to the region corresponding to the position identification feature, and interference of the irrelevant region on the defect classification result is reduced.

Illustratively, the feature extraction network may include an image encoding module and a position encoding module, and inputting the image to be detected and the first position prompt information into the feature extraction network of the defect classification model to obtain an image feature and a position identification feature of the image to be detected, and may include: inputting the image to be detected into an image coding module to obtain image characteristics; the first position prompt information is input into a position coding module to obtain a position identification feature.

In one embodiment, the feature extraction network may include an image encoding module and a position encoding module. The image to be measured is input into an image coding module (Image Encoding Module) to obtain image characteristics. By way of example and not limitation, implementations of image encoding may include, but are not limited to, convolutional neural networks (Convolutional Neural Networks, CNN) or visual converters (Vision Transformer, viT), among others. In one embodiment of the invention, the convolutional neural network can be utilized to obtain the image features corresponding to the image to be measured. And inputting the first position prompt information into a position coding module to obtain the position identification characteristic. By way of example and not limitation, the position codes may include any of conditional position codes, learnable absolute position codes, sine and cosine function codes, relative position codes, and the like. In one embodiment, the position of the two-dimensional position coordinates in the first position indication information may be encoded by using a sine and cosine function, so as to obtain the corresponding position identification feature.

According to the technical scheme, the image to be detected is input into the image coding module, so that the image characteristics can be obtained. And inputting the first position prompt information into a position coding module to obtain the position identification characteristic. This ensures the efficiency and accuracy of acquiring image features and position-coding features.

In one embodiment, the image encoding module may be trained using an unsupervised training scheme. Such as emerging properties in self-supervising visual converters (Emerging Properties in Self-Supervised Vision Transformers, DINO), learning robust visual features without supervision (Learning Robust VisualFeatures without Supervision, DINOv 2). When the defect classification model is trained, parameters of the image coding module are kept fixed, and only branches such as the position coding module, the fully-connected neural network and the like are trained. Therefore, when the defect classification model is trained, the training cost is relatively low, and the training efficiency of the defect classification model can be improved.

FIG. 5 shows a schematic diagram of a defect classification model training process according to another embodiment of the invention. As shown in fig. 5, when the defect classification model is trained for the first time, unlabeled data and labeled data in the sample data set may be input into the defect classification model to obtain a corresponding prediction classification result. As shown in fig. 5, the prediction classification result may include 8 defect categories (each square may represent one defect category). Wherein the 4 defect categories within the dashed box represent already categories and the other 4 defect categories represent new categories. Based on the obtained prediction classification result, the user may input different verification information to verify the prediction classification result. After the checking is completed, new classes that partially do not meet the requirement in the prediction classification result may be deleted (for example, two new classes indicated by an arrow in fig. 5 are invalid classes and may be deleted from the new classes), or part of the new classes in the prediction classification result may be combined with the known classes in the labeled data, or the new classes in the prediction classification result may be added to the known classes. For new categories that are merged or added, the test image corresponding to the portion of the predicted classification result may be re-partitioned into known data, continuing to train the defect classification model. It will be appreciated that when the defect classification model is first trained, the trained sample dataset does not contain "new class annotation data".

According to another aspect of the present invention, there is also provided a defect classification apparatus. Fig. 6 shows a schematic block diagram of a defect classification device 600 according to an embodiment of the invention, which defect classification device 600 comprises a first input module 610 and a second input module 620 as shown in fig. 6.

The first input module 610 is configured to input an image to be measured and first position prompt information into a feature extraction network of the defect classification model to obtain an image feature and a position identification feature of the image to be measured, where the first position prompt information is used to indicate a position where a defect region in the image to be measured is located.

A second input module 620, configured to input the image feature and the location identification feature into a fully connected neural network of the defect classification model to obtain a defect classification result, where the defect classification result is used to indicate a category to which a defect region in the image to be detected belongs; the defect classification model is a classification model obtained through training of a target sample data set, the target sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information and marking labels corresponding to the first sample images, the marking labels are used for indicating categories of defect areas in the corresponding first sample images, the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to the second sample images, and the second position prompt information and/or the third position prompt information are obtained through position marking of the defect areas in the corresponding sample images by utilizing a semi-automatic marking network.

Those skilled in the art will understand the specific implementation and advantages of the defect classification device according to the above description about the defect classification method 100, and the detailed description is omitted herein for brevity.

According to still another aspect of the present invention, an electronic device is also provided. Fig. 7 shows a schematic block diagram of an electronic device according to an embodiment of the invention. As shown in fig. 7, the electronic device 700 includes a processor 710 and a memory 720, wherein the memory 720 stores a computer program, the computer program instructions being executable by the processor 710 for performing the defect classification method described above.

According to yet another aspect of the present invention, there is also provided a storage medium storing a computer program/instructions, the storage medium may include, for example, a storage component of a tablet computer, a hard disk of a personal computer, an erasable programmable read-only memory (EPROM), a portable read-only memory (CD-ROM), a USB memory, or any combination of the foregoing storage media. The storage medium may be any combination of one or more computer readable storage media. The computer program/instructions are used by the processor when running to perform the defect classification method described above.

Those skilled in the art will understand the specific implementation of the electronic device and the storage medium according to the above description about the defect classification method, and for brevity, the description is omitted here.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the above illustrative embodiments are merely illustrative and are not intended to limit the scope of the present invention thereto. Various changes and modifications may be made therein by one of ordinary skill in the art without departing from the scope and spirit of the invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another device, or some features may be omitted or not performed.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in order to streamline the invention and aid in understanding one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of the invention. However, the method of the present invention should not be construed as reflecting the following intent: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be combined in any combination, except combinations where the features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some of the modules in the defect classification device according to an embodiment of the invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.

The foregoing description is merely illustrative of specific embodiments of the present invention and the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present invention. The protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A method of defect classification, the method comprising:

Inputting an image to be detected and first position prompt information into a feature extraction network of a defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected;

inputting the image features and the position identification features into a fully connected neural network of the defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of a defect region in the image to be detected;

The defect classification model is a classification model obtained through training of a target sample data set, the target sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images and second position prompt information and marking labels corresponding to each first sample image, the marking labels are used for indicating the types of defect areas in the corresponding first sample images, the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to each second sample image, the second position prompt information and/or the third position prompt information are obtained through carrying out position marking on the defect areas in the corresponding sample images by utilizing a semi-automatic marking network, and the semi-automatic marking network is realized by dividing a cut model or dividing all content models at one time; the second position prompt information and the third position prompt information are used for inputting the defect classification model;

Wherein the defect classification model is trained by:

Acquiring the target sample dataset;

training the defect classification model through the target sample data set;

Wherein the acquiring the target sample dataset comprises:

Acquiring the plurality of first sample images;

Inputting the plurality of first sample images into the semiautomatic labeling network to determine the positions of defect areas corresponding to the plurality of first sample images;

Labeling positions of defect areas corresponding to the plurality of first sample images respectively to obtain second position prompt information corresponding to the plurality of first sample images respectively;

acquiring labeling labels corresponding to the plurality of first sample images respectively;

acquiring the plurality of second sample images;

Inputting the plurality of second sample images into the semiautomatic labeling network to determine the positions of defect areas corresponding to the plurality of second sample images;

labeling positions of defect areas corresponding to the second sample images to obtain third position prompt information corresponding to the second sample images;

Wherein the training of the defect classification model by the target sample dataset comprises:

performing the following model training operations based on the target sample dataset:

Training the defect classification model by using the marked data to obtain an initially trained defect classification model;

Inputting the plurality of second sample images and third position prompt information corresponding to the plurality of second sample images into the defect classification model which is initially trained so as to obtain first prediction classification results corresponding to the plurality of second sample images, wherein the first prediction classification results are used for indicating the category of the defect region in the corresponding second sample images;

acquiring pseudo labels corresponding to at least part of the second sample images in the plurality of second sample images based on the first prediction classification results corresponding to the plurality of second sample images;

Inputting the plurality of first sample images and second position prompt information corresponding to the plurality of first sample images into the defect classification model which is initially trained so as to obtain second prediction classification results corresponding to the plurality of first sample images, wherein the second prediction classification results are used for indicating the category to which the defect region in the corresponding first sample image belongs;

Inputting third position prompt information corresponding to the at least partial second sample image and the partial second sample image into the defect classification model which is initially trained so as to obtain a third prediction classification result corresponding to the at least partial second sample image, wherein the third prediction classification result is used for indicating the category of the defect region in the corresponding second sample image;

Calculating a prediction loss value based on the difference between the labeling labels corresponding to the plurality of first sample images and the second prediction classification result and the difference between the pseudo labels corresponding to the at least part of the second sample images and the third prediction classification result;

And optimizing parameters in the defect classification model which is initially trained based on the predicted loss value to obtain the defect classification model which is trained and completed and corresponds to the round of training operation.

2. The method of claim 1, wherein,

Inputting the plurality of first sample images into the semiautomatic labeling network, and determining positions of defect areas corresponding to the plurality of first sample images respectively includes:

Inputting the plurality of first sample images and initial position prompt information corresponding to the plurality of first sample images into the semi-automatic labeling network so as to determine the positions of defect areas corresponding to the plurality of first sample images;

Inputting the plurality of second sample images into the semiautomatic labeling network, and determining positions of defect areas corresponding to the plurality of second sample images respectively includes:

And inputting the plurality of second sample images and initial position prompt information corresponding to the plurality of second sample images into the semi-automatic labeling network so as to determine the positions of defect areas corresponding to the plurality of second sample images.

3. The method of claim 2, wherein the initial position cue information corresponding to any one of the first sample image and any one of the second sample image is obtained by:

matching the first sample image or the second sample image with a template image, wherein the image to be detected contains a target object, and the template image is an image containing the target object without defects;

And determining the area, inconsistent with the template image, on the first sample image or the second sample image as a defect area in the first sample image or the second sample image so as to obtain the initial position prompt information.

4. The method of claim 2, wherein the initial position cue information corresponding to any one of the first sample image and any one of the second sample image is obtained by:

and acquiring the initial position prompt information input by the user.

5. The method of claim 1, wherein the obtaining the labeling labels for each of the plurality of first sample images comprises:

and when the plurality of first sample images are input into the semiautomatic labeling network, determining the category of the defect area corresponding to each of the plurality of first sample images so as to obtain labeling labels corresponding to each of the plurality of first sample images.

6. The method of claim 1, wherein the image to be measured comprises a plurality of first images to be measured, the training operation further comprising:

After the training operation of the present round is completed, obtaining fourth prediction classification results, which are obtained through prediction of the defect classification model and are obtained through the training, and correspond to at least one test image, wherein the fourth prediction classification results are used for indicating the category to which the defect area in the corresponding test image belongs, and the at least one test image comprises one or more of the following: at least a portion of the plurality of second sample images; the image to be detected; new images other than the plurality of second sample images and the image to be measured;

outputting fourth prediction classification results corresponding to the at least one test image respectively;

And responding to the check information input by the user, and executing corresponding operation on the fourth prediction classification result and/or the test image.

7. The method of claim 6, wherein the performing a corresponding operation on the fourth predictive classification result and/or the test image in response to the user entered verification information comprises one or more of:

For any test image classified into a known category based on the fourth prediction classification result, deleting the test image if the checking information comprises deletion information about the test image, wherein the known category is a category indicated by a labeling label in the labeled data;

for any new category divided in the fourth prediction classification result, if the checking information comprises deletion information about the new category, deleting the new category from the fourth prediction classification result, wherein the new category is a category different from the known category;

For any new category divided in the fourth prediction classification result, if the checking information comprises merging information about the new category, merging the new category and a known category designated by the merging information into the same category;

and for any new category divided in the fourth prediction classification result, if the check information comprises the adding information about the new category, adding the new category and the test image corresponding to the new category into the marked data.

8. The method of any one of claim 1 to 5,

The first position prompt information, the second position prompt information and the third position prompt information respectively comprise one or more of the following: target frame prompt information corresponding to a target detection frame containing the defect area; mask prompt information corresponding to the mask of the defect area; identification point prompt information corresponding to one or more identification points in the defect area;

The location identification feature includes one or more of the following: a target frame encoding feature corresponding to a target detection frame containing the defective region; mask coding features corresponding to the mask of the defect region; identification point coding features corresponding to one or more identification points within the defective area.

9. The method of any of claims 1-5, wherein the defect classification model further comprises an attention network, the method further comprising, prior to said inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results:

inputting the image features and the position identification features into the attention network for feature fusion so as to obtain fusion features;

The inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results comprises:

and inputting the fusion characteristics into the fully-connected neural network to obtain the defect classification result.

10. The method according to any one of claims 1 to 5, wherein the feature extraction network includes an image encoding module and a position encoding module, and the inputting the image to be measured and the first position hint information into the feature extraction network of the defect classification model to obtain the image feature and the position identification feature of the image to be measured includes:

inputting the image to be detected into the image coding module to obtain the image characteristics;

And inputting the first position prompt information into the position coding module to obtain the position identification characteristic.

11. The method of claim 10, wherein the image coding module is trained using an unsupervised training scheme, wherein parameters of the image coding module remain fixed while training the defect classification model, and wherein parameters of the position coding module are trained.

12. The method of any of claims 1-5, wherein the defect classification model is an open world semi-supervised classification model.

13. The method of any of claims 1-5, wherein the image to be measured is a wafer image comprising a wafer and the defect area is a defect area on the wafer.

14. A defect classification device, the device comprising:

the first input module is used for inputting an image to be detected and first position prompt information into a feature extraction network of a defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected;

the second input module is used for inputting the image features and the position identification features into the fully-connected neural network of the defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of the defect region in the image to be detected;