CN117456290A - Defect classification method and device, electronic equipment and storage medium - Google Patents

Defect classification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117456290A
CN117456290A CN202311802922.XA CN202311802922A CN117456290A CN 117456290 A CN117456290 A CN 117456290A CN 202311802922 A CN202311802922 A CN 202311802922A CN 117456290 A CN117456290 A CN 117456290A
Authority
CN
China
Prior art keywords
image
defect
category
defect classification
prompt information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311802922.XA
Other languages
Chinese (zh)
Other versions
CN117456290B (en
Inventor
韩晓
徐海俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Mega Technology Co Ltd
Original Assignee
Suzhou Mega Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Mega Technology Co Ltd filed Critical Suzhou Mega Technology Co Ltd
Priority to CN202311802922.XA priority Critical patent/CN117456290B/en
Publication of CN117456290A publication Critical patent/CN117456290A/en
Application granted granted Critical
Publication of CN117456290B publication Critical patent/CN117456290B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7753Incorporation of unlabelled data, e.g. multiple instance learning [MIL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30148Semiconductor; IC; Wafer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/06Recognition of objects for industrial automation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Investigating Materials By The Use Of Optical Means Adapted For Particular Applications (AREA)

Abstract

The embodiment of the invention provides a defect classification method and device, electronic equipment and a storage medium. The method comprises the following steps: inputting the image to be detected and first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected; inputting the image features and the position identification features into a fully-connected neural network of a defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of a defect region in an image to be detected; the defect classification model is an open world semi-supervised classification model. The scheme can reduce the interference of other areas except the defect area in the image to be detected on the defect classification result, and improve the detection efficiency and the accuracy of the defect classification result.

Description

Defect classification method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of image processing technology, and more particularly, to a defect classification method, a defect classification apparatus, an electronic device, and a storage medium.
Background
In semiconductor manufacturing processes, it is necessary to detect defects in semiconductors to determine whether the semiconductors have defects and the type of defects. Taking a semiconductor wafer as an example, in the prior art, the wafer is generally inspected by an automated optical inspection method, and the manufacturing process is optimized according to the type of defect detected. However, the types of defects of the wafers are various, and the morphology difference of the defects is often larger for different types of wafers or for wafers in different process steps, and then whether defects exist in the wafers or not and the types of the defects are determined by a manual detection mode or a fixed-type deep learning classification algorithm are needed. The cost of manual detection is high and the efficiency is low. The deep learning classification algorithm with fixed categories needs to acquire the labeling data again and train the classification model, and the method cannot effectively detect new defect types.
Disclosure of Invention
The present invention has been made in view of the above-described problems. The invention provides a defect classification method, a defect classification device, an electronic device and a storage medium.
According to an aspect of the present invention, there is provided a defect classification method, the method comprising: inputting the image to be detected and first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected; inputting the image features and the position identification features into a fully-connected neural network of a defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of a defect region in an image to be detected; the defect classification model is an open world semi-supervised classification model.
Illustratively, the defect classification model further comprises an attention network, the method further comprising, prior to inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results: inputting the image features and the position identification features into an attention network for feature fusion so as to obtain fusion features; inputting the image features and the position identification features into a fully connected neural network of a defect classification model to obtain a defect classification result, comprising: and inputting the fusion characteristics into a fully-connected neural network to obtain a defect classification result.
Illustratively, the first location hint information includes: target frame prompt information corresponding to a target detection frame containing a defect area, and/or mask prompt information corresponding to a mask of the defect area, and/or identification point prompt information corresponding to one or more identification points in the defect area; the location identification feature includes: a target frame encoding feature corresponding to a target detection frame containing the defective area, and/or a mask encoding feature corresponding to a mask of the defective area, and/or an identification point encoding feature corresponding to one or more identification points within the defective area.
The feature extraction network includes an image encoding module and a position encoding module, and inputs the image to be detected and the first position prompt information into the feature extraction network of the defect classification model to obtain an image feature and a position identification feature of the image to be detected, including: inputting the image to be detected into an image coding module to obtain image characteristics; the first position prompt information is input into a position coding module to obtain a position identification feature.
Illustratively, the first location hint information includes: target frame prompt information corresponding to a target detection frame containing a defect area, and/or mask prompt information corresponding to a mask of the defect area, and/or identification point prompt information corresponding to one or more identification points in the defect area; the location identification feature includes: a target frame encoding feature corresponding to a target detection frame containing a defective area, and/or a mask encoding feature corresponding to a mask of the defective area, and/or an identification point encoding feature corresponding to one or more identification points within the defective area; the position coding module comprises: a target frame encoder, and/or a mask encoder, and/or an identification point encoder; inputting the first position prompt information into a position coding module to obtain a position identification feature, including: inputting the target frame prompt information into a target frame encoder to obtain target frame coding characteristics; and/or inputting mask prompt information into a mask encoder to obtain mask coding features; and/or inputting the identification point prompt information into an identification point encoder to obtain identification point coding characteristics.
The image coding module is obtained by training in an unsupervised training mode, and the parameters of the image coding module are kept fixed when the defect classification model is trained.
Illustratively, the defect classification model is obtained by the following training operations: acquiring a sample data set, wherein the sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information corresponding to each first sample image and marking labels, the marking labels are used for indicating the category of a defect area in the corresponding first sample images, and the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to each second sample image; the following model training operations are performed based on the sample dataset: training the defect classification model by using the marked data to obtain an initially trained defect classification model; inputting a plurality of second sample images and third position prompt information corresponding to the plurality of second sample images into the defect classification model subjected to initial training to obtain first prediction classification results corresponding to the plurality of second sample images, wherein the first prediction classification results are used for indicating the category of the defect region in the corresponding second sample images; acquiring pseudo labels corresponding to at least part of the second sample images in the plurality of second sample images based on the first prediction classification results corresponding to the second sample images; inputting a plurality of first sample images and second position prompt information corresponding to the plurality of first sample images into the defect classification model subjected to initial training to obtain second prediction classification results corresponding to the plurality of first sample images, wherein the second prediction classification results are used for indicating the category to which the defect area in the corresponding first sample images belongs; inputting at least part of the second sample images and at least part of third position prompt information corresponding to the second sample images into the defect classification model subjected to initial training to obtain at least part of third prediction classification results corresponding to the second sample images, wherein the third prediction classification results are used for indicating the category of the defect area in the corresponding second sample images; calculating a prediction loss value based on the difference between the labeling labels corresponding to the first sample images and the second prediction classification result and the difference between the pseudo labels corresponding to at least part of the second sample images and the third prediction classification result; parameters in the initially trained defect classification model are optimized based on the predictive loss values to obtain a trained defect classification model corresponding to the present round of training operations.
Illustratively, the image under test comprises a plurality of first images under test, the training operation further comprising: after the training operation of the round is completed, a fourth prediction classification result which is obtained through prediction of the trained defect classification model and corresponds to at least one test image is obtained, wherein the fourth prediction classification result is used for indicating the category to which the defect area in the corresponding test image belongs, and the at least one test image comprises one or more of the following: at least a portion of the plurality of second sample images; an image to be measured; a new image other than the plurality of second sample images and the image to be measured; outputting fourth prediction classification results corresponding to at least one test image respectively; and responding to the check information input by the user, and executing corresponding operation on the fourth prediction classification result and/or the test image.
Illustratively, responsive to the user entered verification information, performing a corresponding operation on the fourth predictive classification result and/or the test image, including one or more of: for any test image classified into a known class based on the fourth prediction classification result, deleting the test image if the verification information comprises deletion information about the test image, wherein the known class is a class indicated by a labeling label in labeled data; for any new category divided in the fourth prediction classification result, if the checking information comprises deletion information about the new category, deleting the new category from the fourth prediction classification result, wherein the new category is a category different from the known category; for any new category divided in the fourth prediction classification result, if the check information comprises merging information about the new category, merging the new category and a known category designated by the merging information into the same category; and for any new category divided in the fourth prediction classification result, if the check information comprises the adding information about the new category, adding the new category and the test image corresponding to the new category into the marked data.
Illustratively, the image to be measured is a wafer image including a wafer, and the defect area is a defect area on the wafer.
According to another aspect of the present invention, there is also provided a defect classification apparatus, including: the first input module is used for inputting the image to be detected and the first position prompt information into the feature extraction network of the defect classification model to obtain the image features and the position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of the defect region in the image to be detected; the second input module is used for inputting the image features and the position identification features into the fully-connected neural network of the defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of the defect region in the image to be detected; the defect classification model is an open world semi-supervised classification model.
According to yet another aspect of the present invention, there is also provided an electronic device comprising a processor and a memory, the memory having stored therein computer program instructions which, when executed by the processor, are adapted to carry out the defect classification method described above.
According to a further aspect of the present invention, there is also provided a storage medium storing a computer program/instruction for executing the above-described defect classification method when executed.
According to the defect classification method, the defect classification device, the electronic equipment and the storage medium, the image characteristics and the position identification characteristics of the image to be detected can be obtained by inputting the image to be detected into the characteristic extraction network of the defect classification model. And inputting the obtained image features and the position identification features into a fully-connected neural network of the defect classification model, so that a defect classification result can be obtained. According to the scheme, the position identification characteristics of the image to be detected are obtained, so that interference of other areas except the defect area in the image to be detected on the defect classification result can be reduced, and the detection efficiency and the accuracy of the defect classification result are improved. Meanwhile, the defect classification model in the scheme is used for detecting the category of the defect area in the image to be detected, and an unknown new category can be found.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following more particular description of embodiments of the present invention, as illustrated in the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, and not constitute a limitation to the invention. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 shows a schematic flow chart of a defect classification method according to an embodiment of the invention;
FIG. 2 illustrates a schematic diagram of acquiring fusion features according to one embodiment of the invention;
FIG. 3 shows a schematic diagram of a defect classification model training process according to an embodiment of the invention;
FIG. 4 shows a schematic diagram of a defect classification model training process according to another embodiment of the invention;
FIG. 5 shows a schematic block diagram of a defect classification apparatus according to an embodiment of the invention; and
fig. 6 shows a schematic block diagram of an electronic device according to an embodiment of the invention.
Description of the embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the invention described in the present application, all other embodiments that a person skilled in the art would have without inventive effort shall fall within the scope of the invention.
In order to at least partially solve the above-mentioned problems, an embodiment of the present invention provides a defect classification method. FIG. 1 shows a schematic flow chart of a defect classification method 100 according to one embodiment of the invention. As shown in fig. 1, the method 100 may include the following steps S110 and S120.
Step S110, inputting the image to be detected and the first position prompt information into a feature extraction network of a defect classification model to obtain image features and position identification features of the image to be detected, wherein the defect classification model is an open world semi-supervised classification model, and the first position prompt information is used for indicating the position of a defect region in the image to be detected.
The image to be measured may be, for example, any type of image including a target object. The target object may be any type of object such as a wafer, a character, an electronic component, etc. The image to be measured can be a static image or any video frame in a dynamic video. The image to be measured may be an original image acquired by an image acquisition device (for example, an image sensor in a camera), or may be an image obtained after preprocessing (such as digitizing, normalizing, smoothing, etc.) the original image.
Illustratively, the image to be measured is a wafer image including a wafer, and the defect area is a defect area on the wafer. In one embodiment of the invention, the target object may be a wafer. The image to be measured may be an image containing a wafer. The defect region may represent a location on the wafer where the defect is located. Defects on the wafer can be classified in this way, so that corresponding operations are performed on different types of defects, respectively.
The first position prompt information can indicate the position of the defect area in the image to be detected. The first position prompt information can be manually marked or obtained after the defect area in the image to be detected is detected through the defect detection model. For example, the first position indication information may be any type of indication information such as a detection frame (marking Box), a mask (mask), a position point (point) and the like corresponding to the defect area, which is not limited in the present invention. Inputting the image to be detected and the first position prompt information into a feature extraction network of the defect classification model, and obtaining the image features and the position identification features of the image to be detected. The defect classification model is an Open World Semi-supervised classification (OWSSL) model, e.g., semi-supervised learning to reality (Towards Realistic Semi-Supervised Learning, TRSSL). Inputting the image to be detected and the first position prompt information into a feature extraction network of the defect classification model to obtain image features and position identification features of the image to be detected. By way of example and not limitation, the feature extraction network may be implemented using a vision transformer (Vision Transformer, viT) or convolutional neural network backbone (Convolutional Neural Networks Backbone, CNN backbone) or the like.
Illustratively, the first location hint information may include: target frame prompt information corresponding to a target detection frame containing a defect area, and/or mask prompt information corresponding to a mask of the defect area, and/or identification point prompt information corresponding to one or more identification points in the defect area; the location identification feature may include: a target frame encoding feature corresponding to a target detection frame containing the defective area, and/or a mask encoding feature corresponding to a mask of the defective area, and/or an identification point encoding feature corresponding to one or more identification points within the defective area.
In one embodiment, the first location hint information may include: target frame prompt information corresponding to a target detection frame containing a defect area, mask prompt information corresponding to a mask of the defect area, and any one or more of identification point prompt information corresponding to one or more identification points in the defect area. For example, the first location hint information may include target frame hint information corresponding to a target detection frame that includes a defective area, or the first location hint information may include mask hint information corresponding to a mask for the defective area, or the first location hint information may include identification point hint information corresponding to one or more identification points within the defective area. For another example, the first location hint information may include target frame hint information corresponding to a target detection frame that includes a defective area, mask hint information corresponding to a mask for the defective area, and identification point hint information corresponding to one or more identification points within the defective area. For example, the target frame prompt may be presented as a circumscribed rectangular box of the defect area. Illustratively, the mask hint information may be presented via a binary map. In the binary image, each pixel in the defective area may be highlighted. The identification point prompt information can be presented through image coordinates of one or more identification points in the image to be detected. The identification point may be any representative pixel point in the defect area. The location identification feature may include: a target frame encoding feature corresponding to a target detection frame containing a defective area, a mask encoding feature corresponding to a mask of the defective area, and any one or more of identification point encoding features corresponding to one or more identification points within the defective area. The type of location identification feature corresponds to the type of first location hint information. That is, when the first location hint information includes target frame hint information corresponding to a target detection frame that includes a defective area, mask hint information corresponding to a mask of the defective area, and identification point hint information corresponding to one or more identification points within the defective area, the location identification feature may include target frame code features corresponding to the target detection frame that includes the defective area, mask code features corresponding to the mask of the defective area, and identification point code features corresponding to the one or more identification points within the defective area.
According to the technical scheme, through the target frame prompt information corresponding to the target detection frame containing the defect area, the mask prompt information corresponding to the mask of the defect area, any one or more first position prompt information in the identification point prompt information corresponding to one or more identification points in the defect area and the corresponding position identification characteristics, supervision information which is beneficial to training of the defect classification model can be provided, so that interference of an irrelevant area to training of the defect classification model is reduced, and training efficiency of the defect classification model is improved.
Step S120, inputting the image feature and the position identification feature into the fully connected neural network of the defect classification model, so as to obtain a defect classification result, where the defect classification result is used to indicate the category to which the defect region in the image to be detected belongs.
Illustratively, the defect classification model may include a fully connected neural network. The image features and the location identification features are input into a fully connected neural network (Full Connect Neural Network) of the defect classification model, and a defect classification result can be obtained. The defect classification result may indicate a category to which the defective region in the image to be measured belongs. The categories to which the defective areas belong may include, but are not limited to, scratches, deletions, or particulate matter, etc. For example, if the location identification feature is a target frame encoding feature, then the image feature and the target frame encoding feature are input into a fully connected neural network, and a defect classification result may be obtained. The defect classification result may include a target frame and a confidence level corresponding to the defect region in the image to be detected. The category corresponding to the maximum confidence level can be determined as the category to which the defect area belongs through the confidence levels corresponding to the different categories respectively.
According to the defect classification method provided by the embodiment of the invention, the image characteristics and the position identification characteristics of the image to be detected can be obtained by inputting the image to be detected into the characteristic extraction network of the defect classification model. And inputting the obtained image features and the position identification features into a fully-connected neural network of the defect classification model, so that a defect classification result can be obtained. According to the scheme, the position identification characteristics of the image to be detected are obtained, so that interference of other areas except the defect area in the image to be detected on the defect classification result can be reduced, and the detection efficiency and the accuracy of the defect classification result are improved. Meanwhile, the defect classification model in the scheme is used for detecting the category of the defect area in the image to be detected, and an unknown new category can be found.
Illustratively, the defect classification model may further include an attention network, and the method may further include, prior to inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain the defect classification result: inputting the image features and the position identification features into an attention network for feature fusion so as to obtain fusion features; inputting the image features and the position identification features into a fully connected neural network of a defect classification model to obtain a defect classification result, comprising: and inputting the fusion characteristics into a fully-connected neural network to obtain a defect classification result.
In one embodiment, the defect classification model may also include an attention network. The image features and the position identification features are input into an Attention network (Attention) to perform feature fusion, and fusion features corresponding to the image features and the position identification features can be obtained through Attention mechanism (Attention) operation. And inputting the obtained fusion characteristics into a fully-connected neural network to obtain a defect classification result.
According to the technical scheme, the image features and the position identification features can be fused through the attention network, so that the obtained fusion features simultaneously comprise the image features and the position identification features. Therefore, the defect classification model can be prompted to pay attention to the region corresponding to the position identification feature, and interference of defect classification results of irrelevant regions is reduced.
Illustratively, the feature extraction network may include an image encoding module and a position encoding module, and inputting the image to be detected and the first position prompt information into the feature extraction network of the defect classification model to obtain an image feature and a position identification feature of the image to be detected, and may include: inputting the image to be detected into an image coding module to obtain image characteristics; the first position prompt information is input into a position coding module to obtain a position identification feature.
In one embodiment, the feature extraction network may include an image encoding module and a position encoding module. The image to be measured is input into an image coding module (Image Encoding Module) to obtain image characteristics. By way of example and not limitation, implementations of image encoding may include, but are not limited to, convolutional neural networks (Convolutional Neural Networks, CNN) or visual converters (Vision Transformer, viT), among others. In one embodiment of the invention, the convolutional neural network can be utilized to obtain the image features corresponding to the image to be measured. And inputting the first position prompt information into a position coding module to obtain the position identification characteristic. By way of example and not limitation, the position codes may include any of conditional position codes, learnable absolute position codes, sine and cosine function codes, relative position codes, and the like. In one embodiment, the position of the two-dimensional position coordinates in the first position indication information may be encoded by using a sine and cosine function, so as to obtain the corresponding position identification feature.
According to the technical scheme, the image to be detected is input into the image coding module, so that the image characteristics can be obtained. And inputting the first position prompt information into a position coding module to obtain the position identification characteristic. This ensures the efficiency and accuracy of acquiring image features and position-coding features.
Illustratively, the first location hint information includes: target frame prompt information corresponding to a target detection frame containing a defect area, and/or mask prompt information corresponding to a mask of the defect area, and/or identification point prompt information corresponding to one or more identification points in the defect area; the location identification feature includes: a target frame encoding feature corresponding to a target detection frame containing a defective area, and/or a mask encoding feature corresponding to a mask of the defective area, and/or an identification point encoding feature corresponding to one or more identification points within the defective area; the position coding module comprises: a target frame encoder, and/or a mask encoder, and/or an identification point encoder; inputting the first position prompt information into a position coding module to obtain a position identification feature, including: inputting the target frame prompt information into a target frame encoder to obtain target frame coding characteristics; inputting mask prompt information into a mask encoder to obtain mask coding characteristics; the identification point prompt information is input into an identification point encoder to obtain identification point coding characteristics.
In one embodiment, the first location hint information and the location identification feature are described in detail in the previous embodiments, and are not described herein for brevity. The position encoding module may include any one or more of a target frame encoder, a mask encoder, and an identification point encoder. For example, when the first position-cue information includes a target-frame cue information and a mask cue information, the position-coding module may include a target-frame encoder and a mask encoder. For another example, when the first position-cue information includes only the target-frame cue information, then the position-coding module includes the target-frame encoder. In one embodiment of the invention, the target frame encoder may encode the target frame position information in a fixed sine and cosine position embedding (sin-cos Position Embedding) manner. Mask encoders may embed mask information using convolution. And inputting different types of position prompt information into corresponding encoders to obtain corresponding position identification features. And inputting the target frame prompt information into a target frame encoder to obtain the target frame coding characteristics. Similarly, inputting mask prompt information into a mask encoder to obtain mask coding characteristics; the identification point prompt information is input into an identification point encoder, and identification point coding characteristics can be obtained.
According to the technical scheme, the coding features corresponding to the position prompt messages of different types are obtained through different encoders, so that the accuracy of the obtained coding features can be ensured.
FIG. 2 illustrates a schematic diagram of acquiring fusion features according to one embodiment of the invention. As shown in fig. 2, inputting the image to be measured into the image coding module can obtain the image coding features corresponding to the image to be measured. And inputting target frame prompt information in the first position prompt information corresponding to the image to be detected into a target frame encoder, so that target frame coding characteristics can be obtained. And inputting mask prompt information in the first position prompt information corresponding to the image to be detected into a mask encoder, so that mask coding characteristics can be obtained. And inputting the obtained image coding features, the target frame coding features and the mask coding features into a attention network to obtain fusion features.
The image coding module is obtained by training in an unsupervised training mode, and the parameters of the image coding module are kept fixed when the defect classification model is trained.
In one embodiment, the image encoding module may be trained using an unsupervised training scheme. Such as emerging properties in a self-supervising visual converter (Emerging Properties in Self-Supervised Vision Transformers, DINO), learning robust visual features without supervision (Learning Robust VisualFeatures without Supervision, DINOv 2). When the defect classification model is trained, parameters of the image coding module are kept fixed, and only branches such as the position coding module, the fully-connected neural network and the like are trained. Therefore, when the defect classification model is trained, the training cost is relatively low, and the training efficiency of the defect classification model can be improved.
Illustratively, the defect classification model is obtained by the following training operations: acquiring a sample data set, wherein the sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information corresponding to each first sample image and marking labels, the marking labels are used for indicating the category of a defect area in the corresponding first sample images, and the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to each second sample image; the following model training operations are performed based on the sample dataset: training the defect classification model by using the marked data to obtain an initially trained defect classification model; inputting a plurality of second sample images and third position prompt information corresponding to the plurality of second sample images into the defect classification model subjected to initial training to obtain first prediction classification results corresponding to the plurality of second sample images, wherein the first prediction classification results are used for indicating the category of the defect region in the corresponding second sample images; acquiring pseudo labels corresponding to at least part of the second sample images in the plurality of second sample images based on the first prediction classification results corresponding to the second sample images; inputting a plurality of first sample images and second position prompt information corresponding to the plurality of first sample images into the defect classification model subjected to initial training to obtain second prediction classification results corresponding to the plurality of first sample images, wherein the second prediction classification results are used for indicating the category to which the defect area in the corresponding first sample images belongs; inputting at least part of the second sample images and at least part of third position prompt information corresponding to the second sample images into the defect classification model subjected to initial training to obtain at least part of third prediction classification results corresponding to the second sample images, wherein the third prediction classification results are used for indicating the category of the defect area in the corresponding second sample images; calculating a prediction loss value based on the difference between the labeling labels corresponding to the first sample images and the second prediction classification result and the difference between the pseudo labels corresponding to at least part of the second sample images and the third prediction classification result; parameters in the initially trained defect classification model are optimized based on the predictive loss values to obtain a trained defect classification model corresponding to the present round of training operations.
In one embodiment, the sample dataset may include annotated data and unlabeled data. The marked data may include a plurality of first sample images and second location cues and marking labels corresponding to each of the first sample images. The method for acquiring the first sample image is similar to the method for acquiring the image to be measured, and the method for acquiring the image to be measured has been described in detail in step S110, and is not described here again for brevity. Each first sample image is provided with corresponding second position prompt information and label. Similar to the first location hint information, the second location hint information may include any one or more of a label box location hint information, a mask hint information, and a mark point hint information that corresponds to a defect area included in the first sample image. The labeling tag may be used to indicate the category to which the defective area in the corresponding first sample image belongs. The unlabeled data may include a plurality of second sample images and a third location hint information corresponding to each of the second sample images. The second sample image acquiring manner is similar to the acquiring manner of the image to be measured, and the acquiring manner of the image to be measured has been described in detail in step S110, and for brevity, the description is omitted here. The plurality of second sample images may include an image obtained after the initial second sample image is subjected to a plurality of data enhancement and the initial second sample image. Implementations of data enhancement include, but are not limited to, flipping, rotating, cropping, etc. the initial second sample image. For example, if the number of the initial second sample images U is 100, the initial second sample images U1 after the first data enhancement and the initial second sample images U2 after the second data enhancement can be obtained after the initial second sample images U are subjected to the two data enhancement. The plurality of second sample images may include an initial second sample image U, an initial second sample image U1 after the first data enhancement, and an initial second sample image U2 after the second data enhancement. The third position indication information is similar to the first position indication information and the second position indication information in the previous embodiments, and is not described herein for brevity. FIG. 3 illustrates a defect classification model training schematic in accordance with one embodiment of the invention. And training the defect classification model by using the marked data to obtain an initially trained defect classification model. As shown in fig. 3, the first prediction classification results corresponding to the second sample images can be obtained by inputting the second sample images and the third position prompt information (i.e. the unlabeled data) corresponding to each of the second sample images into the defect classification model after initial training. The first prediction classification result may be used to indicate a category to which the defective region in the corresponding second sample image belongs. And acquiring all or part of pseudo tags corresponding to the second sample images based on the first prediction classification results corresponding to the second sample images. Constraints may be added to the pseudo tags corresponding to each of the obtained second sample images, for example, a Xin Kehuo en-Knopp (sink horn-Knopp) algorithm may be used to add constraints to the pseudo tags corresponding to each of the obtained second sample images. Similarly, the first sample images and the second position prompt information corresponding to the first sample images are input into the defect classification model which is initially trained, and the second prediction classification result corresponding to the first sample images can be obtained. The second prediction classification result may indicate a category to which the defective region in the corresponding first sample image belongs. And inputting the obtained at least part of second sample images with the pseudo labels and the corresponding third position prompt information (namely pseudo labeling data shown in fig. 3) of the at least part of second sample images into the defect classification model after initial training, so as to obtain the corresponding third prediction classification result of the at least part of second sample images. The third prediction classification result may indicate a category to which the defective region in the corresponding second sample image belongs. The prediction classification result shown in fig. 3 may include a third prediction classification result and a second prediction classification result. "ema" represents a running average of the defect classification model weights. The predictive loss value can be calculated by a preset loss function based on the difference between the labeling labels corresponding to the first sample images and the second predictive classification result and the difference between the pseudo labels corresponding to at least part of the second sample images and the third predictive classification result. The preset loss function may be any loss function such as a cross entropy loss function, which is not limited in the present invention. In one embodiment of the present invention, the labeling labels corresponding to the plurality of first sample images and the pseudo labels corresponding to at least part of the second sample images may be substituted into a preset loss function, so as to calculate the predicted loss value. Parameters in the initially trained defect classification model may then be optimized using back-propagation and gradient descent algorithms based on the predicted loss values. The optimization operation may be repeatedly performed until the defect classification model reaches a converged state. After the training of the present round is finished, a defect classification model which corresponds to the training of the present round and is finished can be obtained.
According to the technical scheme, the defect classification model is trained by using the marked data and the unmarked data, the new class is found based on the consistency of carrying out multiple data enhancement on the same initial sample image and the added constraint. Meanwhile, the defect classification model obtained through training can detect defects of unknown types.
Illustratively, the image under test includes a plurality of first images under test, and the training operation may further include: after the training operation of the round is completed, a fourth prediction classification result which is obtained through prediction of the trained defect classification model and corresponds to at least one test image is obtained, wherein the fourth prediction classification result is used for indicating the category to which the defect area in the corresponding test image belongs, and the at least one test image comprises one or more of the following: at least a portion of the plurality of second sample images; an image to be measured; a new image other than the plurality of second sample images and the image to be measured; outputting fourth prediction classification results corresponding to at least one test image respectively; and responding to the check information input by the user, and executing corresponding operation on the fourth prediction classification result and/or the test image.
In one embodiment, the image to be measured may include a plurality of first images to be measured. In the foregoing embodiment, the defect classification model after the training is obtained through the training, and one or more test images may be respectively input into the defect classification model, so as to obtain a fourth prediction classification result corresponding to each test image. The fourth prediction classification result may indicate a category to which the defective region in the corresponding test image belongs. The test image may be any image containing the target object. For example, the test image may include at least a part of the plurality of second sample images, the image to be tested, and a new image other than the plurality of second sample images and the image to be tested. The one or more test images may also include only at least part of the plurality of second sample images, or the image to be tested, or a new image other than the plurality of second sample images and the image to be tested.
In one embodiment, the means for performing the defect classification method in the embodiment of the present invention may comprise input means and/or output means. The input device and/or the output device may be communicatively coupled to or included in a device for performing the defect classification method in an embodiment of the present invention. The input device may include, but is not limited to, one or more of a mouse, keyboard, microphone, touch screen, etc. The output devices may include, but are not limited to, one or more of a display device, a speaker, and the like. After the fourth prediction classification result is obtained, the fourth prediction classification result corresponding to each of the at least one test image may be output. For example, if the output device is a display device, the fourth prediction classification result may be displayed in a display interface of the display device. The user may also input verification information through the input device. And through the check information input by the user, corresponding operation can be performed on the fourth prediction classification result and/or the test image. For example, if the user's verification information indicates that the image of the test image that is partially unsatisfactory is required to be deleted, then the unsatisfactory image may be automatically deleted from the test image in response to the user's input of verification information.
According to the technical scheme, after the training operation of the round is completed, the fourth prediction classification result corresponding to each of the at least one test image, which is obtained through the prediction of the trained defect classification model, can be obtained. And then outputting fourth prediction classification results corresponding to the at least one test image respectively. The user may input the check information to perform a corresponding operation on the fourth prediction classification result and/or the test image. According to the scheme, the user can conveniently check the fourth prediction classification result and the test image, and corresponding operation is performed on the fourth prediction classification result and the test image, so that the requirements of different users are met, and the interactivity is high.
Illustratively, responsive to the user entered verification information, performing a corresponding operation on the fourth predictive classification result and/or the test image, including one or more of: for any test image classified into a known class based on the fourth prediction classification result, deleting the test image if the verification information comprises deletion information about the test image, wherein the known class is a class indicated by a labeling label in labeled data; for any new category divided in the fourth prediction classification result, if the checking information comprises deletion information about the new category, deleting the new category from the fourth prediction classification result, wherein the new category is a category different from the known category; for any new category divided in the fourth prediction classification result, if the check information comprises merging information about the new category, merging the new category and a known category designated by the merging information into the same category; and for any new category divided in the fourth prediction classification result, if the check information comprises the adding information about the new category, adding the new category and the test image corresponding to the new category into the marked data.
In one embodiment, the fourth prediction classification result may indicate a defective area belonging to a known class, or may indicate a defective area belonging to an unknown class. The known category is the category indicated by the annotation tag in the annotated data. For example, the fourth prediction classification result corresponding to the test image indicates that the category to which the defect area in the test image belongs is a scratch, and then the category of the defect area indicated by the fourth prediction classification result is a known category.
In the first embodiment, for any test image classified into a known class based on the fourth prediction classification result, if the verification information includes deletion information on the test image, the test image is deleted. For example, for test image a, if a user inputs a character such as "delete test image a" or the user clicks an "x" control corresponding to test image a, which may indicate that the user desires to delete the test image, then test image a may be deleted from at least one test image in response to the verification information currently input by the user.
In the second embodiment, for any new category divided in the fourth prediction classification result, if the check information includes deletion information on the new category, the new category is deleted from the fourth prediction classification result. The new category may represent other categories than the known category. For example, the fourth prediction classification result corresponding to the test image B indicates that the class to which the defect region included in the current test image B belongs is B. If the known class does not include class b, then class b may be used as the new class. If the user-entered verification information includes deletion information, such as a character of "delete category b," which may indicate that category b is an invalid category and the user desires to delete the category, category a may be deleted from the fourth prediction classification result in response to the user-currently entered verification information.
In the third embodiment, for any new category divided in the fourth prediction classification result, if the check information includes merging information about the new category, the new category may be merged into the same category as a known category specified by the merging information. For example, the fourth prediction classification result corresponding to the test image C indicates that the category to which the defect area included in the current test image C belongs is a category C, and the similarity between the category C and the scratch is high, and then the user may input merging information to merge the category C and the scratch category. And merging the new category and the known category appointed by the merging information into the same category, and then marking the category corresponding to the defect area contained in the test image C.
In the fourth embodiment, for any new category divided in the fourth prediction classification result, if the check information includes additional information about the new category, the new category and the test image corresponding to the new category are added to the labeled data. For example, the fourth prediction classification result corresponding to the test image D indicates that the class to which the defective region included in the current test image D belongs is a class D and the class D is a valid new class, then the user may add information to add the class D to the known class. Meanwhile, based on the adding information input by the user, the test image D corresponding to the category D can be added into the marked data.
According to the above technical solution, for any test image classified into a known class based on the fourth prediction classification result, different operations may be performed for different verification information, respectively. When the verification information includes deletion information on any one of the test images, the test image may be deleted. When the verification information includes deletion information for any new category, the new category may be deleted from the fourth prediction classification result. Where the audit information includes merging information about any new category, the new category may be merged into the same category as the known category specified by the merging information. When the check information includes additional information about any new category, the new category and the test image corresponding to the new category may be added to the labeled data. The scheme can execute operations such as deleting, adding to a known class or merging with the known class on the new class in an artificial checking mode, and can delete a test image. This allows iterative optimization of the annotated data to obtain more new categories.
FIG. 4 shows a schematic diagram of a defect classification model training process according to another embodiment of the invention. As shown in fig. 4, when the defect classification model is trained for the first time, unlabeled data and labeled data in the sample data set may be input into the defect classification model to obtain a corresponding prediction classification result. As shown in fig. 4, the prediction classification result may include 8 defect categories (each square may represent one defect category). Wherein the 4 defect categories within the dashed box represent already categories and the other 4 defect categories represent new categories. Based on the obtained prediction classification result, the user may input different verification information to verify the prediction classification result. After the checking is completed, new classes that partially do not meet the requirements in the prediction classification result may be deleted (for example, two new classes indicated by arrows in fig. 4 do not meet the requirements, may be deleted from the new classes), or some new classes in the prediction classification result may be combined with the known classes in the labeled data, or the new classes in the prediction classification result may be added to the known classes. For new categories that are merged or added, the test image corresponding to the portion of the predicted classification result may be re-partitioned into known data, continuing to train the defect classification model. It will be appreciated that when the defect classification model is first trained, the trained sample dataset does not contain "new class annotation data".
According to another aspect of the present invention, there is also provided a defect classification apparatus. Fig. 5 shows a schematic block diagram of a defect classification apparatus 500 according to an embodiment of the invention, which defect classification apparatus 500 comprises a first input module 510 and a second input module 520 as shown in fig. 5.
The first input module 510 is configured to input an image to be measured and first location prompt information into a feature extraction network of a defect classification model to obtain image features and location identification features of the image to be measured, where the defect classification model is an open world semi-supervised classification model, and the first location prompt information is configured to indicate a location of a defect region in the image to be measured.
The second input module 520 is configured to input the image feature and the location identification feature into a fully connected neural network of the defect classification model to obtain a defect classification result, where the defect classification result is used to indicate a category to which the defect region in the image to be detected belongs.
Those skilled in the art will understand the specific implementation and advantages of the defect classification device according to the above description about the defect classification method 100, and the detailed description is omitted herein for brevity.
According to still another aspect of the present invention, an electronic device is also provided. Fig. 6 shows a schematic block diagram of an electronic device according to an embodiment of the invention. As shown in fig. 6, the electronic device 600 includes a processor 610 and a memory 620, the memory 620 storing a computer program, the computer program instructions being executable by the processor 610 for performing the defect classification method described above.
According to yet another aspect of the present invention, there is also provided a storage medium storing a computer program/instructions, the storage medium may include, for example, a storage component of a tablet computer, a hard disk of a personal computer, an erasable programmable read-only memory (EPROM), a portable read-only memory (CD-ROM), a USB memory, or any combination of the foregoing storage media. The storage medium may be any combination of one or more computer readable storage media. The computer program/instructions are used by the processor when running to perform the defect classification method described above.
Those skilled in the art will understand the specific implementation of the electronic device and the storage medium according to the above description about the defect classification method, and for brevity, the description is omitted here.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the above illustrative embodiments are merely illustrative and are not intended to limit the scope of the present invention thereto. Various changes and modifications may be made therein by one of ordinary skill in the art without departing from the scope and spirit of the invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another device, or some features may be omitted or not performed.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in order to streamline the invention and aid in understanding one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of the invention. However, the method of the present invention should not be construed as reflecting the following intent: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be combined in any combination, except combinations where the features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some of the modules in the defect classification device according to an embodiment of the invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
The foregoing description is merely illustrative of specific embodiments of the present invention and the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present invention. The protection scope of the invention is subject to the protection scope of the claims.

Claims (13)

1. A method of defect classification, the method comprising:
Inputting an image to be detected and first position prompt information into a feature extraction network of a defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected;
inputting the image features and the position identification features into a fully connected neural network of the defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of a defect region in the image to be detected;
the defect classification model is an open world semi-supervised classification model.
2. The method of claim 1, wherein the defect classification model further comprises an attention network, the method further comprising, prior to said inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results:
inputting the image features and the position identification features into the attention network for feature fusion so as to obtain fusion features;
the inputting the image features and the location identification features into the fully connected neural network of the defect classification model to obtain defect classification results comprises:
And inputting the fusion characteristics into the fully-connected neural network to obtain the defect classification result.
3. The method of claim 1, wherein,
the first position prompt information includes: target frame prompt information corresponding to a target detection frame containing the defect area, and/or mask prompt information corresponding to a mask of the defect area, and/or identification point prompt information corresponding to one or more identification points in the defect area;
the location identification feature comprises: a target frame encoding feature corresponding to a target detection frame containing the defective area, and/or a mask encoding feature corresponding to a mask of the defective area, and/or an identification point encoding feature corresponding to one or more identification points within the defective area.
4. The method of claim 1, wherein the feature extraction network comprises an image encoding module and a position encoding module, the inputting the image to be measured and the first position hint information into the feature extraction network of the defect classification model to obtain the image feature and the position identification feature of the image to be measured, comprising:
inputting the image to be detected into the image coding module to obtain the image characteristics;
And inputting the first position prompt information into the position coding module to obtain the position identification characteristic.
5. The method of claim 4, wherein the first location hint information comprises: target frame prompt information corresponding to a target detection frame containing the defect area, and/or mask prompt information corresponding to a mask of the defect area, and/or identification point prompt information corresponding to one or more identification points in the defect area; the location identification feature comprises: a target frame coding feature corresponding to a target detection frame containing the defective area, and/or a mask coding feature corresponding to a mask of the defective area, and/or an identification point coding feature corresponding to one or more identification points within the defective area; the position coding module comprises: a target frame encoder, and/or a mask encoder, and/or an identification point encoder;
the inputting the first position prompt information into the position coding module to obtain the position identification feature includes:
inputting the target frame prompt information into the target frame encoder to obtain the target frame coding characteristics; and/or the number of the groups of groups,
Inputting the mask prompt information into the mask encoder to obtain the mask coding characteristics; and/or the number of the groups of groups,
and inputting the identification point prompt information into the identification point encoder to obtain the identification point coding feature.
6. The method of claim 4, wherein the image coding module is trained using an unsupervised training scheme, wherein parameters of the image coding module remain fixed while training the defect classification model, and wherein parameters of the position coding module are trained.
7. The method according to any one of claims 1-6, wherein the defect classification model is obtained by a training operation comprising:
acquiring a sample data set, wherein the sample data set comprises marked data and unmarked data, the marked data comprises a plurality of first sample images, second position prompt information corresponding to each first sample image and marking labels, the marking labels are used for indicating the category of a defect area in the corresponding first sample image, and the unmarked data comprises a plurality of second sample images and third position prompt information corresponding to each second sample image;
Performing the following model training operations based on the sample dataset:
training the defect classification model by using the marked data to obtain an initially trained defect classification model;
inputting the plurality of second sample images and third position prompt information corresponding to the plurality of second sample images into the defect classification model which is initially trained so as to obtain first prediction classification results corresponding to the plurality of second sample images, wherein the first prediction classification results are used for indicating the category of the defect region in the corresponding second sample images;
acquiring pseudo labels corresponding to at least part of the second sample images in the plurality of second sample images based on the first prediction classification results corresponding to the plurality of second sample images;
inputting the plurality of first sample images and second position prompt information corresponding to the plurality of first sample images into the defect classification model which is initially trained so as to obtain second prediction classification results corresponding to the plurality of first sample images, wherein the second prediction classification results are used for indicating the category to which the defect region in the corresponding first sample image belongs;
Inputting the at least partial second sample image and the third position prompt information corresponding to the at least partial second sample image into the defect classification model which is initially trained so as to obtain a third prediction classification result corresponding to the at least partial second sample image, wherein the third prediction classification result is used for indicating the category of the defect region in the corresponding second sample image;
calculating a prediction loss value based on the difference between the labeling labels corresponding to the plurality of first sample images and the second prediction classification result and the difference between the pseudo labels corresponding to the at least part of the second sample images and the third prediction classification result;
and optimizing parameters in the defect classification model which is initially trained based on the predicted loss value to obtain the defect classification model which is trained and completed and corresponds to the round of training operation.
8. The method of claim 7, wherein the image to be measured comprises a plurality of first images to be measured, the training operation further comprising:
after the training operation of the present round is completed, obtaining fourth prediction classification results, which are obtained through prediction of the defect classification model and are obtained through the training, and correspond to at least one test image, wherein the fourth prediction classification results are used for indicating the category to which the defect area in the corresponding test image belongs, and the at least one test image comprises one or more of the following: at least a portion of the plurality of second sample images; the image to be detected; new images other than the plurality of second sample images and the image to be measured;
Outputting fourth prediction classification results corresponding to the at least one test image respectively;
and responding to the check information input by the user, and executing corresponding operation on the fourth prediction classification result and/or the test image.
9. The method of claim 8, wherein the performing a corresponding operation on the fourth predictive classification result and/or the test image in response to the user entered verification information comprises one or more of:
for any test image classified into a known category based on the fourth prediction classification result, deleting the test image if the checking information comprises deletion information about the test image, wherein the known category is a category indicated by a labeling label in the labeled data;
for any new category divided in the fourth prediction classification result, if the checking information comprises deletion information about the new category, deleting the new category from the fourth prediction classification result, wherein the new category is a category different from the known category;
for any new category divided in the fourth prediction classification result, if the checking information comprises merging information about the new category, merging the new category and a known category designated by the merging information into the same category;
And for any new category divided in the fourth prediction classification result, if the check information comprises the adding information about the new category, adding the new category and the test image corresponding to the new category into the marked data.
10. The method of any of claims 1-6, wherein the image to be measured is a wafer image comprising a wafer and the defect area is a defect area on the wafer.
11. A defect classification device, the device comprising:
the first input module is used for inputting an image to be detected and first position prompt information into a feature extraction network of a defect classification model to obtain image features and position identification features of the image to be detected, wherein the first position prompt information is used for indicating the position of a defect region in the image to be detected;
the second input module is used for inputting the image features and the position identification features into the fully-connected neural network of the defect classification model to obtain a defect classification result, wherein the defect classification result is used for indicating the category of the defect region in the image to be detected;
the defect classification model is an open world semi-supervised classification model.
12. An electronic device comprising a processor and a memory, wherein the memory has stored therein computer program instructions which, when executed by the processor, are adapted to carry out the defect classification method of any of claims 1-10.
13. A storage medium storing a computer program/instruction which, when executed, is adapted to carry out the defect classification method of any one of claims 1-10.
CN202311802922.XA 2023-12-26 2023-12-26 Defect classification method and device, electronic equipment and storage medium Active CN117456290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311802922.XA CN117456290B (en) 2023-12-26 2023-12-26 Defect classification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311802922.XA CN117456290B (en) 2023-12-26 2023-12-26 Defect classification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117456290A true CN117456290A (en) 2024-01-26
CN117456290B CN117456290B (en) 2024-04-16

Family

ID=89582244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311802922.XA Active CN117456290B (en) 2023-12-26 2023-12-26 Defect classification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117456290B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205176A (en) * 2021-04-19 2021-08-03 重庆创通联达智能技术有限公司 Method, device and equipment for training defect classification detection model and storage medium
CN114359622A (en) * 2021-12-06 2022-04-15 中国科学院深圳先进技术研究院 Image classification method based on convolution neural network-converter hybrid architecture
CN114972174A (en) * 2022-04-06 2022-08-30 电子科技大学中山学院 Defect detection method and device, electronic equipment and storage medium
CN116012291A (en) * 2022-11-21 2023-04-25 南京工业大学 Industrial part image defect detection method and system, electronic equipment and storage medium
CN116485735A (en) * 2023-04-06 2023-07-25 深兰人工智能应用研究院(山东)有限公司 Defect detection method, defect detection device, electronic equipment and computer readable storage medium
CN116994135A (en) * 2023-07-28 2023-11-03 南京航空航天大学 Ship target detection method based on vision and radar fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205176A (en) * 2021-04-19 2021-08-03 重庆创通联达智能技术有限公司 Method, device and equipment for training defect classification detection model and storage medium
CN114359622A (en) * 2021-12-06 2022-04-15 中国科学院深圳先进技术研究院 Image classification method based on convolution neural network-converter hybrid architecture
CN114972174A (en) * 2022-04-06 2022-08-30 电子科技大学中山学院 Defect detection method and device, electronic equipment and storage medium
CN116012291A (en) * 2022-11-21 2023-04-25 南京工业大学 Industrial part image defect detection method and system, electronic equipment and storage medium
CN116485735A (en) * 2023-04-06 2023-07-25 深兰人工智能应用研究院(山东)有限公司 Defect detection method, defect detection device, electronic equipment and computer readable storage medium
CN116994135A (en) * 2023-07-28 2023-11-03 南京航空航天大学 Ship target detection method based on vision and radar fusion

Also Published As

Publication number Publication date
CN117456290B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
CN109977191B (en) Problem map detection method, device, electronic equipment and medium
CN110111334B (en) Crack segmentation method and device, electronic equipment and storage medium
TWI672637B (en) Patern recognition method of autoantibody immunofluorescence image
CN110874591B (en) Image positioning method, device, equipment and storage medium
CN114757948B (en) Deep learning-based method and device for detecting content of recycled aggregate mortar
CN111489352A (en) Tunnel gap detection and measurement method and device based on digital image processing
CN114049356A (en) Method, device and system for detecting structure apparent crack
CN111651674B (en) Bidirectional searching method and device and electronic equipment
CN116128839A (en) Wafer defect identification method, device, electronic equipment and storage medium
CN114429577B (en) Flag detection method, system and equipment based on high confidence labeling strategy
CN114462469B (en) Training method of target detection model, target detection method and related device
CN114972268A (en) Defect image generation method and device, electronic equipment and storage medium
CN113076961B (en) Image feature library updating method, image detection method and device
CN114428110A (en) Method and system for detecting defects of fluorescent magnetic powder inspection image of bearing ring
CN116805387B (en) Model training method, quality inspection method and related equipment based on knowledge distillation
WO2024061309A1 (en) Defect identification method and apparatus, computer device, and storage medium
CN117456290B (en) Defect classification method and device, electronic equipment and storage medium
CN117314863A (en) Defect output method, device, equipment and storage medium
CN112802034A (en) Image segmentation and recognition method, model construction method and device and electronic equipment
CN117456291B (en) Defect classification method and device, electronic equipment and storage medium
CN116824291A (en) Remote sensing image learning method, device and equipment
Jia et al. Sample generation of semi‐automatic pavement crack labelling and robustness in detection of pavement diseases
CN114638304A (en) Training method of image recognition model, image recognition method and device
Kee et al. Cracks identification using mask region-based denoised deformable convolutional network
CN117372510B (en) Map annotation identification method, terminal and medium based on computer vision model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant