WO2023208091A1 - Detection formula configuration and optimization method and apparatus, electronic device and storage medium - Google Patents

Detection formula configuration and optimization method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2023208091A1
WO2023208091A1 PCT/CN2023/091070 CN2023091070W WO2023208091A1 WO 2023208091 A1 WO2023208091 A1 WO 2023208091A1 CN 2023091070 W CN2023091070 W CN 2023091070W WO 2023208091 A1 WO2023208091 A1 WO 2023208091A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
detection
information
defect
detection result
Prior art date
Application number
PCT/CN2023/091070
Other languages
French (fr)
Chinese (zh)
Inventor
王敬贤
刘涛
潘成安
邓帅飞
易兵
鲁阳
张记晨
周许超
Original Assignee
上海微电子装备(集团)股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海微电子装备(集团)股份有限公司 filed Critical 上海微电子装备(集团)股份有限公司
Publication of WO2023208091A1 publication Critical patent/WO2023208091A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/763Non-hierarchical techniques, e.g. based on statistics of modelling distributions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/042Backward inferencing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30148Semiconductor; IC; Wafer

Definitions

  • the invention relates to the field of semiconductor technology, and in particular to a detection recipe setting and optimization method, device, electronic equipment and storage medium.
  • wafer warpage (Bow) and wafer surface morphology are key parameters that affect process stability and product yield, and are critical to wafer yield (Yield). Influence. For example, after the wafer undergoes different processes such as etching or thin film deposition, the wafer will warp to varying degrees or the wafer surface will be uneven; another example is that a robot may scratch the wafer during the manufacturing process of semiconductor integrated circuits. Therefore, wafer defects are what all chip manufacturers pay most attention to during yield inspection. Once a wafer is defective, it is difficult to remedy it through subsequent processes. Therefore, it is crucial to quickly and accurately detect defects on the wafer surface to avoid wasting production resources due to defective products flowing into the next process.
  • the wafer defect detection process usually uses forward process parameter adjustment.
  • the detection parameters of the detection process are usually adjusted one by one. Adjustment, since the coupling relationship between parameters cannot be taken into account, repeated adjustments of a single parameter may lead to deviations in the parameter adjustment results.
  • the detection formula needs to repeatedly adjust parameters, which brings manpower and time. Increase in costs.
  • existing detection formulas are difficult to apply to defect detection in new processes. Adjusting the parameters of the detection formula requires a certain algorithm background, so the requirements for users are high.
  • the purpose of the present invention is to provide a detection recipe setting and optimization method, system, electronic equipment and storage medium in view of the defects existing in the prior art.
  • the detection recipe setting and optimization method provided by the invention is based on the a priori detection result data. knowledge, and fully consider the coupling relationship between parameters to determine the strategy and parameter setting values of the detection formula at one time, which not only determines the efficiency of the detection process, but also improves the detection accuracy of the detection formula.
  • the present invention provides a detection formula setting and optimization method, a detection formula setting and optimization method, including:
  • the second data sample obtain the data feature distribution information of the detection object
  • Using a preset outlier statistical analysis strategy perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and determine the detection formula according to the preset outlier statistical analysis strategy;
  • the values of the detection parameters of the detection formula are set or optimized through reverse derivation.
  • the detection result data includes basic information and characteristic data information of the detection object; wherein the characteristic data information includes position information of the detection result on the detection object, and the process flow of the detection object.
  • Information one or more of the grayscale information, shape information and texture information of the data information of the detection result;
  • Annotating the first data sample to obtain the second data sample includes:
  • For each piece of detection result data obtain the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object;
  • the detection result data is marked as true defect data; if not, the detection result data is marked as noisy data;
  • the second data sample is obtained based on all the detection result data and the label corresponding to each piece of detection result data.
  • the detection object includes a Wafer;
  • the basic information of the Wafer includes the number of the Wafer, the number of Dies it contains, and the basic information of each Die;
  • the basic information of the Die includes the Die number and the Die number of the Die. image information;
  • Obtaining the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object includes:
  • the image information of the detection result corresponding to the piece of detection result data on the Die is obtained.
  • obtaining the data feature distribution information of the detection object according to the second data sample includes:
  • the characteristic data axis represents the characteristic data information of the detection result data
  • the segmentation data axis represents the segmentation feature Information
  • the segmentation feature information includes other feature data information except for the feature data axis
  • the feature space includes one or more feature data axes and one or more segmentation data axes.
  • arranging the second data samples according to the feature space to obtain data feature distribution information of the detection object includes:
  • the characteristic value size of the characteristic data information represented by the characteristic data axis in the horizontal axis direction, the characteristic value size of the characteristic data information represented by the characteristic data axis, and in the vertical axis direction, according to the characteristic data represented by the segmented data axis.
  • the second data samples are arranged according to the characteristic value size of the information to obtain a defect characteristic distribution map.
  • using a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information includes:
  • training the outlier statistical analysis model includes: training the selected outlier statistical analysis model according to the detection result data and the data feature distribution information until the obtained The defect distribution boundary information of the detection object satisfies the first preset condition;
  • the use of data segmentation method to perform outlier statistical analysis on the data feature distribution information includes: based on the detection result data and the data feature distribution information, on the feature data axis and/or the segmented data axis Obtain at least one first segmentation threshold; and obtain the defect boundary information according to the first segmentation threshold until the obtained defect distribution boundary information of the detection object satisfies the second preset condition.
  • the segmented data axis represents process flow information; and based on the detection result data and the data feature distribution information, threshold segmentation is performed on the characteristic data axis and/or the segmented data axis until the The defect distribution boundary information of the detection object satisfies the second preset condition, including:
  • the defect distribution boundary information of the detection object is obtained.
  • preset outlier statistical analysis strategies also includes: an outlier statistical analysis strategy that combines data segmentation and model learning;
  • the outlier statistical analysis strategy that combines data segmentation and model learning includes: obtaining at least one first segmentation threshold on the segmentation data axis of the detection result data labeled as a true defect based on the data feature distribution information. ; And according to the first segmentation threshold and the data feature distribution information, train the selected outlier statistical analysis model until the obtained defect distribution boundary information of the detection object meets the third preset condition.
  • setting or optimizing the values of detection parameters of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy including:
  • the values of the detection parameters of the detection recipe are set or optimized.
  • the preset outlier statistical analysis strategy is a data segmentation method
  • the data distribution density of the detection result data of the detection object is counted as the reverse derivation strategy
  • all detection result data of the detection object are used as the input data information
  • the data distribution density of the characteristic data information of all the detection result data in the feature space is divided into normal areas, noise areas and true defect areas;
  • the normal area is where the data distribution density is greater than The area of the first density threshold
  • the noise area is the area where the data density is less than or equal to the first density threshold and greater than the second density threshold
  • the true defect area is the area where the data density is less than or equal to the second density threshold
  • the preset outlier statistical analysis strategy is an outlier statistical analysis strategy based on Gaussian model
  • the Gaussian distribution of the detection result data of the detection object is obtained as the reverse derivation strategy, and Gaussian model detection is used as the detection formula strategy;
  • all detection result data of the detection object are used as the input data information and the defect distribution boundary information is used as the input data information;
  • the parameters of the Gaussian model detection are determined.
  • the preset outlier statistical analysis strategy is a machine learning outlier statistical analysis strategy
  • the density threshold and distance threshold for obtaining the detection result data of the detection object are used as the reverse derivation strategy, and the machine learning model is used as the strategy of detection formula;
  • the obtained density and distance of the detection result data of the detection object are used as the input data information
  • the density parameters and distance parameters of the detection strategy of the machine learning model are reversely derived.
  • the detection recipe setting and optimization method also includes:
  • defect analysis of the object to be detected is performed to obtain defect data information of the object to be detected.
  • the detection parameter and adjustment device includes:
  • the true defect and noise marking unit is configured to mark the first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes the detection result data Result data and labels corresponding to each test result data;
  • a feature distribution information acquisition unit configured to obtain data feature distribution information of the detection object based on the second data sample
  • the defect distribution boundary acquisition unit is configured to use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and is used to perform outlier statistical analysis according to the preset outlier statistical analysis strategy, Determine the test formula;
  • the detection parameter setting and optimization unit is configured to determine or optimize the value of the detection parameter of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
  • the detection recipe setting and optimization device also includes:
  • the detection recipe application unit is configured to perform defect analysis on the object to be detected based on the detection formula and the values of detection parameters of the detection formula, and obtain defect data information of the object to be detected.
  • the present invention also provides an electronic device, including a processor and a memory.
  • a computer program is stored on the memory.
  • the computer program is executed by the processor, the above-mentioned detection recipe setting is realized. and optimization methods.
  • the present invention also provides a readable storage medium.
  • a computer program is stored in the readable storage medium.
  • the computer program is executed by the processor, the detection recipe setting and optimization method described above is realized. .
  • the detection recipe setting and optimization method, device, electronic equipment and storage medium provided by the present invention have the following advantages:
  • the detection recipe setting and optimization method provided by the present invention first obtains a second data sample by annotating the first data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes all The detection result data and the label corresponding to each of the detection result data; then obtain the data feature distribution information of the detection object according to the second data sample, and determine the detection formula according to the preset outlier statistical analysis strategy; Then, a preset outlier statistical analysis strategy is used to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information; finally, according to the defect distribution boundary information and the preset outlier statistical analysis strategy, through inverse Through direct derivation, the detection formula determines or optimizes the detection parameters of the detection formula.
  • the first data sample includes several pieces of detection result data
  • the detection result data includes auxiliary parameter adjustment information (such as the basic information and characteristic data of the detection object).
  • Information, the characteristic data information includes but is not limited to the grayscale, shape, texture and other information of the defects indicated by the detection results).
  • true defect data and noise data can be distinguished, which can effectively utilize historical information for subsequent data analysis and analysis. Inference provides an important basis for obtaining accurate prior knowledge, which can improve the detection accuracy of detection formulas.
  • the detection recipe strategy and detection parameter values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
  • the present invention can deduce a set of detection parameters at the same time through reverse derivation (that is, adjust all parameters at the same time).
  • the coupling relationship between parameters is also taken into account, and the detection formula is realized. Rapid modeling; avoids repeated adjustment of parameters, which can significantly save manpower and time costs; moreover, for new process defect detection, users can set or optimize the strategy of the detection formula and the detection parameters of the detection formula without having any algorithm foundation. Take value.
  • the detection recipe setting and optimization device, electronic equipment and storage medium provided by the present invention and the detection parameters and adjustment method provided by the present invention belong to the same inventive concept, therefore, the detection recipe setting and optimization device, electronic equipment and storage medium provided by the present invention It has all the advantages of the detection recipe setting and optimization method, which will not be described in detail here.
  • Figure 1 is a schematic flow chart of a detection recipe setting and optimization method provided by an embodiment of the present invention
  • Figure 2 is a schematic flow chart of a data sample labeling method provided by an embodiment of the present invention.
  • Figure 3 is a schematic diagram of an interface for defect marking of data samples provided by an embodiment of the present invention.
  • Figure 4 is an example diagram showing the distribution of detection result data in a two-dimensional feature space in one specific example of applying the present invention
  • Figure 5 is a schematic diagram of the principle of outlier statistical analysis provided by an embodiment of the present invention.
  • Figure 6 is a schematic diagram of defect distribution boundary information obtained by applying the outlier statistical analysis model provided by the present invention.
  • FIG. 7 is a detailed flow diagram of step S400 in Figure 1;
  • Figure 8 is a specific example diagram of reverse derivation using the detection formula setting and optimization method provided by the present invention.
  • Figure 9 is a schematic diagram of the data density distribution of one of the detection result data provided by an embodiment of the present invention.
  • Figure 10 is a schematic diagram of true defect data distribution within the average gray level range of the standard segmentation axis provided by an embodiment of the present invention.
  • Figure 11(a) is an example of multiple test charts provided by an embodiment of the present invention.
  • Figure 11(b) is an example of the mean graph generated from multiple test images in Figure 11(a);
  • Figure 11(c) is an example of the standard deviation chart generated from multiple test charts in Figure 11(a);
  • Figure 11(d) is an enlarged example of one of the test images
  • Figure 11(e) is a schematic diagram of defect locations detected using machine learning recipes
  • Figure 12 is a schematic diagram of the grayscale dynamic threshold provided by the application of the present invention.
  • Figure 13 is a schematic diagram comparing the detection result data obtained by applying the detection formula setting and optimization method provided by the present invention and the detection result data obtained by the original detection formula;
  • Figure 14 is a structural block diagram of a detection recipe setting and optimization device in an embodiment of the present invention.
  • FIG. 15 is a schematic block structure diagram of an electronic device in an embodiment of the present invention.
  • 100-True defect and noise marking unit 200-Feature distribution information acquisition unit, 300-Defect distribution boundary acquisition unit, 400-Inspection parameter setting and optimization unit, 500-Inspection recipe application unit;
  • 601-processor 602-communication interface, 603-memory, 604-communication bus.
  • FIG. 1 schematically provides a flow chart of the detection recipe setting and optimization method provided by an embodiment of the present invention.
  • the detection recipe setting and optimization method includes the following steps:
  • S100 Annotate the first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes the detection result data and each of the detection results The label corresponding to the data;
  • S300 Use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and determine the detection formula according to the preset outlier statistical analysis strategy;
  • S400 Based on the defect distribution boundary information and the preset outlier statistical analysis strategy, set or optimize the values of the detection parameters of the detection formula through reverse derivation.
  • the first data sample includes several pieces of detection result data, and the detection result data includes a large amount of auxiliary parameter adjustment information (such as the basic information and characteristics of the detection object).
  • Data information, the characteristic data information includes but is not limited to the grayscale, shape, texture and other information of the defects indicated by the detection results).
  • real defect data and noise data can be distinguished, and the historical information can be effectively used for subsequent data analysis.
  • reasoning can provide an important basis for obtaining accurate prior knowledge, which can improve the detection accuracy of detection formulas.
  • the detection recipe strategy and parameter setting values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
  • the present invention can deduce a set of detection parameters at the same time (that is, adjust all parameters at the same time) through reverse derivation.
  • the coupling relationship between parameters is also taken into account, realizing rapid modeling of the detection process and avoiding repeated adjustments. parameters, which can significantly save labor and time costs.
  • users can set or optimize the detection recipe strategy and detection parameter values without having any algorithm foundation.
  • the detection result data is the historical detection result data of the detection object.
  • the detection data is detection result data (that is, the first sample data).
  • the detection result data includes all or part of the historical detection data of the detection formula to be optimized.
  • the detection result data described below are historical detection data of wafer defects. Obviously, this is not a limitation of the present invention.
  • the detection recipe setting and optimization provided by the present invention The method can also be adapted to other detection formulas for initial detection of wafer defects, so no examples will be given one by one.
  • the detection result data includes basic information and characteristic data information of the detection object; wherein the characteristic data information includes position information of the detection result on the detection object, and one or more of the process flow information of the detection object, the grayscale information, the shape information and the texture information of the data information of the detection result.
  • the data information of the detection results must also include conclusion information (defective data or non-defective data) used to indicate the detection results.
  • conclusion information defective data or non-defective data
  • the detection result data includes the basic information and characteristic data information of the detection object (such as the grayscale, shape, texture and other information of nuisance) and other auxiliary parameter adjustment information, and will be used in the subsequent drawing of the defect distribution map.
  • the parameter reverse reasoning process is based on the detection result data. Therefore, the detection formula setting and optimization method provided by the present invention can improve the detection accuracy of the detection formula.
  • Figure 2 schematically shows a schematic flow chart of the data sample annotation method.
  • the first data sample is annotated to obtain the second data sample, including:
  • S120 For each piece of detection result data, obtain the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object;
  • S130 Based on the original information, determine whether the defect marked by the data information of the detection result is a true defect. If so, mark the detection result data as true defect data; if not, mark the detection result data as true defect data. Marked as noisy data;
  • S140 Obtain the second data sample based on all the detection result data and the tag corresponding to each detection result data.
  • the detection recipe setting and optimization method provided by the present invention can accurately distinguish the real defect data and noise data (nusiance, noise interference) in the detection result data (historical data) by labeling the first data sample.
  • Data are accurately distinguished, thereby providing accurate prior knowledge for subsequent acquisition of data feature distribution information, and further obtaining defect distribution boundary information based on the data feature distribution information for further reverse derivation, thereby improving the detection accuracy of the detection formula.
  • the characteristic data information is all detection results of defect detection on the detection object, including defect data and non-defect data.
  • the detection object as a wafer as an example.
  • the first data sample is the wafer. historical test result data.
  • the basic information of the Wafer includes the number of the Wafer, the number of Dies (die) contained, and the basic information of each Die; the basic information of the Die includes the Die number and image information of the Die.
  • the original information corresponding to the detection result data on the detection object is obtained based on the basic information of the detection object and the position information of the defect on the detection object, including:
  • S121 According to the basic information of the Wafer, obtain the Die number of each Die of the Wafer and the basic information of each Die;
  • the data information of the detection result includes the detection result in the detection result data.
  • description of the image information, and the image information of the detection result is the original image corresponding to the data information of the detection result on the detection object.
  • the data information of the detection result includes the image information of the detection result. data expression.
  • the data information of the detection result records the texture characteristics of the texture defect, such as the roughness of the texture, etc., and the data information of the detection result
  • the image information is the original image corresponding to the texture defect. Therefore, according to the image information of the detection result, the detection result data corresponding to the image information of the detection result can be re-judged whether it is true defect data or noise data. .
  • FIG. 3 schematically illustrates one of the interface diagrams for defect marking of data samples provided by an embodiment of the present invention.
  • the Wafer display window area is used to graphically display the basic information of the Wafer, including but not limited to the position of each Die on the Wafer and the number of the Die.
  • the user can select the Die number to be marked for defects. According to the Die number selected by the user, the historical detection data results of the Die corresponding to the selected Die number will be refreshed in the detection data list window area.
  • the user can select the detection result data one by one, and the original information corresponding to the detection result data (i.e., the defect display area) will be displayed.
  • the image information of the detection result is the image information indicated by the position information of the detection result on the Die). Therefore, according to various characteristics of the original information (texture, size, curvature, shape, etc.), it can be artificially Further confirm whether the defect indicated by the data information of the test result is a true defect by means of re-judgment or machine re-judgment.
  • the piece of test result data is marked as true defect data (for example, it will be included in the test data list
  • the label of the detection result data in the window area is marked as a true defect, and the value corresponding to the column of the manual judgment whether it is a real defect is set to yes); if not, the detection result data is marked as noise data (for example, the value in the The label of the detection result data in the detection data list window area is marked as a false defect, and the value corresponding to the manual judgment whether it is a real defect column is set to No).
  • the detection recipe setting and optimization method provided by the present invention is explained by taking a wafer as an example as a detection object, as those skilled in the art can understand, this is only a preferred embodiment.
  • the detection object may also be other products besides wafers, including but not limited to lenses, display screens, 3D printing products, etc. Explain with examples one by one.
  • step S200 obtaining the data feature distribution information of the detection object based on the second data sample includes:
  • S210 Determine the characteristic data axis and the segmented data axis, and establish a feature space based on the characteristic data axis and the segmented data axis; wherein the characteristic data axis represents the characteristic data information of the detection result data, and the segmented data axis represents Segmentation feature information; wherein the segmentation feature information includes other feature data information except for the feature data axis;
  • S220 Arrange the second data samples according to the feature space to obtain data feature distribution information of the detection object.
  • the detection recipe setting and optimization method provided by the present invention arranges the second data samples through the feature space, and the purpose is to make the distribution of the detection result data in the feature space show a certain trend. , making the distinction between true defect data and noise data more obvious, so as to facilitate the acquisition of defect distribution boundary information.
  • the feature space includes one or more feature data axes and one or more segmentation data axes.
  • the feature space may include multiple feature data axes and multiple segmentation data axes, and the feature space may be a multi-dimensional feature space.
  • step S220 the Arrange the second data samples to obtain the data feature distribution information of the detection object, including:
  • S221 Use the feature data axis as the horizontal axis and the segmented data axis as the vertical axis to establish a rectangular coordinate system;
  • S222 In the rectangular coordinate system, in the horizontal axis direction according to the characteristic value size of the characteristic data information represented by the characteristic data axis, and in the vertical axis direction according to the characteristic value represented by the segmented data axis.
  • the second data samples are arranged according to the characteristic value size of the characteristic data information to obtain a defect characteristic distribution map.
  • Figure 4 schematically shows an example diagram of the distribution of detection result data in a two-dimensional feature space of one specific example.
  • the horizontal axis represents the feature data axis
  • the vertical axis represents the two-dimensional data feature distribution map formed by dividing the data axis. That is, the abscissa of each point in the coordinate system represents the size of the feature value, and the ordinate represents the size of the corresponding segmentation feature value.
  • the feature values of all detection result data constitute the entire feature distribution map.
  • the feature data axis and the segmentation data axis may be multi-dimensional. That is, multiple segmentation values can be selected for the segmented data axis to divide the detection result data (ie, the second sample data) into several different feature distributions.
  • the present invention does not limit the specific selection method of the feature space.
  • a feature selection algorithm can be used to select the feature data axis and the segmentation data axis to automatically select the feature space; in other embodiments, , the feature data axis and segmentation data axis can also be selected manually, and the present invention does not impose any limitations on this.
  • the feature data axis can represent information such as color, texture, shape, size, etc.
  • the segmentation axis can be information such as a trained mean map.
  • the criteria for selecting the feature space are: the segmented data axis can better distinguish different process areas, and the feature data axis can make true defect data and noise There are obvious differences between the data (noise points).
  • the ultimate goal is to make the distribution of the detection result data in the feature space show a certain trend, making the distinction between real defects and noise points more obvious.
  • the detection result data of wafer defects if the shape in the feature data information is used as the feature data axis rather than the texture in the feature data information as the feature data axis, the detection result data can be better positioned in the feature space.
  • the shape in the feature data information is used as the feature data axis instead of the texture in the feature data information as the feature data axis. It can be understood that the shape in the feature data information is no longer used as the segmentation data axis.
  • FIG. 5 schematically provides a flow chart of a detection recipe setting and optimization method provided by an embodiment of the present invention. It can be seen from Figure 5 that in step S300, the preset outlier statistical analysis strategy is used to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information, including:
  • FIG. 6 is a schematic diagram of defect distribution boundary information obtained by applying the outlier statistical analysis model provided by the present invention.
  • feature1 is the segmentation data axis
  • feartrue2 is the feature data axis.
  • the defect distribution boundary information 3 is a curve. It can be seen that the detection formula setting and optimization method provided by the present invention determines the preset outlier statistical analysis strategy based on the detection result data and the data feature distribution information, and determines the preset outlier statistical analysis strategy based on the determined preset outlier.
  • the group statistical analysis strategy performs outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information, which can enable the defect distribution boundary information to better separate the true defect data 2 and the noise data 1, that is,
  • the defect distribution boundary information can reduce over-inspection problems as much as possible without causing missed detection defects, so as to filter out more noise data. This can ensure that the subsequent detection formula determined by reverse derivation based on the defect distribution boundary information will not cause missed detection or over-detection, thereby improving the defect detection accuracy of the detection process.
  • the defect distribution boundary information obtained may be different. Therefore, the subsequent reverse Derivation As well as the strategy for detecting formulas are closely related to the outlier statistical analysis strategy.
  • the shape of the defect distribution boundary information is completely different from that in Figure 6 , please refer to the description below for details. To avoid redundancy, we will not elaborate here.
  • training the outlier statistical analysis model includes: training the selected outlier statistical analysis model according to the detection result data and the data feature distribution information until the obtained The defect distribution boundary information of the detection object satisfies the first preset condition.
  • the outlier analysis statistical model includes but is not limited to Statistics-based outlier algorithms (such as the 3 ⁇ principle), distance and proximity-based clustering algorithms (such as K-means, etc.), density-based outlier algorithms (such as DBSCAN, etc.), tree-based outlier analysis algorithms (such as isolated forest, etc.). It should be noted that the choice of algorithm model is very critical. Different algorithm models mean different shapes of outlier boundaries. An optimal algorithm model can make the training of the data set neither underfitting nor outliers occur. Overfitting.
  • the outlier analysis statistical model is preferably based on a statistical outlier algorithm (such as the 3 ⁇ principle).
  • a statistical outlier algorithm such as the 3 ⁇ principle.
  • the outlier analysis statistical model is preferably based on the distance sum Proximity clustering algorithm.
  • the purpose of the outlier analysis statistical model is to find the optimal boundary result.
  • the second sample data should be used to pair the selected
  • the outlier analysis statistical model is trained, and through continuous learning and target optimization processes, the model training results can find the optimal inflection point of the segmented data axis and classify the true defects and noise data ( interference noise points) to distinguish. Therefore, after the training of the outlier analysis statistical model is completed, a boundary result (ie, defect distribution boundary information) is obtained.
  • a boundary result ie, defect distribution boundary information
  • defect distribution boundary curve 3 ie, defect distribution boundary information
  • the first preset condition is that the defect distribution boundary information can distinguish the detection result data labeled as true defect data and the detection result data labeled as noise data in the second sample.
  • using the data segmentation method to perform outlier statistical analysis on the data feature distribution information includes: based on the detection result data and the data feature distribution information, on the feature data axis and/or the segmented data At least one first segmentation threshold is obtained on the axis; and the defect boundary information is obtained according to the first segmentation threshold until the obtained defect distribution boundary information of the detection object satisfies the second preset condition.
  • the data segmentation method includes manually segmenting the feature space to obtain the first segmentation threshold.
  • the present invention is not limited to the specific implementation of the data segmentation method.
  • the first segmentation threshold can also be obtained through a data segmentation algorithm.
  • S321 Determine the first segmentation threshold of the segmented data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data.
  • S322 Determine the second segmentation threshold of the feature data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data;
  • S323 Obtain the defect distribution boundary information of the detection object based on the first segmentation threshold of the segmentation data axis and the second segmentation threshold of the feature data axis.
  • step S321 the data feature distribution information is used as input, and the segmented data axis is segmented in this feature distribution map.
  • the segmentation standard is the consistency of the detection result data distribution, and the data with consistent distribution is regarded as a Cluster, find the segmentation value between clusters, so that the data of different processes can be distinguished.
  • the consistent distribution includes the distribution law of the characteristic data information of the detection result data, including but not limited to the distribution density in the characteristic space, the relative position relationship of the spatial points, etc., based on which the segmentation axis and the characteristic axis threshold are determined, such as , in one of the examples, two first segmentation thresholds segment_value1 and segment_value2 are set.
  • a second segmentation threshold is determined for the feature data axis in the feature distribution. Since the defect data points have been marked in the feature distribution, the principle of determining the second segmentation threshold is to separate the noise data and the real defect data as far as possible, so as to ensure that the detection result data will not be missed at the same time. Also minimize the occurrence of over-inspections. That is, the second preset condition is preferably that the defect boundary information can separate the true defect data and the noise data.
  • the defect respective boundary information of the outlier statistical analysis can be obtained.
  • the figure below still takes the two-dimensional feature data distribution as an example to display the manually segmented defect distribution boundary information.
  • the detection result data is segmented using two first segmentation thresholds segment_value1 and segment_value2 on the segmentation axis, and all the detection result data is divided into three different distributions. In each segmentation threshold interval, three different second segmentation thresholds are used on the feature data axis to distinguish true defects from noise data, and the final defect distribution boundary information is obtained.
  • the defect distribution boundary information includes two straight lines parallel to the feature data axis featureu1 formed by the two first segmentation thresholds segment_value1 and segment_value2, and are respectively located on the feature data axis featreu1 and the first segmentation thresholds segment_value1 and segment_value2.
  • the defect distribution boundary information includes two straight lines parallel to the feature data axis featureu1 formed by the two first segmentation thresholds segment_value1 and segment_value2, and are respectively located on the feature data axis featreu1 and the first segmentation thresholds segment_value1 and segment_value2.
  • the use of a preset outlier statistical analysis strategy further includes: an outlier statistical analysis strategy that combines data segmentation and model learning.
  • the outlier statistical analysis strategy that combines data segmentation and model learning includes: obtaining at least one first segmentation threshold on the segmentation data axis of the detection result data labeled as a true defect based on the data feature distribution information. ; And according to the first segmentation threshold and the data feature distribution information, train the selected outlier statistical analysis model until the obtained defect distribution boundary information of the detection object meets the third preset condition.
  • the detection recipe setting and optimization method provided by the present invention can further reduce the uncertainty of machine learning model training through an outlier statistical analysis strategy that combines data segmentation and model learning when obtaining outlier distribution boundary information.
  • the input of the machine learning model has certain constraints, and the results of manual segmentation are used as constraints, which can further improve the efficiency of obtaining defect boundary distribution information.
  • the third preset condition is preferably to ensure that the detection result data does not miss detection while also minimizing the occurrence of over-inspection. That is, the second preset condition is preferably that the defect boundary information can reduce the true defect to The data and the noise data are separated or the number of training times of the outlier statistical analysis model reaches a preset value.
  • the defect distribution boundary information obtained by using the outlier statistical analysis strategy that combines data segmentation and model learning is different from the defect distribution boundary information obtained by the above-mentioned data segmentation method.
  • the defect distribution boundary information obtained by the outlier statistical analysis strategy that combines data segmentation and model learning includes two straight lines parallel to the feature data axis featureure1 formed by the two first segmentation thresholds segment_value1 and segment_value2, and two straight lines located on the feature data respectively.
  • the three intervals formed by axis featreu1, the first segmentation thresholds segment_value1 and segment_value2 are closed curves surrounding the true defect data. Due to the different outlier statistical analysis strategies used, the defect boundary distribution information obtained is completely different.
  • the defect boundary distribution information obtained can all compare with the detection result data. middle Accurately distinguish between true defect data and noise data. As mentioned above, based on this, the present invention does not limit the specific implementation of the outlier statistical analysis strategy.
  • step S400 based on the defect distribution boundary information and the preset outlier statistical analysis strategy, through reverse derivation, the values of the detection parameters for setting or optimizing the detection formula are determined, including :
  • S410 Determine the reverse derivation strategy according to the preset outlier statistical analysis strategy
  • S430 Determine the data distribution model of the detection result data according to the input data information
  • S440 Determine the detection parameters of the detection formula according to the data distribution model and the defect distribution boundary information
  • S450 Set or optimize the value of the detection parameter of the detection recipe according to the strategy of the detection recipe and the input data information of the reverse derivation.
  • the detection recipe setting provided by the present invention is different from Optimization method uses reverse derivation to determine the detection recipe strategy, and reversely infers all parameter settings of the detection recipe (key parameters, such as data density, data sparsity distance and/or tolerance range, etc.) based on the defect boundary distribution information ), the coupling relationship between the parameters of the detection process is also taken into account, thereby avoiding repeated parameter adjustment processes; and the parameter adjustment process is based on the user's annotation results, and the user does not need to have prior knowledge to automatically deduce a relatively accurate set of parameters.
  • the parameters of the detection process are adjusted to the optimal level at one time, which not only improves the efficiency of parameter adjustment in the detection process, but also improves the detection accuracy of the detection formula.
  • FIG. 8 schematically shows a specific example of reverse derivation using the detection recipe setting and optimization method provided by the present invention.
  • the outlier statistical analysis strategy, the reverse derivation strategy and the parameter setting values of the detection process are closely related: that is, the reverse derivation strategy
  • the strategy for directional derivation and the strategy for detecting recipes are consistent with the core of the outlier statistical analysis strategy for obtaining the defect boundary distribution information.
  • the outlier segmentation method is used as the strategy for outlier statistical analysis
  • the basic principles of the strategy for reverse derivation and detection of recipes should also be consistent with the basic principles of the outlier segmentation method.
  • the following uses the data segmentation method as the outlier statistical analysis strategy, the outlier statistical analysis strategy based on Gaussian model and the outlier statistical analysis strategy of machine learning as examples to perform reverse derivation to obtain the parameters of the detection formula.
  • the process of setting values is explained in detail.
  • FIG. 9 schematically provides a schematic diagram of the data density distribution of one of the detection result data provided in an implementation manner of this embodiment.
  • the basic idea of this method is to define the area where the density of the detection result data points (the characteristic value of the detection result data) in the feature distribution diagram is greater than the first threshold as a normal area, that is, the normal area is expressed as the sum of the data density related functions. Therefore, all data points (feature values of detection result data) whose data density data_density is greater than the first threshold are normal, and then data density data_density is one of the detection parameters that requires reverse inference.
  • an area where the data density is less than or equal to the first threshold and greater than the second threshold is defined as a nuisance area.
  • boundary threshold is the defect distribution boundary result obtained by the outlier statistical analysis algorithm
  • defect_threshold is the function related to the defect distribution boundary boundary_threshold
  • displacement parameter offset_parameter can be calculated using defect_threshold and nuisance_threshold.
  • the preset outlier statistical analysis strategy is the data segmentation method
  • the displacement parameters of the detection formula are obtained through the following steps:
  • Step A1 According to the data segmentation method, count the data distribution density of the detection result data of the detection object as the reverse derivation strategy.
  • Step A2 According to the reverse derivation strategy of the statistical data distribution density, use all detection result data of the detection object as the input data information.
  • Step A3 Based on all the detection result data, it is assumed that the data distribution density of the characteristic data information of all the detection result data in the feature space is divided into normal areas, noise areas and true defect areas; the normal area is the data The area where the distribution density is greater than the first density threshold.
  • the noise area is the area where the data density is less than or equal to the first density threshold and greater than the second density threshold.
  • the true defect area is the area where the data density is less than or equal to the second density threshold. area.
  • Step A4 Calculate the first density threshold and the second density threshold according to all detection result data and the labels of all detection result data; wherein the first density threshold is greater than the second density threshold;
  • Step A5 Calculate the displacement parameter of the detection formula according to the first density threshold, the second density threshold and the defect distribution boundary information.
  • the data segmentation method is used to obtain defect boundary distribution information, and reverse derivation is performed to obtain the parameter setting values of the inspection process.
  • FIG. 10 a schematic diagram of true defect distribution within the average gray level range of the standard segmentation axis provided by an implementation of this embodiment is provided.
  • Figure 11(a)- Figure 11(c) and Figure 12 are examples of multiple test charts provided by an embodiment of the present invention, and Figure 11(b) is Figure 11 (a) is an example of the average value chart generated by multiple test charts. Figure 11(c) is an example of the standard deviation chart generated by multiple test charts in Figure 11(a).
  • Figure 12 is the grayscale provided by the application of the present invention. Dynamic threshold diagram. In the figure, pixel A is a pixel in the test image, and pixel A1 and A2 are the corresponding pixels of pixel A in the mean map and standard deviation map respectively.
  • Selected samples Select samples (as shown in Figure 11(a)), and obtain the average value chart and the standard deviation chart based on statistics (training) of N test charts.
  • feature1 is the feature data axis in Figure 10
  • test is the gray value of the test image
  • mean is the gray value of the average image obtained by statistics of N test images.
  • mean is the gray value of the average image obtained by statistics of N test images.
  • defect_threshold mean+/-(sigma*std+gray) (7)
  • mean is the gray value of the average image obtained by statistics of N test images
  • std is the standard deviation corresponding to one of the pixels in the test image
  • sigma is the coefficient of the standard deviation
  • gray is the dynamic threshold.
  • the dynamic threshold gray is equivalent to the displacement parameter offset_parameter mentioned above, which can be defined as any curve.
  • pixels greater than the above threshold defect_threshold are normal points, and pixels less than or equal to the threshold defect_threshold are defective points.
  • Figure 11(d) is an enlarged example of one of the test images
  • Figure 11(e) is the defect location detected using a machine learning algorithm. Schematic diagram. By comparing Figure 11(d) and Figure 11(e), it is easy to find that the detection formula obtained by using the detection formula setting and optimization method provided by the present invention can accurately detect the true defects of the object to be detected.
  • the outlier statistical analysis strategy based on the Gaussian model is first used to reversely deduce new data.
  • the core ideas of the process and parameter setting values are explained.
  • the basic principle of this method is to assume that the distribution of all data points (detection result data) in the feature distribution map obeys Gaussian distribution.
  • the parameters such as mean, variance and variance coefficient that need to be used in the detection model (strategy of the detection process) are reversely inferred to obtain the correlation required for Gaussian model detection. parameter.
  • the Gaussian model-based outlier statistical analysis strategy to reversely derive new data processes and parameter settings includes the following steps:
  • Step B1 The preset outlier statistical analysis strategy is an outlier statistical analysis strategy based on Gaussian model
  • Step B2 According to the outlier statistical analysis strategy based on the Gaussian model, use the Gaussian distribution of the detection result data of the detection object as the reverse derivation strategy, and use Gaussian model detection as the detection formula strategy;
  • Step B3 According to the reverse derivation strategy of statistical Gaussian distribution, use all detection result data of the detection object as the input data information and the defect distribution boundary information as the input data information;
  • Step B4 Based on all the detection result data, it is assumed that the data distribution density of the feature values of all the feature data information of the detection result data in the feature space obeys Gaussian distribution;
  • Step B5 Determine the parameters of the Gaussian model detection based on the input data information and the defect distribution boundary information.
  • boundary_threshold is the defect boundary distribution result obtained by the outlier algorithm.
  • This boundary matrix boundary_threshold can already be obtained.
  • the mean ⁇ can be obtained from the detection result data, which is obtained by calculating the average gray level of the current detection data image.
  • the variance ⁇ is calculated by subtracting the sum of squares from the gray value of the pixels of the image to be detected and the mean ⁇ , and then averaging.
  • the outlier statistical analysis strategy based on machine learning to reversely derive new data processes and parameter setting values includes the following steps:
  • Step C1 The preset outlier statistical analysis strategy is a machine learning outlier statistical analysis strategy
  • Step C2 According to the outlier statistical analysis strategy of machine learning, the density threshold and distance threshold for obtaining the detection result data of the detection object are used as the reverse derivation strategy, and the machine learning model is used as the detection formula strategy;
  • Step C3 According to the reverse derivation strategy of obtaining the density threshold and distance threshold of the detection result data of the detection object, use the obtained density and distance of the detection result data of the detection object as the input data information;
  • Step C4 Based on all detection result data and the defect boundary distribution information, reversely derive the density parameters and distance parameters of the detection strategy of the machine learning model.
  • the determination of the parameters of the outlier statistical analysis algorithm based on machine learning directly affects the detection accuracy.
  • the initial clustering center in the k-means algorithm, the neighborhood and number threshold in the DBSCAN algorithm, etc. Therefore, by performing reverse reasoning on these machine learning parameters through the defect boundary distribution information (results) in outlier statistical analysis, a machine learning model with prior knowledge can be obtained, thereby improving the accuracy of model detection.
  • boundary_threshold is the defect boundary distribution information obtained by the outlier algorithm, which is related to the detection result data and has been obtained in the defect boundary analysis process.
  • the two important parameters of the clustering algorithm based on distance and density are density density_parameters and distance distance_parameters. Density density_parameters and distance distance_parameters are derived from the detection result data and the boundary matrix. By inverting the distance and density parameters, the defects are exactly located at the preset threshold. can be detected; while normal pixels are located within a threshold range with a larger density, are filtered out, thereby improving detection accuracy.
  • the detection recipe setting and optimization method also includes:
  • S500 Perform defect analysis on the object to be detected according to the detection formula and the values of the detection parameters of the detection formula, and obtain defect data information of the object to be detected.
  • Figure 13 schematically shows a comparison diagram of the detection result data obtained by the detection process using the detection recipe setting and optimization method proposed by the present invention and the detection result data obtained by the original detection process. It can be seen from Figure 13 that by applying the strategy and parameter setting values of the detection process obtained by reverse derivation of the present invention for the detection process, the nuisance noise data is filtered out, the true defect data (defect defect data) is retained, and the detection result data is passed The distribution in the feature space can visually test the correctness of the results.
  • the first data sample includes several pieces of detection result data, and the detection result data includes a large amount of auxiliary parameter adjustment information, which can be effectively used for subsequent use through data annotation.
  • Historical information can be used for data analysis and reasoning to obtain accurate prior knowledge, which provides an important basis and can improve the detection accuracy of detection formulas.
  • the detection recipe strategy and parameter setting values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy. Therefore, the present invention can simultaneously deduce a set of detection parameters (adjusting all parameters at the same time) through reverse derivation.
  • the coupling relationship between parameters is also taken into account, realizing rapid modeling of the detection process; avoiding repeated adjustment of parameters. , which can significantly save labor and time costs; moreover, for new process defect detection, users can determine the strategy and parameter setting values of the detection process without the need for algorithm foundation.
  • FIG. 14 schematically provides a structural block diagram of the detection recipe setting and optimization device provided by this embodiment.
  • the detection recipe setting and optimization device provided by this embodiment includes: a true defect and noise marking unit 100, a feature distribution information acquisition unit 200, a defect distribution boundary acquisition unit 300, and a detection parameter setting and optimization unit. 400.
  • the true defect and noise marking unit 100 is configured to mark a first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; and the second The data sample includes the detection result data and the label corresponding to each piece of the detection result data.
  • the feature distribution information acquisition unit 200 is configured to obtain data feature distribution information of the detection object based on the second data sample.
  • the defect distribution boundary acquisition unit 300 is configured to use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and use it to perform outlier statistical analysis according to the preset outlier statistics. Analyze strategies and determine detection recipes.
  • the detection parameter setting and optimization unit 400 is configured to set or optimize the values of detection parameters of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
  • the detection recipe setting and optimization device further includes a detection recipe application unit 500 .
  • the detection recipe application unit 500 is configured to perform defect analysis on the object to be detected according to the detection recipe and the values of detection parameters of the detection recipe, and obtain defect data information of the object to be detected.
  • the detection recipe setting and optimization device provided by the present invention Since the basic principles of the detection recipe setting and optimization device provided by the present invention are similar to the detection recipe setting and optimization methods provided by the above embodiments, in order to avoid redundancy, the specific content of the above detection recipe setting and optimization device implementation is introduced. It is relatively rough. For detailed information, please refer to the detailed description of the detection recipe settings and optimization methods above. Furthermore, since the detection recipe setting and optimization device provided by the present invention and the detection recipe setting and optimization method provided by the above embodiments belong to the same inventive concept, the detection recipe setting and optimization device provided by the present invention at least has the same features as the detection recipe setting and optimization method. The recipe setting and optimization method have the same beneficial effects. You can refer to the relevant content in the detection recipe setting and optimization method above, so this will not be described again.
  • the detection formula setting and optimization device in the present invention and the detection formula setting and optimization method described above belong to the same inventive concept, the introduction to the detection formula setting and optimization device in this article is relatively simple. Regarding how, you can Refer to the detection recipe settings above It is related to the optimization method, so it will not be described again.
  • the present invention also provides an electronic device.
  • FIG. 15 schematically shows a block structure diagram of the electronic device provided by an embodiment of the present invention.
  • the electronic device includes a processor 601 and a memory 603.
  • a computer program is stored on the memory 603.
  • the detection recipe settings described above are implemented. and optimization methods. Since the electronic device provided by the present invention and the detection recipe setting and optimization method described above belong to the same inventive concept, it has all the advantages of the detection recipe setting and optimization method described above, and thus will not be described again.
  • the electronic device also includes a communication interface 602 and a communication bus 604 , wherein the processor 601 , the communication interface 602 , and the memory 603 complete communication with each other through the communication bus 604 .
  • the communication bus 604 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus 604 can be divided into an address bus, a data bus, a control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface 602 is used for communication between the above-mentioned electronic device and other devices.
  • the processor 601 referred to in the present invention can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the processor 601 is the control center of the electronic device and uses various interfaces and lines to connect various parts of the entire electronic device.
  • the memory 603 can be used to store the computer program.
  • the processor 601 implements various functions of the electronic device by running or executing the computer program stored in the memory 603 and calling the data stored in the memory 603. Function.
  • the memory 603 may include non-volatile and/or volatile memory.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • the present invention also provides a readable storage medium.
  • a computer program is stored in the readable storage medium.
  • the computer program is executed by a processor, the above-mentioned detection recipe setting and optimization method can be implemented. Since the readable storage medium provided by the present invention and the detection recipe setting and optimization method described above belong to the same inventive concept, it has all the advantages of the detection recipe setting and optimization method described above, so this will not be discussed further. Repeat.
  • the readable storage medium in the embodiment of the present invention may be any combination of one or more computer-readable media.
  • the readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared or semiconductor system, device or device, or any combination thereof.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Computer program code for performing the operations of the present invention may be written in one or more programming languages, or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language - such as "C" or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider) through the Internet. ).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as an Internet service provider
  • the detection recipe setting and optimization method, device, electronic equipment and storage medium provided by the present invention have the following advantages: the first data sample includes several pieces of detection result data, and the detection result data is
  • the result data includes auxiliary parameter adjustment information (such as the basic information and characteristic data information of the detection object, the characteristic data information includes but is not limited to the grayscale, shape, texture and other information of the defects indicated by the detection results), through data annotation It can distinguish true defect data from noise data, which provides an important basis for subsequent effective use of historical information for data analysis and reasoning to obtain accurate prior knowledge, and can improve the detection accuracy of detection formulas.
  • the detection recipe strategy and detection parameter values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy. Therefore, the present invention can deduce a set of detection parameters at the same time (adjusting all parameters at the same time) through reverse derivation.
  • the coupling relationship between parameters is also taken into account, realizing rapid modeling of detection formulas; avoiding repeated adjustment of parameters. , which can significantly save labor and time costs; moreover, for new process defect detection, the user can determine the strategy of the detection formula and the values of the detection parameters of the detection formula without having any algorithm foundation.
  • each block in the flowchart or block diagrams may represent a module, program, or portion of code that contains one or more operable functions for implementing the specified logical functions.
  • Execution instructions, the module, program segment or part of the code contains one or more executable instructions for implementing the specified logical function.
  • the functions noted in the block may occur out of the order noted in the figures.
  • each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be designed into specialized hardware-based systems that perform the specified functions or acts. Implemented, or may be implemented using a combination of dedicated hardware and computer instructions.
  • each functional module in each embodiment of this article can be integrated together to form an independent part, each module can exist alone, or two or more modules can be integrated to form an independent part.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Analysing Materials By The Use Of Radiation (AREA)
  • Testing Or Measuring Of Semiconductors Or The Like (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided in the present invention is a detection formula configuration and optimization method and apparatus, an electronic device and a storage medium. The method comprises: labeling a first data sample to obtain a second data sample; wherein the first data sample comprises a plurality of pieces of detection result data, and the second data sample comprises the detection result data and a label corresponding to each piece of data; according to the second data sample, obtaining data feature distribution information of a detection object; using a preset outlier statistical analysis strategy, performing outlier statistical analysis on the data feature distribution information, so as to obtain defect distribution boundary information and determine a detection formula; finally, according to the defect distribution boundary information and the preset outlier statistical analysis strategy, determining or optimizing the values of detection parameters of the detection formula by means of reverse derivation. In the invention, the coupling relationship between the parameters is considered, such that repeated adjustment of parameters can be avoided, and meanwhile, a whole set of detection parameters are inferred, so that rapid modeling of the detection formula is achieved; and manpower and time costs can be saved.

Description

检测配方设置与优化方法、装置、电子设备和存储介质Detection recipe setting and optimization methods, devices, electronic equipment and storage media 技术领域Technical field
本发明涉及半导体技术领域,特别涉及一种检测配方设置与优化方法、装置、电子设备和存储介质。The invention relates to the field of semiconductor technology, and in particular to a detection recipe setting and optimization method, device, electronic equipment and storage medium.
背景技术Background technique
在半导体晶圆的制造过程中,晶圆翘曲度(Bow)及晶圆表面的形貌是影响制程工艺稳定性及产品良率的关键参数,对晶圆的良率(Yield)有着关键的影响。比如晶圆在经过刻蚀或薄膜沉积等不同工艺后,晶圆会发生不同程度的翘曲或使晶圆表面凹凸不平;又比如,在半导体集成电路制造过程中机械手可能会刮伤晶圆。因此,晶圆的缺陷是所有芯片制造厂在良率检测中最为关注的部分。晶圆一旦存在缺陷,很难通过后续工艺进行补救,因此如何快速准确地检验出晶圆表面的缺陷,避免因有缺陷的产品流入下道工序造成生产资源的浪费变得至关重要。In the manufacturing process of semiconductor wafers, wafer warpage (Bow) and wafer surface morphology are key parameters that affect process stability and product yield, and are critical to wafer yield (Yield). Influence. For example, after the wafer undergoes different processes such as etching or thin film deposition, the wafer will warp to varying degrees or the wafer surface will be uneven; another example is that a robot may scratch the wafer during the manufacturing process of semiconductor integrated circuits. Therefore, wafer defects are what all chip manufacturers pay most attention to during yield inspection. Once a wafer is defective, it is difficult to remedy it through subsequent processes. Therefore, it is crucial to quickly and accurately detect defects on the wafer surface to avoid wasting production resources due to defective products flowing into the next process.
现有技术中,晶圆缺陷的检测流程通常采用正向流程调参,然而由于现场工艺的多样性,需要每次生成大量信息,加上缺少先验知识,通常对检测流程的检测参数进行逐一调节,由于不能将参数之间的耦合关系考虑在内,因此,单个参数的反复调整可能导致调参结果的偏差,为了达到较好的检测效果,检测配方需要反复调节参数,带来人力和时间成本的增加。而且,由于工艺的多样性,已有的检测配方很难适用于新工艺的缺陷检测,而调整检测配方的参数需要具有一定的算法背景,因此对用户要求较高。In the existing technology, the wafer defect detection process usually uses forward process parameter adjustment. However, due to the diversity of on-site processes, a large amount of information needs to be generated each time, and coupled with the lack of prior knowledge, the detection parameters of the detection process are usually adjusted one by one. Adjustment, since the coupling relationship between parameters cannot be taken into account, repeated adjustments of a single parameter may lead to deviations in the parameter adjustment results. In order to achieve better detection results, the detection formula needs to repeatedly adjust parameters, which brings manpower and time. Increase in costs. Moreover, due to the diversity of processes, existing detection formulas are difficult to apply to defect detection in new processes. Adjusting the parameters of the detection formula requires a certain algorithm background, so the requirements for users are high.
需要说明的是,公开于该发明背景技术部分的信息仅仅旨在加深对本发明一般背景技术的理解,而不应当被视为承认或以任何形式暗示该信息构成已为本领域技术人员所公知的现有技术。It should be noted that the information disclosed in the background technology section of this invention is only intended to deepen the understanding of the general background technology of the invention, and should not be regarded as an admission or any form of implication that the information constitutes what is already known to those skilled in the art. current technology.
发明内容Contents of the invention
本发明的目的在于针对现有技术中存在的缺陷,提供一种检测配方设置与优化方法、系统、电子设备和存储介质,本发明提供的检测配方设置与优化方法,基于检测结果数据的先验知识,并充分考虑参数之间的耦合关系一次性确定检测配方的策略及参数设置值,不仅确定所述检测流程的效率高,而且提高了检测配方的检测精度。The purpose of the present invention is to provide a detection recipe setting and optimization method, system, electronic equipment and storage medium in view of the defects existing in the prior art. The detection recipe setting and optimization method provided by the invention is based on the a priori detection result data. knowledge, and fully consider the coupling relationship between parameters to determine the strategy and parameter setting values of the detection formula at one time, which not only determines the efficiency of the detection process, but also improves the detection accuracy of the detection formula.
为达到上述目的,本发明提供一种检测配方设置与优化方法,一种检测配方设置与优化方法,包括:In order to achieve the above objectives, the present invention provides a detection formula setting and optimization method, a detection formula setting and optimization method, including:
对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签;Annotate the first data sample to obtain a second data sample; wherein, the first data sample includes several pieces of detection result data; the second data sample includes the detection result data and the corresponding data of each piece of detection result data. Tag of;
根据所述第二数据样本,得到检测对象的数据特征分布信息;According to the second data sample, obtain the data feature distribution information of the detection object;
采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,并根据所述预设离群统计分析策略,确定检测配方;Using a preset outlier statistical analysis strategy, perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and determine the detection formula according to the preset outlier statistical analysis strategy;
根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值。According to the defect distribution boundary information and the preset outlier statistical analysis strategy, the values of the detection parameters of the detection formula are set or optimized through reverse derivation.
可选地,所述检测结果数据包括所述检测对象的基本信息和特征数据信息;其中,所述特征数据信息包括检测结果在所述检测对象上的位置信息,以及所述检测对象的工艺流程信息、所述检测结果的数据信息的灰度信息、形状信息和纹理信息中的一种或多种;Optionally, the detection result data includes basic information and characteristic data information of the detection object; wherein the characteristic data information includes position information of the detection result on the detection object, and the process flow of the detection object. Information, one or more of the grayscale information, shape information and texture information of the data information of the detection result;
所述对第一数据样本进行标注,得到第二数据样本,包括:Annotating the first data sample to obtain the second data sample includes:
获取所述第一数据样本中每一条检测结果数据对应的所述检测对象的基本信息; Obtain the basic information of the detection object corresponding to each piece of detection result data in the first data sample;
对于每一条检测结果数据,根据所述检测对象的基本信息和所述检测结果在所述检测对象上的位置信息,获取该条检测结果数据在所述检测对象上对应的原始信息;For each piece of detection result data, obtain the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object;
根据所述原始信息,判断所述检测结果的数据信息标出的缺陷是否为真缺陷,若是,则将该条检测结果数据标记为真缺陷数据;若否,则将该条检测结果数据标记为噪扰数据;According to the original information, it is judged whether the defect marked by the data information of the detection result is a true defect. If so, the detection result data is marked as true defect data; if not, the detection result data is marked as Noisy data;
根据所有的所述检测结果数据及每条所述检测结果数据对应的标签,得到所述第二数据样本。The second data sample is obtained based on all the detection result data and the label corresponding to each piece of detection result data.
可选地,所述检测对象包括Wafer;所述Wafer的基本信息包括所述Wafer的编号、包含的Die个数以及每一个Die的基本信息;所述Die的基本信息包括该Die的Die编号和图像信息;Optionally, the detection object includes a Wafer; the basic information of the Wafer includes the number of the Wafer, the number of Dies it contains, and the basic information of each Die; the basic information of the Die includes the Die number and the Die number of the Die. image information;
所述根据所述检测对象的基本信息和所述检测结果在所述检测对象上的位置信息,获取该条检测结果数据在所述检测对象上对应的原始信息,包括:Obtaining the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object includes:
根据所述Wafer的基本信息,获取所述Wafer的每一个Die的Die编号及每一所述Die的基本信息;According to the basic information of the Wafer, obtain the Die number of each Die of the Wafer and the basic information of each Die;
根据所述检测结果在所述Die上的位置信息以及所述Die的图像信息,获取该条检测结果数据在所述Die上对应的检测结果的图像信息。According to the position information of the detection result on the Die and the image information of the Die, the image information of the detection result corresponding to the piece of detection result data on the Die is obtained.
可选地,所述根据所述第二数据样本,得到所述检测对象的数据特征分布信息,包括:Optionally, obtaining the data feature distribution information of the detection object according to the second data sample includes:
确定特征数据轴和分割数据轴,并根据所述特征数据轴和分割数据轴建立特征空间;其中,所述特征数据轴代表所述检测结果数据的特征数据信息,所述分割数据轴代表分割特征信息;其中,所述分割特征信息包括除用于所述特征数据轴之外的其他特征数据信息;Determine the characteristic data axis and the segmentation data axis, and establish a feature space based on the characteristic data axis and the segmentation data axis; wherein the characteristic data axis represents the characteristic data information of the detection result data, and the segmentation data axis represents the segmentation feature Information; wherein the segmentation feature information includes other feature data information except for the feature data axis;
根据所述特征空间对所述第二数据样本进行排列,得到所述检测对象的数据特征分布信息。Arrange the second data samples according to the feature space to obtain data feature distribution information of the detection object.
可选地,所述特征空间包括一个或多个所述特征数据轴以及一个或多个所述分割数据轴。Optionally, the feature space includes one or more feature data axes and one or more segmentation data axes.
可选地,所述根据所述特征空间对所述第二数据样本进行排列,得到所述检测对象的数据特征分布信息,包括:Optionally, arranging the second data samples according to the feature space to obtain data feature distribution information of the detection object includes:
将所述特征数据轴作为横轴,将所述分割数据轴作为纵轴,建立直角坐标系;Use the feature data axis as the horizontal axis and the segmented data axis as the vertical axis to establish a rectangular coordinate system;
在所述直角坐标系内,在所述横轴方向按照所述特征数据轴代表的所述特征数据信息的特征值大小、在所述纵轴方向按照所述分割数据轴代表的所述特征数据信息的特征值大小对所述第二数据样本进行排列,得到缺陷特征分布图。In the rectangular coordinate system, in the horizontal axis direction, the characteristic value size of the characteristic data information represented by the characteristic data axis, and in the vertical axis direction, according to the characteristic data represented by the segmented data axis. The second data samples are arranged according to the characteristic value size of the information to obtain a defect characteristic distribution map.
可选地,所述采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,包括:Optionally, using a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information includes:
判断是否自动寻找缺陷分布边界信息,若是,则根据选择的离群统计分析模型,对所述离群统计分析模型进行训练,获取缺陷分布边界信息;若否,则采用数据分割法对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息;Determine whether to automatically search for defect distribution boundary information. If so, train the outlier statistical analysis model according to the selected outlier statistical analysis model to obtain defect distribution boundary information; if not, use the data segmentation method to analyze the data. Conduct outlier statistical analysis on feature distribution information to obtain defect distribution boundary information;
其中,所述对所述离群统计分析模型进行训练,包括:根据所述检测结果数据和所述数据特征分布信息,对选定的所述离群统计分析模型进行训练,直至得到的所述检测对象的缺陷分布边界信息满足第一预设条件;Wherein, training the outlier statistical analysis model includes: training the selected outlier statistical analysis model according to the detection result data and the data feature distribution information until the obtained The defect distribution boundary information of the detection object satisfies the first preset condition;
所述采用数据分割法对所述数据特征分布信息进行离群统计分析,包括:根据所述检测结果数据和所述数据特征分布信息,在所述特征数据轴和/或所述分割数据轴上获取至少一个第一分割阈值;并根据所述第一分割阈值获取所述缺陷边界信息,直至得到的所述检测对象的缺陷分布边界信息满足第二预设条件。The use of data segmentation method to perform outlier statistical analysis on the data feature distribution information includes: based on the detection result data and the data feature distribution information, on the feature data axis and/or the segmented data axis Obtain at least one first segmentation threshold; and obtain the defect boundary information according to the first segmentation threshold until the obtained defect distribution boundary information of the detection object satisfies the second preset condition.
可选地,所述分割数据轴代表工艺流程信息;所述根据所述检测结果数据和所述数据特征分布信息,对所述特征数据轴和/或所述分割数据轴进行阈值分割,直至得到的所述检测对象的缺陷分布边界信息满足第二预设条件,包括: Optionally, the segmented data axis represents process flow information; and based on the detection result data and the data feature distribution information, threshold segmentation is performed on the characteristic data axis and/or the segmented data axis until the The defect distribution boundary information of the detection object satisfies the second preset condition, including:
根据所述数据特征分布信息,以及标签为真缺陷数据和标签为噪扰数据的检测结果数据分布的一致性,确定所述分割数据轴的第一分割阈值;Determine the first segmentation threshold of the segmented data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data;
根据所述数据特征分布信息,以及标签为真缺陷数据和标签为噪扰数据的检测结果数据分布的一致性,确定所述特征数据轴的第二分割阈值;Determine the second segmentation threshold of the feature data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data;
根据所述分割数据轴的第一分割阈值和所述特征数据轴的第二分割阈值,得到所述检测对象的缺陷分布边界信息。According to the first segmentation threshold of the segmentation data axis and the second segmentation threshold of the feature data axis, the defect distribution boundary information of the detection object is obtained.
可选地,所述采用预设离群统计分析策略还包括:数据分割和模型学习相结合的离群统计分析策略;Optionally, the use of preset outlier statistical analysis strategies also includes: an outlier statistical analysis strategy that combines data segmentation and model learning;
所述数据分割和模型学习相结合的离群统计分析策略包括:根据所述数据特征分布信息,获取标签为真缺陷的所述检测结果数据在所述分割数据轴上的至少一个第一分割阈值;并根据所述第一分割阈值和所述数据特征分布信息,对选定的所述离群统计分析模型进行训练,直至得到的所述检测对象的缺陷分布边界信息满足第三预设条件。The outlier statistical analysis strategy that combines data segmentation and model learning includes: obtaining at least one first segmentation threshold on the segmentation data axis of the detection result data labeled as a true defect based on the data feature distribution information. ; And according to the first segmentation threshold and the data feature distribution information, train the selected outlier statistical analysis model until the obtained defect distribution boundary information of the detection object meets the third preset condition.
可选地,所述根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值,包括:Optionally, setting or optimizing the values of detection parameters of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy, including:
根据所述预设离群统计分析策略,确定反向推导策略;Determine a reverse derivation strategy according to the preset outlier statistical analysis strategy;
根据所述反向推导策略,确定所述反向推导策略的输入数据信息;According to the reverse derivation strategy, determine the input data information of the reverse derivation strategy;
根据所述输入数据信息,确定所述检测结果数据的数据分布模型;Determine the data distribution model of the detection result data according to the input data information;
根据所述数据分布模型和所述缺陷分布边界信息,确定所述检测配方的检测参数;Determine the detection parameters of the detection formula according to the data distribution model and the defect distribution boundary information;
根据所述检测配方的策略和所述反向推导的输入数据信息,设置或优化所述检测配方的检测参数的取值。According to the strategy of the detection recipe and the input data information of the reverse derivation, the values of the detection parameters of the detection recipe are set or optimized.
可选地,所述预设离群统计分析策略为数据分割法;Optionally, the preset outlier statistical analysis strategy is a data segmentation method;
根据所述数据分割法,将统计所述检测对象的检测结果数据的数据分布密度作为所述反向推导策略;According to the data segmentation method, the data distribution density of the detection result data of the detection object is counted as the reverse derivation strategy;
根据所述统计数据分布密度的反向推导策略,将所述检测对象的所有检测结果数据作为所述输入数据信息;According to the reverse derivation strategy of the statistical data distribution density, all detection result data of the detection object are used as the input data information;
根据所有检测结果数据,假设所有的所述检测结果数据的特征数据信息的特征值在特征空间的数据分布密度分为正常区域、噪扰区域和真缺陷区域;所述正常区域为数据分布密度大于第一密度阈值的区域,噪扰区域为数据密度小于或等于所述第一密度阈值且大于第二密度阈值的区域,真缺陷区域为数据密度小于或等于所述第二密度阈值的区域;According to all the detection result data, it is assumed that the data distribution density of the characteristic data information of all the detection result data in the feature space is divided into normal areas, noise areas and true defect areas; the normal area is where the data distribution density is greater than The area of the first density threshold, the noise area is the area where the data density is less than or equal to the first density threshold and greater than the second density threshold, and the true defect area is the area where the data density is less than or equal to the second density threshold;
根据所有检测结果数据和所有检测结果数据的标签,计算所述第一密度阈值和所述第二密度阈值;其中,所述第一密度阈值大于所述第二密度阈值;Calculate the first density threshold and the second density threshold according to all detection result data and labels of all detection result data; wherein the first density threshold is greater than the second density threshold;
根据所述第一密度阈值、所述第二密度阈值和所述缺陷分布边界信息,计算所述检测配方的位移参数。Calculate the displacement parameter of the detection formula according to the first density threshold, the second density threshold and the defect distribution boundary information.
可选地,所述预设离群统计分析策略为基于高斯模型的离群统计分析策略;Optionally, the preset outlier statistical analysis strategy is an outlier statistical analysis strategy based on Gaussian model;
根据所述基于高斯模型的离群统计分析策略,将获取所述检测对象的检测结果数据的高斯分布作为所述反向推导策略,将高斯模型检测作为检测配方的策略;According to the outlier statistical analysis strategy based on the Gaussian model, the Gaussian distribution of the detection result data of the detection object is obtained as the reverse derivation strategy, and Gaussian model detection is used as the detection formula strategy;
根据统计高斯分布的反向推导策略,将所述检测对象的所有检测结果数据作为所述输入数据信息和所述缺陷分布边界信息作为所述输入数据信息;According to the reverse derivation strategy of statistical Gaussian distribution, all detection result data of the detection object are used as the input data information and the defect distribution boundary information is used as the input data information;
根据所有检测结果数据,假设所有的所述检测结果数据的特征数据信息的特征值在特征空间的数据分布密度服从高斯分布;According to all detection result data, it is assumed that the data distribution density of the feature values of all the feature data information of the detection result data in the feature space obeys Gaussian distribution;
根据所述输入数据信息和所述缺陷分布边界信息,确定所述高斯模型检测的参数。According to the input data information and the defect distribution boundary information, the parameters of the Gaussian model detection are determined.
可选地,所述预设离群统计分析策略为机器学习的离群统计分析策略;Optionally, the preset outlier statistical analysis strategy is a machine learning outlier statistical analysis strategy;
根据所述机器学习的离群统计分析策略,将获取所述检测对象的检测结果数据的密度阈值和距离阈值作为所述反向推导策略,将机器学习模型作为检测配方的策略; According to the outlier statistical analysis strategy of machine learning, the density threshold and distance threshold for obtaining the detection result data of the detection object are used as the reverse derivation strategy, and the machine learning model is used as the strategy of detection formula;
根据所述获取所述检测对象的检测结果数据的密度阈值和距离阈值的反向推导策略,将获取的所述检测对象的检测结果数据的密度和距离作为所述输入数据信息;According to the reverse derivation strategy of obtaining the density threshold and distance threshold of the detection result data of the detection object, the obtained density and distance of the detection result data of the detection object are used as the input data information;
根据所有检测结果数据和所述缺陷边界分布信息,反向推导所述机器学习模型的检测策略的密度参数和距离参数。Based on all detection result data and the defect boundary distribution information, the density parameters and distance parameters of the detection strategy of the machine learning model are reversely derived.
可选地,所述检测配方设置与优化方法,还包括:Optionally, the detection recipe setting and optimization method also includes:
根据所述检测配方及所述检测配方的检测参数的取值,对待检测对象进行缺陷分析,得到所述待检测对象的缺陷数据信息。According to the detection formula and the values of the detection parameters of the detection formula, defect analysis of the object to be detected is performed to obtain defect data information of the object to be detected.
为了实现上述目的,本发明还提供了一种检测配方设置与优化装置,所述检测参数与调整装置,包括:In order to achieve the above object, the present invention also provides a detection formula setting and optimization device. The detection parameter and adjustment device includes:
真缺陷及噪扰标记单元,被配置为对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签;The true defect and noise marking unit is configured to mark the first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes the detection result data Result data and labels corresponding to each test result data;
特征分布信息获取单元,被配置为根据所述第二数据样本,得到检测对象的数据特征分布信息;A feature distribution information acquisition unit configured to obtain data feature distribution information of the detection object based on the second data sample;
缺陷分布边界获取单元,被配置为采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,并用于根据所述预设离群统计分析策略,确定检测配方;The defect distribution boundary acquisition unit is configured to use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and is used to perform outlier statistical analysis according to the preset outlier statistical analysis strategy, Determine the test formula;
检测参数设置及优化单元,被配置为根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,确定或优化所述检测配方的检测参数的取值。The detection parameter setting and optimization unit is configured to determine or optimize the value of the detection parameter of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
可选地,所述检测配方设置与优化装置,还包括:Optionally, the detection recipe setting and optimization device also includes:
检测配方应用单元,被配置为根据所述检测配方及所述检测配方的检测参数的取值,对待检测对象进行缺陷分析,得到所述待检测对象的缺陷数据信息。The detection recipe application unit is configured to perform defect analysis on the object to be detected based on the detection formula and the values of detection parameters of the detection formula, and obtain defect data information of the object to be detected.
为达到上述目的,本发明还提供一种电子设备,包括处理器和存储器,所述存储器上存储有计算机程序,所述计算机程序被所述处理器执行时,实现上文所述的检测配方设置与优化方法。In order to achieve the above object, the present invention also provides an electronic device, including a processor and a memory. A computer program is stored on the memory. When the computer program is executed by the processor, the above-mentioned detection recipe setting is realized. and optimization methods.
为达到上述目的,本发明还提供一种可读存储介质,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现上文所述的检测配方设置与优化方法。In order to achieve the above object, the present invention also provides a readable storage medium. A computer program is stored in the readable storage medium. When the computer program is executed by the processor, the detection recipe setting and optimization method described above is realized. .
与现有技术相比,本发明提供的检测配方设置与优化方法、装置、电子设备和存储介质具有以下优点:Compared with the existing technology, the detection recipe setting and optimization method, device, electronic equipment and storage medium provided by the present invention have the following advantages:
本发明提供的检测配方设置与优化方法,首先通过对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签;然后根据所述第二数据样本,得到检测对象的数据特征分布信息,并根据所述预设离群统计分析策略,确定检测配方;接着采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息;最后根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,检测配方确定或优化所述检测配方的检测参数。由此,本发明提供的检测配方设置与优化方法,所述第一数据样本包括若干条检测结果数据,所述检测结果数据包括辅助的调参信息(比如所述检测对象的基本信息和特征数据信息,所述特征数据信息包括但不限于检测结果指示的缺陷的灰度、形状、纹理等信息),通过数据标注可以区分真缺陷数据和噪扰数据,为后续有效利用历史信息进行数据分析和推理从而能够获取到准确的先验知识提供了重要的依据,能够提高检测配方的检测精度。进一步地,本发明提供的检测配方设置与优化方法,检测配方的策略及检测参数的取值是根据缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导得到。由此,本发明通过反向推导能够同时推理出一套检测参数(即同时调整出所有参数),参数之间的耦合关系也考虑在内,实现了检测配方 的快速建模;避免了反复调整参数,能够显著节约人力和时间成本;而且,针对新工艺缺陷检测,无需用户具备算法基础也能设置或优化检测配方的策略及所述检测配方的检测参数的取值。The detection recipe setting and optimization method provided by the present invention first obtains a second data sample by annotating the first data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes all The detection result data and the label corresponding to each of the detection result data; then obtain the data feature distribution information of the detection object according to the second data sample, and determine the detection formula according to the preset outlier statistical analysis strategy; Then, a preset outlier statistical analysis strategy is used to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information; finally, according to the defect distribution boundary information and the preset outlier statistical analysis strategy, through inverse Through direct derivation, the detection formula determines or optimizes the detection parameters of the detection formula. Therefore, in the detection recipe setting and optimization method provided by the present invention, the first data sample includes several pieces of detection result data, and the detection result data includes auxiliary parameter adjustment information (such as the basic information and characteristic data of the detection object). Information, the characteristic data information includes but is not limited to the grayscale, shape, texture and other information of the defects indicated by the detection results). Through data annotation, true defect data and noise data can be distinguished, which can effectively utilize historical information for subsequent data analysis and analysis. Inference provides an important basis for obtaining accurate prior knowledge, which can improve the detection accuracy of detection formulas. Furthermore, in the detection recipe setting and optimization method provided by the present invention, the detection recipe strategy and detection parameter values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy. Therefore, the present invention can deduce a set of detection parameters at the same time through reverse derivation (that is, adjust all parameters at the same time). The coupling relationship between parameters is also taken into account, and the detection formula is realized. Rapid modeling; avoids repeated adjustment of parameters, which can significantly save manpower and time costs; moreover, for new process defect detection, users can set or optimize the strategy of the detection formula and the detection parameters of the detection formula without having any algorithm foundation. Take value.
由于本发明提供的检测配方设置与优化装置、电子设备和存储介质与本发明提供的检测参数与调整方法属于同一发明构思,因此,本发明提供的检测配方设置与优化装置、电子设备和存储介质具有所述检测配方设置与优化方法的所有优点,在此,不再一一赘述。Since the detection recipe setting and optimization device, electronic equipment and storage medium provided by the present invention and the detection parameters and adjustment method provided by the present invention belong to the same inventive concept, therefore, the detection recipe setting and optimization device, electronic equipment and storage medium provided by the present invention It has all the advantages of the detection recipe setting and optimization method, which will not be described in detail here.
附图说明Description of the drawings
图1为本发明一实施方式提供的检测配方设置与优化方法的流程示意图;Figure 1 is a schematic flow chart of a detection recipe setting and optimization method provided by an embodiment of the present invention;
图2为本发明一实施方式提供的数据样本标注方法流程示意图;Figure 2 is a schematic flow chart of a data sample labeling method provided by an embodiment of the present invention;
图3为本发明一实施方式提供的对数据样本进行缺陷标注的其中一种界面示意图;Figure 3 is a schematic diagram of an interface for defect marking of data samples provided by an embodiment of the present invention;
图4为应用本发明的其中一具体示例的检测结果数据在二维特征空间的分布示例图;Figure 4 is an example diagram showing the distribution of detection result data in a two-dimensional feature space in one specific example of applying the present invention;
图5为本发明一实施方式提供的离群统计分析原理示意图;Figure 5 is a schematic diagram of the principle of outlier statistical analysis provided by an embodiment of the present invention;
图6为应用本发明提供的离群统计分析模型得到的缺陷分布边界信息示意图;Figure 6 is a schematic diagram of defect distribution boundary information obtained by applying the outlier statistical analysis model provided by the present invention;
图7为图1中步骤S400的详细流程示意图;Figure 7 is a detailed flow diagram of step S400 in Figure 1;
图8为应用本发明提供的检测配方设置与优化方法进行反向推导的一具体示例图;Figure 8 is a specific example diagram of reverse derivation using the detection formula setting and optimization method provided by the present invention;
图9为本发明一实施方式提供的其中一种检测结果数据的数据密度分布示意图;Figure 9 is a schematic diagram of the data density distribution of one of the detection result data provided by an embodiment of the present invention;
图10为本发明一实施方式提供的标准分割轴的平均灰度级范围内的真缺陷数据分布示意图;Figure 10 is a schematic diagram of true defect data distribution within the average gray level range of the standard segmentation axis provided by an embodiment of the present invention;
图11(a)为本发明一实施方式提供的多张测试图示例图;Figure 11(a) is an example of multiple test charts provided by an embodiment of the present invention;
图11(b)为图11(a)中多张测试图生成的均值图示例图;Figure 11(b) is an example of the mean graph generated from multiple test images in Figure 11(a);
图11(c)为图11(a)中多张测试图生成的标准差图示例图;Figure 11(c) is an example of the standard deviation chart generated from multiple test charts in Figure 11(a);
图11(d)为其中一张测试图的放大示例图;Figure 11(d) is an enlarged example of one of the test images;
图11(e)为使用机器学习配方检测出来的缺陷位置示意图;Figure 11(e) is a schematic diagram of defect locations detected using machine learning recipes;
图12为应用本发明提供的灰度动态阈值示意图;Figure 12 is a schematic diagram of the grayscale dynamic threshold provided by the application of the present invention;
图13为应用本发明提供的检测配方设置与优化方法得到的检测配方检测得到的检测结果数据与原始检测配方得到的检测结果数据的对比示意图;Figure 13 is a schematic diagram comparing the detection result data obtained by applying the detection formula setting and optimization method provided by the present invention and the detection result data obtained by the original detection formula;
图14为本发明一实施方式中的检测配方设置与优化装置的结构框图;Figure 14 is a structural block diagram of a detection recipe setting and optimization device in an embodiment of the present invention;
图15为本发明一实施方式中的电子设备的方框结构示意图。FIG. 15 is a schematic block structure diagram of an electronic device in an embodiment of the present invention.
其中,附图标记如下:Among them, the reference signs are as follows:
1-噪扰数据,2-真缺陷数据,3-缺陷分布边界曲线,segment_value1、segment_value2-第一分割阈值,A、A1、A2-像素点;1-Noisy data, 2-True defect data, 3-Defect distribution boundary curve, segment_value1, segment_value2-first segmentation threshold, A, A1, A2-pixel points;
100-真缺陷及噪扰标记单元、200-特征分布信息获取单元、300-缺陷分布边界获取单元、400-检测参数设置及优化单元、500-检测配方应用单元;100-True defect and noise marking unit, 200-Feature distribution information acquisition unit, 300-Defect distribution boundary acquisition unit, 400-Inspection parameter setting and optimization unit, 500-Inspection recipe application unit;
601-处理器、602-通信接口、603-存储器、604-通信总线。601-processor, 602-communication interface, 603-memory, 604-communication bus.
具体实施方式Detailed ways
以下结合附图和具体实施方式对本发明提出的检测配方设置与优化方法、装置、电子设备和存储介质作进一步详细说明。根据下面的说明,本发明的优点和特征将更清楚。需要说明的是,附图采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施方式的目的。为了使本发明的目的、特征和优点能够更加明显易懂,请参阅附图。须知,本说明书所附图式所绘示的结构、比例、大小等,均仅用以配合说明书所揭示的内容,以供熟悉此技术的人士了解与阅读,并非用以限定本发明实施的限定条件,任何结构的修饰、比例关系的改变或大小的调整,在与本发明所能产生的功效及所能达成 的目的相同或近似的情况下,均应仍落在本发明所揭示的技术内容能涵盖的范围内。The detection recipe setting and optimization method, device, electronic equipment and storage medium proposed by the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become clearer from the following description. It should be noted that the drawings are in a very simplified form and use imprecise proportions, and are only used to conveniently and clearly assist in explaining the embodiments of the present invention. In order to make the objects, features and advantages of the present invention more apparent, please refer to the accompanying drawings. It should be noted that the structures, proportions, sizes, etc. shown in the drawings attached to this specification are only used to coordinate with the content disclosed in the specification for the understanding and reading of those familiar with this technology, and are not used to limit the implementation of the present invention. Conditions, any structural modification, change in proportion or adjustment in size are incompatible with the effect that the present invention can produce and what it can achieve. Even if the purposes are the same or similar, they should still fall within the scope of the technical content disclosed in the present invention.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations are mutually exclusive. any such actual relationship or sequence exists between them. Furthermore, the terms "comprises," "comprises," or any other variations thereof are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that includes a list of elements includes not only those elements, but also those not expressly listed other elements, or elements inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or apparatus that includes the stated element.
本发明的其中一个实施例提供了一种检测配方设置与优化方法,具体地,请参考图1,其示意性地给出了本发明一实施方式提供的检测配方设置与优化方法的流程示意图。如图1所示,所述检测配方设置与优化方法包括如下步骤:One embodiment of the present invention provides a detection recipe setting and optimization method. Specifically, please refer to FIG. 1 , which schematically provides a flow chart of the detection recipe setting and optimization method provided by an embodiment of the present invention. As shown in Figure 1, the detection recipe setting and optimization method includes the following steps:
S100:对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签;S100: Annotate the first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes the detection result data and each of the detection results The label corresponding to the data;
S200:根据所述第二数据样本,得到检测对象的数据特征分布信息;S200: Obtain the data feature distribution information of the detection object according to the second data sample;
S300:采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,并根据所述预设离群统计分析策略,确定检测配方;S300: Use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and determine the detection formula according to the preset outlier statistical analysis strategy;
S400:根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值。S400: Based on the defect distribution boundary information and the preset outlier statistical analysis strategy, set or optimize the values of the detection parameters of the detection formula through reverse derivation.
由此,本发明提供的检测配方设置与优化方法,所述第一数据样本包括若干条检测结果数据,所述检测结果数据包括大量辅助的调参信息(比如所述检测对象的基本信息和特征数据信息,所述特征数据信息包括但不限于检测结果指示的缺陷的灰度、形状、纹理等信息),通过数据标注可以区分真缺陷数据和噪扰数据,为后续有效利用历史信息进行数据分析和推理从而能够获取到准确的先验知识提供了重要的依据,能够提高检测配方的检测精度。进一步地,本发明提供的检测配方设置与优化方法,检测配方的策略及参数设置值是根据缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导得到。由此,本发明通过反向推导能够同时推理出一套检测参数(即同时调整出所有参数),参数之间的耦合关系也考虑在内,实现了检测流程的快速建模,避免了反复调整参数,能够显著节约人力和时间成本。而且,针对新工艺缺陷检测,无需用户具备算法基础也能设置或优化检测配方的策略及检测参数的取值。Therefore, in the detection recipe setting and optimization method provided by the present invention, the first data sample includes several pieces of detection result data, and the detection result data includes a large amount of auxiliary parameter adjustment information (such as the basic information and characteristics of the detection object). Data information, the characteristic data information includes but is not limited to the grayscale, shape, texture and other information of the defects indicated by the detection results). Through data annotation, real defect data and noise data can be distinguished, and the historical information can be effectively used for subsequent data analysis. And reasoning can provide an important basis for obtaining accurate prior knowledge, which can improve the detection accuracy of detection formulas. Furthermore, in the detection recipe setting and optimization method provided by the present invention, the detection recipe strategy and parameter setting values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy. Therefore, the present invention can deduce a set of detection parameters at the same time (that is, adjust all parameters at the same time) through reverse derivation. The coupling relationship between parameters is also taken into account, realizing rapid modeling of the detection process and avoiding repeated adjustments. parameters, which can significantly save labor and time costs. Moreover, for new process defect detection, users can set or optimize the detection recipe strategy and detection parameter values without having any algorithm foundation.
需要特别说明的是,所述检测结果数据为所述检测对象的历史检测结果数据。举例而言,在初次设置缺陷检测流程中使用的检测配方的策略及检测参数时,可以先随机或人为选择检测配方的策略和检测参数,获取一定量的缺陷检测数据,而该一定量的缺陷检测数据为检测结果数据(也即为第一样本数据)。在对检测配方的策略及检测参数的配方优化时,所述检测结果数据(也即为第一样本数据)包括待优化的检测配方历史检测的数据全部或其中一部分。为了便于理解和说明,下文所述的检测结果数据为进行晶圆缺陷的历史检测数据,很显然地,这并非本发明的限制,在其他的实施方式中,本发明提供的检测配方设置与优化方法也可以适应于初检测晶圆缺陷的其他的检测配方,不再一一示例。It should be noted that the detection result data is the historical detection result data of the detection object. For example, when setting up the detection recipe strategy and detection parameters used in the defect detection process for the first time, you can first randomly or artificially select the detection recipe strategy and detection parameters to obtain a certain amount of defect detection data. The detection data is detection result data (that is, the first sample data). When optimizing the strategy of the detection formula and the detection parameters, the detection result data (that is, the first sample data) includes all or part of the historical detection data of the detection formula to be optimized. For ease of understanding and explanation, the detection result data described below are historical detection data of wafer defects. Obviously, this is not a limitation of the present invention. In other embodiments, the detection recipe setting and optimization provided by the present invention The method can also be adapted to other detection formulas for initial detection of wafer defects, so no examples will be given one by one.
优选地,在其中一种优选实施方式中,所述检测结果数据包括所述检测对象的基本信息和特征数据信息;其中,所述特征数据信息包括检测结果在所述检测对象上的位置信息,以及所述检测对象的工艺流程信息、所述检测结果的数据信息的灰度信息、形状信息和纹理信息中的一种或多种。如本领域技术人员可以理解的,很显然地,所述检测结果的数据信息也必然包括用以指示检测结果的结论信息(缺陷数据或非缺陷数据)。为了便于理解,所述检测结果的数据信息的具体示例将在下文与检测结果的图像信息对比说明,在此,不 再对检测结果的数据信息进行示例性阐述。由此可见,所述检测结果数据包含了所述检测对象的基本信息和特征数据信息(比如nuisance的灰度、形状、纹理等信息)等辅助调参的信息,且在后续的缺陷分布图绘制和参数反向推理过程基于所述检测结果数据,由此,本发明提供的检测配方设置与优化方法,能够提高检测配方的检测精度。Preferably, in one of the preferred embodiments, the detection result data includes basic information and characteristic data information of the detection object; wherein the characteristic data information includes position information of the detection result on the detection object, and one or more of the process flow information of the detection object, the grayscale information, the shape information and the texture information of the data information of the detection result. As those skilled in the art can understand, it is obvious that the data information of the detection results must also include conclusion information (defective data or non-defective data) used to indicate the detection results. In order to facilitate understanding, specific examples of the data information of the detection results will be described below in comparison with the image information of the detection results. Here, no Next, the data information of the detection results is exemplified. It can be seen that the detection result data includes the basic information and characteristic data information of the detection object (such as the grayscale, shape, texture and other information of nuisance) and other auxiliary parameter adjustment information, and will be used in the subsequent drawing of the defect distribution map. The parameter reverse reasoning process is based on the detection result data. Therefore, the detection formula setting and optimization method provided by the present invention can improve the detection accuracy of the detection formula.
优选地,在其中一种实施方式中,请参见图2,其示意性地给出了数据样本标注方法流程示意图。从图2可以看出,步骤S100中,所述对第一数据样本进行标注,得到第二数据样本,包括:Preferably, in one of the implementations, please refer to Figure 2, which schematically shows a schematic flow chart of the data sample annotation method. As can be seen from Figure 2, in step S100, the first data sample is annotated to obtain the second data sample, including:
S110:获取所述第一数据样本中每一条检测结果数据对应的所述检测对象的基本信息;S110: Obtain the basic information of the detection object corresponding to each piece of detection result data in the first data sample;
S120:对于每一条检测结果数据,根据所述检测对象的基本信息和所述检测结果在所述检测对象上的位置信息,获取该条检测结果数据在所述检测对象上对应的原始信息;S120: For each piece of detection result data, obtain the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object;
S130:根据所述原始信息,判断所述检测结果的数据信息标出的缺陷是否为真缺陷,若是,则将该条检测结果数据标记为真缺陷数据;若否,则将该条检测结果数据标记为噪扰数据;S130: Based on the original information, determine whether the defect marked by the data information of the detection result is a true defect. If so, mark the detection result data as true defect data; if not, mark the detection result data as true defect data. Marked as noisy data;
S140:根据所有的所述检测结果数据及每条所述检测结果数据对应的标签,得到所述第二数据样本。S140: Obtain the second data sample based on all the detection result data and the tag corresponding to each detection result data.
如此配置,本发明提供的检测配方设置与优化方法,通过对所述第一数据样本进行标注,可以准确地将检测结果数据(历史数据)中真正的缺陷数据和噪扰数据(nusiance,噪声干扰数据)进行准确地区分,从而为后续获取数据特征分布信息、进而根据所述数据特征分布信息获取缺陷分布边界信息进一步进行反向推导提供准确的先验知识,从而提高检测配方的检测精度。With such configuration, the detection recipe setting and optimization method provided by the present invention can accurately distinguish the real defect data and noise data (nusiance, noise interference) in the detection result data (historical data) by labeling the first data sample. Data) are accurately distinguished, thereby providing accurate prior knowledge for subsequent acquisition of data feature distribution information, and further obtaining defect distribution boundary information based on the data feature distribution information for further reverse derivation, thereby improving the detection accuracy of the detection formula.
需要特别说明的是,本领域的技术人员应该能够理解,所述特征数据信息是对检测对象执行缺陷检测的全部检测结果,包括缺陷数据和非缺陷数据。It should be noted that those skilled in the art should be able to understand that the characteristic data information is all detection results of defect detection on the detection object, including defect data and non-defect data.
作为应用本发明提供的检测配方设置与优化方法的其中一种优选示例,以下以所述检测对象为Wafer(晶圆)为例进行说明,很显然地,所述第一数据样本为所述Wafer的历史检测结果数据。更具体地,所述Wafer的基本信息包括所述Wafer的编号、包含的Die(裸芯)个数以及每一个Die的基本信息;所述Die的基本信息包括该Die的Die编号和图像信息。对应地,步骤S120中,所述根据所述检测对象的基本信息和所述缺陷在所述检测对象上的位置信息,获取该条检测结果数据在所述检测对象上对应的原始信息,包括:As one of the preferred examples of applying the detection recipe setting and optimization method provided by the present invention, the following description takes the detection object as a wafer as an example. Obviously, the first data sample is the wafer. historical test result data. More specifically, the basic information of the Wafer includes the number of the Wafer, the number of Dies (die) contained, and the basic information of each Die; the basic information of the Die includes the Die number and image information of the Die. Correspondingly, in step S120, the original information corresponding to the detection result data on the detection object is obtained based on the basic information of the detection object and the position information of the defect on the detection object, including:
S121:根据所述Wafer的基本信息,获取所述Wafer的每一个Die的Die编号及每一所述Die的基本信息;S121: According to the basic information of the Wafer, obtain the Die number of each Die of the Wafer and the basic information of each Die;
S122:根据所述检测结果在所述Die上的位置信息以及所述Die的图像信息,获取该条检测结果数据在所述Die上对应的检测结果的图像信息。S122: According to the position information of the detection result on the Die and the image information of the Die, obtain the image information of the detection result corresponding to the piece of detection result data on the Die.
为了便于更准确地理解本发明,以下对所述检测结果的数据信息和所述检测结果的图像信息予以解释说明,所述检测结果的数据信息包括在所述检测结果数据中对所述检测结果的图像信息的描述,而检测结果的图像信息为所述检测结果的数据信息在所述检测对象上对应的原始图像,换句话说,所述检测结果的数据信息包括所述检测结果的图像信息的数据表达。仍以晶圆为检测对象举例来说:若所述缺陷为纹理缺陷,则所述检测结果的数据信息记录了所述纹理缺陷的纹理特征,比如纹理的粗糙度等,而所述检测结果的图像信息为所述纹理缺陷对应的原始图像,由此,根据检测结果的图像信息,可以对所述检测结果的图像信息对应的所述检测结果数据进行复判,是真缺陷数据还是噪扰数据。In order to facilitate a more accurate understanding of the present invention, the data information of the detection result and the image information of the detection result are explained below. The data information of the detection result includes the detection result in the detection result data. description of the image information, and the image information of the detection result is the original image corresponding to the data information of the detection result on the detection object. In other words, the data information of the detection result includes the image information of the detection result. data expression. Still taking the wafer as the detection object, for example: if the defect is a texture defect, the data information of the detection result records the texture characteristics of the texture defect, such as the roughness of the texture, etc., and the data information of the detection result The image information is the original image corresponding to the texture defect. Therefore, according to the image information of the detection result, the detection result data corresponding to the image information of the detection result can be re-judged whether it is true defect data or noise data. .
具体地,请参见图3,其示意性地给出了本发明一实施方式提供的对数据样本进行缺陷标注的其中一种界面示意图。从图3可以看出,在所述缺陷标注的界面上,共有3个主功能区,包括Wafer展示窗口区、检测数据列表窗口区以及缺陷(Defect)显示区。具体 地,所述Wafer展示窗口区用于图示化显示所述Wafer的基本信息,包括但不限于Wafer上各个Die在所述Wafer上的位置及所述Die的编号。在所述Wafer展示窗口区下方,用户可以选择要进行缺陷标注的Die编号,根据用户选择的Die编号,在所述检测数据列表窗口区会刷新选中的Die编号对应的Die的历史检测数据结果。由此,根据所述检测数据列表窗口区内的检测结果数据的列表,用户可以逐条选择所述检测结果数据,在所述缺陷显示区会显示所述检测结果数据对应的原始信息(即所述检测结果的图像信息为所述检测结果在所述Die上的位置信息指示的图像信息),由此,根据该原始信息的各种特征(纹理、大小、弯曲度、形状等),可以通过人工复判或机器复判等方式进一步确认所述检测结果的数据信息指示的缺陷,是否为真缺陷,如果是,则将该条检测结果数据标记为真缺陷数据(比如将在所述检测数据列表窗口区中该条检测结果数据的标签标记为真缺陷,将人工判断是否为真实缺陷栏对应的值置为是);如果否,则将该条检测结果数据标记为噪扰数据(比如将在所述检测数据列表窗口区中该条检测结果数据的标签标记为假缺陷,将人工判断是否为真实缺陷栏对应的值置为否)。一直重复上述过程,依次选择每个Die编号,并依次将当前Die下的每个检测数据结果进行人工标注,就能完成整个Wafer的检测数据结果的标注,以此类推,就能将所述第一数据样本标注,从而获取第二数据样本。Specifically, please refer to FIG. 3 , which schematically illustrates one of the interface diagrams for defect marking of data samples provided by an embodiment of the present invention. As can be seen from Figure 3, there are three main functional areas on the defect annotation interface, including the Wafer display window area, the detection data list window area and the defect display area. specific Specifically, the Wafer display window area is used to graphically display the basic information of the Wafer, including but not limited to the position of each Die on the Wafer and the number of the Die. Below the Wafer display window area, the user can select the Die number to be marked for defects. According to the Die number selected by the user, the historical detection data results of the Die corresponding to the selected Die number will be refreshed in the detection data list window area. Therefore, according to the list of detection result data in the detection data list window area, the user can select the detection result data one by one, and the original information corresponding to the detection result data (i.e., the defect display area) will be displayed. The image information of the detection result is the image information indicated by the position information of the detection result on the Die). Therefore, according to various characteristics of the original information (texture, size, curvature, shape, etc.), it can be artificially Further confirm whether the defect indicated by the data information of the test result is a true defect by means of re-judgment or machine re-judgment. If so, mark the piece of test result data as true defect data (for example, it will be included in the test data list The label of the detection result data in the window area is marked as a true defect, and the value corresponding to the column of the manual judgment whether it is a real defect is set to yes); if not, the detection result data is marked as noise data (for example, the value in the The label of the detection result data in the detection data list window area is marked as a false defect, and the value corresponding to the manual judgment whether it is a real defect column is set to No). Repeat the above process, select each Die number in turn, and manually annotate each detection data result under the current Die, then you can complete the annotation of the entire Wafer detection data results, and so on, you can add the above-mentioned first Die number. Label one data sample to obtain a second data sample.
需要特别说明的是,上文虽然以人工标注的方式为例说明所述第一数据样本的标注方法,但很显然地,这并非本发明的限制,在其他的实施方式中,也可以通过机器学习等方法进行标注,本发明对此不作限定。进一步地,如前所述,本发明提供的检测配方设置与优化方法,虽然以晶圆为例作为检测对象进行说明,但如本领域技术人员所能理解地,这仅是较佳实施方式的示例性说明,而非本发明的限制,在其他的实施方式中,所述检测对象也可以为除晶圆之外的其他产品,包括但不限于镜片、显示屏、3D打印产品等等,不再一一示例说明。It should be noted that although manual labeling is used as an example to illustrate the labeling method of the first data sample, it is obvious that this is not a limitation of the present invention. In other implementations, machines can also be used to label the first data sample. Annotation may be performed by learning or other methods, which is not limited by the present invention. Furthermore, as mentioned above, although the detection recipe setting and optimization method provided by the present invention is explained by taking a wafer as an example as a detection object, as those skilled in the art can understand, this is only a preferred embodiment. To illustrate, but not to limit the present invention, in other embodiments, the detection object may also be other products besides wafers, including but not limited to lenses, display screens, 3D printing products, etc. Explain with examples one by one.
优选地,在其中一种示范性实施方式中,步骤S200中,所述根据所述第二数据样本,得到所述检测对象的数据特征分布信息,包括:Preferably, in one of the exemplary implementations, in step S200, obtaining the data feature distribution information of the detection object based on the second data sample includes:
S210:确定特征数据轴和分割数据轴,并根据所述特征数据轴和分割数据轴建立特征空间;其中,所述特征数据轴代表所述检测结果数据的特征数据信息,所述分割数据轴代表分割特征信息;其中,所述分割特征信息包括除用于所述特征数据轴之外的其他特征数据信息;S210: Determine the characteristic data axis and the segmented data axis, and establish a feature space based on the characteristic data axis and the segmented data axis; wherein the characteristic data axis represents the characteristic data information of the detection result data, and the segmented data axis represents Segmentation feature information; wherein the segmentation feature information includes other feature data information except for the feature data axis;
S220:根据所述特征空间对所述第二数据样本进行排列,得到所述检测对象的数据特征分布信息。S220: Arrange the second data samples according to the feature space to obtain data feature distribution information of the detection object.
如此配置,本发明提供的检测配方设置与优化方法,通过所述特征空间将所述第二数据样本进行排列,其目的是为了使所述检测结果数据在特征空间中的分布呈现出某种趋势,使真缺陷数据和噪扰数据的区分更加明显,以便于获取缺陷分布边界信息。So configured, the detection recipe setting and optimization method provided by the present invention arranges the second data samples through the feature space, and the purpose is to make the distribution of the detection result data in the feature space show a certain trend. , making the distinction between true defect data and noise data more obvious, so as to facilitate the acquisition of defect distribution boundary information.
优选地,所述特征空间包括一个或多个所述特征数据轴以及一个或多个所述分割数据轴。Preferably, the feature space includes one or more feature data axes and one or more segmentation data axes.
本发明提供的检测配方与优化方法,所述特征空间可以包括多个所述特征数据轴以及多个所述分割数据轴,及所述特征空间可以为多维特征空间。比如,特征数据轴为两个,其中一个用于代表缺陷的灰度信息,另一个用于代表缺陷的纹理信息;分割数据轴的其中一个用于代表所述缺陷的形状信息,另一个代表所述缺陷的大小。由此,本发明提供的检测配方与优化方法,由于参考了所述缺陷的更多的特征信息,因此,为进一步提升所述检测配方的检测精度奠定了良好的基础。需要特别说明的是,上述仅是示例性说明而非本发明的限制,在实际应用中,所述特征数据轴、所述分割数据轴及各自的个数应更根据实际需要合理选择。In the detection formula and optimization method provided by the present invention, the feature space may include multiple feature data axes and multiple segmentation data axes, and the feature space may be a multi-dimensional feature space. For example, there are two feature data axes, one of which is used to represent the grayscale information of the defect, and the other is used to represent the texture information of the defect; one of the segmentation data axes is used to represent the shape information of the defect, and the other is used to represent the shape information of the defect. Describe the size of the defect. Therefore, the detection formula and optimization method provided by the present invention refer to more characteristic information of the defects, thus laying a good foundation for further improving the detection accuracy of the detection formula. It should be noted that the above is only an illustrative description and not a limitation of the present invention. In practical applications, the characteristic data axis, the segmented data axis and their respective numbers should be reasonably selected according to actual needs.
优选地,在其中一种示范性实施方式中,步骤S220中,所述根据所述特征空间对所 述第二数据样本进行排列,得到所述检测对象的数据特征分布信息,包括:Preferably, in one of the exemplary implementations, in step S220, the Arrange the second data samples to obtain the data feature distribution information of the detection object, including:
S221:将所述特征数据轴作为横轴,将所述分割数据轴作为纵轴,建立直角坐标系;S221: Use the feature data axis as the horizontal axis and the segmented data axis as the vertical axis to establish a rectangular coordinate system;
S222:在所述直角坐标系内,在所述横轴方向按照所述特征数据轴代表的所述特征数据信息的特征值大小、在所述纵轴方向按照所述分割数据轴代表的所述特征数据信息的特征值大小对所述第二数据样本进行排列,得到缺陷特征分布图。S222: In the rectangular coordinate system, in the horizontal axis direction according to the characteristic value size of the characteristic data information represented by the characteristic data axis, and in the vertical axis direction according to the characteristic value represented by the segmented data axis. The second data samples are arranged according to the characteristic value size of the characteristic data information to obtain a defect characteristic distribution map.
具体地,请参见图4,其示意性地给出了其中一具体示例的检测结果数据在二维特征空间的分布示例图。从图4可以看出,该示例为横轴表示特征数据轴,纵轴表示分割数据轴形成的二维数据特征分布图。即坐标系内每个点的横坐标表示特征值大小,纵坐标表示对应的分割特征值大小,如此,所有检测结果数据的特征值构成了整个特征分布图。Specifically, please refer to Figure 4, which schematically shows an example diagram of the distribution of detection result data in a two-dimensional feature space of one specific example. As can be seen from Figure 4, in this example, the horizontal axis represents the feature data axis, and the vertical axis represents the two-dimensional data feature distribution map formed by dividing the data axis. That is, the abscissa of each point in the coordinate system represents the size of the feature value, and the ordinate represents the size of the corresponding segmentation feature value. In this way, the feature values of all detection result data constitute the entire feature distribution map.
需要特别说明的是,如前所述,上述示例虽然以二维特征空间分布为例进行说明,但是,在实际应用中,所述特征数据轴和所述分割数据轴可以是多维的。即所述分割数据轴可以选择多个分割值,以将检测结果数据(即第二样本数据)分为几种不同的特征分布。It should be noted that, as mentioned above, although the above example takes two-dimensional feature space distribution as an example, in actual applications, the feature data axis and the segmentation data axis may be multi-dimensional. That is, multiple segmentation values can be selected for the segmented data axis to divide the detection result data (ie, the second sample data) into several different feature distributions.
进一步地,本发明并不限制特征空间的具体选取方法,在其中一种实施方式中,可使用特征选择算法进行特征数据轴和分割数据轴的选择从而自动选择特征空间;在其他的实施方式中,也可以手动进行特征数据轴和分割数据轴的选择,本发明对此不作任何限定。更具体地,所述特征数据轴可以表示颜色、纹理、形状、大小等信息,所述分割数轴可以是经过训练的均值图等信息。Furthermore, the present invention does not limit the specific selection method of the feature space. In one embodiment, a feature selection algorithm can be used to select the feature data axis and the segmentation data axis to automatically select the feature space; in other embodiments, , the feature data axis and segmentation data axis can also be selected manually, and the present invention does not impose any limitations on this. More specifically, the feature data axis can represent information such as color, texture, shape, size, etc., and the segmentation axis can be information such as a trained mean map.
进一步地,作为其中一种优选实施方式,所述特征空间选择的标准为:所述分割数据轴可以将不同的工艺区域进行较好的区分,所述特征数据轴可以使真缺陷数据和噪扰数据(噪声点)之间有明显的区别,最终目的是使检测结果数据在特征空间中的分布呈现出某种趋势,使真正的缺陷和噪声点的区分更为明显。比如,对于晶圆缺陷的检测结果数据,如果以特征数据信息中的形状作为特征数据轴比以特征数据信息中的纹理作为特征数据轴,更能将所述检测结果数据在所述特征空间中将真正的缺陷和噪扰数据区分的更为明显,则以所述特征数据信息中的形状作为特征数据轴,而不是以所述特征数据信息中的纹理作为特征数据轴。可以理解地,所述特征数据信息中的形状就不再作为所述分割数据轴。Further, as one of the preferred embodiments, the criteria for selecting the feature space are: the segmented data axis can better distinguish different process areas, and the feature data axis can make true defect data and noise There are obvious differences between the data (noise points). The ultimate goal is to make the distribution of the detection result data in the feature space show a certain trend, making the distinction between real defects and noise points more obvious. For example, for the detection result data of wafer defects, if the shape in the feature data information is used as the feature data axis rather than the texture in the feature data information as the feature data axis, the detection result data can be better positioned in the feature space. To make the distinction between real defects and noise data more obvious, the shape in the feature data information is used as the feature data axis instead of the texture in the feature data information as the feature data axis. It can be understood that the shape in the feature data information is no longer used as the segmentation data axis.
优选地,在其中一种优选实施方式中,请参考图5,其示意性地给出了本发明一实施方式提供的检测配方设置与优化方法的流程示意图。从图5可以看出,步骤S300中,所述采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,包括:Preferably, in one of the preferred embodiments, please refer to FIG. 5 , which schematically provides a flow chart of a detection recipe setting and optimization method provided by an embodiment of the present invention. It can be seen from Figure 5 that in step S300, the preset outlier statistical analysis strategy is used to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information, including:
判断是否自动寻找缺陷分布边界信息,若是,则根据选择的离群统计分析模型,对所述离群统计分析模型进行训练,获取缺陷分布边界信息;若否,则采用数据分割法对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息。Determine whether to automatically search for defect distribution boundary information. If so, train the outlier statistical analysis model according to the selected outlier statistical analysis model to obtain defect distribution boundary information; if not, use the data segmentation method to analyze the data. Conduct outlier statistical analysis on the characteristic distribution information to obtain defect distribution boundary information.
具体地,请参见图6,其中,图6为应用本发明提供的离群统计分析模型得到的缺陷分布边界信息示意图。图6中,feature1为分割数据轴,feartrue2为特征数据轴。从图6可以看出,在该示例中,所述缺陷分布边界信息3为一条曲线。由此可见,本发明提供的检测配方设置与优化方法,通过根据所述检测结果数据和所述数据特征分布信息,确定所述预设离群统计分析策略,并根据确定的所述预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,能够使得所述缺陷分布边界信息能将真缺陷数据2和噪扰数据1较好的进行分离,即所述缺陷分布边界信息能够使得在不产生漏检缺陷的情况下尽可能的减少过检问题,以滤除掉更多的噪声数据。从而能够保证后续依据所述缺陷分布边界信息进行反向推导确定的检测配方不出现漏检和过检,从而提高检测流程的缺陷检测精度。Specifically, please refer to FIG. 6 , which is a schematic diagram of defect distribution boundary information obtained by applying the outlier statistical analysis model provided by the present invention. In Figure 6, feature1 is the segmentation data axis, and feartrue2 is the feature data axis. It can be seen from Figure 6 that in this example, the defect distribution boundary information 3 is a curve. It can be seen that the detection formula setting and optimization method provided by the present invention determines the preset outlier statistical analysis strategy based on the detection result data and the data feature distribution information, and determines the preset outlier statistical analysis strategy based on the determined preset outlier. The group statistical analysis strategy performs outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information, which can enable the defect distribution boundary information to better separate the true defect data 2 and the noise data 1, that is, The defect distribution boundary information can reduce over-inspection problems as much as possible without causing missed detection defects, so as to filter out more noise data. This can ensure that the subsequent detection formula determined by reverse derivation based on the defect distribution boundary information will not cause missed detection or over-detection, thereby improving the defect detection accuracy of the detection process.
需要特别说明的是,针对于同样的第二样本数据使用同样的特征空间,若采用的离群统计分析策略不同,则得到的所述缺陷分布边界信息可能不相同,由此,后续的反向推导 以及检测配方的策略都与所述离群统计分析策略紧密相关,针对图6使用的相同的第二样本数据,如采用数据分割法,则所述缺陷分布边界信息的形状与图6中完全不同,具体请参见下文的描述,为了避免赘述,此处暂不展开。It should be noted that if the same feature space is used for the same second sample data and different outlier statistical analysis strategies are adopted, the defect distribution boundary information obtained may be different. Therefore, the subsequent reverse Derivation As well as the strategy for detecting formulas are closely related to the outlier statistical analysis strategy. For the same second sample data used in Figure 6, if the data segmentation method is used, the shape of the defect distribution boundary information is completely different from that in Figure 6 , please refer to the description below for details. To avoid redundancy, we will not elaborate here.
为了便于理解和说明,以下均以二维数据分布为例进行说明,首先,对所述离群统计分析模型予以详细说明,然后,再对数据分割法进行说明。In order to facilitate understanding and explanation, the following description takes two-dimensional data distribution as an example. First, the outlier statistical analysis model is explained in detail, and then the data segmentation method is explained.
具体地,所述对所述离群统计分析模型进行训练包括:根据所述检测结果数据和所述数据特征分布信息,对选定的所述离群统计分析模型进行训练,直至得到的所述检测对象的缺陷分布边界信息满足第一预设条件。Specifically, training the outlier statistical analysis model includes: training the selected outlier statistical analysis model according to the detection result data and the data feature distribution information until the obtained The defect distribution boundary information of the detection object satisfies the first preset condition.
更具体地,如本领域技术人员可以理解地,可以根据所述检测结果数据和所述数据特征分布信息进行综合分析进行离群统计分析模型的选择,所述离群分析统计模型包括但不限于基于统计的离群算法(如3σ原则)、基于距离和邻近度的聚类算法(如K-means等)、基于密度的离群算法(如DBSCAN等)、基于树的离群分析算法(如孤立森林等)。需要特别说明的是,算法模型的选择是非常关键的,算法模型不同则意味着离群边界形状的不同,一个最优的算法模型能使数据集的训练既不发生欠拟合也不会出现过拟合。比如,若在所述特征空间中所述第二样本数据的分布更接近正态分布,则所述离群分析统计模型优选基于统计的离群算法(如3σ原则),再比如,若在所述特征空间中所述第二样本数据的分布真缺陷数据以及噪扰数据之间距离较近,而缺陷数据和噪扰数据之间距离较远,则所述离群分析统计模型优选基于距离和邻近度的聚类算法。本领域的技术人员应该能够据此举一反三,在此不再一一赘述。More specifically, as those skilled in the art can understand, comprehensive analysis can be performed based on the detection result data and the data feature distribution information to select an outlier statistical analysis model. The outlier analysis statistical model includes but is not limited to Statistics-based outlier algorithms (such as the 3σ principle), distance and proximity-based clustering algorithms (such as K-means, etc.), density-based outlier algorithms (such as DBSCAN, etc.), tree-based outlier analysis algorithms (such as isolated forest, etc.). It should be noted that the choice of algorithm model is very critical. Different algorithm models mean different shapes of outlier boundaries. An optimal algorithm model can make the training of the data set neither underfitting nor outliers occur. Overfitting. For example, if the distribution of the second sample data in the feature space is closer to a normal distribution, the outlier analysis statistical model is preferably based on a statistical outlier algorithm (such as the 3σ principle). For another example, if The distribution of the second sample data in the feature space is close to the true defect data and the noise data, but the distance between the defect data and the noise data is far, then the outlier analysis statistical model is preferably based on the distance sum Proximity clustering algorithm. Those skilled in the art should be able to draw inferences based on this and will not go into details here.
进一步地,本领域的技术人员应该能够理解,所述离群分析统计模型的目的是找到最优化的边界结果,在确定离群分析统计模型之后,应该使用所述第二样本数据对选定的所述离群分析统计模型进行训练,通过不断学习和目标优化过程从而使得模型训练的结果可以找到最佳的所述分割数据轴的拐点以及根据所述特征数据轴将真缺陷和噪扰数据(干扰噪声点)进行区别。由此,在所述离群分析统计模型训练完成之后,得到一个边界结果(即缺陷分布边界信息),请参见图6,如图6所示,缺陷分布边界曲线3(即缺陷分布边界信息)能将缺陷数据和噪声数据较好的进行分离,保证检测结果不漏检也不产生过检的问题。即所述第一预设条件为缺陷分布边界信息能够将所述第二样本中的标签为真缺陷数据和标签为噪扰数据的检测结果数据进行区分。Further, those skilled in the art should be able to understand that the purpose of the outlier analysis statistical model is to find the optimal boundary result. After determining the outlier analysis statistical model, the second sample data should be used to pair the selected The outlier analysis statistical model is trained, and through continuous learning and target optimization processes, the model training results can find the optimal inflection point of the segmented data axis and classify the true defects and noise data ( interference noise points) to distinguish. Therefore, after the training of the outlier analysis statistical model is completed, a boundary result (ie, defect distribution boundary information) is obtained. Please refer to Figure 6. As shown in Figure 6, defect distribution boundary curve 3 (ie, defect distribution boundary information) It can better separate defect data and noise data to ensure that the detection results will not miss detection or cause over-inspection problems. That is, the first preset condition is that the defect distribution boundary information can distinguish the detection result data labeled as true defect data and the detection result data labeled as noise data in the second sample.
进一步地,所述采用数据分割法对所述数据特征分布信息进行离群统计分析包括:根据所述检测结果数据和所述数据特征分布信息,在所述特征数据轴和/或所述分割数据轴上获取至少一个第一分割阈值;并根据所述第一分割阈值获取所述缺陷边界信息,直至得到的所述检测对象的缺陷分布边界信息满足第二预设条件。Further, using the data segmentation method to perform outlier statistical analysis on the data feature distribution information includes: based on the detection result data and the data feature distribution information, on the feature data axis and/or the segmented data At least one first segmentation threshold is obtained on the axis; and the defect boundary information is obtained according to the first segmentation threshold until the obtained defect distribution boundary information of the detection object satisfies the second preset condition.
作为其中一种优选实施方式,所述数据分割法包括在所述特征空间采用手动分割的方式,以获取所述第一分割阈值。如本领域技术人员可以理解地,本发明并不限定所述数据分割法的具体实施方式,在其他的实施方式中,也可以通过数据分割算法来获取所述第一分割阈值。As one of the preferred embodiments, the data segmentation method includes manually segmenting the feature space to obtain the first segmentation threshold. As those skilled in the art can understand, the present invention is not limited to the specific implementation of the data segmentation method. In other implementations, the first segmentation threshold can also be obtained through a data segmentation algorithm.
为了便于理解和说明,以下以二维数据分布、手动分割为例,对所述数据分割法予以说明如下:In order to facilitate understanding and explanation, the following takes two-dimensional data distribution and manual segmentation as an example to explain the data segmentation method as follows:
S321:根据所述数据特征分布信息,以及标签为真缺陷数据和标签为噪扰数据的检测结果数据分布的一致性,确定所述分割数据轴的第一分割阈值。S321: Determine the first segmentation threshold of the segmented data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data.
S322:根据所述数据特征分布信息,以及标签为真缺陷数据和标签为噪扰数据的检测结果数据分布的一致性,确定所述特征数据轴的第二分割阈值;S322: Determine the second segmentation threshold of the feature data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data;
S323:根据所述分割数据轴的第一分割阈值和所述特征数据轴的第二分割阈值,得到所述检测对象的缺陷分布边界信息。 S323: Obtain the defect distribution boundary information of the detection object based on the first segmentation threshold of the segmentation data axis and the second segmentation threshold of the feature data axis.
具体地,步骤S321中,以所述数据特征分布信息作为输入,在这个特征分布图中对分割数据轴进行分割,分割的标准是检测结果数据分布的一致性,将具有一致分布的数据作为一个簇,找到簇与簇之间的分割值,使不同工艺的数据能够进行区分。所述一致分布包括检测结果数据的特征数据信息的分布规律,包括但不限于在特征空间中的分布密度、空间点的相对位置关系等,以此进行分割轴和特征轴阈值的确定依据,比如,在其中一个示例中,共设置两个第一分割阈值segment_value1和segment_value2。Specifically, in step S321, the data feature distribution information is used as input, and the segmented data axis is segmented in this feature distribution map. The segmentation standard is the consistency of the detection result data distribution, and the data with consistent distribution is regarded as a Cluster, find the segmentation value between clusters, so that the data of different processes can be distinguished. The consistent distribution includes the distribution law of the characteristic data information of the detection result data, including but not limited to the distribution density in the characteristic space, the relative position relationship of the spatial points, etc., based on which the segmentation axis and the characteristic axis threshold are determined, such as , in one of the examples, two first segmentation thresholds segment_value1 and segment_value2 are set.
对应地,步骤S322中,在特征分布中对所述特征数据轴进行第二分割阈值确定。由于在特征分布中已经对缺陷数据点进行了标注,因此,第二分割阈值确定的原则是让噪扰数据和真缺陷数据分离的尽量远,这样可以在保证检测结果数据不发生漏检的同时也尽可能减少过检的发生。即所述第二预设条件优选为所述缺陷边界信息能够将所述真缺陷数据和所述噪扰数据分离。Correspondingly, in step S322, a second segmentation threshold is determined for the feature data axis in the feature distribution. Since the defect data points have been marked in the feature distribution, the principle of determining the second segmentation threshold is to separate the noise data and the real defect data as far as possible, so as to ensure that the detection result data will not be missed at the same time. Also minimize the occurrence of over-inspections. That is, the second preset condition is preferably that the defect boundary information can separate the true defect data and the noise data.
由此,在分别对分割数据轴的第一分割阈值和所述特征数据轴的第二分割阈值确定后,即可得到离群统计分析的缺陷分别边界信息。下图仍是以二维特征数据分布为例,将手动分割出来的缺陷分布边界信息进行了展示。在所述分割轴上使用两个第一分割阈值segment_value1和segment_value2进行了检测结果数据的分割,将所有的检测结果数据分为三段不同的分布。在每一个分割阈值的区间,在所述特征数据轴上使用三条不同的第二分割阈值将真缺陷和噪扰数据进行区分,得到最终的缺陷分布边界信息。即所述缺陷分布边界信息包括由两个第一分割阈值segment_value1和segment_value2形成的2条平行于所述特征数据轴featrue1的直线,以及分别位于特征数据轴featreu1、所述第一分割阈值segment_value1和segment_value2之间,且分别与所述特征数据轴featreu1和所述第一分割阈值segment_value1相交的第一线段、所述第一分割阈值segment_value1和segment_value2相交的第二线段以及与所述第一分割阈值segment_value2相交且沿着所述分割数据轴feature2向上延伸的第三直线。Therefore, after determining the first segmentation threshold for segmenting the data axis and the second segmentation threshold for the feature data axis, the defect respective boundary information of the outlier statistical analysis can be obtained. The figure below still takes the two-dimensional feature data distribution as an example to display the manually segmented defect distribution boundary information. The detection result data is segmented using two first segmentation thresholds segment_value1 and segment_value2 on the segmentation axis, and all the detection result data is divided into three different distributions. In each segmentation threshold interval, three different second segmentation thresholds are used on the feature data axis to distinguish true defects from noise data, and the final defect distribution boundary information is obtained. That is, the defect distribution boundary information includes two straight lines parallel to the feature data axis featureu1 formed by the two first segmentation thresholds segment_value1 and segment_value2, and are respectively located on the feature data axis featreu1 and the first segmentation thresholds segment_value1 and segment_value2. between the first line segment that intersects the feature data axis featreu1 and the first segmentation threshold segment_value1, the second line segment that intersects the first segmentation thresholds segment_value1 and segment_value2, and the first segmentation threshold segment_value2 A third straight line that intersects and extends upward along the segmented data axis feature2.
优选地,在其中一种示范性实施方式中,所述采用预设离群统计分析策略还包括:数据分割和模型学习相结合的离群统计分析策略。所述数据分割和模型学习相结合的离群统计分析策略包括:根据所述数据特征分布信息,获取标签为真缺陷的所述检测结果数据在所述分割数据轴上的至少一个第一分割阈值;并根据所述第一分割阈值和所述数据特征分布信息,对选定的所述离群统计分析模型进行训练,直至得到的所述检测对象的缺陷分布边界信息满足第三预设条件。Preferably, in one of the exemplary implementations, the use of a preset outlier statistical analysis strategy further includes: an outlier statistical analysis strategy that combines data segmentation and model learning. The outlier statistical analysis strategy that combines data segmentation and model learning includes: obtaining at least one first segmentation threshold on the segmentation data axis of the detection result data labeled as a true defect based on the data feature distribution information. ; And according to the first segmentation threshold and the data feature distribution information, train the selected outlier statistical analysis model until the obtained defect distribution boundary information of the detection object meets the third preset condition.
如此配置,本发明提供的检测配方设置与优化方法,在获取离群分布边界信息时,通过数据分割和模型学习相结合的离群统计分析策略,可以进一步减少机器学习模型训练的不确定性,使机器学习模型的输入有一定的约束条件,将手动分割的结果作为约束条件,从而能够进一步提高缺陷边界分布信息获取的效率。With such a configuration, the detection recipe setting and optimization method provided by the present invention can further reduce the uncertainty of machine learning model training through an outlier statistical analysis strategy that combines data segmentation and model learning when obtaining outlier distribution boundary information. The input of the machine learning model has certain constraints, and the results of manual segmentation are used as constraints, which can further improve the efficiency of obtaining defect boundary distribution information.
所述第三预设条件优选为在保证检测结果数据不发生漏检的同时也尽可能减少过检的发生,即所述第二预设条件优选为所述缺陷边界信息能够将所述真缺陷数据和所述噪扰数据分离或所述离群统计分析模型的训练次数达到预设值。The third preset condition is preferably to ensure that the detection result data does not miss detection while also minimizing the occurrence of over-inspection. That is, the second preset condition is preferably that the defect boundary information can reduce the true defect to The data and the noise data are separated or the number of training times of the outlier statistical analysis model reaches a preset value.
如本领域技术人员可以理解地,与数据分割法不同,采用数据分割和模型学习相结合的离群统计分析策略得到的缺陷分布边界信息与上述采用数据分割法得到的缺陷分布边界信息不同,采用数据分割和模型学习相结合的离群统计分析策略得到的缺陷分布边界信息包括由两个第一分割阈值segment_value1和segment_value2形成的2条平行于所述特征数据轴featrue1的直线,以及分别位于特征数据轴featreu1、所述第一分割阈值segment_value1和segment_value2形成的3个区间的包围所述真缺陷数据的封闭曲线。由于采用的离群统计分析策略不同,得到的缺陷边界分布信息截然不同,但很显然地,不管采用何种离群统计分析策略,得到的所述缺陷边界分布信息均能将所述检测结果数据中的 真缺陷数据和噪扰数据准确地区分。如前所述,基于此,本发明并不限定所述离群统计分析策略的具体实现方式。As those skilled in the art can understand, unlike the data segmentation method, the defect distribution boundary information obtained by using the outlier statistical analysis strategy that combines data segmentation and model learning is different from the defect distribution boundary information obtained by the above-mentioned data segmentation method. The defect distribution boundary information obtained by the outlier statistical analysis strategy that combines data segmentation and model learning includes two straight lines parallel to the feature data axis featureure1 formed by the two first segmentation thresholds segment_value1 and segment_value2, and two straight lines located on the feature data respectively. The three intervals formed by axis featreu1, the first segmentation thresholds segment_value1 and segment_value2 are closed curves surrounding the true defect data. Due to the different outlier statistical analysis strategies used, the defect boundary distribution information obtained is completely different. However, it is obvious that no matter what outlier statistical analysis strategy is used, the defect boundary distribution information obtained can all compare with the detection result data. middle Accurately distinguish between true defect data and noise data. As mentioned above, based on this, the present invention does not limit the specific implementation of the outlier statistical analysis strategy.
另外,所述数据分割和模型学习相结合的离群统计分析策略中的数据分割法和模型学习的详细内容,请参见上文中有关数据分割法和离群统计分析模型的详细说明,为了避免赘述,在此不再详述。In addition, for details on the data segmentation method and model learning in the outlier statistical analysis strategy that combines data segmentation and model learning, please refer to the detailed description of the data segmentation method and outlier statistical analysis model above. In order to avoid redundancy , will not be described in detail here.
优选地,在其中一种示范性实施方式中,请参见图7,其示意性地给出了图1中步骤S400的详细流程示意图。从图7可以看出,步骤S400中,所述根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,确定设置或优化检测配方的检测参数的取值,包括:Preferably, in one of the exemplary implementations, please refer to FIG. 7 , which schematically provides a detailed flow chart of step S400 in FIG. 1 . It can be seen from Figure 7 that in step S400, based on the defect distribution boundary information and the preset outlier statistical analysis strategy, through reverse derivation, the values of the detection parameters for setting or optimizing the detection formula are determined, including :
S410:根据所述预设离群统计分析策略,确定反向推导策略;S410: Determine the reverse derivation strategy according to the preset outlier statistical analysis strategy;
S420:根据所述反向推导策略,确定所述反向推导策略的输入数据信息;S420: According to the reverse derivation strategy, determine the input data information of the reverse derivation strategy;
S430:根据所述输入数据信息,确定所述检测结果数据的数据分布模型;S430: Determine the data distribution model of the detection result data according to the input data information;
S440:根据所述数据分布模型和所述缺陷分布边界信息,确定所述检测配方的检测参数;S440: Determine the detection parameters of the detection formula according to the data distribution model and the defect distribution boundary information;
S450:根据所述检测配方的策略和所述反向推导的输入数据信息,设置或优化所述检测配方的检测参数的取值。S450: Set or optimize the value of the detection parameter of the detection recipe according to the strategy of the detection recipe and the input data information of the reverse derivation.
由此,与现有技术中正向的参数设置采用的根据正向调参反馈的结果,进行相应参数的调整(可能只调整一个或两个检测参数)相比,本发明提供的检测配方设置与优化方法,采用逆向推导的方式确定检测配方策略,并根据缺陷边界分布信息来反向推理出所述检测配方的所有参数设置值(关键参数,比如数据密度、数据稀疏距离和/或公差范围等),将检测流程的参数之间的耦合关系也考虑在内,从而避免了反复调参过程;而且调参过程根据用户标注结果,用户无需具有先验知识,也可自动推理出一套相对准确的检测流程的参数,一次性地将所有检测参数均调到最优水平,在提高检测流程调参效率的同时,提高了检测配方的检测精度。Therefore, compared with the forward parameter setting in the prior art, which adopts the adjustment of corresponding parameters based on the results of forward parameter adjustment feedback (maybe only one or two detection parameters are adjusted), the detection recipe setting provided by the present invention is different from Optimization method uses reverse derivation to determine the detection recipe strategy, and reversely infers all parameter settings of the detection recipe (key parameters, such as data density, data sparsity distance and/or tolerance range, etc.) based on the defect boundary distribution information ), the coupling relationship between the parameters of the detection process is also taken into account, thereby avoiding repeated parameter adjustment processes; and the parameter adjustment process is based on the user's annotation results, and the user does not need to have prior knowledge to automatically deduce a relatively accurate set of parameters. The parameters of the detection process are adjusted to the optimal level at one time, which not only improves the efficiency of parameter adjustment in the detection process, but also improves the detection accuracy of the detection formula.
更具体地,请参见图8,其示意性地给出了应用本发明提供的检测配方设置与优化方法进行反向推导的一具体示例图。从图8可以看出,本发明提供的检测配方设置与优化方法,所述离群统计分析策略、所述反向推导策略以及所述检测流程的参数设置值是紧密相关的:即所述反向推导的策略和检测配方的策略与获取所述缺陷边界分布信息的所述离群统计分析策略的核心是一致的。举例而言,若采用离群分割法作为离群统计分析的策略,则反向推导以及检测配方的策略的基本原理也应该与所述离群分割法的基本原理一致。More specifically, please refer to FIG. 8 , which schematically shows a specific example of reverse derivation using the detection recipe setting and optimization method provided by the present invention. It can be seen from Figure 8 that in the detection recipe setting and optimization method provided by the present invention, the outlier statistical analysis strategy, the reverse derivation strategy and the parameter setting values of the detection process are closely related: that is, the reverse derivation strategy The strategy for directional derivation and the strategy for detecting recipes are consistent with the core of the outlier statistical analysis strategy for obtaining the defect boundary distribution information. For example, if the outlier segmentation method is used as the strategy for outlier statistical analysis, the basic principles of the strategy for reverse derivation and detection of recipes should also be consistent with the basic principles of the outlier segmentation method.
为了便于理解本发明,以下分别以采用数据分割法作为离群统计分析策略、基于高斯模型的离群统计分析策略以及机器学习的离群统计分析策略为例,对反向推导获取检测配方的参数设置值的过程予以详细说明。In order to facilitate the understanding of the present invention, the following uses the data segmentation method as the outlier statistical analysis strategy, the outlier statistical analysis strategy based on Gaussian model and the outlier statistical analysis strategy of machine learning as examples to perform reverse derivation to obtain the parameters of the detection formula. The process of setting values is explained in detail.
一、数据分割法,反向推导新的数据流程和参数设置值1. Data segmentation method, reversely deriving new data processes and parameter setting values
在对运用数据分割法的基本原理进行反向推导以获取检测配方和参数设置值的具体步骤具体说明之前,先对该方法的核心思想说明如下:Before describing the specific steps of reverse derivation of the basic principles of the data segmentation method to obtain the detection formula and parameter setting values, the core idea of the method is explained as follows:
为了便于理解本发明,请结合图9,其示意性地给出了本实施例一实施方式提供的其中一种检测结果数据的数据密度分布示意图。该方法的基本思想在于通过将特征分布图中检测结果数据的点(所述检测结果数据的特征值)密度大于第一阈值的区域定义为正常(normal)区域,即将正常区域表示为和数据密度相关的函数。由此,数据密度data_density大于所述第一阈值的所有数据点(检测结果数据的特征值)均是正常的,那么数据密度data_density就是需要反向推理的其中一项检测参数。进一步地,数据密度小于或等于所述第一阈值且大于第二阈值的区域定义为噪扰(nuisance)区域,噪扰区域的检测参数表示处于该区域的检测结果数据包含噪声,而这些噪声是允许的误差(即由于工艺误差和噪声 影响会产生噪扰区域),而不属于缺陷数据。即认为噪扰区域是在正常区域的基础上增加一个公差值(位移参数)用来描述噪扰区域,用下式表示:
nuisance_threshold=f1(data_density)     (1)
In order to facilitate understanding of the present invention, please refer to FIG. 9 , which schematically provides a schematic diagram of the data density distribution of one of the detection result data provided in an implementation manner of this embodiment. The basic idea of this method is to define the area where the density of the detection result data points (the characteristic value of the detection result data) in the feature distribution diagram is greater than the first threshold as a normal area, that is, the normal area is expressed as the sum of the data density related functions. Therefore, all data points (feature values of detection result data) whose data density data_density is greater than the first threshold are normal, and then data density data_density is one of the detection parameters that requires reverse inference. Further, an area where the data density is less than or equal to the first threshold and greater than the second threshold is defined as a nuisance area. The detection parameters of the nuisance area indicate that the detection result data in this area contains noise, and these noises are Allowable errors (i.e. due to process errors and noise The impact will produce a noisy area) and is not defective data. That is to say, the noise area is considered to be a tolerance value (displacement parameter) added to the normal area to describe the noise area, expressed by the following formula:
nuisance_threshold=f1(data_density) (1)
由于在离群统计分析中,已经对真缺陷数据进行了标注(即获得了噪扰区域和真缺陷区域的边界),因此,可以根据缺陷边界分布信息将位移参数(公差值)进行反向推理得到。将数据密度小于或等于第二阈值(即噪扰区域之外)的区域定位为真缺陷(defect)区域,具体地,可以通过下式表示:
boundary_threshold=f2(inspection_data)      (2)
defect_threshold=f3(boundary_threshold)       (3)
offset_parameter=abs(defect_threshold-nuisance_threshold)    (4)
Since the true defect data has been annotated in the outlier statistical analysis (that is, the boundary between the noise area and the true defect area has been obtained), the displacement parameters (tolerance values) can be reversed based on the defect boundary distribution information. inferred. Locate the area where the data density is less than or equal to the second threshold (that is, outside the noise area) as a true defect area. Specifically, it can be expressed by the following formula:
boundary_threshold=f2(inspection_data) (2)
defect_threshold=f3(boundary_threshold) (3)
offset_parameter=abs(defect_threshold-nuisance_threshold) (4)
式中,boundary threshold为离群统计分析算法得到的缺陷分布边界结果,defect_threshold为所述缺陷分布边界boundary_threshold相关的函数,最后利用defect_threshold和nuisance_threshold可以将位移参数offset_parameter)计算出来。In the formula, boundary threshold is the defect distribution boundary result obtained by the outlier statistical analysis algorithm, defect_threshold is the function related to the defect distribution boundary boundary_threshold, and finally the displacement parameter offset_parameter) can be calculated using defect_threshold and nuisance_threshold.
根据上述分析可知,作为其中一种优选实施方式,若所述预设离群统计分析策略为数据分割法,则通过以下步骤获取所述检测配方的位移参数:According to the above analysis, as one of the preferred implementation methods, if the preset outlier statistical analysis strategy is the data segmentation method, the displacement parameters of the detection formula are obtained through the following steps:
步骤A1:根据所述数据分割法,将统计所述检测对象的检测结果数据的数据分布密度作为所述反向推导策略。Step A1: According to the data segmentation method, count the data distribution density of the detection result data of the detection object as the reverse derivation strategy.
步骤A2:根据所述统计数据分布密度的反向推导策略,将所述检测对象的所有检测结果数据作为所述输入数据信息。Step A2: According to the reverse derivation strategy of the statistical data distribution density, use all detection result data of the detection object as the input data information.
步骤A3:根据所有检测结果数据,假设所有的所述检测结果数据的特征数据信息的特征值在特征空间的数据分布密度分为正常区域、噪扰区域和真缺陷区域;所述正常区域为数据分布密度大于第一密度阈值的区域,噪扰区域为数据密度小于或等于所述第一密度阈值且大于第二密度阈值的区域,真缺陷区域为数据密度小于或等于所述第二密度阈值的区域。Step A3: Based on all the detection result data, it is assumed that the data distribution density of the characteristic data information of all the detection result data in the feature space is divided into normal areas, noise areas and true defect areas; the normal area is the data The area where the distribution density is greater than the first density threshold. The noise area is the area where the data density is less than or equal to the first density threshold and greater than the second density threshold. The true defect area is the area where the data density is less than or equal to the second density threshold. area.
步骤A4:根据所有检测结果数据和所有检测结果数据的标签,计算所述第一密度阈值和所述第二密度阈值;其中,所述第一密度阈值大于所述第二密度阈值;Step A4: Calculate the first density threshold and the second density threshold according to all detection result data and the labels of all detection result data; wherein the first density threshold is greater than the second density threshold;
步骤A5:根据所述第一密度阈值、所述第二密度阈值和所述缺陷分布边界信息,计算所述检测配方的位移参数。Step A5: Calculate the displacement parameter of the detection formula according to the first density threshold, the second density threshold and the defect distribution boundary information.
更具体地,为了更清楚地理解本发明,接下来以晶圆宏缺陷检测为例对采用数据分割法获取缺陷边界分布信息、进行反向推导获取检测流程的参数设置值进行详细说明。More specifically, in order to understand the present invention more clearly, next, taking wafer macro-defect detection as an example, the data segmentation method is used to obtain defect boundary distribution information, and reverse derivation is performed to obtain the parameter setting values of the inspection process.
参见图10,其示意性地给出了本实施例一实施方式提供的标准分割轴的平均灰度级范围内的真缺陷分布示意图。如图10所示,假设在标准(standard)分割轴(即分割数据轴,对应图中的纵轴Feature2)的每个灰度级范围内都存在缺陷,并对真缺陷数据和噪扰数据进行了标注。其中,所述标准分割轴为通过N(N可以根据实际需要设定,比如图10中N=10,本发明对此不作限定)张标准的无缺陷工艺数据图统计产生的平均值图,即使用N张标准图对应像素灰度的平均值作为最终结果。具体地,请参见图11(a)-图11(c)以及图12,其中,图11(a)为本发明一实施方式提供的多张测试图示例图,图11(b)为图11(a)中多张测试图生成的平均值图示例图,图11(c)为图11(a)中多张测试图生成的标准差图示例图,图12为应用本发明提供的灰度动态阈值示意图。图中,像素点A为测试图中的一个像素点,像素点A1和像素点A2分别为像素点A在均值图和标准差图中对应的像素点。Referring to FIG. 10 , a schematic diagram of true defect distribution within the average gray level range of the standard segmentation axis provided by an implementation of this embodiment is provided. As shown in Figure 10, it is assumed that there are defects in each gray level range of the standard segmentation axis (that is, the segmentation data axis, corresponding to the vertical axis Feature2 in the figure), and the true defect data and noise data are marked. Wherein, the standard dividing axis is an average graph generated by the statistics of N (N can be set according to actual needs, such as N=10 in Figure 10, the present invention is not limited to this) standard defect-free process data graphs, that is, The average value of the corresponding pixel grayscales of N standard images is used as the final result. Specifically, please refer to Figure 11(a)-Figure 11(c) and Figure 12. Figure 11(a) is an example of multiple test charts provided by an embodiment of the present invention, and Figure 11(b) is Figure 11 (a) is an example of the average value chart generated by multiple test charts. Figure 11(c) is an example of the standard deviation chart generated by multiple test charts in Figure 11(a). Figure 12 is the grayscale provided by the application of the present invention. Dynamic threshold diagram. In the figure, pixel A is a pixel in the test image, and pixel A1 and A2 are the corresponding pixels of pixel A in the mean map and standard deviation map respectively.
a、选定样本:选定样本(如图11(a)所示),根据N张所述测试图统计(训练)得到平均值图和标准差图。a. Selected samples: Select samples (as shown in Figure 11(a)), and obtain the average value chart and the standard deviation chart based on statistics (training) of N test charts.
b、确定特征数据轴和分割数据轴,参见图10,feature1为图10中的特征数据轴,fearture2为图10中的分割数据轴。根据所述测试图的灰度值与所述分割数据轴的灰度值,计算得 到各个测试图在特征数据轴feature1的值,如下式所示:
feature1=test-mean         (5)
b. Determine the feature data axis and segmentation data axis. See Figure 10. Feature1 is the feature data axis in Figure 10, and fearture2 is the segmentation data axis in Figure 10. According to the gray value of the test image and the gray value of the segmented data axis, we calculate To the value of each test chart on the feature data axis feature1, as shown in the following formula:
feature1=test-mean (5)
式中,feature1为图10中的特征数据轴,test为所述测试图的灰度值,mean为N张所述测试图统计得到的平均值图的灰度值。In the formula, feature1 is the feature data axis in Figure 10, test is the gray value of the test image, and mean is the gray value of the average image obtained by statistics of N test images.
如前所述,所述分割数据轴feature2通过下式获得:
feature2=mean           (6)
As mentioned before, the segmented data axis feature2 is obtained by the following formula:
feature2=mean (6)
式中,mean为N张所述测试图统计得到的平均值图的灰度值。In the formula, mean is the gray value of the average image obtained by statistics of N test images.
c、假设缺陷分布边界信息(阈值)通过下式表示:
defect_threshold=mean+/-(sigma*std+gray)      (7)
c. Assume that the defect distribution boundary information (threshold) is expressed by the following formula:
defect_threshold=mean+/-(sigma*std+gray) (7)
式中,mean为N张所述测试图统计得到的平均值图的灰度值,std为所述测试图中的其中一个像素点对应的标准差,sigma是标准差的系数,为待求解参数,gray为动态阈值。动态阈值gray相当于上文中的位移参数offset_parameter,可以定义为任意曲线,其和灰度值的平均值mean之间存在如下关系:
gray=b+a1*mean+a2*mean^2+a3*mean^3+……+am*mean^m   (8)
In the formula, mean is the gray value of the average image obtained by statistics of N test images, std is the standard deviation corresponding to one of the pixels in the test image, sigma is the coefficient of the standard deviation, and is the parameter to be solved , gray is the dynamic threshold. The dynamic threshold gray is equivalent to the displacement parameter offset_parameter mentioned above, which can be defined as any curve. There is the following relationship between it and the average value of gray value:
gray=b+a1*mean+a2*mean^2+a3*mean^3+……+am*mean^m (8)
其中,当只取多项式的前两项时,动态阈值gray=b+a1*mean为一条直线形式,当继续取后续的多项式时,变为曲线形式。将多个点的对应值带入式(7),整理得到:
Among them, when only the first two terms of the polynomial are taken, the dynamic threshold gray=b+a1*mean is in the form of a straight line, and when the subsequent polynomials are continued to be taken, it becomes a curve form. Put the corresponding values of multiple points into equation (7), and get:
由此,将反向推理检测参数的问题转化为使用最小二乘法求解上述方程组的最优解问题,其中sigma的值即为边界阈值假设公式中的方差std的系数,而[b a1a2…an]为上述要拟合的分段曲线图中的所有系数。From this, the problem of reverse inference detection parameters is transformed into the problem of using the least squares method to solve the optimal solution of the above system of equations, where the value of sigma is the coefficient of the variance std in the boundary threshold hypothesis formula, and [b a1a2…an ] are all coefficients in the above piecewise curve to be fitted.
d、求解多项式中的各个系数d. Solve each coefficient in the polynomial
对方程组进行求解,得到sigma及[b a1a2…an]各个系数的值,即可将算法中涉及的参数全部解析出来。如下式:

By solving the system of equations and obtaining the values of sigma and [b a1a2...an], all parameters involved in the algorithm can be analyzed. As follows:

将上述方程组转化为矩阵形式,如下所示:
Ax=b
A’Ax=A’b
x=(A’A)^(-1)*(A’b)
Convert the above system of equations into matrix form as follows:
Ax=b
A'Ax=A'b
x=(A'A)^(-1)*(A'b)
由此,x即为最终的解,可通过上述的矩阵运算得到向量:
Therefore, x is the final solution, and the vector can be obtained through the above matrix operation:
根据上述向量,可以获取检测流程中的标准差的系数sigma,以及动态阈值曲线所需要的多个系数,由此,动态阈值gray的曲线也可得出。因此,在检测流程中使用下式即可得到测试图像中每个像素点的真缺陷阈值:
defect_threshold=std*sigma+gray         (9)
According to the above vector, the coefficient sigma of the standard deviation in the detection process can be obtained, as well as the multiple coefficients required for the dynamic threshold curve. From this, the curve of the dynamic threshold gray can also be obtained. Therefore, the true defect threshold of each pixel in the test image can be obtained by using the following formula in the inspection process:
defect_threshold=std*sigma+gray (9)
即大于上述阈值defect_threshold的像素点即为正常点,小于或等于阈值defect_threshold的像素点即为缺陷点。That is, pixels greater than the above threshold defect_threshold are normal points, and pixels less than or equal to the threshold defect_threshold are defective points.
具体地,请参见图11(d)和图11(e),其中,图11(d)为其中一张测试图的放大示例图,图11(e)为使用机器学习算法检测出来的缺陷位置示意图。通过对比图11(d)和图11(e)不难发现,使用本发明提供的检测配方设置与优化方法得到的检测配方,能够准确地检测出待检测对象的真缺陷。Specifically, please refer to Figure 11(d) and Figure 11(e). Figure 11(d) is an enlarged example of one of the test images, and Figure 11(e) is the defect location detected using a machine learning algorithm. Schematic diagram. By comparing Figure 11(d) and Figure 11(e), it is easy to find that the detection formula obtained by using the detection formula setting and optimization method provided by the present invention can accurately detect the true defects of the object to be detected.
二、基于高斯模型的离群统计分析策略,反向推导新的数据流程和参数设置值2. Outlier statistical analysis strategy based on Gaussian model, reversely deriving new data processes and parameter setting values
为了便于理解本发明,在具体说明本发明提供的基于离群学习的反向推导获取新的数据流程和参数设置值之前,先对基于高斯模型的离群统计分析策略,反向推导新的数据流程和参数设置值的核心思想予以说明。该方法的基本原理为假设在特征分布图中所有数据点(检测结果数据)的分布都服从高斯分布。然后根据离群统计分析中的缺陷(defect)边界分布信息反向推理检测模型(检测流程的策略)中需要用到的均值和方差以及方差系数等参数,以得到用高斯模型检测所需的相关参数。与所述数据分割法,反向推导新的数据流程和参数设置值的流程类似,基于高斯模型的离群统计分析策略,反向推导新的数据流程和参数设置值,包括以下步骤:In order to facilitate the understanding of the present invention, before specifically describing the reverse derivation based on outlier learning provided by the present invention to obtain new data processes and parameter setting values, the outlier statistical analysis strategy based on the Gaussian model is first used to reversely deduce new data. The core ideas of the process and parameter setting values are explained. The basic principle of this method is to assume that the distribution of all data points (detection result data) in the feature distribution map obeys Gaussian distribution. Then based on the defect boundary distribution information in the outlier statistical analysis, the parameters such as mean, variance and variance coefficient that need to be used in the detection model (strategy of the detection process) are reversely inferred to obtain the correlation required for Gaussian model detection. parameter. Similar to the data segmentation method and the process of reversely deriving new data processes and parameter settings, the Gaussian model-based outlier statistical analysis strategy to reversely derive new data processes and parameter settings includes the following steps:
步骤B1:所述预设离群统计分析策略为基于高斯模型的离群统计分析策略; Step B1: The preset outlier statistical analysis strategy is an outlier statistical analysis strategy based on Gaussian model;
步骤B2:根据所述基于高斯模型的离群统计分析策略,将获取所述检测对象的检测结果数据的高斯分布作为所述反向推导策略,将高斯模型检测作为检测配方的策略;Step B2: According to the outlier statistical analysis strategy based on the Gaussian model, use the Gaussian distribution of the detection result data of the detection object as the reverse derivation strategy, and use Gaussian model detection as the detection formula strategy;
步骤B3:根据统计高斯分布的反向推导策略,将所述检测对象的所有检测结果数据作为所述输入数据信息和所述缺陷分布边界信息作为所述输入数据信息;Step B3: According to the reverse derivation strategy of statistical Gaussian distribution, use all detection result data of the detection object as the input data information and the defect distribution boundary information as the input data information;
步骤B4:根据所有检测结果数据,假设所有的所述检测结果数据的特征数据信息的特征值在特征空间的数据分布密度服从高斯分布;Step B4: Based on all the detection result data, it is assumed that the data distribution density of the feature values of all the feature data information of the detection result data in the feature space obeys Gaussian distribution;
步骤B5:根据所述输入数据信息和所述缺陷分布边界信息,确定所述高斯模型检测的参数。Step B5: Determine the parameters of the Gaussian model detection based on the input data information and the defect distribution boundary information.
更具体地,通过下述各函数关系式表示如下:
boundary_threshold=f2(inspection_data)     (2)
μ=f4(inspection_data)           (10)
∑=f5(inspection_data,,μ)          (11)
∏=f6(boundary_threshold,inspection_data,μ,,∑)      (12)
More specifically, it is expressed as follows through the following functional relationship expressions:
boundary_threshold=f2(inspection_data) (2)
μ=f4(inspection_data) (10)
∑=f5(inspection_data,,μ) (11)
∏=f6(boundary_threshold, inspection_data, μ,, ∑) (12)
式中,boundary_threshold为离群算法得到的缺陷边界分布结果,可以已经得到这个边界矩阵boundary_threshold。均值μ可以由检测结果数据得到,是对当前检测数据图像求取灰度的平均值而得。方差∑的计算是通过待检测图像的像素点的灰度值与均值μ相减的平方和,再求平均得到的。权重∏可以表示为boundary_threshold,inspection_data,μ和∑相关联的函数,它表现为方差∑的系数,根据下式:
μ+∏*∑=boundary_threshold        (13)
In the formula, boundary_threshold is the defect boundary distribution result obtained by the outlier algorithm. This boundary matrix boundary_threshold can already be obtained. The mean μ can be obtained from the detection result data, which is obtained by calculating the average gray level of the current detection data image. The variance Σ is calculated by subtracting the sum of squares from the gray value of the pixels of the image to be detected and the mean μ, and then averaging. The weight ∏ can be expressed as a function associated with boundary_threshold, inspection_data, μ and ∑, which is expressed as the coefficient of the variance ∑, according to the following formula:
μ+∏*∑=boundary_threshold (13)
上式中,由于μ,方差∑和边界boundary_threshold都已经计算得到,因此可以解方程得到权重∏。In the above formula, since μ, variance ∑ and boundary_threshold have been calculated, the weight ∏ can be obtained by solving the equation.
三、基于机器学习的离群统计分析策略,反向推导新的数据流程和参数设置值3. Outlier statistical analysis strategy based on machine learning, reversely deriving new data processes and parameter setting values
作为优选,在其中一种示范性实施方式中,所述基于机器学习的离群统计分析策略,反向推导新的数据流程和参数设置值,包括以下步骤:Preferably, in one of the exemplary implementations, the outlier statistical analysis strategy based on machine learning to reversely derive new data processes and parameter setting values includes the following steps:
步骤C1:所述预设离群统计分析策略为机器学习的离群统计分析策略;Step C1: The preset outlier statistical analysis strategy is a machine learning outlier statistical analysis strategy;
步骤C2:根据所述机器学习的离群统计分析策略,将获取所述检测对象的检测结果数据的密度阈值和距离阈值作为所述反向推导策略,将机器学习模型作为检测配方的策略;Step C2: According to the outlier statistical analysis strategy of machine learning, the density threshold and distance threshold for obtaining the detection result data of the detection object are used as the reverse derivation strategy, and the machine learning model is used as the detection formula strategy;
步骤C3:根据所述获取所述检测对象的检测结果数据的密度阈值和距离阈值的反向推导策略,将获取的所述检测对象的检测结果数据的密度和距离作为所述输入数据信息;Step C3: According to the reverse derivation strategy of obtaining the density threshold and distance threshold of the detection result data of the detection object, use the obtained density and distance of the detection result data of the detection object as the input data information;
步骤C4:根据所有检测结果数据和所述缺陷边界分布信息,反向推导所述机器学习模型的检测策略的密度参数和距离参数。Step C4: Based on all detection result data and the defect boundary distribution information, reversely derive the density parameters and distance parameters of the detection strategy of the machine learning model.
如本领域技术人员可以理解地,由于机器学习模型需要制定多个参数,基于机器学习的离群统计分析算法,参数的确定直接影响检测精度的高低。如:k-means算法中的初始聚类中心,DBSCAN算法中的邻域和数量阈值等。由此,通过离群统计分析中的缺陷边界分布信息(结果)对这些机器学习参数进行反向推理,可以得到具有先验知识的机器学习模型,进而提高模型检测的精度。具体地,可以通过以下各式:
boundary_threshold=f7(inspection_data)
density_parameters=f8(boundary_threshold,inspection_data)
distance_parameters=f9(boundary_threshold,inspection_data)
As those skilled in the art can understand, since the machine learning model needs to formulate multiple parameters, the determination of the parameters of the outlier statistical analysis algorithm based on machine learning directly affects the detection accuracy. For example: the initial clustering center in the k-means algorithm, the neighborhood and number threshold in the DBSCAN algorithm, etc. Therefore, by performing reverse reasoning on these machine learning parameters through the defect boundary distribution information (results) in outlier statistical analysis, a machine learning model with prior knowledge can be obtained, thereby improving the accuracy of model detection. Specifically, the following formulas can be used:
boundary_threshold=f7(inspection_data)
density_parameters=f8(boundary_threshold,inspection_data)
distance_parameters=f9(boundary_threshold,inspection_data)
式中,boundary_threshold为离群算法得到的缺陷边界分布信息,和检测结果数据相关,在缺陷边界分析流程中已经得到。基于距离和密度进行聚类算法重要的两个参数是密度density_parameters和距离distance_parameters,密度density_parameters和距离distance_parameters来源于检测结果数据和边界矩阵,通过反推距离和密度参数,以使得缺陷恰好位于预设阈值之外能够被检测到;而正常的像素点则位于密度较大的阈值范围内, 被过滤掉,由此,提高检测精度。In the formula, boundary_threshold is the defect boundary distribution information obtained by the outlier algorithm, which is related to the detection result data and has been obtained in the defect boundary analysis process. The two important parameters of the clustering algorithm based on distance and density are density density_parameters and distance distance_parameters. Density density_parameters and distance distance_parameters are derived from the detection result data and the boundary matrix. By inverting the distance and density parameters, the defects are exactly located at the preset threshold. can be detected; while normal pixels are located within a threshold range with a larger density, are filtered out, thereby improving detection accuracy.
优选地,其中一种示范性实施方式中,请继续参见图1,从图1可以看出,所述检测配方设置与优化方法还包括:Preferably, in one of the exemplary implementations, please continue to refer to Figure 1. As can be seen from Figure 1, the detection recipe setting and optimization method also includes:
S500:根据所述检测配方及所述检测配方的检测参数的取值,对待检测对象进行缺陷分析,得到所述待检测对象的缺陷数据信息。S500: Perform defect analysis on the object to be detected according to the detection formula and the values of the detection parameters of the detection formula, and obtain defect data information of the object to be detected.
请参见图13,其示意性地给出了应用本发明提出的检测配方设置与优化方法得到的检测流程检测得到的检测结果数据与原始检测流程得到的检测结果数据的对比示意图。从图13可以看出,应用本发明反向推导得到的检测流程的策略和参数设置值用于检测过程,nuisance噪声数据被过滤掉,真缺陷数据(defect缺陷数据)被保留,通过检测结果数据在特征空间的分布可以直观检验结果的正确性。Please refer to Figure 13, which schematically shows a comparison diagram of the detection result data obtained by the detection process using the detection recipe setting and optimization method proposed by the present invention and the detection result data obtained by the original detection process. It can be seen from Figure 13 that by applying the strategy and parameter setting values of the detection process obtained by reverse derivation of the present invention for the detection process, the nuisance noise data is filtered out, the true defect data (defect defect data) is retained, and the detection result data is passed The distribution in the feature space can visually test the correctness of the results.
综上所述,本发明提供的检测配方设置与优化方法,所述第一数据样本包括若干条检测结果数据,所述检测结果数据包括大量辅助的调参信息,通过数据标注,为后续有效利用历史信息进行数据分析和推理从而能够获取到准确的先验知识提供了重要的依据,能够提高检测配方的检测精度。进一步地,本发明提供的检测配方设置与优化方法,检测配方的策略及参数设置值是根据缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导得到。由此,本发明通过反向推导能够同时推理出一套检测参数(同时调整出所有参数),参数之间的耦合关系也考虑在内,实现了检测流程的快速建模;避免了反复调整参数,能够显著节约人力和时间成本;而且,针对新工艺缺陷检测,无需用户具备算法基础也能确定检测流程的策略及参数设置值。To sum up, in the detection recipe setting and optimization method provided by the present invention, the first data sample includes several pieces of detection result data, and the detection result data includes a large amount of auxiliary parameter adjustment information, which can be effectively used for subsequent use through data annotation. Historical information can be used for data analysis and reasoning to obtain accurate prior knowledge, which provides an important basis and can improve the detection accuracy of detection formulas. Furthermore, in the detection recipe setting and optimization method provided by the present invention, the detection recipe strategy and parameter setting values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy. Therefore, the present invention can simultaneously deduce a set of detection parameters (adjusting all parameters at the same time) through reverse derivation. The coupling relationship between parameters is also taken into account, realizing rapid modeling of the detection process; avoiding repeated adjustment of parameters. , which can significantly save labor and time costs; moreover, for new process defect detection, users can determine the strategy and parameter setting values of the detection process without the need for algorithm foundation.
本发明的再一实施例提供了一种检测配方设置与优化装置,具体地,请参见图14,其示意性地给出了本实施方式提供的检测配方设置与优化装置的结构框图。从图14可以看出,本实施例提供的检测配方设置与优化装置,包括:真缺陷及噪扰标记单元100、特征分布信息获取单元200、缺陷分布边界获取单元300和检测参数设置及优化单元400。Yet another embodiment of the present invention provides a detection recipe setting and optimization device. Specifically, please refer to FIG. 14 , which schematically provides a structural block diagram of the detection recipe setting and optimization device provided by this embodiment. As can be seen from Figure 14, the detection recipe setting and optimization device provided by this embodiment includes: a true defect and noise marking unit 100, a feature distribution information acquisition unit 200, a defect distribution boundary acquisition unit 300, and a detection parameter setting and optimization unit. 400.
具体地,所述真缺陷及噪扰标记单元100,被配置为对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签。所述特征分布信息获取单元200,被配置为根据所述第二数据样本,得到检测对象的数据特征分布信息。所述缺陷分布边界获取单元300,被配置为采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,并用于根据所述预设离群统计分析策略,确定检测配方。所述检测参数设置及优化单元400,被配置为根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值。Specifically, the true defect and noise marking unit 100 is configured to mark a first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; and the second The data sample includes the detection result data and the label corresponding to each piece of the detection result data. The feature distribution information acquisition unit 200 is configured to obtain data feature distribution information of the detection object based on the second data sample. The defect distribution boundary acquisition unit 300 is configured to use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and use it to perform outlier statistical analysis according to the preset outlier statistics. Analyze strategies and determine detection recipes. The detection parameter setting and optimization unit 400 is configured to set or optimize the values of detection parameters of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
优选地,作为其中一种示范性实施方式,所述检测配方设置与优化装置还包括检测配方应用单元500。具体地,所述检测配方应用单元500被配置为根据所述检测配方及所述检测配方的检测参数的取值,对待检测对象进行缺陷分析,得到所述待检测对象的缺陷数据信息。Preferably, as one of the exemplary implementations, the detection recipe setting and optimization device further includes a detection recipe application unit 500 . Specifically, the detection recipe application unit 500 is configured to perform defect analysis on the object to be detected according to the detection recipe and the values of detection parameters of the detection recipe, and obtain defect data information of the object to be detected.
由于本发明提供的检测配方设置与优化装置与上述各实施方式提供的检测配方设置与优化方法的基本原理类似,因此,为了避免赘述,对上述检测配方设置与优化装置实施方式的具体内容介绍的比较粗略,详细的内容可参见上文有关检测配方设置与优化方法的详细说明。进一步地,由于本发明提供的检测配方设置与优化装置与上述各实施方式提供的检测配方设置与优化方法属于同一发明构思,因此,本发明提供的检测配方设置与优化装置至少具有与所述检测配方设置与优化方法相同的有益效果,可以参考上文中的检测配方设置与优化方法中的相关内容,故对此不再进行赘述。此外,由于本发明中的检测配方设置与优化装置与上文所述的检测配方设置与优化方法属于同一发明构思,因此本文对检测配方设置与优化装置的介绍较为简单,关于是如何的,可以参考上文中的检测配方设置 与优化方法中的相关内容,故对此不再进行赘述。Since the basic principles of the detection recipe setting and optimization device provided by the present invention are similar to the detection recipe setting and optimization methods provided by the above embodiments, in order to avoid redundancy, the specific content of the above detection recipe setting and optimization device implementation is introduced. It is relatively rough. For detailed information, please refer to the detailed description of the detection recipe settings and optimization methods above. Furthermore, since the detection recipe setting and optimization device provided by the present invention and the detection recipe setting and optimization method provided by the above embodiments belong to the same inventive concept, the detection recipe setting and optimization device provided by the present invention at least has the same features as the detection recipe setting and optimization method. The recipe setting and optimization method have the same beneficial effects. You can refer to the relevant content in the detection recipe setting and optimization method above, so this will not be described again. In addition, since the detection formula setting and optimization device in the present invention and the detection formula setting and optimization method described above belong to the same inventive concept, the introduction to the detection formula setting and optimization device in this article is relatively simple. Regarding how, you can Refer to the detection recipe settings above It is related to the optimization method, so it will not be described again.
基于同一发明构思,本发明还提供一种电子设备,请参考图15,其示意性地给出了本发明一实施方式提供的电子设备的方框结构示意图。如图15所示,所述电子设备包括处理器601和存储器603,所述存储器603上存储有计算机程序,所述计算机程序被所述处理器601执行时,实现上文所述的检测配方设置与优化方法。由于本发明提供的电子设备与上文所述的检测配方设置与优化方法属于同一发明构思,因此其具有上文所述的检测配方设置与优化方法的所有优点,故对此不再进行赘述。Based on the same inventive concept, the present invention also provides an electronic device. Please refer to FIG. 15 , which schematically shows a block structure diagram of the electronic device provided by an embodiment of the present invention. As shown in Figure 15, the electronic device includes a processor 601 and a memory 603. A computer program is stored on the memory 603. When the computer program is executed by the processor 601, the detection recipe settings described above are implemented. and optimization methods. Since the electronic device provided by the present invention and the detection recipe setting and optimization method described above belong to the same inventive concept, it has all the advantages of the detection recipe setting and optimization method described above, and thus will not be described again.
如图15所示,所述电子设备还包括通信接口602和通信总线604,其中所述处理器601、所述通信接口602、所述存储器603通过通信总线604完成相互间的通信。所述通信总线604可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线604可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。所述通信接口602用于上述电子设备与其他设备之间的通信。As shown in FIG. 15 , the electronic device also includes a communication interface 602 and a communication bus 604 , wherein the processor 601 , the communication interface 602 , and the memory 603 complete communication with each other through the communication bus 604 . The communication bus 604 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The communication bus 604 can be divided into an address bus, a data bus, a control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus. The communication interface 602 is used for communication between the above-mentioned electronic device and other devices.
本发明中所称处理器601可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器601是所述电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分。The processor 601 referred to in the present invention can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general processor may be a microprocessor or the processor may be any conventional processor, etc. The processor 601 is the control center of the electronic device and uses various interfaces and lines to connect various parts of the entire electronic device.
所述存储器603可用于存储所述计算机程序,所述处理器601通过运行或执行存储在所述存储器603内的计算机程序,以及调用存储在存储器603内的数据,实现所述电子设备的各种功能。The memory 603 can be used to store the computer program. The processor 601 implements various functions of the electronic device by running or executing the computer program stored in the memory 603 and calling the data stored in the memory 603. Function.
所述存储器603可以包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。The memory 603 may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
本发明还提供了一种可读存储介质,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时可以实现上文所述的检测配方设置与优化方法。由于本发明提供的可读存储介质与上文所述的检测配方设置与优化方法属于同一发明构思,因此其具有上文所述的检测配方设置与优化方法的所有优点,故对此不再进行赘述。The present invention also provides a readable storage medium. A computer program is stored in the readable storage medium. When the computer program is executed by a processor, the above-mentioned detection recipe setting and optimization method can be implemented. Since the readable storage medium provided by the present invention and the detection recipe setting and optimization method described above belong to the same inventive concept, it has all the advantages of the detection recipe setting and optimization method described above, so this will not be discussed further. Repeat.
本发明实施方式的可读存储介质,可以采用一个或多个计算机可读的介质的任意组合。可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机硬盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其组合使用。 The readable storage medium in the embodiment of the present invention may be any combination of one or more computer-readable media. The readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: an electrical connection having one or more conductors, a portable computer hard drive, a hard drive, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. As used herein, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in combination with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)连接到用户计算机,或者可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present invention may be written in one or more programming languages, or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language - such as "C" or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider) through the Internet. ).
综上所述,与现有技术相比,本发明提供的检测配方设置与优化方法、装置、电子设备和存储介质具有以下优点:所述第一数据样本包括若干条检测结果数据,所述检测结果数据包括辅助的调参信息(比如所述检测对象的基本信息和特征数据信息,所述特征数据信息包括但不限于检测结果指示的缺陷的灰度、形状、纹理等信息),通过数据标注可以区分真缺陷数据和噪扰数据,为后续有效利用历史信息进行数据分析和推理从而能够获取到准确的先验知识提供了重要的依据,能够提高检测配方的检测精度。进一步地,本发明提供的检测配方设置与优化方法,检测配方的策略及检测参数的取值是根据缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导得到。由此,本发明通过反向推导能够同时推理出一套检测参数(同时调整出所有参数),参数之间的耦合关系也考虑在内,实现了检测配方的快速建模;避免了反复调整参数,能够显著节约人力和时间成本;而且,针对新工艺缺陷检测,无需用户具备算法基础也能确定检测配方的策略及检测配方的检测参数的取值。To sum up, compared with the existing technology, the detection recipe setting and optimization method, device, electronic equipment and storage medium provided by the present invention have the following advantages: the first data sample includes several pieces of detection result data, and the detection result data is The result data includes auxiliary parameter adjustment information (such as the basic information and characteristic data information of the detection object, the characteristic data information includes but is not limited to the grayscale, shape, texture and other information of the defects indicated by the detection results), through data annotation It can distinguish true defect data from noise data, which provides an important basis for subsequent effective use of historical information for data analysis and reasoning to obtain accurate prior knowledge, and can improve the detection accuracy of detection formulas. Furthermore, in the detection recipe setting and optimization method provided by the present invention, the detection recipe strategy and detection parameter values are obtained through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy. Therefore, the present invention can deduce a set of detection parameters at the same time (adjusting all parameters at the same time) through reverse derivation. The coupling relationship between parameters is also taken into account, realizing rapid modeling of detection formulas; avoiding repeated adjustment of parameters. , which can significantly save labor and time costs; moreover, for new process defect detection, the user can determine the strategy of the detection formula and the values of the detection parameters of the detection formula without having any algorithm foundation.
应当注意的是,在本文的实施方式中所揭露的装置和方法,也可以通过其他的方式实现。以上所描述的装置实施方式仅仅是示意性的,例如,附图中的流程图和框图显示了根据本文的多个实施方式的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用于执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。It should be noted that the devices and methods disclosed in the embodiments of this article can also be implemented in other ways. The device embodiments described above are only illustrative. For example, the flowcharts and block diagrams in the accompanying drawings show the possible implementation architecture, functions and operations of the devices, methods and computer program products according to various embodiments of this document. . In this regard, each block in the flowchart or block diagrams may represent a module, program, or portion of code that contains one or more operable functions for implementing the specified logical functions. Execution instructions, the module, program segment or part of the code contains one or more executable instructions for implementing the specified logical function. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be designed into specialized hardware-based systems that perform the specified functions or acts. Implemented, or may be implemented using a combination of dedicated hardware and computer instructions.
另外,在本文各个实施方式中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。In addition, each functional module in each embodiment of this article can be integrated together to form an independent part, each module can exist alone, or two or more modules can be integrated to form an independent part.
上述描述仅是对本发明较佳实施方式的描述,并非对本发明范围的任何限定,本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰,均属于本发明的保护范围。显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若这些修改和变型属于本发明及其等同技术的范围之内,则本发明也意图包括这些改动和变型在内。 The above description is only a description of the preferred embodiments of the present invention, and does not limit the scope of the present invention in any way. Any changes or modifications made by those of ordinary skill in the field of the present invention based on the above disclosure fall within the scope of the present invention. Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the invention. Thus, if these modifications and variations fall within the scope of the present invention and equivalent technologies, the present invention is also intended to include these modifications and variations.

Claims (18)

  1. 一种检测配方设置与优化方法,其特征在于,包括:A detection recipe setting and optimization method, which is characterized by including:
    对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签;Annotate the first data sample to obtain a second data sample; wherein, the first data sample includes several pieces of detection result data; the second data sample includes the detection result data and the corresponding data of each piece of detection result data. Tag of;
    根据所述第二数据样本,得到检测对象的数据特征分布信息;According to the second data sample, obtain the data feature distribution information of the detection object;
    采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息;并根据所述预设离群统计分析策略,确定检测配方;Using a preset outlier statistical analysis strategy, perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information; and determine the detection formula according to the preset outlier statistical analysis strategy;
    根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值。According to the defect distribution boundary information and the preset outlier statistical analysis strategy, the values of the detection parameters of the detection formula are set or optimized through reverse derivation.
  2. 根据权利要求1所述的检测配方设置与优化方法,其特征在于,所述检测结果数据包括所述检测对象的基本信息和特征数据信息;其中,所述特征数据信息包括检测结果在所述检测对象上的位置信息,以及所述检测对象的工艺流程信息、所述检测结果的灰度信息、形状信息和纹理信息中的一种或多种;The detection recipe setting and optimization method according to claim 1, characterized in that the detection result data includes basic information and characteristic data information of the detection object; wherein the characteristic data information includes the detection result in the detection Position information on the object, as well as one or more of the process flow information of the detected object, the grayscale information, shape information and texture information of the detection result;
    所述对第一数据样本进行标注,得到第二数据样本,包括:Annotating the first data sample to obtain the second data sample includes:
    获取所述第一数据样本中每一条检测结果数据对应的所述检测对象的基本信息;Obtain the basic information of the detection object corresponding to each piece of detection result data in the first data sample;
    对于每一条检测结果数据,根据所述检测对象的基本信息和所述检测结果在所述检测对象上的位置信息,获取该条检测结果数据在所述检测对象上对应的原始信息;For each piece of detection result data, obtain the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object;
    根据所述原始信息,判断所述检测结果的数据信息标出的缺陷是否为真缺陷,若是,则将该条检测结果数据标记为真缺陷数据;若否,则将该条检测结果数据标记为噪扰数据;According to the original information, it is judged whether the defect marked by the data information of the detection result is a true defect. If so, the detection result data is marked as true defect data; if not, the detection result data is marked as Noisy data;
    根据所有的所述检测结果数据及每条所述检测结果数据对应的标签,得到所述第二数据样本。The second data sample is obtained based on all the detection result data and the label corresponding to each piece of detection result data.
  3. 根据权利要求2所述的检测配方设置与优化方法,其特征在于,所述检测对象包括Wafer;所述Wafer的基本信息包括所述Wafer的编号、包含的Die个数以及每一个Die的基本信息;所述Die的基本信息包括该Die的Die编号和图像信息;The detection recipe setting and optimization method according to claim 2, characterized in that the detection object includes a wafer; the basic information of the wafer includes the number of the wafer, the number of Dies included and the basic information of each Die. ;The basic information of the Die includes the Die number and image information of the Die;
    所述根据所述检测对象的基本信息和所述检测结果在所述检测对象上的位置信息,获取该条检测结果数据在所述检测对象上对应的原始信息,包括:Obtaining the original information corresponding to the detection result data on the detection object based on the basic information of the detection object and the position information of the detection result on the detection object includes:
    根据所述Wafer的基本信息,获取所述Wafer的每一个Die的Die编号及每一所述Die的基本信息;According to the basic information of the Wafer, obtain the Die number of each Die of the Wafer and the basic information of each Die;
    根据所述检测结果在所述Die上的位置信息以及所述Die的图像信息,获取该条检测结果数据在所述Die上对应的检测结果的图像信息。According to the position information of the detection result on the Die and the image information of the Die, the image information of the detection result corresponding to the piece of detection result data on the Die is obtained.
  4. 根据权利要求1所述的检测配方设置与优化方法,其特征在于,所述根据所述第二数据样本,得到所述检测对象的数据特征分布信息,包括:The detection recipe setting and optimization method according to claim 1, characterized in that, obtaining the data feature distribution information of the detection object according to the second data sample includes:
    确定特征数据轴和分割数据轴,并根据所述特征数据轴和分割数据轴建立特征空间;其中,所述特征数据轴代表所述检测结果数据的特征数据信息,所述分割数据轴代表分割特征信息;其中,所述分割特征信息包括除用于所述特征数据轴之外的其他特征数据信息;Determine the characteristic data axis and the segmentation data axis, and establish a feature space based on the characteristic data axis and the segmentation data axis; wherein the characteristic data axis represents the characteristic data information of the detection result data, and the segmentation data axis represents the segmentation feature Information; wherein the segmentation feature information includes other feature data information except for the feature data axis;
    根据所述特征空间对所述第二数据样本进行排列,得到所述检测对象的数据特征分布信息。Arrange the second data samples according to the feature space to obtain data feature distribution information of the detection object.
  5. 根据权利要求4所述的检测配方设置与优化方法,其特征在于,所述特征空间包括一个或多个所述特征数据轴以及一个或多个所述分割数据轴。The detection recipe setting and optimization method according to claim 4, wherein the feature space includes one or more feature data axes and one or more segmentation data axes.
  6. 根据权利要求4所述的检测配方设置与优化方法,其特征在于,所述根据所述特征空间对所述第二数据样本进行排列,得到所述检测对象的数据特征分布信息,包括: The detection recipe setting and optimization method according to claim 4, wherein the second data samples are arranged according to the feature space to obtain the data feature distribution information of the detection object, including:
    将所述特征数据轴作为横轴,将所述分割数据轴作为纵轴,建立直角坐标系;Use the feature data axis as the horizontal axis and the segmented data axis as the vertical axis to establish a rectangular coordinate system;
    在所述直角坐标系内,在所述横轴方向按照所述特征数据轴代表的所述特征数据信息的特征值大小、在所述纵轴方向按照所述分割数据轴代表的所述特征数据信息的特征值大小对所述第二数据样本进行排列,得到缺陷特征分布图。In the rectangular coordinate system, in the horizontal axis direction, the characteristic value size of the characteristic data information represented by the characteristic data axis, and in the vertical axis direction, according to the characteristic data represented by the segmented data axis. The second data samples are arranged according to the characteristic value size of the information to obtain a defect characteristic distribution map.
  7. 根据权利要求6所述的检测配方设置与优化方法,其特征在于,所述采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,包括:The detection recipe setting and optimization method according to claim 6, characterized in that the use of a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information to obtain defect distribution boundary information includes:
    判断是否自动寻找缺陷分布边界信息,若是,则根据选择的离群统计分析模型,对所述离群统计分析模型进行训练,获取缺陷分布边界信息;若否,则采用数据分割法对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息;Determine whether to automatically search for defect distribution boundary information. If so, train the outlier statistical analysis model according to the selected outlier statistical analysis model to obtain defect distribution boundary information; if not, use the data segmentation method to analyze the data. Conduct outlier statistical analysis on feature distribution information to obtain defect distribution boundary information;
    其中,所述对所述离群统计分析模型进行训练,包括:根据所述检测结果数据和所述数据特征分布信息,对选定的所述离群统计分析模型进行训练,直至得到的所述检测对象的缺陷分布边界信息满足第一预设条件;Wherein, training the outlier statistical analysis model includes: training the selected outlier statistical analysis model according to the detection result data and the data feature distribution information until the obtained The defect distribution boundary information of the detection object satisfies the first preset condition;
    所述采用数据分割法对所述数据特征分布信息进行离群统计分析,包括:根据所述检测结果数据和所述数据特征分布信息,在所述特征数据轴和/或所述分割数据轴上获取至少一个第一分割阈值;并根据所述第一分割阈值获取所述缺陷边界信息,直至得到的所述检测对象的缺陷分布边界信息满足第二预设条件。The use of data segmentation method to perform outlier statistical analysis on the data feature distribution information includes: based on the detection result data and the data feature distribution information, on the feature data axis and/or the segmented data axis Obtain at least one first segmentation threshold; and obtain the defect boundary information according to the first segmentation threshold until the obtained defect distribution boundary information of the detection object satisfies the second preset condition.
  8. 根据权利要求7所述的检测配方设置与优化方法,其特征在于,所述分割数据轴代表工艺流程信息;所述根据所述检测结果数据和所述数据特征分布信息,对所述特征数据轴和/或所述分割数据轴进行阈值分割,直至得到的所述检测对象的缺陷分布边界信息满足第二预设条件,包括:The detection recipe setting and optimization method according to claim 7, wherein the segmented data axis represents process flow information; and based on the detection result data and the data feature distribution information, the feature data axis is And/or the segmented data axis is threshold segmented until the obtained defect distribution boundary information of the detection object meets the second preset condition, including:
    根据所述数据特征分布信息,以及标签为真缺陷数据和标签为噪扰数据的检测结果数据分布的一致性,确定所述分割数据轴的第一分割阈值;Determine the first segmentation threshold of the segmented data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data;
    根据所述数据特征分布信息,以及标签为真缺陷数据和标签为噪扰数据的检测结果数据分布的一致性,确定所述特征数据轴的第二分割阈值;Determine the second segmentation threshold of the feature data axis based on the data feature distribution information and the consistency of the data distribution of the detection results labeled as true defect data and labeled as noise data;
    根据所述分割数据轴的第一分割阈值和所述特征数据轴的第二分割阈值,得到所述检测对象的缺陷分布边界信息。According to the first segmentation threshold of the segmentation data axis and the second segmentation threshold of the feature data axis, the defect distribution boundary information of the detection object is obtained.
  9. 根据权利要求7所述的检测配方设置与优化方法,其特征在于,所述采用预设离群统计分析策略还包括:数据分割和模型学习相结合的离群统计分析策略;The detection recipe setting and optimization method according to claim 7, characterized in that the use of a preset outlier statistical analysis strategy further includes: an outlier statistical analysis strategy that combines data segmentation and model learning;
    所述数据分割和模型学习相结合的离群统计分析策略包括:根据所述数据特征分布信息,获取标签为真缺陷的所述检测结果数据在所述分割数据轴上的至少一个第一分割阈值;并根据所述第一分割阈值和所述数据特征分布信息,对选定的所述离群统计分析模型进行训练,直至得到的所述检测对象的缺陷分布边界信息满足第三预设条件。The outlier statistical analysis strategy that combines data segmentation and model learning includes: obtaining at least one first segmentation threshold on the segmentation data axis of the detection result data labeled as a true defect based on the data feature distribution information. ; And according to the first segmentation threshold and the data feature distribution information, train the selected outlier statistical analysis model until the obtained defect distribution boundary information of the detection object meets the third preset condition.
  10. 根据权利要求1所述的检测配方设置与优化方法,其特征在于,所述根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值,包括:The detection recipe setting and optimization method according to claim 1, characterized in that, according to the defect distribution boundary information and the preset outlier statistical analysis strategy, the detection recipe is set or optimized through reverse derivation. The values of detection parameters include:
    根据所述预设离群统计分析策略,确定反向推导策略;Determine a reverse derivation strategy according to the preset outlier statistical analysis strategy;
    根据所述反向推导策略,确定所述反向推导策略的输入数据信息;According to the reverse derivation strategy, determine the input data information of the reverse derivation strategy;
    根据所述输入数据信息,确定所述检测结果数据的数据分布模型;Determine the data distribution model of the detection result data according to the input data information;
    根据所述数据分布模型和所述缺陷分布边界信息,确定所述检测配方的检测参数;Determine the detection parameters of the detection formula according to the data distribution model and the defect distribution boundary information;
    根据所述检测配方的策略和所述反向推导的输入数据信息,设置或优化所述检测配方的检测参数的取值。According to the strategy of the detection recipe and the input data information of the reverse derivation, the values of the detection parameters of the detection recipe are set or optimized.
  11. 根据权利要求10所述的检测配方设置与优化方法,其特征在于,所述预设离群统计分析策略为数据分割法; The detection recipe setting and optimization method according to claim 10, characterized in that the preset outlier statistical analysis strategy is a data segmentation method;
    根据所述数据分割法,将统计所述检测对象的检测结果数据的数据分布密度作为所述反向推导策略;According to the data segmentation method, the data distribution density of the detection result data of the detection object is counted as the reverse derivation strategy;
    根据所述统计数据分布密度的反向推导策略,将所述检测对象的所有检测结果数据作为所述输入数据信息;According to the reverse derivation strategy of the statistical data distribution density, all detection result data of the detection object are used as the input data information;
    根据所有检测结果数据,假设所有的所述检测结果数据的特征数据信息的特征值在特征空间的数据分布密度分为正常区域、噪扰区域和真缺陷区域;所述正常区域为数据分布密度大于第一密度阈值的区域,噪扰区域为数据密度小于或等于所述第一密度阈值且大于第二密度阈值的区域,真缺陷区域为数据密度小于或等于所述第二密度阈值的区域;According to all the detection result data, it is assumed that the data distribution density of the characteristic data information of all the detection result data in the feature space is divided into normal areas, noise areas and true defect areas; the normal area is where the data distribution density is greater than The area of the first density threshold, the noise area is the area where the data density is less than or equal to the first density threshold and greater than the second density threshold, and the true defect area is the area where the data density is less than or equal to the second density threshold;
    根据所有检测结果数据和所有检测结果数据的标签,计算所述第一密度阈值和所述第二密度阈值;其中,所述第一密度阈值大于所述第二密度阈值;Calculate the first density threshold and the second density threshold according to all detection result data and labels of all detection result data; wherein the first density threshold is greater than the second density threshold;
    根据所述第一密度阈值、所述第二密度阈值和所述缺陷分布边界信息,计算所述检测配方的位移参数。Calculate the displacement parameter of the detection formula according to the first density threshold, the second density threshold and the defect distribution boundary information.
  12. 根据权利要求10所述的检测配方设置与优化方法,其特征在于,所述预设离群统计分析策略为基于高斯模型的离群统计分析策略;The detection recipe setting and optimization method according to claim 10, wherein the preset outlier statistical analysis strategy is an outlier statistical analysis strategy based on a Gaussian model;
    根据所述基于高斯模型的离群统计分析策略,将获取所述检测对象的检测结果数据的高斯分布作为所述反向推导策略,将高斯模型检测作为检测配方的策略;According to the outlier statistical analysis strategy based on the Gaussian model, the Gaussian distribution of the detection result data of the detection object is obtained as the reverse derivation strategy, and Gaussian model detection is used as the detection formula strategy;
    根据统计高斯分布的反向推导策略,将所述检测对象的所有检测结果数据作为所述输入数据信息和所述缺陷分布边界信息作为所述输入数据信息;According to the reverse derivation strategy of statistical Gaussian distribution, all detection result data of the detection object are used as the input data information and the defect distribution boundary information is used as the input data information;
    根据所有检测结果数据,假设所有的所述检测结果数据的特征数据信息的特征值在特征空间的数据分布密度服从高斯分布;According to all detection result data, it is assumed that the data distribution density of the feature values of all the feature data information of the detection result data in the feature space obeys Gaussian distribution;
    根据所述输入数据信息和所述缺陷分布边界信息,确定所述高斯模型检测的参数。According to the input data information and the defect distribution boundary information, the parameters of the Gaussian model detection are determined.
  13. 根据权利要求10所述的检测配方设置与优化方法,其特征在于,所述预设离群统计分析策略为机器学习的离群统计分析策略;The detection recipe setting and optimization method according to claim 10, characterized in that the preset outlier statistical analysis strategy is a machine learning outlier statistical analysis strategy;
    根据所述机器学习的离群统计分析策略,将获取所述检测对象的检测结果数据的密度阈值和距离阈值作为所述反向推导策略,将机器学习模型作为检测配方的策略;According to the outlier statistical analysis strategy of machine learning, the density threshold and distance threshold for obtaining the detection result data of the detection object are used as the reverse derivation strategy, and the machine learning model is used as the strategy of detection formula;
    根据所述获取所述检测对象的检测结果数据的密度阈值和距离阈值的反向推导策略,将获取的所述检测对象的检测结果数据的密度和距离作为所述输入数据信息;According to the reverse derivation strategy of obtaining the density threshold and distance threshold of the detection result data of the detection object, the obtained density and distance of the detection result data of the detection object are used as the input data information;
    根据所有检测结果数据和所述缺陷边界分布信息,反向推导所述机器学习模型的检测策略的密度参数和距离参数。Based on all detection result data and the defect boundary distribution information, the density parameters and distance parameters of the detection strategy of the machine learning model are reversely derived.
  14. 根据权利要求1-13任一项所述的检测配方设置与优化方法,其特征在于,还包括:The detection formula setting and optimization method according to any one of claims 1 to 13, further comprising:
    根据所述检测配方及所述检测配方的检测参数的取值,对待检测对象进行缺陷分析,得到所述待检测对象的缺陷数据信息。According to the detection formula and the values of the detection parameters of the detection formula, defect analysis of the object to be detected is performed to obtain defect data information of the object to be detected.
  15. 一种检测配方设置与优化装置,其特征在于,包括:A detection formula setting and optimization device, which is characterized by including:
    真缺陷及噪扰标记单元,被配置为对第一数据样本进行标注,得到第二数据样本;其中,所述第一数据样本包括若干条检测结果数据;所述第二数据样本包括所述检测结果数据以及每条所述检测结果数据对应的标签;The true defect and noise marking unit is configured to mark the first data sample to obtain a second data sample; wherein the first data sample includes several pieces of detection result data; the second data sample includes the detection result data Result data and labels corresponding to each test result data;
    特征分布信息获取单元,被配置为根据所述第二数据样本,得到检测对象的数据特征分布信息;A feature distribution information acquisition unit configured to obtain data feature distribution information of the detection object based on the second data sample;
    缺陷分布边界获取单元,被配置为采用预设离群统计分析策略,对所述数据特征分布信息进行离群统计分析,获取缺陷分布边界信息,并用于根据所述预设离群统计分析策略,确定检测配方;The defect distribution boundary acquisition unit is configured to use a preset outlier statistical analysis strategy to perform outlier statistical analysis on the data feature distribution information, obtain defect distribution boundary information, and is used to perform outlier statistical analysis according to the preset outlier statistical analysis strategy, Determine the test formula;
    检测参数设置及优化单元,被配置为根据所述缺陷分布边界信息和所述预设离群统计分析策略,通过反向推导,设置或优化所述检测配方的检测参数的取值。The detection parameter setting and optimization unit is configured to set or optimize the value of the detection parameter of the detection formula through reverse derivation based on the defect distribution boundary information and the preset outlier statistical analysis strategy.
  16. 根据权利要求15所述的检测配方设置与优化装置,其特征在于,还包括: The detection formula setting and optimization device according to claim 15, further comprising:
    检测配方应用单元,被配置为根据所述检测配方及所述检测配方的检测参数的取值,对待检测对象进行缺陷分析,得到所述待检测对象的缺陷数据信息。The detection recipe application unit is configured to perform defect analysis on the object to be detected according to the detection recipe and the values of the detection parameters of the detection formula, and obtain defect data information of the object to be detected.
  17. 一种电子设备,其特征在于,包括处理器和存储器,所述存储器上存储有计算机程序,所述计算机程序被所述处理器执行时,实现权利要求1至14中任一项所述的检测配方设置与优化方法。An electronic device, characterized in that it includes a processor and a memory, and a computer program is stored on the memory. When the computer program is executed by the processor, the detection described in any one of claims 1 to 14 is implemented. Recipe setting and optimization methods.
  18. 一种可读存储介质,其特征在于,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现权利要求1至14中任一项所述的检测配方设置与优化方法。 A readable storage medium, characterized in that a computer program is stored in the readable storage medium. When the computer program is executed by a processor, the detection recipe setting and detection method described in any one of claims 1 to 14 are realized. Optimization.
PCT/CN2023/091070 2022-04-29 2023-04-27 Detection formula configuration and optimization method and apparatus, electronic device and storage medium WO2023208091A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210474912.7A CN117058064A (en) 2022-04-29 2022-04-29 Method, device, electronic equipment and storage medium for setting and optimizing detection formula
CN202210474912.7 2022-04-29

Publications (1)

Publication Number Publication Date
WO2023208091A1 true WO2023208091A1 (en) 2023-11-02

Family

ID=88517858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/091070 WO2023208091A1 (en) 2022-04-29 2023-04-27 Detection formula configuration and optimization method and apparatus, electronic device and storage medium

Country Status (3)

Country Link
CN (1) CN117058064A (en)
TW (1) TW202343613A (en)
WO (1) WO2023208091A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011191296A (en) * 2010-03-16 2011-09-29 Ngr Inc Pattern inspection device and method
CN108256632A (en) * 2018-01-29 2018-07-06 百度在线网络技术(北京)有限公司 Information processing method and device
CN111523576A (en) * 2020-04-13 2020-08-11 河海大学常州校区 Density peak value clustering outlier detection method suitable for electronic quality detection
CN111881299A (en) * 2020-08-07 2020-11-03 哈尔滨商业大学 Outlier event detection and identification method based on duplicate neural network
CN113822870A (en) * 2021-09-27 2021-12-21 陈博源 AI detection method for surface defects of electroluminescent semiconductor plate

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011191296A (en) * 2010-03-16 2011-09-29 Ngr Inc Pattern inspection device and method
CN108256632A (en) * 2018-01-29 2018-07-06 百度在线网络技术(北京)有限公司 Information processing method and device
CN111523576A (en) * 2020-04-13 2020-08-11 河海大学常州校区 Density peak value clustering outlier detection method suitable for electronic quality detection
CN111881299A (en) * 2020-08-07 2020-11-03 哈尔滨商业大学 Outlier event detection and identification method based on duplicate neural network
CN113822870A (en) * 2021-09-27 2021-12-21 陈博源 AI detection method for surface defects of electroluminescent semiconductor plate

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN JUN; PENG XIAO-QI; TANG XIU-MING; SONG YAN-PO; LIU ZHENG: "Support Vector Data Description Method with Local Optimization Boundary", ELECTRIC MACHINES AND CONTROL, vol. 19, no. 10, 15 October 2015 (2015-10-15), pages 93 - 99, XP009549884, ISSN: 1007-449X, DOI: 10.15938/j.emc.2015.10.014 *

Also Published As

Publication number Publication date
CN117058064A (en) 2023-11-14
TW202343613A (en) 2023-11-01

Similar Documents

Publication Publication Date Title
CN111241947B (en) Training method and device for target detection model, storage medium and computer equipment
CN111191566B (en) Optical remote sensing image multi-target detection method based on pixel classification
WO2020155518A1 (en) Object detection method and device, computer device and storage medium
CN107784288B (en) Iterative positioning type face detection method based on deep neural network
KR20190063839A (en) Method and System for Machine Vision based Quality Inspection using Deep Learning in Manufacturing Process
CN110930390B (en) Chip pin missing detection method based on semi-supervised deep learning
CN107545263B (en) Object detection method and device
US9846929B2 (en) Fast density estimation method for defect inspection application
WO2020156409A1 (en) Data processing method, defect detection method, computing apparatus, and storage medium
CN112241478B (en) Large-scale data visualization dimension reduction method based on graph neural network
CN111259710B (en) Parking space structure detection model training method adopting parking space frame lines and end points
WO2022082692A1 (en) Lithography hotspot detection method and apparatus, and storage medium and device
CN115880520A (en) Defect grade classification method and system based on template matching and defect segmentation
CN115249321A (en) Method for training neural network, system for training neural network and neural network
WO2023116632A1 (en) Video instance segmentation method and apparatus based on spatio-temporal memory information
CN112906816A (en) Target detection method and device based on optical differential and two-channel neural network
CN116128839A (en) Wafer defect identification method, device, electronic equipment and storage medium
CN110349070B (en) Short video watermark detection method
CN114972268A (en) Defect image generation method and device, electronic equipment and storage medium
WO2023208091A1 (en) Detection formula configuration and optimization method and apparatus, electronic device and storage medium
Wang et al. Welding seam detection and location: Deep learning network-based approach
Zhou et al. An adaptive clustering method detecting the surface defects on linear guide rails
CN112541884A (en) Defect detection method and apparatus, and computer-readable storage medium
CN117132564A (en) YOLOv 3-based sapphire substrate surface defect detection method and system
CN116385466A (en) Method and system for dividing targets in image based on boundary box weak annotation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795523

Country of ref document: EP

Kind code of ref document: A1