US20220108436A1 - Device and method for detecting defects on wafer - Google Patents
Device and method for detecting defects on wafer Download PDFInfo
- Publication number
- US20220108436A1 US20220108436A1 US17/465,179 US202117465179A US2022108436A1 US 20220108436 A1 US20220108436 A1 US 20220108436A1 US 202117465179 A US202117465179 A US 202117465179A US 2022108436 A1 US2022108436 A1 US 2022108436A1
- Authority
- US
- United States
- Prior art keywords
- image
- defect
- machine learning
- wafer
- combination
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000007547 defect Effects 0.000 title claims abstract description 213
- 238000000034 method Methods 0.000 title claims description 24
- 238000010801 machine learning Methods 0.000 claims abstract description 92
- 238000012360 testing method Methods 0.000 claims abstract description 45
- 239000004065 semiconductor Substances 0.000 claims abstract description 23
- 238000003384 imaging method Methods 0.000 claims abstract description 14
- 230000011218 segmentation Effects 0.000 claims description 25
- 238000003860 storage Methods 0.000 claims description 8
- 230000003936 working memory Effects 0.000 claims description 2
- 235000012431 wafers Nutrition 0.000 description 116
- 238000011960 computer-aided design Methods 0.000 description 24
- 238000001878 scanning electron micrograph Methods 0.000 description 22
- 238000010586 diagram Methods 0.000 description 18
- 238000011176 pooling Methods 0.000 description 14
- 238000001514 detection method Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 230000015654 memory Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 238000004088 simulation Methods 0.000 description 5
- 230000007423 decrease Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 238000013499 data model Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011017 operating method Methods 0.000 description 3
- 229920002120 photoresistant polymer Polymers 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000001459 lithography Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L22/00—Testing or measuring during manufacture or treatment; Reliability measurements, i.e. testing of parts without further processing to modify the parts as such; Structural arrangements therefor
- H01L22/30—Structural arrangements specially adapted for testing or measuring during manufacture or treatment, or specially adapted for reliability measurements
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L22/00—Testing or measuring during manufacture or treatment; Reliability measurements, i.e. testing of parts without further processing to modify the parts as such; Structural arrangements therefor
- H01L22/10—Measuring as part of the manufacturing process
- H01L22/12—Measuring as part of the manufacturing process for structural parameters, e.g. thickness, line width, refractive index, temperature, warp, bond strength, defects, optical inspection, electrical measurement of structural dimensions, metallurgic measurement of diffusions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/0006—Industrial image inspection using a design-rule based approach
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/0008—Industrial image inspection checking presence/absence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/97—Determining parameters from multiple pictures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
- G06V10/993—Evaluation of the quality of the acquired pattern
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L21/00—Processes or apparatus adapted for the manufacture or treatment of semiconductor or solid state devices or of parts thereof
- H01L21/67—Apparatus specially adapted for handling semiconductor or electric solid state devices during manufacture or treatment thereof; Apparatus specially adapted for handling wafers during manufacture or treatment of semiconductor or electric solid state devices or components ; Apparatus not specifically provided for elsewhere
- H01L21/67005—Apparatus not specifically provided for elsewhere
- H01L21/67242—Apparatus for monitoring, sorting or marking
- H01L21/67288—Monitoring of warpage, curvature, damage, defects or the like
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
- G06T2207/10061—Microscopic image from scanning electron microscope
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30148—Semiconductor; IC; Wafer
Definitions
- Example embodiments of the present disclosure described herein relate to a semiconductor process technology, and more particularly, relate to a system and a method for inferring a defect on a wafer based on machine learning.
- circuit patterns are formed on a surface of the wafer by a process including applying photo resist (PR) onto the wafer on which an oxide film is deposited, and selectively emitting a light on the photo resist (e.g., through a mask containing the circuit patterns).
- PR photo resist
- the demand for higher degrees of integration of circuits has increase, but the pitch between the circuit patterns has decreases. As such, circuit designs have also becomes more complicated.
- a defect may occur on the wafer in an exposure step.
- the defect on the wafer may cause a fault of a semiconductor device manufactured by using the wafer. For this reason, the defect on the wafer may be perceived as a critical factor reducing the reliability and productivity of a semiconductor device. Accordingly, there is a great demand on a test process of high accuracy for inferring a wafer defect.
- Example embodiments of the present disclosure provide a system and a method for inferring a defect on a wafer based on machine learning without a separate module or detector.
- a wafer defect inference system includes a test equipment that receives a first image obtained by imaging circuit patterns formed on a semiconductor wafer by using a scanning electron microscope and a second image obtained by imaging a layout image of a mask for implementing the circuit pattern on the semiconductor wafer and combines the first image and the second image to generate a combination image, and at least one computing device that is capable of communicating with the test equipment and infers a defect associated with the circuit pattern formed on the semiconductor wafer.
- the computing device receives the combination image, performs machine learning for inferring the defect based on the combination image, and generates an output image including information about the defect based on the machine learning.
- an operating method of a device configured to infer a defect of circuit patterns formed on a semiconductor wafer includes combining a first image and a second image to generate a combination image, the first image including an imaging of the circuit pattern, and the second image including an imaging of a layout image of a mask for implementing the circuit pattern on the semiconductor wafer; generating, based on a machine learning operation of the device, an output image from the combination image, the output image including defect information about the defect from the combination image, and outputting the output image.
- a non-transitory computer-readable medium storing a program code including an image generation model executable by a processor, the program code, when executed, causing the processor to combine a first image and a second image to generate a combination image, the first image including an imaging of a circuit pattern formed on a semiconductor wafer, and the second image including an imaging of a layout image of a mask for implementing the circuit pattern on the semiconductor wafer; and to generate, based on machine learning, an output image from the combination image, the output image including defect information of the circuit pattern.
- FIG. 1 is a block diagram illustrating a wafer defect inference system according to some embodiments.
- FIG. 2 is a block diagram illustrating a configuration of a computing device according to some embodiments.
- FIG. 3 is a diagram for describing how a neuromorphic processor according to some embodiments performs machine learning based on a GAN.
- FIG. 4 is a diagram for describing how a neuromorphic processor according to some embodiments of the present disclosure performs machine learning based on a CGAN.
- FIGS. 5A and 5B are flowcharts for describing a machine learning operation performed by a neuromorphic processor according to some embodiments.
- FIG. 6 is a diagram for describing how a discriminator network included in an image generation model executable by a neuromorphic processor, according to some embodiments, operates.
- FIG. 7 is a diagram for describing how a generator network included in an image generation model executable by a neuromorphic processor, according to some embodiments, operates.
- FIG. 8 is a diagram illustrating a combination image used in a wafer defect inference system according to some embodiments.
- FIG. 9 is a diagram for describing an output data model of a wafer defect inference system according to some embodiments.
- FIGS. 10A and 10B are diagrams indicating simulation results of a wafer defect inference system according to some embodiments.
- FIG. 11 is a flowchart for describing an operating method of a wafer defect inference system according to some embodiments.
- FIG. 1 is a block diagram illustrating a wafer defect inference system 10 according to some embodiments.
- the wafer defect inference system 10 may also be referred to as a “wafer monitoring system,” a “wafer test system,” a “semiconductor manufacturing process monitoring system,” and/or a “semiconductor manufacturing system.”
- the wafer defect inference system 10 may infer a defect in circuit patterns implemented on a wafer.
- the wafer defect inference system 10 includes test equipment 100 and a computing device 200 .
- the computing device 200 described as a separate component (e.g., independent of the test equipment 100 ).
- the computing device 200 may be implemented in the form of being embedded in the test equipment 100 .
- the test equipment 100 may detect a defect of circuit patterns on the wafer and may output defect information of the wafer (and/or information about a wafer defect).
- Information about the wafer (and/or wafer defect) may include, for example, at least one of a location of a defect, a size of the defect, a shape of the defect, a color of the defect, a kind of the defect, and/or the like.
- the test equipment 100 may output the information about the wafer (and/or wafer defect) in the form of an image.
- the test equipment 100 may include a geometry verification system (e.g., nano geometry research (NGR) equipment), an image detecting system (e.g., an electron microscope such as a scanning electron microscope (SEM)), and/or the like.
- NGR nano geometry research
- SEM scanning electron microscope
- the test equipment 100 may include and/or be connected to a user interface (not illustrated).
- the user interface may include a user input interface and a user output interface.
- the user input interface may be configured to receive information from a user, and may include at least one of a keyboard, a mouse, a touch pad, a microphone, and/or the like.
- the user output interface may be configured to output information to the user and/or may include at least one of a monitor, a beam projector, a speaker, and/or the like.
- the wafer defect inference system 10 may output information about the defect to the user through the user output interface.
- the test equipment 100 may be supplied with an image for detecting a defect on a wafer.
- the image input to the test equipment 100 may be, for example, an SEM image and/or a computer aided design (CAD) image.
- the image may be referred to as a “wafer image” and, in some example embodiments, may be obtained by scanning circuit patterns formed on a wafer through a mask, by using a scanning electron microscope (SEM).
- SEM scanning electron microscope
- the CAD image that is an image of a mask formed to implement circuit patterns on a wafer may include a layout image associated with a target pattern produced in and/or modified by a computer system.
- the test equipment 100 may include an image detecting system and/or processing circuitry such that at least one of the SEM image and/or the CAD image is produced by the test equipment 100 .
- the test equipment 100 may combine the input SEM image and CAD image to generate a combination image.
- the combination image may be, for example, generated by overlapping the SEM image and the CAD image around a pattern axis.
- the test equipment 100 may include an align module.
- the align module may perform template matching around the pattern axis of the SEM image and the CAD image, as preprocessing for generating a combination image of the test equipment 100 .
- the computing device 200 may communicate with the test equipment 100 .
- the computing device 200 may be referred to as an “electronic device” and/or an “image generating device.”
- the computing device 200 may receive input data D 1 from the test equipment 100 .
- the input data D 1 may include the combination image of the SEM image and the CAD image.
- the computing device 200 may perform machine learning on information about a wafer (and/or wafer defect) based on deep learning.
- the computing device 200 may perform learning on information about a wafer (and/or wafer defect) based on a generative adversarial network (hereinafter referred to as a “GAN”).
- GAN generative adversarial network
- the description will be given as implemented based on the principle of the GAN, but this is an example embodiment, and the example embodiments should not be limited thereto.
- the machine learning may be based on any other network included in a GAN system.
- the machine learning may be based on a conditional generative adversarial network (hereinafter referred to as a “CGAN”).
- CGAN conditional generative adversarial network
- the machine learning may be based on an architecture of a deep neural network (DNN) and/or n-layer neural network.
- DNN deep neural network
- the DNN and/or n-layer neural network may correspond to a convolution neural network (CNN), recurrent neural network (RNN), deep belief network, restricted Boltzmann machine, or the like.
- CNN convolution neural network
- RNN recurrent neural network
- deep belief network deep belief network
- restricted Boltzmann machine or the like.
- such artificial intelligence architecture systems may include other forms of machine learning models, such as, for example, linear and/or logistic regression, statistical clustering, Bayesian classification, decision trees, dimensionality reduction such as principal component analysis, and expert systems; and/or combinations thereof, including ensembles such as random forests.
- the artificial intelligence architecture systems may include a pooling layer (as described below in more detail), a fully connected layer, and/or the like in addition to a plurality of convolution layers.
- the computing device 200 may also infer a defect on a wafer based on the learned defect information.
- the computing device 200 may transmit image data D 2 including defect information of a wafer to the test equipment 100 .
- the test equipment 100 may output the defect information of the wafer obtained from the computing device 200 to the outside and/or may use the defect information of the wafer in performing a test operation.
- the wafer defect inference system 10 may, in some embodiments, output the defect information of the wafer as an image.
- the defect information output from the wafer defect inference system 10 according to the present disclosure may include various types of defect information such as a location of a defect on a wafer, a size of the defect, a shape of the defect, a color of the defect, a kind of the defect, and/or the like.
- the wafer defect inference system 10 may include a neuromorphic processor, configured to perform machine learning based on the GAN for image conversion. Accordingly, the wafer defect inference system 10 , according to some embodiments, may detect a new type of defect without data associated with all kinds of defects and may reduce false detection (e.g., false positives) and/or undetection (e.g., false negatives) of defects on a wafer.
- false detection e.g., false positives
- undetection e.g., false negatives
- FIG. 2 is a block diagram illustrating a configuration of the computing device 200 according to some embodiments.
- the computing device 200 may include a bus 210 , a processor 220 , a neuromorphic processor 230 , a random access memory (RAM) 240 , a modem 250 , and storage 270 .
- the bus 210 may provide a communication channel between the components 220 to 250 and 270 included in the computing device 200 .
- the processor 220 may control the computing device 200 .
- the processor 220 may execute an operating system, firmware, and/or the like for driving the computing device 200 .
- the processor 220 may instruct the neuromorphic processor 230 to perform machine learning and/or may support the machine learning of the neuromorphic processor 230 .
- the processor 220 may control and/or assist in the communication of the neuromorphic processor 230 with the RAM 240 , the modem 250 , or the storage 270 through the bus 210 .
- the neuromorphic processor 230 may perform the machine learning, for example, based on the instruction of the processor 220 .
- the neuromorphic processor 230 may receive images from the test equipment 100 through the modem 250 .
- the neuromorphic processor 230 may perform the machine learning based on the received images.
- the neuromorphic processor 230 is illustrated in FIG. 2 as a component independent of the processor 220 , according to some embodiments, the neuromorphic processor 230 may be included in the processor 220 , and the machine learning performed by the neuromorphic processor 230 may be performed by the processor 220 .
- the test equipment 100 , the processor 220 and/or the neuromorphic processor 230 may include and/or be included in, for example, processing circuitry such as hardware including logic circuits; a hardware/software combination such as a processor executing software; or a combination thereof.
- processing circuitry more specifically may include, but is not limited to, a central processing unit (CPU), an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a hardware accelerator, and programmable logic unit, a microprocessor, application-specific integrated circuit (ASIC), etc.
- CPU central processing unit
- ALU arithmetic logic unit
- FPGA field programmable gate array
- ASIC application-specific integrated circuit
- the RAM 240 may function as a working memory of the processor 220 and/or the neuromorphic processor 230 .
- the RAM 240 may include a volatile memory such as a static random access memory (SRAM), a dynamic random access memory (DRAM), a nonvolatile memory (such as a phase-change random access memory (PRAM), a magnetic random access memory (MRAM), a resistive random access memory (RRAM), and/or a ferroelectric random access memory (FRAM)), and/or the like.
- SRAM static random access memory
- DRAM dynamic random access memory
- PRAM phase-change random access memory
- MRAM magnetic random access memory
- RRAM resistive random access memory
- FRAM ferroelectric random access memory
- the modem 250 may receive images from the test equipment 100 and may transfer the received images to the neuromorphic processor 230 .
- the test equipment 100 is illustrated as connected with the computing device 200 through the modem 250 , this is only an example embodiment, and in some embodiments, the test equipment 100 and the computing device 200 may be integrated.
- the test equipment 100 may include an image database that stores images.
- the image database may store images associated with circuit patterns on a wafer and images including defect information of the wafer.
- the image database may store combination images input to the computing device 200 and images indicating defect information obtained from the computing device 200 .
- the image database may include and/or be included in a computer-accessible medium (not shown) for example, a non-transitory memory system.
- a computer-accessible medium for example, a non-transitory memory system.
- non-transitory is a limitation of the medium itself (e.g., as tangible, and not a signal) as opposed to a limitation on data storage persistency (e.g., RAM vs. ROM).
- the image database may be included in or separate from the storage 270 and/or the RAM 240 .
- the storage 270 may store data generated by the processor 220 .
- the storage 270 may store codes of an operating system or firmware that the processor 220 executes.
- the storage 270 may include a nonvolatile memory such as a PRAM, an MRAM, an RRAM, an FRAM, and/or a NAND flash memory. Though illustrated as separate from the RAM 240 , this is only an example embodiment.
- the storage 270 , the RAM 240 , and/or the image database may include or be included in separate computer-accessible medium, include or be included in different regions of a shared computer-accessible medium, and/or share a computer-accessible medium.
- the computer-accessible medium may also include instructions for the operation of the wafer defect inference system 10 .
- FIG. 3 is a diagram for describing how the neuromorphic processor 230 (refer to FIG. 2 ) according to some embodiments performs machine learning based on the GAN.
- the neuromorphic processor 230 may obtain information of a defect, which is capable of occurring on a wafer, through machine learning. For example, the neuromorphic processor 230 may use GAN-based machine learning to infer a defect.
- a GAN-based image generation model that the neuromorphic processor 230 executes may include a generator network 231 and a discriminator network 232 .
- the generator network 231 may be referred to as a “generator” and/or a “generation unit,” and the discriminator network 232 may be referred to as a “discriminator” and/or a “discrimination unit.”
- the discriminator network 232 may receive a real combination image image_real and/or a fake combination image image_fake and may determine whether an input image (e.g., a received image) is real or fake.
- the real combination image image_real refers to an image that is input to the neuromorphic processor 230 and is obtained by combining an SEM image and a CAD image of a wafer targeted for defect detection
- fake combination image image_fake refers to a combination image that is generated by the generator network 231 based on an input vector.
- the neuromorphic processor 230 may perform a first machine learning that allows the discriminator network 232 to determine the real combination image image_real as real and the fake combination image image_fake as fake.
- the discriminator network 232 may perform at least one of the following operations on the input real combination image image_real and/or the input fake combination image image_fake: a convolution operation, a pooling operation, a down sampling operation, a multiplication operation, an addition operation, an activation operation, and/or the like.
- the discriminator network 232 may output a signal indicating whether an input image is real or fake.
- the operations of the discriminator network 232 will be more fully described with reference to FIG. 6 .
- the neuromorphic processor 230 may update and/or tune weights and/or biases of nodes (not illustrated) included in the discriminator network 232 .
- the weights and/or biases may be updated and/or tuned in the case wherein the discriminator network 232 incorrectly determines the real combination image image_real as fake.
- the neuromorphic processor 230 may update and/or tune the weights and/or biases of the nodes included in the discriminator network 232 .
- the neuromorphic processor 230 may also perform a second machine learning on the generator network 231 such that the fake combination image image_fake generated from the generator network 231 is determined as real by the discriminator network 232 .
- the second machine learning may be performed after a first machine learning is complete and/or in parallel to the first machine learning.
- the generator network 231 may perform at least one of the following operations on the input vector: a deconvolution operation, an unpooling operation, an up sampling operation, a multiplication operation, an addition operation, and/or an activation operation. Through the above operations, the generator network 231 may generate the fake combination image image_fake based on the input vector. The operations of the generator network 231 will be more fully described with reference to FIG. 7 .
- the neuromorphic processor 230 may update or tune weights or biases of nodes included in the generator network 231 .
- the discriminator network 232 may determine the fake combination image image_fake generated by the generator network 231 as real, may determine the fake combination image image_fake generated by the generator network 231 as real with the probability of about 50%, or may determine the fake combination image image_fake generated by the generator network 231 as fake with the probability of about 50%.
- the discriminator network 232 may fail to accurately determine whether the fake combination image image_fake output from the generator network 231 is real or fake, and thus, the generator network 231 may cheat the discriminator network 232 .
- the fake combination image image_fake generated by the generator network 231 may be output from the computing device 200 (refer to FIG. 1 ) to the test equipment 100 (refer to FIG. 1 ) in the form of image data image_data.
- the image data image_data may include defect information of a wafer targeted for defect detection.
- the wafer defect information capable of being extracted may include various types of defect features such as a location of a defect, a size of the defect, a shape of the defect, a color of the defect, and a kind of the defect.
- FIG. 4 is a diagram for describing how the neuromorphic processor 230 (refer to FIG. 2 ) according to some embodiments performs machine learning based on the CGAN.
- a CGAN-based machine learning may be similar to and/or the same as the GAN-based machine learning (for example, in terms of a driving principle), but may otherwise include a condition “C” applied to input data of the generator network 231 and the discriminator network 232 .
- the generator network 231 may be provided with an input vector “Vector” and the condition “C,” and the discriminator network 232 may be provided with the real combination image image_real and the condition “C.”
- the condition “C” may be auxiliary information such as and/or related to class labels and/or defect information of the real combination image image_real and/or the fake combination image image_fake.
- the machine learning, the image generating operation, and the image determining operation of each of the generator network 231 and the discriminator network 232 may be performed in a state where the condition “C” is applied to input data.
- FIGS. 5A and 5B are flowcharts for describing a machine learning operation performed by the neuromorphic processor 230 (refer to FIG. 2 ) according to some embodiments.
- FIG. 5A is a flowchart for describing an example machine learning operation that the discriminator network 232 (refer to FIGS. 3 and/or 4 ) performs
- FIG. 5B is a flowchart for describing an example machine learning operation that the generator network 231 (refer to FIGS. 3 and/or 4 ) performs.
- an example machine learning operation of the discriminator network 232 to be described with reference to FIG. 5A may be referred to as “first machine learning”
- an example machine learning operation of the generator network 231 to be described with reference to FIG. 5B may be referred to as “second machine learning.”
- the discriminator network 232 may receive image data.
- the image data may include a real combination image corresponding to a combination of an SEM image and a CAD image, a fake combination image generated from the generator network 231 , and/or a defect image capable of occurring at circuit patterns on a wafer in general.
- the discriminator network 232 may perform the first machine learning.
- the first machine learning may be performed to determine whether data input to the discriminator network 232 is real or fake.
- the discriminator network 232 may perform the first machine learning through various operations of a plurality of convolution layers and a plurality of pooling layers. The operations of the plurality of convolution layers and the plurality of pooling layers will be more fully described with reference to FIG. 6 .
- the discriminator network 232 may determine whether an input image is real or fake. Though operation S 120 and operation S 130 are illustrated as separate steps this is for convenience of description; and in some example embodiments the determination of whether the input image is real or fake may be determined by the machine learning of operation S 120 and/or may be determined based on a result of the machine learning of operation S 120 .
- the procedure proceeds to operation S 140 in the following cases: where a real combination image input to the discriminator network 232 is determined by the discriminator network 232 as fake even though the real combination image is input to the discriminator network 232 and/or where a fake combination image input to the discriminator network 232 is determined by the discriminator network 232 as real even though the fake combination image is input to the discriminator network 232 .
- the discriminator network 232 may update and/or tune weights and/or biases of the nodes included in the discriminator network 232 , based on a determination result in operation S 130 . After the weights and/or biases of the discriminator network 232 are updated and/or tuned, the procedure proceeds to operation S 120 to repeatedly perform the first machine learning.
- the discriminator network 232 may determine whether the probability that a real combination image is determined as real converges to about 50%. When the probability that a real combination image is determined as real and a fake combination image is determined as fake converges to about 50%, the procedure for the first machine learning may be terminated. Meanwhile, when the probability that a real combination image is determined as real and a fake combination image is determined as fake does not converge to about 50%, the discriminator network 232 may return to operation S 120 to repeatedly perform the first machine learning.
- the generator network 231 may receive an input vector.
- the generator network 231 may generate a fake combination image based on the input vector.
- the input vector may be distributed depending on a Gaussian distribution.
- the generator network 231 may perform the second machine learning such that a fake combination image generated by the generator network 231 is determined by the discriminator network 232 as real. For example, the generator network 231 may generate a fake combination image based on a false positive and/or a false negative result of the discriminator network 232 .
- the generator network 231 may perform the second machine learning through various operations of a plurality of deconvolution layers and a plurality of unpooling layers. The operations of the plurality of deconvolution layers and the plurality of unpooling layers will be more fully described with reference to FIG. 7 .
- the generator network 231 may generate a fake combination image based on the second machine learning.
- operation S 220 and operation S 230 are illustrated as separate steps, this is for convenience of description; and in some example embodiments, the machine learning of operation S 220 may generate the fake image and/or the fake image may be generated based on a result of the machine learning of operation S 220 .
- the fake combination image thus generated may include pieces of information about defects capable of occurring at circuit patterns on a wafer, and the fake combination image may be transferred to the discriminator network 232 and/or may be output to the outside as output data.
- the generator network 231 may receive an indication of whether the discriminator network 232 determined the fake combination image as a real combination image or fake combination image.
- the procedure proceeds to operation S 250 .
- the discriminator network 232 determines the fake combination image as real, the procedure proceeds to operation S 260 .
- the generator network 231 may update and/or tune weights and/or biases of the nodes included in the generator network 231 , based on a determination result in operation S 240 . After the weights and/or biases of the generator network 231 are updated and/or tuned, the procedure proceeds to operation S 220 to repeatedly perform the second machine learning.
- the generator network 231 may determine whether the probability that a fake combination image is determined as real converges to about 50%. When the probability that a fake combination image is determined as real converges to about 50%, the procedure for the second machine learning may be terminated. Meanwhile, when the probability that a fake combination image is determined as real does not converge to about 50%, the generator network 231 may return to operation S 220 to repeatedly perform the second machine learning.
- FIG. 6 is a diagram for describing how the discriminator network 232 (refer to FIG. 3 or 4 ) included in an image generation model executable by the neuromorphic processor 230 , according to some embodiments, operates.
- the neuromorphic processor 230 may input a real combination image and/or a fake combination image to the discriminator network 232 .
- a size of the combination image may be gradually reduced as the combination image passing through a plurality of layers.
- an operation of the discriminator network 232 may be similar to and/or the same as an operation of a convolutional neural network (CNN).
- CNN convolutional neural network
- the discriminator network 232 may extract a feature of the combination image to determine whether the combination image is real or fake.
- the discriminator network 232 may generate a feature image by applying a filter (or a kernel or a matrix) to the combination image and repeatedly performing convolution operations on sampling values of the combination image corresponding to the filter and/or values of the filter.
- the discriminator network 232 may scale down a size (and/or a dimension) of the feature image by repeatedly performing an average pooling operation and/or a maximum pooling operation on the feature image output from the convolution layer, to which the combination image is input, through a pooling layer. Polling may be referred as “down-sampling.”
- the combination image may pass through a plurality of convolution layers and a plurality of pooling layers included in the discriminator network 232 under control of the neuromorphic processor 230 , and the number of layers is not limited to the example illustrated in FIG. 6 .
- One convolution layer and one pooling layer may be collectively regarded as one convolution/pooling layer.
- a condition “C” (refer to FIG. 4 ) may be applied to one of the layers.
- the condition “C” may be applied to a convolution layer, a pooling layer, and/or a convolution/pooling layer.
- the discriminator network 232 may reshape and/or transform a size of output data (or a feature image) passing through the plurality of convolution layers and the plurality of pooling layers. The reshaping of the output data may be omitted if unnecessary.
- the discriminator network 232 may perform the activation operation on the output data passing through the plurality of convolution layers and the plurality of pooling layers and may output a signal indicating whether the combination image is real or fake.
- FIG. 7 is a diagram for describing how the generator network 231 (refer to FIGS. 3 and/or 4 ) included in the image generation model executable by the neuromorphic processor 230 according to some embodiments operate.
- the neuromorphic processor 230 may input an input vector to the generator network 231 .
- a size (and/or dimension) of the input vector may be gradually enlarged (and/or expanded) as the input vector passes through a plurality of layers of the generator network 231 .
- the generator network 231 may correspond to forward propagation, and the discriminator network 232 (refer to FIG. 3 or 4 ) may correspond to backward propagation.
- an operation of the generator network 231 may be similar to an operation of a deconvolution neural network.
- the neuromorphic processor 230 may reshape and/or transform the size (and/or dimension) of the input vector for the purpose of inputting the input vector to a layer of the generator network 231 .
- the reshaping of the output vector may be omitted if unnecessary.
- the generator network 231 may enlarge the size of the input vector by repeatedly performing an unpooling operation on the input vector. Unpooling may be referred as “up-sampling.”
- the generator network 231 may generate a feature image by repeatedly performing the deconvolution operation (and/or a transposed convolution operation) (marked by Deconv 1 , Deconv 2 , Deconv 3 , and Deconv 4 in FIG. 7 ) on data output from an unpooling layer.
- the input vector may pass through a plurality of unpooling layers and a plurality of deconvolution layers included in the generator network 231 under the control of the neuromorphic processor 230 , and the number of layers is not limited to the example illustrated in FIG. 7 .
- One unpooling layer and one deconvolution layer may be collectively regarded as one unpooling/deconvolution layer.
- the generator network 231 may output a combination image by allowing the input vector to pass through a plurality of unpooling layers and a plurality of deconvolution layers.
- FIG. 8 is a diagram illustrating a combination image B 3 used in the wafer defect inference system 10 according to some embodiments.
- the combination image B 3 may be generated by overlapping an SEM image B 1 and a CAD image B 2 around a pattern axis (not illustrated).
- the SEM image B 1 may include an image obtained by scanning circuit patterns formed on a wafer using a scanning electron microscope (SEM).
- the CAD image B 2 may include a layout image of a mask for imprinting circuit patterns on a wafer.
- the test equipment 100 included in the wafer defect inference system 10 may be provided with the SEM image B 1 and the CAD image B 2 .
- the test equipment 100 may overlap the SEM image B 1 and the CAD image B 2 to generate the combination image B 3 .
- the test equipment 100 may perform an alignment operation for performing template matching.
- the computing device 200 (refer to FIG. 1 ) may be provided with the combination image B 3 from the test equipment 100 .
- the wafer defect inference system 10 may perform machine learning of the generator network 231 (refer to FIGS. 3 and/or 4 ) and the discriminator network 232 (refer to FIGS.
- the wafer defect inference system 10 may extract various types of defect information, such as a kind of a defect on a wafer, a size of the defect, a color of the defect, a location of the defect, and/or a shape of the defect, as well as a location of the defect.
- the computing device 200 of the wafer defect inference system 10 may draw defect information from the combination image B 3 being input data, based on the machine learning.
- FIG. 9 is a diagram for describing an output data model of the wafer defect inference system 10 according to some embodiments.
- the wafer defect inference system 10 may accurately infer various types of defect information such as a location of a defect, a size of the defect, a shape of the defect, a color of the defect, and/or a kind of the defect. Accordingly, the wafer defect inference system 10 may use a segmentation model and/or a heat map model as an output data model.
- the wafer defect inference system 10 may generate an image indicating a defect of circuit patterns on a wafer based on the segmentation model.
- the segmentation model may be a model predicting a class to which each pixel of the image belongs, and the prediction may be made on all the pixels of the image.
- the segmentation model may be implemented with a gray scale model and/or a red, green, and blue (RGB) model displaying defect information.
- the segmentation model implemented with the gray scale model may be a binary model.
- the segmentation model may generate a defect image segmented from a background in units of pixel, for example, as disclosed in FIG. 9 .
- the segmentation model may infer a defect from specific channel information according to a type of the defect without a separate classification module.
- the segmentation model as the number of layers included in the generator network 231 (refer to FIGS. 3 and/or 4 ) and the discriminator network 232 (refer to FIGS. 3 and/or 4 ) increases, the elaborateness of a defect inference image may increase.
- the reliability of a fake combination image generated by the generator network 231 may decrease. Accordingly, in the case where the wafer defect inference system 10 is driven based on the segmentation model, to make the reliability of output data high, the wafer defect inference system 10 may refine conventional defect images and may use a segmentation image associated with a defect portion as learning data.
- a size of the segmentation image may be at least a size of a pixel of a combination image.
- the wafer defect inference system 10 may generate an image indicating a defect of circuit patterns on a wafer based on the heat map model.
- the generator network 231 may generate an output image indicating a defect of circuit patterns on a wafer, in a manner similar to that of the segmentation model.
- the heat map model may, instead of segmenting an image in units of pixel, perform defect prediction based on a Gaussian distribution.
- the heat map model may infer a defect from specific channel information according to a type of the defect without a separate classification module. Even in the heat map model, as the number of layers included in the generator network 231 (refer to FIGS. 3 and/or 4 ) and the discriminator network 232 (refer to FIGS. 3 and/or 4 ) increases, the elaborateness of a defect inference image may increase.
- the learning based on inaccurate defect information may be possible compared to the segmentation model.
- a fake combination image of low accuracy may be prevented from being generated from the generator network 231 by using the heat map model.
- a fake combination image that makes it possible to infer a defect more accurately may be generated by calculating average locations of defects, generating a Gaussian heat map associated with the average locations, and performing learning based on the Gaussian heat map.
- the wafer defect inference system 10 may generate an image based on one of the segmentation model and the heat map model. Also, the wafer defect inference system 10 according to the present disclosure may generate an image based on an ensemble model of the segmentation model and the heat map model. In the case where the wafer defect inference system 10 is driven based on the ensemble model, the accuracy of defect inference may be improved even more.
- FIGS. 10A and 10B are diagrams indicating simulation results of the wafer defect inference system 10 (refer to FIG. 1 ) according to some embodiments.
- FIGS. 10A and 10B show input images B 4 and B 7 , target images B 5 and B 8 , and output images B 6 and B 9 as a simulation result of the wafer defect inference system 10 .
- FIG. 10A shows a simulation image of the wafer defect inference system 10 based on the segmentation model
- FIG. 10B shows a simulation image of the wafer defect inference system 10 based on the heat map model.
- the input image B 4 is a combination image that is obtained by combining an SEM image of a wafer targeted for defect detection and a CAD image of a mask used in a wafer process.
- the input image B 4 may be input to the test equipment 100 (refer to FIG. 1 ) included in the wafer defect inference system 10 to be combined at the test equipment 100 and may be input to the computing device 200 (refer to FIG. 1 ) in the form of a combination image.
- the test equipment 100 may first perform the alignment operation for template matching.
- the target image B 5 means a target image to be drawn from the wafer defect inference system 10 .
- the generator network 231 (refer to FIGS. 3 and/or 4 ) may generate the output image B 6 similar to the target image B 5 , based on the first machine learning performed by the discriminator network 232 (refer to FIGS. 3 and/or 4 ) and the second machine learning performed by the generator network 231 .
- the output image B 6 may indicate various types of defect information such as a location of a defect on a wafer targeted for defect detection, a size of the defect, a shape of the defect, a color of the defect, and a kind of the defect.
- the reliability of the output image B 6 may be proportional to the level of machine learning of the computing device 200 performed with regard to a defect and the number of layers included in the generator network 231 and the discriminator network 232 of the neuromorphic processor 230 (refer to FIG. 2 ) in the computing device 200 .
- the output image B 9 may be generated based on the target image B 8 .
- a result of FIG. 10B (e.g., the output image B 9 ) may be generated based on the heat map model. Because the heat map model uses the Gaussian distribution associated with defect information, to infer a defect of circuit patterns on a wafer may be easy compared to the segmentation model of FIG. 10A .
- FIG. 11 is a flowchart for describing an operating method of the wafer defect inference system 10 (refer to FIG. 1 ) according to some embodiments.
- the wafer defect inference system 10 may receive first image data.
- the first image data may include an SEM image and a CAD image or may include a conventional defect image for learning of defect information.
- the wafer defect inference system 10 may perform the first machine learning.
- the first machine learning means learning for determining whether image data input to the discriminator network 232 (refer to FIGS. 3 and/or 4 ) is real or fake.
- the first machine learning of the discriminator network 232 is described with reference to FIG. 5A in detail.
- the wafer defect inference system 10 may perform the second machine learning.
- the second machine learning means learning for determining image data generated by the generator network 231 (refer to FIGS. 3 and/or 4 ) as real.
- the second machine learning of the generator network 231 is described with reference to FIG. 5B in detail.
- the wafer defect inference system 10 may receive second image data.
- the second image data may include a SEM image and a CAD image of a wafer targeted for defect detection.
- the SEM image and the CAD image input from the outside may be combined by the test equipment 100 (refer to FIG. 1 ).
- the wafer defect inference system 10 may align the SEM image and the CAD image input in operation S 340 around a pattern axis to perform template matching on the SEM image and the CAD image.
- the test equipment 100 may combine the SEM image and the CAD image aligned around the pattern axis to generate a combination image.
- operation S 350 may be omitted.
- the wafer defect inference system 10 may generate an image including information about a defect existing in circuit patterns on a wafer being a check target, based on the combination image.
- the defect information may include a location of a defect, a size of the defect, a color of the defect, a kind of the defect, and/or the like.
- the wafer defect inference system 10 may output the image generated in operation S 360 .
- the image including the defect information about the circuit pattern on the wafer may be visually transferred to the user through the user output interface.
- the defect information may include an indication that the circuit pattern is to be reprocessed and/or discarded.
- the circuit pattern, wafer, and/or mask may be modified to address the inferred defect.
- the circuit pattern and/or wafer may be reprocessed and/or discarded.
- the circuit pattern and/or wafer may be reprocessed (e.g., in a case wherein the defect is fixable) and/or discarded based on the type and/or severity of the defect.
- the mask associated with the circuit pattern may be modified to reduce the potential of the formation of an inferred defect.
- operation S 380 may be omitted.
- the probability that a defect occurs on a wafer may decrease by inferring various types of defect features such as a location of a defect on the wafer, a size of the defect, a shape of the defect, a color of the defect, a kind of the defect, and/or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Manufacturing & Machinery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Power Engineering (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Condensed Matter Physics & Semiconductors (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Testing Or Measuring Of Semiconductors Or The Like (AREA)
- Image Analysis (AREA)
Abstract
Description
- This application claims priority under 35 U.S.C. § 119 to Korean Patent Application No. 10-2020-0128348 filed on Oct. 5, 2020, in the Korean Intellectual Property Office, the disclosures of which are incorporated by reference herein in their entireties.
- Example embodiments of the present disclosure described herein relate to a semiconductor process technology, and more particularly, relate to a system and a method for inferring a defect on a wafer based on machine learning.
- During lithography, which is a technology for applying circuit patterns onto wafers, circuit patterns are formed on a surface of the wafer by a process including applying photo resist (PR) onto the wafer on which an oxide film is deposited, and selectively emitting a light on the photo resist (e.g., through a mask containing the circuit patterns). With the development of the semiconductor process technology, the demand for higher degrees of integration of circuits has increase, but the pitch between the circuit patterns has decreases. As such, circuit designs have also becomes more complicated.
- Because a size of a light (e.g., the wavelength) used in the lithography may be large compared to the pitch between circuit patterns, a defect may occur on the wafer in an exposure step. The defect on the wafer may cause a fault of a semiconductor device manufactured by using the wafer. For this reason, the defect on the wafer may be perceived as a critical factor reducing the reliability and productivity of a semiconductor device. Accordingly, there is a great demand on a test process of high accuracy for inferring a wafer defect.
- Example embodiments of the present disclosure provide a system and a method for inferring a defect on a wafer based on machine learning without a separate module or detector.
- According to an embodiment, a wafer defect inference system includes a test equipment that receives a first image obtained by imaging circuit patterns formed on a semiconductor wafer by using a scanning electron microscope and a second image obtained by imaging a layout image of a mask for implementing the circuit pattern on the semiconductor wafer and combines the first image and the second image to generate a combination image, and at least one computing device that is capable of communicating with the test equipment and infers a defect associated with the circuit pattern formed on the semiconductor wafer. The computing device receives the combination image, performs machine learning for inferring the defect based on the combination image, and generates an output image including information about the defect based on the machine learning.
- According to an embodiment, an operating method of a device configured to infer a defect of circuit patterns formed on a semiconductor wafer includes combining a first image and a second image to generate a combination image, the first image including an imaging of the circuit pattern, and the second image including an imaging of a layout image of a mask for implementing the circuit pattern on the semiconductor wafer; generating, based on a machine learning operation of the device, an output image from the combination image, the output image including defect information about the defect from the combination image, and outputting the output image.
- According to an embodiment, a non-transitory computer-readable medium storing a program code including an image generation model executable by a processor, the program code, when executed, causing the processor to combine a first image and a second image to generate a combination image, the first image including an imaging of a circuit pattern formed on a semiconductor wafer, and the second image including an imaging of a layout image of a mask for implementing the circuit pattern on the semiconductor wafer; and to generate, based on machine learning, an output image from the combination image, the output image including defect information of the circuit pattern.
- The above and other objects and features of the present disclosure will become apparent by describing in detail embodiments thereof with reference to the accompanying drawings.
-
FIG. 1 is a block diagram illustrating a wafer defect inference system according to some embodiments. -
FIG. 2 is a block diagram illustrating a configuration of a computing device according to some embodiments. -
FIG. 3 is a diagram for describing how a neuromorphic processor according to some embodiments performs machine learning based on a GAN. -
FIG. 4 is a diagram for describing how a neuromorphic processor according to some embodiments of the present disclosure performs machine learning based on a CGAN. -
FIGS. 5A and 5B are flowcharts for describing a machine learning operation performed by a neuromorphic processor according to some embodiments. -
FIG. 6 is a diagram for describing how a discriminator network included in an image generation model executable by a neuromorphic processor, according to some embodiments, operates. -
FIG. 7 is a diagram for describing how a generator network included in an image generation model executable by a neuromorphic processor, according to some embodiments, operates. -
FIG. 8 is a diagram illustrating a combination image used in a wafer defect inference system according to some embodiments. -
FIG. 9 is a diagram for describing an output data model of a wafer defect inference system according to some embodiments. -
FIGS. 10A and 10B are diagrams indicating simulation results of a wafer defect inference system according to some embodiments. -
FIG. 11 is a flowchart for describing an operating method of a wafer defect inference system according to some embodiments. - Below, embodiments of the present disclosure may be described in detail and clearly to such an extent that an ordinary one in the art easily implements the present disclosure.
- The terms used in the specification are provided to describe the embodiments, not to limit the present disclosure. As used in the specification, the singular terms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms “comprises” and/or “comprising,” when used in the specification, specify the presence of steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other steps, operations, elements, components, and/or groups thereof.
- Unless otherwise defined, all terms (including technical and scientific terms) used in the specification should have the same meaning as commonly understood by those skilled in the art to which the present disclosure pertains. The terms, such as those defined in commonly used dictionaries, should not be interpreted in an idealized or overly formal sense unless expressly so defined herein. The same reference numerals represent the same elements throughout the specification.
-
FIG. 1 is a block diagram illustrating a waferdefect inference system 10 according to some embodiments. The waferdefect inference system 10 may also be referred to as a “wafer monitoring system,” a “wafer test system,” a “semiconductor manufacturing process monitoring system,” and/or a “semiconductor manufacturing system.” The waferdefect inference system 10 may infer a defect in circuit patterns implemented on a wafer. Referring toFIG. 1 , the waferdefect inference system 10 includestest equipment 100 and acomputing device 200. Below, the description will be given with thecomputing device 200 described as a separate component (e.g., independent of the test equipment 100). However, this is an example embodiment, and the example embodiments should not be limited thereto. For example, thecomputing device 200 may be implemented in the form of being embedded in thetest equipment 100. - The
test equipment 100 may detect a defect of circuit patterns on the wafer and may output defect information of the wafer (and/or information about a wafer defect). Information about the wafer (and/or wafer defect) may include, for example, at least one of a location of a defect, a size of the defect, a shape of the defect, a color of the defect, a kind of the defect, and/or the like. In some example embodiments, thetest equipment 100 may output the information about the wafer (and/or wafer defect) in the form of an image. Thetest equipment 100 may include a geometry verification system (e.g., nano geometry research (NGR) equipment), an image detecting system (e.g., an electron microscope such as a scanning electron microscope (SEM)), and/or the like. - The
test equipment 100 may include and/or be connected to a user interface (not illustrated). The user interface may include a user input interface and a user output interface. For example, the user input interface may be configured to receive information from a user, and may include at least one of a keyboard, a mouse, a touch pad, a microphone, and/or the like. The user output interface may be configured to output information to the user and/or may include at least one of a monitor, a beam projector, a speaker, and/or the like. In some embodiment, when the waferdefect inference system 10 infers a defect associated with a circuit pattern, as described in further detail below, the waferdefect inference system 10 may output information about the defect to the user through the user output interface. - In some example embodiments, for example wherein the
test equipment 100 does not include an image detecting system, thetest equipment 100 may be supplied with an image for detecting a defect on a wafer. The image input to thetest equipment 100 may be, for example, an SEM image and/or a computer aided design (CAD) image. The image may be referred to as a “wafer image” and, in some example embodiments, may be obtained by scanning circuit patterns formed on a wafer through a mask, by using a scanning electron microscope (SEM). The CAD image that is an image of a mask formed to implement circuit patterns on a wafer may include a layout image associated with a target pattern produced in and/or modified by a computer system. However, this is one example embodiment, and the example embodiments should not be limited thereto. For example, as noted above, in some embodiments, thetest equipment 100 may include an image detecting system and/or processing circuitry such that at least one of the SEM image and/or the CAD image is produced by thetest equipment 100. - The
test equipment 100 may combine the input SEM image and CAD image to generate a combination image. The combination image may be, for example, generated by overlapping the SEM image and the CAD image around a pattern axis. Although not illustrated inFIG. 1 , thetest equipment 100 may include an align module. The align module may perform template matching around the pattern axis of the SEM image and the CAD image, as preprocessing for generating a combination image of thetest equipment 100. - The
computing device 200 may communicate with thetest equipment 100. Thecomputing device 200 may be referred to as an “electronic device” and/or an “image generating device.” Thecomputing device 200 may receive input data D1 from thetest equipment 100. The input data D1 may include the combination image of the SEM image and the CAD image. - The
computing device 200 may perform machine learning on information about a wafer (and/or wafer defect) based on deep learning. For example, thecomputing device 200 may perform learning on information about a wafer (and/or wafer defect) based on a generative adversarial network (hereinafter referred to as a “GAN”). Below, the description will be given as implemented based on the principle of the GAN, but this is an example embodiment, and the example embodiments should not be limited thereto. For example, in some embodiments, the machine learning may be based on any other network included in a GAN system. For example, the machine learning may be based on a conditional generative adversarial network (hereinafter referred to as a “CGAN”). Additionally, according to some embodiments, the machine learning may be based on an architecture of a deep neural network (DNN) and/or n-layer neural network. The DNN and/or n-layer neural network may correspond to a convolution neural network (CNN), recurrent neural network (RNN), deep belief network, restricted Boltzmann machine, or the like. - Alternatively and/or additionally, such artificial intelligence architecture systems may include other forms of machine learning models, such as, for example, linear and/or logistic regression, statistical clustering, Bayesian classification, decision trees, dimensionality reduction such as principal component analysis, and expert systems; and/or combinations thereof, including ensembles such as random forests. In some example embodiments, the artificial intelligence architecture systems may include a pooling layer (as described below in more detail), a fully connected layer, and/or the like in addition to a plurality of convolution layers.
- The
computing device 200 may also infer a defect on a wafer based on the learned defect information. Thecomputing device 200 may transmit image data D2 including defect information of a wafer to thetest equipment 100. Thetest equipment 100 may output the defect information of the wafer obtained from thecomputing device 200 to the outside and/or may use the defect information of the wafer in performing a test operation. The waferdefect inference system 10 may, in some embodiments, output the defect information of the wafer as an image. The defect information output from the waferdefect inference system 10 according to the present disclosure may include various types of defect information such as a location of a defect on a wafer, a size of the defect, a shape of the defect, a color of the defect, a kind of the defect, and/or the like. - In addition, the wafer
defect inference system 10 according to some embodiments may include a neuromorphic processor, configured to perform machine learning based on the GAN for image conversion. Accordingly, the waferdefect inference system 10, according to some embodiments, may detect a new type of defect without data associated with all kinds of defects and may reduce false detection (e.g., false positives) and/or undetection (e.g., false negatives) of defects on a wafer. -
FIG. 2 is a block diagram illustrating a configuration of thecomputing device 200 according to some embodiments. Referring toFIG. 2 , thecomputing device 200 may include abus 210, aprocessor 220, aneuromorphic processor 230, a random access memory (RAM) 240, amodem 250, andstorage 270. Thebus 210 may provide a communication channel between thecomponents 220 to 250 and 270 included in thecomputing device 200. - The
processor 220 may control thecomputing device 200. For example, theprocessor 220 may execute an operating system, firmware, and/or the like for driving thecomputing device 200. Theprocessor 220 may instruct theneuromorphic processor 230 to perform machine learning and/or may support the machine learning of theneuromorphic processor 230. For example, theprocessor 220 may control and/or assist in the communication of theneuromorphic processor 230 with theRAM 240, themodem 250, or thestorage 270 through thebus 210. - The
neuromorphic processor 230 may perform the machine learning, for example, based on the instruction of theprocessor 220. Theneuromorphic processor 230 may receive images from thetest equipment 100 through themodem 250. Theneuromorphic processor 230 may perform the machine learning based on the received images. Though theneuromorphic processor 230 is illustrated inFIG. 2 as a component independent of theprocessor 220, according to some embodiments, theneuromorphic processor 230 may be included in theprocessor 220, and the machine learning performed by theneuromorphic processor 230 may be performed by theprocessor 220. - The
test equipment 100, theprocessor 220 and/or theneuromorphic processor 230 may include and/or be included in, for example, processing circuitry such as hardware including logic circuits; a hardware/software combination such as a processor executing software; or a combination thereof. For example, the processing circuitry more specifically may include, but is not limited to, a central processing unit (CPU), an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a hardware accelerator, and programmable logic unit, a microprocessor, application-specific integrated circuit (ASIC), etc. - The
RAM 240 may function as a working memory of theprocessor 220 and/or theneuromorphic processor 230. TheRAM 240 may include a volatile memory such as a static random access memory (SRAM), a dynamic random access memory (DRAM), a nonvolatile memory (such as a phase-change random access memory (PRAM), a magnetic random access memory (MRAM), a resistive random access memory (RRAM), and/or a ferroelectric random access memory (FRAM)), and/or the like. - The
modem 250 may receive images from thetest equipment 100 and may transfer the received images to theneuromorphic processor 230. Though thetest equipment 100 is illustrated as connected with thecomputing device 200 through themodem 250, this is only an example embodiment, and in some embodiments, thetest equipment 100 and thecomputing device 200 may be integrated. Also, although not illustrated inFIG. 2 , thetest equipment 100 may include an image database that stores images. The image database may store images associated with circuit patterns on a wafer and images including defect information of the wafer. For example, the image database may store combination images input to thecomputing device 200 and images indicating defect information obtained from thecomputing device 200. In some embodiments, the image database may include and/or be included in a computer-accessible medium (not shown) for example, a non-transitory memory system. The term “non-transitory,” as used herein, is a limitation of the medium itself (e.g., as tangible, and not a signal) as opposed to a limitation on data storage persistency (e.g., RAM vs. ROM). The image database may be included in or separate from thestorage 270 and/or theRAM 240. - The
storage 270 may store data generated by theprocessor 220. Thestorage 270 may store codes of an operating system or firmware that theprocessor 220 executes. Thestorage 270 may include a nonvolatile memory such as a PRAM, an MRAM, an RRAM, an FRAM, and/or a NAND flash memory. Though illustrated as separate from theRAM 240, this is only an example embodiment. For example, in some example embodiments, thestorage 270, theRAM 240, and/or the image database may include or be included in separate computer-accessible medium, include or be included in different regions of a shared computer-accessible medium, and/or share a computer-accessible medium. In some embodiments, the computer-accessible medium may also include instructions for the operation of the waferdefect inference system 10. -
FIG. 3 is a diagram for describing how the neuromorphic processor 230 (refer toFIG. 2 ) according to some embodiments performs machine learning based on the GAN. Theneuromorphic processor 230, according to some embodiments, may obtain information of a defect, which is capable of occurring on a wafer, through machine learning. For example, theneuromorphic processor 230 may use GAN-based machine learning to infer a defect. Referring toFIG. 3 , a GAN-based image generation model that theneuromorphic processor 230 executes may include agenerator network 231 and adiscriminator network 232. Thegenerator network 231 may be referred to as a “generator” and/or a “generation unit,” and thediscriminator network 232 may be referred to as a “discriminator” and/or a “discrimination unit.” - The
discriminator network 232 may receive a real combination image image_real and/or a fake combination image image_fake and may determine whether an input image (e.g., a received image) is real or fake. Herein, the real combination image image_real refers to an image that is input to theneuromorphic processor 230 and is obtained by combining an SEM image and a CAD image of a wafer targeted for defect detection; and fake combination image image_fake refers to a combination image that is generated by thegenerator network 231 based on an input vector. - The
neuromorphic processor 230 may perform a first machine learning that allows thediscriminator network 232 to determine the real combination image image_real as real and the fake combination image image_fake as fake. For example, thediscriminator network 232 may perform at least one of the following operations on the input real combination image image_real and/or the input fake combination image image_fake: a convolution operation, a pooling operation, a down sampling operation, a multiplication operation, an addition operation, an activation operation, and/or the like. Through the above operations, thediscriminator network 232 may output a signal indicating whether an input image is real or fake. The operations of thediscriminator network 232 will be more fully described with reference toFIG. 6 . - The
neuromorphic processor 230 may update and/or tune weights and/or biases of nodes (not illustrated) included in thediscriminator network 232. For example, the weights and/or biases may be updated and/or tuned in the case wherein thediscriminator network 232 incorrectly determines the real combination image image_real as fake. Similarly, in the case where thediscriminator network 232 determines the fake combination image image_fake as real, theneuromorphic processor 230 may update and/or tune the weights and/or biases of the nodes included in thediscriminator network 232. - The
neuromorphic processor 230 may also perform a second machine learning on thegenerator network 231 such that the fake combination image image_fake generated from thegenerator network 231 is determined as real by thediscriminator network 232. In some embodiments, the second machine learning may be performed after a first machine learning is complete and/or in parallel to the first machine learning. Thegenerator network 231 may perform at least one of the following operations on the input vector: a deconvolution operation, an unpooling operation, an up sampling operation, a multiplication operation, an addition operation, and/or an activation operation. Through the above operations, thegenerator network 231 may generate the fake combination image image_fake based on the input vector. The operations of thegenerator network 231 will be more fully described with reference toFIG. 7 . - At the beginning of the second machine learning, in the case where the
discriminator network 232 determines the fake combination image image_fake generated by thegenerator network 231 as fake, theneuromorphic processor 230 may update or tune weights or biases of nodes included in thegenerator network 231. When the second machine learning is completed, thediscriminator network 232 may determine the fake combination image image_fake generated by thegenerator network 231 as real, may determine the fake combination image image_fake generated by thegenerator network 231 as real with the probability of about 50%, or may determine the fake combination image image_fake generated by thegenerator network 231 as fake with the probability of about 50%. For example, thediscriminator network 232 may fail to accurately determine whether the fake combination image image_fake output from thegenerator network 231 is real or fake, and thus, thegenerator network 231 may cheat thediscriminator network 232. - The fake combination image image_fake generated by the
generator network 231 may be output from the computing device 200 (refer toFIG. 1 ) to the test equipment 100 (refer toFIG. 1 ) in the form of image data image_data. The image data image_data may include defect information of a wafer targeted for defect detection. The wafer defect information capable of being extracted may include various types of defect features such as a location of a defect, a size of the defect, a shape of the defect, a color of the defect, and a kind of the defect. -
FIG. 4 is a diagram for describing how the neuromorphic processor 230 (refer toFIG. 2 ) according to some embodiments performs machine learning based on the CGAN. In some example embodiments, a CGAN-based machine learning may be similar to and/or the same as the GAN-based machine learning (for example, in terms of a driving principle), but may otherwise include a condition “C” applied to input data of thegenerator network 231 and thediscriminator network 232. For example, in performing the CGAN-based machine learning, thegenerator network 231 may be provided with an input vector “Vector” and the condition “C,” and thediscriminator network 232 may be provided with the real combination image image_real and the condition “C.” In some embodiments, the condition “C” may be auxiliary information such as and/or related to class labels and/or defect information of the real combination image image_real and/or the fake combination image image_fake. The machine learning, the image generating operation, and the image determining operation of each of thegenerator network 231 and thediscriminator network 232 may be performed in a state where the condition “C” is applied to input data. -
FIGS. 5A and 5B are flowcharts for describing a machine learning operation performed by the neuromorphic processor 230 (refer toFIG. 2 ) according to some embodiments.FIG. 5A is a flowchart for describing an example machine learning operation that the discriminator network 232 (refer toFIGS. 3 and/or 4 ) performs, andFIG. 5B is a flowchart for describing an example machine learning operation that the generator network 231 (refer toFIGS. 3 and/or 4 ) performs. For convenience of description, an example machine learning operation of thediscriminator network 232 to be described with reference toFIG. 5A may be referred to as “first machine learning,” and an example machine learning operation of thegenerator network 231 to be described with reference toFIG. 5B may be referred to as “second machine learning.” - Referring to
FIG. 5A , in operation S110, thediscriminator network 232 may receive image data. The image data may include a real combination image corresponding to a combination of an SEM image and a CAD image, a fake combination image generated from thegenerator network 231, and/or a defect image capable of occurring at circuit patterns on a wafer in general. - In operation S120, the
discriminator network 232 may perform the first machine learning. The first machine learning may be performed to determine whether data input to thediscriminator network 232 is real or fake. Thediscriminator network 232 may perform the first machine learning through various operations of a plurality of convolution layers and a plurality of pooling layers. The operations of the plurality of convolution layers and the plurality of pooling layers will be more fully described with reference toFIG. 6 . - In operation S130, the
discriminator network 232 may determine whether an input image is real or fake. Though operation S120 and operation S130 are illustrated as separate steps this is for convenience of description; and in some example embodiments the determination of whether the input image is real or fake may be determined by the machine learning of operation S120 and/or may be determined based on a result of the machine learning of operation S120. The procedure proceeds to operation S140 in the following cases: where a real combination image input to thediscriminator network 232 is determined by thediscriminator network 232 as fake even though the real combination image is input to thediscriminator network 232 and/or where a fake combination image input to thediscriminator network 232 is determined by thediscriminator network 232 as real even though the fake combination image is input to thediscriminator network 232. Meanwhile, the procedure proceeds to operation S150 in the following cases: where a real combination image is input to thediscriminator network 232 and is determined by thediscriminator network 232 as real and/or where a fake combination image is input to thediscriminator network 232 and is determined by thediscriminator network 232 as fake. - In operation S140, the
discriminator network 232 may update and/or tune weights and/or biases of the nodes included in thediscriminator network 232, based on a determination result in operation S130. After the weights and/or biases of thediscriminator network 232 are updated and/or tuned, the procedure proceeds to operation S120 to repeatedly perform the first machine learning. - In operation S150, the
discriminator network 232 may determine whether the probability that a real combination image is determined as real converges to about 50%. When the probability that a real combination image is determined as real and a fake combination image is determined as fake converges to about 50%, the procedure for the first machine learning may be terminated. Meanwhile, when the probability that a real combination image is determined as real and a fake combination image is determined as fake does not converge to about 50%, thediscriminator network 232 may return to operation S120 to repeatedly perform the first machine learning. - Referring to
FIG. 5B , in operation S210, thegenerator network 231 may receive an input vector. Thegenerator network 231 may generate a fake combination image based on the input vector. In some embodiments, the input vector may be distributed depending on a Gaussian distribution. - In operation S220, the
generator network 231 may perform the second machine learning such that a fake combination image generated by thegenerator network 231 is determined by thediscriminator network 232 as real. For example, thegenerator network 231 may generate a fake combination image based on a false positive and/or a false negative result of thediscriminator network 232. Thegenerator network 231 may perform the second machine learning through various operations of a plurality of deconvolution layers and a plurality of unpooling layers. The operations of the plurality of deconvolution layers and the plurality of unpooling layers will be more fully described with reference toFIG. 7 . - In operation S230, the
generator network 231 may generate a fake combination image based on the second machine learning. Though operation S220 and operation S230 are illustrated as separate steps, this is for convenience of description; and in some example embodiments, the machine learning of operation S220 may generate the fake image and/or the fake image may be generated based on a result of the machine learning of operation S220. The fake combination image thus generated may include pieces of information about defects capable of occurring at circuit patterns on a wafer, and the fake combination image may be transferred to thediscriminator network 232 and/or may be output to the outside as output data. - In operation S240, the
generator network 231 may receive an indication of whether thediscriminator network 232 determined the fake combination image as a real combination image or fake combination image. When thediscriminator network 232 determines the fake combination image as fake, the procedure proceeds to operation S250. When thediscriminator network 232 determines the fake combination image as real, the procedure proceeds to operation S260. - In operation S250, the
generator network 231 may update and/or tune weights and/or biases of the nodes included in thegenerator network 231, based on a determination result in operation S240. After the weights and/or biases of thegenerator network 231 are updated and/or tuned, the procedure proceeds to operation S220 to repeatedly perform the second machine learning. - In operation S260, the
generator network 231 may determine whether the probability that a fake combination image is determined as real converges to about 50%. When the probability that a fake combination image is determined as real converges to about 50%, the procedure for the second machine learning may be terminated. Meanwhile, when the probability that a fake combination image is determined as real does not converge to about 50%, thegenerator network 231 may return to operation S220 to repeatedly perform the second machine learning. -
FIG. 6 is a diagram for describing how the discriminator network 232 (refer toFIG. 3 or 4 ) included in an image generation model executable by theneuromorphic processor 230, according to some embodiments, operates. Theneuromorphic processor 230 may input a real combination image and/or a fake combination image to thediscriminator network 232. A size of the combination image may be gradually reduced as the combination image passing through a plurality of layers. For example, an operation of thediscriminator network 232 may be similar to and/or the same as an operation of a convolutional neural network (CNN). Thediscriminator network 232 may extract a feature of the combination image to determine whether the combination image is real or fake. - The
discriminator network 232 may generate a feature image by applying a filter (or a kernel or a matrix) to the combination image and repeatedly performing convolution operations on sampling values of the combination image corresponding to the filter and/or values of the filter. Thediscriminator network 232 may scale down a size (and/or a dimension) of the feature image by repeatedly performing an average pooling operation and/or a maximum pooling operation on the feature image output from the convolution layer, to which the combination image is input, through a pooling layer. Polling may be referred as “down-sampling.” - The combination image may pass through a plurality of convolution layers and a plurality of pooling layers included in the
discriminator network 232 under control of theneuromorphic processor 230, and the number of layers is not limited to the example illustrated inFIG. 6 . One convolution layer and one pooling layer may be collectively regarded as one convolution/pooling layer. In some embodiments, a condition “C” (refer toFIG. 4 ) may be applied to one of the layers. For example, the condition “C” may be applied to a convolution layer, a pooling layer, and/or a convolution/pooling layer. - The
discriminator network 232 may reshape and/or transform a size of output data (or a feature image) passing through the plurality of convolution layers and the plurality of pooling layers. The reshaping of the output data may be omitted if unnecessary. Thediscriminator network 232 may perform the activation operation on the output data passing through the plurality of convolution layers and the plurality of pooling layers and may output a signal indicating whether the combination image is real or fake. -
FIG. 7 is a diagram for describing how the generator network 231 (refer toFIGS. 3 and/or 4 ) included in the image generation model executable by theneuromorphic processor 230 according to some embodiments operate. Theneuromorphic processor 230 may input an input vector to thegenerator network 231. A size (and/or dimension) of the input vector may be gradually enlarged (and/or expanded) as the input vector passes through a plurality of layers of thegenerator network 231. Thegenerator network 231 may correspond to forward propagation, and the discriminator network 232 (refer toFIG. 3 or 4 ) may correspond to backward propagation. For example, an operation of thegenerator network 231 may be similar to an operation of a deconvolution neural network. - The
neuromorphic processor 230 may reshape and/or transform the size (and/or dimension) of the input vector for the purpose of inputting the input vector to a layer of thegenerator network 231. The reshaping of the output vector may be omitted if unnecessary. Thegenerator network 231 may enlarge the size of the input vector by repeatedly performing an unpooling operation on the input vector. Unpooling may be referred as “up-sampling.” Thegenerator network 231 may generate a feature image by repeatedly performing the deconvolution operation (and/or a transposed convolution operation) (marked by Deconv1, Deconv2, Deconv3, and Deconv4 inFIG. 7 ) on data output from an unpooling layer. The input vector may pass through a plurality of unpooling layers and a plurality of deconvolution layers included in thegenerator network 231 under the control of theneuromorphic processor 230, and the number of layers is not limited to the example illustrated inFIG. 7 . One unpooling layer and one deconvolution layer may be collectively regarded as one unpooling/deconvolution layer. Thegenerator network 231 may output a combination image by allowing the input vector to pass through a plurality of unpooling layers and a plurality of deconvolution layers. -
FIG. 8 is a diagram illustrating a combination image B3 used in the waferdefect inference system 10 according to some embodiments. According to some embodiments, the combination image B3 may be generated by overlapping an SEM image B1 and a CAD image B2 around a pattern axis (not illustrated). The SEM image B1 may include an image obtained by scanning circuit patterns formed on a wafer using a scanning electron microscope (SEM). The CAD image B2 may include a layout image of a mask for imprinting circuit patterns on a wafer. - The test equipment 100 (refer to
FIG. 1 ) included in the waferdefect inference system 10 may be provided with the SEM image B1 and the CAD image B2. Thetest equipment 100 may overlap the SEM image B1 and the CAD image B2 to generate the combination image B3. Before overlapping the SEM image B1 and the CAD image B2, thetest equipment 100 may perform an alignment operation for performing template matching. The computing device 200 (refer toFIG. 1 ) may be provided with the combination image B3 from thetest equipment 100. The waferdefect inference system 10 according to the present disclosure may perform machine learning of the generator network 231 (refer toFIGS. 3 and/or 4 ) and the discriminator network 232 (refer toFIGS. 3 and/or 4 ) based on the combination image B3. The waferdefect inference system 10 may extract various types of defect information, such as a kind of a defect on a wafer, a size of the defect, a color of the defect, a location of the defect, and/or a shape of the defect, as well as a location of the defect. Thecomputing device 200 of the waferdefect inference system 10 may draw defect information from the combination image B3 being input data, based on the machine learning. -
FIG. 9 is a diagram for describing an output data model of the waferdefect inference system 10 according to some embodiments. In addition to the classification of a defect location, the waferdefect inference system 10 according to some embodiments may accurately infer various types of defect information such as a location of a defect, a size of the defect, a shape of the defect, a color of the defect, and/or a kind of the defect. Accordingly, the waferdefect inference system 10 may use a segmentation model and/or a heat map model as an output data model. - For example, in some embodiments, the wafer
defect inference system 10 may generate an image indicating a defect of circuit patterns on a wafer based on the segmentation model. The segmentation model may be a model predicting a class to which each pixel of the image belongs, and the prediction may be made on all the pixels of the image. The segmentation model may be implemented with a gray scale model and/or a red, green, and blue (RGB) model displaying defect information. In some embodiments, the segmentation model implemented with the gray scale model may be a binary model. The segmentation model may generate a defect image segmented from a background in units of pixel, for example, as disclosed inFIG. 9 . However, the segmentation model may infer a defect from specific channel information according to a type of the defect without a separate classification module. According to the segmentation model, as the number of layers included in the generator network 231 (refer toFIGS. 3 and/or 4 ) and the discriminator network 232 (refer toFIGS. 3 and/or 4 ) increases, the elaborateness of a defect inference image may increase. - In some embodiments wherein the segmentation model is used for the machine learning of the
generator network 231 and thediscriminator network 232, in the case where a combination image is not segmented in units of pixel, the reliability of a fake combination image generated by thegenerator network 231 may decrease. Accordingly, in the case where the waferdefect inference system 10 is driven based on the segmentation model, to make the reliability of output data high, the waferdefect inference system 10 may refine conventional defect images and may use a segmentation image associated with a defect portion as learning data. A size of the segmentation image may be at least a size of a pixel of a combination image. - In some embodiments the wafer
defect inference system 10 may generate an image indicating a defect of circuit patterns on a wafer based on the heat map model. Thegenerator network 231 may generate an output image indicating a defect of circuit patterns on a wafer, in a manner similar to that of the segmentation model. The heat map model may, instead of segmenting an image in units of pixel, perform defect prediction based on a Gaussian distribution. Also, the heat map model may infer a defect from specific channel information according to a type of the defect without a separate classification module. Even in the heat map model, as the number of layers included in the generator network 231 (refer toFIGS. 3 and/or 4 ) and the discriminator network 232 (refer toFIGS. 3 and/or 4 ) increases, the elaborateness of a defect inference image may increase. - In the case where the heat map model is used in the learning of the
generator network 231 and thediscriminator network 232, the learning based on inaccurate defect information may be possible compared to the segmentation model. For example, even though a shape of a defect object included in conventional defect image is inaccurate or a shape of the defect object is not clear, a fake combination image of low accuracy may be prevented from being generated from thegenerator network 231 by using the heat map model. In the case of the heat map model, a fake combination image that makes it possible to infer a defect more accurately may be generated by calculating average locations of defects, generating a Gaussian heat map associated with the average locations, and performing learning based on the Gaussian heat map. - The wafer
defect inference system 10 may generate an image based on one of the segmentation model and the heat map model. Also, the waferdefect inference system 10 according to the present disclosure may generate an image based on an ensemble model of the segmentation model and the heat map model. In the case where the waferdefect inference system 10 is driven based on the ensemble model, the accuracy of defect inference may be improved even more. -
FIGS. 10A and 10B are diagrams indicating simulation results of the wafer defect inference system 10 (refer toFIG. 1 ) according to some embodiments.FIGS. 10A and 10B show input images B4 and B7, target images B5 and B8, and output images B6 and B9 as a simulation result of the waferdefect inference system 10.FIG. 10A shows a simulation image of the waferdefect inference system 10 based on the segmentation model, andFIG. 10B shows a simulation image of the waferdefect inference system 10 based on the heat map model. - In
FIG. 10A , the input image B4 is a combination image that is obtained by combining an SEM image of a wafer targeted for defect detection and a CAD image of a mask used in a wafer process. The input image B4 may be input to the test equipment 100 (refer toFIG. 1 ) included in the waferdefect inference system 10 to be combined at thetest equipment 100 and may be input to the computing device 200 (refer toFIG. 1 ) in the form of a combination image. Before combining the SEM image and the CAD image, thetest equipment 100 may first perform the alignment operation for template matching. - The target image B5 means a target image to be drawn from the wafer
defect inference system 10. The generator network 231 (refer toFIGS. 3 and/or 4 ) may generate the output image B6 similar to the target image B5, based on the first machine learning performed by the discriminator network 232 (refer toFIGS. 3 and/or 4 ) and the second machine learning performed by thegenerator network 231. The output image B6 may indicate various types of defect information such as a location of a defect on a wafer targeted for defect detection, a size of the defect, a shape of the defect, a color of the defect, and a kind of the defect. The reliability of the output image B6 may be proportional to the level of machine learning of thecomputing device 200 performed with regard to a defect and the number of layers included in thegenerator network 231 and thediscriminator network 232 of the neuromorphic processor 230 (refer toFIG. 2 ) in thecomputing device 200. - In
FIG. 10B , also, when the input image B7, corresponding to a combination image of the SEM image and the CAD image, is input to thecomputing device 200, the output image B9 may be generated based on the target image B8. Unlike the result ofFIG. 10A based on the segmentation model, a result ofFIG. 10B (e.g., the output image B9) may be generated based on the heat map model. Because the heat map model uses the Gaussian distribution associated with defect information, to infer a defect of circuit patterns on a wafer may be easy compared to the segmentation model ofFIG. 10A . -
FIG. 11 is a flowchart for describing an operating method of the wafer defect inference system 10 (refer toFIG. 1 ) according to some embodiments. In operation S310, the waferdefect inference system 10 may receive first image data. The first image data may include an SEM image and a CAD image or may include a conventional defect image for learning of defect information. - In operation S320, the wafer
defect inference system 10 may perform the first machine learning. In this case, the first machine learning means learning for determining whether image data input to the discriminator network 232 (refer toFIGS. 3 and/or 4 ) is real or fake. The first machine learning of thediscriminator network 232 is described with reference toFIG. 5A in detail. - In operation S330, the wafer
defect inference system 10 may perform the second machine learning. In this case, the second machine learning means learning for determining image data generated by the generator network 231 (refer toFIGS. 3 and/or 4 ) as real. The second machine learning of thegenerator network 231 is described with reference toFIG. 5B in detail. - In operation S340, the wafer
defect inference system 10 may receive second image data. The second image data may include a SEM image and a CAD image of a wafer targeted for defect detection. The SEM image and the CAD image input from the outside may be combined by the test equipment 100 (refer toFIG. 1 ). - In operation S350, the wafer
defect inference system 10 may align the SEM image and the CAD image input in operation S340 around a pattern axis to perform template matching on the SEM image and the CAD image. Thetest equipment 100 may combine the SEM image and the CAD image aligned around the pattern axis to generate a combination image. However, in some embodiments, operation S350 may be omitted. - In operation S360, the wafer
defect inference system 10 may generate an image including information about a defect existing in circuit patterns on a wafer being a check target, based on the combination image. The defect information may include a location of a defect, a size of the defect, a color of the defect, a kind of the defect, and/or the like. - In operation S370, the wafer
defect inference system 10 may output the image generated in operation S360. The image including the defect information about the circuit pattern on the wafer may be visually transferred to the user through the user output interface. In some embodiments, the defect information may include an indication that the circuit pattern is to be reprocessed and/or discarded. - In operation S380, based on the defect information about the circuit pattern on the wafer the circuit pattern, wafer, and/or mask may be modified to address the inferred defect. For example, in some example embodiments, the circuit pattern and/or wafer may be reprocessed and/or discarded. For example, the circuit pattern and/or wafer may be reprocessed (e.g., in a case wherein the defect is fixable) and/or discarded based on the type and/or severity of the defect. In some embodiments, the mask associated with the circuit pattern may be modified to reduce the potential of the formation of an inferred defect. In some embodiments, operation S380 may be omitted.
- According to the present disclosure, the probability that a defect occurs on a wafer may decrease by inferring various types of defect features such as a location of a defect on the wafer, a size of the defect, a shape of the defect, a color of the defect, a kind of the defect, and/or the like.
- According to the present disclosure, because a new type of defect is inferred through machine learning without data including information about all types of defects, false detection and undetection capable of occurring in detecting a defect on a wafer may decrease.
- While the present disclosure has been described with reference to some example embodiments thereof, it will be apparent to those of ordinary skill in the art that various changes and modifications may be made thereto without departing from the spirit and scope of the present disclosure as set forth in the following claims.
Claims (20)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020200128348A KR20220045499A (en) | 2020-10-05 | 2020-10-05 | The device and method for detecting defects on the wafer |
KR10-2020-0128348 | 2020-10-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220108436A1 true US20220108436A1 (en) | 2022-04-07 |
Family
ID=80932560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/465,179 Pending US20220108436A1 (en) | 2020-10-05 | 2021-09-02 | Device and method for detecting defects on wafer |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220108436A1 (en) |
KR (1) | KR20220045499A (en) |
CN (1) | CN114388380A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230030088A1 (en) * | 2021-07-30 | 2023-02-02 | The Boeing Company | Systems and methods for synthetic image generation |
US20230043409A1 (en) * | 2021-07-30 | 2023-02-09 | The Boeing Company | Systems and methods for synthetic image generation |
US20230081300A1 (en) * | 2021-09-12 | 2023-03-16 | Nanya Technology Corporation | Method of measuring a semiconductor device |
CN116433661A (en) * | 2023-06-12 | 2023-07-14 | 锋睿领创(珠海)科技有限公司 | Method, device, equipment and medium for detecting semiconductor wafer by multitasking |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102513580B1 (en) * | 2022-05-06 | 2023-03-24 | 주식회사 스캐터엑스 | method and apparatus for generating wafer image for training wafer classification model, and method and apparatus for generating wafer classification model using the same |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210026338A1 (en) * | 2019-07-26 | 2021-01-28 | Kla Corporation | System and Method for Rendering SEM Images and Predicting Defect Imaging Conditions of Substrates Using 3D Design |
US20210133989A1 (en) * | 2019-10-31 | 2021-05-06 | Kla Corporation | BBP Assisted Defect Detection Flow for SEM Images |
US20210209418A1 (en) * | 2020-01-02 | 2021-07-08 | Applied Materials Israel Ltd. | Machine learning-based defect detection of a specimen |
US20210279520A1 (en) * | 2020-03-09 | 2021-09-09 | Nanotronics Imaging, Inc. | Defect Detection System |
US20210334946A1 (en) * | 2020-04-24 | 2021-10-28 | Camtek Ltd. | Method and system for classifying defects in wafer using wafer-defect images, based on deep learning |
US20210364450A1 (en) * | 2020-05-22 | 2021-11-25 | Kla Corporation | Defect size measurement using deep learning methods |
US20220375063A1 (en) * | 2019-09-20 | 2022-11-24 | Asml Netherlands B.V. | System and method for generating predictive images for wafer inspection using machine learning |
-
2020
- 2020-10-05 KR KR1020200128348A patent/KR20220045499A/en active Search and Examination
-
2021
- 2021-09-02 US US17/465,179 patent/US20220108436A1/en active Pending
- 2021-10-08 CN CN202111170392.2A patent/CN114388380A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210026338A1 (en) * | 2019-07-26 | 2021-01-28 | Kla Corporation | System and Method for Rendering SEM Images and Predicting Defect Imaging Conditions of Substrates Using 3D Design |
US20220375063A1 (en) * | 2019-09-20 | 2022-11-24 | Asml Netherlands B.V. | System and method for generating predictive images for wafer inspection using machine learning |
US20210133989A1 (en) * | 2019-10-31 | 2021-05-06 | Kla Corporation | BBP Assisted Defect Detection Flow for SEM Images |
US20210209418A1 (en) * | 2020-01-02 | 2021-07-08 | Applied Materials Israel Ltd. | Machine learning-based defect detection of a specimen |
US20210279520A1 (en) * | 2020-03-09 | 2021-09-09 | Nanotronics Imaging, Inc. | Defect Detection System |
US20210334946A1 (en) * | 2020-04-24 | 2021-10-28 | Camtek Ltd. | Method and system for classifying defects in wafer using wafer-defect images, based on deep learning |
US20210364450A1 (en) * | 2020-05-22 | 2021-11-25 | Kla Corporation | Defect size measurement using deep learning methods |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230030088A1 (en) * | 2021-07-30 | 2023-02-02 | The Boeing Company | Systems and methods for synthetic image generation |
US20230043409A1 (en) * | 2021-07-30 | 2023-02-09 | The Boeing Company | Systems and methods for synthetic image generation |
US11651554B2 (en) * | 2021-07-30 | 2023-05-16 | The Boeing Company | Systems and methods for synthetic image generation |
US11900534B2 (en) * | 2021-07-30 | 2024-02-13 | The Boeing Company | Systems and methods for synthetic image generation |
US20230081300A1 (en) * | 2021-09-12 | 2023-03-16 | Nanya Technology Corporation | Method of measuring a semiconductor device |
US11830176B2 (en) * | 2021-09-12 | 2023-11-28 | Nanya Technology Corporation | Method of measuring a semiconductor device |
CN116433661A (en) * | 2023-06-12 | 2023-07-14 | 锋睿领创(珠海)科技有限公司 | Method, device, equipment and medium for detecting semiconductor wafer by multitasking |
Also Published As
Publication number | Publication date |
---|---|
CN114388380A (en) | 2022-04-22 |
KR20220045499A (en) | 2022-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220108436A1 (en) | Device and method for detecting defects on wafer | |
JP7122386B2 (en) | Training Neural Networks for Defect Detection in Low-Resolution Images | |
US11798132B2 (en) | Image inpainting method and apparatus, computer device, and storage medium | |
US9965901B2 (en) | Generating simulated images from design information | |
US10679333B2 (en) | Defect detection, classification, and process window control using scanning electron microscope metrology | |
US11449711B2 (en) | Machine learning-based defect detection of a specimen | |
KR102618355B1 (en) | Method and system for classifying defects in wafer using wafer-defect images, based on deep learning | |
TW201734955A (en) | Generating simulated output for a specimen | |
JP2019508789A (en) | Accelerated training of machine learning based models for semiconductor applications | |
US11568212B2 (en) | Techniques for understanding how trained neural networks operate | |
US11853660B2 (en) | System and method for modeling a semiconductor fabrication process | |
CN111368636A (en) | Object classification method and device, computer equipment and storage medium | |
WO2022082692A1 (en) | Lithography hotspot detection method and apparatus, and storage medium and device | |
TW202347396A (en) | Computer implemented method for the detection and classification of anomalies in an imaging dataset of a wafer, and systems making use of such methods | |
US20180101637A1 (en) | Systems and methods of fabricating semiconductor devices | |
Meng et al. | Machine learning models for edge placement error based etch bias | |
JP7150918B2 (en) | Automatic selection of algorithm modules for specimen inspection | |
US11983865B2 (en) | Deep generative model-based alignment for semiconductor applications | |
Suzuki et al. | Superpixel convolution for segmentation | |
US20230343078A1 (en) | Automated defect classification and detection | |
US11816185B1 (en) | Multi-view image analysis using neural networks | |
Stalder et al. | What you see is what you classify: Black box attributions | |
KR20230046361A (en) | Apparatus for diagnosing the state of production facilities based on deep neural network and method therefor | |
Lechien et al. | Automated Semiconductor Defect Inspection in Scanning Electron Microscope Images: a Systematic Review | |
US20230069493A1 (en) | Method, electronic device and operating method of electronic device and manufacture of semiconductor device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SAMSUNG ELECTRONICS CO., LTD., KOREA, REPUBLIC OF Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANG, MIN-CHEOL;SIM, WOOJOO;SIGNING DATES FROM 20210602 TO 20210614;REEL/FRAME:057561/0210 Owner name: SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION, KOREA, REPUBLIC OF Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KIM, DO-NYUN;KIM, JAEHOON;REEL/FRAME:057561/0309 Effective date: 20210624 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |