CN117255972A

CN117255972A - Method for determining a random metric related to a lithographic process

Info

Publication number: CN117255972A
Application number: CN202280033121.9A
Authority: CN
Inventors: C·巴蒂斯塔基斯; M·皮萨伦科; M·G·M·M·范克莱杰; V·D·鲁蒂格利安尼; S·A·米德尔布鲁克; C·A·弗舒伦; N·盖佩恩
Original assignee: ASML Holding NV
Current assignee: ASML Holding NV
Priority date: 2021-05-06
Filing date: 2022-04-12
Publication date: 2023-12-19

Abstract

A method of determining a random metric, the method comprising: obtaining a trained model, the model having been trained to correlate training optical metrology data with training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals related to a distribution of intensity-related parameters across zero-order or higher-order diffraction of radiation scattered from a plurality of training structures, and the training random metrology data comprises random metrology values related to the plurality of training structures, wherein the plurality of training structures have been formed to have a variation in one or more dimensions on which the random metrology depends; obtaining optical metrology data comprising a distribution of intensity-related parameters across zero-order or higher-order diffraction of radiation scattered from the structure; and deducing values of the random metric from the optical metrology data using the trained model.

Description

Method for determining a random metric related to a lithographic process

Cross Reference to Related Applications

The present application claims priority from EP application 21172589.0 filed on month 5 and 6 of 2021, EP application 21179403.7 filed on month 6 and 15 of 2021, EP application 21214225.1 filed on month 12 and 14 of 2021, and EP application 22156035.2 filed on month 2 and 10 of 2022, the entire contents of which are incorporated herein by reference.

Technical Field

The present invention relates to a method and apparatus for applying a pattern to a substrate during a lithographic process.

Background

A lithographic apparatus is a machine that applies a desired pattern onto a substrate, typically onto a target portion of the substrate. For example, lithographic apparatus can be used to manufacture Integrated Circuits (ICs). In this example, a patterning device (alternatively referred to as a mask or a reticle) may be used to generate a circuit pattern to be formed on an individual layer of the IC. The pattern may be transferred onto a target portion (e.g., including a portion of one or more dies) on a substrate (e.g., a silicon wafer). Transfer of the pattern is typically via imaging onto a layer of radiation-sensitive material (resist) disposed on the substrate. In general, a single substrate will contain a network of adjacent target portions that are continuously patterned. The known lithographic apparatus comprises a so-called stepper, in which each target portion is irradiated by exposing the entire pattern onto the target portion at one time; and so-called scanners in which each target portion is irradiated by scanning the substrate through a radiation beam scanning pattern in a given direction (the "scanning" -direction) while synchronously scanning the substrate parallel or anti-parallel to this direction. The pattern may also be transferred from the patterning device to the substrate by imprinting the pattern onto the substrate.

To monitor the lithographic process, parameters of the patterned substrate are measured. Parameters may include, for example, overlay errors between successive layers formed in or on the patterned substrate and critical line width or Critical Dimension (CD) of the developed photoresist. The measurement may be performed on a product substrate and/or a dedicated metrology target. There are various techniques for measuring microstructures formed during photolithography, including the use of scanning electron microscopes and various specialized tools.

In performing a lithographic process, such as applying a pattern to a substrate or measuring such a pattern, process control and/or quality monitoring methods may rely on random analysis of features formed by the lithographic process. Such random analysis currently requires high resolution metrology, which can typically be achieved using Scanning Electron Microscopy (SEM). However, SEM measurement is slow and therefore not suitable for mass production.

Disclosure of Invention

The present invention is directed to a random metrology that may be faster than the speeds currently used with SEM.

In a first aspect of the invention, there is provided a method of determining a random metric relating to a structure, the method comprising: obtaining a trained model that has been trained to correlate training optical metrology data with training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals related to a plurality of angle resolved distributions of intensity related parameters across zero or higher order diffraction included within radiation scattered from a plurality of training structures on a substrate, and the training random metrology data comprises random metrology values related to the plurality of training structures, wherein the plurality of training structures have been formed with variations in one or more dimensions on which the random metrology is dependent; obtaining optical metrology data comprising an angle-resolved distribution of intensity-related parameters spanning a zero-order or higher order diffraction included within radiation scattered from the structure; and infer values of random metrics associated with the structure from the optical metrology data using the trained model.

By using a trained model as described, it is possible to obtain an accurate method of deriving random metrics based on less time-consuming (compared to SEM metrology) optical metrology data.

In a second aspect of the invention, there is provided a computing device comprising a processor and configured to perform the method of the first aspect.

In a third aspect of the invention, there is provided a scanning electron microscopy apparatus operable to image a plurality of features on a substrate and comprising the computing device of the second aspect.

In a fourth aspect of the invention, there is provided a computer program comprising program instructions operable to perform the method of the first aspect when run on a suitable device.

In a fifth aspect of the present invention, there is provided an optical metrology apparatus comprising: an optical system operable to obtain optical metrology data comprising at least one measurement signal relating to a structure that has been exposed during a lithographic process; a non-transitory data carrier comprising a trained model that has been trained to infer one or more random metric values for the random metric from optical metrology data, the trained model having been trained on training optical metrology data and training random metrology data, wherein: the training optical metrology data includes a plurality of measurement signals, each measurement signal relating to scattered radiation that has been scattered by one of a plurality of training structures on the training substrate; and training random metric data comprising random metric values related to the training structure, wherein multiple instances of the training structure have been formed with variations in one or more process parameters upon which the random metric depends; and a processor operable to infer values of the random metric from the optical metrology data using a trained model.

Other aspects, features, and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with reference to the accompanying drawings. It should be noted that the present invention is not limited to the particular embodiments described herein. Such embodiments are presented herein for illustrative purposes only. Additional embodiments will be apparent to those skilled in the relevant art(s) based on the teachings contained herein.

Drawings

Embodiments of the invention will now be described, by way of example, with reference to the accompanying drawings, in which

FIG. 1 depicts a lithographic apparatus that forms a production facility for semiconductor devices together with other devices;

fig. 2 schematically depicts two examples of random variations: (a) Line edge roughness LER and (b) illustrative Line Width Roughness (LWR);

FIG. 3 (a) is a schematic diagram of a first optical metrology device operable to implement a method according to one embodiment; and FIG. 3 (b) is a target that can be measured using such a tool;

FIG. 4 (a) is a schematic diagram of a second optical metrology apparatus operable to implement a method according to one embodiment in which EUV and/or SXR radiation is used and FIG. 4 (b) is a diffraction pattern that may be detected using such metrology apparatus;

FIG. 5 is a flow chart describing a first method of training and using a machine learning model to infer random correlation data from optical metrology data in accordance with an embodiment of the present invention;

FIG. 6 is a plot of the defect rate DR (SEM) measured using a scanning electron microscope versus the defect rate DR (IDM) measured using an optical metrology tool such as the one illustrated in FIG. 3 (a) or FIG. 4 (a); and

FIG. 7 is a flow chart describing a second method of training and using a machine learning model to infer random correlation data from optical metrology data in accordance with one embodiment of the present invention.

Detailed Description

Before describing embodiments of the invention in detail, it is instructive to present an example environment in which embodiments of the invention may be implemented.

FIG. 1 illustrates at 200 a lithographic apparatus LA as part of an industrial production facility that implements a high volume lithographic manufacturing process. In this example, the manufacturing process is adapted to manufacture a semiconductor product (integrated circuit) on a substrate such as a semiconductor wafer. Those skilled in the art will appreciate that a wide variety of products can be manufactured by processing different types of substrates in variations of the process. The production of semiconductor products is used purely as an example of the great commercial significance of today.

Within a lithographic apparatus (or simply "lithographic tool" 200), a measurement station MEA is shown at 202 and an exposure station EXP is shown at 204. The control unit LACU is shown at 206. In this example, each substrate accesses a measurement station and an exposure station to be patterned. In an optical lithographic apparatus, for example, a projection system is used to transfer a product pattern from a patterning device MA onto a substrate using conditioned radiation and the projection system. This is done by forming an image of the pattern in the layer of radiation-sensitive resist material.

The term "projection system" used herein should be broadly interpreted as encompassing any type of projection system, including refractive, reflective, catadioptric, magnetic, electromagnetic and electrostatic optical systems, or any combination thereof, as appropriate for the exposure radiation being used, or for other factors such as the use of an immersion liquid or the use of a vacuum. The patterning MA device may be a mask or reticle that imparts a pattern to a radiation beam that is transmitted or reflected by the patterning device. Well known modes of operation include a step mode and a scan mode. As is well known, the projection system may cooperate with support and positioning systems for the substrate and patterning device in a variety of ways to apply a desired pattern to a number of target portions across the substrate. A programmable patterning device may be used instead of a reticle with a fixed pattern. The radiation may include, for example, electromagnetic radiation in the Deep Ultraviolet (DUV) or Extreme Ultraviolet (EUV) band. The present disclosure is also applicable to other types of lithographic processes, such as imprint lithography and direct write lithography, e.g., by electron beam.

The lithographic apparatus control unit LACU controls all movements and measurements of the various actuators and sensors to receive the substrate W and the reticle MA and to implement patterning operations. The LACU also includes signal processing and data processing capabilities to achieve desired calculations related to the operation of the device. In practice, the control unit LACU will be implemented as a system of many sub-units, each of which processes real-time data acquisition, processing and control of a sub-system or component within the device.

The substrate is processed at the measurement station MEA before the pattern is applied to the substrate at the exposure station EXP so that various preparatory steps can be carried out. The preparing step may include: a level sensor is used to map the surface height of the substrate and an alignment sensor is used to measure the position of the alignment marks on the substrate. The alignment marks are nominally arranged in a regular grid pattern. However, the mark deviates from the ideal grid due to inaccuracy in creating the mark and also due to deformation of the substrate that occurs throughout its processing. As a result, in addition to measuring the position and orientation of the substrate, the alignment sensor must in practice also measure the position of many marks across the substrate area in detail if the device is to print product features at the correct positions with very high accuracy. The device may be of the so-called dual stage type, having two substrate stages, each with a positioning system controlled by a control unit LACU. While one substrate on one substrate stage is being exposed at the exposure station EXP, another substrate may be loaded onto the other substrate stage at the measurement station MEA so that various preparation steps may be performed. Therefore, measuring the alignment marks is very time consuming and providing two substrate stages enables a substantial increase in the throughput of the device. IF the position sensor IF is not able to measure the position of the substrate stage when it is at the measurement station as well as the exposure station, a second position sensor may be provided to enable tracking of the position of the substrate stage at both stations. The lithographic apparatus LA may, for example, be of a so-called dual stage type having two substrate stages and two stations (i.e. an exposure station and a measurement station), between which the substrate stages may be interchanged.

Within the production facility, the apparatus 200 forms part of a "lithography unit" or "lithography cluster" that also includes a coating device 208 for applying photoresist and other coatings to the substrate W for patterning by the apparatus 200. At the output side of the apparatus 200, a baking device 210 and a developing device 212 are provided for developing the exposed pattern into a physical resist pattern. Between all of these devices, the substrate handling system is responsible for supporting the substrates and transferring them from one device to the next. These devices, often collectively referred to as rails, are under the control of a rail control unit, which is itself controlled by a supervisory control system SCS, which also controls the lithographic apparatus via a lithographic control unit LACU. Thus, different devices can be operated to maximize throughput and processing efficiency. The supervisory control system SCS receives recipe information R, which in detail provides a definition of the steps to be performed to create each patterned substrate.

Once the pattern has been applied and developed in the lithography unit, the patterned substrate 220 is transferred to other processing devices, such as the processing devices illustrated at 222, 224, 226. A wide range of processing steps are implemented by various means in a typical manufacturing facility. For purposes of example, the device 222 in this embodiment is an etching station, and the device 224 performs a post-etch annealing step. Other physical and/or chemical processing steps are employed in the further apparatus 226 or the like. Many types of operations may be required to fabricate real devices, such as deposition of materials, modification of surface material properties (oxidation, doping, ion implantation, etc.), chemical-mechanical polishing (CMP), etc. The apparatus 226 may in practice represent a series of different process steps performed in one or more apparatuses. As another example, an apparatus and process steps for achieving self-aligned multiple patterning may be provided to produce a plurality of smaller features based on a precursor pattern laid down by a lithographic apparatus.

It is well known to fabricate semiconductor devices comprising multiple iterations of such processes to build device structures of appropriate materials and patterns layer by layer on a substrate. Thus, the substrates 230 reaching the lithography cluster may be newly prepared substrates, or they may be substrates that have been previously processed completely in the cluster or in another device. Also, depending on the processing required, the substrates 232 on the exiting device 226 may be returned for subsequent patterning operations in the same lithography cluster, they may be designated for patterning operations in different clusters, or they may be finished products to be sent out for dicing and packaging.

Each layer of the product structure requires a different set of process steps and the devices 226 used at each layer may be entirely different in type. Further, even where the processing steps to be applied by the apparatus 226 are nominally the same, in a large facility there may be several hypothetical identical machines working in parallel to perform step 226 on different substrates. Small differences in settings or faults between these machines may mean that they affect different substrates in different ways. The steps that are relatively common to the layers, such as etching (device 222), may even be achieved by several etching devices that are nominally identical but work in parallel to maximize throughput. Furthermore, in practice, different layers require different etching processes, e.g. chemical etching, plasma etching, depending on the details of the material to be etched and the specific requirements such as, for example, anisotropic etching.

The preceding and/or subsequent processes may be performed in other lithographic apparatus, as just mentioned, and may even be performed in different types of lithographic apparatus. For example, some layers in a very demanding device manufacturing process may be performed in a more advanced lithography tool than other layers that are less demanding in terms of parameters such as resolution and overlay. Thus, some layers may be exposed in an immersion lithography tool, while other layers are exposed in a 'dry' tool. Some layers may be exposed in a tool operating at DUV wavelengths, while other layers are exposed using EUV wavelength radiation.

In order for a substrate exposed by a lithographic apparatus to be correctly and consistently exposed, it is desirable to inspect the exposed substrate to measure characteristics such as overlay error between subsequent layers, line thickness, critical Dimension (CD), and the like. Thus, the manufacturing facility in which the lithography unit LC is located also includes a metrology system that receives some or all of the substrates W that have been processed in the lithography unit. The measurement results are directly or indirectly supplied to the supervisory control system SCS. If errors are detected, the exposure of subsequent substrates may be adjusted, especially if metrology can be completed immediately and quickly enough to leave other substrates of the same lot still to be exposed. In addition, the substrate that has been exposed can be stripped and reworked to increase throughput, or discarded, thereby avoiding performing other processes on the substrate that is known to be defective. In cases where only some target portions of the substrate are defective, other exposures may be performed on only those target portions that are good.

Also shown in fig. 1 is a metrology device 240, the metrology device 240 being provided for measuring a product parameter at a desired stage in the manufacturing process. A common example of a metrology station in a modern lithographic production facility is a scatterometer, e.g. an angle-resolved scatterometer or a spectroscatterometer, and it may be applied to measure the properties of the developed substrate at 220 prior to etching in the apparatus 222. Using the metrology device 240, it may be determined, for example, that important performance parameters such as overlay or Critical Dimension (CD) do not meet the accuracy requirements specified in the developed resist. Prior to the etching step, there is an opportunity to strip the developed resist and reprocess the substrate 220 by means of lithography clusters. The measurements 242 from the apparatus 240 may be used to maintain accurate performance of patterning operations in the lithography cluster by small adjustments of the supervisory control system SCS and/or control unit LACU 206 over time, thereby minimizing the risk that the product is made out of specification and requires rework.

Another example of a metrology station is a Scanning Electron Microscope (SEM), also known as an electron beam (e-beam) metrology device, which may be included in addition to or as an alternative to a scatterometer. As such, the metrology apparatus 240 may include an e-beam or SEM metrology device, either alone or in addition to a scatterometer. The e-beam and SEM metrology apparatus has the following advantages: features are measured directly (i.e., they image the features directly), rather than indirectly as in scatterometry (where parameter values are determined from the reconstruction and/or asymmetry of the diffraction orders of the radiation diffracted by the structure being measured). The main disadvantage of e-beam or SEM metrology devices is that their measurement speed is much slower than scatterometry, limiting their potential application to specific off-line monitoring processes.

Additionally, metrology devices 240 and/or other metrology devices (not shown) may be employed to measure characteristics of the processed substrates 232, 234 and the incoming substrate 230. The metrology apparatus may be used on processed substrates to determine important parameters such as overlay or CD.

Lithographic projection apparatus typically project (i.e., through a reticle) a patterned image at a point directly above the substrate and then ultimately into the resist. The projected image is referred to as a aerial image, which includes the light intensity distribution as a function of spatial position in the image plane. The aerial image is a source of information exposed into the resist, forming a gradient of dissolution rate that enables the three-dimensional resist image to develop during development.

Random induced fault prediction is typically based on one or more random metrics. Such a random metric may include a random measurement of a change in one or more dimensional parameters, for example, one or more of the following: for example, CD (so-called local CD uniformity (LCDU)), line edge position (so-called Line Edge Roughness (LER)), or line width (so-called Line Width Roughness (LWR)). Accurately measuring the number of faults is cumbersome because the expected failure rate may be low (e.g., approximately equal to 1 part per million to 1 part per billion) in an optimized process.

Imaging using a lithographic projection apparatus can result in random variations of one or more parameters, such as apparent Line Width Roughness (LWR) and localized CD variations of small two-dimensional features such as holes. Random variations can be attributed to factors such as photon shot noise, photon generated secondary electrons, photon absorption variations, photon generated acids in the resist, and the like. In the case of EUV lithography, the feature size required for EUV is small, which further exacerbates this random variation. Random variation of smaller features is an important factor in production yield and is therefore justified to be incorporated into various optimization processes of lithographic projection apparatus.

Fig. 2 (a) schematically depicts the random effect-line edge roughness LER. Assuming that all conditions in the edge 903 of the feature on the three-shot or simulated exposure design layout are the same, the shape and location of the resist images 903A, 903B, and 903C of the edge 903 may be slightly different. Locations 904A, 904B, and 904C of resist images 903A, 903B, and 903C may be measured by averaging resist images 903A, 903B, and 903C, respectively. LER of edge 903 can be a measure of the spatial distribution of locations 904A, 904B, and 904C. For example, LER may be 3σ of the spatial distribution (assuming that the distribution is a normal distribution). LER may be derived from multiple exposures or simulations of edge 903.

Fig. 2 (b) schematically depicts LWR. Assuming all conditions are the same in a long rectangular feature 910 having a width 911 on a three-shot or analog exposure design layout, the resist images 910A, 910B, and 910C of the rectangular feature 910 may have slightly different widths 911A, 911B, and 911C, respectively. The LWR of rectangular feature 910 may be a measure of the distribution of widths 911A, 911B, and 911C. For example, LWR may be 3σ of the distribution (assuming that the distribution is a normal distribution). LWR may be derived from multiple exposures or simulations of rectangular features 910. In the context of short features (e.g., contact holes), the width of their image is not well defined, as long edges cannot be used to average their positions. A similar number (LCDU) may be used to characterize random variations. LCDU is 3σ of the distribution of measured CDs of images of short features (assuming the distribution is a normal distribution).

FIG. 3 (a) illustrates one example of a metrology apparatus 100 suitable for use in the embodiments of the invention disclosed herein. The principle of operation of this type of metrology device and the intra-die metrology (IDM) techniques that it may be used for are explained in more detail in U.S. patent application nos. US2006-033921, US2010-201963 and WO2017148982, which are incorporated herein by reference in their entirety. The optical axis with several branches in the whole device is indicated by the dashed line O. In this apparatus, radiation emitted by a source 110 (e.g., a xenon lamp) is directed onto a substrate W via an optical system comprising a lens system 120, an aperture plate 130, a lens system 140, a partially reflective surface 150, and an objective lens 160. In one embodiment, these lens systems 120, 140, 160 are arranged in a double sequence of 4F arrangements. In one embodiment, the radiation emitted by the radiation source 110 is collimated using a lens system 120. If desired, a different lens arrangement may be used. The angular extent of radiation incident on the substrate may be selected by defining a spatial intensity distribution in a plane of the spatial spectrum that exhibits the plane of the substrate. In particular, this may be achieved by inserting an aperture plate 130 of a suitable form between lenses 120 and 140 in the plane of the rear projection image, which is the pupil plane of the objective lens. By using different apertures, different intensity distributions (e.g., annular, dipole, etc.) may be obtained. The angular distribution of the illumination in the radial direction and in the peripheral direction, as well as characteristics such as the wavelength, polarization and/or coherence of the radiation, may be adjusted to achieve the desired result. For example, one or more interference filters 130 may be provided between the source 110 and the partially reflective surface 150 to select wavelengths of interest in a range such as 400nm to 900nm or even lower (such as 200nm to 300 nm). The interference filter may be tunable rather than comprising a collection of different filters. Instead of an interference filter, a grating may be used. In one embodiment, one or more polarizers 170 may be provided between the source 110 and the partially reflective surface 150 to select the polarization of interest. The polarizers may be tunable rather than comprising a collection of different polarizers.

The target T is placed with the substrate W perpendicular to the optical axis O of the objective lens 160. Thus, radiation from the source 110 is reflected by the partially reflective surface 150 and focused via the objective lens 160 to an illumination spot S on a target T on the substrate W (see FIG. 3 (b)). In one embodiment, the objective lens 160 has a high Numerical Aperture (NA), desirably at least 0.9 or at least 0.95. Immersion measuring devices (using a fluid of relatively high refractive index, such as water) may even have a numerical aperture of greater than 1.

The illumination rays 170, 172 focused to the illumination spot at an angle off axis O produce diffracted rays 174, 176. It should be remembered that these rays are only one of a plurality of parallel rays covering the area of the substrate comprising the target T. Each element within the illumination spot is within the field of view of the metrology device. Due to the limited width of the aperture in the plate 130 (necessary to permit a useful amount of radiation), the incident rays 170, 172 will actually occupy a range of angles, and the diffracted rays 174, 176 will spread out slightly. Each diffraction order will also spread over a range of angles, instead of a single ideal ray as shown, according to the point spread function of a small target.

At least the zeroth order diffracted by the target on the substrate W is collected by the objective lens 160 and directed back through the partially reflective surface 150. The optical element 180 provides at least a portion of the diffracted beam to an optical system 182, which optical system 182 uses the zeroth order diffracted beam and/or the first order diffracted beam to form a diffraction spectrum (pupil plane image) of the target T on a sensor 190 (e.g., a CCD sensor or a CMOS sensor). In one embodiment, an aperture 186 is provided to filter out certain diffraction orders so that a particular diffraction order is provided to the sensor 190. In one embodiment, the aperture 186 allows substantially or primarily only the zeroth order radiation to reach the sensor 190. In one embodiment, the sensor 190 may be a two-dimensional detector such that a two-dimensional angular scatter spectrum of the substrate target T may be measured. The sensor 190 may be an array of CCD sensors or CMOS sensors, for example, and may use an integration time of 40 milliseconds per frame, for example. The sensor 190 may be used to measure the intensity of redirected radiation at a single wavelength (or a narrow range of wavelengths), individual intensities at multiple wavelengths, or integrated intensities over a range of wavelengths. Furthermore, the sensor may be used to measure the intensity of radiation having transverse magnetic polarization and/or transverse electrical polarization and/or the phase difference between transverse magnetic polarization and transverse electrical polarized radiation separately.

Optionally, the optical element 180 provides at least a portion of the diffracted beam to the measurement branch 200 to form an image of the target on the substrate W on the sensor 230 (e.g., a CCD sensor or a CMOS sensor). The measurement branch 200 may be used for various auxiliary functions, such as focusing the metrology device (i.e. enabling the substrate W to be focused with the objective lens 160) and/or for dark field imaging in which the image is formed with the zeroth order blocked so that it comprises only a single diffraction order or complementary diffraction order pair.

In order to provide a customized field of view for gratings of different sizes and shapes, an adjustable field stop 300 is provided in the lens system 140 on the path from the source 110 to the objective lens 160. The field stop 300 contains an aperture 302 and is located in a plane conjugate to the plane of the target T such that the illumination spot becomes an image of the aperture 302. The image may be scaled according to magnification, or the aperture and illumination spot may be 1:1 size relationship. To adapt the illumination to different types of measurements, the aperture plate 300 may include a plurality of aperture patterns formed around a disk that rotates to bring the desired pattern into place. Alternatively or additionally, the plate set 300 may be provided and exchanged to achieve the same effect. Additionally or alternatively, a programmable aperture device such as a deformable mirror array or a transmissive spatial light modulator may be used.

Typically, the target will be aligned with its periodic structural features extending parallel to the Y-axis or parallel to the X-axis. With respect to its diffraction behavior, a periodic structure characterized by extending in a direction parallel to the Y-axis has periodicity in the X-direction, while a periodic structure characterized by extending in a direction parallel to the X-axis has periodicity in the Y-direction. To measure performance in both directions, two types of features are typically provided. Although reference will be made to lines and spaces for simplicity, the periodic structure need not be formed by lines and spaces. Moreover, each line and/or space between lines may be a structure formed of smaller substructures. Further, for example, where the periodic structure includes pillars and/or holes, the periodic structure may be formed periodically in two dimensions at the same time.

Fig. 3 (b) illustrates a plan view of a typical target T and the extent of the irradiation spot S in the apparatus of fig. 3 (a). To obtain a diffraction spectrum that is not disturbed by surrounding structures, in one embodiment, the target T is a periodic structure (e.g., a grating) that is greater than the width (e.g., diameter) of the illuminated spot S. The width of the irradiation spot S may be smaller than the width and length of the target. In other words, the target is illuminated 'underfilled' and the diffraction signal is substantially unaffected by any signal from product features or the like outside the target itself. This simplifies the mathematical reconstruction of the object, as it can be considered infinite.

The apparatus depicted in fig. 3 (a) may be used to determine values of one or more variables of interest of the target pattern based on measurement data obtained using metrology measurements. The radiation detected by detector 190 provides a measured radiation profile (or more generally, an angle-resolved parameter profile) for target T.

Fig. 4 (a) depicts a schematic representation of a metrology device 302 in which radiation in the wavelength range 0.01nm to 100nm can be used to measure parameters of structures on a substrate. The metrology device 302 represented in fig. 4 (a) may be adapted for use in the hard X-ray, soft X-ray or EUV fields.

FIG. 4 (a) illustrates a schematic physical arrangement of another metrology device 302 that may be used in the methods disclosed herein. Purely by way of example, the metrology device 302 comprises a spectroscatterometer using (optionally grazing incidence) hard X-rays (HXR) and/or soft X-rays (SXR) and/or EUV radiation. Such an apparatus is referred to herein as an SXR measurement device for performing SXR measurements, and the resulting image will be referred to as an SXR image, irrespective of the actual wavelength used.

The inspection apparatus 302 includes a radiation source or so-called illumination source 310, an illumination system 312, a substrate support 316, inspection systems 318, 398, and a Metrology Processing Unit (MPU) 320.

In this example, the illumination source 310 is used to generate EUV, hard X-ray or soft X-ray radiation. The illumination source 310 may be based on Higher Harmonic Generation (HHG) technology as shown in fig. 4 (a), and it may also be other types of illumination sources, such as a liquid metal jet source, an Inverse Compton Scattering (ICS) source, a plasma channel source, a magnetic undulator source, or a Free Electron Laser (FEL) source.

For the example of a HHG source, as shown in fig. 4 (a), the main components of the radiation source are a pump radiation source 330 operable to emit pump radiation and a gas delivery system 332. Optionally, the pump radiation source 330 is a laser, optionally the pump radiation source 330 is a pulsed high power infrared or optical laser. The pump radiation source 330 may for example be a fiber-based laser with an optical amplifier that generates pulses of infrared radiation that may last for example less than 1ns (1 nanosecond) per pulse, if desired, with pulse repetition rates up to several megahertz. The wavelength of the infrared radiation may for example be in the range of 1 μm (1 micrometer). Optionally, the laser pulses are delivered as first pump radiation 340 to the gas delivery system 332, wherein in the gas a portion of the radiation is converted to a higher frequency than the first radiation, becoming emitted radiation 342. The gas supply 334 supplies a suitable gas to the gas delivery system 332, where the gas is optionally ionized by a power supply 336. The gas delivery system 332 may be a cut tube. The gas provided by the gas delivery system 332 defines a gas target, which may be a gas flow or a static volume. The gas may be, for example, a rare gas such as neon (Ne), helium (He), or argon (Ar). N2, O2, ar, kr, xe gases are considered. These may be selectable options within the same device.

The emitted radiation may comprise a plurality of wavelengths. If the emitted radiation is monochromatic, the measurement calculations (e.g. reconstruction) can be simplified, but radiation with several wavelengths is more easily generated. The emission divergence angle of the emitted radiation may have a wavelength dependence. For example, when imaging structures of different materials, different wavelengths will provide different levels of contrast. For example, to inspect metal or silicon structures, wavelengths may be selected that are different from the wavelengths used to image features of the (carbon-based) resist or to detect contamination of these different materials. One or more filtering devices 344 may be provided. For example, a filter such as an aluminum (Al) or zirconium (Zr) film may be used to cut off the primary IR radiation from further entering the examination apparatus. A grating (not shown) may be provided to select one or more specific wavelengths from the generated wavelengths. Alternatively, some or all of the beam paths may be contained in a vacuum environment, bearing in mind that SXR and/or EUV radiation is absorbed when traveling in air. The various components of the radiation source 310 and the illumination optics 312 are all adjustable to achieve different metrology 'schemes' in the same apparatus. For example, different wavelengths and/or polarizations may be selected.

Depending on the material of the structure being inspected, different wavelengths may provide the desired level of penetration for the lower layers. Short wavelengths may be preferred in order to resolve the smallest device feature and the defects in the smallest device feature. For example, one or more wavelengths in the range of 0.01nm to 20nm, or alternatively in the range of 1nm to 10nm, or alternatively in the range of 10nm to 20nm, may be selected. Critical angles for wavelengths shorter than 5nm can be very low when reflecting off of materials of interest in semiconductor fabrication. Thus, choosing wavelengths greater than 5nm may provide a stronger signal at higher angles of incidence. On the other hand, wavelengths up to 50nm may be useful if the inspection task is to detect the presence of a certain material, e.g. to detect contamination.

The filtered beam 342 enters the inspection chamber 350 from the radiation source 310, in which inspection chamber 350 a substrate W including a structure of interest is held by the substrate support 316 at a measurement position for inspection. The structure of interest is labeled T. Alternatively, the atmosphere within the inspection chamber 350 may be maintained near vacuum by a vacuum pump 352 such that SXR and/or EUV radiation may pass through the atmosphere without excessive attenuation. The illumination system 312 has the function of focusing radiation into a focused beam 356, and may include, for example, a two-dimensional curved mirror or a series of one-dimensional curved mirrors, as described in the above-referenced U.S. patent application US2017/0184981A1 (the contents of which are incorporated herein by reference in their entirety). When projected onto a structure of interest, focusing is performed to obtain a circular or elliptical spot S with a diameter of less than 10 μm. The substrate support 316 includes, for example, an X-Y translation stage and a rotation stage by which any portion of the substrate W can be brought to the focal point of the beam to be in a desired orientation. Thus, a radiation spot S is formed on the structure of interest. Alternatively or additionally, the substrate support 316 includes, for example, a tilt stage that can tilt the substrate W at an angle to control the angle of incidence of the focused beam on the structure of interest T.

Optionally, the illumination system 312 provides a reference radiation beam to the reference detector 314, which reference detector 314 may be configured to measure spectra and/or intensities of different wavelengths in the filtered beam 342. The reference detector 314 may be configured to generate a signal 315 that is provided to the processor 310, and the filter may include information about the spectrum of the filtered beam 342 and/or the intensities of the different wavelengths in the filtered beam.

The reflected radiation 360 is captured by the detector 318 and the spectrum is provided to the processor 320 for calculating characteristics of the target structure T. Thus, illumination system 312 and detection system 318 form an inspection device. The inspection device may include a hard X-ray, soft X-ray and/or EUV spectral reflectometer of the kind described in US2016282282A1, the contents of which are incorporated herein by reference in its entirety.

If the target Ta has a certain periodicity, the radiation of the focused beam 356 may also be partially diffracted. In contrast to reflected radiation 360, diffracted radiation 397 follows another path at a well-defined angle relative to the angle of incidence. In fig. 4 (a), the depicted diffracted radiation 397 is depicted in a schematic way, and the diffracted radiation 397 may follow many other paths than the depicted path. Inspection device 302 may also include other detection systems 398 that detect and/or image at least a portion of diffracted radiation 397. A single other detection system 398 is depicted in fig. 4 (a), but embodiments of inspection apparatus 302 may also include more than one other detection system 398 arranged at different positions to detect and/or image diffracted radiation 397 in multiple diffraction directions. In other words, the (higher) diffraction order of the focused radiation beam impinging on the target Ta is detected and/or imaged by one or more other detection systems 398. One or more inspection systems 398 generate signals 399 that are provided to the metrology processor 320. The signal 399 may comprise information of the diffracted light 397 and/or may comprise an image obtained from the diffracted light 397.

To assist in aligning and focusing the spot S with the desired product structure, the inspection apparatus 302 may also use auxiliary radiation to provide auxiliary optics under the control of the metrology processor 320. The metrology processor 320 can also be in communication with a position controller 372 that operates the translation stage, the rotation stage, and/or the tilt stage. The processor 320 receives high accuracy feedback regarding the position and orientation of the substrate via the sensors. For example, the sensor 374 may include an interferometer, the accuracy of which may be in the picometer range. In operating the inspection device 302, the spectral data 382 captured by the detection system 318 is transferred to the metrology processing unit 320.

Fig. 4 (b) shows a diffraction image that can be obtained by measuring an object (e.g., an object such as that shown in fig. 3 (a)). The light diffracts and captures multiple orders on the detector. In this diagram, the zeroth order 0th (specular reflection) and the two first diffraction orders are shown. All orders are spectrally resolved (thus, a 2D pattern is formed by the first order), except for specular reflection. Note that the soft X-ray setup of fig. 4 (a) measures the entire spectrum at one time, in contrast to the metrology device of fig. 3 (a), which measures multiple angles at one time, i.e. the image of fig. 4 (b) is spectrally resolved, whereas the pupil image captured by the metrology device of fig. 3 (a) is angle resolved.

In either metrology apparatus, a substrate support may be provided to hold a substrate W during a measurement operation. In one example, where the metrology apparatus is integrated with a lithographic apparatus, both apparatuses may have the same substrate stage. Coarse and fine positioners may be provided to accurately position the substrate relative to the measurement optics. For example, various sensors and actuators are provided to acquire the position of the object of interest and bring it to a position under the objective lens. Typically, many measurements are made on targets at different locations across the substrate W. The substrate support may be movable in the X-direction and the Y-direction to acquire different targets, and may be movable in the Z-direction to obtain a desired position of the targets relative to the focal point of the optical system. When, for example, in practice the optical system may remain substantially stationary (typically in the X-direction and Y-direction, but possibly also in the Z-direction) and only the substrate is moved, it is convenient to take the objective lens to different positions relative to the substrate for thinking and operation. If the relative positions of the substrate and the optical system are correct, it is in principle irrelevant which of the real world is moving or if both are moving or if a combination of parts of the optical system is moving (e.g. in the Z-direction and/or tilt direction), while the rest of the optical system is stationary and the substrate is moving (e.g. in the X-direction and Y-direction, but optionally also in the Z-direction and/or tilt direction).

In one embodiment, the measurement accuracy and/or sensitivity of the target may vary with respect to one or more properties of the radiation beam provided onto the target, such as the wavelength of the radiation beam, the polarization of the radiation beam, the intensity distribution (i.e., angular intensity distribution or spatial intensity distribution) of the radiation beam, and so forth. Thus, a specific measurement strategy may be selected that ideally achieves good measurement accuracy and/or sensitivity of, for example, the target.

To monitor a patterning process (e.g., a device manufacturing process) that includes at least one pattern transfer step (e.g., a photolithography step), the patterned substrate is inspected and one or more parameters of the patterned substrate are measured/determined. The one or more parameters may include, for example, an overlay between successive layers formed in or on the patterned substrate, a Critical Dimension (CD) (e.g., critical line width) of a feature formed in or on the patterned substrate, a focus or focus error of the photolithography step, a dose or dose error of the photolithography step, an optical aberration of the photolithography step, a positional error (e.g., edge positional error), and the like. The measurement may be performed on a target of the product substrate itself and/or a dedicated metrology target provided on the substrate. The measurement may be performed after resist development but before etching, or may be performed after etching.

In one embodiment, the parameter obtained from the measurement process is a parameter derived from a parameter determined directly from the measurement process. As one example, the derived parameter obtained from the measured parameter is an Edge Position Error (EPE) for the patterning process. The edge position error provides a change in the position of the edge of the structure created by the patterning process. In one embodiment, the edge position error is derived from the overlay value. In one embodiment, the edge position error is derived from a combination of the overlay value and at least one random metric. In one embodiment, the edge locations are derived from a combination of the overlay value, at least one CD random metric value (e.g., CDU, LCDU), and (optionally) another random metric (e.g., edge roughness, shape asymmetry, etc. of the individual structures). In one embodiment, the edge position error includes an extremum (e.g., 3 standard deviations, i.e., 3σ) of the combined overlay error and CD error. In one embodiment, the edge position error has the following form (or includes at least the first two of the following):

wherein sigma _overlay Corresponding to the standard deviation of the overlay, sigma _{CDUstructures} Standard deviation, σ, corresponding to Critical Dimension Uniformity (CDU) of structures created during patterning _OPE.PBA Corresponding to the Optical Proximity Effect (OPE) and/or the mean value of the proximity deviation (PBA), which is the difference between CD at pitch and reference CD, and σ _LER，LPE Corresponding to Line Edge Roughness (LER) and/orStandard deviation of Local Position Error (LPE). While the above formula relates to standard deviation, it can be formulated in different comparable statistical manners, such as variance.

Various techniques exist for measuring structures formed during patterning, including the use of scanning electron microscopes, image-based measurement tools, and/or various specialized tools. As discussed above, a fast and non-invasive form of dedicated metrology tool is one that directs a beam of radiation onto a target on the surface of a substrate and measures the characteristics of the scattered (diffracted/reflected) beam. By evaluating one or more characteristics of radiation scattered by the substrate, one or more characteristics of the substrate may be determined. This may be referred to as diffraction-based metrology. One such type of diffraction-based metrology should be feature asymmetry within the measurement target. This may be used, for example, as a measure of overlay, but other applications are also known. For example, asymmetry may be measured by comparing relative portions of the diffraction spectra (e.g., comparing the-1 st order and +1 st order in the diffraction spectra of a periodic grating). This may be done as described above and, for example, in U.S. patent application publication No. US2006-066855, the entire contents of which are incorporated herein by reference. Another application of diffraction-based metrology is the measurement of feature widths (CDs) within a target. Such techniques may use the apparatus and methods described above with respect to fig. 3 or fig. 4.

The object or structure measured by the apparatus as shown in fig. 3 (a) or fig. 4 and by the methods disclosed herein may comprise one or more geometrically symmetric unit cells or features. As such, the target T or structure may include only a single physical instance of a unit cell or feature, or may include multiple physical instances of a unit cell or feature.

The target/structure may be a specifically designed target. In one embodiment, the target is for a scribe lane. In one embodiment, the targets may be on-die targets, i.e., targets that are within the device pattern (and thus between scribe lanes). In one embodiment, the target may have a feature width or pitch comparable to the device pattern features. For example, the target feature width or pitch may be less than or equal to 300% of the minimum feature size or pitch of the device pattern, less than or equal to 200% of the minimum feature size or pitch of the device pattern, less than or equal to 150% of the minimum feature size or pitch of the device pattern, or less than or equal to 100% of the minimum feature size or pitch of the device pattern.

The object or structure may be a device structure. For example, the object or structure may be a portion of a memory device (which typically has one or more features that are or may be geometrically symmetric). In the case where the device structure is aperiodic or irregular (e.g., a logical structure), the target may be superficially similar to the logical structure (e.g., having similar feature sizes and configurations) such that it includes a regularized extraction of the logical structure that mimics the exposure performance of the logical structure.

Ideally, more than each structure, a physical instance of a unit cell/feature or multiple physical instances of a unit cell/feature together fill the beam spot of the measurement device. In this case, the measured result basically includes only information from the physical instance (its multiple physical instances) of the unit cell. In one embodiment, the beam spot has a cross-sectional width of 50 microns or less, 40 microns or less, 30 microns or less, 20 microns or less, 15 microns or less, 10 microns or less, 5 microns or less, 2 microns or less. The pitch of the structural features may be on the scale of 20nm, so if the beam spot is, for example, 5 μm, each measurement or capture may comprise 200 to 300 features (e.g., about 250 features). As such, each structure or object may include hundreds of features.

The nature of randomness involves absorbed doses that fluctuate due to a limited number of photons (absorbed) and are resistant to chemical noise. This is actually reflected in the variation of the CD between features and/or within the length of the features. As such, a random metric in the context of the present disclosure may include a defect rate or other defect metric, or, for example, an average or mean of: line Edge Roughness (LER), line Width Roughness (LWR), LCDU, contact hole LCDU, rounded edge roughness (CER), edge Position Error (EPE), or a combination thereof.

Currently, failure rates may be determined by counting defects obtained from SEM (e.g., e-beam) images. Typically, this failure rate estimation is performed by collecting multiple measurement points (e.g., contact Holes (CH)) and counting the number of failures within the sample. Although the e-beam measurement is accurate, it is time consuming and therefore not always practical nor suitable for large number of defect measurements. Because the mean CD is closely related to the pattern defect rate, CD SEM is helpful in HVM as a yield index. An average CD as calculated by e.g. several hundred CH is sufficient to obtain a rough estimate of the failure rate. However, CH arrays with the same mean CD may have different failure rates due to focal spot fluctuations.

It is presented herein to use optical metrology (e.g., scatterometry-based metrology) pupil measurements for fast random defect rate estimation across a wafer. Such metrology may use raw pupil data (e.g., in-device metrology (IDM) raw pupil data) or SXR images (e.g., using a metrology tool such as that illustrated in fig. 4 (a) to obtain a 2D spectrally resolved image such as that illustrated in fig. 4 (b)) as input to a trained model (e.g., a machine learning model (e.g., a neural network model or Convolutional Neural Network (CNN)) that is trained to infer defect rate predictions and/or other stochastic metric predictions from raw pupil data/SXR spectrally resolved diffraction image data.

With respect to the IDM embodiment, each measurement may produce a corresponding angle-resolved measurement signal. For example, the in-device metrology may be based on detection of a measurement signal comprising an angle-resolved distribution (e.g., an angle-resolved intensity and/or diffraction efficiency distribution) in a pupil plane from radiation scattered by structures on the wafer after the structures are illuminated. The diffraction efficiency (dimensionless value) describes the relative intensity of the diffracted beam and may include the ratio of the diffracted light intensity to the incident light intensity. Such an angle-resolved distribution measured at a pupil plane will be referred to simply as "pupil" or "pupil measurement" in the following description. The pupil used may be, for example, an original pupil or an unprocessed pupil (optionally, except for any normalization).

The angle-resolved distribution may be obtained from only the zeroth order of the radiation scattered by the structure, only one or more higher orders of the radiation scattered by the structure, or from a combination of the zeroth order and one or more higher orders of the radiation scattered by the structure. IDM measurements based on analysis of the zeroth order (intensity distribution) are described, for example, in WO2017148982 above to infer overlay/CD at device resolution (given that the device features are periodic).

With respect to SXR embodiments, each measurement may produce a corresponding spectrally resolved measurement signal. For example, SXR measurements may be based on detection of measurement signals comprising bright field images or spectrally resolved distributions (e.g., spectrally resolved intensity and/or diffraction efficiency distributions or images) measured in one or more (e.g., conjugate) pupil planes from radiation scattered by structures on the wafer after illuminating the structures. The diffraction efficiency (dimensionless value) describes the relative intensity of the diffracted beam and may include the ratio of the diffracted light intensity to the incident light intensity. Such a spectrally resolved distribution measured at the pupil plane will be referred to in the following description simply as "SXR image" or "SXR measurement" (independent of the actual wavelength used). The SXR image used may be, for example, an original SXR image or an unprocessed SXR image (optionally, except for any normalization). Features on the order of device pitch can be resolved due to the small wavelengths used in SXR metrics. Thus, we can expect better correlation between SEM measurements and SXR measurements than IDM.

IDM and/or SXR measurements may be performed on objects or structures that include features that are similar or identical in size to product features, and/or may be performed directly on product features if the product features are sufficiently regular (e.g., periodic). IDM/SXR measurement data may be obtained from pre-etch measurements (i.e., post-development measurements ADI) and/or post-etch measurements (i.e., post-etch measurements AEI) of structures in the resist.

For example, such a method may facilitate hybrid metrology techniques including a combination of optical metrology and SEM metrology, such that, for example, a fast optical wafer scan may be performed, and the results of such optical metrology used to direct slower but more accurate (or at least higher resolution) SEM inspection to several critical locations.

In optical (e.g., IDM or SXR) metrology, the measurement resolution is not high enough to directly detect individual random variations and defects, e.g., about a single defect feature may occur for every 10000 or more good features (note that a measurement spot may comprise more than 100 individual features for a single measurement). However, the inventors have determined that such measurements (IDM or SXR) may be used as inputs to a suitably trained model, which can then provide a very accurate estimate of one or more random metrics, such as total defect rate or other defect metrics and/or LCDUs, for example, when tested on varying process parameters (e.g., varying dose and focus conditions).

Random defects may be generated by both photon shot noise and resist chemical noise. Thus, random variability in resist depends on both aerial image and resist. The inventors have observed that the random nature of a particular pattern correlates well with the average geometry and material properties of the pattern, which information is present in the IDM original pupil or SXR diffraction image. For example, the defect rate and LCDU variation for a given pattern and resist varies with variations in process parameters such as dose and/or focus variation. The IDM pupil or SXR diffraction image contains information about the averaged 3D contour geometry, which also varies with dose and/or focus variation. In this manner, by varying one or more process parameters (e.g., focus exposure matrix) and/or feature sizes between targets on a training structure or training substrate, a machine learning model may be trained on pupil/SXR diffraction images obtained from measurements of these structures. Process parameter (such as focus and/or dose) variations cause changes in the geometry characteristics (sensitivity depends on resist characteristics). These geometric characteristics (which may be measured using SEM/e-beam tools) are related to random metrics and may also be measured by optical (e.g., IDM/SXR) measurements.

It should be appreciated that the focusing effect on the printed pattern may not be captured by the e-beam tool, but still by the optical tool. As such, variations in dose and/or focus may enable the e-beam tool to capture random pattern changes (e.g., failure rate and LCDUs); however, the inventors have determined that some 3D changes affecting the failure rate may actually be better captured via optical metrology.

The machine learning model may be trained to a process window based on known or observed defect rates or other random metrics that defines a process space that includes process parameter values that are expected to produce good or non-defective dies (at least in terms of acceptable probability), and such that process parameter values outside of the process window may be expected to result in dies having unacceptable probability of defects. For example, a machine learning model may be trained on such a defect rate-based process window, measured by an e-beam/SEM tool or any other tool of sufficient resolution to directly measure random metrics/defect rates. In a particular example, the process window may include a focus exposure window, where focus and dose are process parameters of interest that vary over the structure measured by the e-beam/SEM tool to define the process window. Optical measurements (e.g., pupil measurements or SXR diffraction images) across all or a portion of the process window or focus exposure window are obtained from the same wafer measured by the e-beam/SEM tool. The optical measurements and corresponding SEM-based defect rate data/process window may be used together as a training input for a machine learning model. For example, each optical measurement may be labeled with its corresponding defect rate data and process parameter value(s) and used to train a machine learning model. In addition to or as an alternative to focus and/or dose, the process parameters may be related to parameters of the reticle used for exposure (e.g., reticle feature sizes on which imaged feature sizes such as CDs depend). By varying the reticle feature size, the CD may be intentionally varied across multiple structures on the training wafer, providing a local CD randomness measure (e.g., LCDU) on which a machine learning model may be trained so that the machine learning model may map IDM pupil/SXR diffraction images to LCDU predictions. As with the focus/dose example, the LCD change may be associated with a process window that includes LCDU values that are expected to occur with acceptable probabilities.

In addition to pupil data/SXR diffraction image data from the wafer corresponding to SEM data, the training data may also include nominal information signals (e.g., nominal information pupil/SXR diffraction images) from the reference and/or simulation. Such nominal information signals may relate to non-defective structures/wafers (e.g., simulated pupils from perfectly formed structures) and/or structures/wafers having specific examples of specific defects. In this way, the model can learn how to compare the optically measured data with the nominal information signal and better return their differences to a given failure rate. As such, the training data may include a tensor containing the measured optical data (IDM pupil or SXR diffraction image from the exposed training wafer (s)) and a nominal information signal (e.g., the nominal measured or simulated IDM pupil or SXR diffraction image as described).

The trained machine learning model may then be used to infer defect failure rates and/or other stochastic metrics based on pupil measurement inputs.

The machine learning model may be a CNN. More specifically, the CNN may include an input layer, an output layer, and a hidden layer therebetween. The hidden layer may include several repetitions of, for example, a convolutional layer, an active layer, and a bulk normalization layer, followed by one or more drop out (dropout) layers and one or more fully connected layers. In one embodiment, the activation layer may apply a logarithmic activation function to linearly span the defect rate of the exponential range.

Fig. 5 is a flow chart describing such a method. At step 400, a wafer is exposed using a scanner, wherein at least one process parameter is varied across the wafer. As such, the exposed wafer may include a plurality of training structures, wherein each training structure may include instances of a plurality of features. The training structures may all be similar except that one or more process parameters used in their formation may be different. The process parameters in this context may describe parameters of the lithographic apparatus (e.g., focus and/or dose) used to image the structure from the reticle and/or reticle parameters such as reticle feature sizes (the imaged feature sizes such as CDs depend on the reticle parameters). For example, these structures may be repeated for different focus and/or dose values, e.g., in a similar manner as the focus exposure matrix FEM and/or using different CD values.

The training structure used to train the model may be similar or substantially identical to the structure that will be measured to obtain optical metrology data on which the trained model will be used to infer random metrics in the production setting or HVM setting. However, this is not necessarily essential and some differences may be taken into account, but the trained model may affect the accuracy of the inference.

At step 410, high resolution metrology data is obtained from measurements of structures on the wafer exposed at step 400 using a high resolution metrology tool, e.g., having a resolution sufficient to enable individual imaging of each feature or structure and/or direct determination of defect rate. As such, the high resolution metrology tool may have a higher resolution than the optical metrology tool and may include an SEM/e-beam tool. Based on such high resolution metrology data, random metrology data describing a process window of a process can be determined. The process window may describe a process space or range of process parameter values within which a number of defects/defect rate or other random metric is acceptable, e.g., below a threshold value. If the process parameter remains within the process window, this indicates that the probability of no defects is acceptable, and outside of this window the number of defects/defect rate or other random metric may be considered unacceptable (i.e., unacceptable indicating the probability of no defects).

At step 420, the same wafer may be measured using an optical metrology tool, such as in a pupil plane, to obtain pupil measurement or optical metrology data. As such, at step 400, the wafer may be imaged using structures or targets suitable for such optical metrology. In one embodiment, the optical metrology data may include an angle-resolved distribution (e.g., obtained via IDM metrology) or a spectral-resolved distribution (e.g., obtained via SXR metrology), such as an angle-resolved intensity distribution or a spectral-resolved intensity distribution or an angle-resolved diffraction efficiency distribution or a spectral-resolved diffraction efficiency distribution. The intensity or diffraction efficiency may be normalized (more details are provided below). The optical metrology data can relate to measurements made of the same structure under different illumination conditions (e.g., combinations of one or more of illumination wavelength, illumination polarization, and wafer orientation).

At step 430, a machine learning model, such as a deep convolutional neural network, is trained on the random metrology data and the optical metrology data such that it can map or regress the optical metrology data (e.g., pupil images or SXR diffraction images) to the random metrology data (e.g., specific failure rates or LCDU values). As already mentioned, additional metrology data of a type similar to that of the optical metrology data (e.g. pupil images or SXR diffraction images) may also be included in the training data, the additional metrology data being related to the nominal information signal, e.g. to a specific random metrology example (e.g. zero defect or specific defect type/rate). The additional metrology data may include simulated and/or measured nominal information signals.

As is well known, training may include a verification step such that training data is divided into a training set and a verification set. In one embodiment, verification may be performed on the CDU wafer. The CDU wafer may be exposed at an optimal dose and optimal focus (e.g., using values known from a training set), and may also include dose and focus variations that have never been seen before by the trained model. This further trains the model, enabling it to "interpolate" between the dose and focus conditions for the training set.

At step 440, the trained model may be used to infer random metrology values from one or more optical measurement pupils or SXR diffraction images (e.g., IDM or SXR measurements on target/regular product structures) associated with the production wafer. This step may include: pupil or SXR diffraction images are input into a trained model, which can then infer random metric values from the input.

It can be demonstrated that the diffraction efficiency distribution can show better predictive performance when the machine learning model has been validated on validation data including individual defect values for each target compared to the intensity distribution. The intensity distribution embodiment exhibits more acceptable performance when the validation data is averaged over multiple targets, e.g., the validation data includes defect rates for multiple pupils or SXR diffraction images (for multiple targets), which are averaged (e.g., an average of the logarithms of the defect rates).

Fig. 6 is a plot of defect rates obtained in a conventional manner via high resolution (e.g., SEM) metrology DR (SEM) versus defect rates obtained via optical (e.g., IDM) metrology DR (IDM) using a trained model in accordance with the teachings of the present disclosure. It can be seen that there is a near perfect correlation between the values obtained by the two methods, so that model inference from optical metrology performs substantially as well as conventional SEM metrology for measuring defect rates. Tests have shown that the relationship between mean LCDU based on SEM measurements and mean LCDU values obtained from optical metrology using trained models and the teachings herein is very similar. Similar correlations are expected to be found for other random metrics as well.

As stated, training may be based on intensity optical metrology data or diffraction efficiency optical metrology data. There are some advantages in using diffraction efficiency, including less dependence on the size of the training data set (i.e., smaller training sets perform better, e.g., with less than 50, 40, or 35 targets per field) and better performance when using a trained model that is validated on a single target measurement rather than on an averaged measurement of several targets. In contrast, the intensity-based embodiment shows reliable inferences when validated on averaged metrology data (e.g., more than 10 or more than 20 targets or more than 50 targets).

While ideally, training of the model may be performed on all process parameter values that may be encountered during production, the inventors have determined that the trained model can predict, with good accuracy, from the pupil, a random metric related to the process parameter values that are not used during training and therefore have never been encountered before. In this way, the model can "interpolate" (and possibly extrapolate beyond) the dose and focus conditions for the training set.

Step 420 may include an optional normalization of the measured pupil intensity or diffraction efficiency. Normalized Intensity-norm _{p，i，j，k} Intermediate normalized Intensity-norm 'can be determined by high resolution in a high resolution step' _{p，i，j，k} To determine:

where i is the field index of the training data = 1..n _{fields_train} J is training data target index=1..n _{targets_train} K is the channel index of training data = 1..n _channels (where the channel may be related to the illumination condition (polarization) and orientation of the substrate), and p is the pixel index of the training data = 1..n _pixels (wherein each pupil may comprise a plurality of pixels, each pixel having a corresponding Intensity (or diffraction efficiency) value of Intensity _{p，i，j，k} 。max _p，i，j (Intensity _{p，i，j，k} ) The maximum intensity value within the dataset is described. Diffraction efficiency normalization may be performed in the same manner).

An optional second step may comprise: normalized intensity_norm _{p，i，j，k} The method comprises the following steps:

where k=1..n _channels Wherein p=1..n _pixels . Diffraction efficiency normalization may be performed in the same manner as diffraction efficiency. Alternatively, an intermediate normalized Intensity-norm 'may be used' _{p，i，j，k} . Alternatively, idealizations based on e.g. simulations (parameters) may be subtracted from the normalized intensity pupilTo take care of) the pupil.

The above description relates to obtaining optical measurements such as pupils (e.g., imaged at a pupil plane using a tool such as that illustrated in fig. 3 (a)) or SXR diffraction images (e.g., imaged at a pupil plane using a tool such as that illustrated in fig. 4 (a)) and mapping them to random metrics using a suitably trained model.

In another embodiment, a similar technique will be applied to bright field images (as opposed to spectrally resolved SXR images that have been described) obtained using a bright field inspection tool. Bright field inspection tools are used for defect detection in integrated circuit manufacturing processes.

Bright Field Inspection (BFI) images may be obtained by illuminating a sample (e.g., a structure on a substrate) with high angle incident light (e.g., 45 degrees to 90 degrees relative to horizontal) that produces a "bright" field of view, collecting radiation reflected by the structure, and imaging the reflected radiation at an image plane on a camera. By looking at the difference between the acquired BFI image and the reference BFI image of the same pattern, the presence of defects may be detected (with limited accuracy).

In general, in BFI images, defects appear darker against a brighter background. However, a typical BFI image contains much more information than a single dark spot indicative of a defect, which typically contains more other candidate (typically smaller) dark spots, such that the background resembles a white noise image. If the classification algorithm is not robust enough, each of these other dark spots may be erroneously identified as a defect. BFI images may be strongly affected by the surrounding pattern and random contour changes may be detected.

It is expected that the BFI image will show sensitivity to line edge roughness. However, current BFI image processing typically classifies defects within the BFI image only in order to extract relevant pattern defect rates, without using valuable surrounding pattern information within the image.

As such, a machine learning model, such as a deep convolutional neural network, may be used in order to regress the full BFI image to a given pattern defect rate. Thus, the set of images obtained via BFI may be mapped to a pattern failure rate determined during the training phase, e.g., by an SEM tool or the like. The input of the model may be a tensor comprising the measured bright field image(s). Similar to the previous embodiments, the training data may also include one or more nominal information bright field images from a reference and/or simulation (e.g., related to a zero defect image and/or a specific defect). In this way, the model will learn how to compare the measured calibration image to the nominal calibration image and better return their differences to a given failure rate.

By combining predictions of multiple independent targets per field, the method can be extended using uncertainty estimates of defect rates.

Fig. 7 is a flow chart describing such a method. At step 500, a wafer is exposed using a scanner, wherein at least one process parameter is varied across the wafer. As such, the exposed wafer may include a plurality of training structures, each of which may include a plurality of feature instances. The training structures may all be similar except that one or more process parameters used in their formation may be different. The process parameters in this context may describe parameters of the lithographic apparatus (e.g., focus and/or dose) used to image the structure from the reticle and/or reticle parameters such as reticle feature sizes (the imaged feature sizes such as CDs depend on the reticle parameters). For example, these structures may be repeated for different focus and/or dose values, e.g., in a similar manner as the focus exposure matrix FEM and/or using different CD values.

The training structure used to train the model may be similar or substantially identical to the structure that would be measured to obtain inspection image data (e.g., image plane data or bright field inspection image data such as BFI images) on which the trained model would be used to infer random metrics in the production settings or HVM settings. However, this is not necessarily essential and some differences may be taken into account, but the trained model may affect the accuracy of the inference.

At step 510, high resolution metrology data is obtained from the measurements of the structures on the wafer exposed at step 500 using a high resolution metrology tool, e.g., of a resolution sufficient to enable individual imaging of each feature or structure and/or direct accurate classification of defects, thereby determining pattern defect rates. As such, the high resolution metrology tool may have a higher resolution than the optical metrology tool and may include an SEM/e-beam tool. Based on the high resolution metrology data, random metrology data describing a process window of a process may be determined. The process window may describe a process space or range of process parameter values within which a number of defects/defect rate or other random metric is acceptable, e.g., below a threshold value. If the process parameters remain within the process window, this indicates that the probability of no defects is acceptable, and outside of this window the number of defects/defect rate or other random metric may be considered unacceptable (i.e., indicates that the probability of no defects is unacceptable).

At step 520, the same wafer may be measured using an inspection image tool (e.g., a bright field inspection tool) that captures an image in an image plane to obtain inspection image data. The inspection image data may be related to measurements made of the same structure under different illumination conditions (e.g., combinations of one or more of illumination wavelength, illumination polarization, and wafer orientation).

At step 530, a machine learning model, such as a deep convolutional neural network, is trained on the random metric data and the inspection image data such that it can map or regress the inspection image data to the random metric data (e.g., a particular failure rate or LCDU value). As already mentioned, additional metrology data of a type similar to that of the inspection image data (e.g. BFI image) may also be included in the training data, the additional metrology data being related to the nominal information image, e.g. to a specific random metric example (e.g. zero defect or specific defect type/rate). The additional metrology data may include simulated and/or measured nominal information images.

As with the previous embodiments, a verification step may be performed, which may be the same as the verification step described previously.

At step 540, the trained model may be used to infer random metric values from one or more inspection images (e.g., bright field inspection images) related to producing the wafer. This step may include: the inspection image is input to a trained model, which can then infer random metric values from the input.

It can be shown that the diffraction efficiency distribution can show better predictive performance when the machine learning model has been validated on validation data comprising individual defect values for each target compared to the intensity distribution. The intensity distribution embodiment demonstrates more acceptable performance when the validation data is averaged over multiple targets; for example, the verification data includes defect rates of a plurality of pupils (for a plurality of targets) or inspection images, which are averaged (e.g., an average of the logarithms of the defect rates).

The proposed method may combine predictions of several independent objects/structures (e.g. where the independence is related to a specific field of view) in order to obtain uncertainty estimates as well as more accurate predictions.

The methods disclosed herein may be applied to performing EPE measurements using an optical metrology tool/scatterometer. EPE (described above) is a combined metric that includes a randomness metric and a systemicity metric. In one embodiment, EPE may be determined by using a trained model and LCDUs determined as disclosed herein and combining them with conventional overlay measurements (e.g., diffraction-based overlay or micro-diffraction based overlay methods using optical metrology tools).

It should be appreciated that any defect rate predictions provided by the trained machine learning model are not binary (i.e., defective/non-defective), but rather may predict the expected defect rate for a particular pattern or patterns being inspected. For example, for each individual pattern/structure, the machine learning model may be trained separately to obtain a dedicated model for that pattern/structure type. For example, such a model may be trained on key patterns with small process windows. In another embodiment, a single model is trained to map pupils of multiple different features/objects to a single defect rate prediction or a single "global" defect rate. In such an embodiment, it may be sufficient to train only one pattern to teach the model to infer such global defect rates.

Other embodiments of the invention are disclosed below in the numbered clause list.

1. A method of determining at least one random metric related to a lithographic process, the method comprising:

obtaining a trained machine learning model that has been trained to infer one or more random metric values for the random metric from optical metrology data, the trained machine learning model having been trained on training optical metrology data and training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals, each measurement signal relating to scattered radiation scattered by one of a plurality of training structures on a trained substrate; and the training random metric data includes random metric values related to the training structure, wherein multiple instances of the training structure have been formed with variations in one or more process parameters upon which the random metric depends; obtaining optical metrology data comprising at least one measurement signal relating to a structure that has been exposed during a lithographic process; and deriving a value of the random metric from the optical metrology data using a trained machine learning model.

2. The method of clause 1, wherein each of the measurement signals comprises an angle-resolved parameter distribution.

3. The method of clause 2, wherein each of the angle-resolved parameter distributions comprises an angle-resolved intensity distribution or an angle-resolved diffraction efficiency distribution.

4. The method of clause 2 or 3, wherein each angular resolved parameter distribution comprises an angular resolved parameter distribution obtained from a zeroth order of the scattered radiation.

5. The method of clause 2, 3, or 4, wherein each angular resolution parameter distribution comprises an angular resolution parameter distribution obtained from one or more higher orders of the scattered radiation, the higher orders comprising diffraction orders different from the zeroth order.

6. The method according to any one of clauses 2 to 5, comprising the steps of: normalizing the angular resolution parameter distribution.

7. The method of clause 1, wherein each of the measurement signals comprises a spectrally resolved parameter distribution.

8. The method of clause 7, wherein each of the angle-resolved parameter distributions comprises a spectrally resolved intensity distribution or a spectrally resolved diffraction efficiency distribution.

9. The method according to clause 7 or 8, wherein each angle-resolved parameter distribution is obtained from measurement radiation comprising wavelengths between 5nm and 30nm or more specifically between 10nm and 20 nm.

10. The method of clause 7, 8, or 9, wherein each angular resolved parameter distribution comprises a spectral resolved parameter distribution obtained from at least one or more higher orders of the scattered radiation, the higher orders comprising diffraction orders different from the zeroth order.

11. The method according to any one of clauses 7 to 10, comprising the steps of: normalizing the angular resolution parameter distribution.

12. The method of any preceding clause, wherein the machine learning model comprises a convolutional neural network.

13. The method of clause 12, wherein the convolutional neural network comprises one or more activation layers that apply a logarithmic activation function.

14. The method of any preceding clause, wherein the training random metric data describes an acceptable space or range of random metric values or related dimensional metric values and a corresponding acceptable space or range of process parameter values for the one or more process parameters.

15. The method according to any preceding clause, comprising the initial steps of:

acquiring the training optical measurement data and the random measurement data; and

the trained machine learning model is trained on the training optical metrology data and the random metrology data.

16. The method according to clause 15, comprising: obtaining high-resolution measurement data; and determining the random metrology data from the high resolution metrology data.

17. The method of clause 16, wherein the high resolution metrology data is obtained from scanning electron microscope metrology.

18. The method of any preceding clause, wherein the training optical metrology data further comprises nominal information metrology data relating to one or both of:

non-defect measurement and/or simulation; and

specific defect measurements or simulations.

19. The method of any preceding clause, comprising: the inferred values of the random metrics are used to decide where and/or when to perform additional high resolution measurements.

20. The method of any preceding clause, wherein the one or more process parameters include one or both of focus and dose when forming the training structure.

21. The method of any preceding clause, wherein the one or more process parameters comprise one or more feature sizes on a patterning device used to expose the training structure.

22. The method of any preceding clause, wherein the random metric comprises one or more of: defect rate or other defect measure, line edge roughness, line width roughness, local critical dimension uniformity, circular edge roughness, or edge placement error.

23. A processing apparatus comprising a processor and configured to perform the method of any preceding clause.

24. An optical inspection device operable to measure and obtain the optical metrology data and comprising the computing device of clause 23.

25. A computer program comprising program instructions operable, when run on a suitable device, to perform the method of any one of clauses 1 to 22.

26. A non-transitory computer program carrier comprising a computer program according to clause 25.

27. An optical metrology apparatus comprising:

an optical system operable to obtain optical metrology data comprising at least one measurement signal relating to a structure that has been exposed during a lithographic process; a non-transitory data carrier comprising a trained machine learning model that has been trained to infer one or more random metric values for the random metric from optical metrology data, the trained machine learning model having been trained on training optical metrology data and training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals, each measurement signal relating to scattered radiation scattered by one of a plurality of training structures on a trained substrate; and the training random metric data includes random metric values related to the training structure, wherein multiple instances of the training structure have been formed with variations in one or more process parameters upon which the random metric depends; and a processor operable to infer values of the random metric from the optical metrology data using the trained machine learning model.

28. The optical metrology apparatus of clause 27, wherein each of the measurement signals includes an angle-resolved parameter distribution.

29. The optical metrology apparatus of clause 28, wherein each angular resolved parameter distribution includes an angular resolved intensity distribution or an angular resolved diffraction efficiency distribution.

30. The optical metrology apparatus of clause 28 or 29, wherein each angular resolved parameter distribution comprises an angular resolved parameter distribution obtained from a zeroth order of the scattered radiation.

31. The optical metrology apparatus of any one of clauses 28-30, wherein each angular resolved parameter distribution comprises an angular resolved parameter distribution obtained from one or more higher orders of the scattered radiation, the higher orders comprising diffraction orders different from the zeroth order.

32. The optical metrology apparatus of clause 27, wherein each of the measurement signals includes a spectrally resolved parameter distribution.

33. The optical metrology apparatus of clause 32, wherein each angle-resolved parameter distribution includes a spectrally resolved intensity distribution or a spectrally resolved diffraction efficiency distribution.

34. The optical metrology apparatus of clause 32 or 33, wherein each angle-resolved parameter distribution is obtained from measurement radiation comprising wavelengths between 5nm and 30nm, or more specifically between 10nm and 20 nm.

35. The optical metrology apparatus of any one of clauses 32 to 34, wherein each angular resolved parameter distribution comprises a spectral resolved parameter distribution obtained from at least one or more higher orders of the scattered radiation, the higher orders comprising diffraction orders different from the zeroth order.

36. The optical metrology apparatus of any one of clauses 27 to 35, wherein the machine learning model comprises a convolutional neural network.

37. The optical metrology apparatus of clause 36, wherein the convolutional neural network comprises one or more active layers operable to apply a logarithmic activation function.

38. The optical metrology apparatus of any one of clauses 27 to 37, wherein the training random metrology data describes an acceptable space or range of random metrology values or related dimensional metrology values and a corresponding acceptable space or range of process parameter values for the one or more process parameters.

39. The optical metrology apparatus of any one of clauses 27 to 38, wherein the one or more process parameters include one or both of focus and dose when forming the training structure.

40. The optical metrology apparatus of any one of clauses 27 to 39, wherein the random metric comprises one or more of: defect rate or other defect measure, line edge roughness, line width roughness, local critical dimension uniformity, circular edge roughness, or edge placement error.

41. A method of determining a random metric related to a structure, the method comprising:

obtaining a trained machine learning model, the machine learning model having been trained to correlate training optical metrology data with training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals related to radiation scattered from a plurality of training structures on a substrate, and the training random metrology data comprises random metrology values related to the plurality of training structures, wherein the plurality of training structures have been formed to have a change in one or more dimensions on which the random metrology is dependent; obtaining optical metrology data from the structure; and infer values of the random metric associated with the structure from the optical metrology data using the trained machine learning model.

42. The method of clause 41, wherein the random metric represents a probability of defect or a change in CD of small spatial scale, e.g., a CD change of less than 1000 times the CD.

43. The method of clause 41 or 42, wherein the measurement signal is a zero-order pupil intensity distribution of the radiation after being scattered by the structure or training structure.

44. The method of any one of clauses 41 to 43, wherein the training random metric data is obtained using an e-beam metrology tool.

45. The method according to any of clauses 41 to 44, wherein the change in one or more dimensions is associated with a change in a process parameter of the lithographic apparatus, such as exposure dose and/or focus setting.

46. A computer program comprising program instructions operable, when run on a suitable device, to perform the method of any one of clauses 41 to 45.

47. A non-transitory computer program carrier comprising a computer program according to clause 46.

48. A method of determining a random metric related to a structure, the method comprising:

obtaining a trained machine learning model, the machine learning model having been trained to correlate training optical metrology data with training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals related to radiation scattered from a plurality of training structures on a substrate, and the training random metrology data comprises random metrology values related to the plurality of training structures, wherein the plurality of training structures have been formed to have a variation in one or more dimensions on which the random metrology is dependent; obtaining optical metrology data from the structure; and infer values of the random metric associated with the structure from the optical metrology data using the trained machine learning model.

Additional embodiments of the present invention are disclosed below in the numbered clause list.

obtaining a trained model, the model having been trained to infer one or more random metric values for the random metric from inspection image data, the trained model having been trained on training inspection image data and training random metric data, wherein the training inspection image data comprises a plurality of inspection images, each inspection image relating to reflected radiation that has been reflected by one of a plurality of training structures on a training substrate; and the training random metric data includes random metric values related to the training structure, wherein multiple instances of the training structure have been formed with variations in one or more process parameters upon which the random metric depends; obtaining inspection image data comprising at least one inspection image relating to a structure that has been exposed in a lithographic process; and infer values of the random metric from the inspection image data using the trained model.

2. The method of clause 1, wherein each of the inspection images comprises an image of a respective structure captured at an image plane of an inspection imaging device used to obtain the inspection image or a conjugate thereof.

3. The method of clause 1 or 2, wherein the inspection image comprises a bright field inspection image.

4. The method of any preceding clause, wherein the model comprises a machine learning model, a neural network, or a convolutional neural network.

5. The method of clause 4, wherein the convolutional neural network comprises one or more activation layers that apply a logarithmic activation function.

6. The method of any preceding clause, wherein the training random metric data describes an acceptable space or range of random metric values or related dimensional metric values and a corresponding acceptable space or range of process parameter values for the one or more process parameters.

7. The method according to any preceding clause, comprising the initial steps of:

acquiring the training inspection image data and the random metric data; and

training the trained model on the training inspection image data and the random metric data.

8. The method according to clause 7, comprising: obtaining high-resolution measurement data; and determining the random metrology data from the high resolution metrology data.

9. The method of clause 8, wherein the high resolution metrology data is obtained from scanning electron microscope metrology.

10. The method of any preceding clause, wherein the training inspection image data further comprises nominal information measurement data relating to one or both of:

non-defect inspection and/or simulation; and

specific defect inspection or simulation.

11. The method of any preceding clause, comprising: the inferred values for the random metrics are used to decide where and/or when to perform other high resolution measurements.

12. The method of any preceding clause, wherein the one or more processing parameters include one or both of focus and dose when forming the training structure.

13. The method of any preceding clause, wherein the one or more process parameters include one or more feature sizes on a patterning device used to expose the training structure.

14. The method of any preceding clause, wherein the random metric comprises one or both of: defect rate or other defect measure, line edge roughness, line width roughness, local critical dimension uniformity, circular edge roughness, or edge placement error.

15. A processing apparatus comprising a processor and configured to perform the method according to any of the preceding clauses.

16. An optical inspection apparatus operable to measure and obtain the inspection image data and comprising a computing device according to clause 15.

17. A computer program comprising program instructions operable, when run on a suitable apparatus, to perform the method of any one of clauses 1 to 16.

18. A non-transitory computer program carrier comprising the computer program of clause 17.

19. An inspection imaging apparatus comprising:

an imaging system operable to obtain inspection image data comprising at least one inspection image relating to a structure that has been exposed in a lithographic process; a non-transitory data carrier comprising a trained model that has been trained to infer one or more random metric values for the random metric from inspection image data, the trained model having been trained on training inspection image data and training random metric data, wherein the training inspection image data comprises a plurality of inspection images, each inspection image relating to reflected radiation reflected by one of a plurality of training structures on a trained substrate; and the training random metric data includes random metric values related to the training structure, wherein multiple instances of the training structure have been formed with variations in one or more process parameters upon which the random metric depends; and a processor operable to infer values of the random metric from the inspection image data using the trained model.

20. The inspection imaging apparatus of clause 19 comprising a camera of an image plane of the inspection imaging apparatus or a conjugate of the image plane for capturing the inspection image.

21. The inspection imaging apparatus of clause 19 or 20, wherein the inspection imaging apparatus comprises a bright field imaging apparatus and each inspection image comprises a bright field inspection image.

22. The inspection imaging apparatus of any of clauses 19-21 wherein the model comprises a machine learning model, a neural network, or a convolutional neural network.

23. The inspection imaging apparatus of clause 22 wherein the convolutional neural network comprises one or more activation layers operable to apply a logarithmic activation function.

24. The inspection imaging apparatus of any of clauses 19 to 23 wherein the training random metric data describes an acceptable space or range of random metric values or related dimensional metric values and a corresponding acceptable space or range of process parameter values for the one or more process parameters.

25. The examination imaging apparatus of any of clauses 19-24, wherein the one or more process parameters comprise one or both of focus and dose when forming the training structure.

26. The inspection imaging device of any of clauses 19-25 wherein the random metric comprises one or more of: defect rate or other defect measure, line edge roughness, line width roughness, local critical dimension uniformity, circular edge roughness, or edge placement error.

27. A method of determining a random metric related to a structure, the method comprising:

obtaining a trained model, the model having been trained to correlate training inspection image data with training random metric data, wherein the training inspection image data comprises a plurality of inspection images related to radiation reflected from a plurality of training structures on a substrate, and the training random metric data comprises random metric values related to the plurality of training structures, wherein the plurality of training structures have been formed to have a change in one or more dimensions on which the random metric depends; obtaining inspection image data from the structure; and infer values of the random metric associated with the structure from the inspection image data using the trained model.

28. The method of clause 27, wherein the random metric represents a probability of defect or a CD variation of small spatial scale, e.g., a CD variation of less than 1000 times the CD.

29. The method of clause 27 or 28, wherein the inspection image is a brightfield inspection image of the structure or the training structure.

30. The method of any of clauses 27-29, wherein the training random metric data is obtained using an e-beam metrology tool.

31. The method of any of clauses 27 to 30, wherein the change in one or more dimensions is associated with a change in a process parameter of the lithographic apparatus, such as exposure dose and/or focus setting.

32. A computer program comprising program instructions operable, when run on a suitable apparatus, to perform the method of any of clauses 27 to 31.

33. A non-transitory computer program carrier comprising a computer program according to clause 32.

34. A method of determining a random metric related to a structure, the method comprising:

The terms "radiation" and "beam" used with respect to a lithographic apparatus encompass all types of electromagnetic radiation, including Ultraviolet (UV) radiation (e.g. having a wavelength of or about 365nm, 355nm, 248nm, 193nm, 157nm or 126 nm) and extreme ultra-violet (EUV) radiation (e.g. having a wavelength in the range of 5nm to 20 nm), as well as particle beams, such as ion beams or electron beams.

The term "lens", where the context allows, may refer to any one or combination of various types of optical components, including refractive, reflective, magnetic, electromagnetic and electrostatic optical components.

The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific embodiments without undue experimentation without departing from the generic concept of the present invention. Accordingly, such changes and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments based on the teachings and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description by way of example and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.

The breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

1. A method of determining a random metric related to a structure, the method comprising:

obtaining a trained model, the model having been trained to correlate training optical metrology data with training random metrology data, wherein the training optical metrology data comprises a plurality of measurement signals related to a plurality of angle-resolved distributions of intensity-related parameters across zero-order or higher-order diffraction included within radiation scattered from a plurality of training structures on a substrate, and the training random metrology data comprises random metrology values related to the plurality of training structures, wherein the plurality of training structures have been formed with variations in one or more dimensions on which the random metrology is dependent;

obtaining optical metrology data comprising an angle-resolved distribution of the intensity-related parameter across zero-order or higher-order diffraction included in radiation scattered from a structure; and

values of the random metric associated with the structure are inferred from the optical metrology data using the trained model.

2. The method of claim 1, wherein each of the measurement signals further comprises a spectrally resolved distribution of the intensity-related parameter across zero-order or higher diffraction included within radiation scattered from the plurality of training structures on the substrate.

3. The method of claim 1, wherein the parameter is diffraction efficiency.

4. The method of claim 1, wherein the training optical metrology data further comprises nominal information metrology data relating to one or both of: defect-free measurement and/or simulation; and specific defect measurements or simulations.

5. The method of claim 1, wherein the model comprises a machine learning model, a neural network, or a convolutional neural network.

6. The method of claim 1, wherein the change in the one or more dimensions is associated with a change in one or more process parameters of a lithographic process used in applying the training structure to the training substrate.

7. The method of claim 6, wherein the training random metric data describes an acceptable space or range of random metric values or related dimensional metric values and a corresponding acceptable space or range of values of the one or more process parameters.

8. The method of claim 6, wherein the one or more process parameters are one or more of: dose, focus.

9. The method of claim 1, further comprising the initial step of:

obtaining the training optical metrology data and the random metrology data; and

training the trained model on the training optical metrology data and the random metrology data.

10. The method of claim 9, comprising: obtaining high-resolution measurement data; and

the random metrology data is determined from the high resolution metrology data.

11. The method of claim 10, wherein the high resolution metrology data is obtained from scanning electron microscope metrology.

12. The method of claim 1, further comprising: the inferred value of the random metric is used to decide where and/or when to perform other high resolution measurements.

13. The method of claim 1, wherein the random metric comprises one or more of: defect rate or other defect measure, line edge roughness, line width roughness, local critical dimension uniformity, circular edge roughness, or edge placement error.

14. A computer program comprising program instructions operable to perform the method of any one of claims 1 to 13 when run on a suitable device.

15. A non-transitory computer program carrier comprising a computer program according to claim 14.