WO2018041696A1 - System for parallel data processing with multi-layer workload management - Google Patents

System for parallel data processing with multi-layer workload management Download PDF

Info

Publication number
WO2018041696A1
WO2018041696A1 PCT/EP2017/071268 EP2017071268W WO2018041696A1 WO 2018041696 A1 WO2018041696 A1 WO 2018041696A1 EP 2017071268 W EP2017071268 W EP 2017071268W WO 2018041696 A1 WO2018041696 A1 WO 2018041696A1
Authority
WO
WIPO (PCT)
Prior art keywords
computing nodes
workloads
scheduler
sub
sets
Prior art date
Application number
PCT/EP2017/071268
Other languages
French (fr)
Inventor
Wei Fang
Peilei ZHANG
Yanan HAN
Original Assignee
Asml Netherlands B.V.
Hermes Microvision, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Asml Netherlands B.V., Hermes Microvision, Inc. filed Critical Asml Netherlands B.V.
Publication of WO2018041696A1 publication Critical patent/WO2018041696A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities

Definitions

  • the present disclosure generally relates to the fields of data processing and computer workload management, and more particularly, to a system for processing large quantity of data sets in real time, using a multi-layer workload management scheme.
  • the data sets may include inspection data generated by an electron-beam inspection tool during the scanning of a sample, e.g., a semiconductor wafer.
  • a charged particle (e.g., electron) beam has a shorter wavelength and thereby can offer superior spatial resolution.
  • Typical electron-beam (e-beam) inspection tools such as a scanning electron microscope (SEM) or a transmission electron microscope (TEM), focus electrons of a single primary electron beam at predetermined scan locations of a wafer under inspection.
  • the primary electron beam interacts with the wafer and may be backscattered or may cause the wafer to emit secondary electrons.
  • the intensity of the backscattered or secondary electrons may vary based on the properties of the internal and/or external structures of the wafer, and thus indicates whether the wafer has defects.
  • the traditional single e-beam inspection tools have low throughput, due to their high resolutions. This limits the single e-beam inspection tools from being applied to wafer inspection in large scale.
  • One way to improve the throughput is to use multiple beamlets for scanning multiple separate areas simultaneously.
  • Such a multi-beam inspection tool can drastically improve the speed of scanning a wafer, but can also generate a huge amount of inspection data.
  • a multi-beam inspection system may output inspection data at a rate of approximately hundreds of gigabytes per second, or hundreds of terabytes per hour. The level of inspection data can be difficult to process without delay.
  • Embodiments of the present disclosure relate to a system for parallel data processing with multi-layer workload management.
  • a computer system includes a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons.
  • the computer system also includes a first sub-scheduler configured to manage a first set of computing nodes for processing the first set of workloads assigned by the scheduler.
  • the computer system further includes a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • a system in some embodiments, includes a first and a second sets of computing nodes.
  • the system also includes a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons.
  • the system also includes a first sub-scheduler configured to manage the first set of computing nodes for processing the first set of workloads assigned by the scheduler.
  • the system further includes a second sub-scheduler configured to manage the second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • a method includes receiving, by a scheduler, an inspection data set to be processed.
  • the method also includes parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons.
  • the method also includes managing, by a first sub-scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler.
  • the method further includes managing, by a second sub-scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • a non-transitory computer-readable medium stores instructions that, when executed by one or more processors, cause the processors to perform a method including: receiving, by a scheduler, an inspection data set to be processed; parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons; managing, by a first sub-scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and managing, by a second sub- scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • FIG. 1 is a schematic diagram illustrating an exemplary electron-beam (e-beam) inspection system, consistent with embodiments of the present disclosure.
  • FIG. 2 is a schematic diagram illustrating a sensor surface of an exemplary electron detector used in the exemplary e-beam inspection system of FIG. 1.
  • FIG. 3 is a schematic diagram illustrating an exemplary architecture of a data processing system used in the exemplary e-beam inspection system of FIG. 1.
  • FIG. 4A is a block diagram of a computing node used in the exemplary data processing system of FIG. 3.
  • FIG. 4B is a block diagram of a computing node used in the exemplary data processing system of FIG. 3.
  • FIG. 5 is a block diagram of a main scheduler or a sub-scheduler used in the exemplary data processing system of FIG. 3.
  • FIG. 6 is a flowchart of a data processing method, consistent with embodiments of the present disclosure.
  • the present application discloses a data processing system configured to process inspection data generated by an electron-beam (e-beam) inspection system in real time.
  • the disclosed data processing system may be used in many technologies, such as in manufacturing processes of integrated circuits (ICs).
  • FIG. 1 is a schematic diagram illustrating an exemplary e-beam inspection system 1, consistent with embodiments of the present disclosure.
  • e-beam inspection system 1 includes an e-beam tool 10 coupled with a data processing system 20.
  • e-beam tool 10 is configured to scan a sample, e.g., a semiconductor wafer, with one or more electron beams and to generate inspection data sets based on secondary electrons reflected from the sample.
  • the inspection data sets may be provided to and processed by data processing system 20 in real time.
  • FIG. 1 shows data processing system 20 is coupled to one e-beam tool 10, it is contemplated that data processing system 20 can be shared by multiple e-beam tools 10 and thus can process inspection data sets generated by multiple e-beam tools 10.
  • E-beam tool 10 includes a motorized stage 134 and a wafer holder 136 supported by motorized stage 134 to hold a wafer 150 to be inspected.
  • E-beam tool 10 further includes a cathode 100, an anode 120, a gun aperture 122, a beam limit aperture 124, a condenser lens 126, a source conversion unit 128, an objective lens assembly 132, a beam separator 138, and an electron detector 140.
  • Source conversion unit 128, in some embodiments, can include a micro-deflectors array 129 and a beamlet-limit plate 130.
  • Objective lens assembly 132 can include a modified swing objective retarding immersion lens (SORIL), which includes a pole piece 132a, a control electrode 132b, a deflector 132c, and an exciting coil 132d.
  • SORIL modified swing objective retarding immersion lens
  • E-beam tool 10 may additionally include an energy dispersive X-ray spectrometer (EDS) detector (not shown) to characterize the materials on the wafer.
  • EDS energy dispersive X-ray spectrometer
  • a wafer 150 to be inspected is mounted or placed on wafer holder 136, which is supported by motorized stage 134.
  • a voltage is applied between anode 120 and cathode 100, cathode 100 emits an electron beam 160.
  • the emitted electron beam passes through gun aperture 122 and beam limit aperture 124, both of which can determine the size of electron beam entering condenser lens 126, which resides below beam limit aperture 124.
  • Condenser lens 126 can focus the emitted electron beam 160 before electron beam 160 enters source conversion unit 128.
  • Micro-deflectors array 129 can split the emitted beam into multiple primary electron beams 160a, 160b, and 160c.
  • the number of multiple primary beams is not limited to three and micro-deflector array 129 can be configured to split the emitted beam into greater number of primary electron beams.
  • Beamlet-limit plate 130 can set the size of the multiple primary electron beams before entering objective lens assembly 132.
  • Deflector 132c deflects the primary electron beams 160a, 160b, and 160c to facilitate beam scanning on the wafer. For example, in a scanning process, deflector 132c can be controlled to deflect primary electron beams 160a, 160b, and 160c simultaneously onto different locations of top surface of wafer 150 at different time points, to provide data for image reconstruction for different parts of wafer 150.
  • Exciting coil 132d and pole piece 132a generate a magnetic field that begins at one end of pole piece 132a and terminates at the other end of pole piece 132a.
  • a part of wafer 150 being scanned by primary electron beam 160 can be immersed in the magnetic field and can be electrically charged, which, in turn, creates an electric field.
  • the electric field reduces the energy of impinging primary electron beam 160 near the surface of the wafer before it collides with the wafer.
  • Control electrode 132b being electrically isolated from pole piece 132a, controls an electric field on the wafer to prevent micro-arching of the wafer and to ensure proper beam focus.
  • Backscattered primary electrons and secondary electrons can be emitted from the part of wafer 150 upon receiving primary electron beams 160a, 160b, and 160c.
  • Beam separator 138 can direct the secondary and/or scattered electron beams 170a, 170b, and 170c, comprising backscattered and secondary electrons, to a sensor surface of electron detector 140.
  • FIG. 2 is a schematic diagram illustrating a sensor surface 142 of electron detector 140, according to some embodiments of the present disclosure.
  • the detected electron beams 170a, 170b, and 170c from FIG. 1 can form corresponding beam spots 180a, 180b, and 180c on sensor surface 142 of electron detector 140.
  • Electron detector 140 can generate signals (e.g., voltages, currents, etc.) that represent the intensities of the received beam spots, and provide the signals to data processing system 20 (FIG. 1).
  • sensor surface 142 includes a plurality of sensor regions, labeled as 144a, 144b, 144c, etc.
  • Each sensor region can be designated to receive a beam spot (e.g., beam spots 180a, 180b, and 180c) emitted from a particular location on wafer 150.
  • the number of primary beams used in e-beam tool 10 is not limited to three.
  • the present disclosure does not limit the number of sensor regions on sensor surface 142, and the number of beam spots detectable by electron detector 140.
  • electron detector 140 may include 3x3, 4x5, or 10x10 sensor regions arranged on sensor surface 142 as a matrix.
  • Each sensor region can comprise an array of electron sensing elements 146.
  • Electron sensing elements 146 may comprise, for example, a PIN diode, an electron multiplier tube (EMT), etc. Electron sensing elements 146 can generate current signals commensurate with the electrons received in the sensor regions.
  • the current signals generated by detector 140 may be amplified and digitized before being provided to data processing system 20.
  • electron detector 140 may output the current signals to a preprocessing circuit (not shown) included in or coupled with electron detector 140.
  • the preprocessing circuit can amplify the current signals and convert the amplified current signals into voltage signals (representing the intensities of received electron beam spots 180a, 180b, and 180c).
  • the voltage signals constitute the inspection data to be processed by data processing system 20.
  • the intensity of secondary and/or scattered electron beams 170a, 170b, and 170c, and the resultant beam spots 180a, 180b, and 180c can vary according to the external and/or internal structure of wafer 150.
  • primary electron beams 160a, 160b, and 160c can be projected onto different locations of the top surface of wafer 150 to generate secondary and/or scattered electron beams 170a, 170b, and 170c (and the resultant beam spots) of different intensities. Therefore, by mapping the intensities of the beam spots with the locations of wafer 150, data processing system 20 can reconstruct an image that reflects the internal and/or external structures of wafer 150.
  • FIGs. 1 & 2 show e-beam tool 10 as a multi-beam inspection tool that employs multiple primary electron beamlets to simultaneously scan multiple locations on wafer 150, it is contemplated that e-beam tool 10 may also be a single-beam inspection tool that uses only one primary electron beam to scan one location of wafer 150 at a time. The present application does not limit the number of electron beams used in e-beam tool 10.
  • Communication link 210 may comprise one or more interconnected wired or wireless data networks that transmit the inspection data generated by e-beam tool 10 to data processing system 20. For example,
  • communication link 210 may be implemented as a wired Wide Area Network (WAN), a wired Local Area Network (LAN), a wireless LAN (e.g., Institute of Electrical and Electronics Engineers (IEEE) 802.11, Bluetooth®, etc.), a wireless WAN (e.g., Worldwide Interoperability for Microwave Access (WiMAX)), the Internet, or the like.
  • communication link 210 may be implemented as one or more data cables connecting electron detector 140 (and/or the preprocessing circuit) directly to data processing system 20. Such wired connections may provide more reliable and faster data transmission than wireless connections.
  • Data processing system 20 includes a plurality of computer processors 220 operating collaboratively to run various applications.
  • Each computer processor 220 may be implemented using a variety of different equipment, such as a supercomputer, a personal computer, a server, a mainframe, a mobile device, or the like.
  • Computer processors 220 may be located in the same location as or remotely from e-beam tool 10.
  • Multiple server clusters 108 may be formed as one or more grids or clusters to share resources and workloads.
  • FIG. 3 is a schematic diagram illustrating an exemplary architecture of data processing system 20, consistent with embodiments of the present disclosure.
  • data processing system 20 may include a plurality of computing nodes arranged into multiple clusters, including at least a first cluster 50 having computing nodes 50-1, 50-2, . . ., 50-n and a second cluster 60 having computing nodes 60-1, 60-2, . . ., 60-n.
  • Data processing system 20 also includes a first sub-scheduler 41 configured to manage and distribute workloads within first cluster 50, and a second sub-scheduler 42 configured to manage and distribute workloads within second cluster 60.
  • Data processing system 20 further includes a main scheduler 30 for coordinating workloads to be processed by first sub-scheduler 41 (and thus first cluster 50) and second sub-scheduler 42 (and thus second cluster 60).
  • main scheduler 30, sub- schedulers 41 and 42, and clusters 50 and 60 are arranged in a multi-layer tree structure.
  • Main scheduler 30 coordinates the processing power of all the available computing nodes in data processing system 20 by allocating workloads between sub-schedulers 41 and 42, while sub-schedulers 41 and 42 are dedicated for managing workloads within clusters 50 and 60, respectively.
  • main scheduler 30 receives an inspection data set from e-beam tool 10 and parses it into multiple workloads.
  • Main scheduler 30 further assigns different workloads to sub-schedulers 41 and 42, which then manage clusters 50 and 60 respectively to process the assigned workloads.
  • data processing system 20 may be shared by multiple e-beam tools 10. Accordingly, main scheduler 30 and sub-schedulers 41 and 42 may be configured to simultaneously manage the processing of inspection data sets generated by multiple e-beam tools 10.
  • Data processing system 20 includes features to minimize system interference between clusters 50 and 60, so as to ensure the workloads to be concurrently processed in real time.
  • clusters 50 and 60 do not share the same computing node.
  • sub-schedulers 41 and 42 only receive workloads from main scheduler 30, while they do not communicate with each other, in order to achieve isolation between clusters 50 and 60. This way, the workloads assigned to cluster 50 will not affect the timely processing of the workloads assigned to cluster 60, and vice versa.
  • clusters 50 and 60 may be configured to receive the workloads via different data channels, in order to prevent the transmission of the workloads assigned to cluster 50 from delaying the transmission of the workloads assigned to cluster 60, and vice versa.
  • Data processing system 20 also includes features to enable precise allocation of the workloads between clusters 50 and 60, so as to ensure the workloads to be concurrently processed in real time.
  • the computing nodes in clusters 50 and 60 are configured to be non-programmable nodes, such that the hardware and/or software configurations of the computing nodes are fixed. This way, the computing power and/or data bandwidth of each computing node are predictable, and thus main scheduler 30 and sub- schedulers 41 and 42 can accurately allocate the workloads to the computing nodes according to the configurations of the computing nodes.
  • the computing nodes are made non-programmable, the hardware and/or software configurations of clusters 50 and 60 may still change.
  • the number of computing nodes in each of clusters 50 and 60, or the number of computing nodes managed by each of sub-schedulers 41 and 42 may be modified occasionally.
  • the number of workable computing nodes in a cluster may decrease due to malfunctioning of some computing nodes in the cluster.
  • a user may want to increase the number of computing nodes in a cluster to lift the cluster' s computing power.
  • sub-schedulers 41 and 42 may be made programmable based on system modifications of clusters 50 and 60.
  • the workload management algorithms in sub-schedulers 41 and 42 may be fine-tuned in response to changes to the numbers of computing nodes in clusters 50 and 60, respectively.
  • main scheduler 30 may be configured to dynamically select the number of clusters and sub- schedulers for processing an inspection data set based on, for example, job requirement (e.g., the size of the inspection data set), current availability of the clusters, and/or hardware/software configurations (e.g., data processing capabilities and/or data bandwidths) of the clusters.
  • job requirement e.g., the size of the inspection data set
  • current availability of the clusters e.g., current availability of the clusters
  • hardware/software configurations e.g., data processing capabilities and/or data bandwidths
  • a computing node used in data processing system 20 may include one or more of a CPU, an image processing unit, an application-specific integrated circuit (ASIC), a
  • the computing node may be configured as a hybrid node containing multiple different types of processors, to meet different data processing requirements.
  • FIG. 4A is a block diagram of an exemplary computing node 400A used in data processing system 20.
  • computing node 400A may be used in cluster 50 and/or cluster 60 (FIG. 3).
  • computing node 400A includes a CPU 410 and a special-purpose processor 420, such as a special-purpose processor 420 dedicated for image processing.
  • CPU 410 and special-purpose processor 420 may be used to fulfill different tasks.
  • CPU 410 is good at performing single-threaded applications, such as defect detection based on the inspection data, while special-purpose processor 420 is more efficient and powerful for image processing, such as image denoising and image enhancement.
  • CPU 410 may be configured to allocate tasks between CPU 410 and special -purpose processor 420 according to attributes of the tasks, so as to maximize the system efficiency.
  • FIG. 4B is a block diagram of an exemplary computing node 400B used in data processing system 20.
  • computing node 400B may be used in cluster 50 and/or cluster 60 (FIG. 3).
  • computing node 400B includes a CPU 410, a FPGA 440, and a multi-core high performance processor 450, such as an Intel® Xeon PhiTM processor 450.
  • FPGA 440 may be programmed to handle specific applications such as digital signal processing, image construction, etc.
  • Intel® Xeon PhiTM processor 450 is suitable for performing applications that require high-density calculation, fast memory access, or high multi-thread capacity, such as applications performing image alignment.
  • CPU 410 may be configured to decide which application shall be performed by which type of processor.
  • FIG. 5 is a block diagram of an exemplary workload scheduler 500, consistent with embodiments of the present disclosure.
  • Workload scheduler 500 may be main scheduler 30, or sub- schedulers 41 and 42. Functions of main scheduler 30 and sub-schedulers 41 and 42 are discussed in further detail with respect to FIG. 6.
  • workload scheduler 500 may include one or more processors
  • Workload scheduler 500 may be a single server or may be configured as a distributed computer system including multiple servers or computers that interoperate to perform one or more of the processes and functionalities associated with the disclosed embodiments.
  • workload scheduler 500 is specially configured with hardware and/or software modules for performing functions of disclosed methods.
  • workload scheduler 500 may include a job interceptor 512, a job parser 514, and a job director 516.
  • the modules can be implemented as specialized circuitry integrated within processor 510 or in communication with processor 510, and/or specialized software executable by processor 510.
  • job inspector 512 may be configured to intercept a received inspection data set at an abstraction layer and extract information (e.g., data size) about the inspection data set.
  • Job parser 514 may be configured to parse the inspection data set into a plurality of workloads.
  • job director 516 may be configured to direct different workloads to sub-schedulers 41 and 42.
  • job inspector 512 may be configured to intercept, at an abstraction layer, a workload forwarded from main scheduler 30 and extract attributes (e.g., hardware/software requirement by the workload) of the workload.
  • job parser 514 may be configured to parse the workload into smaller components for evaluating the computation cost of the workload, while job director 516 may be configured to direct the workload to a suitable computing node for processing.
  • Processor 510 may be one or more known or custom processing devices designed to perform functions of the disclosed workload scheduling methods, such as a single core or multiple core processors capable of executing parallel processes simultaneously.
  • processor 510 may be a single core processor configured with virtual processing technologies.
  • processor 510 may use logical processors to simultaneously execute and control multiple processes.
  • Processor 510 may implement virtual machine technologies, or other known technologies to provide the ability to execute, control, run, manipulate, store, etc. multiple software processes, applications, programs, etc.
  • processor 510 may include a multiple-core processor arrangement (e.g., dual core, quad core, etc.) configured to provide parallel processing functionalities to allow workload scheduler 500 to execute multiple processes simultaneously. It is appreciated that other types of processor arrangements could be implemented that provide for the capabilities disclosed herein.
  • Workload scheduler 500 may also include one or more I/O devices 520 that may comprise one or more interfaces for receiving signals or input from devices and providing signals or output to one or more devices that allow data to be received and/or transmitted by workload scheduler 500.
  • workload scheduler 500 may include interface components, which may provide interfaces to one or more input devices, such as one or more keyboards, mouse devices, and the like, that enable workload scheduler 500 to receive input from a user or administrator.
  • Workload scheduler 500 may include one or more storage devices configured to store information used by processor 510 (or other components) to perform certain functions related to the disclosed embodiments.
  • workload scheduler 500 may include memory 530 that includes instructions to enable processor 510 to execute one or more applications, such as workload management, server applications, network communication processes, and any other type of application or software known to be available on computer systems.
  • the instructions, application programs, etc. may be stored in an internal database or an external storage (not shown) in direct communication with workload scheduler 500.
  • the internal database and/or external storage may be a volatile or non- volatile, magnetic, semiconductor, tape, optical, removable, non-removable, or other type of storage device or tangible and/or non-transitory computer-readable medium.
  • non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM or any other flash memory, NVRAM, a cache, a register, any other memory chip or cartridge, and networked versions of the same.
  • workload scheduler 500 may include memory 530 that includes instructions that, when executed by processor 510, perform one or more processes consistent with the functionalities disclosed herein.
  • workload scheduler 500 may include memory 530 that may include one or more programs 540 to perform one or more functions of the disclosed embodiments.
  • processor 510 may execute one or more programs located remotely from data processing system 20.
  • workload scheduler 500 may access one or more remote programs, that, when executed, perform functions related to disclosed embodiments.
  • Programs 540 stored in memory 530 and executed by processor(s) 510 may include one or more workload management app(s) 542 and operating system 544.
  • Workload management app(s) 542 may be configured to cause processor(s) 510 to execute one or more processes related to intercepting an inspection data set and/or a workload, parsing the inspection data set and/or the workload, determining the cost of processing the workload by a computing node, and directing the workload to a selected computing nodes.
  • Data 550 may include metadata describing the running status of clusters 50 and 60 at a macro level, and/or the running status of individual computing nodes at a micro level. For example, such metadata may indicate each cluster's current resource availability, file system, data storage locations, etc.
  • Memory 530 may include one or more memory devices that store data and instructions used to perform one or more features of the disclosed embodiments. Memory 530 may also include any combination of one or more relational and/or non-relational databases controlled by memory controller devices (e.g., server(s), etc.) or software, such as document management systems, Microsoft® Structured Query Language (SQL) databases, SharePoint® databases, Oracle® databases, or other relational databases, or non-relational databases such as key- value stores or non SQL
  • memory controller devices e.g., server(s), etc.
  • software such as document management systems, Microsoft® Structured Query Language (SQL) databases, SharePoint® databases, Oracle® databases, or other relational databases, or non-relational databases such as key- value stores or non SQL
  • memory 530 may comprise an associative array architecture, such as a key- value storage, for storing and rapidly retrieving large amounts of information about clusters 50 and 60, and/or information about individual computing nodes included in clusters 50 and 60.
  • FIG. 6 is a flowchart of an exemplary data processing method 600, consistent with embodiments of the present disclosure.
  • method 600 may be performed by a data processing system (e.g., data processing system 20) coupled with an e-beam tool (e.g., e-beam tool 10).
  • the data processing system may include a main scheduler (e.g., main scheduler 30 of FIG. 3), at least a first sub-scheduler and a second sub-scheduler (e.g., first sub-scheduler 41 and second sub-scheduler 42 of FIG. 3), and at least a first cluster and a second cluster (e.g., cluster 50 and cluster 60 of FIG. 3).
  • the first cluster includes a first set of computing nodes
  • the second cluster includes a second set of computing nodes.
  • the main scheduler, first and second sub-schedulers, and first and second clusters may be arranged according to the architecture shown in FIG. 3. Referring to FIG. 6, method 600 may include one or more of the following steps 610-650.
  • step 610 the main scheduler receives an inspection data set from the e-beam tool.
  • the inspection data set may include intensity data of secondary electrons detected by the e-beam tool.
  • the main scheduler may intercept the inspection data set at an abstraction layer and extract information, such as data size, about the inspection data set.
  • the main scheduler selects one or more computer clusters for processing the inspection data set in real time.
  • the data processing system may include multiple clusters. Each of the multiple clusters may be managed by a different sub-scheduler. However, some of the clusters may be currently processing other jobs and thus are not available for processing the inspection data set immediately.
  • the main scheduler may constantly monitor the running status of the clusters and select the currently available clusters for processing the inspection data set. For example, the main scheduler may periodically receive status reports of a cluster from a sub-scheduler managing the cluster, and determine the availability of the cluster accordingly.
  • the main scheduler may select clusters for processing the inspection data set based additionally on the size of the inspection data set, the data processing capacities of the clusters, and/or the data bandwidths of the clusters. Such information may be used by the main scheduler to determine how many available clusters and/or which available clusters are needed for processing the inspection data in real time. For illustrative purpose only, the following description assumes that the main scheduler selects the first and second clusters for proposing the inspection data set in real time.
  • the main scheduler parses the inspection data set into a plurality of workloads.
  • the main scheduler may use various methods to parse the inspection data set. For example, when the inspection data set is generated by a multi-beam tool, the main scheduler may package inspection data corresponding to different detected electron beams (e.g., detected electron beams 170a, 170b, and 170c of FIG. 1) into different workloads. This way, the processing result of each workload indicates the scan result of a different primary beam.
  • the main scheduler may make the number of workloads to be equal to the total number of the computing nodes, wherein each workload corresponds to a different computing node.
  • the main schedule may further make data sizes of the workloads to be proportional to the processing capabilities of the corresponding computing nodes.
  • the plurality of workloads includes at least a first set of workloads and a second set of workloads.
  • the first set of workloads and the second set of workloads are to be concurrently processed by the first and second clusters, respectively.
  • the main scheduler may arrange the first set and second set of workloads based on the data processing capabilities and/or data bandwidths of the first and second clusters, such that the first and second clusters can process the first and second sets of workloads without delay. It is appreciated that there can be multiple sets of workloads, and that these workloads can be distributed across any number of clusters.
  • step 640 the main scheduler submits the first set of workloads to the first sub-scheduler, and submits the second set of workloads to the second sub-scheduler.
  • step 650 the first sub-scheduler manages the first cluster to process the first set of workloads in real time, and the second sub-scheduler manages the second cluster to process the second set of workloads in real time.
  • the first and second clusters use different data channels to receive the first and second sets of workloads, so as to minimize interference between the data transmissions for the first and second clusters.
  • the first and second sub-schedulers may manage the respective clusters based on various factors. For example, the first sub-scheduler may distribute the first set of workloads among the first set of computing nodes based on one or more of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, and/or current availabilities of the first set of computing nodes.
  • the first sub-scheduler may constantly collect and update metadata of the first set of computing nodes.
  • the metadata may include hardware/software configurations (e.g., processor types, data bandwidths, etc.) of each member of the first set of computing nodes.
  • the metadata may also include current availability of each member of the first set of computing nodes. Based on the metadata, the first sub-scheduler may determine whether one of the first set of computing nodes can process a particular workload.
  • the first sub-scheduler may evaluate the computing cost for one of the first set of computing nodes to process a particular workload received from the main scheduler.
  • the first sub-scheduler may further parse the workload into smaller components for evaluating the processing cost.
  • the first sub-scheduler may evaluate the processing cost based on a cost model, the data attributes of the workload, and/or the hard/software configurations of the first set of computing nodes.
  • the first sub-scheduler may select a computing node with the lowest cost to process the workload.
  • the first sub-scheduler may assign the workload to a computing node equipped with an image processing unit, which can process the workload more efficiently than other types of processors. Also for example, if the workload requires high-density calculation, the first sub-scheduler may assign the workload to a computing node equipped with a multi-core processor (e.g., an Intel® Xeon PhiTM processor).
  • a multi-core processor e.g., an Intel® Xeon PhiTM processor.
  • the data processing system may aggregate the processing results of the first and second clusters into a final result.
  • the aggregating may be performed by the main scheduler and/or the sub-schedulers.
  • the aggregating may include: aggregating the processing results to generate an image of a sample currently scanned by the e-beam tool; if the image indicates that the sample includes a defect, storing the image in a memory for further analysis; and if the image indicates that the sample does not include defects, discarding the inspection data set.
  • the data processing system only needs to store those inspection data sets representing defects of the sample.
  • precious storage space can be saved and system speed can be improved because of the skipping of unnecessary data storage.
  • a computer system comprising:
  • a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
  • a first sub-scheduler configured to manage a first set of computing nodes for processing the first set of workloads assigned by the scheduler
  • a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
  • the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
  • each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
  • the first sub-scheduler is configured to distribute the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
  • each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes.
  • the scheduler is configured to select at least one of the first and second sub- schedulers based on availability of the first and second sets of computing nodes, a size of the inspection data set, and data processing capacities and data bandwidths of the first and second sets of computing nodes.
  • a system comprising:
  • a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
  • a first sub-scheduler configured to manage the first set of computing nodes for processing the first set of workloads assigned by the scheduler
  • a second sub-scheduler configured to manage the second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
  • the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
  • each computing node in the first and second sets of computing nodes includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
  • the first sub-scheduler is configured to distribute the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
  • each of the first and second sub- schedulers is programmable based on a modification to the first set of computing nodes.
  • a method comprising:
  • the scheduler parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
  • a second sub-scheduler managing, by a second sub-scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • the inspection data set is received from an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample.
  • the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
  • the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
  • each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
  • the first sub-scheduler distributing, by the first sub-scheduler, the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
  • each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes.
  • a non-transitory computer-readable medium storing a set of instructions that is executable by one or more processors of one or more devices to cause the one or more devices to perform a method comprising:
  • the inspection data set parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
  • a second sub-scheduler managing, by a second sub-scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
  • each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
  • the first sub-scheduler distributing, by the first sub-scheduler, the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
  • sub- schedulers is programmable based on a modification to the first set of computing nodes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Testing Or Measuring Of Semiconductors Or The Like (AREA)

Abstract

A data processing system is disclosed. According to certain embodiments, the system includes a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons. The system also includes a first sub-scheduler configured to manage a first set of computing nodes for processing the first set of workloads assigned by the scheduler. The system further includes a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.

Description

SYSTEM FOR PARALLEL DATA PROCESSING WITH MULTI-LAYER WORKLOAD
MANAGEMENT
CROSS-REFERENCE TO RELATED APPLICATIONS
[001] This application claims priority of US application 62/380,607 which was filed on
August 29, 2016 and US application 62/546,479 which was filed on August 16, 2017 which are incorporated herein in its entirety by reference.
TECHNICAL FIELD
[002] The present disclosure generally relates to the fields of data processing and computer workload management, and more particularly, to a system for processing large quantity of data sets in real time, using a multi-layer workload management scheme. The data sets may include inspection data generated by an electron-beam inspection tool during the scanning of a sample, e.g., a semiconductor wafer.
BACKGROUND
[003] In manufacturing processes of integrated circuits (ICs), unfinished or finished circuit components are inspected to ensure that they are manufactured according to original design and are free of defects. As the physical sizes of IC components continue to reduce down to a sub-100 or even sub- 10 nanometers, the conventional optical-microscopy based inspection system gradually becomes incompetent because its resolution is limited by the wavelength of light.
[004] Compared to a photon beam, a charged particle (e.g., electron) beam has a shorter wavelength and thereby can offer superior spatial resolution. Typical electron-beam (e-beam) inspection tools, such as a scanning electron microscope (SEM) or a transmission electron microscope (TEM), focus electrons of a single primary electron beam at predetermined scan locations of a wafer under inspection. The primary electron beam interacts with the wafer and may be backscattered or may cause the wafer to emit secondary electrons. The intensity of the backscattered or secondary electrons may vary based on the properties of the internal and/or external structures of the wafer, and thus indicates whether the wafer has defects.
[005] However, the traditional single e-beam inspection tools have low throughput, due to their high resolutions. This limits the single e-beam inspection tools from being applied to wafer inspection in large scale. One way to improve the throughput is to use multiple beamlets for scanning multiple separate areas simultaneously. Such a multi-beam inspection tool can drastically improve the speed of scanning a wafer, but can also generate a huge amount of inspection data. For example, a multi-beam inspection system may output inspection data at a rate of approximately hundreds of gigabytes per second, or hundreds of terabytes per hour. The level of inspection data can be difficult to process without delay. SUMMARY
[006] Embodiments of the present disclosure relate to a system for parallel data processing with multi-layer workload management. In some embodiments, a computer system is provided. The computer system includes a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons. The computer system also includes a first sub-scheduler configured to manage a first set of computing nodes for processing the first set of workloads assigned by the scheduler. The computer system further includes a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
[007] In some embodiments, a system is provided. The system includes a first and a second sets of computing nodes. The system also includes a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons. The system also includes a first sub-scheduler configured to manage the first set of computing nodes for processing the first set of workloads assigned by the scheduler. The system further includes a second sub-scheduler configured to manage the second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
[008] In some embodiments, a method is provided. The method includes receiving, by a scheduler, an inspection data set to be processed. The method also includes parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons. The method also includes managing, by a first sub-scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler. The method further includes managing, by a second sub-scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
[009] In some embodiments, a non-transitory computer-readable medium is provided. The medium stores instructions that, when executed by one or more processors, cause the processors to perform a method including: receiving, by a scheduler, an inspection data set to be processed; parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons; managing, by a first sub-scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and managing, by a second sub- scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
[010] Additional objects and advantages of the disclosed embodiments will be set forth in part in the following description, and in part will be apparent from the description, or may be learned by practice of the embodiments. The objects and advantages of the disclosed embodiments may be realized and attained by the elements and combinations set forth in the claims.
[011] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosed embodiments, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[012] FIG. 1 is a schematic diagram illustrating an exemplary electron-beam (e-beam) inspection system, consistent with embodiments of the present disclosure.
[013] FIG. 2 is a schematic diagram illustrating a sensor surface of an exemplary electron detector used in the exemplary e-beam inspection system of FIG. 1.
[014] FIG. 3 is a schematic diagram illustrating an exemplary architecture of a data processing system used in the exemplary e-beam inspection system of FIG. 1.
[015] FIG. 4A is a block diagram of a computing node used in the exemplary data processing system of FIG. 3.
[016] FIG. 4B is a block diagram of a computing node used in the exemplary data processing system of FIG. 3.
[017] FIG. 5 is a block diagram of a main scheduler or a sub-scheduler used in the exemplary data processing system of FIG. 3.
[018] FIG. 6 is a flowchart of a data processing method, consistent with embodiments of the present disclosure.
DESCRIPTION OF THE EMBODIMENTS
[019] Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise represented. The implementations set forth in the following description of exemplary embodiments do not represent all implementations consistent with the invention. Instead, they are merely examples of apparatuses and methods consistent with aspects related to the invention as recited in the appended claims.
[020] The present application discloses a data processing system configured to process inspection data generated by an electron-beam (e-beam) inspection system in real time. The disclosed data processing system may be used in many technologies, such as in manufacturing processes of integrated circuits (ICs).
[021] FIG. 1 is a schematic diagram illustrating an exemplary e-beam inspection system 1, consistent with embodiments of the present disclosure. As shown in FIG. 1, e-beam inspection system 1 includes an e-beam tool 10 coupled with a data processing system 20. As described in detail below, e-beam tool 10 is configured to scan a sample, e.g., a semiconductor wafer, with one or more electron beams and to generate inspection data sets based on secondary electrons reflected from the sample. The inspection data sets may be provided to and processed by data processing system 20 in real time. Although FIG. 1 shows data processing system 20 is coupled to one e-beam tool 10, it is contemplated that data processing system 20 can be shared by multiple e-beam tools 10 and thus can process inspection data sets generated by multiple e-beam tools 10.
[022] E-beam tool 10 includes a motorized stage 134 and a wafer holder 136 supported by motorized stage 134 to hold a wafer 150 to be inspected. E-beam tool 10 further includes a cathode 100, an anode 120, a gun aperture 122, a beam limit aperture 124, a condenser lens 126, a source conversion unit 128, an objective lens assembly 132, a beam separator 138, and an electron detector 140. Source conversion unit 128, in some embodiments, can include a micro-deflectors array 129 and a beamlet-limit plate 130. Objective lens assembly 132, in one embodiment, can include a modified swing objective retarding immersion lens (SORIL), which includes a pole piece 132a, a control electrode 132b, a deflector 132c, and an exciting coil 132d. E-beam tool 10 may additionally include an energy dispersive X-ray spectrometer (EDS) detector (not shown) to characterize the materials on the wafer.
[023] When e-beam tool 10 operates, a wafer 150 to be inspected is mounted or placed on wafer holder 136, which is supported by motorized stage 134. A voltage is applied between anode 120 and cathode 100, cathode 100 emits an electron beam 160. The emitted electron beam passes through gun aperture 122 and beam limit aperture 124, both of which can determine the size of electron beam entering condenser lens 126, which resides below beam limit aperture 124. Condenser lens 126 can focus the emitted electron beam 160 before electron beam 160 enters source conversion unit 128. Micro-deflectors array 129 can split the emitted beam into multiple primary electron beams 160a, 160b, and 160c. The number of multiple primary beams is not limited to three and micro-deflector array 129 can be configured to split the emitted beam into greater number of primary electron beams. Beamlet-limit plate 130 can set the size of the multiple primary electron beams before entering objective lens assembly 132. Deflector 132c deflects the primary electron beams 160a, 160b, and 160c to facilitate beam scanning on the wafer. For example, in a scanning process, deflector 132c can be controlled to deflect primary electron beams 160a, 160b, and 160c simultaneously onto different locations of top surface of wafer 150 at different time points, to provide data for image reconstruction for different parts of wafer 150.
[024] Exciting coil 132d and pole piece 132a generate a magnetic field that begins at one end of pole piece 132a and terminates at the other end of pole piece 132a. A part of wafer 150 being scanned by primary electron beam 160 can be immersed in the magnetic field and can be electrically charged, which, in turn, creates an electric field. The electric field reduces the energy of impinging primary electron beam 160 near the surface of the wafer before it collides with the wafer. Control electrode 132b, being electrically isolated from pole piece 132a, controls an electric field on the wafer to prevent micro-arching of the wafer and to ensure proper beam focus.
[025] Backscattered primary electrons and secondary electrons can be emitted from the part of wafer 150 upon receiving primary electron beams 160a, 160b, and 160c. Beam separator 138 can direct the secondary and/or scattered electron beams 170a, 170b, and 170c, comprising backscattered and secondary electrons, to a sensor surface of electron detector 140.
[026] FIG. 2 is a schematic diagram illustrating a sensor surface 142 of electron detector 140, according to some embodiments of the present disclosure. Referring to FIG. 2, the detected electron beams 170a, 170b, and 170c from FIG. 1 can form corresponding beam spots 180a, 180b, and 180c on sensor surface 142 of electron detector 140. Electron detector 140 can generate signals (e.g., voltages, currents, etc.) that represent the intensities of the received beam spots, and provide the signals to data processing system 20 (FIG. 1). Specifically, sensor surface 142 includes a plurality of sensor regions, labeled as 144a, 144b, 144c, etc. Each sensor region can be designated to receive a beam spot (e.g., beam spots 180a, 180b, and 180c) emitted from a particular location on wafer 150. As described above, the number of primary beams used in e-beam tool 10 is not limited to three. As such, the present disclosure does not limit the number of sensor regions on sensor surface 142, and the number of beam spots detectable by electron detector 140. For example, consistent with the disclosed embodiments, electron detector 140 may include 3x3, 4x5, or 10x10 sensor regions arranged on sensor surface 142 as a matrix. Each sensor region can comprise an array of electron sensing elements 146. Electron sensing elements 146 may comprise, for example, a PIN diode, an electron multiplier tube (EMT), etc. Electron sensing elements 146 can generate current signals commensurate with the electrons received in the sensor regions.
[027] Referring back to FIG. 1, the current signals generated by detector 140 may be amplified and digitized before being provided to data processing system 20. For example, electron detector 140 may output the current signals to a preprocessing circuit (not shown) included in or coupled with electron detector 140. The preprocessing circuit can amplify the current signals and convert the amplified current signals into voltage signals (representing the intensities of received electron beam spots 180a, 180b, and 180c). The voltage signals constitute the inspection data to be processed by data processing system 20.
[028] In exemplary embodiments, the intensity of secondary and/or scattered electron beams 170a, 170b, and 170c, and the resultant beam spots 180a, 180b, and 180c, can vary according to the external and/or internal structure of wafer 150. Moreover, as discussed above with respect to FIG. 1, primary electron beams 160a, 160b, and 160c can be projected onto different locations of the top surface of wafer 150 to generate secondary and/or scattered electron beams 170a, 170b, and 170c (and the resultant beam spots) of different intensities. Therefore, by mapping the intensities of the beam spots with the locations of wafer 150, data processing system 20 can reconstruct an image that reflects the internal and/or external structures of wafer 150.
[029] Although FIGs. 1 & 2 show e-beam tool 10 as a multi-beam inspection tool that employs multiple primary electron beamlets to simultaneously scan multiple locations on wafer 150, it is contemplated that e-beam tool 10 may also be a single-beam inspection tool that uses only one primary electron beam to scan one location of wafer 150 at a time. The present application does not limit the number of electron beams used in e-beam tool 10.
[030] Still referring to FIG. 1, electron detector 140 and the preprocessing circuit (not shown) are coupled to data processing system 20 via a communication link 210. Communication link 210 may comprise one or more interconnected wired or wireless data networks that transmit the inspection data generated by e-beam tool 10 to data processing system 20. For example,
communication link 210 may be implemented as a wired Wide Area Network (WAN), a wired Local Area Network (LAN), a wireless LAN (e.g., Institute of Electrical and Electronics Engineers (IEEE) 802.11, Bluetooth®, etc.), a wireless WAN (e.g., Worldwide Interoperability for Microwave Access (WiMAX)), the Internet, or the like. In some embodiments, communication link 210 may be implemented as one or more data cables connecting electron detector 140 (and/or the preprocessing circuit) directly to data processing system 20. Such wired connections may provide more reliable and faster data transmission than wireless connections.
[031] Data processing system 20 includes a plurality of computer processors 220 operating collaboratively to run various applications. Each computer processor 220 may be implemented using a variety of different equipment, such as a supercomputer, a personal computer, a server, a mainframe, a mobile device, or the like. Computer processors 220 may be located in the same location as or remotely from e-beam tool 10. Multiple server clusters 108 may be formed as one or more grids or clusters to share resources and workloads.
[032] FIG. 3 is a schematic diagram illustrating an exemplary architecture of data processing system 20, consistent with embodiments of the present disclosure. Referring to FIG. 3, data processing system 20 may include a plurality of computing nodes arranged into multiple clusters, including at least a first cluster 50 having computing nodes 50-1, 50-2, . . ., 50-n and a second cluster 60 having computing nodes 60-1, 60-2, . . ., 60-n. Data processing system 20 also includes a first sub-scheduler 41 configured to manage and distribute workloads within first cluster 50, and a second sub-scheduler 42 configured to manage and distribute workloads within second cluster 60. Data processing system 20 further includes a main scheduler 30 for coordinating workloads to be processed by first sub-scheduler 41 (and thus first cluster 50) and second sub-scheduler 42 (and thus second cluster 60).
[033] As shown in FIG. 3, main scheduler 30, sub- schedulers 41 and 42, and clusters 50 and 60 are arranged in a multi-layer tree structure. Main scheduler 30 coordinates the processing power of all the available computing nodes in data processing system 20 by allocating workloads between sub-schedulers 41 and 42, while sub-schedulers 41 and 42 are dedicated for managing workloads within clusters 50 and 60, respectively. Specifically, main scheduler 30 receives an inspection data set from e-beam tool 10 and parses it into multiple workloads. Main scheduler 30 further assigns different workloads to sub-schedulers 41 and 42, which then manage clusters 50 and 60 respectively to process the assigned workloads.
[034] As described above, data processing system 20 may be shared by multiple e-beam tools 10. Accordingly, main scheduler 30 and sub-schedulers 41 and 42 may be configured to simultaneously manage the processing of inspection data sets generated by multiple e-beam tools 10.
[035] Data processing system 20 includes features to minimize system interference between clusters 50 and 60, so as to ensure the workloads to be concurrently processed in real time. In the disclosed embodiments, clusters 50 and 60 do not share the same computing node. Moreover, sub-schedulers 41 and 42 only receive workloads from main scheduler 30, while they do not communicate with each other, in order to achieve isolation between clusters 50 and 60. This way, the workloads assigned to cluster 50 will not affect the timely processing of the workloads assigned to cluster 60, and vice versa. In some embodiments, clusters 50 and 60 may be configured to receive the workloads via different data channels, in order to prevent the transmission of the workloads assigned to cluster 50 from delaying the transmission of the workloads assigned to cluster 60, and vice versa.
[036] Data processing system 20 also includes features to enable precise allocation of the workloads between clusters 50 and 60, so as to ensure the workloads to be concurrently processed in real time. In some embodiments, the computing nodes in clusters 50 and 60 are configured to be non-programmable nodes, such that the hardware and/or software configurations of the computing nodes are fixed. This way, the computing power and/or data bandwidth of each computing node are predictable, and thus main scheduler 30 and sub- schedulers 41 and 42 can accurately allocate the workloads to the computing nodes according to the configurations of the computing nodes.
[037] In these embodiments, although the computing nodes are made non-programmable, the hardware and/or software configurations of clusters 50 and 60 may still change. For example, the number of computing nodes in each of clusters 50 and 60, or the number of computing nodes managed by each of sub-schedulers 41 and 42, may be modified occasionally. For example, the number of workable computing nodes in a cluster may decrease due to malfunctioning of some computing nodes in the cluster. As another example, a user may want to increase the number of computing nodes in a cluster to lift the cluster' s computing power. To be adaptable to these modifications, sub-schedulers 41 and 42 may be made programmable based on system modifications of clusters 50 and 60. For example, the workload management algorithms in sub-schedulers 41 and 42 may be fine-tuned in response to changes to the numbers of computing nodes in clusters 50 and 60, respectively.
[038] Although FIG. 3 only shows two sub-schedulers and two clusters, the present disclosure does not limit the number of clusters and thus the number of sub-schedulers. In some embodiments, main scheduler 30 may be configured to dynamically select the number of clusters and sub- schedulers for processing an inspection data set based on, for example, job requirement (e.g., the size of the inspection data set), current availability of the clusters, and/or hardware/software configurations (e.g., data processing capabilities and/or data bandwidths) of the clusters.
[039] A computing node used in data processing system 20 may include one or more of a CPU, an image processing unit, an application-specific integrated circuit (ASIC), a
field-programmable gate array (FPGA), etc. In some embodiments, the computing node may be configured as a hybrid node containing multiple different types of processors, to meet different data processing requirements.
[040] For example, FIG. 4A is a block diagram of an exemplary computing node 400A used in data processing system 20. For example, computing node 400A may be used in cluster 50 and/or cluster 60 (FIG. 3). Referring to FIG. 4A, computing node 400A includes a CPU 410 and a special-purpose processor 420, such as a special-purpose processor 420 dedicated for image processing. CPU 410 and special-purpose processor 420 may be used to fulfill different tasks. For example, CPU 410 is good at performing single-threaded applications, such as defect detection based on the inspection data, while special-purpose processor 420 is more efficient and powerful for image processing, such as image denoising and image enhancement. Consistent with the disclosed embodiments, CPU 410 may be configured to allocate tasks between CPU 410 and special -purpose processor 420 according to attributes of the tasks, so as to maximize the system efficiency.
[041] As another example, FIG. 4B is a block diagram of an exemplary computing node 400B used in data processing system 20. For example, computing node 400B may be used in cluster 50 and/or cluster 60 (FIG. 3). Referring to FIG. 4B, computing node 400B includes a CPU 410, a FPGA 440, and a multi-core high performance processor 450, such as an Intel® Xeon Phi™ processor 450. For example, FPGA 440 may be programmed to handle specific applications such as digital signal processing, image construction, etc. Intel® Xeon Phi™ processor 450 is suitable for performing applications that require high-density calculation, fast memory access, or high multi-thread capacity, such as applications performing image alignment. Consistent with the disclosed embodiments, CPU 410 may be configured to decide which application shall be performed by which type of processor.
[042] FIG. 5 is a block diagram of an exemplary workload scheduler 500, consistent with embodiments of the present disclosure. Workload scheduler 500 may be main scheduler 30, or sub- schedulers 41 and 42. Functions of main scheduler 30 and sub-schedulers 41 and 42 are discussed in further detail with respect to FIG. 6.
[043] As shown in FIG. 5, workload scheduler 500 may include one or more processors
510, input/output ("I/O") devices 520, and memory 530 storing programs 540 (including, for example, workload management app(s) 542 and operating system 544) and data 550. Workload scheduler 500 may be a single server or may be configured as a distributed computer system including multiple servers or computers that interoperate to perform one or more of the processes and functionalities associated with the disclosed embodiments.
[044] In some embodiments, workload scheduler 500 is specially configured with hardware and/or software modules for performing functions of disclosed methods. For example, workload scheduler 500 may include a job interceptor 512, a job parser 514, and a job director 516. The modules can be implemented as specialized circuitry integrated within processor 510 or in communication with processor 510, and/or specialized software executable by processor 510.
[045] When workload scheduler 500 is implemented as main scheduler 30, job inspector 512 may be configured to intercept a received inspection data set at an abstraction layer and extract information (e.g., data size) about the inspection data set. Job parser 514 may be configured to parse the inspection data set into a plurality of workloads. Moreover, job director 516 may be configured to direct different workloads to sub-schedulers 41 and 42.
[046] When workload scheduler 500 is implemented as sub-schedulers 41 and 42, job inspector 512 may be configured to intercept, at an abstraction layer, a workload forwarded from main scheduler 30 and extract attributes (e.g., hardware/software requirement by the workload) of the workload. In such embodiments, job parser 514 may be configured to parse the workload into smaller components for evaluating the computation cost of the workload, while job director 516 may be configured to direct the workload to a suitable computing node for processing.
[047] Processor 510 may be one or more known or custom processing devices designed to perform functions of the disclosed workload scheduling methods, such as a single core or multiple core processors capable of executing parallel processes simultaneously. For example, processor 510 may be a single core processor configured with virtual processing technologies. In certain embodiments, processor 510 may use logical processors to simultaneously execute and control multiple processes. Processor 510 may implement virtual machine technologies, or other known technologies to provide the ability to execute, control, run, manipulate, store, etc. multiple software processes, applications, programs, etc. In some embodiments, processor 510 may include a multiple-core processor arrangement (e.g., dual core, quad core, etc.) configured to provide parallel processing functionalities to allow workload scheduler 500 to execute multiple processes simultaneously. It is appreciated that other types of processor arrangements could be implemented that provide for the capabilities disclosed herein.
[048] Workload scheduler 500 may also include one or more I/O devices 520 that may comprise one or more interfaces for receiving signals or input from devices and providing signals or output to one or more devices that allow data to be received and/or transmitted by workload scheduler 500. For example, workload scheduler 500 may include interface components, which may provide interfaces to one or more input devices, such as one or more keyboards, mouse devices, and the like, that enable workload scheduler 500 to receive input from a user or administrator.
[049] Workload scheduler 500 may include one or more storage devices configured to store information used by processor 510 (or other components) to perform certain functions related to the disclosed embodiments. In one example, workload scheduler 500 may include memory 530 that includes instructions to enable processor 510 to execute one or more applications, such as workload management, server applications, network communication processes, and any other type of application or software known to be available on computer systems. Alternatively or additionally, the instructions, application programs, etc. may be stored in an internal database or an external storage (not shown) in direct communication with workload scheduler 500. The internal database and/or external storage may be a volatile or non- volatile, magnetic, semiconductor, tape, optical, removable, non-removable, or other type of storage device or tangible and/or non-transitory computer-readable medium. Common forms of non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM or any other flash memory, NVRAM, a cache, a register, any other memory chip or cartridge, and networked versions of the same.
[050] In some embodiments, workload scheduler 500 may include memory 530 that includes instructions that, when executed by processor 510, perform one or more processes consistent with the functionalities disclosed herein. For example, workload scheduler 500 may include memory 530 that may include one or more programs 540 to perform one or more functions of the disclosed embodiments. Moreover, processor 510 may execute one or more programs located remotely from data processing system 20. For example, workload scheduler 500 may access one or more remote programs, that, when executed, perform functions related to disclosed embodiments.
[051] Programs 540 stored in memory 530 and executed by processor(s) 510 may include one or more workload management app(s) 542 and operating system 544. Workload management app(s) 542 may be configured to cause processor(s) 510 to execute one or more processes related to intercepting an inspection data set and/or a workload, parsing the inspection data set and/or the workload, determining the cost of processing the workload by a computing node, and directing the workload to a selected computing nodes.
[052] Data 550 may include metadata describing the running status of clusters 50 and 60 at a macro level, and/or the running status of individual computing nodes at a micro level. For example, such metadata may indicate each cluster's current resource availability, file system, data storage locations, etc.
[053] Memory 530 may include one or more memory devices that store data and instructions used to perform one or more features of the disclosed embodiments. Memory 530 may also include any combination of one or more relational and/or non-relational databases controlled by memory controller devices (e.g., server(s), etc.) or software, such as document management systems, Microsoft® Structured Query Language (SQL) databases, SharePoint® databases, Oracle® databases, or other relational databases, or non-relational databases such as key- value stores or non SQL
(NoSQL) databases such as Apache™ HBase™. In some embodiments, memory 530 may comprise an associative array architecture, such as a key- value storage, for storing and rapidly retrieving large amounts of information about clusters 50 and 60, and/or information about individual computing nodes included in clusters 50 and 60.
[054] FIG. 6 is a flowchart of an exemplary data processing method 600, consistent with embodiments of the present disclosure. For example, method 600 may be performed by a data processing system (e.g., data processing system 20) coupled with an e-beam tool (e.g., e-beam tool 10). The data processing system may include a main scheduler (e.g., main scheduler 30 of FIG. 3), at least a first sub-scheduler and a second sub-scheduler (e.g., first sub-scheduler 41 and second sub-scheduler 42 of FIG. 3), and at least a first cluster and a second cluster (e.g., cluster 50 and cluster 60 of FIG. 3). The first cluster includes a first set of computing nodes, and the second cluster includes a second set of computing nodes. The main scheduler, first and second sub-schedulers, and first and second clusters may be arranged according to the architecture shown in FIG. 3. Referring to FIG. 6, method 600 may include one or more of the following steps 610-650.
[055] In step 610, the main scheduler receives an inspection data set from the e-beam tool.
The inspection data set may include intensity data of secondary electrons detected by the e-beam tool. The main scheduler may intercept the inspection data set at an abstraction layer and extract information, such as data size, about the inspection data set.
[056] In step 620, the main scheduler selects one or more computer clusters for processing the inspection data set in real time. Consistent with the disclosed embodiments, the data processing system may include multiple clusters. Each of the multiple clusters may be managed by a different sub-scheduler. However, some of the clusters may be currently processing other jobs and thus are not available for processing the inspection data set immediately. As such, to ensure that the inspection data set is to be processed in real time, the main scheduler may constantly monitor the running status of the clusters and select the currently available clusters for processing the inspection data set. For example, the main scheduler may periodically receive status reports of a cluster from a sub-scheduler managing the cluster, and determine the availability of the cluster accordingly.
[057] In some embodiments, the main scheduler may select clusters for processing the inspection data set based additionally on the size of the inspection data set, the data processing capacities of the clusters, and/or the data bandwidths of the clusters. Such information may be used by the main scheduler to determine how many available clusters and/or which available clusters are needed for processing the inspection data in real time. For illustrative purpose only, the following description assumes that the main scheduler selects the first and second clusters for proposing the inspection data set in real time.
[058] In step 630, the main scheduler parses the inspection data set into a plurality of workloads. The main scheduler may use various methods to parse the inspection data set. For example, when the inspection data set is generated by a multi-beam tool, the main scheduler may package inspection data corresponding to different detected electron beams (e.g., detected electron beams 170a, 170b, and 170c of FIG. 1) into different workloads. This way, the processing result of each workload indicates the scan result of a different primary beam. For another example, the main scheduler may make the number of workloads to be equal to the total number of the computing nodes, wherein each workload corresponds to a different computing node. The main schedule may further make data sizes of the workloads to be proportional to the processing capabilities of the corresponding computing nodes.
[059] The plurality of workloads includes at least a first set of workloads and a second set of workloads. The first set of workloads and the second set of workloads are to be concurrently processed by the first and second clusters, respectively. As such, the main scheduler may arrange the first set and second set of workloads based on the data processing capabilities and/or data bandwidths of the first and second clusters, such that the first and second clusters can process the first and second sets of workloads without delay. It is appreciated that there can be multiple sets of workloads, and that these workloads can be distributed across any number of clusters.
[060] In step 640, the main scheduler submits the first set of workloads to the first sub-scheduler, and submits the second set of workloads to the second sub-scheduler.
[061] In step 650, the first sub-scheduler manages the first cluster to process the first set of workloads in real time, and the second sub-scheduler manages the second cluster to process the second set of workloads in real time. Consistent with the disclosed embodiments, the first and second clusters use different data channels to receive the first and second sets of workloads, so as to minimize interference between the data transmissions for the first and second clusters.
[062] The first and second sub-schedulers may manage the respective clusters based on various factors. For example, the first sub-scheduler may distribute the first set of workloads among the first set of computing nodes based on one or more of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, and/or current availabilities of the first set of computing nodes.
[063] For example, the first sub-scheduler may constantly collect and update metadata of the first set of computing nodes. The metadata may include hardware/software configurations (e.g., processor types, data bandwidths, etc.) of each member of the first set of computing nodes. The metadata may also include current availability of each member of the first set of computing nodes. Based on the metadata, the first sub-scheduler may determine whether one of the first set of computing nodes can process a particular workload.
[064] As another example, the first sub-scheduler may evaluate the computing cost for one of the first set of computing nodes to process a particular workload received from the main scheduler. The first sub-scheduler may further parse the workload into smaller components for evaluating the processing cost. The first sub-scheduler may evaluate the processing cost based on a cost model, the data attributes of the workload, and/or the hard/software configurations of the first set of computing nodes. The first sub-scheduler may select a computing node with the lowest cost to process the workload. For example, if the workload requires image denoising and enhancement, the first sub-scheduler may assign the workload to a computing node equipped with an image processing unit, which can process the workload more efficiently than other types of processors. Also for example, if the workload requires high-density calculation, the first sub-scheduler may assign the workload to a computing node equipped with a multi-core processor (e.g., an Intel® Xeon Phi™ processor).
[065] Although the above examples are described with respect to the first sub-scheduler, it is contemplated that the description equally applies to the second sub-scheduler or any other sub-scheduler consistent with the present disclosure, which is not repeated here.
[066] Still referring to FIG. 6, in step 660, the data processing system may aggregate the processing results of the first and second clusters into a final result. For example, the aggregating may be performed by the main scheduler and/or the sub-schedulers. In some embodiments, the aggregating may include: aggregating the processing results to generate an image of a sample currently scanned by the e-beam tool; if the image indicates that the sample includes a defect, storing the image in a memory for further analysis; and if the image indicates that the sample does not include defects, discarding the inspection data set. This way, the data processing system only needs to store those inspection data sets representing defects of the sample. Thus, precious storage space can be saved and system speed can be improved because of the skipping of unnecessary data storage. [067] The embodiments may further be described using the following clauses:
1. A computer system comprising:
a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
a first sub-scheduler configured to manage a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and
a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
2. The computer system of clause 1, wherein the computer system is coupled with an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample.
3. The computer system of claim 2, wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
4. The computer system of any one of clauses 2 and 3, wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
5. The computer system of any one of clauses 1-4, wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes.
6. The computer system of any one of clauses 1-5, wherein the second set of computing nodes are different from the first set of computing nodes. 7. The computer system of any one of clauses 1-6, wherein each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
8. The computer system of any one of clauses 1-7, wherein the first sub-scheduler is configured to distribute the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
9. The computer system of any one of clauses 1-8, wherein the first scheduler is
programmable.
10. The computer system of any one of clauses 1-9, wherein each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes.
11. The computer system of clause 10, wherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
12. The computer system of any one of clauses 1-11, wherein the scheduler is configured to select at least one of the first and second sub- schedulers based on availability of the first and second sets of computing nodes, a size of the inspection data set, and data processing capacities and data bandwidths of the first and second sets of computing nodes.
13. The computer system of any one of clauses 1-12, wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels.
14. The computer system of any one of clauses 1-13, wherein the scheduler, and the first and second sub-schedulers are hosted on one or more processors.
15. The computer system of any one of clauses 1-14, further comprising a non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the processors to implement functions of the scheduler and the first and second sub-schedulers.
16. A system comprising:
a first and a second sets of computing nodes;
a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
a first sub-scheduler configured to manage the first set of computing nodes for processing the first set of workloads assigned by the scheduler; and
a second sub-scheduler configured to manage the second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
17. The computer system of clause 16, wherein the system is coupled with an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample.
18. The system of clause 17, wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
19. The system of any one of clauses 17 and 18, wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
20. The system of any one of clauses 16-19, wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes.
21. The system of any one of clauses 16-20, wherein the second set of computing nodes are different from the first set of computing nodes.
22. The system of any one of clauses 16-21, wherein each computing node in the first and second sets of computing nodes includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
23. The system of any one of clauses 16-22, wherein the first sub-scheduler is configured to distribute the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
24. The system of any one of clauses 16-23, wherein the first scheduler is programmable.
25. The system of any one of clauses 16-24, wherein each of the first and second sub- schedulers is programmable based on a modification to the first set of computing nodes.
26. The system of clause 25, wherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
27. The system of any one of clauses 16-26, wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels.
28. The system of any one of clauses 16-27, wherein the scheduler, and the first and second sub-schedulers are hosted on one or more processors.
29. The system of any one of clauses 16-28, further comprising a non-transitory
computer-readable medium storing instructions that, when executed by one or more processors, cause the processors to implement functions of the scheduler and the first and second sub-schedulers.
30. A method comprising:
receiving, by a scheduler, an inspection data set to be processed;
parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons; and
managing, by a first sub -scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and
managing, by a second sub-scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
31. The method of clause 30, wherein the inspection data set is received from an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample. 32. The method of clause 31, wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
33. The method of any one of clauses 31 and 32, wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
34. The method of any one of clauses 30-33, wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes.
35. The method of any one of clauses 30-34, wherein the second set of computing nodes are different from the first set of computing nodes.
36. The method of any one of clauses 30-35, wherein each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
37. The method of any one of clauses 30-36, wherein managing, by the first sub-scheduler, the first set of computing nodes for processing the first set of workloads assigned by the scheduler comprises:
distributing, by the first sub-scheduler, the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
38. The method of any one of clauses 30-37, wherein the first scheduler is programmable.
39. The method of any one of clauses 30-38, wherein each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes.
40. The method of clause 39, wherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
41. The method of any one of clauses 30-40, further comprising: selecting, by the scheduler, at least one of the first and second sub- schedulers based on availability of the first and second sets of computing nodes, a size of the inspection data set, and data processing capacities and data bandwidths of the first and second sets of computing nodes.
42. The method of any one of clauses 30-41, wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels.
43. The method of any one of clauses 30-42, wherein the scheduler, and the first and second sub- schedulers are hosted on one or more processors.
44. A non-transitory computer-readable medium storing a set of instructions that is executable by one or more processors of one or more devices to cause the one or more devices to perform a method comprising:
receiving, by a scheduler, an inspection data set to be processed;
parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
managing, by a first sub -scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and
managing, by a second sub-scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
45. The medium of clause 44, wherein the inspection data set is received from an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample.
46. The medium of clause 45, wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool.
47. The medium of any one of clauses 45 and 46, wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set; if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
48. The medium of any one of clauses 44-47, wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes.
49. The medium of any one of clauses 44-48, wherein the second set of computing nodes are different from the first set of computing nodes.
50. The medium of any one of clauses 44-49, wherein each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
51. The medium of any one of clauses 44-50, wherein managing, by the first sub-scheduler, the first set of computing nodes for processing the first set of workloads assigned by the scheduler comprises:
distributing, by the first sub-scheduler, the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes.
52. The medium of any one of clauses 44-51, wherein the first scheduler is programmable.
53. The medium of any one of clauses 44-52, wherein each of the first and second
sub- schedulers is programmable based on a modification to the first set of computing nodes.
54. The medium of clause 53, wherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
55. The medium of any one of clauses 44-54, wherein that set of instructions that is executable by the one or more processors of the one or more devices to cause the one or more devices to further perform:
selecting, by the scheduler, at least one of the first and second sub-schedulers based on availability of the first and second sets of computing nodes, a size of the inspection data set, and data processing capacities and data bandwidths of the first and second sets of computing nodes.
56. The medium of any one of clauses 44-55, wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels.
[068] It will be appreciated that the present invention is not limited to the exact construction that has been described above and illustrated in the accompanying drawings, and that various modifications and changes can be made without departing from the scope thereof. It is intended that the scope of the invention should only be limited by the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A computer system comprising:
a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
a first sub-scheduler configured to manage a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and
a second sub-scheduler configured to manage a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
2. The computer system of claim 1, wherein the computer system is coupled with an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample, wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool, and/or wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
3. The computer system of claim 1, wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes, and/or wherein the second set of computing nodes are different from the first set of computing nodes, and/or wherein each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
4. The computer system of claim 1, wherein the first sub-scheduler is configured to distribute the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes, and/or wherein the scheduler is
programmable, and/orwherein each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes, and/orwherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
5. The computer system of claim 1, wherein the scheduler is configured to select at least one of the first and second sub-schedulers based on availability of the first and second sets of computing nodes, a size of the inspection data set, and data processing capacities and data bandwidths of the first and second sets of computing nodes, and/or wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels, and/or wherein the scheduler, and the first and second sub-schedulers are hosted on one or more processors.
6. A system comprising:
a first and a second sets of computing nodes;
a scheduler configured to receive an inspection data set to be processed and to parse the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
a first sub-scheduler configured to manage the first set of computing nodes for
processing the first set of workloads assigned by the scheduler; and a second sub-scheduler configured to manage the second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
7. The system of claim 6, wherein the system is coupled with an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample, and/or wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool, wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set and/or wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes, and/or wherein the second set of computing nodes are different from the first set of computing nodes, wherein each computing node in the first and second sets of computing nodes includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
8. The system of claim 6, wherein the first sub-scheduler is configured to distribute the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes, and/or wherein the scheduler is programmable, wherein each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes, and/or wherein the scheduler, and the first and second sub- schedulers are hosted on one or more processors, and/or wherein the modification includes modifying the number of computing nodes in the first set of computing nodes, and/or wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels, and/or wherein the system further comprising a non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the processors to implement functions of the scheduler and the first and second sub-schedulers.
9. A method comprising:
receiving, by a scheduler, an inspection data set to be processed;
parsing, by the scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons; and
managing, by a first sub-scheduler, a first set of computing nodes for processing the first set of workloads assigned by the scheduler; and
managing, by a second sub- scheduler, a second set of computing nodes for processing the second set of workloads assigned by the scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
10. The method of claim 9, wherein the inspection data set is received from an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample, and/or wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool, and/or wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set.
11. The method of claim 9, wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes, and/or wherein the second set of computing nodes are different from the first set of computing nodes, and/or wherein the first and second sets of computer nodes respectively receive the first and second sets of workloads via different data channels, and/or wherein each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a
field-programmable gate array, and/or wherein managing, by the first sub-scheduler, the first set of computing nodes for processing the first set of workloads assigned by the scheduler comprises: distributing, by the first sub-scheduler, the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes, and/or
wherein the scheduler is programmable and each of the first and second sub- schedulers is programmable based on a modification to the first set of computing nodes and/or, wherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
12. The method of claim 9, further comprising:
selecting, by the scheduler, at least one of the first and second sub-schedulers based on availability of the first and second sets of computing nodes, a size of the inspection data set, and data processing capacities and data bandwidths of the first and second sets of computing nodes, and/or wherein the scheduler, and the first and second sub-schedulers are hosted on one or more processors.
13. A non-transitory computer-readable medium storing a set of instructions that is executable by one or more processors of one or more devices to cause the one or more devices to perform a method comprising:
receiving, by a programmable scheduler, an inspection data set to be processed; parsing, by the programmable scheduler, the inspection data set into a plurality of workloads including a first set of workloads and a second set of workloads, wherein the inspection data set corresponds to information related to one or more sets of secondary electrons;
managing, by a first sub-scheduler, a first set of computing nodes for processing the first set of workloads assigned by the programmable scheduler; and
managing, by a second sub- scheduler, a second set of computing nodes for processing the second set of workloads assigned by the programmable scheduler, wherein the first set of workloads and the second set of workloads are concurrently processed by the first and second set of computing nodes.
14. The medium of claiml3, wherein the inspection data set is received from an electron-beam inspection tool configured to scan a sample with one or more primary electron beams and to generate the inspection data set based on the one or more sets of secondary electrons reflected from the sample, wherein the inspection data set includes electron intensity data received from a plurality of electron sensing elements of the electron-beam inspection tool, and/or wherein the processing of the first and second sets of workloads comprises:
generating, in real time, an image of the sample based on the inspection data set;
if the image indicates that the sample includes a defect, storing the image in a memory; and if the image indicates that the sample does not include defects, discarding the inspection data set, and/or
wherein the parsing of the inspection data set is based on data processing capacities and data bandwidths of the first and second sets of computing nodes, wherein the second set of computing nodes are different from the first set of computing nodes and/or, wherein each computing node includes at least one of a central processing unit, an image processing unit, an application-specific integrated circuit, or a field-programmable gate array.
15. The medium of claim 13, wherein managing, by the first sub-scheduler, the first set of computing nodes for processing the first set of workloads assigned by the programmable scheduler comprises: distributing, by the first sub-scheduler, the first set of workloads among the first set of computing nodes based on at least one of data attributes of the first set of workloads, data processing capacities of the first set of computing nodes, data bandwidths of the first set of computing nodes, processor types of the first set of computing nodes, or current availabilities of the first set of computing nodes, wherein each of the first and second sub-schedulers is programmable based on a modification to the first set of computing nodes, and/or wherein the modification includes modifying the number of computing nodes in the first set of computing nodes.
PCT/EP2017/071268 2016-08-29 2017-08-24 System for parallel data processing with multi-layer workload management WO2018041696A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662380607P 2016-08-29 2016-08-29
US62/380,607 2016-08-29
US201762546479P 2017-08-16 2017-08-16
US62/546,479 2017-08-16

Publications (1)

Publication Number Publication Date
WO2018041696A1 true WO2018041696A1 (en) 2018-03-08

Family

ID=59761935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/071268 WO2018041696A1 (en) 2016-08-29 2017-08-24 System for parallel data processing with multi-layer workload management

Country Status (2)

Country Link
TW (1) TW201812576A (en)
WO (1) WO2018041696A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3882951A1 (en) * 2020-03-19 2021-09-22 FEI Company Charged particle beam device for inspection of a specimen with a plurality of charged particle beamlets

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI719786B (en) 2019-12-30 2021-02-21 財團法人工業技術研究院 Data processing system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2132763A2 (en) * 2007-02-22 2009-12-16 Applied Materials Israel Ltd. High throughput sem tool
US20100223618A1 (en) * 2009-02-27 2010-09-02 International Business Machines Corporation Scheduling jobs in a cluster
EP2879155A1 (en) * 2013-12-02 2015-06-03 ICT Integrated Circuit Testing Gesellschaft für Halbleiterprüftechnik mbH Multi-beam system for high throughput EBI
WO2016036246A2 (en) * 2014-09-04 2016-03-10 Technische Universiteit Delft Multi electron beam inspection apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2132763A2 (en) * 2007-02-22 2009-12-16 Applied Materials Israel Ltd. High throughput sem tool
US20100223618A1 (en) * 2009-02-27 2010-09-02 International Business Machines Corporation Scheduling jobs in a cluster
EP2879155A1 (en) * 2013-12-02 2015-06-03 ICT Integrated Circuit Testing Gesellschaft für Halbleiterprüftechnik mbH Multi-beam system for high throughput EBI
WO2016036246A2 (en) * 2014-09-04 2016-03-10 Technische Universiteit Delft Multi electron beam inspection apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Network and Parallel Computing", vol. 1659, 1 January 1999, SPRINGER INTERNATIONAL PUBLISHING, Cham, ISBN: 978-3-642-24784-2, ISSN: 0302-9743, article JÖRN GEHRING ET AL: "Scheduling a Metacomputer with Uncooperative Sub-schedulers", pages: 179 - 201, XP055423248, 032548, DOI: 10.1007/3-540-47954-6_10 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3882951A1 (en) * 2020-03-19 2021-09-22 FEI Company Charged particle beam device for inspection of a specimen with a plurality of charged particle beamlets
US11676795B2 (en) 2020-03-19 2023-06-13 Fei Company Charged particle beam device for inspection of a specimen with a plurality of charged particle beamlets

Also Published As

Publication number Publication date
TW201812576A (en) 2018-04-01

Similar Documents

Publication Publication Date Title
US11935721B2 (en) System comprising a multi-beam particle microscope and method for operating the same
US11521827B2 (en) Method of imaging a 2D sample with a multi-beam particle microscope
Gallagher et al. Negative‐stain transmission electron microscopy of molecular complexes for image analysis by 2D class averaging
KR102438825B1 (en) Deep Learning with Real-Time Intelligence to Detect Defects Using Electron Beam Inspection and Reduce News
JP2016004785A (en) Scan type microscope and mathematical image construction
JP7232911B2 (en) Fully automatic SEM sampling system for e-beam image enhancement
Teodoro et al. Accelerating large scale image analyses on parallel, CPU-GPU equipped systems
US10551827B2 (en) Hybrid inspection system for efficient process window discovery
JP2021120686A (en) Dynamic care area generation system to inspection tool, and method
WO2018041696A1 (en) System for parallel data processing with multi-layer workload management
JP2023138870A (en) Multi-beam electronic property evaluation tool with telecentric illumination
US20190378705A1 (en) High Resolution Electron Energy Analyzer
CN109075001A (en) For the method and system of the charge control of floating metal structure to be imaged on nonconductive substrate
US20220059316A1 (en) Scanning Electron Microscope Image Anchoring to Design for Array
Wagner et al. Performance analysis and optimization of the fftxlib on the intel knights landing architecture
US11927549B2 (en) Shielding strategy for mitigation of stray field for permanent magnet array
KR20240038019A (en) Charged particle beam microscope image processing system and its control method
US10755892B2 (en) Reflection-mode electron-beam inspection using ptychographic imaging
Xiao et al. High performance approximate sort algorithm using GPUs
US20220076914A1 (en) Magnetic immersion electron gun
WO2022233591A1 (en) System and method for distributed image recording and storage for charged particle systems
US20230258499A1 (en) Phase Analyzer, Sample Analyzer, and Analysis Method
EP4300087A1 (en) Method of processing data derived from a sample
TW202342974A (en) Data processing device and method, charged particle assessment system and method
Fischer Coordinated Caching for High Performance Calibration using Z→ µµ Events of the CMS Experiment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17761212

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17761212

Country of ref document: EP

Kind code of ref document: A1