WO2023128357A1

WO2023128357A1 - Software-based simulator for disaggregated architecture system, and method therefor

Info

Publication number: WO2023128357A1
Application number: PCT/KR2022/019733
Authority: WO
Inventors: 강병훈; 송용호; 이준수; 김대경; 김건우; 최원우; 김은진; 임창일; 이호준
Original assignee: 한국과학기술원; 성균관대학교산학협력단
Priority date: 2021-12-29
Filing date: 2022-12-06
Publication date: 2023-07-06

Abstract

Various embodiments relate to a software-based simulator for a disaggregated architecture system, and a method therefor, and simulator modeling constituent elements of the disaggregated architecture system on the basis of a configuration file for the disaggregated architecture system, simulating each of the constituent elements in order to measure the function and performance of the disaggregated architecture system, and outputting a result of the simulation.

Description

Software-based individual separation architecture system simulator and its method

The present disclosure relates to a software-based discrete architecture system simulator and method thereof.

New technologies such as artificial intelligence and the Internet of Things are being actively introduced in all industries in modern society. Characteristics of these new technologies include the requirement for large-scale data aggregation ("big data") and the use of parallel processing hardware (GPUs, ASICs, etc.) specialized for a particular task for high-speed computation. However, from the point of view of a typical company, it is burdensome such as facility and maintenance costs to have all the infrastructure for storing/transmitting/processing large-scale data. Accordingly, more and more companies are renting and using infrastructure of a cloud service/data center provider such as Amazon Web Service.

On the other hand, cloud service providers face the burden of accurately predicting demand and preparing various hardware resources accordingly in order to maximize cost versus profit. In this process, in the past, a machine that bundled CPU/memory/accelerator, etc. was used as a unit to predict demand. This reflected the traditional computer architecture as it is, but it had two drawbacks: The first drawback is that only one resource cannot be added or removed without limit during demand forecasting and resource preparation. For example, in the existing machine unit architecture, in order to add a memory or an operation accelerator, the CPU that controls the corresponding hardware must also be included in the same machine. Therefore, it is common for cloud service providers to prepare more resources than necessary. The second drawback is, likewise, that cloud service consumers cannot add or remove any one resource without limit.

The present disclosure provides a software-based discrete architecture system simulator and method thereof.

The simulator according to the present disclosure includes a preparation unit configured to model components of an individual separation architecture system based on a setting file for the separation architecture system, and to measure functions and performances of the separation architecture system, the It includes an execution unit configured to perform a simulation for each of the components, and an output unit configured to output a result of the simulation.

A simulator method according to the present disclosure includes modeling components of an individual separation architecture system based on a configuration file for the individual separation architecture system, and measuring the function and performance of the individual separation architecture system. performing a simulation for each of the elements; and outputting a result of the simulation.

According to the present disclosure, cloud service providers can easily test using a simulator before actually implementing the infrastructure they have designed, and furthermore, the present disclosure can also be used as a tool for estimating potential costs and problems. will be available

1 is a diagram illustrating an existing architecture and a general disaggregated architecture.

2 is a diagram schematically showing the configuration of a simulator of the present disclosure.

3 is a diagram schematically illustrating a generalized discrete architecture that is the subject of the present disclosure.

FIG. 4 is a diagram illustrating the compute node of FIG. 3;

FIG. 5 is a diagram for explaining a memory space of the memory controller of FIG. 1 .

Hereinafter, various embodiments of the present disclosure will be described with reference to the accompanying drawings.

1 is a diagram showing an existing architecture and a general separate architecture. Here, (a) of FIG. 1 shows an existing architecture, and (b) of FIG. 1 shows an individual isolation architecture.

Referring to FIG. 1 , in order to solve the above-mentioned disadvantages of the existing machine unit architecture, a new computer architecture called “separate architecture” has been proposed. In the individual separation architecture, hardware resources allocated to each machine are dismantled, grouped into pools of each hardware resource, and connected to each other through high-speed interfaces. Since each hardware resource can be flexibly connected according to consumer needs, demand forecasting and allocation of resources become flexible.

However, unlike individual architectures that are receiving a lot of attention in the cloud service industry, the currently disclosed technologies are only at the level of proprietary prototypes or technical white papers in some companies. Therefore, cloud service providers have many difficulties in designing their own service infrastructure based on an individual separation architecture. The present disclosure presents a method capable of simulating the structure of a general discrete architecture in software and Disaggsim, which is an actual implementation. Through this, cloud service providers can easily test their designed infrastructure before actually implementing it, and furthermore, they can use the present disclosure as a tool for estimating potential costs and problems.

1.One. 시뮬레이터simulator

This disclosure proposes Disaggsim, a simulator that can change the settings and configuration of an individual separation system and test its performance without physical hardware.

2 is a diagram schematically showing the configuration of the simulator 200 of the present disclosure.

Referring to FIG. 2 , the simulator 200 of the present disclosure, that is, the Disaggsim 200, is largely configured to perform three

steps

210, 220, and 230. The first step 210 is a step of preparing a simulation by reading a setting file in which the configuration of the system described in the following 2. Architecture is described in Python or another programming language, and the second step 220 is prepared It is a step of executing a program in a simulation environment to check whether the function of the individual separation system is operating and calculating a performance measure, and a third step 230 is a step of outputting whether the function is operating or not and the calculated performance measure. At this time, Disaggsim 200 may include a preparation unit configured to perform the three

steps

210, 220, and 230 respectively, an execution unit, and an output unit. In some embodiments, at least two of the preparation unit, the execution unit, or the output unit may be integrated into one unit.

1-1. 제 1 단계(210): 설정 파일 분석 및 시뮬레이션 준비1-1. Step 1 (210): Analysis of configuration files and preparation for simulation

In order to estimate the function and performance of the system through simulation without the hardware that implements the actual individual separation system, Disaggsim (200) models each component of the individual separation system in software (each component is described in 2. Architecture). defined as a concept). The modeled components are connected to each other through a setting file written in a programming language to form an overall separate system. In addition, several parameters that affect component modeling can be defined in the configuration file.

The pseudo code in Table 1 below shows an example of a configuration file. At the beginning of the configuration file, it is defined that it is a separate system configuration. After that, it is configured to simulate a system in which one compute node and a memory pool are connected by a fabric interconnect with a latency of 300ns. The program used for simulation is also specified at the end.

1-2. 제 2 단계(220): 시뮬레이션 실행1-2. Step 2 (220): Running the simulation

In order to measure the function and performance of the system modeled in the first step 210, Disaggsim simulates the operation of each component in software. For example, in the first step 210, it is possible to calculate how many CPU cycles and power are required to execute the specified program. The program code for calculating the relevant numerical value must be prepared in advance to fit the system described in 2. Architecture, and properly initialized in the first step (210). Since Disaggsim 200 provides a programming interface for writing corresponding codes and means for connecting between models of each system component, a user of the present disclosure can easily write codes for simulations he/she needs. The simulation result is stored in a separate data structure and reported to the user of the present disclosure in a third step (230).

1-3. 제 3 단계(230): 시뮬레이션 결과 출력1-3. Step 3 (230): output simulation result

In the third step 230, the result calculated after the end of the simulation is output. The output result may include not only numerical values related to the performance and functions of the modeled system, but also a configuration diagram of the entire system. The output format is usually a text file, but can also be an image file or a binary file that can be analyzed by other programs.

2. 아키텍처2. Architecture

The present disclosure aims at a generalized simulation based on the design of Disaggsim 200 and a general discrete architecture. Since the performance of an individual system can vary greatly depending on the structure and implementation method, it is necessary to identify and define modules that can be expressed in the simulator. Since there is no standardized separate architecture yet, Disaggsim (200) considers compatibility with several existing proposed systems.

3 is a diagram schematically illustrating a generalized separate architecture 300 that is the subject of the present disclosure.

Referring to FIG. 3, the generalized individual isolation architecture 300 includes a computing node 310, a memory pool 320, and a fabric interconnect 330 connecting them. It consists of The compute node 310 accesses the memory pool 320 in which data and execution codes are stored through the fabric interconnect 330 and performs work. At this time, the compute node 310 generates a fabric packet, which is a data structure containing related information, to transfer data and code, and transmits the read/write request to the memory pool 320 . FIG. 3 illustrates a process in which a fabric packet generated in the computing node 310 #0 passes through the fabric interconnect 330 and reaches the memory pool 320 . The simulator 200 of the present disclosure provides an interface to configure and connect the computing node 310 , the memory pool 320 , and the fabric interconnect 330 through a modularization function. Below, each component is explained in detail.

2-1. 작업 그룹(workgroup)2-1. workgroup

A work group means a configuration of modules required for actual work. Each work group is composed of a pool including a compute node 310 and one or more memory controllers 321 . The components of the work group are managed using identifiers (Component ID (CID) and sub-component ID (sub CID)).

2-2. 컴포넌트 관리 유닛(Component Management Unit; CMU)(340)2-2. Component Management Unit (CMU) 340

The software modeling code of the CMU 340 is compiled into a binary file before driving the simulator 200, initialized in the first step 210, and connected to other components. In the first starting process of the simulator (second step 220), the CMU 340 directly connected to the fabric interconnect 330 transmits an initialization packet to all components for initialization. The CMU 340 allocates CIDs to all nodes including the compute node 310 and the memory pool 320 and allocates sub-CIDs to all memory controllers 321 . The CMU 340 includes a resource allocation table for matching the CID allocated to the compute node 310 and the sub-CID (memory range information) allocated to the memory controller 321, respectively.

2-3. 컴퓨트 노드(310)2-3. Compute Node (310)

The software modeling code of the compute node 310 is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, in a second step 220 it performs a predefined simulation and interacts with other components such as fabric interconnect 330 and memory pool 320 .

The compute node 310 is an element that actually performs calculations. Most discrete systems use a custom System-on-Chip (SoC) because discrete architectures require hardware resources to be connected to the interconnect, while conventional processors do not have the ability to connect directly to the interconnect. This disclosure also models a custom Soc to connect to fabric interconnect 330 .

FIG. 4 is a diagram illustrating the compute node 310 of FIG. 3 .

Referring to FIG. 4 , a compute node 310 includes a core 411 including L1 and L2, a last-level cache (LLC) 413, and interconnect logic 415. can do. At this time, the core 311 is the same as the CPU core of the existing architecture. Accordingly, the compute node 310 may drive a binary program created to suit the existing architecture. The simulator 200 of the present disclosure also supports compatibility with binary programs for existing architectures.

2-3-1. 인터커넥트 로직(415)2-3-1. Interconnect Logic(415)

Code modeling the interconnect logic 415 in software is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, in a second step 220, it performs a predefined simulation and interacts with other components such as the compute node 310 and the like.

The interconnect logic 415 is located between the compute node 310 responsible for the actual operation and the fabric interconnect 330 connecting it. It is assumed that the interconnect logic 415 uses one of the connection technologies that meet common network requirements for connections between each node and module. For example, interconnect technologies such as InfiniBand define different packet specifications for fabric interconnects, while storage components such as DRAM use traditional datapath packets (eg DDR4). Accordingly, the interconnect logic 415 must act as a translator between the fabric interconnect 330 and the components to resolve these incompatibilities. Specifically, the interconnector logic 415 may include a fabric memory management unit (fMMU) 416 and a packet translator 417 .

2-3-1-1.2-3-1-1. fMMU(416)fMMU(416)

The fabric address space, an additional layer of memory in discrete systems, requires a management unit similar to the memory management unit (MMU) of conventional processor architectures. Previous studies have a single central MMU, but all memory access traffic is concentrated in one place, which can cause a serious bottleneck. In the individual separation system model, which is a simulation target of the present disclosure, the fMMU 416 is configured for each compute node 310 to solve the problem. The fMMU 416 is a component of interconnect logic that can map a single unified fabric memory address space to one or more memory controllers. The fMMU 416 may map the address space to the memory controller 321 through several methods. As an example, the fMMU 416 retrieves a base address from an address translation table based on a physical address of a CPU packet. Finally, an offset is added to this base address to obtain a fabric memory address actually used by the memory module 323 .

The software modeling code of the fMMU 416 is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, as a sub-component of the compute node 310, it performs a predefined simulation in the second step 220 and interacts with other components such as the memory pool 320 and the like.

2-3-1-2. 패킷 트랜슬레이터(417)2-3-1-2. Packet Translator (417)

CPU packets generated by the processor have a physical address, but do not have the information to route through the interconnect and determine the corresponding memory controller. Accordingly, the interconnect logic 415 of the compute node 310 converts the CPU-generated packets into fabric packets. The detailed structure of the packet is described in 2-6. It is described later in Fabric Packet. The software modeling code for packet conversion is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, as a sub-component of the compute node 310, it performs a predefined simulation in the second step 220 and interacts with other components such as the memory pool 320 and the like.

2-4.2-4. 메모리 풀(320)Memory Pool (320)

Memory resources of the separate architecture are integrated and configured to facilitate large-scale data calculation and flexible management for the compute node 310 . Centralized memory resource management is a key feature of memory-centric discrete architectures. That is, all memory uses one unified address space. The memory pool 320 includes a memory module 323 such as DRAM and NVMe and a memory module 323 compatible with a fabric such as a memory pool manager (MPM) 350. Using the MPM 350, the memory pool shown in FIG. 1 can be configured and managed. Each memory module 323 can be accessed through a memory controller 321 connected to the MPM 350 . Also, the compute node 310 may use completely separate memory without using local memory.

The software modeling code for the memory pool 320 is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected with other components. Thereafter, in a second step 220 it performs a predefined simulation and interacts with other components such as fabric interconnect 330 and compute node 310 .

2-4-1. 메모리 풀 매니저(MPM)(350)2-4-1. Memory Pool Manager (MPM) (350)

Unlike the existing architecture, in the discrete architecture, the memory space is remotely located and attached to the fabric interconnect 330 . In addition, unlike the compute node 310 having a calculation function, the memory controller 321 and the memory module 323 have limited processing power and are designed to perform only memory I/O operations. Therefore, the model that this disclosure simulates includes MPM 350, a new component of computing power and long-range memory access. The MPM interface is responsible for inter-component communication between the fabric interconnect 330 and the memory controller 321 . The MPM 350 includes an interconnect logic 351, a packet multiplexer 353, and a memory controller 321. MPM 350 is identified by CID and each memory controller 321 is identified by sub CID. Interconnect logic 351 of MPM 350 receives fabric packets from fabric interconnect 330 and forwards the packets to packet multiplexer 353.

The software modeling code of the MPM 350 is compiled into a binary file before driving the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, in a second step 220 it performs a predefined simulation and interacts with other components such as fabric interconnect 330 and compute node 310 .

2-4-1-1. 패킷 멀티플렉서(353)2-4-1-1. Packet Multiplexer(353)

Several compute nodes 310 may simultaneously transmit read/write fabric packets, which are modified memory packets including CID and sub-CID fields, to the memory pool 320 connected to the fabric. The interconnect logic 415 of the compute node 310 includes the CID and sub-CID information into the packet. This encapsulated packet is transmitted over fabric interconnect 330 to MPM 350 . When a fabric packet is routed from the compute node to the MPM 350 using the destination CID (dstCID) information, the packet multiplexer 353 interprets the encapsulated fabric packet again and references the sub-CID included in the header of the packet to correspond to the corresponding sub-CID. The packet is sent to the memory controller 321. Finally, the memory controller 321 receives the separated original CPU packet without the CID and sub-CID information.

The software modeling code of the packet multiplexer 353 is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, as a sub-component of the memory pool manager 350, it performs a predefined simulation in the second step 220 and interacts with other components such as the memory controller 321.

2-4-2. 메모리 컨트롤러(321)2-4-2. memory controller(321)

The memory controller 321 directly accesses storage structures for each module, such as DRAM rows and columns, according to physical addresses included in packets received from the MPM 350 . The memory controller 321 has read and write queues to process requests simultaneously according to the capacity of the queue. As long as the queue is not full, even if the memory controller 321 is processing another request, the computing node 310 does not need to send the packet again. The memory controller 321 has its own physical address space, but since it appears as a contiguous memory space to the compute node 310, it appears merged with the address space provided by other memory controllers 321. FIG. 5 is a diagram for explaining a memory space of the memory controller 321 of FIG. 1 .

For example, it is assumed that two memory controllers 321 have DRAM modules of 8 GB each. As shown in (a) of FIG. 5 , each memory controller 321 has its own memory space between 0 and 8 GB. However, as shown in (b) of FIG. 5 , the compute node 310 regards these two 8 GB spaces as one 16 GB space. That is, each component of the individual separation system supports to continuously access the memory space allocated to one work group. When the compute node 310 sends a memory access request, the MPM 353 routes the memory request between the two memory controllers 321 using the sub CID included in the request packet. Since the Fmmu 416 of the interconnect logic 415 converts each request address into an address of the target memory controller 321, the memory controller 321 can process the request without separate conversion.

The software modeling code for the memory controller 321 is compiled into a binary file before driving the simulator 200, initialized in the first step 210, and connected to other components. Thereafter, as a sub-component of the memory pool 320, it performs a predefined simulation in the second step 220 and interacts with other components such as the memory pool manager 350 and the like.

2-5. 패브릭 인터커넥트(330)2-5. Fabric Interconnect(330)

The fabric interconnect 350 is responsible for a connection for communication between the compute node 310 and the memory pool 320 . Fabric interconnect 350 is assumed to be based on existing high-speed interconnect technology and also includes hardware components of the fabric interconnect such as packet routing. The simulator 200 of the present disclosure supports the detailed configuration of the fabric interconnect 330 and can modify parameters affecting latency and throughput. In addition, for flexible modeling of the system, the topology in which each component is connected also supports an interface that can be defined by the user.

Code modeling the fabric interconnect 330 in software is compiled into a binary file before running the simulator 200, initialized in the first step 210, and connected to other components. After that, in the second step 220, it performs a predefined simulation and interacts with other components such as the compute node 310 and the memory pool manager 350.

2-6. 패브릭 패킷2-6. fabric packet

The fabric packet is a communication means for communication between the compute node 310 and the memory pool 320 . Packets are organized in binary format. For example, a fabric packet may have a structure as shown in Table 2 below.

In this example, the fabric packet includes a CID (srcCID/dstCID) representing the source and destination and a sub-CID (sub-dstCID) corresponding to the memory controller. Users of this disclosure may include other fields depending on the system model they wish to simulate. For example, an item to be measured or a field related to system control may be additionally defined according to the implemented protocol and the fabric interconnect 330. A code modeling a fabric packet in software is performed before driving the simulator 200. It is compiled into a binary file, initialized in the first step 210 and linked with other components. After that, in a second step 220, it performs a predefined simulation and interacts with other components. Since the fabric packet is a concept applied to the entire simulator, codes modeling all components of the simulator 200 are written to ensure compatibility with the fabric packet.

In short, the present disclosure provides a software-based discrete architecture system 300 simulator 200 and a method thereof.

The method of the simulator 200 according to the present disclosure includes modeling components of the individually separated architecture system 300 based on a setting file for the individually separated architecture system 300 (210), an individual separation architecture system ( In order to measure the function and performance of 300), a simulation is performed on each of the components (220), and a result of the simulation is output (230).

The simulator 200 according to the present disclosure includes a preparation unit configured to model components of the individually separated architecture system 300 based on a setting file for the individual separation architecture system 300, and the individual separation architecture system 300 In order to measure the function and performance of, an execution unit configured to perform a simulation for each of the components, and an output unit configured to output a result of the simulation.

According to the present disclosure, codes for modeling each component are precompiled, initialized by the preparation unit and connected to each other, and interact with each other while simulation is performed by the execution unit.

According to the present disclosure, components include at least one compute node 310 configured to perform actual calculations, memory modules 323 required to perform calculations, and memory configured to manage the memory modules 323 . A memory pool 320 including controllers 321 and a fabric interconnect 330 supporting packet communication between the compute node 310 and the memory pool 320 .

According to the present disclosure, the compute node 310 is located between the core 411 and the core 411 and the fabric interconnect 330 to map a single address space to at least one of the memory controllers 321 . and an interconnect logic 415 comprising a configured fabric memory management unit 416 and a packet translator 417 configured to convert packets generated in the core 411 into fablet packets, respectively.

According to the present disclosure, the components further include a component management unit 340 configured to assign a CID to the compute node 310 and assign a sub CID to each of the memory controllers 321 .

According to the present disclosure, the compute node 310 is configured to include the assigned CID and the sub-CID assigned to at least one of the memory controllers 321 in a fabric packet and transmit the fabric packet to the fabric interconnect 330. do.

According to the present disclosure, the components include an interconnect logic 351 configured to receive a fabric packet from the fabric interconnect 330 and a fabric received fabric packet to at least one memory controller 321 to which a sub CID of the received fabric packet is assigned. and a memory pool manager 350 comprising a packet multiplexer 353 configured to send packets.

According to the present disclosure, each of the memory controllers 321 is configured to store a received fabric packet in at least one of the memory modules 323 .

The devices described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in the embodiments include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), and a programmable PLU (programmable logic unit). logic unit), microprocessor, or any other device capable of executing and responding to instructions. The processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. You can command the device. The software and/or data may be embodied in any tangible machine, component, physical device, computer storage medium, or device to be interpreted by, or to provide instructions or data to, a processing device. there is. Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

Methods according to various embodiments may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. In this case, the medium may continuously store programs executable by a computer or temporarily store them for execution or download. Also, the medium may be a single or various types of recording means or storage means in the form of a combination of several pieces of hardware. It is not limited to a medium directly connected to a certain computer system, and may be distributed on a network. Examples of the medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROM and DVD, magneto-optical media such as floptical disks, and ROM, RAM, flash memory, etc. configured to store program instructions. In addition, examples of other media include recording media or storage media managed by an app store that distributes applications, a site that supplies or distributes various other software, and a server.

Various embodiments of this document and terms used therein are not intended to limit the technology described in this document to a specific embodiment, and should be understood to include various modifications, equivalents, and/or substitutes of the embodiment. In connection with the description of the drawings, like reference numerals may be used for like elements. Singular expressions may include plural expressions unless the context clearly dictates otherwise. In this document, expressions such as "A or B", "at least one of A and/or B", "A, B or C" or "at least one of A, B and/or C" refer to all of the items listed together. Possible combinations may be included. Expressions such as "first," "second," "first," or "second" may modify the elements in any order or importance, and are used only to distinguish one element from another. The components are not limited. When a (e.g., first) element is referred to as being "(functionally or communicatively) connected" or "connected" to another (e.g., second) element, it is referred to as being "connected" to the other (e.g., second) element. It may be directly connected to the component or connected through another component (eg, a third component).

The term "module" used in this document includes a unit composed of hardware, software, or firmware, and may be used interchangeably with terms such as logic, logic block, component, or circuit, for example. A module may be an integral part or a minimum unit or part thereof that performs one or more functions. For example, the module may be composed of an application-specific integrated circuit (ASIC).

According to various embodiments, each component (eg, module or program) of the described components may include a singular object or a plurality of entities. According to various embodiments, one or more components or steps among the aforementioned components may be omitted, or one or more other components or steps may be added. Alternatively or additionally, a plurality of components (eg modules or programs) may be integrated into a single component. In this case, the integrated component may perform one or more functions of each of the plurality of components identically or similarly to those performed by the corresponding component among the plurality of components prior to integration. According to various embodiments, steps performed by a module, program, or other component are executed sequentially, in parallel, iteratively, or heuristically, or one or more of the steps are executed in a different order, omitted, or , or one or more other steps may be added.

Claims

In the software-based individual separation architecture system simulator,

a preparation unit configured to model components of the individual separation architecture system based on a setting file for the individual separation architecture system;

an execution unit configured to perform a simulation for each of the components in order to measure functions and performance of the individual isolation architecture system; and

An output unit configured to output a result of the simulation

including,

simulator.
According to claim 1,

The codes for modeling each of the components are,

Compiled in advance, initialized by the preparation unit and connected to each other, and interacting with each other while the simulation is performed by the execution unit,

simulator.
According to claim 1,

These components are

at least one compute node configured to perform actual computation;

a memory pool including memory modules required to perform the operation and memory controllers configured to manage the memory modules; and

A fabric interconnect supporting packet communication between the compute node and the memory pool.

including,

simulator.
According to claim 3,

The compute node,

core; and

a fabric memory management unit (fMMU) positioned between the core and the fabric interconnect and configured to map a single address space to at least one of the memory controllers and to convert packets generated by the core to fabrit packets; Interconnect logic including packet translator

Including individually,

simulator.
According to claim 4,

These components are

a component management unit configured to assign a component ID (CID) to the compute node and assign a sub-CID (sub-CID) to each of the memory controllers;

Including more,

The compute node,

Include the assigned CID and the sub-CID assigned to at least one of the memory controllers in the fabric packet and transmit the fabric packet to the fabric interconnect,

simulator.
According to claim 5,

These components are

a memory pool manager comprising interconnect logic configured to receive the fabric packet from the fabric interconnect and a packet multiplexer configured to direct the received fabric packet to at least one memory controller to which the sub CID of the received fabric packet is assigned;

Including more,

simulator.
In the method of software-based individual separation architecture system simulator,

modeling components of the individual architecture system based on a configuration file for the individual architecture system;

performing simulations on each of the components to measure functions and performance of the separate architecture system; and

Outputting the result of the simulation

including,

method of the simulator.
According to claim 7,

The codes for modeling each of the components are,

Precompiled, initialized to model the components, connected to each other, and interacting with each other while the simulation is performed,

method of the simulator.
According to claim 7,

These components are

at least one compute node configured to perform actual computation;

a memory pool including memory modules required to perform the operation and memory controllers configured to manage the memory modules; and

A fabric interconnect supporting packet communication between the compute node and the memory pool.

including,

method of the simulator.
A computer program stored in a non-transitory computer readable recording medium for executing a method in a software-based discrete architecture system simulator,

The method,

modeling components of the individual architecture system based on a configuration file for the individual architecture system;

performing simulations on each of the components to measure functions and performance of the separate architecture system; and

Outputting the result of the simulation

including,

computer program.