WO2018059497A1 - Cache consistency processing method and device - Google Patents

Cache consistency processing method and device Download PDF

Info

Publication number
WO2018059497A1
WO2018059497A1 PCT/CN2017/104021 CN2017104021W WO2018059497A1 WO 2018059497 A1 WO2018059497 A1 WO 2018059497A1 CN 2017104021 W CN2017104021 W CN 2017104021W WO 2018059497 A1 WO2018059497 A1 WO 2018059497A1
Authority
WO
WIPO (PCT)
Prior art keywords
consistency
processor
processor core
identifier
router
Prior art date
Application number
PCT/CN2017/104021
Other languages
French (fr)
Chinese (zh)
Inventor
崔晓松
陈云
蔡毅
黄勤业
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018059497A1 publication Critical patent/WO2018059497A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0813Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances

Definitions

  • the present invention relates to the field of cloud computing technologies, and in particular, to a cache coherency processing technique in an on-chip multi-core processor.
  • the on-chip multi-processors (English: Chip Multi-processors, CMPs) architecture has become the mainstream architecture of processor design. It consists of many processor cores (Core) through a router (English: Router) on-chip network (English: Network on Chip, abbreviation: NoC) connection structure.
  • the cache in the chip is designed according to the hierarchy.
  • the L1 cache (English: Level-1 cache) is a private cache. It is designed inside the processor core, and the stored data block can only be used by the processor core.
  • L2 cache (English: Level-2cache) is a shared cache, in which the stored data blocks are shared data blocks, which can be accessed by multiple processor cores, and can generally be designed outside the processor core in a centralized manner, or Distributedly designed in the vicinity of each processor core, that is, physically distributed near each processor core and interconnected by an on-chip network, can be logically shared.
  • the data block is typically stored in a shared cache for access by one or more processor cores.
  • a copy of the data block is created in a private cache of one or more processor cores that have accessed the data block, such that when a core accessing the data block needs to revisit the data block It only needs to read the data block into the core's private cache. Because a copy of the data block is stored in the private cache of the one or more processor cores being accessed, it is necessary to maintain consistency between the copies of the data block in the private cache of the plurality of cores, and to resolve the consistency between the copies.
  • the specific ways to solve the cache consistency include: a directory-based consistency protocol and a snooping-based consistency protocol.
  • the directory-based coherence protocol uses a directory to record which processor cores the data block is accessed by.
  • the processor core storing the copy of the data block is determined by querying the table entry. And then performing consistency operations on the data block in the corresponding processor core;
  • the interception-based consistency protocol is a method in which all processor cores listen to the bus, that is, when a processor core modifies its private cache. After a certain data block, the consistency maintenance message is broadcasted on the bus, so that other processor cores storing the copy of the data block perform a consistency operation.
  • This paper describes a cache coherency processing method and apparatus to achieve cache coherency processing and communication of on-chip multi-core processors and balance of area overhead.
  • an embodiment of the present application provides a cache coherency processing method.
  • the method is applied to an on-chip multi-core processor, comprising: a first router receiving a consistency maintenance request sent by a first processor core directly connected thereto, the consistency maintenance request carrying an identifier of a first processor core; Identifying a preset consistency maintenance entry, and generating a consistency maintenance command for the consistency area to which the first processor core belongs; sending the generated consistency maintenance command to the other processor in the consistent area through the on-chip network nuclear.
  • consistency maintenance can be completed only in the consistent area where the cache data block write operation occurs, thereby avoiding the overhead caused by global consistency maintenance.
  • the first router queries the preset consistency maintenance entry according to the identifier of the first processor core, acquires the identifier of the consistency area to which the first processor core belongs, and is located in the An identifier of another processor core in the consistency region; and the first router generates an identifier for the consistency according to the identifier of the consistency region where the first processor core is located and the identifier of another processor core located in the consistency region Consistency maintenance commands for sexual areas.
  • the first router determines at least one route transmission path according to the on-chip network topology state and the identifiers of other processor cores in the consistent area, wherein each route transmission path is within the consistency zone
  • the routers of the other processor cores are configured; for each routing transmission path, the first router reconfigures the consistency maintenance command to generate a consistency maintenance command for the routing path; and uses each routing transmission path.
  • a consistency maintenance command for the route transmission path is sent to the processor core on the route transmission path.
  • the first router determines the identifier of the router connected to other processor cores in the consistency region according to the identifiers of other processor cores in the consistency region; and according to other processing in the consistency region The identifier of the router connected to the core and the topology state of the on-chip network.
  • the first router performs route discovery according to the XY routing algorithm and determines at least one route transmission path. The above route discovery process conveniently and quickly determines the route transmission path included in the consistency area.
  • the preset consistency maintenance entry may be generated by: the resource manager receiving a processor resource allocation request sent by the virtual machine, and the processor resource allocation request for requesting the resource management Configuring a virtual machine to allocate at least two processors including a first processor core; the resource manager generates a consistency maintenance entry for the virtual machine according to the processor resource allocation request, where the consistency maintenance entry includes : The identity of the consistency zone and the identity of the processor core included in the consistency zone.
  • the method further includes: the resource manager receiving the processor resource adjustment request sent by the virtual machine, and processing The resource adjustment request is used to request the resource manager to adjust the processor core allocated to the virtual machine; the resource manager adjusts the consistency maintenance entry for the virtual machine according to the processor resource adjustment request.
  • the resource manager when the processor resource adjustment is to reduce the processor core, the resource manager needs to write the cache data in the processor core to be reduced back into the memory, and then clear the data of the processor core to be reduced, and The identifier of the processor core to be reduced is deleted in the foregoing consistency maintenance entry; and when the processor resource adjustment is to increase the processor core, the resource manager records the identifier of the processor core to be added in the consistency maintenance. In the table entry.
  • an embodiment of the present invention provides a processor chip, where the processor chip includes: a plurality of processor cores, including a first processor core; and an on-chip network, which is composed of a plurality of routers, wherein the first a router, the first router and the first processor core are directly connected; wherein the first router comprises: a processor core connection port for connecting with the first processor core; at least one output port, and The at least one router of the on-chip network is connected to the server; the cache module is configured to store a preset consistency maintenance entry; and the processing module is configured to receive, by using the processor core connection port, the consistency sent by the first processor core a maintenance request, the consistency maintenance request carrying the identifier of the first processor core, querying the consistency maintenance entry stored in the buffer according to the identifier of the first processor core, and according to the consistency The maintenance maintenance entry generates a consistency maintenance command for the consistency area, and sends the consistency maintenance command to the consistency area by using the output port.
  • Other processor cores include: a plurality of processor cores
  • the processing module is further configured to: query the consistency maintenance entry stored by the cache module according to the identifier of the first processor core, and obtain the identifier of the consistency region to which the first processor core belongs And an identifier of the other processor cores located in the consistency area; and generating an identifier for the identifier of the consistency area of the first processor core and the identifiers of other processor cores located in the consistency area A consistency maintenance command for the consistency zone.
  • the processing module is further configured to: according to the on-chip network topology state, and the identifiers of other processor cores in the consistency region, the first path determines at least one route transmission path, where each The routing transmission path is composed of routers connected to other processor cores in the consistency area; for each of the routing transmission paths, the consistency maintenance command is reconstructed and generated, and the routing transmission is generated for the routing a consistency maintenance command of the path; using each of the routing transmission paths, a consistency maintenance command for the routing transmission path is sent to the processor core on the routing transmission path through the output port.
  • the processing module is further configured to: determine, according to the identifiers of other processor cores in the consistency region, identifiers of routers connected to other processor cores in the consistency region; according to other processor cores in the consistency region The identifier of the connected router and the topology state of the on-chip network, route discovery according to the XY routing algorithm, and determine at least one route transmission path.
  • an embodiment of the present invention provides a computer system, including the processor chip disclosed in the foregoing aspect, and a resource manager, configured to receive a processor resource allocation request sent by the virtual machine, where the processor The resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine; and generate a consistency maintenance entry for the virtual machine according to the processor resource allocation request, where the consistency maintenance entry includes: The identity of the sexual area and the identity of the processor core included in the consistent area.
  • the function of the resource manager can be implemented in hardware or in hardware by executing the corresponding software.
  • the hardware or software includes one or more modules corresponding to the functions described above.
  • the modules can be software and/or hardware.
  • the resource manager is further configured to: receive a processor resource adjustment request sent by the virtual machine, where The processor resource adjustment request is used to request the resource manager to adjust the processor core allocated to the virtual machine; and adjust the consistency maintenance table for the virtual machine according to the processor resource adjustment request.
  • the resource manager is further configured to: when the processor resource adjustment request includes a decrease in processor resources, the resource manager needs to write the cache data in the processor core to be reduced back to the memory in advance. Then, the data of the processor core to be reduced is cleared, and the identifier of the processor core to be reduced is deleted in the consistency maintenance table item.
  • the resource manager is further configured to: when the processor resource adjustment is to increase a processor core, record the identifier of the processor core to be added in the consistency maintenance table entry. .
  • an embodiment of the present invention provides a computer storage medium for storing computer software instructions for use in the router, including a program designed to perform the above aspects.
  • an embodiment of the present invention provides a computer storage medium for storing computer software instructions for use by the resource manager, including a program designed to execute the above aspects.
  • the solution provided by the present invention can more flexibly manage the consistency maintenance area to achieve cache coherency processing and communication of on-chip multi-core processors and balance of area overhead.
  • FIG. 1 is a schematic diagram of a cloud computing architecture applied to the present invention
  • FIG. 2 is a schematic diagram of a processor chip of the present invention
  • FIG. 3 is a schematic flowchart of a cache consistency processing method according to an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of a router module according to an embodiment of the present invention.
  • the cloud computing architecture applied in this application is divided into four levels from top to bottom, namely:
  • Application Layer 100 which runs various applications (English: Application) to provide users with corresponding services.
  • the operating system layer (Operating System Layer) 200 which includes an operating system (English: Operating System, abbreviation: OS), is responsible for allocating hardware resources (processor, memory, and network IO) for various types of applications running thereon.
  • the operating system and the application running on it constitute a software architecture of a virtual machine (English: Virtual Machine, VM for short).
  • An operating system belonging to a virtual machine allocates hardware resources (eg, processor, memory, network IO, etc.) to applications running thereon within the scope of hardware resources owned by the virtual machine.
  • FIG. 1 Two virtual machines, wherein the virtual machine 1 includes an operating system 1 and an application 1 and an application 2, the virtual machine 2 including an operating system 2 and an application 3 and an application 4.
  • Resource Management Layer 300 which runs a resource manager (English: Resource Manager, RM for short).
  • resource manager International: Resource Manager, RM for short.
  • multiple virtual machines are created on a physical machine (English: Physical Machine, PM).
  • hardware resources include: processor, memory, network IO, and so on.
  • the resource manager is also referred to as a virtual machine monitor (English: Virtual Machine Monitor, referred to as: VMM) or a hypervisor.
  • the processor layer 400 includes hardware resources that the resource management layer 300 can manage.
  • FIG. 1 shows a processor, a memory, a network IO, and the like.
  • FIG. 2 shows an on-chip multi-core processor composed of 16 processor cores (abbreviation: core, English: Core) through an on-chip network connection of 16 routers.
  • core processor cores
  • Each core is directly connected to a router, and an inter-core network formed by a router can implement inter-core communication.
  • the core and the router can be connected by wire (electrical connection, such as copper wire connection; optical connection, such as fiber connection), or wireless connection.
  • the embodiment of the invention is directed to a solution to the cache consistency problem in a cloud computing scenario.
  • a cloud computing scenario at least two virtual machines are created on a single server.
  • the resource manager is required to allocate hardware resources for it.
  • the resource manager allocates a certain number of processor cores to the virtual machine as processing resources of the virtual machine.
  • the hardware resources owned by any two virtual machines are logically completely isolated.
  • the on-chip multi-core processor includes 16 processor cores, and it is assumed that the on-chip multi-core processor is allocated to two virtual machines by the resource manager. Wherein processor cores 0-9 and 12-13 are assigned to virtual machine 1, processor cores 10-11 and 14-15 are assigned to virtual machine 2, since processor cores assigned to different virtual machines are logically isolated Therefore, the processor core in Figure 2 is divided into two consistent regions, as shown in Table 1:
  • Consistent area number Included processor core Consistency area 1 Core 0-9, and core 12-13 Consistency area 2 Nuclear 10-11, and nuclear 14-15
  • the core 1 For the virtual machine 1, taking the processor core 1 as an example, after the core 1 writes a certain cache data block of the private cache in the core, it needs to be in the consistency area 1 where the processor core 1 is located, for the cache.
  • the data block performs consistency maintenance operations, and the processing procedure is as follows:
  • Step 310 The first router receives a consistency maintenance request sent by the first processor core, where the consistency maintenance request carries an identifier of the first processor core.
  • the consistency maintenance request is a processor core that has completed the data block write operation.
  • the cache controller sent by 1 carries the identifier of the processor core 1 in the consistency operation request.
  • the identifier of the processor core may be the number of the processor core of the system.
  • Step 320 The first router queries the preset consistency maintenance entry according to the identifier of the first processor core, and obtains an identifier of the consistency area to which the first processor core belongs, and an identifier of another processor core located in the consistency area. .
  • the consistency maintenance entry is generated according to the processor resource allocation request after the virtual machine is created by the resource manager after receiving the processor resource allocation request sent by the virtual machine.
  • the processor resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine.
  • the consistency maintenance entry structure includes: a valid bit (Valid), a coherence domain identifier (CD: ID), and a core ID of the coherent area, where The valid bit is used to indicate whether the entry is valid; the consistency zone identifier is used to indicate the identity of the consistency zone; the core identity of the consistency zone is in the form of a bit mask, from right to left.
  • the maintenance table item of the consistency area is represented as follows, wherein the on-chip multi-core processor shown in FIG. 2 includes 16 cores and is composed of 16 bits. The vector is used to represent (from right to left, sequentially representing the core 0-core 15), as shown in Table 2:
  • the resource manager After the resource manager creates the consistency maintenance entry, it sends the consistency maintenance entry to the router in the corresponding consistency zone.
  • the resource manager sends the consistency maintenance entry of the consistency zone 1 to the router included in the consistency zone 1, and the router can store the consistency maintenance entry in its own cache or router. In the register.
  • the first router queries the consistency maintenance entry according to the identifier of the first processor core, and may specifically: query the core attribution of the consistency region of the consistency maintenance entry by using the number of the first processor core.
  • An identifier that determines an identity of a coherency zone to which the first processor core belongs, and an identity of other processor cores located within the coherency zone.
  • Step 330 The first router generates a consistency maintenance command for the consistency area according to the identifier of the consistency area where the first processor core is located, and the identifiers of other processor cores located in the consistency area.
  • the first router generates a consistency maintenance command for the consistency zone according to the identifiers of other processor cores located in the consistency zone.
  • the consistency maintenance command adds fields such as the CDID field and the Core IDs field based on the routing packet structure. The meaning of each domain is shown in Table 3:
  • Flit Type defines the consistency command type, such as: 0x99;
  • the source address in this scenario, refers to the identifier of the Core that initiates the consistency maintenance request.
  • the initiator consistency maintenance request is processor core 1, whose number is 1);
  • CDID the consistency area identifier in the consistency maintenance entry (in the example of this embodiment, the consistency area identifier is 1);
  • Core IDs are the cores of the consistency zone in the consistency maintenance table.
  • the processor cores corresponding to the core identity of the consistency zone are cores 0-9 and 12-13.
  • Payload refers to the consistency operation content specifically included in the consistency maintenance command (for example, updating the data of a variable to a certain value. In the example of this embodiment, setting the value of the variable a to 2) .
  • Step 340 The first router sends the consistency maintenance command to the other processor cores of the consistency area through the on-chip network.
  • the router 1 broadcasts the generated consistency maintenance command to other processor cores in the coherency area (ie, cores 0-9 and cores 12-13) through the on-chip network, router 1
  • the consistency maintenance command is sent to other processor cores (core 0, and core 2-15) on the on-chip multi-core processor through other routers (router 0, and router 2-15) in the on-chip network, each receiving
  • the processor core of the consistency maintenance command compares its own identifier with the Core Ids field of the consistency maintenance command. When the bit corresponding to the Core IDs is 1, the processor core belongs to the consistent area. Consistency processing is required according to the consistency maintenance command. When the bit corresponding to the Core IDs is 0, the processor core does not belong to the consistency zone. The consistency maintenance command is discarded and no processing is performed.
  • the router 1 acquires the topology state of the on-chip network and the identifiers of other processor cores in the consistency region, and determines at least one route transmission path according to the obtained information. Specifically, the router 1 can sense the topology state of the on-chip network, calculate the possible next hop mode, and confirm the next hop node, that is, the router (0, 2, 5). For example, in a 2D mesh network, according to the XY routing method, (x+1, y), (x-1, y), (x, y+1), (x, y-1) are respectively calculated, and the router 1 is determined. The first hop node is the router (0, 2, 5).
  • Router 1 obtains the identifier of the other processor cores in the consistency area according to the consistency maintenance entry, and determines that the processor core connected to the router (0, 2, 5) belongs to the consistency area 1; then proceeds to the next manner in a similar manner.
  • the selection of the hop node, the second hop node of the router 1 is the router (3, 4, 6, 9), and the router (3, 4, 6, 9) is determined according to the identifier of the other processor cores in the consistency area 1 is obtained.
  • the connected processor core also belongs to the consistency area 1; the judgment is continued until the traversal judgment of the routing node in the consistency area 1 is completed; thus, three routing transmission paths can be obtained, namely: 1 ⁇ 0 ⁇ 4 ⁇ 8 ⁇ 12 (westward route), 1 ⁇ 5 ⁇ 9 ⁇ 13 (route to the south), 1 ⁇ 2 ⁇ 3(6) ⁇ 7 (route to the east).
  • the first router For each route transmission path, the first router reconfigures the consistency maintenance command to generate a consistency maintenance command for the route transmission path, that is, generates a route directional transmission command according to the consistency maintenance command, Shown as follows:
  • Routing Directional Sending Instructions are shown in Table 4:
  • Routing Directional Sending Instructions (Etherward Routing) are shown in Table 5:
  • the routing of the eastbound route is performed on the router 2 because there are two branches of 1 ⁇ 2 ⁇ 3 ⁇ 7 and 1 ⁇ 2 ⁇ 6.
  • the send instruction can be further broken down into the following two:
  • the consistency maintenance command for the routing transmission path is sent to the processor core on the routing transmission path by using each routing transmission path.
  • the processor resource adjustment request is used to request the resource manager to adjust the processor core allocated to the virtual machine; resource management The device adjusts the consistency maintenance entry for the virtual machine according to the processor resource adjustment request.
  • the above adjustment requests include two types:
  • the resource manager needs to write the cache data in the added processor core to the memory (also referred to as main memory) in advance, and then the processor to be added.
  • the data in the core is cleared, and the identifier of the processor core to be added is recorded in the above consistency maintenance table.
  • the resource manager needs to write the cache data in the processor core to be reduced back into the memory, and then clear the data of the processor core to be reduced, and The identifier of the processor core to be reduced is deleted in the above consistency maintenance entry.
  • FIG. 2 shows a block diagram of a design of a processor chip 200 involved in the above embodiment (limited to size, only a portion of the processor chip is shown).
  • the processor chip 200 includes:
  • processor cores including a first processor core 210 (by way of example, core 1 as a first processor core);
  • the on-chip network is composed of a plurality of router connections, wherein the first router 220 is included, and the first router 220 is directly connected to the first processor core 210;
  • the first router 220 includes:
  • a processor core connection port 221 for connecting to the first processor core 210
  • At least one output port 222 for connecting to at least one router of the network on the chip
  • the cache module 223 is configured to store a preset consistency maintenance entry.
  • the processing module 224 is configured to receive, by the processor core connection port 221, a consistency maintenance request sent by the first processor core 210, where the consistency maintenance request carries an identifier of the first processor core 210; according to the identifier of the first processor core Querying the consistency maintenance table stored by the cache module 223, obtaining an identifier of a consistency area to which the first processor core 210 belongs, and an identifier of another processor core located in the consistency area; An identifier of the consistency area of the processor core 210, and an identifier of another processor core located in the consistency area, generating a consistency maintenance command for the consistency area; and passing the consistency maintenance command to the output port 222 Other processor cores sent to this coherency zone.
  • Figure 1 also shows a computer system comprising: a resource manager, a virtual machine running on the resource manager, and a processor chip as provided in the previous embodiment.
  • the resource manager is configured to receive an allocation request of a processor resource sent by the virtual machine, where the processor resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine; a processor resource allocation request, generating a consistency maintenance entry for the virtual machine, where the consistency maintenance entry includes: an identifier of the consistency area, and an identifier of a processor core included in the consistency area .
  • the resource manager is further configured to: receive a processor resource adjustment request sent by the virtual machine, where the processor resource adjustment request is used to request the resource manager to allocate a processor core to the virtual machine Make adjustments; And adjusting, according to the processor resource adjustment request, a consistency maintenance entry for the virtual machine.
  • the resource manager is further configured to: when the processor resource adjustment request is to reduce the processor core, the resource manager writes the cache data in the processor core to be reduced back into the memory, and then the processing to be reduced The data of the core of the kernel is cleared, and the identifier of the processor core to be reduced is deleted in the consistency maintenance entry;
  • the identifier of the processor core to be added is recorded in the consistency maintenance table entry.
  • the resource manager can be implemented by software or by hardware + software.
  • the above-mentioned resource manager for executing the present invention is a central processing unit (English: Central Process ing Uni t, abbreviated: CPU), a general-purpose processor, and a digital signal processor (English: Digital Signal Processor, abbreviation) :DSP), application-specific integrated circuit (English: Application-specific integrated circuit, abbreviation: ASIC), Field Programmable Gate Array (English: Field Programmable Gate Array, abbreviation: FPGA) or other programmable logic devices, transistor logic devices, hardware A component or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware, or may be implemented by a processor executing software instructions.
  • the software instructions may be composed of corresponding software modules, which may be stored in a RAM (Random Access Memory) memory, a flash memory, a ROM (Read-Only Memory) memory, and an EPROM (Erasable Programmable Read- Only Memory, EEPROM (Electrically-Erasable Programmable Read-Only Memory) memory, registers, hard disk, mobile hard disk, CD-ROM or well known in the art Any other form of storage medium.
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in the user equipment. Of course, the processor and the storage medium may also reside as discrete components in the user equipment.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Multi Processors (AREA)

Abstract

A cache consistency method comprises: a first router (220) receives a consistency preserving request sent by a first processor core which is directly connected to the first router (220), the consistency preserving request carrying an identifier of the first processor core; inquire about a preset consistency preserving table entry according to the identifier of the first processor core, and generate a consistency preserving command for the consistency region according to the consistency preserving table entry; and send the consistency preserving command to other processor cores in the consistency region by means of a network on chip. While the cache consistency is effectively preserved within a certain region, problems of communication and area overhead of a multi-core processor on chip can also be alleviated.

Description

缓存一致性处理方法和装置Cache consistency processing method and device 技术领域Technical field
本发明涉及云计算技术领域,尤其涉及片上多核处理器中的缓存一致性(Cache Coherency)处理技术。The present invention relates to the field of cloud computing technologies, and in particular, to a cache coherency processing technique in an on-chip multi-core processor.
背景技术Background technique
片上多核处理器(英文:Chip Multi-processors,简称:CMPs)架构成为处理器设计的主流架构,它由许多处理器核(Core)通过路由器(英文:Router)构成的片上网络(英文:Network on chip,简称:NoC)连接构成。芯片中的缓存(Cache)是按照分级来设计的,其中,L1缓存(英文:Level-1cache)是私有缓存,它被设计在处理器核内部,其中存储的数据块仅能由该处理器核访问;L2缓存(英文:Level-2cache)是共享缓存,其中存储的数据块为共享数据块,可由多个处理器核访问,一般可采用集中式的方式被设计在处理器核外部,或者被分布地设计在各个处理器核的附近,即按照物理上分布于各个处理器核附近并通过片上网络进行互连,逻辑上可进行共享。The on-chip multi-processors (English: Chip Multi-processors, CMPs) architecture has become the mainstream architecture of processor design. It consists of many processor cores (Core) through a router (English: Router) on-chip network (English: Network on Chip, abbreviation: NoC) connection structure. The cache in the chip is designed according to the hierarchy. The L1 cache (English: Level-1 cache) is a private cache. It is designed inside the processor core, and the stored data block can only be used by the processor core. Access; L2 cache (English: Level-2cache) is a shared cache, in which the stored data blocks are shared data blocks, which can be accessed by multiple processor cores, and can generally be designed outside the processor core in a centralized manner, or Distributedly designed in the vicinity of each processor core, that is, physically distributed near each processor core and interconnected by an on-chip network, can be logically shared.
在片上多核处理器的应用中,存在一些数据块被该处理器中的一个或者多个处理器核访问的场景。对于该场景,通常会将该数据块存储在共享缓存中,以便一个或多个处理器核能够访问。为了加速数据块的访问,在访问过该数据块的一个或多个处理器核的私有缓存中创建该数据块的拷贝,这样当访问过该数据块的某一核需要重新访问该数据块时,只需要到该核的私有缓存中进行该数据块的读取。因为在被访问的一个或者多个处理器核的私有缓存中存有该数据块的拷贝,就需要维护该数据块在多个核的私有缓存中的拷贝之间一致性,解决拷贝之间一致性问题被称为缓存一致性(Cache Coherence)问题。解决该缓存一致性问题的基本原理是当某一个核中该数据块的拷贝被修改时,其他存储有该数据块拷贝的处理器核需要进行一致性操作(即更新其他核中该数据块的拷贝或删除该数据块在其他核中的拷贝),这就需要确定该数据块在该多核处理器中的哪些核存在拷贝。In an on-chip multi-core processor application, there are scenes where some data blocks are accessed by one or more processor cores in the processor. For this scenario, the data block is typically stored in a shared cache for access by one or more processor cores. To speed up access to the data block, a copy of the data block is created in a private cache of one or more processor cores that have accessed the data block, such that when a core accessing the data block needs to revisit the data block It only needs to read the data block into the core's private cache. Because a copy of the data block is stored in the private cache of the one or more processor cores being accessed, it is necessary to maintain consistency between the copies of the data block in the private cache of the plurality of cores, and to resolve the consistency between the copies. Sexual issues are known as Cache Coherence issues. The basic principle for solving the cache coherency problem is that when a copy of the data block is modified in a certain core, other processor cores storing the copy of the data block need to perform a consistency operation (ie, updating the data block in other cores). Copying or deleting a copy of the data block in other cores requires determining which of the cores in the multi-core processor have copies of the data block.
解决缓存一致性的具体方式主要包括:基于目录(Directory)的一致性协议和基于侦听(Snooping)的一致性协议。其中,基于目录的一致性协议是采用目录来记录数据块被哪些处理器核访问,当需要进行一致性维护时,通过查询目录的表项,确定存储有该数据块的拷贝的处理器核,然后对相应的处理器核内的该数据块进行一致性操作;基于侦听的一致性协议是采用所有处理器核侦听总线的方式,即当某个处理器核修改了其私有缓存中的某一数据块后,在总线上广播该一致性维护消息,使得其他存储有该数据块拷贝的处理器核进行一致性操作。The specific ways to solve the cache consistency include: a directory-based consistency protocol and a snooping-based consistency protocol. The directory-based coherence protocol uses a directory to record which processor cores the data block is accessed by. When consistency maintenance is required, the processor core storing the copy of the data block is determined by querying the table entry. And then performing consistency operations on the data block in the corresponding processor core; the interception-based consistency protocol is a method in which all processor cores listen to the bus, that is, when a processor core modifies its private cache. After a certain data block, the consistency maintenance message is broadcasted on the bus, so that other processor cores storing the copy of the data block perform a consistency operation.
随着云计算的发展,资源池的概念应运而生,原来服务器的资源(处理器、内存、网络IO端口等)被分配给1个或者多个虚拟机使用,通过资源管理软件,可以为虚拟机分配这些资源。在云计算场景下,分配给虚拟机的处理器资源如何实现缓存一致性是业界面临的问题。 With the development of cloud computing, the concept of resource pool came into being. The original server resources (processor, memory, network IO port, etc.) were assigned to one or more virtual machines. Through resource management software, it can be virtual. These resources are allocated by the machine. In a cloud computing scenario, how processor resources allocated to virtual machines achieve cache coherency is a problem facing the industry.
发明内容Summary of the invention
本文描述了一种缓存一致性处理方法和装置,以实现缓存一致性处理以及片上多核处理器的通信以及面积开销的平衡。This paper describes a cache coherency processing method and apparatus to achieve cache coherency processing and communication of on-chip multi-core processors and balance of area overhead.
一方面,本申请的实施例提供一种缓存一致性处理方法。方法应用于片上多核处理器,包括第一路由器接收与其直接相连的第一处理器核发送的一致性维护请求,该一致性维护请求携带第一处理器核的标识;根据第一处理器核的标识查询预设的一致性维护表项,生成针对该第一处理器核所属的一致性区域的一致性维护命令;将生成的一致性维护命令通过片上网络发送给上述一致性区域的其他处理器核。采用上述的方式,能够使得一致性维护仅在发生缓存数据块写操作的一致性区域完成,从而避免了全局一致性维护所带来的开销。In one aspect, an embodiment of the present application provides a cache coherency processing method. The method is applied to an on-chip multi-core processor, comprising: a first router receiving a consistency maintenance request sent by a first processor core directly connected thereto, the consistency maintenance request carrying an identifier of a first processor core; Identifying a preset consistency maintenance entry, and generating a consistency maintenance command for the consistency area to which the first processor core belongs; sending the generated consistency maintenance command to the other processor in the consistent area through the on-chip network nuclear. In the above manner, consistency maintenance can be completed only in the consistent area where the cache data block write operation occurs, thereby avoiding the overhead caused by global consistency maintenance.
在一种可能的设计中,第一路由器根据所述第一处理器核的标识查询预设的一致性维护表项,获取所述第一处理器核所属一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识;以及第一路由器根据所述第一处理器核所在一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识,生成针对所述一致性区域的一致性维护命令。In a possible design, the first router queries the preset consistency maintenance entry according to the identifier of the first processor core, acquires the identifier of the consistency area to which the first processor core belongs, and is located in the An identifier of another processor core in the consistency region; and the first router generates an identifier for the consistency according to the identifier of the consistency region where the first processor core is located and the identifier of another processor core located in the consistency region Consistency maintenance commands for sexual areas.
在一种可能的设计中,第一路由器根据片上网络拓扑状态,以及上述一致性区域内其他处理器核的标识,确定至少一条路由传输路径,其中,每条路由传输路径由和一致性区域内其他处理器核相连的路由器组成;针对每条路由传输路径,第一路由器对上述一致性维护命令进行重构处理,生成针对该条路由传输路径的一致性维护命令;利用每条路由传输路径,将针对该条路由传输路径的一致性维护命令发送给该路由传输路径上的处理器核。采用这种路由定向传输的方式,能够避免通过广播方式所带来的片上网络存在的广播风暴风险,有效地降低了片上网络所包括的路由器的压力。In a possible design, the first router determines at least one route transmission path according to the on-chip network topology state and the identifiers of other processor cores in the consistent area, wherein each route transmission path is within the consistency zone The routers of the other processor cores are configured; for each routing transmission path, the first router reconfigures the consistency maintenance command to generate a consistency maintenance command for the routing path; and uses each routing transmission path. A consistency maintenance command for the route transmission path is sent to the processor core on the route transmission path. By adopting such a route-oriented transmission mode, the risk of broadcast storms existing in the on-chip network caused by the broadcast mode can be avoided, and the pressure of the routers included in the on-chip network can be effectively reduced.
在一种可能的设计中,第一路由器根据一致性区域内其他处理器核的标识,确定和该一致性区域内其他处理器核相连的路由器的标识;并根据所述一致性区域内其他处理器核相连的路由器的标识,以及片上网络的拓扑状态,第一路由器按照XY路由算法进行路由发现,并确定至少一条路由传输路径。上述路由发现过程,方便快捷确定一致性区域内所包括的路由传输路径。In a possible design, the first router determines the identifier of the router connected to other processor cores in the consistency region according to the identifiers of other processor cores in the consistency region; and according to other processing in the consistency region The identifier of the router connected to the core and the topology state of the on-chip network. The first router performs route discovery according to the XY routing algorithm and determines at least one route transmission path. The above route discovery process conveniently and quickly determines the route transmission path included in the consistency area.
在另一种可能的设计中,预设的一致性维护表项,可以通过如下方式生成:资源管理器接收虚拟机发送的处理器资源分配请求,处理器资源分配请求用于请求所述资源管理器为虚拟机分配包括第一处理器核在内的至少两个处理器;资源管理器根据所述处理器资源分配请求,生成针对虚拟机的一致性维护表项,该一致性维护表项包括:一致性区域的标识,以及一致性区域包括的处理器核的标识。In another possible design, the preset consistency maintenance entry may be generated by: the resource manager receiving a processor resource allocation request sent by the virtual machine, and the processor resource allocation request for requesting the resource management Configuring a virtual machine to allocate at least two processors including a first processor core; the resource manager generates a consistency maintenance entry for the virtual machine according to the processor resource allocation request, where the consistency maintenance entry includes : The identity of the consistency zone and the identity of the processor core included in the consistency zone.
在另一种可能的设计中,在资源管理器根据处理器资源分配请求,生成针对虚拟机的一致性维护表项之后,还包括:资源管理器接收虚拟机发送的处理器资源调整请求,处理器资源调整请求用于请求资源管理器对分配给所述虚拟机的处理器核进行调整;资源管理器根据处理器资源调整请求,对针对虚拟机的一致性维护表项进行调整。 In another possible design, after the resource manager generates a consistency maintenance entry for the virtual machine according to the processor resource allocation request, the method further includes: the resource manager receiving the processor resource adjustment request sent by the virtual machine, and processing The resource adjustment request is used to request the resource manager to adjust the processor core allocated to the virtual machine; the resource manager adjusts the consistency maintenance entry for the virtual machine according to the processor resource adjustment request.
其中,当处理器资源调整是减少处理器核时,资源管理器需要预先将待减少的处理器核内的缓存数据写回到内存中,然后将待减少的处理器核的数据清空,并在上述一致性维护表项中将上述待减少的处理器核的标识删除;以及当处理器资源调整是增加处理器核时,资源管理器将待增加的处理器核的标识记录在上述一致性维护表项中。Wherein, when the processor resource adjustment is to reduce the processor core, the resource manager needs to write the cache data in the processor core to be reduced back into the memory, and then clear the data of the processor core to be reduced, and The identifier of the processor core to be reduced is deleted in the foregoing consistency maintenance entry; and when the processor resource adjustment is to increase the processor core, the resource manager records the identifier of the processor core to be added in the consistency maintenance. In the table entry.
另一方面,本发明实施例提供了一种处理器芯片,该处理器芯片包括:多个处理器核,包括第一处理器核;片上网络,由多个路由器连接构成,其中,包括第一路由器,所述第一路由器和所述第一处理器核直接相连;其中第一路由器包括:处理器核连接端口,用于和所述第一处理器核相连;至少一个输出端口,用于和所述片上网络的至少一个路由器相连;缓存模块,用于存储预设的一致性维护表项;处理模块,用于通过所述处理器核连接端口接收所述第一处理器核发送的一致性维护请求,所述一致性维护请求携带所述第一处理器核的标识;根据所述第一处理器核的标识查询所述缓存器存储的所述一致性维护表项,并根据所述一致性维护表项生成针对所述一致性区域的一致性维护命令;将所述一致性维护命令通过所述输出端口发送给所述一致性区域的其他处理器核。In another aspect, an embodiment of the present invention provides a processor chip, where the processor chip includes: a plurality of processor cores, including a first processor core; and an on-chip network, which is composed of a plurality of routers, wherein the first a router, the first router and the first processor core are directly connected; wherein the first router comprises: a processor core connection port for connecting with the first processor core; at least one output port, and The at least one router of the on-chip network is connected to the server; the cache module is configured to store a preset consistency maintenance entry; and the processing module is configured to receive, by using the processor core connection port, the consistency sent by the first processor core a maintenance request, the consistency maintenance request carrying the identifier of the first processor core, querying the consistency maintenance entry stored in the buffer according to the identifier of the first processor core, and according to the consistency The maintenance maintenance entry generates a consistency maintenance command for the consistency area, and sends the consistency maintenance command to the consistency area by using the output port. Other processor cores.
在一个可能的设计中,处理模块还用于:根据所述第一处理器核的标识查询所述缓存模块存储的一致性维护表项,获取所述第一处理器核所属一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识;以及根据所述第一处理器核所在一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识,生成针对所述一致性区域的一致性维护命令。In a possible design, the processing module is further configured to: query the consistency maintenance entry stored by the cache module according to the identifier of the first processor core, and obtain the identifier of the consistency region to which the first processor core belongs And an identifier of the other processor cores located in the consistency area; and generating an identifier for the identifier of the consistency area of the first processor core and the identifiers of other processor cores located in the consistency area A consistency maintenance command for the consistency zone.
在一个可能的设计中,处理模块还用于:根据所述片上网络拓扑状态,以及所述一致性区域内其他处理器核的标识,所述第一路确定至少一条路由传输路径,其中,每条所述路由传输路径由和所述一致性区域内其他处理器核相连的路由器组成;针对每条所述路由传输路径,对所述一致性维护命令进行重构处理,生成针对该条路由传输路径的一致性维护命令;利用每条所述路由传输路径,将针对该条路由传输路径的一致性维护命令通过所述输出端口发送给所述路由传输路径上的处理器核。In a possible design, the processing module is further configured to: according to the on-chip network topology state, and the identifiers of other processor cores in the consistency region, the first path determines at least one route transmission path, where each The routing transmission path is composed of routers connected to other processor cores in the consistency area; for each of the routing transmission paths, the consistency maintenance command is reconstructed and generated, and the routing transmission is generated for the routing a consistency maintenance command of the path; using each of the routing transmission paths, a consistency maintenance command for the routing transmission path is sent to the processor core on the routing transmission path through the output port.
在一个可能的设计中,处理模块还用于:根据一致性区域内其他处理器核的标识,确定和一致性区域内其他处理器核相连的路由器的标识;根据一致性区域内其他处理器核相连的路由器的标识,以及片上网络的拓扑状态,按照XY路由算法进行路由发现,并确定至少一条路由传输路径。In a possible design, the processing module is further configured to: determine, according to the identifiers of other processor cores in the consistency region, identifiers of routers connected to other processor cores in the consistency region; according to other processor cores in the consistency region The identifier of the connected router and the topology state of the on-chip network, route discovery according to the XY routing algorithm, and determine at least one route transmission path.
又一方面,本发明实施例提供了一种计算机系统,包括如前一方面所揭示的处理器芯片,以及资源管理器,用于接收虚拟机发送的处理器资源的分配请求,其中,处理器资源分配请求用于请求所述资源管理器为所述虚拟机分配至少两个处理器;以及根据处理器资源分配请求,生成针对虚拟机的一致性维护表项,一致性维护表项包括:一致性区域的标识,以及一致性区域包括的处理器核的标识。资源管理器的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。所述模块可以是软件和/或硬件。In another aspect, an embodiment of the present invention provides a computer system, including the processor chip disclosed in the foregoing aspect, and a resource manager, configured to receive a processor resource allocation request sent by the virtual machine, where the processor The resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine; and generate a consistency maintenance entry for the virtual machine according to the processor resource allocation request, where the consistency maintenance entry includes: The identity of the sexual area and the identity of the processor core included in the consistent area. The function of the resource manager can be implemented in hardware or in hardware by executing the corresponding software. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware.
在一个可能的设计中,资源管理器还用于:接收虚拟机发送的处理器资源调整请求,处 理器资源调整请求用于请求资源管理器对分配给虚拟机的处理器核进行调整;以及根据处理器资源调整请求,对针对虚拟机的一致性维护表项进行调整。In a possible design, the resource manager is further configured to: receive a processor resource adjustment request sent by the virtual machine, where The processor resource adjustment request is used to request the resource manager to adjust the processor core allocated to the virtual machine; and adjust the consistency maintenance table for the virtual machine according to the processor resource adjustment request.
在另一种可能的设计中,资源管理器还用于:在处理器资源调整请求包含对处理器资源减少时,资源管理器需要预先将待减少的处理器核内的缓存数据写回到内存中,然后将待减少的处理器核的数据清空,并在上述一致性维护表项中将上述待减少的处理器核的标识删除。In another possible design, the resource manager is further configured to: when the processor resource adjustment request includes a decrease in processor resources, the resource manager needs to write the cache data in the processor core to be reduced back to the memory in advance. Then, the data of the processor core to be reduced is cleared, and the identifier of the processor core to be reduced is deleted in the consistency maintenance table item.
在另一个可能的设计中,资源管理器还用于:当所述处理器资源调整是增加处理器核时,将所述待增加的处理器核的标识记录在所述一致性维护表项中。In another possible design, the resource manager is further configured to: when the processor resource adjustment is to increase a processor core, record the identifier of the processor core to be added in the consistency maintenance table entry. .
再一方面,本发明实施例提供了一种计算机存储介质,用于储存为上述路由器所用的计算机软件指令,其包含用于执行上述方面所设计的程序。In still another aspect, an embodiment of the present invention provides a computer storage medium for storing computer software instructions for use in the router, including a program designed to perform the above aspects.
再一方面,本发明实施例提供了一种计算机存储介质,用于储存为上述资源管理器所用的计算机软件指令,其包含用于执行上述方面所设计的程序。In still another aspect, an embodiment of the present invention provides a computer storage medium for storing computer software instructions for use by the resource manager, including a program designed to execute the above aspects.
相较于现有技术,本发明提供的方案可以更加灵活的管理一致性维护区域,以实现缓存一致性处理以及片上多核处理器的通信以及面积开销的平衡。Compared with the prior art, the solution provided by the present invention can more flexibly manage the consistency maintenance area to achieve cache coherency processing and communication of on-chip multi-core processors and balance of area overhead.
附图说明DRAWINGS
下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。BRIEF DESCRIPTION OF THE DRAWINGS The drawings to be used in the embodiments or the description of the prior art will be briefly described below.
图1为本发明应用的云计算架构示意图;1 is a schematic diagram of a cloud computing architecture applied to the present invention;
图2为本发明的一种处理器芯片的示意图;2 is a schematic diagram of a processor chip of the present invention;
图3为本发明实施例提供的缓存一致性处理方法流程示意图;FIG. 3 is a schematic flowchart of a cache consistency processing method according to an embodiment of the present disclosure;
图4为本发明实施例提供的一种路由器模块的结构示意图。FIG. 4 is a schematic structural diagram of a router module according to an embodiment of the present invention.
具体实施方式detailed description
下面结合附图,对本发明实施例中的技术方案进行清楚地描述。The technical solutions in the embodiments of the present invention are clearly described below with reference to the accompanying drawings.
本发明实施例描述的系统架构以及业务场景是为了更加清楚的说明本发明实施例的技术方案,并不构成对于本发明实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本发明实施例提供的技术方案对于类似的技术问题,同样适用。The system architecture and the service scenario described in the embodiments of the present invention are for the purpose of more clearly illustrating the technical solutions of the embodiments of the present invention, and do not constitute a limitation of the technical solutions provided by the embodiments of the present invention. The technical solutions provided by the embodiments of the present invention are equally applicable to similar technical problems.
如图1所示,本申请所应用的云计算架构至上而下划分为四个层次,分别为:As shown in FIG. 1 , the cloud computing architecture applied in this application is divided into four levels from top to bottom, namely:
应用层(Application Layer)100,该层运行着各类应用(英文:Application),为用户提供相应的服务。 Application Layer 100, which runs various applications (English: Application) to provide users with corresponding services.
操作系统层(Operating System Layer)200,该层包括操作系统(英文:Operating System,缩写:OS),负责为运行在其上的各类应用分配硬件资源(处理器、内存以及网络IO)。其中,操作系统和运行在其上的应用构成虚拟机(英文:Virtual Machine,简称:VM)的软件架构。属于某一个虚拟机的操作系统在虚拟机所拥有的硬件资源的范围内为运行在其上的应用分配硬件资源(例如:处理器、内存、以及网络IO等)。以图1为例,示出 两个虚拟机,其中虚拟机1包括操作系统1以及应用1和应用2,虚拟机2包括操作系统2以及应用3和应用4。The operating system layer (Operating System Layer) 200, which includes an operating system (English: Operating System, abbreviation: OS), is responsible for allocating hardware resources (processor, memory, and network IO) for various types of applications running thereon. The operating system and the application running on it constitute a software architecture of a virtual machine (English: Virtual Machine, VM for short). An operating system belonging to a virtual machine allocates hardware resources (eg, processor, memory, network IO, etc.) to applications running thereon within the scope of hardware resources owned by the virtual machine. Taking FIG. 1 as an example, Two virtual machines, wherein the virtual machine 1 includes an operating system 1 and an application 1 and an application 2, the virtual machine 2 including an operating system 2 and an application 3 and an application 4.
资源管理层(Resource Management Layer)300,该层运行着资源管理器(英文:Resource Manager,简称:RM)。在云计算系统中,在物理机(英文:Physical Machine,简称:PM)上会创建多个虚拟机。虚拟机在被创建时,需要通过资源管理器为待创建的虚拟机分配硬件资源,这些硬件资源包括:处理器、内存、网络IO等。在实际应用中,该资源管理器又被称为虚拟机监视器(英文:Virtual Machine Monitor,简称:VMM)或者Hypervisor。 Resource Management Layer 300, which runs a resource manager (English: Resource Manager, RM for short). In a cloud computing system, multiple virtual machines are created on a physical machine (English: Physical Machine, PM). When a virtual machine is created, it needs to allocate hardware resources to the virtual machine to be created through the resource manager. These hardware resources include: processor, memory, network IO, and so on. In practical applications, the resource manager is also referred to as a virtual machine monitor (English: Virtual Machine Monitor, referred to as: VMM) or a hypervisor.
资源层(Processor Layer)400,包括资源管理层300所能够管理的硬件资源,作为举例,图1示出有处理器、内存、以及网络IO等。The processor layer 400 includes hardware resources that the resource management layer 300 can manage. By way of example, FIG. 1 shows a processor, a memory, a network IO, and the like.
作为处理器的举例,图2示出了片上多核处理器,它由16个处理器核(简称:核,英文:Core)通过16个路由器构成的片上网络连接构成。其中,每个核均和一个路由器直接相连,通过路由器构成的片上网络,能够实现核间通信。需要说明的是,核与路由器之间可以通过有线连接(电连接,如:采用铜线连接;或者光连接,如:采用光纤连接),也可以采用无线方式连接。As an example of a processor, FIG. 2 shows an on-chip multi-core processor composed of 16 processor cores (abbreviation: core, English: Core) through an on-chip network connection of 16 routers. Each core is directly connected to a router, and an inter-core network formed by a router can implement inter-core communication. It should be noted that the core and the router can be connected by wire (electrical connection, such as copper wire connection; optical connection, such as fiber connection), or wireless connection.
本发明实施例面向云计算场景下的缓存一致性问题的解决方案。在云计算场景下,在一台服务器上会创建至少两个虚拟机。当一台虚拟机被创建时,需要资源管理器为其分配硬件资源,其中,资源管理器会为该虚拟机分配一定数量的处理器核来作为该虚拟机的处理资源。任意两台虚拟机所拥有的硬件资源在逻辑上是完全隔离的。基于这种应用场景,在片上多核处理器的缓存一致性操作处理过程中,就不需要采用“全局一致性”的方式去进行所有处理器核内缓存一致性处理,而只需要进行局部一致性处理即可。The embodiment of the invention is directed to a solution to the cache consistency problem in a cloud computing scenario. In a cloud computing scenario, at least two virtual machines are created on a single server. When a virtual machine is created, the resource manager is required to allocate hardware resources for it. The resource manager allocates a certain number of processor cores to the virtual machine as processing resources of the virtual machine. The hardware resources owned by any two virtual machines are logically completely isolated. Based on this application scenario, in the process of cache coherency operation of the on-chip multi-core processor, there is no need to adopt a "global consistency" method for all processor core cache coherency processing, but only local consistency is required. Just handle it.
具体以图2来举例,该片上多核处理器包括16个处理器核,假设该片上多核处理器被资源管理器分配给两个虚拟机使用。其中,处理器核0-9以及12-13被分配给虚拟机1,处理器核10-11以及14-15被分配给虚拟机2,由于分配给不同虚拟机的处理器核逻辑上是隔离的,因此,图2中的处理器核被划分为两个一致性区域,如表一:Specifically, as shown in FIG. 2, the on-chip multi-core processor includes 16 processor cores, and it is assumed that the on-chip multi-core processor is allocated to two virtual machines by the resource manager. Wherein processor cores 0-9 and 12-13 are assigned to virtual machine 1, processor cores 10-11 and 14-15 are assigned to virtual machine 2, since processor cores assigned to different virtual machines are logically isolated Therefore, the processor core in Figure 2 is divided into two consistent regions, as shown in Table 1:
一致性区域编号Consistent area number 所包含的处理器核Included processor core
一致性区域1Consistency area 1 核0-9,及核12-13Core 0-9, and core 12-13
一致性区域2 Consistency area 2 核10-11,及核14-15Nuclear 10-11, and nuclear 14-15
表一Table I
对于虚拟机1,以处理器核1为例,当核1对该核内的私有缓存的某一缓存数据块进行写操作之后,需要在处理器核1所在的一致性区域1,针对该缓存数据块进行一致性维护操作,其处理过程如下:For the virtual machine 1, taking the processor core 1 as an example, after the core 1 writes a certain cache data block of the private cache in the core, it needs to be in the consistency area 1 where the processor core 1 is located, for the cache. The data block performs consistency maintenance operations, and the processing procedure is as follows:
步骤310、第一路由器接收第一处理器核发送的一致性维护请求,该一致性维护请求携带该第一处理器核的标识。Step 310: The first router receives a consistency maintenance request sent by the first processor core, where the consistency maintenance request carries an identifier of the first processor core.
作为实现过程的举例,参看图2,一致性维护请求是由完成了数据块写操作的处理器核 1中的缓存控制器发送的,该一致性操作请求中携带有处理器核1的标识。其中,处理器核的标识可以为系统对处理器核的编号。As an example of the implementation process, referring to FIG. 2, the consistency maintenance request is a processor core that has completed the data block write operation. The cache controller sent by 1 carries the identifier of the processor core 1 in the consistency operation request. The identifier of the processor core may be the number of the processor core of the system.
步骤320、第一路由器根据第一处理器核的标识查询预设的一致性维护表项,获取第一处理器核所属一致性区域的标识,以及位于该一致性区域内其他处理器核的标识。Step 320: The first router queries the preset consistency maintenance entry according to the identifier of the first processor core, and obtains an identifier of the consistency area to which the first processor core belongs, and an identifier of another processor core located in the consistency area. .
其中,一致性维护表项,是在虚拟机创建时,由资源管理器接收到虚拟机发送的处理器资源分配请求之后,根据处理器资源分配请求生成的。其中,处理器资源分配请求用于请求资源管理器为上述虚拟机分配至少两个处理器。其中,作为举例,一致性维护表项结构包括:有效位(Valid),一致性区域标识(英文:Coherence Domain Ident ifier,简称:CDID),以及一致性区域的核归属标识(Core IDs),其中:有效位用来表示该表项是否有效;一致性区域标识用来表示该一致性区域的标识;一致性区域的核归属标识采用比特掩码(Bit Mask)的形式,从右至左以此表示对应的处理器核是否属于该一致性区域(其中,该比特为1,则表示属于该一致性区域,该比特为0,则表示不属于该一致性区域)。以图2中的一致性区域1和一致性区域2为例,该一致性区域的维护表项表示如下,其中,图2所示的片上多核处理器包括16个核,采用16个比特构成的向量来表示(从右到左,依序表示核0-核15),如表二所示:The consistency maintenance entry is generated according to the processor resource allocation request after the virtual machine is created by the resource manager after receiving the processor resource allocation request sent by the virtual machine. The processor resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine. As an example, the consistency maintenance entry structure includes: a valid bit (Valid), a coherence domain identifier (CD: ID), and a core ID of the coherent area, where The valid bit is used to indicate whether the entry is valid; the consistency zone identifier is used to indicate the identity of the consistency zone; the core identity of the consistency zone is in the form of a bit mask, from right to left. Indicates whether the corresponding processor core belongs to the consistency region (where the bit is 1, indicating that it belongs to the consistency region, and if the bit is 0, it indicates that the consistency region is not included). Taking the consistency area 1 and the consistency area 2 in FIG. 2 as an example, the maintenance table item of the consistency area is represented as follows, wherein the on-chip multi-core processor shown in FIG. 2 includes 16 cores and is composed of 16 bits. The vector is used to represent (from right to left, sequentially representing the core 0-core 15), as shown in Table 2:
Figure PCTCN2017104021-appb-000001
Figure PCTCN2017104021-appb-000001
表二Table II
资源管理器在创建上述一致性维护表项后,将一致性维护表项分别发送给对应的一致性区域中的路由器。作为举例,参看图2,资源管理器将一致性区域1的一致性维护表项,发送给一致性区域1所包括的路由器,这些路由器可以将该一致性维护表项存储在自身的缓存或者路由器的寄存器中。After the resource manager creates the consistency maintenance entry, it sends the consistency maintenance entry to the router in the corresponding consistency zone. By way of example, referring to FIG. 2, the resource manager sends the consistency maintenance entry of the consistency zone 1 to the router included in the consistency zone 1, and the router can store the consistency maintenance entry in its own cache or router. In the register.
在本步骤中,第一路由器根据第一处理器核的标识查询该一致性维护表项,具体可以通过:用第一处理器核的编号查询上述一致性维护表项的一致性区域的核归属标识,确定第一处理器核所属的一致性区域的标识,以及位于该一致性区域内的其他处理器核的标识。In this step, the first router queries the consistency maintenance entry according to the identifier of the first processor core, and may specifically: query the core attribution of the consistency region of the consistency maintenance entry by using the number of the first processor core. An identifier that determines an identity of a coherency zone to which the first processor core belongs, and an identity of other processor cores located within the coherency zone.
步骤330、第一路由器根据第一处理器核所在一致性区域的标识,以及位于一致性区域内其他处理器核的标识,生成针对该一致性区域的一致性维护命令。Step 330: The first router generates a consistency maintenance command for the consistency area according to the identifier of the consistency area where the first processor core is located, and the identifiers of other processor cores located in the consistency area.
在本步骤中,第一路由器根据位于一致性区域其他处理器核的标识,生成针对该一致性区域的一致性维护命令。作为举例,一致性维护命令是在路由包结构的基础上,增加了CDID域以及Core IDs域等字段,其中各个域的意义如表三所示:In this step, the first router generates a consistency maintenance command for the consistency zone according to the identifiers of other processor cores located in the consistency zone. As an example, the consistency maintenance command adds fields such as the CDID field and the Core IDs field based on the routing packet structure. The meaning of each domain is shown in Table 3:
Flit TypeFlit Type Src AddrSrc Addr Dst AddrDst Addr CDIDCDID Core IDsCore IDs PayloadPayload
0x990x99 11 -- 11 00110011111111110011001111111111 Set a=2Set a=2
表三 Table 3
Flit Type:定义一致性命令类型,如:0x99;Flit Type: defines the consistency command type, such as: 0x99;
Src Addr:源地址,在本方案中指的是发起一致性维护请求的Core的标识(在本实施例举例,发起一致性维护请求是处理器核1,其编号为1);Src Addr: The source address, in this scenario, refers to the identifier of the Core that initiates the consistency maintenance request. (In this embodiment, the initiator consistency maintenance request is processor core 1, whose number is 1);
Dst Addr:目的地址(此处预留);Dst Addr: destination address (reserved here);
CDID:即一致性维护表项中的一致性区域标识(在本实施例举例中,该一致性区域标识为1);CDID: the consistency area identifier in the consistency maintenance entry (in the example of this embodiment, the consistency area identifier is 1);
Core IDs:即一致性维护表项中的一致性区域的核归属标识(在本实施例举例中,一致性区域的核归属标识对应的处理器核为核0-9以及核12-13);Core IDs are the cores of the consistency zone in the consistency maintenance table. In this example, the processor cores corresponding to the core identity of the consistency zone are cores 0-9 and 12-13.
Payload:指的是该一致性维护命令中具体包括的一致性操作内容(如:将某一变量的数据更新为某一数值,在本实施例的举例中,设置变量a的取值为2)。Payload: refers to the consistency operation content specifically included in the consistency maintenance command (for example, updating the data of a variable to a certain value. In the example of this embodiment, setting the value of the variable a to 2) .
步骤340、第一路由器将上述一致性维护命令通过片上网络发送给一致性区域的其他处理器核。Step 340: The first router sends the consistency maintenance command to the other processor cores of the consistency area through the on-chip network.
在本步骤中,存在两种实现方式:In this step, there are two implementations:
方式一:采用广播的方式Method 1: Adopting the broadcast method
作为举例,在图2中,路由器1采用广播方式将生成的一致性维护命令发送给该一致性区域其他的处理器核(即核0-9以及核12-13),通过片上网络,路由器1将一致性维护命令通过片上网络中的其他路由器(路由器0,以及路由器2-15)发送给该片上多核处理器上其他的处理器核(核0,以及核2-15),每个收到上述一致性维护命令的处理器核,将自身的标识和上述一致性维护命令的Core Ids域进行比对,当发现Core IDs对应的比特为1时,则说明该处理器核属于上述一致性区域,需要根据一致性维护命令进行一致性处理;当发现Core IDs对应的比特为0时,则说明该处理器核不属于上述一致性区域,则丢弃该一致性维护命令,不做任何处理。By way of example, in FIG. 2, the router 1 broadcasts the generated consistency maintenance command to other processor cores in the coherency area (ie, cores 0-9 and cores 12-13) through the on-chip network, router 1 The consistency maintenance command is sent to other processor cores (core 0, and core 2-15) on the on-chip multi-core processor through other routers (router 0, and router 2-15) in the on-chip network, each receiving The processor core of the consistency maintenance command compares its own identifier with the Core Ids field of the consistency maintenance command. When the bit corresponding to the Core IDs is 1, the processor core belongs to the consistent area. Consistency processing is required according to the consistency maintenance command. When the bit corresponding to the Core IDs is 0, the processor core does not belong to the consistency zone. The consistency maintenance command is discarded and no processing is performed.
方式二:采用路由定向发送的方式Method 2: Adopt route-oriented sending method
作为举例,在图2中,路由器1获取片上网络的拓扑状态,以及该一致性区域内其他处理器核的标识,根据上述获取的信息确定至少一条路由传输路径。具体的,路由器1能够感知片上网络的拓扑状态,计算其可能的下一跳的方式,确认下一跳节点,即路由器(0,2,5)。例如,在2D Mesh网络中,按照XY路由方法,分别计算(x+1,y)、(x-1,y)、(x,y+1)、(x,y-1),确定路由器1的第一跳节点为路由器(0,2,5)。路由器1根据一致性维护表项中获取该一致性区域其他处理器核的标识,判断路由器(0,2,5)所连接的处理器核属于一致性区域1;然后按照类似的方式进行下一跳节点的选择,路由器1的第二跳节点为路由器(3,4,6,9),根据获取到一致性区域1中其他处理器核的标识,确定路由器(3,4,6,9)所连接的处理器核也属于一致性区域1;以此继续进行判断,直至完成属于一致性区域1内的路由节点的遍历判断;由此,可以获得三条路由传输路径,分别是:1→0→4→8→12(向西路由),1→5→9→13(向南路由),1→2→3(6)→7(向东路由)。By way of example, in FIG. 2, the router 1 acquires the topology state of the on-chip network and the identifiers of other processor cores in the consistency region, and determines at least one route transmission path according to the obtained information. Specifically, the router 1 can sense the topology state of the on-chip network, calculate the possible next hop mode, and confirm the next hop node, that is, the router (0, 2, 5). For example, in a 2D mesh network, according to the XY routing method, (x+1, y), (x-1, y), (x, y+1), (x, y-1) are respectively calculated, and the router 1 is determined. The first hop node is the router (0, 2, 5). Router 1 obtains the identifier of the other processor cores in the consistency area according to the consistency maintenance entry, and determines that the processor core connected to the router (0, 2, 5) belongs to the consistency area 1; then proceeds to the next manner in a similar manner. The selection of the hop node, the second hop node of the router 1 is the router (3, 4, 6, 9), and the router (3, 4, 6, 9) is determined according to the identifier of the other processor cores in the consistency area 1 is obtained. The connected processor core also belongs to the consistency area 1; the judgment is continued until the traversal judgment of the routing node in the consistency area 1 is completed; thus, three routing transmission paths can be obtained, namely: 1→0 →4→8→12 (westward route), 1→5→9→13 (route to the south), 1→2→3(6)→7 (route to the east).
针对每条路由传输路径,第一路由器对上述一致性维护命令进行重构处理,生成针对该条路由传输路径的一致性维护命令,即根据一致性维护命令,生成路由定向发送指令,表 示如下:For each route transmission path, the first router reconfigures the consistency maintenance command to generate a consistency maintenance command for the route transmission path, that is, generates a route directional transmission command according to the consistency maintenance command, Shown as follows:
(1)路由定向发送指令(向西路由)如表四所示:(1) Routing Directional Sending Instructions (westbound routing) are shown in Table 4:
Flit TypeFlit Type Src AddrSrc Addr Dst AddrDst Addr CDIDCDID Core IDsCore IDs PayloadPayload
0x990x99 11 -- 11 00010001000100010001000100010001 Set a=2Set a=2
表四Table 4
(2)路由定向发送指令(向东路由)如表五所示:(2) Routing Directional Sending Instructions (Etherward Routing) are shown in Table 5:
Flit TypeFlit Type Src AddrSrc Addr Dst AddrDst Addr CDIDCDID Core IDsCore IDs PayloadPayload
0x990x99 11 -- 11 00000000110011000000000011001100 Set a=2Set a=2
表五Table 5
(3)路由定向发送指令(向南路由)如表六所示:(3) The route-oriented sending instruction (route to the south) is as shown in Table 6:
Flit TypeFlit Type Src AddrSrc Addr Dst AddrDst Addr CDIDCDID Core IDsCore IDs PayloadPayload
0x990x99 11 -- 11 00100010001000000010001000100000 Set a=2Set a=2
表六Table 6
需要说明的是,当向东路由的路由定向发送指令发送到路由器2时,由于存在1→2→3→7以及1→2→6两个分支,在路由器2上,向东路由的路由定向发送指令可进一步分解为如下两条:It should be noted that when the route-oriented sending instruction of the east route is sent to the router 2, the routing of the eastbound route is performed on the router 2 because there are two branches of 1→2→3→7 and 1→2→6. The send instruction can be further broken down into the following two:
(4)2→3→7路由定向发送指令,如表七所示:(4) 2→3→7 route directional sending instructions, as shown in Table 7:
Flit TypeFlit Type Src AddrSrc Addr Dst AddrDst Addr CDIDCDID Core IDsCore IDs PayloadPayload
0x990x99 11 -- 11 000000010001000000000010001000 Set a=2Set a=2
表七Table 7
(5)2→6路由定向发送指令,如表八所示:(5) 2→6 route directional sending instructions, as shown in Table 8:
Flit TypeFlit Type Src AddrSrc Addr Dst AddrDst Addr CDIDCDID Core IDsCore IDs PayloadPayload
0x990x99 11 -- 11 00000000010000000000000001000000 Set a=2Set a=2
表八Table eight
在重构针对路由传输路径的路由定向发送指令后,利用每条路由传输路径,将针对该条路由传输路径的一致性维护命令发送给上述路由传输路径上的处理器核。After the routing instruction is sent to the routing transmission path, the consistency maintenance command for the routing transmission path is sent to the processor core on the routing transmission path by using each routing transmission path.
采用这种路由定向发送的方式,能够避免了片上网络存在的广播风暴风险,有效地降低了片上网络各路由节点的消息包转发压力。By adopting such a route-oriented transmission method, the risk of broadcast storms existing on the on-chip network can be avoided, and the packet forwarding pressure of each routing node of the on-chip network is effectively reduced.
作为上述实施例方案的拓展,在资源管理器接收虚拟机发送的处理器资源调整请求后,该处理器资源调整请求用于请求资源管理器对分配给虚拟机的处理器核进行调整;资源管理器根据上述处理器资源调整请求,对针对上述虚拟机的一致性维护表项进行调整。 As an extension of the foregoing embodiment, after the resource manager receives the processor resource adjustment request sent by the virtual machine, the processor resource adjustment request is used to request the resource manager to adjust the processor core allocated to the virtual machine; resource management The device adjusts the consistency maintenance entry for the virtual machine according to the processor resource adjustment request.
上述调整请求包括两种类型:The above adjustment requests include two types:
(1)处理器核的增加(1) increase of processor core
在处理器资源调整请求包含对处理器资源增加时,资源管理器需要预先对待增加的处理器核内的缓存数据写回到内存(又称为:主存储器)中,然后将待增加的处理器核内的数据清空,以及将待增加的处理器核的标识记录在上述一致性维护表项中。When the processor resource adjustment request includes an increase in processor resources, the resource manager needs to write the cache data in the added processor core to the memory (also referred to as main memory) in advance, and then the processor to be added. The data in the core is cleared, and the identifier of the processor core to be added is recorded in the above consistency maintenance table.
(2)处理器核的减少(2) processor core reduction
在处理器资源调整请求包含对处理器资源减少时,资源管理器需要预先将待减少的处理器核内的缓存数据写回到内存中,然后将待减少的处理器核的数据清空,并在上述一致性维护表项中将上述待减少的处理器核的标识删除。When the processor resource adjustment request includes reducing the processor resources, the resource manager needs to write the cache data in the processor core to be reduced back into the memory, and then clear the data of the processor core to be reduced, and The identifier of the processor core to be reduced is deleted in the above consistency maintenance entry.
图2示出了上述实施例中涉及到的一种处理器芯片200的设计方框图(限于尺寸,仅画出该处理器芯片的部分)。其中,该处理器芯片200包括:FIG. 2 shows a block diagram of a design of a processor chip 200 involved in the above embodiment (limited to size, only a portion of the processor chip is shown). The processor chip 200 includes:
多个处理器核,其中包括第一处理器核210(作为举例,以核1作为第一处理器核);a plurality of processor cores, including a first processor core 210 (by way of example, core 1 as a first processor core);
片上网络,由多个路由器连接构成,其中,包括第一路由器220,第一路由器220和第一处理器核210直接相连;The on-chip network is composed of a plurality of router connections, wherein the first router 220 is included, and the first router 220 is directly connected to the first processor core 210;
所述第一路由器220包括:The first router 220 includes:
处理器核连接端口221,用于和第一处理器核210相连;a processor core connection port 221 for connecting to the first processor core 210;
至少一个输出端口222,用于和片上网络的至少一个路由器相连;At least one output port 222 for connecting to at least one router of the network on the chip;
缓存模块223,用于存储预设的一致性维护表项;The cache module 223 is configured to store a preset consistency maintenance entry.
处理模块224,用于通过处理器核连接端口221接收第一处理器核210发送的一致性维护请求,该一致性维护请求携带第一处理器核210的标识;根据第一处理器核的标识查询所述缓存模块223存储的所述一致性维护表项,获取第一处理器核210所属一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识;根据所述第一处理器核210所在一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识,生成针对所述一致性区域的一致性维护命令;将所述一致性维护命令通过输出端口222发送给该一致性区域的其他处理器核。The processing module 224 is configured to receive, by the processor core connection port 221, a consistency maintenance request sent by the first processor core 210, where the consistency maintenance request carries an identifier of the first processor core 210; according to the identifier of the first processor core Querying the consistency maintenance table stored by the cache module 223, obtaining an identifier of a consistency area to which the first processor core 210 belongs, and an identifier of another processor core located in the consistency area; An identifier of the consistency area of the processor core 210, and an identifier of another processor core located in the consistency area, generating a consistency maintenance command for the consistency area; and passing the consistency maintenance command to the output port 222 Other processor cores sent to this coherency zone.
图1还示出了一种计算机系统,该系统包括:资源管理器,运行在资源管理器上的虚拟机,以及如前一实施例提供的处理器芯片。Figure 1 also shows a computer system comprising: a resource manager, a virtual machine running on the resource manager, and a processor chip as provided in the previous embodiment.
其中,资源管理器,用于接收虚拟机发送的处理器资源的分配请求,所述处理器资源分配请求用于请求所述资源管理器为所述虚拟机分配至少两个处理器;以及根据所述处理器资源分配请求,生成针对所述虚拟机的一致性维护表项,所述一致性维护表项包括:所述一致性区域的标识,以及所述一致性区域包括的处理器核的标识。The resource manager is configured to receive an allocation request of a processor resource sent by the virtual machine, where the processor resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine; a processor resource allocation request, generating a consistency maintenance entry for the virtual machine, where the consistency maintenance entry includes: an identifier of the consistency area, and an identifier of a processor core included in the consistency area .
进一步的,资源管理器,还用于:接收所述虚拟机发送的处理器资源调整请求,所述处理器资源调整请求用于请求所述资源管理器对分配给所述虚拟机的处理器核进行调整;以 及根据所述处理器资源调整请求,对针对所述虚拟机的一致性维护表项进行调整。Further, the resource manager is further configured to: receive a processor resource adjustment request sent by the virtual machine, where the processor resource adjustment request is used to request the resource manager to allocate a processor core to the virtual machine Make adjustments; And adjusting, according to the processor resource adjustment request, a consistency maintenance entry for the virtual machine.
进一步的,资源管理器,还用于:当处理器资源调整请求是减少处理器核时,资源管理器将待减少的处理器核内的缓存数据写回到内存中,然后将待减少的处理器核的数据清空,并在一致性维护表项中将所述待减少的处理器核的标识删除;以及Further, the resource manager is further configured to: when the processor resource adjustment request is to reduce the processor core, the resource manager writes the cache data in the processor core to be reduced back into the memory, and then the processing to be reduced The data of the core of the kernel is cleared, and the identifier of the processor core to be reduced is deleted in the consistency maintenance entry;
当处理器资源调整是增加处理器核时,将待增加的处理器核的标识记录在所述一致性维护表项中。When the processor resource adjustment is to increase the processor core, the identifier of the processor core to be added is recorded in the consistency maintenance table entry.
其中,资源管理器可以采用软件实现,或者采用硬件+软件的方式实现。如采用硬件方式实现时,用于执行本发明上述资源管理器是中央处理器(英文:Central Process ing Uni t,缩写:CPU),通用处理器、数字信号处理器(英文:Digital Signal Processor,缩写:DSP)、专用集成电路(英文:Application-specific integrated circuit,缩写:ASIC),现场可编程门阵列(英文:Field Programmable Gate Array,缩写:FPGA)或者其他可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。其可以实现或执行结合本发明公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。The resource manager can be implemented by software or by hardware + software. When implemented in hardware, the above-mentioned resource manager for executing the present invention is a central processing unit (English: Central Process ing Uni t, abbreviated: CPU), a general-purpose processor, and a digital signal processor (English: Digital Signal Processor, abbreviation) :DSP), application-specific integrated circuit (English: Application-specific integrated circuit, abbreviation: ASIC), Field Programmable Gate Array (English: Field Programmable Gate Array, abbreviation: FPGA) or other programmable logic devices, transistor logic devices, hardware A component or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure. The processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
结合本发明公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于RAM(Random Access Memory,随机存取存储器)存储器、闪存、ROM(Read-Only Memory,只读存储器)存储器、EPROM(Erasable Programmable Read-Only Memory,可擦写可编程只读存储器)存储器、EEPROM(Electrically-Erasable Programmable Read-Only Memory,电可擦可编程只读存储器)存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于用户设备中。当然,处理器和存储介质也可以作为分立组件存在于用户设备中。The steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware, or may be implemented by a processor executing software instructions. The software instructions may be composed of corresponding software modules, which may be stored in a RAM (Random Access Memory) memory, a flash memory, a ROM (Read-Only Memory) memory, and an EPROM (Erasable Programmable Read- Only Memory, EEPROM (Electrically-Erasable Programmable Read-Only Memory) memory, registers, hard disk, mobile hard disk, CD-ROM or well known in the art Any other form of storage medium. An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium. Of course, the storage medium can also be an integral part of the processor. The processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in the user equipment. Of course, the processor and the storage medium may also reside as discrete components in the user equipment.
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。Those skilled in the art will appreciate that in one or more examples described above, the functions described herein can be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium. Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another. A storage medium may be any available media that can be accessed by a general purpose or special purpose computer.
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。 The specific embodiments of the present invention have been described in detail with reference to the preferred embodiments of the present invention. The scope of the protection, any modifications, equivalent substitutions, improvements, etc., which are made on the basis of the technical solutions of the present invention, are included in the scope of the present invention.

Claims (14)

  1. 一种缓存一致性处理方法,应用于片上多核处理器,其特征在于,所述方法包括:A cache coherency processing method is applied to an on-chip multi-core processor, and the method includes:
    第一路由器接收与其直接相连第一处理器核发送的一致性维护请求,所述一致性维护请求携带所述第一处理器核的标识;The first router receives a consistency maintenance request sent by the first processor core directly connected thereto, where the consistency maintenance request carries an identifier of the first processor core;
    所述第一路由器根据所述第一处理器核的标识查询预设的一致性维护表项,并根据所述一致性维护表项生成针对所述一致性区域的一致性维护命令;The first router queries a preset consistency maintenance entry according to the identifier of the first processor core, and generates a consistency maintenance command for the consistency area according to the consistency maintenance entry.
    所述第一路由器将所述一致性维护命令通过所述片上网络发送给所述一致性区域的其他处理器核。The first router sends the consistency maintenance command to the other processor cores of the consistency area through the on-chip network.
  2. 根据权利要求1所述的方法,其特征在于,所述第一路由器根据所述第一处理器核的标识查询预设的一致性维护表项,并根据所述一致性维护表项生成针对所述一致性区域的一致性维护命令包括:The method according to claim 1, wherein the first router queries a preset consistency maintenance entry according to the identifier of the first processor core, and generates a target according to the consistency maintenance entry. The consistency maintenance commands for the consistency zone include:
    所述第一路由器根据所述第一处理器核的标识查询预设的一致性维护表项,获取所述第一处理器核所属一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识;The first router queries the preset consistency maintenance entry according to the identifier of the first processor core, acquires the identifier of the consistency area to which the first processor core belongs, and performs other processing in the consistency area. Identification of the core;
    所述第一路由器根据所述第一处理器核所在一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识,生成针对所述一致性区域的一致性维护命令。The first router generates a consistency maintenance command for the consistency area according to the identifier of the consistency area where the first processor core is located and the identifiers of other processor cores located in the consistency area.
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一路由器将所述一致性维护命令通过所述片上网络发送给所述一致性区域的其他处理器核,包括:The method according to claim 1 or 2, wherein the first router sends the consistency maintenance command to the other processor cores of the consistency area by using the on-chip network, including:
    根据所述片上网络拓扑状态,以及所述一致性区域内其他处理器核的标识,所述第一路由器确定至少一条路由传输路径,其中,每条所述路由传输路径由和所述一致性区域内其他处理器核相连的路由器组成;Determining, by the first router, at least one route transmission path according to the on-chip network topology state and an identifier of another processor core in the consistency region, wherein each of the route transmission paths is associated with the consistent region a router connected to other processor cores;
    针对每条所述路由传输路径,所述第一路由器对所述一致性维护命令进行重构处理,生成针对该条路由传输路径的一致性维护命令;And the first router reconfigures the consistency maintenance command to generate a consistency maintenance command for the route transmission path;
    利用每条所述路由传输路径,将针对该条路由传输路径的一致性维护命令发送给所述路由传输路径上的处理器核。Using each of the routing transmission paths, a consistency maintenance command for the routing transmission path is sent to the processor core on the routing transmission path.
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述片上网络拓扑状态,以及所述一致性区域内其他处理器核的标识,所述第一路由器确定至少一条路由传输路径,包括:The method according to claim 3, wherein said first router determines at least one route transmission path according to said on-chip network topology state and an identifier of another processor core in said consistency region, including :
    根据所述一致性区域内其他处理器核的标识,所述第一路由器确定和所述一致性区域内其他处理器核相连的路由器的标识;Determining, by the first router, an identifier of a router connected to another processor core in the consistency region according to an identifier of another processor core in the consistency region;
    根据所述一致性区域内其他处理器核相连的路由器的标识,以及所述片上网络的拓扑状态,所述第一路由器按照XY路由算法进行路由发现,并确定所述至少一条路由传输路径。The first router performs route discovery according to the XY routing algorithm according to the identifier of the router connected to the other processor cores in the consistency area, and the topology state of the on-chip network, and determines the at least one route transmission path.
  5. 根据权利要求1-4任一所述的方法,其特征在于,在所述第一路由器根据所述第一处理器核的标识查询预设的一致性维护表项之前,所述方法还包括:The method according to any one of claims 1-4, wherein before the first router queries the preset consistency maintenance entry according to the identifier of the first processor core, the method further includes:
    资源管理器接收虚拟机发送的处理器资源分配请求,所述处理器资源分配请求用于请求所述资源管理器为所述虚拟机分配包括第一处理器核在内的至少两个处理器;The resource manager receives a processor resource allocation request sent by the virtual machine, where the processor resource allocation request is used to request the resource manager to allocate at least two processors including the first processor core to the virtual machine;
    所述资源管理器根据所述处理器资源分配请求,生成针对所述虚拟机的一致性维护表项,所述一致性维护表项包括:所述一致性区域的标识,以及所述一致性区域包括的处理器 核的标识。And the resource manager generates a consistency maintenance entry for the virtual machine according to the processor resource allocation request, where the consistency maintenance entry includes: an identifier of the consistency area, and the consistency area Included processor Nuclear identification.
  6. 根据权利要求5所述的方法,其特征在于,在所述资源管理器根据所述处理器资源分配请求,生成针对所述虚拟机的一致性维护表项之后,所述方法还包括:The method according to claim 5, wherein after the resource manager generates a consistency maintenance entry for the virtual machine according to the processor resource allocation request, the method further includes:
    所述资源管理器接收所述虚拟机发送的处理器资源调整请求,所述处理器资源调整请求用于请求所述资源管理器对分配给所述虚拟机的处理器核进行调整;The resource manager receives a processor resource adjustment request sent by the virtual machine, where the processor resource adjustment request is used to request the resource manager to adjust a processor core allocated to the virtual machine;
    所述资源管理器根据所述处理器资源调整请求,对针对所述虚拟机的一致性维护表项进行调整。And the resource manager adjusts the consistency maintenance entry for the virtual machine according to the processor resource adjustment request.
  7. 根据权利要求6所述的方法,其特征在于,所述处理器资源管理单元根据所述处理器资源调整请求,对针对所述虚拟机的一致性维护表项进行调整,包括:The method according to claim 6, wherein the processor resource management unit adjusts the consistency maintenance entry for the virtual machine according to the processor resource adjustment request, including:
    当所述处理器资源调整是减少处理器核时,所述资源管理器将待减少的处理器核内的缓存数据写回到内存中,然后将待减少的处理器核的数据清空,并在所述一致性维护表项中将所述待减少的处理器核的标识删除。When the processor resource adjustment is to reduce the processor core, the resource manager writes the cache data in the processor core to be reduced back into the memory, and then clears the data of the processor core to be reduced, and The identifier of the processor core to be reduced is deleted in the consistency maintenance entry.
  8. 一种处理器芯片,其特征在于,包括:A processor chip, comprising:
    多个处理器核,包括第一处理器核;a plurality of processor cores, including a first processor core;
    片上网络,由多个路由器连接构成,其中,包括第一路由器,所述第一路由器和所述第一处理器核直接相连;An on-chip network, which is composed of a plurality of router connections, wherein the first router includes a first router, and the first router and the first processor core are directly connected;
    所述第一路由器包括:The first router includes:
    处理器核连接端口,用于和所述第一处理器核相连;a processor core connection port for connecting to the first processor core;
    至少一个输出端口,用于和所述片上网络的至少一个路由器相连;At least one output port for connecting to at least one router of the on-chip network;
    缓存模块,用于存储预设的一致性维护表项;a cache module, configured to store a preset consistency maintenance entry;
    处理模块,用于通过所述处理器核连接端口接收所述第一处理器核发送的一致性维护请求,所述一致性维护请求携带所述第一处理器核的标识;根据所述第一处理器核的标识查询所述缓存器存储的所述一致性维护表项,并根据所述一致性维护表项生成针对所述一致性区域的一致性维护命令;将所述一致性维护命令通过所述输出端口发送给所述一致性区域的其他处理器核。a processing module, configured to receive, by the processor core connection port, a consistency maintenance request sent by the first processor core, where the consistency maintenance request carries an identifier of the first processor core; Querying, by the identifier of the processor core, the consistency maintenance entry that is stored in the buffer, and generating a consistency maintenance command for the consistency area according to the consistency maintenance entry; The output port is sent to other processor cores of the coherency area.
  9. 根据权利要求8所述的处理器芯片,其特征在于,所述处理模块还用于:The processor chip according to claim 8, wherein the processing module is further configured to:
    根据所述第一处理器核的标识查询所述缓存模块存储的一致性维护表项,获取所述第一处理器核所属一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识;以及Querying, according to the identifier of the first processor core, the consistency maintenance entry stored by the cache module, obtaining an identifier of a consistency area to which the first processor core belongs, and other processor cores located in the consistency area Identification;
    根据所述第一处理器核所在一致性区域的标识,以及位于所述一致性区域内其他处理器核的标识,生成针对所述一致性区域的一致性维护命令。And generating a consistency maintenance command for the consistency area according to the identifier of the consistency area where the first processor core is located, and the identifiers of other processor cores located in the consistency area.
  10. 根据权利要求8或9所述的处理器芯片,其特征在于,所述处理模块还用于:The processor chip according to claim 8 or 9, wherein the processing module is further configured to:
    根据所述片上网络拓扑状态,以及所述一致性区域内其他处理器核的标识,所述第一路确定至少一条路由传输路径,其中,每条所述路由传输路径由和所述一致性区域内其他处理器核相连的路由器组成;Determining, according to the on-chip network topology state, and identifiers of other processor cores in the consistency region, the first path determining at least one route transmission path, wherein each of the route transmission paths is associated with the consistent region a router connected to other processor cores;
    针对每条所述路由传输路径,对所述一致性维护命令进行重构处理,生成针对该条路由 传输路径的一致性维护命令;Reconstructing the consistency maintenance command for each of the routing transmission paths, and generating a route for the routing Consistency maintenance command for the transmission path;
    利用每条所述路由传输路径,将针对该条路由传输路径的一致性维护命令通过所述输出端口发送给所述路由传输路径上的处理器核。Using each of the routing transmission paths, a consistency maintenance command for the routing transmission path is sent to the processor core on the routing transmission path through the output port.
  11. 根据权利要求10所述的处理器芯片,其特征在于,所述处理模块还用于:The processor chip according to claim 10, wherein the processing module is further configured to:
    根据所述一致性区域内其他处理器核的标识,确定和所述一致性区域内其他处理器核相连的路由器的标识;Determining an identifier of a router connected to another processor core in the consistency region according to an identifier of another processor core in the consistency region;
    根据所述一致性区域内其他处理器核相连的路由器的标识,以及所述片上网络的拓扑状态,按照XY路由算法进行路由发现,并确定所述至少一条路由传输路径。And performing route discovery according to the XY routing algorithm according to the identifier of the router connected to the other processor cores in the consistency area and the topology state of the on-chip network, and determining the at least one route transmission path.
  12. 一种计算机系统,其特征在于,所述系统包括:A computer system, characterized in that the system comprises:
    资源管理器,用于接收虚拟机发送的处理器资源的分配请求,所述处理器资源分配请求用于请求所述资源管理器为所述虚拟机分配至少两个处理器;以及根据所述处理器资源分配请求,生成针对所述虚拟机的一致性维护表项,所述一致性维护表项包括:所述一致性区域的标识,以及所述一致性区域包括的处理器核的标识;以及如权利要求7-9任一所述的处理器芯片。a resource manager, configured to receive an allocation request of a processor resource sent by the virtual machine, where the processor resource allocation request is used to request the resource manager to allocate at least two processors to the virtual machine; and according to the processing a resource allocation request, generating a consistency maintenance entry for the virtual machine, where the consistency maintenance entry includes: an identifier of the consistency area, and an identifier of a processor core included in the consistency area; A processor chip as claimed in any of claims 7-9.
  13. 根据权利要求12所述的计算机系统,其特征在于,所述资源管理器,还用于:接收所述虚拟机发送的处理器资源调整请求,所述处理器资源调整请求用于请求所述资源管理器对分配给所述虚拟机的处理器核进行调整;以及根据所述处理器资源调整请求,对针对所述虚拟机的一致性维护表项进行调整。The computer system according to claim 12, wherein the resource manager is further configured to: receive a processor resource adjustment request sent by the virtual machine, where the processor resource adjustment request is used to request the resource The manager adjusts a processor core allocated to the virtual machine; and adjusts a consistency maintenance entry for the virtual machine according to the processor resource adjustment request.
  14. 根据权利要求13所述的计算机系统,其特征在于,所述资源管理器,还用于:当所述处理器资源调整请求是减少处理器核时,所述资源管理器将待减少的处理器核内的缓存数据写回到内存中,然后将待减少的处理器核的数据清空,并在所述一致性维护表项中将所述待减少的处理器核的标识删除。 The computer system according to claim 13, wherein the resource manager is further configured to: when the processor resource adjustment request is to reduce a processor core, the resource manager is to be reduced by a processor The cached data in the core is written back into the memory, and then the data of the processor core to be reduced is cleared, and the identifier of the processor core to be reduced is deleted in the consistency maintenance table entry.
PCT/CN2017/104021 2016-09-30 2017-09-28 Cache consistency processing method and device WO2018059497A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610878873.1A CN107894914A (en) 2016-09-30 2016-09-30 Buffer consistency treating method and apparatus
CN201610878873.1 2016-09-30

Publications (1)

Publication Number Publication Date
WO2018059497A1 true WO2018059497A1 (en) 2018-04-05

Family

ID=61763220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/104021 WO2018059497A1 (en) 2016-09-30 2017-09-28 Cache consistency processing method and device

Country Status (2)

Country Link
CN (1) CN107894914A (en)
WO (1) WO2018059497A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210409265A1 (en) * 2021-01-28 2021-12-30 Intel Corporation In-network multicast operations

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10698842B1 (en) * 2019-04-10 2020-06-30 Xilinx, Inc. Domain assist processor-peer for coherent acceleration
CN112131174A (en) * 2019-06-25 2020-12-25 北京百度网讯科技有限公司 Method, apparatus, electronic device, and computer storage medium supporting communication between multiple chips

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
CN104239270A (en) * 2014-07-25 2014-12-24 浪潮(北京)电子信息产业有限公司 High-speed cache synchronization method and high-speed cache synchronization device
US20150286577A1 (en) * 2012-10-25 2015-10-08 Empire Technology Development Llc Multi-granular cache coherence

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7469321B2 (en) * 2003-06-25 2008-12-23 International Business Machines Corporation Software process migration between coherency regions without cache purges
CN103440223B (en) * 2013-08-29 2017-04-05 西安电子科技大学 A kind of hierarchical system and its method for realizing cache coherent protocol
WO2015051488A1 (en) * 2013-10-08 2015-04-16 华为技术有限公司 Memory sharing method, device and system in aggregation virtualization
US9886382B2 (en) * 2014-11-20 2018-02-06 International Business Machines Corporation Configuration based cache coherency protocol selection
CN105740164B (en) * 2014-12-10 2020-03-17 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing method, device and equipment
CN104991868B (en) * 2015-06-09 2018-02-02 浪潮(北京)电子信息产业有限公司 A kind of multi-core processor system and caching consistency processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
US20150286577A1 (en) * 2012-10-25 2015-10-08 Empire Technology Development Llc Multi-granular cache coherence
CN104239270A (en) * 2014-07-25 2014-12-24 浪潮(北京)电子信息产业有限公司 High-speed cache synchronization method and high-speed cache synchronization device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210409265A1 (en) * 2021-01-28 2021-12-30 Intel Corporation In-network multicast operations

Also Published As

Publication number Publication date
CN107894914A (en) 2018-04-10

Similar Documents

Publication Publication Date Title
US8959310B2 (en) Dynamic network adapter memory resizing and bounding for virtual function translation entry storage
US9952975B2 (en) Memory network to route memory traffic and I/O traffic
US9558041B2 (en) Transparent non-uniform memory access (NUMA) awareness
US8937940B2 (en) Optimized virtual function translation entry memory caching
CN117544581A (en) Shared memory for intelligent network interface card
CN103870435B (en) server and data access method
US20120290695A1 (en) Distributed Policy Service
WO2018059497A1 (en) Cache consistency processing method and device
US8347038B2 (en) Optimizing memory copy routine selection for message passing in a multicore architecture
US11710206B2 (en) Session coordination for auto-scaled virtualized graphics processing
US11954528B2 (en) Technologies for dynamically sharing remote resources across remote computing nodes
WO2015135383A1 (en) Data migration method, device, and computer system
CN101257457A (en) Method for network processor to copy packet and network processor
WO2021184551A1 (en) Communication method and apparatus based on plurality of networks, electronic device, and storage medium
WO2020034729A1 (en) Data processing method, related device, and computer storage medium
WO2020134827A1 (en) Path creation method and apparatus for network-on-chip and electronic device
US20170093736A1 (en) Packet size control using maximum transmission units for facilitating packet transmission
WO2016172862A1 (en) Memory management method, device and system
CN108471384B (en) Method and device for forwarding messages for end-to-end communication
US20200403912A1 (en) Serverless packet processing service with configurable exception paths
WO2016119618A1 (en) Remote memory allocation method, device and system
TW202301133A (en) Memory inclusivity management in computing systems
US10970217B1 (en) Domain aware data migration in coherent heterogenous systems
US20190065419A1 (en) Message routing in a main memory arrangement
US11573719B2 (en) PMEM cache RDMA security

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17854947

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17854947

Country of ref document: EP

Kind code of ref document: A1