WO2013185660A1 - Instruction storage device of network processor and instruction storage method for same - Google Patents

Instruction storage device of network processor and instruction storage method for same Download PDF

Info

Publication number
WO2013185660A1
WO2013185660A1 PCT/CN2013/078736 CN2013078736W WO2013185660A1 WO 2013185660 A1 WO2013185660 A1 WO 2013185660A1 CN 2013078736 W CN2013078736 W CN 2013078736W WO 2013185660 A1 WO2013185660 A1 WO 2013185660A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
cache
memory
instruction data
low speed
Prior art date
Application number
PCT/CN2013/078736
Other languages
French (fr)
Chinese (zh)
Inventor
郝宇
安康
王志忠
刘衡祁
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2013185660A1 publication Critical patent/WO2013185660A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Definitions

  • Instruction storage device of network processor and instruction storage method of the device is
  • the present invention relates to the field of the Internet, and in particular, to an instruction storage device of a network processor and an instruction storage method of the instruction storage device.
  • the interface rate of the core router used for backbone network interconnection has reached 100 Gbps, which requires the core router's line card to quickly process the packets passing through the line card.
  • Most of the current industry uses multi-core network processors. structure. The fetch efficiency of instructions is a key factor affecting the performance of multicore network processors.
  • MEs micro-engines
  • Some traditional multi-core network processors use a multi-level cache structure.
  • each micro-engine is equipped with a separate level 1 cache, and a group of micro-engines share a level 2 cache structure to achieve storage space sharing, as shown in Figure 1. Show.
  • These caches have a large space to ensure the hit rate.
  • the locality of the instructions is not strong. Therefore, the large-capacity cache does not guarantee the efficiency of the indexing, and also causes a large waste of resources.
  • Other network processors use a polling instruction storage scheme to store the instructions required by a group of microengines in the same number of random access memories (RAMs) as the microengines, as shown in Figure 2.
  • RAMs random access memories
  • the four microengines in the figure poll the instructions in the four RAMs through an arbitration module.
  • Each micro-arch engine accesses all of the RAM in turn, and their access is always in a different "phase", so that different micro-engines will not collide with the same RAM, realizing the sharing of storage space.
  • Embodiments of the present invention provide an instruction storage device of a network processor and an instruction storage method of the instruction storage device, which can save hardware resources.
  • An embodiment of the present invention provides an instruction storage device for a network processor, including: a fast memory (Qmem), a cache (cache), a first low-speed instruction memory, and a second low-speed instruction memory, where:
  • the network processor includes two or more micro-engine large groups, each micro-engine large group includes N micro-engines, and the N micro-engines are divided into two or more micro-engine groups;
  • Each microengine corresponds to a Qmem and a cache, the Qmem is connected to the microengine, and the cache is connected to the Qmem;
  • Each micro-engine group corresponds to a first low-speed instruction memory, and a cache corresponding to each of the micro-engine groups is connected to the first low-speed instruction memory;
  • Each micro-engine large group corresponds to a second low-speed instruction memory, and a cache corresponding to each micro-engine in the micro-engine large group is connected to the second low-speed instruction memory.
  • the Qmem is configured to: after receiving the instruction data request sent by the microengine, determine whether the Qmem has instruction data, and if yes, return the instruction data to the microengine, if not, then The cache sends the instruction data request.
  • the Qmem stores an instruction for an address segment that has the highest processing quality.
  • the cache includes two cache lines, each cache line stores a plurality of consecutive instructions; the cache is set to: after receiving the instruction data request sent by the Qmem, determining whether the cache has the instruction data, if If yes, the instruction data is returned to the microengine through the Qmem, and if not, the instruction data request is sent to the first low speed instruction memory or the second low speed instruction memory.
  • the two Cache Lines process the message with a ping-pong operation, and the ping-pong operation is synchronized with the ping-pong operation of the message store.
  • the instruction storage device further includes a first arbitration module, a second arbitration module, and a third arbitration module, wherein:
  • Each microengine corresponds to a first arbitration module, and the first arbitration module is connected to a cache of each microengine;
  • Each micro-engine group corresponds to a second arbitration module, one end of the second arbitration module is connected to the first arbitration module of each micro-engine in the micro-engine group, and the other end is connected to the first low-speed instruction memory;
  • Each microengine large group corresponds to a third arbitration module, one end of the third arbitration module is connected to a first arbitration module of each microengine in the microengine large group, and the other end is connected to the second low speed instruction memory. Connected.
  • the first arbitration module is configured to: when buffering the instruction data request, determining whether the requested instruction is located in the first low speed instruction memory or in the second low speed instruction memory, when determining that the requested instruction is located in the Transmitting, by the first low speed instruction memory, the instruction data request to the first low speed instruction memory, and sending the instruction to the second low speed instruction memory when determining that the requested instruction is located in the second low speed instruction memory And receiving the instruction data returned by the first low speed instruction memory or the second low speed instruction memory, and returning the instruction data to the cache;
  • the second arbitration module is configured to: receive the finger sent by one or more first arbitration modules When the data is requested, an instruction data request is sent to the first low speed instruction memory processing, and the instruction data obtained by fetching the first low speed instruction memory is returned to the first arbitration module;
  • the third arbitration module is configured to: when receiving the instruction data request sent by the one or more first arbitration modules, select an instruction data request to be sent to the second low speed instruction memory to process, the second low speed instruction The instruction data obtained after the memory fetching is returned to the first arbitration module.
  • the cache is further configured to: update the cached content and the tag after receiving the instruction data returned by the first arbitration module.
  • Each microengine large group includes 32 microengines, which are divided into 4 microengine groups, and each microengine group includes 8 microengines.
  • the embodiment of the present invention further provides a method for storing an instruction by the instruction storage device as described above, wherein the instruction storage device is an instruction storage device as described above, and the method includes:
  • the fast memory after receiving the instruction data request sent by the microengine, determines whether the Qmem has instruction data, and if so, returns the instruction data to the microengine, and if not, sends the Instruction data request;
  • the Cache Line in the cache determines whether the cache has the instruction data, and if so, returns the instruction data to the micro engine through the Qmem. If not, transmitting the instruction data request to the first low speed instruction memory or the second low speed instruction memory;
  • the first low speed instruction memory After receiving the instruction data request sent by the cache, the first low speed instruction memory searches for instruction data, and returns the found instruction data to the cache;
  • the second low speed instruction memory After receiving the instruction data request sent by the cache, the second low speed instruction memory searches for the instruction data, and returns the found instruction data to the cache.
  • the method for storing an instruction further includes: A cache line in the cache sends the instruction data request to the first arbitration module when determining that the cache does not have the instruction data, and the first arbitration module determines that the requested instruction is located in the first low speed instruction.
  • the instruction data request is sent to the first low speed instruction memory, and the first arbitration module determines that the requested instruction is in the second low speed instruction memory if located in the second low speed instruction memory. Sending the instruction data request.
  • the method for storing an instruction further includes:
  • the first arbitration module determines that the requested instruction is located in the first low-speed instruction memory, sends the instruction data request to the second arbitration module, and the second arbitration module receives one or more first arbitration modules.
  • the transmitted instruction data is requested, an instruction data request is selected and sent to the first low speed instruction memory;
  • the instruction data request is sent to the third arbitration module, and the third arbitration module receives one or more first arbitration modules.
  • the transmitted command data is requested, an instruction data request is selected and sent to the second low speed instruction memory.
  • a fast memory and cache-based instruction storage scheme applicable to the multi-core network processor provided by the embodiment of the present invention, a fast memory, a small-capacity ping-pong operation buffer, and a low-speed dynamic RAM (DRAM) are combined. Together, where the memory uses a hierarchical grouping strategy. Using this kind of instruction storage scheme, it effectively guarantees high indexing efficiency and high average indexing efficiency of some instructions, and saves a lot of hardware storage resources, and the compiler is also very simple to implement.
  • FIG. 1 is a schematic structural diagram of a conventional two-level cache.
  • FIG. 2 is a schematic structural diagram of an instruction storage scheme of a polling mode.
  • FIG. 3 is a block diagram showing the structure of an instruction storage device in accordance with Embodiment 1 of the present invention.
  • FIG. 4 is a schematic structural diagram of a specific instruction storage device according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a ping-pong operation of a message memory and an icache according to an embodiment of the present invention.
  • Figure 6 is a process flow diagram of an instruction storage device in accordance with an embodiment of the present invention.
  • Figure ⁇ is a detailed process flow diagram of an instruction storage device in accordance with an embodiment of the present invention.
  • FIG. 8 is a process diagram of a Cache Line operation in a cache module according to an embodiment of the present invention.
  • a fast memory Qme Memory, referred to as Qmem
  • a small-capacity ping-pong Cache Cache
  • a low-speed RAM for example, an Instruction Memory (IMEM)
  • the instruction storage device of this embodiment is as shown in Fig. 3, and the following structure is employed.
  • a micro-engine large group includes N micro-engines, and the N micro-engines are divided into two or more micro-engine groups, each micro-engine corresponding to one Qmem and one Cache, and each micro-engine group corresponds to a first low-speed instruction memory ( Hereinafter, IMEM), the N microengines of the micro-engine group correspond to a second low-speed instruction memory (hereinafter referred to as IMEM-COM).
  • IMEM first low-speed instruction memory
  • IMEM-COM second low-speed instruction memory
  • the Qmem is set to: after receiving the instruction data request sent by the microengine, determine whether the Qmem has instruction data, and if so, return the instruction data to the microengine, and if not, send the instruction data request to the cache.
  • the Qmem stores instructions for an address segment with the highest processing quality, which can be implemented by SRAM with fast read/write speed. The content in Qmem will not be updated during the message processing.
  • Qmem can return the required instruction data of the micro engine in one clock cycle, which greatly improves the efficiency of indexing.
  • the Cache has two Cache Lines (no general Chinese technical terminology), each Cache Line can store multiple consecutive instructions, and the Cache Line is set to determine whether the cache has the instruction data after receiving the instruction data request sent by Qmem. If yes, return the command data to the microengine via Qmem. If not, send the command data to IMEM or IMEM-COM. Request.
  • the two Cache Lines process the message with a ping-pong operation, and the ping-pong operation is synchronized with the ping-pong operation of the message memory.
  • the above IMEM and IMEM-COM are respectively set to: store a piece of instructions located in different address segments, request to find instruction data based on the instruction data, and return the instruction data.
  • Hierarchical memory can effectively utilize the difference in the probability of instruction execution, thereby optimizing the efficiency of the micro-engine fetching instructions. Since more low-speed memory is used, hardware resources are saved.
  • the apparatus further includes a first arbitration module (arbiter), a second arbitration module (arbiter2), and a third arbitration module (arbiter3).
  • Each microengine corresponds to an arbiterl, and the arbiterl is connected to the cache of each microengine;
  • each microengine group corresponds to an arbiter2, one end of which is connected to the abiter1 of each microengine in the microengine group, and the other end is connected to the IMEM.
  • Each micro-bow I engine group corresponds to an arbiter3, one end of which is connected to the abiterl of each micro-engine in the micro-engine large group, and the other end is connected to IMEM-COM.
  • the arbiterl is set to: when the cache instruction data request is made, determine whether the requested instruction is located in the IMEM or in the IMEM-COM, and when determining that the requested instruction is located in the IMEM, send an instruction data request to the IMEM, when determining that the requested instruction is located in the IMEM - COM, send an instruction data request to IMEM-COM; and receive instruction data returned by IMEM or IMEM-COM, and return the instruction data to the cache;
  • the arbiter2 is set to: when receiving one or more instruction data requests sent by arbiterl, select an instruction data request to send to the IMEM for processing, and return the instruction data obtained by the IMEM fetching to arbiterl;
  • the arbiter3 is set to: when receiving one or more instruction data requests sent by arbiterl, select an instruction data request to be sent to the IMEM-COM process, and return the instruction data obtained by the IMEM-COM fetch to the arbiter.
  • each group of 32 micro-engines can be divided into 4 groups, each group including 8 micro-engines.
  • each microengine corresponds to a Qmem and a Cache (including two instruction caches (icache)), and each group of 8 microengines share an IMEM, each group of 32 microengines share an IMEM—C0M.
  • A1 represents arbiterl
  • A2 represents arbiter2
  • A3 represents arbiter3.
  • the two icache correspond to two message stores in the ME, which work in turn to mask the delay in message storage and fetching.
  • the instruction storage method of the instruction storage device is as shown in FIG. 6, and includes the following steps.
  • Step 1 After receiving the instruction data request sent by the microengine, Qmem determines whether the Qmem has the instruction data, and if so, returns the instruction data to the microengine, and if not, sends the instruction data request to the cache.
  • Step 2 After receiving a command data request sent by Qmem, a cache line in the cache determines whether the cache has the command data, and if so, returns the command data to the micro engine through Qmem, if not, to the IMEM or IMEM—COM sends an instruction data request.
  • Step 3 After receiving the instruction data request sent by the cache, the IMEM searches for the instruction data, and returns the found instruction data to the cache. After receiving the instruction data request sent by the cache, the IMEM searches for the instruction data and returns the search to the cache. The instruction data to.
  • Step 110 The micro engine sends the required instruction address and address enable to the Qmem of the micro engine.
  • the message memory in the micro engine When receiving the message, the message memory in the micro engine sends the instruction address and address in the message to the instruction storage device, that is, the Qmem corresponding to the micro engine.
  • Step 120 Qmem determines whether the instruction address is within the address range of the instruction it stores. If yes, step 130 is performed. If not, step 140 is performed.
  • Step 130 Qmem uses the instruction address and the address to enable fetching instruction data and return the instruction data to the microengine. The fetching process ends.
  • step 140 Qmem transmits the instruction address and address enable to the Cache of the micro engine.
  • Step 150 The Cache determines whether the instruction address is within the address range of the instruction it stores. If yes, step 160 is performed. If no, step 170 is performed. Since each part of the Cache has only one Cache Line, the Cache tag has only one tag information. When the address request is sent to the Cache, it can immediately determine whether the required data is in the Cache according to the tag, that is, The bit corresponding to the instruction address is compared with the tag corresponding to the currently working Cache Line. If they are the same, the instruction is in the Cache. If the command is different, the instruction is not in the Cache.
  • Step 160 The Cache extracts the instruction data corresponding to the location in the Cache Line based on the address and sends the instruction data to the micro engine through the Qmem, and the fetching process ends.
  • Step 170 The Cache sends the instruction address and address enable to the first arbitration module (arbiterl).
  • arbiterl determines whether the instruction address is in the IMEM corresponding to the microengine group of the microengine, or in the IMEM-COM corresponding to the microengine large group of the microengine, if in the IMEM, step 190 is performed. In IMEM-COM, step 210 is performed.
  • Arbiterl determines whether the instruction is in IMEM or IMEM-COM based on the instruction address. Step 190, arbiterl sends the instruction address and address enable to the second arbitration module (arbiter2). Step 200: arbiter2 selects an instruction request to send to the IMEM, and the IMEM enables the instruction data according to the instruction address and address in the request, and returns the instruction data to the Cache through arbiterl, and step 230 is performed.
  • arbiter1 When arbiter1 corresponding to multiple microengines initiates an instruction fetching request to arbiter2, arbiter2 processes the requests of each Cache by polling, selects an instruction fetch request and sends IMEM processing, and since the data return requires multiple clock cycles, it has already sent out The requested branch will no longer be polled.
  • Step 210 arbiterl sends the instruction address and address enable to the third arbitration module (arbiter3).
  • Each microengine corresponds to the function of arbiter with the function of arbiterl, and the function of arbiter3 is the same as the function of arbiter2.
  • Step 230 The Cache updates the contents of the Cache Line and the Tag, and returns the instruction data to the micro engine through the Qmem, and the fetching process ends.
  • Figure 8 is a schematic diagram of the structure of the icon of Figure 5, icache receives the command address sent by Qmem After comparing with Tag, it is judged whether it is hit. If it is hit, after decoding, according to the address enable, the instruction content is fetched from the physical storage location of the icon, and the instruction content is output through the multiplexer. If it is missed, continue to go.
  • the low-speed instruction memory fetches the instruction data, and the returned instruction data is output through the multiplexer.
  • the Cache Linel used by the current message finds the corresponding instruction data in the Cache, and does not issue a read request to the lower-level low-speed instruction memory (IMEM or IMEM-COM), if Cache Line2 detects the first address in the next message. If requested, Cache Line2 issues a read request to the lower low-speed instruction memory with the instruction first address contained in the next message to obtain the instruction data required in the next message. After the current Cache Linel packet is processed, the Cache switches to the other half of Cache Line 2 to prepare for processing the next packet.
  • IMEM lower-level low-speed instruction memory
  • the ping-pong operation to process the message can effectively cover the time of the message storage and the delay of the instruction to the low-speed instruction memory.
  • the required instruction can be obtained immediately.
  • the efficiency of the fetching is increased, so that the processing efficiency of the microengine is improved.
  • the instruction storage scheme of the embodiment of the invention effectively ensures high indexing efficiency and high average indexing efficiency of a part of the instructions, and saves a large amount of hardware storage resources, and the compiler is also very simple to implement.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Multi Processors (AREA)

Abstract

An instruction storage device of a network processor and an instruction storage method for same. The device comprises: quick memories (Qmems), buffers, a first low-speed instruction memory and a second low-speed instruction memory. The network processor comprises more than two micro engine large groups, each micro engine large group comprising N micro engines, and the N micro engines being divided into more than two micro engine subgroups; each micro engine corresponds to one Qmem and one buffer, the Qmem being connected to the micro engine, and the buffer being connected to the Qmem; each micro engine subgroup corresponds to one first low-speed instruction memory, the buffer corresponding to each micro engine in the micro engine subgroup being connected to the first low-speed instruction memory; and each micro engine large group corresponds to one second low-speed instruction memory. In this solution, a high instruction fetch efficiency is ensured, a large amount of hardware storage resources is saved, and the realization of a compiler is made simpler.

Description

一种网络处理器的指令存储装置及该装置的指令存储方法  Instruction storage device of network processor and instruction storage method of the device
技术领域 Technical field
本发明涉及互联网领域, 尤其涉及一种网络处理器的指令存储装置及该 指令存储装置的指令存储方法。  The present invention relates to the field of the Internet, and in particular, to an instruction storage device of a network processor and an instruction storage method of the instruction storage device.
背景技术 Background technique
随着互联网的迅猛发展, 用于主干网络互联的核心路由器的接口速率已 经达到 lOOGbps, 该速率要求核心路由器的线卡能够迅速处理通过线卡上的 报文, 当前业界大都使用多核网络处理器的结构。 而指令的取指效率是影响 多核网络处理器性能的一个关键因素。  With the rapid development of the Internet, the interface rate of the core router used for backbone network interconnection has reached 100 Gbps, which requires the core router's line card to quickly process the packets passing through the line card. Most of the current industry uses multi-core network processors. structure. The fetch efficiency of instructions is a key factor affecting the performance of multicore network processors.
在多核结构的网络处理器系统中, 同一组微引擎(Micro Engine, 简称 ME )有着同样的指令需求, 由于芯片面积和工艺的限制, 无法为每一个微引 擎都配备一块独享的存储空间来存储这些指令。 因此需要设计一个相应的方 案来实现一组微引擎对一片指令存储空间的共享, 同时能够有较高的取指效 率。  In a network processor system with a multi-core structure, the same set of micro-engines (MEs) have the same instruction requirements. Due to chip area and process limitations, it is impossible to equip each micro-engine with an exclusive storage space. Store these instructions. Therefore, it is necessary to design a corresponding solution to realize the sharing of a set of micro-engines to a piece of instruction storage space, and at the same time, it can have higher indexing efficiency.
一些传统的多核网络处理器使用多级緩存的结构, 譬如, 每个微引擎配 备一个单独的一级緩存, 一组微引擎共享一个二级緩存的结构来实现存储空 间的共享, 如图 1所示。 这些緩存都具有较大的空间以保证命中率, 但是由 于网络报文的随机性造成指令的局部性不强, 因此大容量的緩存并不能保证 取指效率, 同时还造成资源的大量浪费。  Some traditional multi-core network processors use a multi-level cache structure. For example, each micro-engine is equipped with a separate level 1 cache, and a group of micro-engines share a level 2 cache structure to achieve storage space sharing, as shown in Figure 1. Show. These caches have a large space to ensure the hit rate. However, due to the randomness of the network packets, the locality of the instructions is not strong. Therefore, the large-capacity cache does not guarantee the efficiency of the indexing, and also causes a large waste of resources.
另一些网络处理器釆用了轮询式的指令存储方案, 将一组微引擎所需的 指令存储在与微引擎同等数量的随机存取存储器( Random Access Memory, 简称 RAM ) 内, 如图 2所示, 图中 4个微引擎通过一个仲裁模块轮询 4个 RAM中的指令。 每个微弓 I擎依次地访问所有的 RAM , 它们的访问始终处于 不同的 "相位" , 因此不会发生不同微引擎访问同一个 RAM的碰撞, 实现 了存储空间的共享。 但是由于指令中存在大量的跳转指令, 假设对流水线结 构的微引擎来说, 从开始取到跳转指令到跳转完成需要 n个时钟的时间, 为 保证某个跳转指令的目标在该跳转指令所在 RAM后面第 n+1个 RAM中 ,写 入指令的时候必须要插入一些空指令保证跳转目标位置的正确。 当跳转指令 所占比例很大的时候就需要插入大量的空指令,造成了指令空间的大量浪费, 而且也会增加编译器实现的复杂度。 该方案要求所有的 RAM都能在 1个时 钟周期内返回数据, 因此须釆用静态 RAM ( Static RAM, 简称 SRAM )实现, 但大量 SRAM的使用也造成大量的资源开销。 Other network processors use a polling instruction storage scheme to store the instructions required by a group of microengines in the same number of random access memories (RAMs) as the microengines, as shown in Figure 2. As shown, the four microengines in the figure poll the instructions in the four RAMs through an arbitration module. Each micro-arch engine accesses all of the RAM in turn, and their access is always in a different "phase", so that different micro-engines will not collide with the same RAM, realizing the sharing of storage space. However, due to the large number of jump instructions in the instruction, it is assumed that for the microengine of the pipeline structure, it takes n clocks from the start of the jump instruction to the completion of the jump. Ensure that the target of a jump instruction is in the n+1th RAM behind the RAM where the jump instruction is located. When writing the instruction, some empty instructions must be inserted to ensure the correct jump destination. When the proportion of jump instructions is large, a large number of empty instructions need to be inserted, which causes a large waste of instruction space, and also increases the complexity of the compiler implementation. This scheme requires all RAMs to return data in one clock cycle. Therefore, static RAM (SRAM) is required, but the use of a large number of SRAMs also causes a large amount of resource overhead.
发明内容 Summary of the invention
本发明实施例提供一种网络处理器的指令存储装置及该指令存储装置的 指令存储方法, 能够节约硬件资源。  Embodiments of the present invention provide an instruction storage device of a network processor and an instruction storage method of the instruction storage device, which can save hardware resources.
本发明实施例提供了一种网络处理器的指令存储装置, 包括: 快速存储 器(Qmem ) 、 緩存(cache ) 、 第一低速指令存储器和第二低速指令存储器, 其中:  An embodiment of the present invention provides an instruction storage device for a network processor, including: a fast memory (Qmem), a cache (cache), a first low-speed instruction memory, and a second low-speed instruction memory, where:
所述网络处理器包括两个以上的微引擎大组, 每个微引擎大组包括 N个 微引擎, 所述 N个微引擎分成两个以上的微引擎小组;  The network processor includes two or more micro-engine large groups, each micro-engine large group includes N micro-engines, and the N micro-engines are divided into two or more micro-engine groups;
每个微引擎对应一个 Qmem和一个緩存,所述 Qmem与所述微引擎连接, 所述緩存与所述 Qmem相连;  Each microengine corresponds to a Qmem and a cache, the Qmem is connected to the microengine, and the cache is connected to the Qmem;
每个微引擎小组对应一个第一低速指令存储器, 所述微引擎小组中每个 微引擎对应的緩存与所述第一低速指令存储器相连; 以及  Each micro-engine group corresponds to a first low-speed instruction memory, and a cache corresponding to each of the micro-engine groups is connected to the first low-speed instruction memory;
每个微引擎大组对应一个第二低速指令存储器, 所述微引擎大组中每个 微引擎对应的緩存与所述第二低速指令存储器相连。  Each micro-engine large group corresponds to a second low-speed instruction memory, and a cache corresponding to each micro-engine in the micro-engine large group is connected to the second low-speed instruction memory.
可选地 ,  Optionally,
所述 Qmem设置成: 在接收到所述微引擎发送的指令数据请求后, 判断 本 Qmem是否有指令数据, 如果有, 则将所述指令数据返回给所述微引擎, 如果没有, 则向所述緩存发送所述指令数据请求。  The Qmem is configured to: after receiving the instruction data request sent by the microengine, determine whether the Qmem has instruction data, and if yes, return the instruction data to the microengine, if not, then The cache sends the instruction data request.
可选地 ,  Optionally,
所述 Qmem中存储对处理质量要求最高的一个地址段的指令。  The Qmem stores an instruction for an address segment that has the highest processing quality.
可选地 , 所述緩存包括两个 Cache Line, 每个 Cache Line存放多条连续的指令; 所述緩存设置成: 在接收到所述 Qmem发送的指令数据请求后, 判断本 緩存是否有所述指令数据, 如果有, 则将所述指令数据通过所述 Qmem返回 给所述微引擎, 如果没有, 则向所述第一低速指令存储器或所述第二低速指 令存储器发送所述指令数据请求。 Optionally, The cache includes two cache lines, each cache line stores a plurality of consecutive instructions; the cache is set to: after receiving the instruction data request sent by the Qmem, determining whether the cache has the instruction data, if If yes, the instruction data is returned to the microengine through the Qmem, and if not, the instruction data request is sent to the first low speed instruction memory or the second low speed instruction memory.
可选地 ,  Optionally,
所述两个 Cache Line釆用乒乓操作处理报文, 且所述乒乓操作与报文存 储器的乒乓操作同步。  The two Cache Lines process the message with a ping-pong operation, and the ping-pong operation is synchronized with the ping-pong operation of the message store.
可选地 ,  Optionally,
所述指令存储装置还包括第一仲裁模块、第二仲裁模块和第三仲裁模块, 其中:  The instruction storage device further includes a first arbitration module, a second arbitration module, and a third arbitration module, wherein:
每个微引擎对应一个第一仲裁模块, 所述第一仲裁模块与每个微引擎的 緩存相连;  Each microengine corresponds to a first arbitration module, and the first arbitration module is connected to a cache of each microengine;
每个微引擎小组对应一个第二仲裁模块, 所述第二仲裁模块的一端与所 述微引擎小组中每个微引擎的第一仲裁模块相连, 另一端与所述第一低速指 令存储器相连;  Each micro-engine group corresponds to a second arbitration module, one end of the second arbitration module is connected to the first arbitration module of each micro-engine in the micro-engine group, and the other end is connected to the first low-speed instruction memory;
每个微引擎大组对应一个第三仲裁模块, 所述第三仲裁模块的一端与所 述微引擎大组中每个微引擎的第一仲裁模块相连, 另一端与所述第二低速指 令存储器相连。  Each microengine large group corresponds to a third arbitration module, one end of the third arbitration module is connected to a first arbitration module of each microengine in the microengine large group, and the other end is connected to the second low speed instruction memory. Connected.
可选地,  Optionally,
所述第一仲裁模块设置成: 在緩存所述指令数据请求时, 判断所请求的 指令位于所述第一低速指令存储器还是位于所述第二低速指令存储器, 当判 断所请求的指令位于所述第一低速指令存储器时 , 向所述第一低速指令存储 器发送所述指令数据请求, 当判断所请求的指令位于所述第二低速指令存储 器时, 向所述第二低速指令存储器发送所述指令数据请求; 以及接收所述第 一低速指令存储器或所述第二低速指令存储器返回的指令数据, 并将所述指 令数据返回给所述緩存;  The first arbitration module is configured to: when buffering the instruction data request, determining whether the requested instruction is located in the first low speed instruction memory or in the second low speed instruction memory, when determining that the requested instruction is located in the Transmitting, by the first low speed instruction memory, the instruction data request to the first low speed instruction memory, and sending the instruction to the second low speed instruction memory when determining that the requested instruction is located in the second low speed instruction memory And receiving the instruction data returned by the first low speed instruction memory or the second low speed instruction memory, and returning the instruction data to the cache;
所述第二仲裁模块设置成: 在接收到一个或多个第一仲裁模块发送的指 令数据请求时,选择一个指令数据请求发送给所述第一低速指令存储器处理, 将所述第一低速指令存储器取指后得到的指令数据返回给所述第一仲裁模 块; 以及 The second arbitration module is configured to: receive the finger sent by one or more first arbitration modules When the data is requested, an instruction data request is sent to the first low speed instruction memory processing, and the instruction data obtained by fetching the first low speed instruction memory is returned to the first arbitration module;
所述第三仲裁模块设置成: 在接收到一个或多个第一仲裁模块发送的指 令数据请求时 ,选择一个指令数据请求发送给所述第二低速指令存储器处理 , 将所述第二低速指令存储器取指后得到的指令数据返回给所述第一仲裁模 块。  The third arbitration module is configured to: when receiving the instruction data request sent by the one or more first arbitration modules, select an instruction data request to be sent to the second low speed instruction memory to process, the second low speed instruction The instruction data obtained after the memory fetching is returned to the first arbitration module.
可选地 ,  Optionally,
所述緩存还设置成: 在接收到所述第一仲裁模块返回的指令数据后, 更 新緩存内容和标签。  The cache is further configured to: update the cached content and the tag after receiving the instruction data returned by the first arbitration module.
可选地 ,  Optionally,
每个微引擎大组包括 32个微引擎, 所述 32个微引擎分成 4个微引擎小 组, 每个微引擎小组包括 8个微引擎。  Each microengine large group includes 32 microengines, which are divided into 4 microengine groups, and each microengine group includes 8 microengines.
本发明实施例还提供了一种如上所述的指令存储装置存储指令的方法, 其中, 所述指令存储装置为如前所述的指令存储装置, 所述方法包括: The embodiment of the present invention further provides a method for storing an instruction by the instruction storage device as described above, wherein the instruction storage device is an instruction storage device as described above, and the method includes:
快速存储器 (Qmem )在接收到微引擎发送的指令数据请求后, 判断本 Qmem是否有指令数据, 如果有, 则将所述指令数据返回给所述微引擎, 如 果没有, 则向緩存发送所述指令数据请求;  The fast memory (Qmem), after receiving the instruction data request sent by the microengine, determines whether the Qmem has instruction data, and if so, returns the instruction data to the microengine, and if not, sends the Instruction data request;
所述緩存中的一 Cache Line在接收到所述 Qmem发送的指令数据请求后, 判断本緩存是否有所述指令数据,如果有,则将所述指令数据通过所述 Qmem 返回给所述微引擎, 如果没有, 则向第一低速指令存储器或第二低速指令存 储器发送所述指令数据请求;  After receiving the instruction data request sent by the Qmem, the Cache Line in the cache determines whether the cache has the instruction data, and if so, returns the instruction data to the micro engine through the Qmem. If not, transmitting the instruction data request to the first low speed instruction memory or the second low speed instruction memory;
所述第一低速指令存储器在接收到所述緩存发送的指令数据请求后 , 查 找指令数据, 向所述緩存返回查找到的指令数据; 以及  After receiving the instruction data request sent by the cache, the first low speed instruction memory searches for instruction data, and returns the found instruction data to the cache;
所述第二低速指令存储器在接收到所述緩存发送的指令数据请求后 , 查 找指令数据, 向所述緩存返回查找到的指令数据。  After receiving the instruction data request sent by the cache, the second low speed instruction memory searches for the instruction data, and returns the found instruction data to the cache.
可选地, 所述存储指令的方法还包括: 所述緩存中的一 Cache Line在判断本緩存没有所述指令数据时, 将所述 指令数据请求发送给第一仲裁模块 , 所述第一仲裁模块判断所请求的指令位 于所述第一低速指令存储器时, 则向所述第一低速指令存储器发送所述指令 数据请求, 所述第一仲裁模块判断所请求的指令如果位于所述第二低速指令 存储器时, 则向所述第二低速指令存储器发送所述指令数据请求。 Optionally, the method for storing an instruction further includes: A cache line in the cache sends the instruction data request to the first arbitration module when determining that the cache does not have the instruction data, and the first arbitration module determines that the requested instruction is located in the first low speed instruction. In the case of a memory, the instruction data request is sent to the first low speed instruction memory, and the first arbitration module determines that the requested instruction is in the second low speed instruction memory if located in the second low speed instruction memory. Sending the instruction data request.
可选地, 所述存储指令的方法还包括:  Optionally, the method for storing an instruction further includes:
所述第一仲裁模块判断所请求的指令位于所述第一低速指令存储器时, 则向第二仲裁模块发送所述指令数据请求, 所述第二仲裁模块收到一个或多 个第一仲裁模块发送的指令数据请求时, 选择一个指令数据请求发送给所述 第一低速指令存储器;  When the first arbitration module determines that the requested instruction is located in the first low-speed instruction memory, sends the instruction data request to the second arbitration module, and the second arbitration module receives one or more first arbitration modules. When the transmitted instruction data is requested, an instruction data request is selected and sent to the first low speed instruction memory;
所述第一仲裁模块判断所请求的指令位于所述第二低速指令存储器时, 则向第三仲裁模块发送所述指令数据请求, 所述第三仲裁模块收到一个或多 个第一仲裁模块发送的指令数据请求时, 选择一个指令数据请求发送给所述 第二低速指令存储器。  When the first arbitration module determines that the requested instruction is located in the second low-speed instruction memory, the instruction data request is sent to the third arbitration module, and the third arbitration module receives one or more first arbitration modules. When the transmitted command data is requested, an instruction data request is selected and sent to the second low speed instruction memory.
本发明实施例所提供的适用于多核网络处理器的基于快速存储器和緩存 的指令存储方案中, 将快速存储器、 小容量的釆用乒乓操作的緩存以及低速 动态 RAM ( Dynamic RAM, DRAM )结合在一起, 其中存储器釆用层次化的 分组策略。 釆用该种指令存储方案, 有效地保证一部分指令的高取指效率和 较高的平均取指效率, 而且节省了大量的硬件存储资源, 同时编译器的实现 也十分简单。  In the fast memory and cache-based instruction storage scheme applicable to the multi-core network processor provided by the embodiment of the present invention, a fast memory, a small-capacity ping-pong operation buffer, and a low-speed dynamic RAM (DRAM) are combined. Together, where the memory uses a hierarchical grouping strategy. Using this kind of instruction storage scheme, it effectively guarantees high indexing efficiency and high average indexing efficiency of some instructions, and saves a lot of hardware storage resources, and the compiler is also very simple to implement.
附图概述 BRIEF abstract
图 1是传统的两级緩存的结构示意图。  FIG. 1 is a schematic structural diagram of a conventional two-level cache.
图 2是轮询方式的指令存储方案的结构示意图。  2 is a schematic structural diagram of an instruction storage scheme of a polling mode.
图 3是本发明实施例 1的一种指令存储装置的结构示意图。  Figure 3 is a block diagram showing the structure of an instruction storage device in accordance with Embodiment 1 of the present invention.
图 4是本发明实施例的一种具体的指令存储装置的结构示意图。  4 is a schematic structural diagram of a specific instruction storage device according to an embodiment of the present invention.
图 5是本发明实施例的报文存储器和 icache的乒乓操作的示意图。  FIG. 5 is a schematic diagram of a ping-pong operation of a message memory and an icache according to an embodiment of the present invention.
图 6是本发明实施例的指令存储装置的处理流程图。 图 Ί是本发明实施例的一种指令存储装置的详细处理流程图。 图 8是本发明实施例中緩存模块中一个 Cache Line工作的过程图。 Figure 6 is a process flow diagram of an instruction storage device in accordance with an embodiment of the present invention. Figure Ί is a detailed process flow diagram of an instruction storage device in accordance with an embodiment of the present invention. FIG. 8 is a process diagram of a Cache Line operation in a cache module according to an embodiment of the present invention.
本发明的较佳实施方式 Preferred embodiment of the invention
本发明实施例中将快速存储器(Quick Memory, 简称 Qmem ) 、 小容量 的釆用乒乓操作的緩存 (Cache ) 以及低速 RAM (如, 低速指令存储器 ( Instruction Memory, 简称 IMEM ) )结合起来作为微引擎的緩存。  In the embodiment of the present invention, a fast memory (Qume Memory, referred to as Qmem), a small-capacity ping-pong Cache (Cache), and a low-speed RAM (for example, an Instruction Memory (IMEM)) are combined as a micro-engine. Cache.
下文中将结合附图对本发明的实施例进行详细说明。 需要说明的是, 在 不冲突的情况下, 本申请中的实施例及实施例中的特征可以相互任意组合。  Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that, in the case of no conflict, the features in the embodiments and the embodiments in the present application may be arbitrarily combined with each other.
实施例 1  Example 1
本实施例的指令存储装置如图 3所示, 釆用以下结构。  The instruction storage device of this embodiment is as shown in Fig. 3, and the following structure is employed.
一微引擎大组包括 N个微引擎,该 N个微引擎分为两个以上的微引擎小 组, 每个微引擎对应一个 Qmem和一个 Cache, 每个微引擎小组对应一个第 一低速指令存储器(以下简称 IMEM ) , 该一微引擎大组的 N个微引擎对应 一个第二低速指令存储器(以下简称 IMEM— COM ) , 如图 3所示, Qmem设 置为与微引擎连接, 緩存与 Qmem相连; 微引擎小组中每个微引擎对应的緩 存与第一低速指令存储器相连; 微引擎大组中每个微引擎对应的緩存与第二 低速指令存储器相连。  A micro-engine large group includes N micro-engines, and the N micro-engines are divided into two or more micro-engine groups, each micro-engine corresponding to one Qmem and one Cache, and each micro-engine group corresponds to a first low-speed instruction memory ( Hereinafter, IMEM), the N microengines of the micro-engine group correspond to a second low-speed instruction memory (hereinafter referred to as IMEM-COM). As shown in FIG. 3, the Qmem is set to be connected to the microengine, and the cache is connected to the Qmem; The cache corresponding to each microengine in the microengine group is connected to the first low speed instruction memory; the cache corresponding to each microengine in the microengine group is connected to the second low speed instruction memory.
该 Qmem设置成:在接收到微引擎发送的指令数据请求后,判断本 Qmem 是否有指令数据, 如果有, 则将该指令数据返回给微引擎, 如果没有, 则向 緩存发送指令数据请求。 该 Qmem存储对处理质量要求最高的一个地址段的 指令, 可以由读写速度快的 SRAM实现。 Qmem中的内容在报文处理过程中 将再不会被更新, 当微引擎需求这部分指令的时候, Qmem可以在一个时钟 周期内返回微引擎所需指令数据, 大大提高了取指效率。  The Qmem is set to: after receiving the instruction data request sent by the microengine, determine whether the Qmem has instruction data, and if so, return the instruction data to the microengine, and if not, send the instruction data request to the cache. The Qmem stores instructions for an address segment with the highest processing quality, which can be implemented by SRAM with fast read/write speed. The content in Qmem will not be updated during the message processing. When the micro engine needs this part of the instruction, Qmem can return the required instruction data of the micro engine in one clock cycle, which greatly improves the efficiency of indexing.
该 Cache具有两个 Cache Line (无通用中文技术术语 ) , 每个 Cache Line 可以存放多条连续的指令, Cache Line设置成在接收到 Qmem发送的指令数 据请求后, 判断本緩存是否有该指令数据, 如果有, 则将该指令数据通过 Qmem返回给微引擎, 如果没有, 则向 IMEM或 IMEM— COM发送指令数据 请求。 两个 Cache Line釆用乒乓操作处理报文, 且该乒乓操作与报文存储器 的乒乓操作同步。 The Cache has two Cache Lines (no general Chinese technical terminology), each Cache Line can store multiple consecutive instructions, and the Cache Line is set to determine whether the cache has the instruction data after receiving the instruction data request sent by Qmem. If yes, return the command data to the microengine via Qmem. If not, send the command data to IMEM or IMEM-COM. Request. The two Cache Lines process the message with a ping-pong operation, and the ping-pong operation is synchronized with the ping-pong operation of the message memory.
上述 IMEM和 IMEM— COM分别设置成: 存储位于不同地址段的一片指 令, 基于指令数据请求查找指令数据并返回指令数据。  The above IMEM and IMEM-COM are respectively set to: store a piece of instructions located in different address segments, request to find instruction data based on the instruction data, and return the instruction data.
针对上述四个存储位置: Qmem、 Cache, IMEM和 IMEM— COM, 访问 速度依次降低。 釆用层次化的存储器, 可以有效地利用指令执行的概率的不 同, 从而优化微引擎取到指令的效率。 由于较多的釆用了低速存储器, 节约 了硬件资源。  For the above four storage locations: Qmem, Cache, IMEM, and IMEM-COM, the access speed is reduced in turn. Hierarchical memory can effectively utilize the difference in the probability of instruction execution, thereby optimizing the efficiency of the micro-engine fetching instructions. Since more low-speed memory is used, hardware resources are saved.
可选地,该装置还包括第一仲裁模块( arbiterl )、第二仲裁模块( arbiter2 ) 和第三仲裁模块(arbiter3 ) 。 每个微引擎对应一个 arbiterl , 该 arbiterl与每 个微引擎的緩存相连; 每个微引擎小组对应一个 arbiter2, 该 arbiter2的一端 与微引擎小组中每个微引擎的 arbiterl相连, 另一端与 IMEM相连; 每个微 弓 I擎大组对应一个 arbiter3 , 该 arbiter3的一端与微引擎大组中每个微引擎的 arbiterl相连, 另一端与 IMEM— COM相连。  Optionally, the apparatus further includes a first arbitration module (arbiter), a second arbitration module (arbiter2), and a third arbitration module (arbiter3). Each microengine corresponds to an arbiterl, and the arbiterl is connected to the cache of each microengine; each microengine group corresponds to an arbiter2, one end of which is connected to the abiter1 of each microengine in the microengine group, and the other end is connected to the IMEM. Each micro-bow I engine group corresponds to an arbiter3, one end of which is connected to the abiterl of each micro-engine in the micro-engine large group, and the other end is connected to IMEM-COM.
该 arbiterl设置成:在緩存指令数据请求时,判断所请求的指令位于 IMEM 还是位于 IMEM— COM , 当判断所请求的指令位于 IMEM时 , 向 IMEM发送 指令数据请求, 当判断所请求的指令位于 IMEM— COM时, 向 IMEM— COM 发送指令数据请求; 以及接收 IMEM或 IMEM— COM返回的指令数据 , 并将 该指令数据返回给緩存;  The arbiterl is set to: when the cache instruction data request is made, determine whether the requested instruction is located in the IMEM or in the IMEM-COM, and when determining that the requested instruction is located in the IMEM, send an instruction data request to the IMEM, when determining that the requested instruction is located in the IMEM - COM, send an instruction data request to IMEM-COM; and receive instruction data returned by IMEM or IMEM-COM, and return the instruction data to the cache;
该 arbiter2设置成:在接收到一个或多个 arbiterl发送的指令数据请求时, 选择一个指令数据请求发送给 IMEM处理,将 IMEM取指后得到的指令数据 返回给 arbiterl ;  The arbiter2 is set to: when receiving one or more instruction data requests sent by arbiterl, select an instruction data request to send to the IMEM for processing, and return the instruction data obtained by the IMEM fetching to arbiterl;
该 arbiter3设置成:在接收到一个或多个 arbiterl发送的指令数据请求时, 选择一个指令数据请求发送给 IMEM— COM处理, 将 IMEM— COM取指后得 到的指令数据返回给 arbiterl。  The arbiter3 is set to: when receiving one or more instruction data requests sent by arbiterl, select an instruction data request to be sent to the IMEM-COM process, and return the instruction data obtained by the IMEM-COM fetch to the arbiter.
以 N=32为例, 可将每大组的 32个微引擎分成 4个小组, 每小组包括 8 个微引擎。 如图 4所示, 每个微引擎对应一个 Qmem和一个 Cache (包括两 个指令緩存 ( icache ) ) , 每小组的 8 个微引擎共享一个 IMEM, 每大组的 32个微引擎共享一个 IMEM— C0M。图 4中 A1表示 arbiterl , A2表示 arbiter2, A3表示 arbiter3。 如图 5所示, 两个 icache与 ME中的两个报文存储器—— 对应, 它们轮流工作以掩盖报文存储和取指的延时。 Taking N=32 as an example, each group of 32 micro-engines can be divided into 4 groups, each group including 8 micro-engines. As shown in Figure 4, each microengine corresponds to a Qmem and a Cache (including two instruction caches (icache)), and each group of 8 microengines share an IMEM, each group of 32 microengines share an IMEM—C0M. In Fig. 4, A1 represents arbiterl, A2 represents arbiter2, and A3 represents arbiter3. As shown in Figure 5, the two icache correspond to two message stores in the ME, which work in turn to mask the delay in message storage and fetching.
实施例 2  Example 2
对应图 3所示的指令存储装置, 该指令存储装置的指令存储方法如图 6 所示, 包括以下步骤。  Corresponding to the instruction storage device shown in FIG. 3, the instruction storage method of the instruction storage device is as shown in FIG. 6, and includes the following steps.
步骤 1 , Qmem在接收到微引擎发送的指令数据请求后, 判断本 Qmem 是否有该指令数据, 如果有, 则将指令数据返回给微引擎, 如果没有, 则向 緩存发送指令数据请求。  Step 1: After receiving the instruction data request sent by the microengine, Qmem determines whether the Qmem has the instruction data, and if so, returns the instruction data to the microengine, and if not, sends the instruction data request to the cache.
步骤 2,緩存中的一 Cache Line在接收到 Qmem发送的指令数据请求后, 判断本緩存是否有该指令数据, 如果有, 则将指令数据通过 Qmem返回给微 引擎, 如果没有, 则向 IMEM或 IMEM— COM发送指令数据请求。  Step 2: After receiving a command data request sent by Qmem, a cache line in the cache determines whether the cache has the command data, and if so, returns the command data to the micro engine through Qmem, if not, to the IMEM or IMEM—COM sends an instruction data request.
步骤 3 , IMEM在接收到緩存发送的指令数据请求后, 查找指令数据, 向緩存返回查找到的指令数据; IMEM— COM在接收到緩存发送的指令数据 请求后, 查找指令数据, 向緩存返回查找到的指令数据。  Step 3: After receiving the instruction data request sent by the cache, the IMEM searches for the instruction data, and returns the found instruction data to the cache. After receiving the instruction data request sent by the cache, the IMEM searches for the instruction data and returns the search to the cache. The instruction data to.
对于任一个微引擎, 取指令过程如图 7所示, 包括以下步骤。  For any microengine, the instruction fetch process is shown in Figure 7, which includes the following steps.
步骤 110 , 微引擎将需求的指令地址和地址使能发送至该微引擎的 Qmem。  Step 110: The micro engine sends the required instruction address and address enable to the Qmem of the micro engine.
微引擎中的报文存储器收到报文时将报文中的指令地址和地址使能发送 给指令存储装置, 即, 该微引擎对应的 Qmem。  When receiving the message, the message memory in the micro engine sends the instruction address and address in the message to the instruction storage device, that is, the Qmem corresponding to the micro engine.
步骤 120, Qmem判断该指令地址是否在其所存指令的地址范围内,如果 在, 则执行步骤 130, 如果不在, 则执行步骤 140。  Step 120: Qmem determines whether the instruction address is within the address range of the instruction it stores. If yes, step 130 is performed. If not, step 140 is performed.
步骤 130, Qmem用该指令地址和该地址使能取指令数据并将指令数据返 回给微引擎, 本次取指过程结束。  Step 130: Qmem uses the instruction address and the address to enable fetching instruction data and return the instruction data to the microengine. The fetching process ends.
步骤 140, Qmem将该指令地址和地址使能传送给该微引擎的 Cache。 步骤 150, Cache判断该指令地址是否在其所存指令的地址范围内, 如果 是, 执行步骤 160, 如果否, 则执行步骤 170。 由于 Cache的每部分只有一个 Cache Line, 因此 Cache的标签( Tag )也 只有一个标签的信息, 当地址请求被送到 Cache时, 根据 Tag马上就可以判 断出所需数据是否在 Cache 中, 即, 将该指令地址对应的位与当前工作的 Cache Line对应的 Tag进行对比, 如果相同, 说明该指令在 Cache中, 如果 不同, 则说明该指令不在 Cache中。 In step 140, Qmem transmits the instruction address and address enable to the Cache of the micro engine. Step 150: The Cache determines whether the instruction address is within the address range of the instruction it stores. If yes, step 160 is performed. If no, step 170 is performed. Since each part of the Cache has only one Cache Line, the Cache tag has only one tag information. When the address request is sent to the Cache, it can immediately determine whether the required data is in the Cache according to the tag, that is, The bit corresponding to the instruction address is compared with the tag corresponding to the currently working Cache Line. If they are the same, the instruction is in the Cache. If the command is different, the instruction is not in the Cache.
步骤 160, Cache基于地址使能将 Cache Line中对应位置的指令数据取出 并通过 Qmem将指令数据送给微引擎, 本次取指过程结束。  Step 160: The Cache extracts the instruction data corresponding to the location in the Cache Line based on the address and sends the instruction data to the micro engine through the Qmem, and the fetching process ends.
步骤 170, Cache将该指令地址和地址使能送至第一仲裁模块( arbiterl )。 步骤 180, arbiterl判断该指令地址是在该微引擎所在微引擎小组对应的 IMEM中, 还是在该微引擎所在微引擎大组对应的 IMEM— COM中, 如果在 IMEM中, 则执行步骤 190, 如果在 IMEM— COM中, 则执行步骤 210。  Step 170: The Cache sends the instruction address and address enable to the first arbitration module (arbiterl). Step 180, arbiterl determines whether the instruction address is in the IMEM corresponding to the microengine group of the microengine, or in the IMEM-COM corresponding to the microengine large group of the microengine, if in the IMEM, step 190 is performed. In IMEM-COM, step 210 is performed.
arbiterl根据指令地址判断该指令是在 IMEM中还是在 IMEM— COM中。 步骤 190, arbiterl将指令地址和地址使能发送至第二仲裁模块( arbiter2 )。 步骤 200, arbiter2选择一个指令请求发送至 IMEM, IMEM根据请求中 的指令地址和地址使能取指令数据, 通过 arbiterl将指令数据返回给 Cache, 执行步骤 230。  Arbiterl determines whether the instruction is in IMEM or IMEM-COM based on the instruction address. Step 190, arbiterl sends the instruction address and address enable to the second arbitration module (arbiter2). Step 200: arbiter2 selects an instruction request to send to the IMEM, and the IMEM enables the instruction data according to the instruction address and address in the request, and returns the instruction data to the Cache through arbiterl, and step 230 is performed.
当有多个微引擎对应的 arbiterl均向 arbiter2发起取指请求时, arbiter2通 过轮询的方式处理各 Cache的请求, 选择一取指请求送 IMEM处理, 由于数 据返回需要多个时钟周期, 已经发出请求的支路将不会再被轮询到。  When arbiter1 corresponding to multiple microengines initiates an instruction fetching request to arbiter2, arbiter2 processes the requests of each Cache by polling, selects an instruction fetch request and sends IMEM processing, and since the data return requires multiple clock cycles, it has already sent out The requested branch will no longer be polled.
步骤 210, arbiterl将指令地址和地址使能发送至第三仲裁模块( arbiter3 )。 步骤 220, arbiter3选择一个指令请求发送至 IMEM— COM, IMEM— COM 根据请求中的指令地址和地址使能取指令数据,通过 arbiterl将指令数据返回 给 Cache, 执行步骤 230。  Step 210, arbiterl sends the instruction address and address enable to the third arbitration module (arbiter3). Step 220: arbiter3 selects an instruction request to send to IMEM_COM, and the IMEM_COM enables the instruction data according to the instruction address and address in the request, and returns the instruction data to the Cache through arbiterl, and step 230 is performed.
每个微引擎对应的 arbiter 的功能同 arbiterl 的功能, arbiter3 的功能同 arbiter2的功能。  Each microengine corresponds to the function of arbiter with the function of arbiterl, and the function of arbiter3 is the same as the function of arbiter2.
步骤 230, Cache更新 Cache Line和 Tag的内容, 并将该指令数据通过 Qmem返回给微引擎, 本次取指过程结束。  Step 230: The Cache updates the contents of the Cache Line and the Tag, and returns the instruction data to the micro engine through the Qmem, and the fetching process ends.
图 8为图 5中 icache的结构示意图, icache收到 Qmem送来的指令地址 后, 与 Tag进行比较, 判断是否命中, 如果命中, 则在译码后, 根据地址使 能从 icache的物理存储位置取指令内容, 通过多路选择器输出指令内容, 如 果未命中, 则继续去低速指令存储器取指令数据, 返回的指令数据经多路选 择器输出。 Figure 8 is a schematic diagram of the structure of the icon of Figure 5, icache receives the command address sent by Qmem After comparing with Tag, it is judged whether it is hit. If it is hit, after decoding, according to the address enable, the instruction content is fetched from the physical storage location of the icon, and the instruction content is output through the multiplexer. If it is missed, continue to go. The low-speed instruction memory fetches the instruction data, and the returned instruction data is output through the multiplexer.
处理同一个报文时, 只使用 Cache中一个 Cache Line来工作。 当前报文 所使用的 Cache Linel在 Cache中找到相应的指令数据,而未向下级低速指令 存储器( IMEM或 IMEM— COM )发出读请求时, 如果 Cache Line2检测到下 个报文中有首地址的请求,则 Cache Line2用下个报文中所包含的指令首地址 向下级低速指令存储器发出读请求, 以获得下一个报文中所需的指令数据。 当前 Cache Linel的报文处理完后, Cache切换到另一半的 Cache Line 2以准 备处理下一个报文。 这样釆用乒乓操作来处理报文可以有效地掩盖报文存储 的时间和向低速指令存储器取指的延时, 在微引擎切换到下个报文时马上就 可以取到所需要的指令, 提高了取指效率, 从而使得微引擎的处理效率提高。  When processing the same message, only use one Cache Line in the Cache to work. The Cache Linel used by the current message finds the corresponding instruction data in the Cache, and does not issue a read request to the lower-level low-speed instruction memory (IMEM or IMEM-COM), if Cache Line2 detects the first address in the next message. If requested, Cache Line2 issues a read request to the lower low-speed instruction memory with the instruction first address contained in the next message to obtain the instruction data required in the next message. After the current Cache Linel packet is processed, the Cache switches to the other half of Cache Line 2 to prepare for processing the next packet. In this way, the ping-pong operation to process the message can effectively cover the time of the message storage and the delay of the instruction to the low-speed instruction memory. When the micro-engine switches to the next message, the required instruction can be obtained immediately. The efficiency of the fetching is increased, so that the processing efficiency of the microengine is improved.
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序 来指令相关硬件完成, 所述程序可以存储于计算机可读存储介质中, 如只读 存储器、 磁盘或光盘等。 可选地, 上述实施例的全部或部分步骤也可以使用 一个或多个集成电路来实现。 相应地, 上述实施例中的各模块 /单元可以釆用 硬件的形式实现, 也可以釆用软件功能模块的形式实现。 本发明实施例不限 制于任何特定形式的硬件和软件的结合。  One of ordinary skill in the art will appreciate that all or a portion of the above steps may be accomplished by a program instructing the associated hardware, such as a read-only memory, a magnetic disk, or an optical disk. Alternatively, all or part of the steps of the above embodiments may also be implemented using one or more integrated circuits. Correspondingly, each module/unit in the above embodiment may be implemented in the form of hardware or in the form of a software function module. Embodiments of the invention are not limited to any particular form of combination of hardware and software.
当然, 本发明还可有其他多种实施例, 在不背离本发明精神及其实质的 和变形, 但这些相应的改变和变形都应属于本发明所附的权利要求的保护范 围。  It is a matter of course that the invention may be embodied in various other forms and modifications without departing from the spirit and scope of the invention.
工业实用性 Industrial applicability
釆用本发明实施例的指令存储方案, 有效地保证一部分指令的高取指效 率和较高的平均取指效率, 而且节省了大量的硬件存储资源, 同时编译器的 实现也十分简单。  The instruction storage scheme of the embodiment of the invention effectively ensures high indexing efficiency and high average indexing efficiency of a part of the instructions, and saves a large amount of hardware storage resources, and the compiler is also very simple to implement.

Claims

权 利 要 求 书 Claim
1、一种网络处理器的指令存储装置, 网络处理器包括两个以上的微引擎 大组, 每个微引擎大组包括 N个微引擎, 该 N个微引擎包括两个以上的微引 擎小组, 所述指令存储装置包括: 快速存储器(Qmem ) 、 緩存(cache ) 、 第一低速指令存储器和第二低速指令存储器, 其中:  A command storage device for a network processor, the network processor comprising two or more micro-engine groups, each micro-engine group comprising N micro-engines, the N micro-engines comprising more than two micro-engine groups The instruction storage device includes: a flash memory (Qmem), a cache (cache), a first low speed instruction memory, and a second low speed instruction memory, wherein:
每个微引擎对应一个 Qmem和一个緩存, 所述 Qmem设置成与所述微引 擎连接, 所述緩存与所述 Qmem相连;  Each micro engine corresponds to a Qmem and a cache, the Qmem is set to be connected to the micro engine, and the cache is connected to the Qmem;
每个微引擎小组对应一个第一低速指令存储器, 所述微引擎小组中每个 微引擎对应的緩存与所述第一低速指令存储器相连; 以及  Each micro-engine group corresponds to a first low-speed instruction memory, and a cache corresponding to each of the micro-engine groups is connected to the first low-speed instruction memory;
每个微引擎大组对应一个第二低速指令存储器, 所述微引擎大组中每个 微引擎对应的緩存与所述第二低速指令存储器相连。  Each micro-engine large group corresponds to a second low-speed instruction memory, and a cache corresponding to each micro-engine in the micro-engine large group is connected to the second low-speed instruction memory.
2、 如权利要求 1所述的指令存储装置, 其中:  2. The instruction storage device of claim 1, wherein:
所述 Qmem设置成: 在接收到所述微引擎发送的指令数据请求后, 判断 本 Qmem是否有指令数据, 如果有, 则将所述指令数据返回给所述微引擎, 如果没有, 则向所述緩存发送所述指令数据请求。  The Qmem is configured to: after receiving the instruction data request sent by the microengine, determine whether the Qmem has instruction data, and if yes, return the instruction data to the microengine, if not, then The cache sends the instruction data request.
3、 如权利要求 1或 2所述的装置, 其中:  3. Apparatus according to claim 1 or 2, wherein:
所述 Qmem中存储对处理质量要求最高的一个地址段的指令。  The Qmem stores an instruction for an address segment that has the highest processing quality.
4、 如权利要求 1所述的指令存储装置, 其中:  4. The instruction storage device of claim 1, wherein:
所述緩存包括两个 Cache Line, 每个 Cache Line存放多条连续的指令; 所述 Cache Line设置成在接收到所述 Qmem发送的指令数据请求后, 判 断本緩存是否有所述指令数据, 如果有, 则将所述指令数据通过所述 Qmem 返回给所述微引擎, 如果没有, 则向所述第一低速指令存储器或所述第二低 速指令存储器发送所述指令数据请求。  The cache includes two Cache Lines, each Cache Line stores a plurality of consecutive instructions; the Cache Line is configured to determine whether the cache has the instruction data after receiving the instruction data request sent by the Qmem, if If yes, the instruction data is returned to the microengine through the Qmem, and if not, the instruction data request is sent to the first low speed instruction memory or the second low speed instruction memory.
5、 如权利要求 4所述的装置, 其中:  5. Apparatus according to claim 4 wherein:
所述两个 Cache Line釆用乒乓操作处理报文, 且所述乒乓操作与报文存 储器的乒乓操作同步。  The two Cache Lines process the message with a ping-pong operation, and the ping-pong operation is synchronized with the ping-pong operation of the message store.
6、 如权利要求 1或 2或 4或 5所述的指令存储装置, 还包括第一仲裁模 块、 第二仲裁模块和第三仲裁模块, 其中: 6. The instruction storage device according to claim 1 or 2 or 4 or 5, further comprising a first arbitration mode a block, a second arbitration module, and a third arbitration module, wherein:
每个微引擎对应一个第一仲裁模块, 所述第一仲裁模块与每个微引擎的 緩存相连;  Each microengine corresponds to a first arbitration module, and the first arbitration module is connected to a cache of each microengine;
每个微引擎小组对应一个第二仲裁模块, 所述第二仲裁模块的一端与所 述微引擎小组中每个微引擎的第一仲裁模块相连, 另一端与所述第一低速指 令存储器相连;  Each micro-engine group corresponds to a second arbitration module, one end of the second arbitration module is connected to the first arbitration module of each micro-engine in the micro-engine group, and the other end is connected to the first low-speed instruction memory;
每个微引擎大组对应一个第三仲裁模块, 所述第三仲裁模块的一端与所 述微引擎大组中每个微引擎的第一仲裁模块相连, 另一端与所述第二低速指 令存储器相连。  Each microengine large group corresponds to a third arbitration module, one end of the third arbitration module is connected to a first arbitration module of each microengine in the microengine large group, and the other end is connected to the second low speed instruction memory. Connected.
7、 如权利要求 6所述的指令存储装置, 其中:  7. The instruction storage device of claim 6, wherein:
所述第一仲裁模块设置成: 在緩存所述指令数据请求时, 判断所请求的 指令位于所述第一低速指令存储器还是位于所述第二低速指令存储器, 当判 断所请求的指令位于所述第一低速指令存储器时 , 向所述第一低速指令存储 器发送所述指令数据请求, 当判断所请求的指令位于所述第二低速指令存储 器时, 向所述第二低速指令存储器发送所述指令数据请求; 以及接收所述第 一低速指令存储器或所述第二低速指令存储器返回的指令数据, 并将所述指 令数据返回给所述緩存;  The first arbitration module is configured to: when buffering the instruction data request, determining whether the requested instruction is located in the first low speed instruction memory or in the second low speed instruction memory, when determining that the requested instruction is located in the Transmitting, by the first low speed instruction memory, the instruction data request to the first low speed instruction memory, and sending the instruction to the second low speed instruction memory when determining that the requested instruction is located in the second low speed instruction memory And receiving the instruction data returned by the first low speed instruction memory or the second low speed instruction memory, and returning the instruction data to the cache;
所述第二仲裁模块设置成: 在接收到一个或多个第一仲裁模块发送的指 令数据请求时,选择一个指令数据请求发送给所述第一低速指令存储器处理, 将所述第一低速指令存储器取指后得到的指令数据返回给所述第一仲裁模 块; 以及  The second arbitration module is configured to: when receiving the instruction data request sent by the one or more first arbitration modules, select an instruction data request to be sent to the first low speed instruction memory to process, the first low speed instruction The instruction data obtained after the memory fetching is returned to the first arbitration module;
所述第三仲裁模块设置成: 在接收到一个或多个第一仲裁模块发送的指 令数据请求时,选择一个指令数据请求发送给所述第二低速指令存储器处理, 将所述第二低速指令存储器取指后得到的指令数据返回给所述第一仲裁模 块。  The third arbitration module is configured to: when receiving the instruction data request sent by the one or more first arbitration modules, select an instruction data request to be sent to the second low speed instruction memory to process, the second low speed instruction The instruction data obtained after the memory fetching is returned to the first arbitration module.
8、 如权利要求 7所述的指令存储装置, 其中:  8. The instruction storage device of claim 7, wherein:
所述緩存还设置成: 在接收到所述第一仲裁模块返回的指令数据后, 更 新緩存内容和标签。 The cache is further configured to: after receiving the instruction data returned by the first arbitration module, update the cache content and the label.
9、 如权利要求 1、 2、 4、 5、 7或 8所述的指令存储装置, 其中: 每个微引擎大组包括 32个微引擎, 所述 32个微引擎分成 4个微引擎小 组, 每个微引擎小组包括 8个微引擎。 9. The instruction storage device of claim 1, 2, 4, 5, 7 or 8, wherein: each microengine large group comprises 32 microengines, and the 32 microengines are divided into 4 microengine groups. Each microengine group includes 8 microengines.
10、 一种使用如权利要求 1所述的指令存储装置存储指令的方法, 所述 方法包括:  10. A method of storing instructions using the instruction storage device of claim 1 , the method comprising:
快速存储器 (Qmem )在接收到微引擎发送的指令数据请求后, 判断本 Qmem是否有指令数据, 如果有, 则将所述指令数据返回给所述微引擎, 如 果没有, 则向緩存发送所述指令数据请求;  The fast memory (Qmem), after receiving the instruction data request sent by the microengine, determines whether the Qmem has instruction data, and if so, returns the instruction data to the microengine, and if not, sends the Instruction data request;
所述緩存中的一 Cache Line在接收到所述 Qmem发送的指令数据请求后, 判断本緩存是否有所述指令数据,如果有,则将所述指令数据通过所述 Qmem 返回给所述微引擎, 如果没有, 则向第一低速指令存储器或第二低速指令存 储器发送所述指令数据请求;  After receiving the instruction data request sent by the Qmem, the Cache Line in the cache determines whether the cache has the instruction data, and if so, returns the instruction data to the micro engine through the Qmem. If not, transmitting the instruction data request to the first low speed instruction memory or the second low speed instruction memory;
所述第一低速指令存储器在接收到所述緩存发送的指令数据请求后 , 查 找指令数据, 向所述緩存返回查找到的指令数据; 以及  After receiving the instruction data request sent by the cache, the first low speed instruction memory searches for instruction data, and returns the found instruction data to the cache;
所述第二低速指令存储器在接收到所述緩存发送的指令数据请求后, 查 找指令数据, 向所述緩存返回查找到的指令数据。  After receiving the instruction data request sent by the cache, the second low speed instruction memory searches for the instruction data, and returns the found instruction data to the cache.
11、 如权利要求 10所述的存储指令的方法, 还包括:  11. The method of storing instructions according to claim 10, further comprising:
所述緩存中的一 Cache Line在判断本緩存没有所述指令数据时, 将所述 指令数据请求发送给第一仲裁模块 , 所述第一仲裁模块判断所请求的指令位 于所述第一低速指令存储器时, 则向所述第一低速指令存储器发送所述指令 数据请求, 所述第一仲裁模块判断所请求的指令位于所述第二低速指令存储 器时 , 则向所述第二低速指令存储器发送所述指令数据请求。  A cache line in the cache sends the instruction data request to the first arbitration module when determining that the cache does not have the instruction data, and the first arbitration module determines that the requested instruction is located in the first low speed instruction. In the case of a memory, the instruction data request is sent to the first low speed instruction memory, and the first arbitration module determines that the requested instruction is located in the second low speed instruction memory, and then sends the instruction to the second low speed instruction memory. The instruction data request.
12、 如权利要求 11所述的存储指令的方法, 还包括:  12. The method of storing instructions according to claim 11, further comprising:
所述第一仲裁模块判断所请求的指令位于所述第一低速指令存储器时, 则向第二仲裁模块发送所述指令数据请求 , 所述第二仲裁模块收到一个或多 个第一仲裁模块发送的指令数据请求时, 选择一个指令数据请求发送给所述 第一低速指令存储器;  When the first arbitration module determines that the requested instruction is located in the first low-speed instruction memory, sends the instruction data request to the second arbitration module, and the second arbitration module receives one or more first arbitration modules. When the transmitted instruction data is requested, an instruction data request is selected and sent to the first low speed instruction memory;
所述第一仲裁模块判断所请求的指令位于所述第二低速指令存储器时, 则向第三仲裁模块发送所述指令数据请求, 所述第三仲裁模块收到一个或多 个第一仲裁模块发送的指令数据请求时, 选择一个指令数据请求发送给所述 第二低速指令存储器。 The first arbitration module determines that the requested instruction is located in the second low speed instruction memory, And sending, by the third arbitration module, the instruction data request, when the third arbitration module receives the instruction data request sent by the one or more first arbitration modules, selecting an instruction data request to send to the second low-speed instruction memory. .
PCT/CN2013/078736 2012-07-06 2013-07-03 Instruction storage device of network processor and instruction storage method for same WO2013185660A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210233710.XA CN102855213B (en) 2012-07-06 2012-07-06 A kind of instruction storage method of network processing unit instruction storage device and the device
CN201210233710.X 2012-07-06

Publications (1)

Publication Number Publication Date
WO2013185660A1 true WO2013185660A1 (en) 2013-12-19

Family

ID=47401809

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/078736 WO2013185660A1 (en) 2012-07-06 2013-07-03 Instruction storage device of network processor and instruction storage method for same

Country Status (2)

Country Link
CN (1) CN102855213B (en)
WO (1) WO2013185660A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855213B (en) * 2012-07-06 2017-10-27 中兴通讯股份有限公司 A kind of instruction storage method of network processing unit instruction storage device and the device
CN106293999B (en) 2015-06-25 2019-04-30 深圳市中兴微电子技术有限公司 A kind of implementation method and device of micro engine processing message intermediate data snapshot functions
CN108804020B (en) * 2017-05-05 2020-10-09 华为技术有限公司 Storage processing method and device
CN109493857A (en) * 2018-09-28 2019-03-19 广州智伴人工智能科技有限公司 A kind of auto sleep wake-up robot system
EP3893122A4 (en) * 2018-12-24 2022-01-05 Huawei Technologies Co., Ltd. Network processor and message processing method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1845529A (en) * 2006-01-25 2006-10-11 华为技术有限公司 Network processing device and method
US20070234310A1 (en) * 2006-03-31 2007-10-04 Wenjie Zhang Checking for memory access collisions in a multi-processor architecture
US20110289034A1 (en) * 2010-05-19 2011-11-24 Palmer Douglas A Neural Processing Unit
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
CN102855213A (en) * 2012-07-06 2013-01-02 中兴通讯股份有限公司 Network processor instruction storage device and instruction storage method thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050102474A1 (en) * 2003-11-06 2005-05-12 Sridhar Lakshmanamurthy Dynamically caching engine instructions
CN100456271C (en) * 2007-03-19 2009-01-28 中国人民解放军国防科学技术大学 Stream application-oriented on-chip memory

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1845529A (en) * 2006-01-25 2006-10-11 华为技术有限公司 Network processing device and method
US20070234310A1 (en) * 2006-03-31 2007-10-04 Wenjie Zhang Checking for memory access collisions in a multi-processor architecture
US20110289034A1 (en) * 2010-05-19 2011-11-24 Palmer Douglas A Neural Processing Unit
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
CN102855213A (en) * 2012-07-06 2013-01-02 中兴通讯股份有限公司 Network processor instruction storage device and instruction storage method thereof

Also Published As

Publication number Publication date
CN102855213A (en) 2013-01-02
CN102855213B (en) 2017-10-27

Similar Documents

Publication Publication Date Title
US11809321B2 (en) Memory management in a multiple processor system
US7558925B2 (en) Selective replication of data structures
US7555597B2 (en) Direct cache access in multiple core processors
US6772268B1 (en) Centralized look up engine architecture and interface
US10970214B2 (en) Selective downstream cache processing for data access
US9529622B1 (en) Systems and methods for automatic generation of task-splitting code
CN108257078B (en) Memory aware reordering source
US20060179277A1 (en) System and method for instruction line buffer holding a branch target buffer
WO2013185660A1 (en) Instruction storage device of network processor and instruction storage method for same
WO2016101664A1 (en) Instruction scheduling method and device
US9418018B2 (en) Efficient fill-buffer data forwarding supporting high frequencies
US9384131B2 (en) Systems and methods for accessing cache memory
WO2015176315A1 (en) Hash join method, device and database management system
US9697127B2 (en) Semiconductor device for controlling prefetch operation
JP2007510989A (en) Dynamic caching engine instructions
US20140089587A1 (en) Processor, information processing apparatus and control method of processor
CN114924794B (en) Address storage and scheduling method and device for transmission queue of storage component
CN114063923A (en) Data reading method and device, processor and electronic equipment
James-Roxby et al. Time-critical software deceleration in a FCCM
US9811467B2 (en) Method and an apparatus for pre-fetching and processing work for procesor cores in a network processor
WO2021061269A1 (en) Storage control apparatus, processing apparatus, computer system, and storage control method
US10303483B2 (en) Arithmetic processing unit and control method for arithmetic processing unit
CN110674138A (en) Message searching method and device
CN118012510A (en) Network processor, network data processing device and chip
JPWO2012172694A1 (en) Arithmetic processing device, information processing device, and control method of arithmetic processing device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13803552

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13803552

Country of ref document: EP

Kind code of ref document: A1