CN102855213B - A kind of instruction storage method of network processing unit instruction storage device and the device - Google Patents

A kind of instruction storage method of network processing unit instruction storage device and the device Download PDF

Info

Publication number
CN102855213B
CN102855213B CN201210233710.XA CN201210233710A CN102855213B CN 102855213 B CN102855213 B CN 102855213B CN 201210233710 A CN201210233710 A CN 201210233710A CN 102855213 B CN102855213 B CN 102855213B
Authority
CN
China
Prior art keywords
low speed
director data
micro engine
instruction memory
speed instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210233710.XA
Other languages
Chinese (zh)
Other versions
CN102855213A (en
Inventor
郝宇
安康
王志忠
刘衡祁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanechips Technology Co Ltd
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201210233710.XA priority Critical patent/CN102855213B/en
Publication of CN102855213A publication Critical patent/CN102855213A/en
Priority to PCT/CN2013/078736 priority patent/WO2013185660A1/en
Application granted granted Critical
Publication of CN102855213B publication Critical patent/CN102855213B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses the instruction storage method of a kind of network processing unit instruction storage device and the device, hardware resource can be saved.The network processing unit includes big group of more than two micro engines, and each big group of micro engine includes N number of micro engine, and N number of micro engine includes more than two micro engine groups, and the instruction storage device includes:Qmem, caching, the first low speed instruction memory and the second low speed instruction memory, wherein:Each one Qmem and caching of micro engine correspondence, Qmem is set to be connected with micro engine, and caching is connected with Qmem;Corresponding cache of each micro engine is connected with the first low speed instruction memory in one the first low speed instruction memory of each micro engine group correspondence, micro engine group;Corresponding cache of each micro engine is connected with the second low speed instruction memory in one the second low speed instruction memory of each big group of correspondence of micro engine, big group of micro engine.Substantial amounts of hardware store resource is saved using the program.

Description

A kind of instruction storage method of network processing unit instruction storage device and the device
Technical field
The present invention relates to internet arena, and in particular to the instruction of a kind of network processing unit instruction storage device and the device Storage method.
Background technology
With internet(Internet)Fast development, for core network interconnect core router interface rate 100Gbps is reached, the line card of the rate requirement core router can be handled rapidly by the message on line card, current industry Boundary mostly uses the structure of multi-core network processor.And the fetching efficiency instructed is to influence one of multi-core network processor performance Key factor.
In the network processor system of coenocytism, same group of micro engine (Micro Engine, abbreviation ME) has together The instruction demand of sample, due to the limitation of chip area and technique, it is impossible to be equipped with one piece of storage exclusively enjoyed for each micro engine Space instructs to store these.Therefore one corresponding scheme of design is needed to realize that one group of micro engine is empty to a piece of instruction storage Between it is shared, while can have higher fetching efficiency.
Some traditional multi-core network processors use the structure of multi-level buffer, and for example each micro engine is equipped with one individually Level cache, one group of micro engine shares the structure of a L2 cache to realize the shared of memory space, as shown in Figure 1.This A little cachings all have larger space to ensure hit rate, but are due to that the randomness of network message causes the locality of instruction not By force, therefore the caching of Large Copacity does not ensure that fetching efficiency, while also resulting in a large amount of wastes of resource.
Other network processing units employ the instruction storage scheme of polling type, and the instruction needed for one group of micro engine is stored With the RAM of micro engine equivalent amount in, as shown in Fig. 24 micro engines pass through in 4 RAM of an arbitration modules poll in figure Instruction.Each micro engine in turn accesses all RAM, and their access is in different " phase " all the time, therefore will not Occur the collision that different micro engines access same RAM, realize the shared of memory space.But it is due to exist largely in instructing Jump instruction, it is assumed that for the micro engine of pipeline organization, from starting to get jump instruction to when redirecting completion and needing n The time of clock, to ensure the target of some jump instruction where the jump instruction behind RAM in (n+1)th RAM, write instruction When have to insert some do-nothing instructions and ensure the correct of jump target position.When jump instruction proportion is very big It is accomplished by inserting substantial amounts of do-nothing instruction, causes a large amount of wastes of the instruction space, and the complexity of compiler realization can be increased Degree.The program requires that all RAM must can be realized in 1 clock cycle returned data using SRAM, but a large amount of SRAM make With also resulting in substantial amounts of resource overhead.
The content of the invention
The technical problem to be solved in the present invention is to provide the instruction of a kind of network processing unit instruction storage device and the device Storage method, can save hardware resource.
In order to solve the above technical problems, the invention provides a kind of network processing unit instruction storage device, network processing unit Including big group of more than two micro engines, each big group of micro engine includes N number of micro engine, and N number of micro engine includes two or more Micro engine group, the instruction storage device includes:Fast storage(Qmem), caching(cache), the first low speed instruction deposits Reservoir and the second low speed instruction memory, wherein:
Each one Qmem and caching of micro engine correspondence, Qmem is set to be connected with micro engine, caching and Qmem phases Even;
Each micro engine is corresponding in one the first low speed instruction memory of each micro engine group correspondence, micro engine group Caching is connected with the first low speed instruction memory;
Each micro engine is corresponding in one the second low speed instruction memory of each big group of correspondence of micro engine, big group of micro engine Caching is connected with the second low speed instruction memory.
Further, the Qmem is used for after the director data request of micro engine transmission is received, and judges that this Qmem is It is no to have the director data, if so, director data then is returned into micro engine, if it is not, sending director data to caching Request.
Further, the instruction to handling one address field of quality requirement highest is stored in the Qmem.
Further, described cache includes two Cache Line, a plurality of continuous instruction of each Cache Line storages, The Cache Line are used for after the director data request of Qmem transmissions is received, and judge whether this caching has the instruction number According to, if so, director data then is returned into micro engine by Qmem, if it is not, to the first low speed instruction memory or Second low speed instruction memory sends director data request.
Further, described two Cache Line use ping-pong operation form, and with the ping-pong operation of packet storage device It is synchronous.
Further, described device also includes the first arbitration modules, the second arbitration modules and the 3rd arbitration modules, wherein:
One the first arbitration modules of each micro engine correspondence, first arbitration modules are connected with the caching of each micro engine;
In one the second arbitration modules of each micro engine group correspondence, one end of second arbitration modules and micro engine group First arbitration modules of each micro engine are connected, and the other end is connected with the first low speed instruction memory;
One the 3rd arbitration modules of each big group of correspondence of micro engine, one end and each micro engine of the 3rd arbitration modules First arbitration modules are connected, and the other end is connected with the second low speed instruction memory.
Further, first arbitration modules, in cache request director data, judging asked command bits Still it is located at the second low speed instruction memory in the first low speed instruction memory, to the first low speed instruction memory or the second low speed Command memory sends director data request;And for receiving the first low speed instruction memory or the second low speed instruction memory The director data of return, caching is returned to by the director data;
Second arbitration modules, for receiving the director data request that one or more first arbitration modules are sent When, one director data request of selection is sent to the processing of the first low speed instruction memory, by the first low speed instruction memory fetching After obtain director data and return to corresponding first arbitration modules;
3rd arbitration modules, for receiving the director data request that one or more first arbitration modules are sent When, one director data request of selection is sent to the processing of the second low speed instruction memory, by the second low speed instruction memory fetching After obtain director data and return to corresponding first arbitration modules.
Further, the caching is additionally operable to after the director data of the first arbitration modules return is received, and updates caching Content and label.
Further, big group of each micro engine includes 32 micro engines, and 32 micro engines include 4 micro engine groups, Each micro engine group includes 8 micro engines.
In order to solve the above technical problems, present invention also offers a kind of instruction storage method of instruction storage device, it is described Instruction storage device is foregoing instruction storage device, and methods described includes:
Fast storage(Qmem)After the director data request of micro engine transmission is received, judge whether this Qmem has this Director data, if so, director data then is returned into micro engine, if it is not, sending director data request to caching;
A Cache Line in the caching judge that this caching is after the director data request of Qmem transmissions is received It is no to have the director data, if so, director data then is returned into micro engine by Qmem, if it is not, to the first low speed Command memory or the second low speed instruction memory send director data request;
The first low speed instruction memory is after the director data request that caching is sent is received, look-up command data, The director data found is returned to caching;
The second low speed instruction memory is after the director data request that caching is sent is received, look-up command data, The director data found is returned to caching.
Further, methods described also includes:
A Cache Line in the caching ask director data to send out when judging this caching without the director data The first arbitration modules are given, first arbitration modules judge asked instruction if located in the first low speed instruction memory, then To the first low speed instruction memory send director data request, the instruction asked if located in the second low speed instruction memory, To the second low speed instruction memory requests director data.
Further, methods described also includes:
First arbitration modules judge asked instruction if located in the first low speed instruction memory, then secondary to second Cut out module and send director data request, second arbitration modules receive the instruction number that one or more first arbitration modules are sent During according to request, one director data request of selection is sent to the first low speed instruction memory;
First arbitration modules judge asked instruction if located in the second low speed instruction memory, then secondary to the 3rd Cut out module and send director data request, the 3rd arbitration modules receive the instruction number that one or more first arbitration modules are sent During according to request, one director data request of selection is sent to the second low speed instruction memory.
What the embodiment of the present invention was provided is applied to the instruction based on fast storage and caching of multi-core network processor Storage scheme, is combined together fast storage, low capacity and using the caching and low speed DRAM memory of ping-pong operation, Wherein memory uses the grouping strategy of stratification.Storage scheme is instructed effectively to ensure that the height of part instruction takes using this kind Refer to efficiency and higher average fetching efficiency, and save substantial amounts of hardware store resource, while the realization of compiler also ten Divide simple.
Brief description of the drawings
Fig. 1 is the structural representation of traditional two-level cache;
Fig. 2 is the structural representation of the instruction storage scheme of polling mode;
Fig. 3 is a kind of structural representation of the instruction storage device of embodiment 1;
Fig. 4 is a kind of specific instruction storage device structural representation;
Fig. 5 is the schematic diagram of packet storage device and icache ping-pong operations;
Fig. 6 is instruction storage device process chart;
A kind of instruction storage device detailed process figures of Fig. 7;
Fig. 8 is the procedure chart of a Cache Line job in Cache modules in the present invention.
Embodiment
The present invention is considered by fast storage (Quick Memory, abbreviation Qmem) and low capacity and using ping-pong operation Cache (Cache), and low speed RAM memory(Such as low speed instruction memory(Instruction memory, abbreviation IMEM)) Combine the caching as micro engine.
For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with accompanying drawing to the present invention Embodiment be described in detail.It should be noted that in the case where not conflicting, in the embodiment and embodiment in the application Feature can mutually be combined.
Embodiment 1
The command memory of the present embodiment is as shown in figure 3, using following structure:
One, which organizes greatly N number of micro engine, is divided into more than two groups, and each micro engine corresponds to Qmem and Cache, Per group's micro engine one the first low speed instruction memory of correspondence(Hereinafter referred to as IMEM), this one organizes greatly N number of micro engine correspondence one Individual second low speed instruction memory(Hereinafter referred to as IMEM_COM), as shown in figure 3, Qmem is set to be connected with micro engine, caching It is connected with Qmem;Corresponding cache of each micro engine is connected with the first low speed instruction memory in micro engine group;Micro engine is big Corresponding cache of each micro engine is connected with the second low speed instruction memory in group, wherein:
The Qmem is used for after the director data request of micro engine transmission is received, and judges whether this Qmem has the instruction number According to if so, director data then is returned into micro engine, if it is not, sending director data request to caching.The Qmem is excellent Instruction of the choosing storage to processing one address field of quality requirement highest, is preferably realized by the fast SRAM of read or write speed.In Qmem Content will not will be updated again during Message processing, when this part of micro engine demand is instructed, Qmem can be Director data needed for returning to micro engine in one clock cycle, substantially increases fetching efficiency;
The Cache has two Cache Line(Without universal Chinese technical term), each Cache Line can deposit A plurality of continuous instruction, Cache Line are used for after the director data request of Qmem transmissions is received, and whether judge this caching There is the director data, if so, director data then is returned into micro engine by Qmem, if it is not, to IMEM or IMEM_ COM sends director data request.Two Cache Line use ping-pong operation form, and same with the ping-pong operation of packet storage device Step;
Above-mentioned IMEM and IMEM_COM are respectively used to a piece of instruction that storage is located at different address section, please based on director data Seek look-up command data and return.
Aforementioned four storage location:Qmem, Cache, IMEM, IMEM_COM, access speed are also reduced successively.Using level The memory of change can effectively utilize the difference for the probability that instruction is performed, so as to optimize the efficiency that micro engine gets instruction.By Slow memory is employed in more, hardware resource has been saved.
Preferably, the device also includes the first arbitration modules(arbiter1), the second arbitration modules(arbiter2)With Three arbitration modules(arbiter3).Each micro engine correspondence one arbiter1, the arbiter1 and each micro engine caching It is connected;Each micro engine group correspondence one arbiter2, the arbiter2 one end and each micro engine in micro engine group Arbiter1 be connected, the other end is connected with IMEM;Each corresponding arbiter3 of big group of micro engine, the one of the arbiter3 End is connected with the arbiter1 of each micro engine, and the other end is connected with IMEM_COM.
The arbiter1 is used in cache request director data, judges that asked instruction is still located at positioned at IMEM IMEM_COM, director data request is sent to IMEM or IMEM_COM;And for receiving the finger that IMEM or IMEM_COM is returned Data are made, the director data is returned into caching;
The arbiter2 is used to, when receiving the director data request that one or more arbiter1 are sent, select one Director data request is sent to IMEM processing, and director data will be obtained after IMEM fetchings and returns to corresponding arbiter1;
The arbiter3 is used to, when receiving the director data request that one or more arbiter1 are sent, select one Director data request is sent to IMEM_COM processing, is returned to director data is obtained after IMEM_COM fetchings accordingly arbiter1。
By taking N=32 as an example, every group of 32 micro engines can be divided into 4 groups, per 8 micro engines of group.As shown in figure 4, every One Qmem and Cache of individual micro engine correspondence(Including two instruction buffers(icache)), shared per 8 micro engines of group One IMEM, every group of 32 micro engines share an IMEM_COM.A1 represents that arbiter1, A2 represent arbiter2, A3 in Fig. 4 Represent arbiter3.As shown in figure 5, two packet storage devices in two icache and ME are corresponded, they take turns to operate To cover the delay of packet storage and fetching.
Embodiment 2
Instruction storage device shown in corresponding diagram 3, corresponding instruction storage method as shown in fig. 6, including:
Step 1, Qmem judges whether this Qmem has the instruction number after the director data request of micro engine transmission is received According to if so, director data then is returned into micro engine, if it is not, sending director data request to caching;
Step 2, the Cache Line in caching judge this caching after the director data request of Qmem transmissions is received Whether have the director data, if so, director data then is returned into micro engine by Qmem, if it is not, to IMEM or IMEM_COM sends director data request;
Step 3, IMEM is after the director data request that caching is sent is received, and look-up command data are looked into caching return The director data found;IMEM_COM is after the director data request that caching is sent is received, and look-up command data are returned to caching Return the director data found.
Specifically, to any one micro engine, instruction fetch process is as shown in fig. 7, comprises following steps:
Step 110, micro engine sends the IA and address enable of demand to the Qmem of the micro engine;
Specifically, the instruction first address in message and address enable are sent out when the packet storage device in micro engine receives message Instruction storage device is given, i.e. the corresponding Qmem of the micro engine.
Step 120, Qmem judges the IA whether in the address realm that it deposits instruction, if performed Step 130, step 140 is otherwise performed;
Step 130, micro engine is returned to the IA and the address enable instruction fetch data, this fetching process knot Beam;
Step 140, the IA and address enable are sent to the Cache of the micro engine;
Step 150, Cache judges the IA whether in the address realm that it deposits instruction, if it is, performing step Rapid 160, otherwise perform step 170;
Due to Cache every part only one of which Cache Line, therefore Cache label(Tag)Also only one of which mark The information of label, when Address requests are sent to Cache, according to tag at once it may determine that whether data are in Cache needed for going out In, will IA corresponding positions tag corresponding with the CacheLine of work at present contrasted, if identical, illustrate this Otherwise instruction illustrates the instruction not in Cache in Cache.
Step 160, the director data of correspondence position in Cache Line is taken out based on address enable and given by Qmem Micro engine, this fetching process terminates;
Step 170, the IA and address enable are delivered to the first arbitration modules by Cache(arbiter1);
Step 180, during arbiter1 judges that the IA is the corresponding IMEM of the group where the micro engine, or In the corresponding IMEM_COM of micro engine group where the micro engine, if in IMEM, step 190 is performed, if in IMEM_ In COM, then step 210 is performed;
Specifically arbiter1 judges that the instruction is in IMEM or in IMEM_COM according to IA;
Step 190, arbiter1 sends IA and address enable to the second arbitration modules(arbiter2);
Step 200, arbiter2 select an instruction request send to IMEM, IAes of the IMEM in request and Address enable instruction fetch data, Cache is returned to by arbiter1, performs step 230;
When there is the corresponding arbiter1 of multiple micro engines to initiate fetching request to arbiter2, arbiter2 passes through wheel The mode of inquiry handles each cache request, and IMEM processing is sent in selection one fetching request, and multiple clock weeks are needed because data are returned Phase, having been sent from the branch road of request will not be polled to again;
Step 210, arbiter1 sends IA and address enable to the 3rd arbitration modules(arbiter3);
Step 220, arbiter3 selects an instruction request to send to IMEM_COM, fingers of the IMEM_COM in request Address and address enable instruction fetch data are made, Cache is returned to by arbiter1, step 230 is performed;
The corresponding arbiter of each micro engine function same arbiter1, the arbiter3 same arbiter2 of function.
Step 230, Cache Line and tag content are updated, and the director data is returned into micro engine by Qmem, This fetching process terminates.
The structural representation that Fig. 8 is icache in Fig. 5, icache is received after the IA that Qmem is sent, and is carried out with tag Compare, judge whether hit, if hit, after decoding, according to address enable from icache physical storage locations fetching Content is made, is exported by MUX, if miss, continues low speed instruction memory instruction fetch data, return Director data is exported through MUX.
Only carry out work using one of Cache Line when handling same message.Used in current message Cache Line1 find corresponding director data in Cache, not to subordinate's slow memory(IMEM or IMEM_COM)Send During read request, now, if Cache Line2 detect the request of first address in next message, with institute in next message Comprising instruction first address to subordinate's slow memory send read request, to obtain the director data needed for next message. After current Cache Line1 Message processings are complete, Cache is switched to second half Cache Line 2 to prepare to handle next report Text.Message is so handled using ping-pong operation can effectively be covered the time of packet storage and remove the command memory of low speed The delay of middle fetching, required instruction can be just got when micro engine is switched to next message at once, improve fetching effect Rate, so that the treatment effeciency of micro engine is improved.
One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program Related hardware is completed, and described program can be stored in computer-readable recording medium, such as read-only storage, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also use one or more integrated circuits to realize.Accordingly Each module/unit in ground, above-described embodiment can be realized in the form of hardware, it would however also be possible to employ the shape of software function module Formula is realized.The present invention is not restricted to the combination of the hardware and software of any particular form.
Certainly, the present invention can also have other various embodiments, ripe in the case of without departing substantially from spirit of the invention and its essence Various corresponding changes and deformation, but these corresponding changes and change ought can be made according to the present invention by knowing those skilled in the art Shape should all belong to the protection domain of appended claims of the invention.

Claims (12)

1. a kind of network processing unit instruction storage device, network processing unit includes big group of more than two micro engines, each micro- to draw Holding up big group includes N number of micro engine, and N number of micro engine is divided into more than two micro engine groups, the instruction storage device bag Include:Fast storage Qmem, caching cache, the first low speed instruction memory and the second low speed instruction memory, wherein:
Each one Qmem and caching of micro engine correspondence, Qmem is set to be connected with micro engine, and caching is connected with Qmem;
The corresponding caching of each micro engine in one the first low speed instruction memory of each micro engine group correspondence, micro engine group It is connected with the first low speed instruction memory;
The corresponding caching of each micro engine in one the second low speed instruction memory of each big group of correspondence of micro engine, big group of micro engine It is connected with the second low speed instruction memory.
2. device as claimed in claim 1, it is characterised in that:
The Qmem is used for after the director data request of micro engine transmission is received, and judges whether this Qmem has the instruction number According to if so, director data then is returned into micro engine, if it is not, sending director data request to caching.
3. device as claimed in claim 1 or 2, it is characterised in that:
The instruction to handling one address field of quality requirement highest is stored in the Qmem.
4. device as claimed in claim 1, it is characterised in that:
The caching includes two Cache Line, each Cache Line and deposits a plurality of continuous instruction, the Cache Line is used for after the director data request of Qmem transmissions is received, and judges whether this caching has the director data, if so, then Director data is returned into micro engine by Qmem, if it is not, to the first low speed instruction memory or the second low speed instruction Memory sends director data request.
5. device as claimed in claim 4, it is characterised in that:
Described two Cache Line use ping-pong operation form, and synchronous with the ping-pong operation of packet storage device.
6. the device as described in claim 1 or 2 or 4 or 5, it is characterised in that:
Described device also includes the first arbitration modules, the second arbitration modules and the 3rd arbitration modules, wherein:
One the first arbitration modules of each micro engine correspondence, first arbitration modules are connected with the caching of each micro engine;
It is each in one the second arbitration modules of each micro engine group correspondence, one end of second arbitration modules and micro engine group First arbitration modules of micro engine are connected, and the other end is connected with the first low speed instruction memory;
One the 3rd arbitration modules of each big group of correspondence of micro engine, one end of the 3rd arbitration modules and the first of each micro engine Arbitration modules are connected, and the other end is connected with the second low speed instruction memory.
7. device as claimed in claim 6, it is characterised in that:
First arbitration modules, in cache request director data, judging that asked instruction refers to positioned at the first low speed Make memory still be located at the second low speed instruction memory, sent out to the first low speed instruction memory or the second low speed instruction memory Director data is sent to ask;And for receiving the instruction number that the first low speed instruction memory or the second low speed instruction memory are returned According to the director data is returned into caching;
Second arbitration modules, during for being asked in the director data for receiving one or more first arbitration modules transmissions, One director data request of selection is sent to the processing of the first low speed instruction memory, will be obtained after the first low speed instruction memory fetching Corresponding first arbitration modules are returned to director data;
3rd arbitration modules, during for being asked in the director data for receiving one or more first arbitration modules transmissions, One director data request of selection is sent to the processing of the second low speed instruction memory, will be obtained after the second low speed instruction memory fetching Corresponding first arbitration modules are returned to director data.
8. device as claimed in claim 7, it is characterised in that:
The caching is additionally operable to after the director data of the first arbitration modules return is received, and updates cache contents and label.
9. the device as described in claim 1,2,4,5,7 or 8, it is characterised in that:
Each big group of micro engine includes 32 micro engines, and 32 micro engines are divided into 4 micro engine groups, and each micro engine is small Group includes 8 micro engines.
10. a kind of instruction storage method of instruction storage device, the instruction storage device is instruction as claimed in claim 1 Storage device, methods described includes:
Fast storage Qmem judges whether this Qmem has the instruction number after the director data request of micro engine transmission is received According to if so, director data then is returned into micro engine, if it is not, sending director data request to caching;
A Cache Line in the caching judge whether this caching has after the director data request of Qmem transmissions is received The director data, if so, director data then is returned into micro engine by Qmem, if it is not, to the first low speed instruction Memory or the second low speed instruction memory send director data request;
The first low speed instruction memory is after the director data request that caching is sent is received, look-up command data, Xiang Huan Deposit the director data for returning and finding;
The second low speed instruction memory is after the director data request that caching is sent is received, look-up command data, Xiang Huan Deposit the director data for returning and finding.
11. method as claimed in claim 10, it is characterised in that:
Methods described also includes:
Director data request is sent to by the Cache Line in the caching when judging this caching without the director data First arbitration modules, first arbitration modules judge asked instruction if located in the first low speed instruction memory, then to the One low speed instruction memory sends director data request, and the instruction asked is if located in the second low speed instruction memory, to the Two low speed instruction memory requests director datas.
12. method as claimed in claim 11, it is characterised in that:
Methods described also includes:
First arbitration modules judge asked instruction if located in the first low speed instruction memory, then to the second arbitration mould Block sends director data request, and the director data that second arbitration modules receive one or more first arbitration modules transmissions please When asking, one director data request of selection is sent to the first low speed instruction memory;
First arbitration modules judge asked instruction if located in the second low speed instruction memory, then to the 3rd arbitration mould Block sends director data request, and the director data that the 3rd arbitration modules receive one or more first arbitration modules transmissions please When asking, one director data request of selection is sent to the second low speed instruction memory.
CN201210233710.XA 2012-07-06 2012-07-06 A kind of instruction storage method of network processing unit instruction storage device and the device Active CN102855213B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201210233710.XA CN102855213B (en) 2012-07-06 2012-07-06 A kind of instruction storage method of network processing unit instruction storage device and the device
PCT/CN2013/078736 WO2013185660A1 (en) 2012-07-06 2013-07-03 Instruction storage device of network processor and instruction storage method for same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210233710.XA CN102855213B (en) 2012-07-06 2012-07-06 A kind of instruction storage method of network processing unit instruction storage device and the device

Publications (2)

Publication Number Publication Date
CN102855213A CN102855213A (en) 2013-01-02
CN102855213B true CN102855213B (en) 2017-10-27

Family

ID=47401809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210233710.XA Active CN102855213B (en) 2012-07-06 2012-07-06 A kind of instruction storage method of network processing unit instruction storage device and the device

Country Status (2)

Country Link
CN (1) CN102855213B (en)
WO (1) WO2013185660A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855213B (en) * 2012-07-06 2017-10-27 中兴通讯股份有限公司 A kind of instruction storage method of network processing unit instruction storage device and the device
CN106293999B (en) 2015-06-25 2019-04-30 深圳市中兴微电子技术有限公司 A kind of implementation method and device of micro engine processing message intermediate data snapshot functions
CN108804020B (en) * 2017-05-05 2020-10-09 华为技术有限公司 Storage processing method and device
CN109493857A (en) * 2018-09-28 2019-03-19 广州智伴人工智能科技有限公司 A kind of auto sleep wake-up robot system
EP3893122A4 (en) * 2018-12-24 2022-01-05 Huawei Technologies Co., Ltd. Network processor and message processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1997973A (en) * 2003-11-06 2007-07-11 英特尔公司 Dynamically caching engine instructions
CN101021818A (en) * 2007-03-19 2007-08-22 中国人民解放军国防科学技术大学 Stream application-oriented on-chip memory

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100454899C (en) * 2006-01-25 2009-01-21 华为技术有限公司 Network processing device and method
US7836435B2 (en) * 2006-03-31 2010-11-16 Intel Corporation Checking for memory access collisions in a multi-processor architecture
CA2799167A1 (en) * 2010-05-19 2011-11-24 Douglas A. Palmer Neural processing unit
CN102270180B (en) * 2011-08-09 2014-04-02 清华大学 Multicore processor cache and management method thereof
CN102855213B (en) * 2012-07-06 2017-10-27 中兴通讯股份有限公司 A kind of instruction storage method of network processing unit instruction storage device and the device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1997973A (en) * 2003-11-06 2007-07-11 英特尔公司 Dynamically caching engine instructions
CN101021818A (en) * 2007-03-19 2007-08-22 中国人民解放军国防科学技术大学 Stream application-oriented on-chip memory

Also Published As

Publication number Publication date
CN102855213A (en) 2013-01-02
WO2013185660A1 (en) 2013-12-19

Similar Documents

Publication Publication Date Title
CN107301455B (en) Hybrid cube storage system for convolutional neural network and accelerated computing method
CN102855213B (en) A kind of instruction storage method of network processing unit instruction storage device and the device
CN103647807B (en) A kind of method for caching information, device and communication equipment
CN104679669B (en) The method of cache cache accumulator systems and access cache row cache line
US20160132541A1 (en) Efficient implementations for mapreduce systems
US9529622B1 (en) Systems and methods for automatic generation of task-splitting code
CN109388590B (en) Dynamic cache block management method and device for improving multichannel DMA (direct memory access) access performance
CN105677580A (en) Method and device for accessing cache
CN102439574B (en) Data replacement method in system cache and multi-core communication processor
CN102195874A (en) Pre-fetching of data packets
CN101841438A (en) Method or system for accessing and storing stream records of massive concurrent TCP streams
CN102446087A (en) Instruction prefetching method and device
CN104049955A (en) Multistage cache consistency pipeline processing method and device
CN106445472B (en) A kind of character manipulation accelerated method, device, chip, processor
US20090300287A1 (en) Method and apparatus for controlling cache memory
US20200341764A1 (en) Scatter Gather Using Key-Value Store
CN112559433A (en) Multi-core interconnection bus, inter-core communication method and multi-core processor
CN104252416A (en) Accelerator and data processing method
CN109992526A (en) A kind of read-write management method and relevant apparatus
CN114036077A (en) Data processing method and related device
CN107783909B (en) Memory address bus expansion method and device
CN102646058A (en) Method and device for selecting node where shared memory is located in multi-node computing system
CN111653317A (en) Gene comparison accelerating device, method and system
CN114063923A (en) Data reading method and device, processor and electronic equipment
CN117194283A (en) Vector read-write instruction processing method based on RISC-V instruction set

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221116

Address after: 518055 Zhongxing Industrial Park, Liuxian Avenue, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: SANECHIPS TECHNOLOGY Co.,Ltd.

Address before: 518057 Ministry of justice, Zhongxing building, South Science and technology road, Nanshan District hi tech Industrial Park, Shenzhen, Guangdong

Patentee before: ZTE Corp.