CN104901901B - A kind of micro engine and its method for handling message - Google Patents

A kind of micro engine and its method for handling message Download PDF

Info

Publication number
CN104901901B
CN104901901B CN201410084619.5A CN201410084619A CN104901901B CN 104901901 B CN104901901 B CN 104901901B CN 201410084619 A CN201410084619 A CN 201410084619A CN 104901901 B CN104901901 B CN 104901901B
Authority
CN
China
Prior art keywords
message
thread
queue
thread number
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410084619.5A
Other languages
Chinese (zh)
Other versions
CN104901901A (en
Inventor
周峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen ZTE Microelectronics Technology Co Ltd
Original Assignee
Shenzhen ZTE Microelectronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen ZTE Microelectronics Technology Co Ltd filed Critical Shenzhen ZTE Microelectronics Technology Co Ltd
Priority to CN201410084619.5A priority Critical patent/CN104901901B/en
Priority to PCT/CN2014/077834 priority patent/WO2015131445A1/en
Publication of CN104901901A publication Critical patent/CN104901901A/en
Application granted granted Critical
Publication of CN104901901B publication Critical patent/CN104901901B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Advance Control (AREA)

Abstract

The invention discloses a kind of micro engine (ME) and its methods for handling message, ME carries out thread distribution to message is received by least five thread management queues, according to the thread distributed by the packet storage in the packet storage device with double reading-writing ports, and control distributed thread by the way of eight level production lines and the message being stored in the packet storage device handled;The present invention also discloses a kind of ME for handling message.

Description

A kind of micro engine and its method for handling message
Technical field
The present invention relates to Network Processor technology more particularly to a kind of micro engine (ME, Micro Engine) and its processing The method of message.
Background technique
In order to meet the needs of future network development, the performance of router is improved, internet (Internet) backbone is in The core router of position has carried out one and another technological change.Especially in high-end router market, network processing unit with Its outstanding Message processing performance and programmability, which have become, constitutes the irreplaceable part of routing forwarding engine.Industry at present The basic network processor arrangement for using multithreading, and the management of multithreading and scheduling are to influence multi-threaded network processor performance A key factor.
In network processor system, ME is the core component of network processing unit.Multithreaded architecture is to improve network processes A kind of effective ways of device ME performance, but the problems such as also bring along the complexity and system frequency bottleneck of thread management.Therefore It needs to design a reasonable scheme to realize the ME threading scheduling management of high frequency efficient, while ME being made to have higher treatability Energy.
Some traditional multi-threaded network processors use the ME that dispatches based on coarseness, although such ME can be with Guarantee that the instruction an of thread executes at full speed, but in the switching of each secondary thread, the load and preservation of data can all cause interior The free time of core assembly line, so as to cause the decline of ME performance.
In addition, due to only having a thread executing in the assembly line of ME, so needing to solve data when design scheme to emit The problem of danger.When the design being pushed forward using data, the complexity of logic will be increased, and relevant in two results of continuous processing Message causes the increase in combinational logic path when instructing, lead to the decline of system frequency.
Summary of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of ME and its handle message method, can overcome it is existing ME frequency and the not high problem of performance.
The technical scheme of the present invention is realized as follows:
The present invention provides a kind of methods of ME processing message, which comprises ME passes through at least five thread managements Queue carries out thread distribution to the message received, and the packet storage is had double reading-writing ports according to the thread distributed Packet storage device in, and control by the way of eight level production lines distributed thread to being stored in the packet storage device Message handled.
In above scheme, the ME carries out thread distribution to the message received by least five thread management queues Are as follows: when ME receives new message, thread number is distributed as message in the way of first in first out by idle queues free_queue, and Queue rdy_queue to be processed is write into the self-contained fetching address of the thread number of distribution and message, it is idle when having in ME Pipeline resource when, ME dispatched from rdy_queue an outstanding message thread number and the thread number it is corresponding Operation queue work_queue is write in fetching address, and what is stored in work_queue is all the thread for the message that ME is being handled Number and fetching address the thread number of the message and fetching address are written to queue of tabling look-up when a message needs to table look-up In srh_queue, when a Message processing finishes, the thread number of the message and fetching address are written to message output team It arranges in pkt_out_queue;Wherein, when a message needs table look-up or are disposed, from work_queue described in deletion The corresponding thread number of message and fetching address.
It is described to control distributed thread by the way of eight level production lines and deposited to the message is stored in above scheme Message in reservoir is handled are as follows: eight level production lines support eight threads to work at the same time, and every level-one is corresponding in eight level production lines One thread;Wherein, the first order, thread send the acquisition request that message instructs according to the fetching address of message;The second level, thread Receive message instruction;The third level, thread analytic message instruct and obtain the source operand of message instruction;The fourth stage, thread is to source Operand carries out position adjustment;Level V, thread execute message according to institute's rheme source operand adjusted and instruct corresponding arithmetic The calculating of operation and corresponding storage address;6th grade, thread issues read-write operation request according to the storage address;7th Grade, thread obtain the response of the operation requests;8th grade, thread is by the result of the arithmetical operation or the operation requests Respond the processing result write-back instructed as message;Wherein, after the 8th grade, determine that message does not need to table look-up and includes not When the message instruction of processing, untreated message in the first order processing message is returned to according to the thread number of the message and is referred to It enables.
In above scheme, after the completion of the Message processing, the thread number of the message is discharged.
The present invention provides a kind of ME, the ME includes: thread management module, the packet storage mould with double reading-writing ports Block, kernel module;Wherein, the thread management module, for passing through at least five thread management queues to the message received Carry out thread distribution;The packet storage module, for the message according to the threads store distributed;The kernel module, The message being stored in the packet storage module is carried out for controlling distributed thread by the way of eight level production lines Processing.
In above scheme, the thread management module is specifically used for through idle queues free_queue with first in first out Mode be that message distributes thread number, and queue to be processed is write into the self-contained fetching address of the thread number of distribution and message Rdy_queue, when available free pipeline resource, from rdy_queue dispatch an outstanding message thread number and Operation queue work_queue is write in the corresponding fetching address of the thread number, and what is stored in work_queue is all to locate The thread number of the message of reason and fetching address, when a message needs to table look-up, by the thread number of the message and fetching address It is written in the queue srh_queue that tables look-up, when a Message processing finishes, by the thread number of the message and fetching address It is written in message output queue pkt_out_queue;Wherein, when a message needs table look-up or are disposed, from The corresponding thread number of the message and fetching address are deleted in work_queue.
In above scheme, the kernel module is specifically used for the corresponding thread of level-one every in eight level production lines;Its In, the first order, thread sends the acquisition request that message instructs according to the fetching address of message;The second level, thread receive message and refer to It enables;The third level, thread analytic message instruct and obtain the source operand of message instruction;The fourth stage, thread carry out source operand Position adjustment;Level V, thread execute message according to institute's rheme source operand adjusted and instruct corresponding arithmetical operation and correspondence Storage address calculating;6th grade, thread issues read-write operation request according to the storage address;7th grade, thread obtains The response of the operation requests;8th grade, thread is using the response of the result of the arithmetical operation or the operation requests as institute State the processing result write-back of message instruction;Wherein, after the 8th grade, determine that message does not need to table look-up and includes untreated When message instructs, the first order is returned to according to the thread number of the message and handles untreated message instruction in the message.
In above scheme, the thread management module is also used to after the completion of the Message processing, by the line of the message Journey number release.
It can be seen that a kind of method that the embodiment of the present invention provides ME and its handles message, ME passes through at least five threads Management queue carries out thread distribution to message is received, and the packet storage is had double read-write ends according to the thread distributed Mouthful packet storage device in, and control by the way of eight level production lines distributed thread to being stored in the packet storage mould Message in block is handled;The generation that data hazard is avoided from hardware configuration, simplifies logic, does not need to carry out and count According to venture correlated judgment logic, and the generation of ME internal resource access conflict is avoided, effectively improves ME working frequency and property Can, guarantee the high performance processing message of ME high-frequency, and scheme realization is relatively easy, can reduce the complexity of coding, from And reduce human cost.
Detailed description of the invention
Fig. 1 is the flow diagram for the method that the ME that the embodiment of the present invention one provides handles message;
Fig. 2 is the flow diagram for the method that ME provided by Embodiment 2 of the present invention handles message;
Fig. 3 is the course of work schematic diagram of one message of ME pipeline processes provided by Embodiment 2 of the present invention;
Fig. 4 is the course of work schematic diagram of the multiple messages of ME pipeline processes provided by Embodiment 2 of the present invention;
The structural schematic diagram for the ME that Fig. 5 embodiment of the present invention three provides.
Specific embodiment
In embodiments of the present invention, ME carries out thread point to the message received by least five thread management queues Match, is flowed the packet storage in the packet storage device with double reading-writing ports, and using eight grades according to the thread distributed The mode of waterline controls distributed thread and handles the message being stored in the packet storage device.
With reference to the accompanying drawing and specific embodiment is described in further detail technical solution of the present invention.
Embodiment one
Fig. 1 is the flow diagram for the mode that the ME that the embodiment of the present invention one provides handles message, as shown in Fig. 1, the party Method the following steps are included:
Step 101, ME carries out thread distribution to the message received by least five thread management queues;
Specifically, by taking five thread management queues as an example, when ME receives new message, by idle queues free_ Queue distributes thread number as message in the way of first in first out, and by the self-contained fetching of the thread number of distribution and message Queue rdy_queue to be processed is write in location, and when having idle pipeline resource in ME, ME dispatches one from rdy_queue Operation queue work_queue, work_ are write in the thread number of a outstanding message and the corresponding fetching address of the thread number What is stored in queue is all thread number and the fetching address for the message that ME is being handled, when a message needs to table look-up, by institute The thread number and fetching address for stating message are written in the queue srh_queue that tables look-up, will be described when a Message processing finishes The thread number of message and fetching address are written in message output queue pkt_out_queue;Wherein, when a message needs are looked into Table or when being disposed, while by the thread number of message and fetching address write-in srh_queue or pkt_out_queue, from The corresponding thread number of the message and fetching address are deleted in work_queue.
Wherein, it is corresponded by the thread number that free_queue is message distribution with message itself, passes through the line distributed Journey number can determine that its corresponding message.
When the message that ME is being handled is less than 8, ME has free pipeline resource, then dispatches from rdy_queue The thread number of one outstanding message and the corresponding fetching address of the thread number are write in work_queue, by the free time Pipeline resource distributes to the corresponding message of thread number that scheduling at this time enters work_queue, corresponding by the thread number of the message Thread the message is handled using idle pipeline resource.Here, what is stored in work_queue is being handled The sum of the thread number of message is 8, corresponding with eight level production lines;When the sum of the thread number stored in work_queue is 8 When, the quantity for the message that ME is being handled is 8, wherein each message corresponds to a thread, at this point, having in eight level production lines of ME 8 threads cycle operation in eight level production lines.
During eight level production lines handle message, one thread of every grade of correspondence of eight level production lines, at per thread A message is managed, therefore, ME assembly line can simultaneously be handled 8 messages, when a Message processing in 8 messages finishes Later, the corresponding thread number of the message being disposed is written in pkt_out_queue, and the message that this is disposed Corresponding thread number and fetching address are deleted from work_queue, then the sum of the thread number stored in work_queue is small In 8, correspondingly, since the Message processing finishes, for the message being disposed distribution pipeline resource since it is processed Journey is finished, and is set to idle state, is used again to handle other messages.
After Message processing, ME discharges the thread number of the message and the corresponding thread of the thread number;Here, it releases The corresponding thread of the thread number put is assigned again to ME in received message later.
Step 102, ME according to the thread distributed by the packet storage in the packet storage device with double reading-writing ports In;
Here, ME is that the message received distributes after thread number in a step 101, is assigned with corresponding line for message Journey, then ME first stores received message according to the thread distributed, by packet storage in the message with dual-read port In memory.
In practical applications, packet storage device is random access memory (RAM, Random with double reading-writing ports Access Memory).
Step 103, ME controls distributed thread to being stored in the packet storage device by the way of eight level production lines In message handled;
Specifically, ME uses eight grades of flowing water when work_queue is written with corresponding fetching address in the thread number of message The mode of line controls the thread distributed in a step 101 and handles the message being stored in packet storage device.
Here, eight level production lines support eight threads to work at the same time, the corresponding thread of every level-one in eight level production lines, In,
The first order, thread send the acquisition request that message instructs according to the fetching address of message;
The second level, thread receive message instruction;
The third level, thread analytic message instruct and obtain the source operand of message instruction;
The fourth stage, thread carry out position adjustment to source operand;
Level V, thread execute message according to institute's rheme source operand adjusted and instruct corresponding arithmetical operation and correspondence Storage address calculating;
6th grade, thread issues read-write operation request according to the storage address;
7th grade, thread obtains the response of the operation requests;
8th grade, the processing that thread instructs the response of the result of the arithmetical operation or the operation requests as message As a result write-back;
Wherein, after the 8th grade, when determining that message does not need to table look-up and include the instruction of untreated message, according to institute The thread number for stating message returns to the first order processing untreated message instruction, until at the message instruction all of the message Reason is completed.
For a message, the successively place Jing Guo the first order to the 8th grade is needed to the processing of one article message instruction Reason, under normal circumstances, a message include a plurality of message instruction, therefore, after the completion of the eight grades of execution of the first order-the, need to sentence Whether the message that breaks needs to table look-up, and determines whether to include untreated message instruction, does not need to look into when determining in the message Table and when including the instruction of untreated message, the corresponding thread number of the message and the corresponding fetching address of the thread number are continued It is stored in work_queue, and continues to instruct untreated message by eight level production lines by the corresponding thread of the thread number It is handled;When determining that message needs to table look-up, the corresponding thread number of the message is written in srh_queue, by the message Corresponding thread suspension is continued by the corresponding thread of the thread number to the message until ME receives the response of tabling look-up of the message Message instruction is handled;When Message processing finishes, that is, does not need to table look-up and instruct there is no untreated message, by the report The corresponding thread number of text is written in pkt_out_queue.
In practical applications, the 6th grade of eight level production lines to the RAM of double reading-writing ports send read-write operation request when, One in two reading-writing ports is only used, another reading-writing port is used for when ME receives message load or acquisition request Message receives and sends in ME.
Embodiment two
In example 2, ME completes the management of message and the scheduling of thread by following 5 queues:
Idle queues free_queue, queue rdy_queue to be processed, operation queue work_queue, queue of tabling look-up Srh_queue, message output queue pkt_out_queue;Wherein, free_queue is that the new message into ME distributes thread Number;Rdy_queue store outstanding message thread number and fetching address, wherein rdy_queue can at most store 8 to The thread number of processing and fetching address;The thread number for the message that work_queue storage is being handled and fetching address;srh_ Queue storage needs the thread number for the message tabled look-up and fetching address, pkt_out_queue storage to need to issue after being disposed Message thread number and fetching address;Wherein, message is when entering ME, self-contained fetching address.
Fig. 2 is the method that a kind of ME provided by Embodiment 2 of the present invention handles message, as shown in Figure 2, comprising the following steps:
Step 201, ME is that message distributes thread number by free_queue;
Specifically, being that message distributes thread number by free_queue, wherein free_ when ME receives new message The depth of queue is that the depth of 16, free_queue determines the quantity that can distribute thread number, that is to say, that free_queue Thread number for message distribution is up to 16;The thread and free_queue that can be used for handling message in ME can be message distribution The quantity of thread number is consistent, is also 16;Here, the depth 16 of free_queue is according to all-round property testing result and money What the cost calculation in source obtained.
Free_queue distributes thread number in a manner of first in first out for received message, and the depth of free_queue is 16,16 thread numbers can be 0-15;When receiving message, free_queue distributes to the thread number for coming queue foremost The message received.After message completion has been handled and exported, then the corresponding thread number of message for completing processing is discharged, and will The thread number of the release stores to free_queue, free_queue and the thread number of release is placed on to the foremost of queue, with to When receiving new message again, the thread number of the release is again assigned to new message.When ME is powered on or resets starting, 16 thread numbers of free_queue are received report since the thread number for coming queue foremost all in unallocated state Text distribution thread number, and rear received message distributes thread number for it by the way of first in first out.
ME by giving message to distribute thread number, make between message and message it is mutually indepedent, and by thread number by message with The storage resource that ME distributes to message establishes corresponding mapping relations.
Step 202, ME is that rdy_queue is written with corresponding fetching address in the thread number of message distribution;
Here, after distributing thread number for message, free_queue writes the thread number of distribution and fetching address Rdy_queue, at this point, the corresponding message of thread number into rdy_queue is outstanding message;It is stored in rdy_queue The thread number of outstanding message and fetching address, and store up to the thread number and fetching address of 8 outstanding messages.
Step 203, when having idle pipeline resource, ME dispatches an outstanding message from rdy_queue Work_queue is written in the corresponding fetching address of thread number;
Specifically, ME dispatches a report to be processed from rdy_queue when having idle pipeline resource in ME Thread number fetching corresponding with the thread number address of text is written to work_queue, at this point, by the line of the distribution for the message The corresponding thread of journey number is handled the message by pipeline resource idle in assembly line;Wherein, in work_queue What is stored is all the thread number for the message that ME is being handled.After message needs table look-up or are disposed, ME is from work_ The corresponding fetching address of the corresponding thread number of the message is deleted in queue, the corresponding thread number of the message is corresponding Fetching address be written in srh_queue or pkt_out_queue so that work_queue allow rdy_queue in Thread number fetching corresponding with the thread number address of outstanding message enter in work_queue.
Here, Kernel-level thread is to eight stage pipeline structures of use of the processing of message, and eight level production lines are according to work_ Complete the extraction of message instruction in message, message instruction in the thread number stored in queue fetching corresponding with thread number address After extraction finishes, the extraction of the source operand of message instruction is completed in analytic message instruction.After extracting source operand, Logic computing unit in ME is completed to shift source operand according to the requirement that message instructs, and splicing, plus-minus etc. calculates, and will calculate As a result it is written in destination register or memory.Message has following three kinds of feelings after the processing of eight level production lines Condition:
The first, when message does not need to table look-up and include the instruction of untreated message, the message is untreated to be finished, by eight Level production line continues to handle the next message instruction of the message;
The second, when message needs to table look-up, the message is untreated to be finished, and executes step 204;
Third, when Message processing finishes, execute step 205.
Step 204, srh_queue is written in the corresponding fetching address of the thread number for the message for needing to table look-up by ME;
Specifically, the thread number of message fetching corresponding with thread number address is written when message needs to table look-up In srh_queue, waiting is tabled look-up;Finished at this point, the message is untreated, by its thread number be written srh_queue, ME should Message is sent when being tabled look-up, and the message of tabling look-up sent out carries the thread number of the message, therefore, the thread number of the message It is still occupied;Meanwhile the thread for handling the message is suspended, and waits the return for response of tabling look-up;Due to the message tabled look-up Corresponding thread is suspended, then eight level production lines have idle pipeline resource, and idle pipeline resource is to rdy_queue Outstanding message corresponding to the thread number of middle storage is handled.
When ME receive table look-up respond when, carry the thread number of the message of tabling look-up in response of tabling look-up, existed by the thread number The fetching address of the message is extracted in the corresponding table-look-up instruction memory srh_pc_ram of the thread;It is returned after extracting IA Return step 202;Wherein, in the case that receive table look-up respond when it is unique with the step 202 when two kinds when receiving new message Difference is: when ME receive message table look-up respond when, directly by the fetching address of the thread number of the message and extraction be written Rdy_queue no longer distributes thread number by free_queue;It will be the report by free_queue when ME receives new message Rdy_queue is written in the thread number of text distribution and fetching address.
Step 205, pkt_out_queue is written with corresponding fetching address in the thread number for the message being disposed by ME;
Specifically, the thread number of message fetching corresponding with thread number address is written when Message processing finishes In pkt_out_queue, waiting is exported;It is finished at this point, the message is although processed, but its thread number is written to pkt_ In out_queue, the thread number of the message is still used;When ME sends the message, the thread number quilt of the message Release, so that the thread number can be assigned to new message, distributes thread number resource to be not take up in ME.
Wherein, it is written in free_queue in a manner of queue in the thread number that step 205 discharges, for being divided again Match.
It should be noted that when the sum of the corresponding thread number of report to be processed stored in rdy_queue is less than 8, Empty message is automatically generated in ME, keeping the quantity of the thread number stored in rdy_queue and work_queue is 8, and is made Eight level production lines of ME handle 8 messages simultaneously, so that the at different levels of eight level production lines can normally execute.Here, to sky The processing result of message is step 205.
ME after step 201, divides according to the thread number in step 201 being message distribution, that is, according to by message Message is stored in the packet storage device pkt_ram with double reading-writing ports by the thread matched;Here, pkt_ram has double read Write port, so that the double reading-writing ports of eight level production lines application handle the message being stored in pkt_ram.
When eight level production lines of ME are handling message instruction, need to access pkt_ram, and at this point, write-in pkt_ The message exported is waited also to be stored in pkt_ram in out_queue, when message is exported from ME, it is also desirable to access pkt_ Ram causes the read/write conflict of pkt_ram as a result,.In order to avoid this conflict, needs to suspend the processing of assembly line or need The acquisition for the message to be exported, so that the reduced performance of ME, in the embodiment of the present invention three, by the pkt_ram of double reading-writing ports, So that the case where leading to pipeline stalling because of the port for seizing pkt_ram will not occur, assembly line energy full speed running is improved The performance of ME processing message.
Fig. 3 is the workflow schematic diagram of one message of ME pipeline processes in step 203, as shown in Fig. 3, flowing water Line is as follows to the treatment process of message:
When the thread number of message and the self-contained fetching address of message enter work_queue, idle stream in assembly line Waterline resource handles message.Assembly line can at most run 8 threads simultaneously, can handle 8 messages simultaneously.
Message initially enters the assembly line first order, and instruction obtains 1(IF1, Instruction Fetch1) grade, to message into The thread of row processing sends obtaining for message instruction according to fetching address thread_pc of the packet storage in work_queue Take request;Wherein, the acquisition request that message instructs is sent to the command memory for being used for stored messages instruction by thread In instrmem;Here, instrmem is the mutually independent RAM of RAM with stored messages, makes the read and write access speed of call instruction Height, it is not in the case where access is not hit by that delay is small.
The second level, instruction obtain 2(IF2, Instruction Fetch2) grade, message instruction is received from instrmem, The instruction of received message is stored in the command register if_instr for the message instruction for being used to save acquisition.
The third level, Instruction decoding (ID, Instruction Decode) grade instruct message received in if_instr and solve Analysis, is decoded, and register file (RF, register file) read command and read address are generated, and is obtained from RF for executing report The source operand that each execution unit of text instruction needs,;Here, ME is that the per thread in assembly line distributes corresponding RF, with Store data relevant to per thread.
The fourth stage executes 1(EX1, execute1) grade, position adjustment is carried out to source operand, the operation class supported due to ME Type is more, for example, logic computing unit (ALU, Arithmetic Logical Unit) class calculates, then needs source operand Numerical value be aligned, prepare for the operation of level V.This level-one is mainly the arithmetical operation list for guaranteeing to execute message instruction Member does not need to calculate source operand, it is only necessary to when carrying out position adjustment according to the operand of acquisition and action type to improve Sequence.
Level V, execute 2(EX2, execute2) grade, by ALU in the fourth stage position adjustment after source operand into Row calculates, to execute the calculating that message instructs corresponding arithmetical operation and corresponding storage address;This part is pure combinational logic, Message, which is completed, according to source operand instructs corresponding arithmetical operation and the calculating of storage address.
6th grade, internal storage access 1(MA1, Memory Access1) grade, operation requests here are corresponding with message instruction, When message instruction is arithmetical operation, arithmetic operation results are write in result (result) unit;When message instruction is storage When the operation of address, read-write operation request is issued to pkt_ram by one of reading-writing port of pkt_ram.
7th grade, internal storage access 2(MA2, Memory Access2) grade, it obtains read-write operation request and is read from pkt_ram Data, meanwhile, using result unit and from pkt_ram read data as the output of assembly line be sent to data judgement singly First Wb_mux is specially by the message after pipeline processes with the output judgement before the 8th grade of write-back according to assembly line Any, processing result of the output of assembly line namely this message instruction of three kinds of situations in step 203.
8th grade, the assembly line output by Wb_mux judgement is written back in RF by write-back (WB, Write Back) grade, The processing result for instructing message comes into force.
Wherein, message instruction, ID grades of analytic message instructions, EX1 grades of completion reports are extracted from instrmem with IF2 grades for IF1 grades The extraction of the source operand of text instruction, EX2 grades are completed to move source operand by logic computing unit according to the requirement that message instructs Position, splicing, plus-minus etc. calculate, and MA1 grades, MA2 grades, WB grades are written to the result that message instructs in the RF of message instruction, pass through Eight grades of operation of above-mentioned assembly line executes, and completes the processing of message instruction.
In practical applications, when IF1 grades of sending messages instruct acquisition request, the corresponding message instruction in current fetching address It will be extracted;After message instruction is extracted, fetching address also changes accordingly, deviates one backward, to obtain next report Message instruction can be correctly extracted when text instruction.
As shown in figure 4, each message sequentially enters every level-one of assembly line in order below;Every level-one of assembly line A corresponding thread, assembly line support 8 threads to work at the same time.First message takes message to instruct at IF1 grades, completes at WB grades The write-back of processing result, completes the processing of message instruction, each subsequent message successively lags behind previous message one Level production line.8 level production lines, in the same time, every level-one flowing water executes different operations, completes this grade of corresponding function.When 8 When a thread works at the same time, each thread is worked in order in different pipelining-stages.Such as: T1 moment, Thread0 thread work Make at IF1 grades;At the T2 moment, Thread0 thread work is at IF2 grades, and Thread1 thread work is at IF1 grades;The T3 moment, Thread0 thread work is at ID grades, and Thread1 thread work is at IF2 grades, and Thread2 thread work is at IF1 grades, successively, works as T8 When the moment, Thread0 thread work at WB grades, Thread1 work at MA2 grades, Thread2 thread work at MA1 grades, Thread3 work is at EX2 grades, and Thread4 work is at EX1 grades, and at ID grades, Thread3's Thread2 thread work works in IF2 Grade, Thread7 work at IF1 grades.
As soon as a message goes to WB grades, the processing for completing the message one instruction is represented;At this point, message does not need to look into Table and current message instruction are not the last item message instructions of the message, then ME by the thread number of the message and fetching address after It is continuous to be stored in work_queue, and the next message instruction by handling the thread process of the message message.
The instruction of relevant for continuous two ALU message, that is, previous message instruction calculate the result is that latter item The source operand of message instruction, previous message is instructed to come into force in write-back of the processing result of WB grades of write-backs in RF, next finger Enable and need at ID grade to obtain the processing result as source operand from RF, wherein the write-back of processing result be acquired middle interval 5 periods, that is to say, that the processing knot that next message instruction could use upper message to instruct after needing five periods Otherwise fruit can generate data hazard.And pipeline series are 8 grades, the corresponding thread of every level-one, a thread has executed one Message instruction instructs 8 periods of midfeather to next message is executed, which is greater than 5 periods, avoids data The generation of venture.
For thread Thread0, first message that Thread0 is executed, which instructs, enters assembly line at the T1 moment, It needs just go to WB grades by 8 periods, first message instruction.At this moment the Article 2 message of Thread0 instructs just meeting Assembly line is entered, i.e., for Thread0, two messages in front and back instruction performed by Thread0 enters between assembly line wants Every 8 periods.And for eight level production lines, WB grades are the operations for completing RF write-in, and ID grades are the behaviour for completing RF and reading Make, be separated by 5 periods between this two-stage, latter item instruction is not carried out ID grades at this time, avoids the generation of data hazard.
Embodiment three
Fig. 5 is a kind of structural schematic diagram for ME that the embodiment of the present invention three provides, as shown in figure 3, ME50 includes thread pipe Manage module 51, packet storage module 52 and kernel module 53 with double reading-writing ports;
Thread management module 51 can be matched by central processing unit (CPU, Central Processing Unit) with storage chip It closes and realizes, for carrying out thread distribution to the message received by least five thread management queues;
Specifically, thread management module 51 can pass through idle queues free_ by taking five thread management queues as an example Queue distributes thread number as message in the way of first in first out, and by the self-contained fetching of the thread number of distribution and message Queue rdy_queue to be processed is write in location, and when available free pipeline resource, scheduling one is to be processed from rdy_queue Message thread number and the corresponding fetching address of the thread number write operation queue work_queue, deposited in work_queue What is stored up is all thread number and the fetching address of the message handled, when a message needs to table look-up, by the line of the message Journey number and fetching address are written in the queue srh_queue that tables look-up, when a Message processing finishes, by the thread of the message Number and fetching address be written in message output queue pkt_out_queue;Wherein, when a message needs are tabled look-up or handled Bi Shi deletes the corresponding thread number of the message and fetching address from work_queue.
Packet storage module 52 can be realized by RAM, for the message according to the threads store distributed.
Kernel module 53 can be realized with CPU and signal processing chip, for being controlled by the way of eight level production lines The thread distributed handles message;
Specifically, kernel module 53 is used for the corresponding thread of level-one every in eight level production lines, wherein
The first order, thread send the acquisition request that message instructs according to the fetching address of message;
The second level, thread receive message instruction;
The third level, thread analytic message instruct and obtain the source operand of message instruction;
The fourth stage, thread carry out position adjustment to source operand;
Level V, thread execute message according to institute's rheme source operand adjusted and instruct corresponding arithmetical operation and correspondence Storage address calculating;
6th grade, thread issues read-write operation request according to the storage address;
7th grade, thread obtains the response of the operation requests;
8th grade, thread instructs the response of the result of the arithmetical operation or the operation requests as the message Processing result write-back;
Wherein, after the 8th grade, when determining that message does not need to table look-up and include the instruction of untreated message, according to institute The thread number for stating message returns to the first order and handles untreated message instruction in the message.
Thread management module 51 is also used to after the completion of the Message processing, and the thread number of the message is discharged.
The present invention be according to embodiments of the present invention one into embodiment three method of any embodiment, equipment (system) and The flowchart and/or the block diagram of computer program product describes.It should be understood that flow chart can be realized by computer program instructions And/or the knot of the process and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram It closes.These computer program instructions be can provide to general purpose computer, special purpose computer, Embedded Processor or other programmable numbers According to the processor of processing equipment to generate a machine, so that passing through the processing of computer or other programmable data processing devices The instruction that device executes generates for realizing in one box of one or more flows of the flowchart and/or block diagram or multiple sides The device for the function of being specified in frame.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Correspondingly, any embodiment also provides a kind of computer storage medium in the embodiment of the present invention one, two, wherein storing There is computer program, which is used to execute the ME processing message of any embodiment in the embodiment of the present invention one, two Method.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention, should be included in protection of the invention Within the scope of.

Claims (4)

1. a kind of method of micro engine (ME) processing message, which is characterized in that the described method includes:
ME carries out thread distribution to the message that receives by least five thread management queues, according to the thread distributed by institute Packet storage is stated in the packet storage device with double reading-writing ports, and controls distributed line by the way of eight level production lines Journey handles the message being stored in the packet storage device;
Wherein, the ME carries out thread distribution to the message received by least five thread management queues are as follows:
When ME receives new message, thread number is distributed as message in the way of first in first out by idle queues free_queue, And queue rdy_queue to be processed is write into the self-contained fetching address of the thread number of distribution and message, it is empty when having in ME When the pipeline resource in spare time, ME dispatches the thread number an of outstanding message from rdy_queue and the thread number corresponds to Fetching address write operation queue work_queue, what is stored in work_queue is all the thread for the message that ME is being handled Number and fetching address the thread number of the message and fetching address are written to queue of tabling look-up when a message needs to table look-up In srh_queue, when a Message processing finishes, the thread number of the message and fetching address are written to message output team It arranges in pkt_out_queue;Wherein, when a message needs table look-up or are disposed, from work_queue described in deletion The corresponding thread number of message and fetching address;
It is described controlled by the way of eight level production lines distributed thread to the message being stored in the packet storage device into Row processing are as follows:
Eight level production lines support eight threads to work at the same time, the corresponding thread of every level-one in eight level production lines;Wherein,
The first order, thread send the acquisition request that message instructs according to the fetching address of message;
The second level, thread receive message instruction;
The third level, thread analytic message instruct and obtain the source operand of message instruction;
The fourth stage, thread carry out position adjustment to source operand;
Level V, thread execute message according to institute's rheme source operand adjusted and instruct corresponding arithmetical operation and corresponding deposit Store up the calculating of address;
6th grade, thread issues read-write operation request according to the storage address;
7th grade, thread obtains the response of the operation requests;
8th grade, the processing result that thread instructs the response of the result of the arithmetical operation or the operation requests as message Write-back;
Wherein, after the 8th grade, when determining that message does not need to table look-up and include the instruction of untreated message, according to the report The thread number of text returns to the first order and handles untreated message instruction in the message.
2. the method according to claim 1, wherein this method further include:
After the completion of the Message processing, the thread number of the message is discharged.
3. a kind of ME, which is characterized in that the ME include: thread management module, the packet storage module with double reading-writing ports, Kernel module;Wherein,
The thread management module, for carrying out thread distribution to the message received by least five thread management queues;
The packet storage module, for the message according to the threads store distributed;
The kernel module, for controlling distributed thread by the way of eight level production lines to being stored in the packet storage Message in module is handled;
Wherein, the thread management module, specifically for being report in a manner of first in first out idle queues free_queue Text distribution thread number, and queue rdy_queue to be processed is write into the self-contained fetching address of the thread number of distribution and message, When available free pipeline resource, the thread number and the thread number of an outstanding message are dispatched from rdy_queue Operation queue work_queue is write in corresponding fetching address, and what is stored in work_queue is all the line of the message handled The thread number of the message and fetching address are written to team of tabling look-up when a message needs to table look-up by journey number and fetching address It arranges in srh_queue, when a Message processing finishes, the thread number of the message and fetching address is written to message output In queue pkt_out_queue;Wherein, when a message needs table look-up or are disposed, institute is deleted from work_queue State the corresponding thread number of message and fetching address;
The kernel module is specifically used for the corresponding thread of level-one every in eight level production lines;Wherein,
The first order, thread send the acquisition request that message instructs according to the fetching address of message;
The second level, thread receive message instruction;
The third level, thread analytic message instruct and obtain the source operand of message instruction;
The fourth stage, thread carry out position adjustment to source operand;
Level V, thread execute message according to institute's rheme source operand adjusted and instruct corresponding arithmetical operation and corresponding deposit Store up the calculating of address;
6th grade, thread issues read-write operation request according to the storage address;
7th grade, thread obtains the response of the operation requests;
8th grade, the processing that thread instructs the response of the result of the arithmetical operation or the operation requests as the message As a result write-back;
Wherein, after the 8th grade, when determining that message does not need to table look-up and include the instruction of untreated message, according to the report The thread number of text returns to the first order and handles untreated message instruction in the message.
4. ME according to claim 3, which is characterized in that the thread management module is also used in the Message processing After the completion, the thread number of the message is discharged.
CN201410084619.5A 2014-03-07 2014-03-07 A kind of micro engine and its method for handling message Active CN104901901B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410084619.5A CN104901901B (en) 2014-03-07 2014-03-07 A kind of micro engine and its method for handling message
PCT/CN2014/077834 WO2015131445A1 (en) 2014-03-07 2014-05-19 Microengine and packet processing method therefor, and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410084619.5A CN104901901B (en) 2014-03-07 2014-03-07 A kind of micro engine and its method for handling message

Publications (2)

Publication Number Publication Date
CN104901901A CN104901901A (en) 2015-09-09
CN104901901B true CN104901901B (en) 2019-03-12

Family

ID=54034300

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410084619.5A Active CN104901901B (en) 2014-03-07 2014-03-07 A kind of micro engine and its method for handling message

Country Status (2)

Country Link
CN (1) CN104901901B (en)
WO (1) WO2015131445A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109257280B (en) * 2017-07-14 2022-05-27 深圳市中兴微电子技术有限公司 Micro-engine and message processing method thereof
CN109298923B (en) * 2018-09-14 2019-11-29 中科驭数(北京)科技有限公司 Deep pipeline task processing method and device
CN117331655A (en) * 2022-06-27 2024-01-02 深圳市中兴微电子技术有限公司 Multithreading scheduling method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560029A (en) * 1991-07-22 1996-09-24 Massachusetts Institute Of Technology Data processing system with synchronization coprocessor for multiple threads
US6829697B1 (en) * 2000-09-06 2004-12-07 International Business Machines Corporation Multiple logical interfaces to a shared coprocessor resource
CN1767502A (en) * 2004-09-29 2006-05-03 英特尔公司 Updating instructions executed by a multi-core processor
CN101763285A (en) * 2010-01-15 2010-06-30 西安电子科技大学 Zero-overhead switching multithread processor and thread switching method thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102752198B (en) * 2012-06-21 2014-10-29 北京星网锐捷网络技术有限公司 Multi-core message forwarding method, multi-core processor and network equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560029A (en) * 1991-07-22 1996-09-24 Massachusetts Institute Of Technology Data processing system with synchronization coprocessor for multiple threads
US6829697B1 (en) * 2000-09-06 2004-12-07 International Business Machines Corporation Multiple logical interfaces to a shared coprocessor resource
CN1767502A (en) * 2004-09-29 2006-05-03 英特尔公司 Updating instructions executed by a multi-core processor
CN101763285A (en) * 2010-01-15 2010-06-30 西安电子科技大学 Zero-overhead switching multithread processor and thread switching method thereof

Also Published As

Publication number Publication date
CN104901901A (en) 2015-09-09
WO2015131445A1 (en) 2015-09-11

Similar Documents

Publication Publication Date Title
US11003489B2 (en) Cause exception message broadcast between processing cores of a GPU in response to indication of exception event
CN102752198B (en) Multi-core message forwarding method, multi-core processor and network equipment
JP6674384B2 (en) Processor core using dynamic instruction stream mapping, computer system including processor core, and method of executing program instructions by processor core (parallel slice processor using dynamic instruction stream mapping)
US20070143581A1 (en) Superscalar data processing apparatus and method
US11256507B2 (en) Thread transition management
JP2018501564A (en) Execution unit circuit for processor core, processor core, and method of executing program instructions in processor core
CN105426160A (en) Instruction classified multi-emitting method based on SPRAC V8 instruction set
US20070226696A1 (en) System and method for the execution of multithreaded software applications
US20080155197A1 (en) Locality optimization in multiprocessor systems
RU2008138707A (en) DECLARATIVE MODEL FOR MANAGING PARALLEL PERFORMANCE OF LIGHTWEIGHT PERFORMANCE FLOWS
TW200945206A (en) Method for automatic workload distribution on a multicore processor
CN103197916A (en) Methods and apparatus for source operand collector caching
CN109032668A (en) Stream handle with high bandwidth and low-power vector register file
CN106575220B (en) Multiple clustered VLIW processing cores
CN104901901B (en) A kind of micro engine and its method for handling message
CN110308982A (en) A kind of shared drive multiplexing method and device
US20200319893A1 (en) Booting Tiles of Processing Units
CN115129480B (en) Scalar processing unit and access control method thereof
CN106406820B (en) A kind of multi-emitting parallel instructions processing method and processing device of network processor micro-engine
US11875425B2 (en) Implementing heterogeneous wavefronts on a graphics processing unit (GPU)
He et al. Real-time scheduling in mapreduce clusters
CN116414541A (en) Task execution method and device compatible with multiple task working modes
CN109298923B (en) Deep pipeline task processing method and device
CN109257280A (en) A kind of micro engine and its method for handling message
US7536674B2 (en) Method and system for configuring network processing software to exploit packet flow data locality

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20150909

Assignee: Xi'an Chris Semiconductor Technology Co. Ltd.

Assignor: SHENZHEN ZTE MICROELECTRONICS TECHNOLOGY CO., LTD.

Contract record no.: 2019440020036

Denomination of invention: Micro-engine and method for processing message therewith

Granted publication date: 20190312

License type: Common License

Record date: 20190619

EE01 Entry into force of recordation of patent licensing contract