WO2022222756A1 - 芯片、处理数据的方法和计算机设备 - Google Patents

芯片、处理数据的方法和计算机设备 Download PDF

Info

Publication number
WO2022222756A1
WO2022222756A1 PCT/CN2022/085503 CN2022085503W WO2022222756A1 WO 2022222756 A1 WO2022222756 A1 WO 2022222756A1 CN 2022085503 W CN2022085503 W CN 2022085503W WO 2022222756 A1 WO2022222756 A1 WO 2022222756A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
processed
instruction
type
action
Prior art date
Application number
PCT/CN2022/085503
Other languages
English (en)
French (fr)
Inventor
张旭
于乾坤
李楠
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022222756A1 publication Critical patent/WO2022222756A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions

Definitions

  • the present application relates to the field of chip technology, and more particularly, to a chip, a method for processing data, and a computer device.
  • An arithmetic logic unit is a combinational logic digital circuit that can perform arithmetic or bit operations on binary integers.
  • the ALU is an essential element of many computing circuits. These computing circuits include a central processing unit (CPU), a graphics processing unit (GPU), a network processing unit (NP), and the like.
  • An ALU has various input and output networks that are used to pass digital signals between the ALU and external circuits.
  • the external circuit inputs a signal at the input end of the ALU, and the ALU will generate the operation result and output the signal to the external circuit through its output end.
  • the ALU typically supports many basic arithmetic and bitwise logic functions.
  • Basic general-purpose ALUs typically support the following operations: arithmetic operations (eg, addition, subtraction, etc.), bitwise logical operations (eg, AND, OR, etc.), and shift operations.
  • Figure 1 is a block diagram of an instruction execution unit that includes an ALU.
  • the instruction execution unit 100 shown in FIG. 1 includes two source operand multiplexers, a source operand multiplexer 101 and a source operand multiplexer 102 .
  • the instruction execution unit 100 also includes an ALU 103 and a destination operand demultiplexer 104.
  • Multiple instruction execution units in computing circuits such as CPU, GPU, and NP are all instruction execution units 100 shown in FIG. 1 .
  • the ALU usually supports binary operation instructions such as addition, subtraction, and "and" operations. Therefore, the instruction execution unit 100 needs to include two source operand multiplexers to obtain the data required by the binary operation instruction.
  • the present application provides a chip, a method for processing data, and a computer device, which can be better applied to scenarios with many unary operations.
  • an embodiment of the present application provides a chip, the chip includes a matching module, a first action module, and a second action module, the matching module is used to determine the type of data to be processed and a method for processing the data to be processed.
  • the first set of instructions wherein the type of the data to be processed is the first type of data to be processed or the second type of data to be processed; the matching module is also used to determine that the type of the data to be processed is the type of the first type of data to be processed In this case, the data to be processed and the first instruction set are indicated to the first action module, wherein when the type of the data to be processed is the first type of data to be processed, the first instruction set includes at least Each instruction is a unary operation instruction; the matching module is also used to indicate the to-be-processed data and the first instruction set to the second type of the to-be-processed data when it is determined that the type of the to-be-processed data is the second type of the to-be-processed data.
  • An action module wherein when the type of the data to be processed is the second type of data to be processed, the first instruction set includes at least one multiple operation instruction; the first action module is used to obtain the data to be processed and the first instruction set, execute the instructions in the first instruction set on the to-be-processed data to obtain the first processed data; the second action module is used to obtain the to-be-processed data and the first In the case of an instruction set, the instructions in the first instruction set are executed on the data to be processed to obtain the second processed data.
  • the chip provided by the above technical solution includes two kinds of action modules, one of which is specially used for processing unary operation instructions. Compared with an action module that can process both unary operation instructions and multiple operation instructions, the structure of the action module that can only process unary operation instructions is simpler. In this way, the chip can use a smaller area to realize the same function of the existing chip, thereby reducing the manufacturing cost of the chip and improving the performance of the chip.
  • the first instruction set further includes at least one unary operation instruction, wherein each of the at least one unary operation instruction One unary operation instruction corresponds to one or more of the at least one multiple operation instruction, and each unary operation instruction is executed after the corresponding multiple operation instruction.
  • the output data of the first action module and the output data of the second action module are combined and output.
  • the output data of the first action module and the output data of the matching module are combined and input to the second action module.
  • the matching module is further configured to determine a second instruction set for processing the first processed data, and send the second instruction set to the second action module, the second instruction set Including at least one multiple operation instruction; the second action module is also used to obtain the first processed data when the second instruction set is acquired, and execute the first processed data in the second instruction set. instruction to get the third processed data.
  • the first action module includes at least one instruction execution unit, and each instruction execution unit in the at least one instruction execution unit includes a source operand multiplexer, an instruction executor, and a destination For the operand multiplexer, the storage space that the source operand multiplexer can access is the storage space for saving the response data generated by the matching module.
  • the source operand multiplexer can select a simpler source operand multiplexer, thereby further reducing the complexity of the chip.
  • the unary operation instruction is a data move operation or a data shift operation.
  • the chip includes at least one processing unit, and each processing unit in the at least one processing unit includes: the matching module, the first action module and the second action module.
  • an embodiment of the present application provides a method for processing data, including: determining a type of data to be processed and a first instruction set for processing the data to be processed, wherein the type of the data to be processed is the first type of data to be processed Processing data or the second type of data to be processed, in the case that the type of the data to be processed is the first type of data to be processed, at least one instruction included in the first instruction set is a unary operation instruction.
  • the first instruction set includes at least one multiple operation instruction; the first processed data is obtained by executing the instructions in the first instruction set on the to-be-processed data.
  • the above technical solution may be implemented by the chip provided in the first aspect.
  • the chip includes two action modules, one of which is dedicated to processing unary operation instructions. Compared with an action module that can process both unary operation instructions and multiple operation instructions, the structure of the action module that can only process unary operation instructions is simpler. In this way, the chip can use a smaller area to realize the same function of the existing chip, thereby reducing the manufacturing cost of the chip and improving the performance of the chip.
  • the first instruction set further includes at least one unary operation instruction, wherein each of the at least one unary operation instruction One unary operation instruction corresponds to one or more of the at least one multiple operation instruction, and each unary operation instruction is executed after the corresponding multiple operation instruction.
  • the method further includes: determining a second set of instructions for processing the first processed data, and executing the instructions in the second set of instructions on the first processed data to obtain a second set of processed data. Data processing.
  • the unary operation instruction is a data move operation or a data shift operation.
  • an embodiment of the present application provides a computer device, where the computer device includes a chip in the first aspect or any possible design of the first aspect.
  • embodiments of the present application provide a computer-readable storage medium, where program codes are stored in the computer-readable storage medium, and when the computer storage medium runs on a computer, the computer is made to execute the second aspect or the second aspect. any possible design.
  • an embodiment of the present application provides a computer program (product), the computer program product comprising: computer program code, when the computer program code is run on a computer, the computer is made to execute the second aspect or the second aspect. any possible design.
  • Figure 1 is a block diagram of an instruction execution unit that includes an ALU.
  • FIG. 2 is a schematic structural diagram of an integrated circuit provided according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of another integrated circuit provided according to an embodiment of the present application.
  • FIG. 4 is a schematic structural block diagram of an integrated circuit.
  • FIG. 5 is a schematic structural block diagram of a first action module provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a chip provided according to an embodiment of the present application.
  • FIG. 7 is a schematic structural block diagram of a switch chip provided according to an embodiment of the present application.
  • Figure 8 is a schematic diagram of the original decision diagram.
  • Figure 9 is a schematic flow diagram of the swap operation.
  • Figure 10 is a schematic diagram of an optimized decision diagram.
  • FIG. 11 is a schematic flowchart of a method for processing data according to an embodiment of the present application.
  • the term “at least one” refers to one or more, and "a plurality” refers to two or more.
  • “And/or”, which describes the association relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, which can indicate: the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • “At least one item(s) below” or similar expressions refer to any combination of these items, including any combination of single item(s) or plural item(s).
  • At least one item (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
  • ordinal numbers such as “first” and “second” mentioned in the embodiments of the present application are used to distinguish multiple objects, and are not used to limit the order, sequence, priority or importance of multiple objects degree.
  • FIG. 2 is a schematic structural diagram of an integrated circuit provided according to an embodiment of the present application.
  • the integrated circuit 200 shown in FIG. 2 may include a matching module 201 , a first action module 202 and a second action module 203 .
  • the integrated circuit 200 also includes an input interface 204 and an output interface 205 .
  • the matching module 201 is configured to obtain input data from the input interface 204, and determine the data to be processed according to the input data.
  • the matching module 201 is further configured to determine the type of the data to be processed.
  • the type of the data to be processed may be the first type of data to be processed or the second type of data to be processed.
  • the matching module 201 is further configured to process the instruction set of the data to be processed.
  • the instruction set for processing the data to be processed may be referred to as instruction set 1 .
  • One or more instructions may be included in the instruction set 1 .
  • the instructions included in the instruction set 1 are all unary operation instructions.
  • the unary operation instruction may include a data move operation (move, MOV) and a data shift operation.
  • Data shift operations may include acyclic shifts (eg, shift left (SHL), shift right (SHR), shift arithmetic left (SAL), shift right (shift arithmetic right) , SAR)) and cyclic shifts (such as rotate left (ROL), rotate right (ROR), rotate through carry left (RCL), rotate right with carry Rotate through carry right (RCR)).
  • acyclic shifts eg, shift left (SHL), shift right (SHR), shift arithmetic left (SAL), shift right (shift arithmetic right) , SAR)
  • cyclic shifts such as rotate left (ROL), rotate right (ROR), rotate through carry left (RCL), rotate right with carry Rotate through carry right (RCR)
  • the instruction set 1 includes at least one multiple operation instruction.
  • the multiple operation instructions may include binary operation instructions and ternary or more multiple operation instructions.
  • the type of the data to be processed is the first type of data to be processed; if the instructions that need to process the data to be processed include multiple operation instructions, then the type of the data to be processed is the first type of data to be processed; The data to be processed is the second type of data to be processed.
  • the instruction set 1 may also include at least one unary operation instruction.
  • Each unary operation instruction in the at least one unary operation instruction corresponds to one or more multiple operation instructions, and each unary operation instruction is executed after the corresponding multiple operation instruction.
  • the instruction set 1 includes both multiple operation instructions and unary operation instructions, then the unary operation instruction in the instruction set 1 needs to be executed after the multiple operation instruction.
  • the matching module 201 may indicate the data to be processed and the instruction set 1 to the first action module 202 .
  • the first action module 202 may execute the instruction set 1 on the acquired data to be processed to obtain the processed data.
  • the processed data determined by the first action module 202 may be referred to as processed data 1 .
  • the matching module 201 may indicate the data to be processed and the instruction set 1 to the second action module 203 .
  • the second action module 203 may execute the instruction set 1 on the acquired data to be processed to obtain the processed data.
  • the processed data determined by the second action module 203 may be referred to as processed data 2 .
  • the matching module 201 can directly send the data to be processed to the first action module 202 or the second action module 203.
  • the matching module 201 may send the identification of each instruction in the instruction set 1 to the first action module 202 or the second action module 203 . If the instruction set 1 includes multiple instructions, the matching module 201 may also send the execution sequence of the multiple instructions to the first action module 202 or the second action module 203 . The first action module 202 or the second action module 203 can find the corresponding command according to the identifier of the received command, and execute the corresponding command on the data to be processed.
  • the matching module 201 may only send the identifier of the first instruction to be executed in the instruction set 1 (hereinafter referred to as instruction 1) to the first action module 202 or the second action module 203.
  • the first action module 202 or the second action module 203 may determine the instruction 1 according to the identifier of the instruction 1, and execute the instruction 1 on the data to be processed.
  • the instruction 1 may include a next step instruction, and the next step instruction may indicate an identifier of an instruction to be executed in the next step (hereinafter referred to as instruction 2) or an instruction to complete the operation. If the next step indication indicates the identifier of instruction 2, the first action module 202 or the second action module 203 may continue to execute instruction 2 on the data to be processed after executing instruction 1.
  • instruction 2 may also include a next step instruction. If the next step instruction indicates the completion of the operation, the first action module 202 or the second action module 203 outputs the execution result of executing the instruction 1 on the data to be processed, that is, the processed data 1 or the processed data 2 .
  • the data output by the first action module 202 may be combined with the data output by the second action module 203 and then output through the output interface 205 .
  • FIG. 3 is a schematic structural diagram of another integrated circuit provided according to an embodiment of the present application.
  • the integrated circuit 300 shown in FIG. 3 may include a matching module 301 , a first action module 302 and a second action module 303 .
  • the integrated circuit 300 also includes an input interface 304 and an output interface 305 .
  • the function of the matching module 301 in the integrated circuit 300 is similar to the function of the matching module 201 shown in FIG. 2 .
  • the matching module 301 in the integrated circuit 300 can also obtain input data from the input interface 204, and determine the data to be processed according to the input data, the type of the data to be processed and the instruction set for processing the data to be processed.
  • instruction set 2 the instruction set for processing the data to be processed determined by the matching module 301 may be referred to as instruction set 2 .
  • One or more instructions may be included in the instruction set 2 .
  • the specific implementation manner in which the matching module 301 determines the type of the data to be processed is similar to the specific implementation manner in which the matching module 201 determines the type of the data to be processed, and for brevity, details are not repeated here.
  • instruction set 2 is similar to instruction set 1.
  • the instructions included in the instruction set 2 are all unary operation instructions; if the type of the data to be processed determined by the matching module 301 is For the second type of data to be processed, the instruction set 2 includes at least one multiple operation instruction.
  • the instruction set 2 may also include at least one unary operation instruction, and the unary operation instruction included in the instruction set 2 is executed after the corresponding multiple operation instruction.
  • the matching module 301 determines that the data to be processed is the first type of data to be processed, it can indicate the data to be processed and the instruction set 2 to the first action module 302; In the case of the data to be processed, the data to be processed and the instruction set 2 may be indicated to the second action module 303 .
  • the manner in which the matching module 301 indicates the data to be processed and the instruction set is the same as the manner in which the matching module 201 shown in FIG. 2 indicates the data to be processed and the instruction set, and for brevity, details are not repeated here.
  • the first action module 302 may execute the instruction set 2 on the acquired data to be processed to obtain processed data.
  • the processed data determined by the first action module 302 may be referred to as processed data 3 .
  • the second action module 303 may execute the instruction set 2 on the acquired data to be processed to obtain the processed data.
  • the processed data determined by the second action module 303 may be referred to as processed data 4 .
  • the processed data determined by the first action module 302 may be combined with the data output by the matching module 301 , and the combined data may be sent to the second action module 303 .
  • the second action module 303 can continue to process the processed data 3 .
  • the set of instructions for processing the processed data 3 may be determined by the matching module 301 and sent to the second action module 303 .
  • the set of instructions for processing the processed data 3 may be referred to as the set of instructions 3 .
  • the type of instructions contained in the instruction set 3 is the same as the type of instructions contained in the instruction set 2 in the case where the data to be processed is the second type of data to be processed.
  • the instruction set 3 includes at least one multi-operation instruction.
  • the instruction set 3 may also include at least one unary operation instruction. Each unary operation instruction in the at least one unary operation instruction corresponds to one or more multiple operation instructions, and each unary operation instruction is executed after the corresponding multiple operation instruction.
  • processed data 4 The processed data obtained after the second action module 303 executes the instruction set 3 on the processed data 3 may be referred to as processed data 4 .
  • the data output by the second action module 303 can be output through the output interface 305 .
  • FIG. 4 is a schematic structural block diagram of an integrated circuit.
  • the integrated circuit 400 shown in FIG. 4 includes a matching module 401 and an action module 402 .
  • the integrated circuit 400 also includes an input interface 403 and an output interface 404 .
  • the integrated circuit shown in FIG. 4 only includes one action module.
  • the matching module 401 no longer distinguishes the types of the data to be processed, but sends all the data to be processed to the action module 402 .
  • FIG. 5 is a schematic structural block diagram of a first action module provided by an embodiment of the present application.
  • the first action module 500 shown in FIG. 5 includes at least one instruction execution unit 510 .
  • Each instruction execution unit in the at least one instruction execution unit includes a source operand multiplexer 511 , an instruction executor 512 and a destination operand demultiplexer 513 .
  • the first action module 500 shown in FIG. 5 may be the first action module 202 shown in FIG. 2 or the first action module 203 shown in FIG. 3 .
  • the instruction execution unit shown in FIG. 5 includes only one source operand multiplexer, while the execution unit 100 shown in FIG. 1 includes two sources. Operand multiplexer. Therefore, each instruction execution unit includes fewer source operand multiplexers.
  • the second action module 203 shown in FIG. 2 , the second action module 303 shown in FIG. 3 , and the action module 403 shown in FIG. 4 may include at least one instruction execution unit 100 shown in FIG. 1 .
  • M 1 K 1 +K 2
  • the instruction executor 512 as shown in FIG. 5 since the instruction executor 512 as shown in FIG. 5 only needs to execute the unary operation instruction, the instruction executor 512 may be a logic circuit which can only be used for executing the unary operation instruction. The instruction executor 512 can use simpler circuits than an ALU that needs to support both unary and multiple operation instructions.
  • the first action module only executes the unary operation instruction, and the multiple operation instruction and the unary operation instruction related to the multiple operation instruction are all executed by the second action module. In this way, if there are many unary operation instructions among the instructions to be executed, the integrated circuit shown in FIG. 2 and FIG. 3 can effectively reduce the area of the integrated circuit.
  • FIG. 6 is a schematic diagram of a chip provided according to an embodiment of the present application.
  • the chip 600 shown in FIG. 6 includes a plurality of processing units 601 .
  • each processing unit 601 of the plurality of processing units 601 shown in FIG. 6 may be the integrated circuit 200 shown in FIG. 2 .
  • each processing unit in the plurality of processing units shown in FIG. 6 may be the integrated circuit 300 shown in FIG. 3 .
  • the structure of some of the processing units in the plurality of processing units shown in FIG. 6 may be the integrated circuit 200 shown in FIG. 2 , and the structure of another part of the processing units may be as shown in FIG. 3 .
  • Integrated circuit 300 Integrated circuit 300 .
  • the structure of part of the processing units may be the integrated circuit 400 , and the structure of part of the processing units may be the integrated circuit 200 or the integrated circuit 300 .
  • An embodiment of the present application further provides a chip, and the chip may include only one integrated circuit 200 as shown in FIG. 2 or one integrated circuit 300 as shown in FIG. 3 .
  • the integrated circuit 200 shown in FIG. 2 and the integrated circuit 300 shown in FIG. 3 can be considered as one chip.
  • a switch is a typical electronic device that executes many unary operation instructions.
  • the core problem to be solved by a switch is the processing and forwarding of packets from one port to another.
  • the switch needs to determine the destination address, port number and other information by looking up the table, and write the determined information into the packet header.
  • the act of writing information into the header or intermediate data can usually be accomplished with unary operation instructions. Therefore, the chip responsible for processing packets in the switch may be the chip 600 shown in FIG. 6 .
  • the packet referred to in the embodiments of this application may also be referred to as a packet, a data packet or a packet, and the packet may refer to a packet at the network layer, or may be a packet at the link layer, such as an Ethernet frame.
  • FIG. 7 is a schematic structural block diagram of a switch chip provided according to an embodiment of the present application.
  • the switch chip 700 shown in FIG. 7 is a programmable chip.
  • the chip 700 shown in FIG. 7 includes a programmable parser (programmable parser) 701 , a programmable match-action pipeline (programmable match-action pipeline) 702 and a programmable deparser (programmable deparser) 703 .
  • the programmable match-action pipeline includes two match-action units (match-action units, MAUs) 704 and 705 , and the MAUs 704 and 705 are also referred to as the MA node 704 and the MA node 705 .
  • the MA node 704 and the MA node 705 shown in FIG. 7 may be the integrated circuit shown in FIG. 2 or FIG. 3 , or one of the MA node 704 and the MA node 705 is the integrated circuit shown in FIG. 2 , and the other is the integrated circuit shown in FIG. 2 .
  • One is the integrated circuit shown in Figure 3.
  • the programmable parser 701 is responsible for parsing the packet into packet headers that can be identified and processed by the switch chip, such as source/destination internet protocol (IP) addresses, source/destination media access control (MAC) addresses, Source/destination port number and other information.
  • IP internet protocol
  • MAC media access control
  • the packet header parsed by the programmable parser 701 may pass through the programmable match-action pipeline 702 .
  • the MA nodes in the programmable match-action pipeline 702 process the corresponding information in the packet according to a match action table (MAT).
  • MAT match action table
  • Each MAT contains one or more entries (also called rules).
  • the entry contains the key used for matching.
  • Actions corresponding to entries may include adding, deleting, modifying, or nulling.
  • the fields that can be matched in the MAT and the control flow between them, and the range of allowable actions can be specified by a program pre-written and stored in the chip 700 .
  • the program can also specify the structure of each possible header and a decision graph expressing ordering and dependencies.
  • each MA node in the programmable match-action pipeline 702 can process the data in the packet header according to the MAT specified by the program and the data processing sequence and actions in the decision diagram.
  • the programmable inverse parser 703 rewrites the header of the message back to the message before sending the message out to the appropriate port.
  • the MA node assigns the unary operation and the multiple operation (including the unary operation related to the multiple operation) to different action modules for execution.
  • the decision diagram can be divided into three parts, which are the matching part, the unary operation part and the multivariate operation part.
  • the original decision graph of the packet header is often not the same as the decision graph including the above three parts (which can be called the optimized decision graph). Therefore, the original decision graph can be transformed into an optimized decision graph through a swap operation.
  • Figure 8 is a schematic diagram of the original decision diagram.
  • Figure 9 is a schematic flow diagram of the swap operation.
  • FIG. 10 is a schematic diagram of an optimized decision diagram. The optimized decision diagram shown in FIG. 9 is obtained from the original decision diagram shown in FIG. 8 according to the exchange operation shown in FIG. 9 .
  • action 1, action 3, action 4, action 5, and action 6 are actions performed by the unary operation instruction
  • action 5 is the action performed by the multi-operation instruction.
  • Three paths, path 1, path 2, and path 3 can be obtained through the swap operation as shown in FIG. 9 .
  • Path 1 is executed when condition a and condition b are met, and action 1, action 4, action 2, and action 6 need to be executed in sequence.
  • action 6 is an action performed by a unary operation instruction
  • action 6 needs to be performed after action 5
  • action 5 is an action performed by a multi-operation instruction. Therefore, action 6 belongs to the multivariate operation part.
  • Action 1 and Action 4 are actions performed by the unary operation instruction and the multi-operation instruction does not need to be executed before the execution of Action 1 and Action 4 . Therefore, action 1 and action 4 belong to the unary operation part.
  • Path 2 is executed when condition a is met and condition b is not met, and action 1, action 5, action 2, and action 6 need to be executed in sequence. Similar to Path 1, Action 6 in Path 2 is also executed after Action 5, so Action 6 also belongs to the multi-operation part.
  • Path 3 is executed when condition a is not satisfied, and action 1, action 3, and action 6 need to be executed in sequence.
  • action 6 is executed after action 1 and action 3, which are both actions performed by the unary operation instruction. Therefore, action 6 belongs to the unary operation part.
  • each action in path 1 can be executed, wherein action 1 and action 4 can be executed by the first action module, action 2 and action 6 can be executed by the second action module; if the matching module determines that condition a is met and condition b is not met, then each action in path 2 can be executed, wherein action 1 and action 5 can be executed by the first action module, action 2 and action 6 may be performed by the second action module; if the matching module determines that condition a is not satisfied, then each action in path 3 may be performed, wherein action 1, action 3 and action 6 are all performed by the first action module.
  • FIB forward information database
  • VRF virtual routing forwarding
  • the programmable parser 701 can parse the packet to obtain a packet header including the destination IP address and the VRF, and input the packet header into the programmable match-action pipeline 702 .
  • the data input to and output by the various MA units in the programmable match-action pipeline 702 may be referred to as metadata.
  • the MA node 704 can extract the destination IP address and VRF from the metadata (that is, the packet header parsed by the programmable parser 701); according to the destination IP address and VRF, query the FIB table to determine the next hop information; One-hop information is written to the metadata, and the metadata is output.
  • the matching module in the MA node 704 can be used to extract the destination IP address and the VRF from the metadata; according to the destination IP address and the VRF, the FIB table is queried to determine the next hop information.
  • Writing the next hop information to the metadata can be achieved by the MOV instruction. In other words, writing the next hop information to the metadata can be accomplished with a single unary operation instruction. Therefore, the matching module in the MA node 704 can send the determined next hop information to the first action module, which can be used to write the next hop information to the metadata.
  • the MA node 705 obtains the metadata output by the MA node 704, extracts the next hop information and the TTL value from the metadata; queries the rewrite table according to the next hop information, obtains the MAC address of the next entry, and writes the destination MAC address into the To metadata; write back to metadata after decrementing the TTL value by 1, and output metadata.
  • the matching module in the MA node 705 can be used to extract the next hop information from the metadata; query the rewrite table according to the next hop information to obtain the next hop destination MAC address.
  • Writing the destination MAC address to the metadata can be implemented by the MOV instruction. In other words, writing the destination MAC address to the metadata can be accomplished with a single unary operation instruction. Therefore, the matching module in the MA node 705 may send the determined destination MAC address to the first action module, which may be used to write the destination MAC address to the metadata. Decreasing the TTL value by 1 is a binary operation. Therefore, subtracting 1 from the TTL value requires the implementation of the second action module. In this case, the second action module in the MA node 705 can extract the TTL value from the metadata, decrement the TTL value by 1, and rewrite it into the metadata.
  • the programmable inverse parser 703 determines the message header according to the metadata output by the MA node 705, and rewrites the message header to the message.
  • the switch chip shown in FIG. 7 is only an example to help those skilled in the art better understand the embodiments of the present application, and the chips/integrated circuits provided in the embodiments of the present application can also be applied to other switch chips, such as including Programmable switch chips for more MA nodes or can also be applied to non-programmable switch chips.
  • the chip/integrated circuit in the embodiments of the present application may also be applied to other chips that need to process more unary operations, such as a control chip of a storage device.
  • the chip in the embodiment of the present application may be a network processor (NP), a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), or a system chip (system on chip, SoC), digital signal processing circuit (digital signal processor, DSP), microcontroller (micro controller unit, MCU), programmable logic device (programmable logic device, PLD), etc.
  • NP network processor
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • SoC system chip
  • DSP digital signal processing circuit
  • microcontroller microcontroller unit, MCU
  • programmable logic device programmable logic device
  • the first action module may only acquire the response data determined by the matching unit.
  • the response data obtained by the first action module in the MA node 704 is the next hop information determined by the matching module; the response data obtained by the first action module in the MA node 705 is determined by the matching module The destination MAC address of the next hop.
  • the first action module may not need to obtain metadata.
  • the first action module may include at least one instruction execution unit, and each instruction execution unit includes a source operand multiplexer.
  • the source operand multiplexer may be an N-to-1 selector, where the value of N is the maximum value of the storage space corresponding to the response data generated by the matching module.
  • the memory space accessible to the source operand multiplexer is the memory space used to store the response data generated by the matching module.
  • the chip can use 4 bytes (byte) to store the response data, then the value of N is 4. It is assumed that the chip includes 32 physical addresses, namely physical address 0 to physical address 15, of which physical address 0 to physical address 3 are the physical addresses used to save response data, and physical address 4 to physical address 31 are used to save metadata. physical address.
  • the source operand multiplexer may only need to be able to access physical address 0 to physical address 3. In other words, the source operand multiplexer may not need to access physical address 4 to physical address 31. Since the source operand multiplexer may not need to access the physical address for storing metadata, the structure of the source operand multiplexer may be simpler, thereby further reducing the complexity and cost of the first action module.
  • the second action module may include at least one instruction execution unit as shown in FIG. 1 .
  • the storage space accessible to the source operand multiplexer in the instruction execution unit in the second action module includes a storage space for storing response data generated by the matching module and a storage space for storing metadata.
  • the chip can use 4 bytes (byte) to store the response data, then the value of N is 4. It is assumed that the chip includes 32 physical addresses, namely physical address 0 to physical address 15, of which physical address 0 to physical address 3 are the physical addresses used to save response data, and physical address 4 to physical address 31 are used to save metadata. physical address. Then access to physical address 0 to physical address 31 can be accessed by the source operand multiplexer.
  • FIG. 11 is a schematic flowchart of a method for processing data according to an embodiment of the present application.
  • 1101. Determine the type of the data to be processed and a first instruction set for processing the data to be processed, where the type of the data to be processed is the first type of data to be processed or the second type of data to be processed, and the When the type of the data to be processed is the first type, at least one instruction included in the first instruction set is a unary operation instruction, and when the type of the data to be processed is the data to be processed of the second type, the first an instruction set including at least one multiple operation instruction;
  • the method shown in FIG. 11 may be performed by an integrated circuit as shown in FIG. 2 or FIG. 3 .
  • the matching module in the integrated circuit 200 or the integrated circuit 300 may be responsible for performing step 1101, ie, determining the type of the data to be processed and the first set of instructions for processing the data to be processed.
  • Step 1102 may be performed by the first action module or the second action module.
  • step 1102 is executed by the first action module.
  • the matching module needs to send the data to be processed and the first instruction set to the first action module.
  • the first action module executes the instructions in the first instruction set on the to-be-processed data to obtain the first processed data.
  • step 1102 is executed by the second action module.
  • the matching module needs to send the data to be processed and the first instruction set to the second action module.
  • the second action module executes the instructions in the first instruction set on the to-be-processed data to obtain the first processed data.
  • the first instruction set further includes at least one unary operation instruction, wherein the at least one unary operation instruction is Each unary operation instruction of , corresponds to one or more of the at least one multiple operation instruction, and each unary operation instruction is executed after the corresponding multiple operation instruction.
  • the method shown in FIG. 11 if the method shown in FIG. 11 is implemented by the integrated circuit shown in FIG. 2 , the method further includes: determining a second instruction set for processing the first processed data , execute the instructions in the second instruction set on the first processed data to obtain second processed data.
  • Embodiments of the present application further provide a computer device, where the computer device includes the integrated circuit or chip as shown in the foregoing embodiments.
  • the computer device may be a switch including a chip 700 as shown in FIG. 7 .
  • the present application further provides a computer program product, the computer program product includes: computer program code, when the computer program code is run on a computer, the computer is made to execute each of the above-mentioned embodiments. step.
  • the present application further provides a computer-readable medium, where program codes are stored in the computer-readable medium, and when the program codes are run on a computer, the computer is made to execute each of the above-mentioned embodiments. step.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution, and the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Abstract

本申请提供一种芯片、处理数据的方法和计算机设备,该芯片包括匹配模块、第一动作模块和第二动作模块,第一动作模块为专用于处理一元操作指令的动作模块,第二动作模块为处理多元操作指令的动作模块。匹配模块如果确定处理待处理数据的指令均是一元操作指令,那么可以将该待处理数据发送给第一动作模块处理;如果处理该待处理数据的指令包括多元操作指令,那么可以将该待处理数据发送给第二动作模块处理。相比既能处理一元操作指令也能处理多元操作指令的动作模块,只能处理一元操作指令的动作模块结构更加简单。这样,该芯片可以使用更小的面积实现现有芯片相同的功能,从而可以减少芯片的制造成本,提升芯片性能。

Description

芯片、处理数据的方法和计算机设备
本申请要求于2021年4月22日提交中国专利局、申请号为202110435405.8、申请名称为“芯片架构及设备”的中国专利申请的优先权,以及于2021年06月23日提交国家知识产权局、申请号为202110698347.8、申请名称为“芯片、处理数据的方法和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及芯片技术领域,更具体地,涉及芯片、处理数据的方法和计算机设备。
背景技术
算数逻辑单元(arithmetic logic unit,ALU)是一种可以对二进制整数进行算数运算或者位运算的组合逻辑数字电路。ALU是许多计算电路的基本元件。这些计算电路包括中央处理单元(central processing unit,CPU)、图形处理单元(graphics processing unit,GPU)、网络处理器(network processing unit,NP)等。
一个ALU具有各种输入和输出网络,它们用于在ALU和外部电路之间传送数字信号。当ALU工作时,外部电路在ALU的输入端输入信号,而ALU将产生运算结果,并将信号通过其输出端输出至外部电路。
ALU通常支持许多基本算术和按位逻辑函数。基本的通用ALU通常支持以下操作:算术运算(例如加法、减法等)、按位逻辑运算(例如“与”运算、“或”运算等)和移位操作。
图1是一个包含ALU的指令执行单元的结构图。如图1所示的指令执行单元100包括两个源操作数多路选择器,源操作数多路选择器101和源操作数多路选择器102。指令执行单元100还包括ALU 103和目的操作数多路输出选择器104。
CPU、GPU、NP等计算电路中的多个指令执行单元都是如图1所示的指令执行单元100。ALU通常支持加法、减法、“与”运算等二元操作指令。因此,指令执行单元100中需要包括两个源操作数多路选择器来获取二元操作指令需要数据。
然而,如图1所示的计算单元并不适用于所有的应用场景。
发明内容
本申请提供一种芯片、处理数据的方法和计算机设备,可以更好地适用于一元操作较多的场景。
第一方面,本申请实施例提供一种芯片,该芯片包括匹配模块、第一动作模块和第二动作模块,该匹配模块,用于确定待处理数据的类型以及用于处理该待处理数据的第一指令集合,其中该待处理数据的类型为第一类待处理数据或者第二类待处理数据;该匹配模块,还用于在确定该待处理数据的类型为该第一类待处理数据的情况下,将该待处理数据 和该第一指令集合指示给第一动作模块,其中在该待处理数据的类型为该第一类待处理数据的情况下,该第一指令集合包括的至少一条指令均为一元操作指令;该匹配模块,还用于在确定该待处理数据的类型为该第二类待处理数据的情况下,将该待处理数据和该第一指令集合指示给第二动作模块,其中在该待处理数据的类型为该第二类待处理数据的情况下,该第一指令集合包括至少一条多元操作指令;该第一动作模块,用于在获取到该待处理数据和该第一指令集合的情况下,对该待处理数据执行该第一指令集合中的指令,得到第一已处理数据;该第二动作模块,用于在获取到该待处理数据和该第一指令集合的情况下,对该待处理数据执行该第一指令集合中的指令,得到第二已处理数据。
上述技术方案提供的芯片中包括两种动作模块,其中一种动作模块专门用于处理一元操作指令。相比既能处理一元操作指令也能处理多元操作指令的动作模块,这种只能处理一元操作指令的动作模块结构更加简单。这样,该芯片可以使用更小的面积实现现有芯片相同的功能,从而可以减少芯片的制造成本,提升芯片性能。
在一种可能的设计中,在该待处理数据的类型为该第二类待处理数据的情况下,该第一指令集合还包括至少一条一元操作指令,其中该至少一条一元操作指令中的每条一元操作指令与该至少一条多元操作指令中的一条或多条对应,该每条一元操作指令在对应的多元操作指令之后被执行。
在一种可能的设计中,该第一动作模块的输出数据与该第二动作模块的输出数据合并后输出。
在一种可能的设计中,该第一动作模块的输出数据与该匹配模块的输出数据合并后输入该第二动作模块。
在一种可能的设计中,该匹配模块,还用于确定用于处理该第一已处理数据的第二指令集合,将该第二指令集合发送至该第二动作模块,该第二指令集合包括至少一条多元操作指令;该第二动作模块,还用在获取到该第二指令集合的情况下,获取该第一已处理数据,对该第一已处理数据执行该第二指令集合中的指令,得到第三已处理数据。
在一种可能的设计中,该第一动作模块包括至少一个指令执行单元,该至少一个指令执行单元中的每个指令执行单元包括一个源操作数多路选择器、一个指令执行器和一个目的操作数多路输出选择器,该源操作数多路选择器能够访问的存储空间是用于保存匹配模块产生的响应数据的存储空间。
由于只需要访问用于保存匹配模块产生的响应数据的存储空间,源操作数多路选择器可以选取更加简单的源操作数多路选择器,从而可以进一步降低芯片的复杂度。
在一种可能的设计中,该一元操作指令为数据搬移操作或数据移位操作。
在一种可能的设计中,该芯片包括至少一个处理单元,该至少一个处理单元中的每个处理单元包括:该匹配模块、该第一动作模块和该第二动作模块。
第二方面,本申请实施例提供一种处理数据的方法,包括:确定待处理数据的类型以及用于处理该待处理数据的第一指令集合,其中该待处理数据的类型为第一类待处理数据或者第二类待处理数据,在该待处理数据的类型为该第一类待处理数据的情况下,该第一指令集合包括的至少一条指令均为一元操作指令,在该待处理数据的类型为该第二类待处理数据的情况下,该第一指令集合包括至少一条多元操作指令;对该待处理数据执行该第一指令集合中的指令,得到第一已处理数据。
上述技术方案可以由第一方面提供的芯片实现。该芯片中包括两种动作模块,其中一种动作模块专门用于处理一元操作指令。相比既能处理一元操作指令也能处理多元操作指令的动作模块,这种只能处理一元操作指令的动作模块结构更加简单。这样,该芯片可以使用更小的面积实现现有芯片相同的功能,从而可以减少芯片的制造成本,提升芯片性能。
在一种可能的设计中,在该待处理数据的类型为该第二类待处理数据的情况下,该第一指令集合还包括至少一条一元操作指令,其中该至少一条一元操作指令中的每条一元操作指令与该至少一条多元操作指令中的一条或多条对应,该每条一元操作指令在对应的多元操作指令之后被执行。
在一种可能的设计中,该方法还包括:确定用于处理该第一已处理数据的第二指令集合,对该第一已处理数据执行该第二指令集合中的指令,得到第二已处理数据。
在一种可能的设计中,该一元操作指令为数据搬移操作或数据移位操作。
第三方面,本申请实施例提供一种计算机设备,该计算机设备包括第一方面或第一方面任一种可能的设计中的芯片。
第四方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质存储有程序代码,当该计算机存储介质在计算机上运行时,使得计算机执行如第二方面或第二方面的任一种可能的设计。
第五方面,本申请实施例提供一种计算机程序(产品),该计算机程序产品包括:计算机程序代码,当该计算机程序代码在计算机上运行时,使得计算机执行如第二方面或第二方面的任一种可能的设计。
附图说明
图1是一个包含ALU的指令执行单元的结构图。
图2是根据本申请实施例提供的一种集成电路的示意性结构图。
图3是根据本申请实施例提供的另一集成电路的示意性结构图。
图4是一个集成电路的示意性结构框图。
图5本申请实施例提供的第一动作模块的示意性结构框图。
图6是根据本申请实施例提供的一种芯片的示意图。
图7是根据本申请实施例提供的一种交换机芯片的示意性结构框图。
图8是原始决策图示意图。
图9交换操作的示意性流程图。
图10是优化后的决策图的示意图。
图11是根据本申请实施例提供的一种处理数据的方法的示意性流程图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请实施例中的术语“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其 类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
以及,除非有特别说明,本申请实施例提及“第一”、“第二”等序数词是用于对多个对象进行区分,不用于限定多个对象的顺序、时序、优先级或者重要程度。
图2是根据本申请实施例提供的一种集成电路的示意性结构图。如图2所示的集成电路200可以包括匹配模块201、第一动作模块202和第二动作模块203。集成电路200还包括输入接口204和输出接口205。
匹配模块201,用于从输入接口204获取输入数据,根据该输入数据确定待处理数据。
匹配模块201,还用于确定待处理数据的类型。
该待处理数据的类型可以是第一类待处理数据或者第二类待处理数据。
匹配模块201,还用于处理该待处理数据的指令集合。为了便于描述,可以将用于处理该待处理数据的指令集合称为指令集合1。指令集合1中可以包括一个或多个指令。
如果该待处理数据的类型是该第一类待处理数据,那么指令集合1中包括的指令均为一元操作指令。该一元操作指令可以包括数据搬移操作(move,MOV)和数据移位操作。
数据移位操作可以包括非循环移位(例如逻辑左移(shift left,SHL)、逻辑右移(shift right,SHR)、算数左移(shift arithmetic left,SAL)、算数右移(shift arithmetic right,SAR))和循环移位(例如循环左移(rotate left,ROL)、循环右移(rotate right,ROR)、带进位循环左移(rotate through carry left,RCL)、带进位循环右移(rotate through carry right,RCR))。
如果该待处理数据的类型是该第二类待处理数据,那么指令集合1中包括至少一条多元操作指令。该多元操作指令可以包括二元操作指令以及三元或者更多元操作指令。
换句话说,如果需要对待处理数据进行处理的指令都是一元操作指令,那么该待处理数据的类型就是第一类待处理数据;如果需要对待处理数据进行处理的指令包括多元操作指令,那么该待处理数据就是第二类待处理数据。
在一些实施例中,如果该待处理数据的类型是该第二类待处理数据,那么指令集合1中也可以包括至少一条一元操作指令。该至少一条一元操作指令中的每条一元操作指令都对应于一条或多条多元操作指令,且该每条一元操作指令在对应的多元操作指令之后执行。换句话说,如果指令集合1中既包括多元操作指令和一元操作指令,那么指令集合1中的一元操作指令需要在多元操作指令后执行。
如果匹配模块201确定的待处理数据的类型是该第一类待处理数据,那么匹配模块201可以将该待处理数据和指令集合1指示给第一动作模块202。第一动作模块202可以对获取到的待处理数据执行指令集合1,得到已处理数据。为了便于描述,可以将第一动作模块202确定的已处理数据称为已处理数据1。
如果匹配模块201确定的待处理数据的类型是该第二类待处理数据,那么匹配模块201可以将该待处理数据和指令集合1指示给第二动作模块203。第二动作模块203可以对获取到的待处理数据执行指令集合1,得到已处理数据。为了便于描述,可以将第二动作模块203确定的已处理数据称为已处理数据2。
匹配模块201可以直接将该待处理数据发送至第一动作模块202或者第二动作模块 203。
在一些实施例中,匹配模块201可以将指令集合1中的每条指令的标识发送至第一动作模块202或者第二动作模块203。如果指令集合1中包括多条指令,那么匹配模块201还可以将该多条指令的执行顺序发送给第一动作模块202或者第二动作模块203。第一动作模块202或者第二动作模块203可以根据接收到的指令的标识找到对应的指令,并对该待处理数据执行相应的指令。
在另一些实施例中,匹配模块201可以仅将指令集合1中第一条需要被执行的指令(以下简称指令1)的标识发送给第一动作模块202或者第二动作模块203。第一动作模块202或者第二动作模块203可以根据指令1的标识确定指令1,并对该待处理数据执行指令1。指令1中可以包括下一步指示,该下一步指示可以指示下一步需要执行的指令(以下简称指令2)的标识或者指示完成操作。如果该下一步指示所指示的是指令2的标识,那么第一动作模块202或者第二动作模块203可以继续对执行完指令1的待处理数据执行指令2。类似的,指令2中也可以包括下一步指示。如果该下一步指示所指示的是完成操作,那么第一动作模块202或者第二动作模块203输出对该待处理数据执行指令1的执行结果,即已处理数据1或者已处理数据2。
第一动作模块202输出的数据可以与第二动作模块203输出的数据合并后通过输出接口205输出。
图3是根据本申请实施例提供的另一集成电路的示意性结构图。如图3所示的集成电路300可以包括匹配模块301、第一动作模块302和第二动作模块303。集成电路300还包括输入接口304和输出接口305。
集成电路300中的匹配模块301的功能与如图2所示的匹配模块201功能类似。集成电路300中的匹配模块301也可以从输入接口204获取输入数据,根据该输入数据确定待处理数据,确定待处理数据的类型以及用于处理该待处理数据的指令集合。
为了便于描述,可以将匹配模块301确定的用于处理该待处理数据的指令集合称为指令集合2。指令集合2中可以包括一个或多个指令。
匹配模块301确定待处理数据的类型的具体实现方式与匹配模块201确定该待处理数据的类型的具体实现方式类似,为了简洁,在此就不再赘述。
类似的,指令集合2与指令集合1类似。换句话说,如果匹配模块301确定的该待处理数据的类型为第一类待处理数据,那么指令集合2中包括的指令都是一元操作指令;如果匹配模块301确定的该待处理数据的类型为第二类待处理数据,那么指令集合2中包括至少一个多元操作指令。类似的,指令集合2中也可以包括至少一个一元操作指令,指令集合2包括的一元操作指令在对应的多元操作指令后执行。
类似的,匹配模块301在确定该待处理数据是第一类待处理数据的情况下,可以将该待处理数据以及指令集合2指示给第一动作模块302;在确定该待处理数据是第二类待处理数据的情况下,可以将该待处理数据以及指令集合2指示给第二动作模块303。匹配模块301指示待处理数据以及指令集合的方式与图2所示的匹配模块201指示待处理数据以及指令集合的方式相同,为了简洁,在此就不再赘述。
第一动作模块302可以对获取到的待处理数据执行指令集合2,得到已处理数据。为了便于描述,可以将第一动作模块302确定的已处理数据称为已处理数据3。
第二动作模块303可以对获取到的待处理数据执行指令集合2,得到已处理数据。为了便于描述,可以将第二动作模块303确定的已处理数据称为已处理数据4。
第一动作模块302确定的已处理数据可以与匹配模块301输出的数据合并,合并后的数据可以发送至第二动作模块303。在此情况下,第二动作模块303可以继续对已处理数据3进行处理。用于处理已处理数据3的指令集合可以由匹配模块301确定并发送至第二动作模块303。为了便于描述,可以将用于处理已处理数据3的指令集合称为指令集合3。
指令集合3中包含的指令的类型与在待处理数据是第二类待处理数据的情况下指令集合2包括的指令的类型相同。换句话说,指令集合3中包括至少一条多元操作指令。指令集合3中也可以包括至少一条一元操作指令。该至少一条一元操作指令中的每条一元操作指令都对应于一条或多条多元操作指令,且该每条一元操作指令在对应的多元操作指令之后执行。
第二动作模块303对已处理数据3执行指令集合3后得到的已处理数据可以称为已处理数据4。
第二动作模块303输出的数据可以通过输出接口305输出。
图4是一个集成电路的示意性结构框图。如图4所示的集成电路400包括匹配模块401和动作模块402。集成电路400还包括输入接口403和输出接口404。
可以看出与如图2或图3所示的集成电路相比,如图4所示的集成电路中只包括一个动作模块。在此情况下,匹配模块401不再区分待处理数据的类型,而是把所有的待处理数据都发送给动作模块402。
图5本申请实施例提供的第一动作模块的示意性结构框图。如图5所示的第一动作模块500包括至少一个指令执行单元510。该至少一个指令执行单元中的每个指令执行单元包括一个源操作数多路选择器511、一个指令执行器512和一个目的操作数多路输出选择器513。
如图5所示的第一动作模块500可以是如图2所示的第一动作模块202,也可以是如图3所示的第一动作模块203。
与如图1所示的指令执行单元100相比,如图5所示的指令执行单元中仅包括一个源操作数多路选择器,而如图1所示的执行单元100中包括两个源操作数多路选择器。因此,每个指令执行单元包括的源操作数多路选择器少。
如图2所示的第二动作模块203、如图3所示的第二动作模块303和如图4所示的动作模块403可以包括至少一个如图1所示的指令执行单元100。
通过测试发现,如果在相同的时间完成K 1条一元操作指令和K 2条多元操作指令,如图4所示的动作模块402需要包括M 1个如图1所示的指令执行单元,其中M 1=K 1+K 2;如图2或图3所示的第一动作模块需要M 2个如图5所示的指令执行单元,如图2或图3所示的第二动作模块需要M 3个如图1所示的指令执行单元,其中M 2=K 1,M 3=K 2。这样,集成电路400中共需要2×M 1个源操作数多路选择器,集成电路200或集成电路300中只需要M 2+2×M 3个源操作数多路选择器。2×M 1大于M 2+2×M 3。由此可见,实现包含同样数目的指令,图2或图3所示的集成电路所需的源操作数多路选择器的数目要少于如图4所示的集成电路需要的源操作数多路选择器的数目。此外,由于如图5所示的指令执行器512只需要执行一元操作指令,因此指令执行器512可以是一个仅能用于执行一元操作 指令的逻辑电路。相比需要同时支持一元操作指令和多元操作指令的ALU,指令执行器512可以使用更简单的电路。
在图2和图3所示的集成电路中,第一动作模块只执行一元操作指令,而多元操作指令以及与多元操作指令相关的一元操作指令都由第二动作模块执行。这样,如果需要执行的指令中一元操作指令较多,那么图2和图3所示的集成电路可以有效地减少该集成电路的面积。
图6是根据本申请实施例提供的一种芯片的示意图。如图6所示的芯片600中包括多个处理单元601。
在一些实施例中,如图6所示的多个处理单元601中的每个处理单元601的结构可以是如图2所示的集成电路200。
在另一些实施例中,如图6所示的多个处理单元中的每个处理单元的结构可以是如图3所示的集成电路300。
在另一些实施例中,如图6所示的多个处理单元中的部分处理单元的结构可以是如图2所示的集成电路200,另一部分处理单元的结构可以是如图3所示的集成电路300。
在另一些实施例中,如图6所示的多个处理单元中部分处理单元的结构可以是集成电路400,部分处理单元的结构可以是集成电路200或者集成电路300。
本申请实施例还提供一种芯片,该芯片可以只包括一个如图2所示的集成电路200或者一个如图3所示的集成电路300。换句话说,在此情况下,可以认为图2所示的集成电路200和图3所示的集成电路300就是一个芯片。
交换机是一种典型的执行一元操作指令较多的电子设备。交换机要解决的核心问题是报文从一个端口到另一个端口的处理与转发。在对进行报文处理与转发的过程中,需要查表确定一些信息,并将确定到的信息写入到报文头或者处理报文头的过程中的中间数据中。例如,交换机需要通过查表的方式确定目的地址、端口号等信息,并将确定的信息写入到报文头中。将信息写入报文头或中间数据的动作通常都可以用一元操作指令实现。因此,交换机中负责处理报文的芯片可以是如图6所示的芯片600。
本申请实施例中所称的报文也可以称为分组、数据包或包,该报文可以是指网络层的分组,也可以是链路层的分组,比如以太帧。
下面以交换机芯片为例,对本申请提供的芯片进行介绍。
图7是根据本申请实施例提供的一种交换机芯片的示意性结构框图。如图7所示的交换机芯片700是一个可编程芯片。如图7所示的芯片700包括可编程解析器(programmable parser)701、可编程匹配-动作流水线(programmable match-action pipeline)702和可编程逆解析器(programmable deparser)703。
如图7所示,可编程匹配-动作流水线中包括两个匹配-动作单元(match-action unit,MAU)704和705,MAU 704和705也称为MA节点704和MA节点705。如图7所示的MA节点704和MA节点705可以是如图2或图3所示的集成电路,或者,MA节点704和MA节点705中的一个是如图2所示的集成电路,另一个是如图3所示的集成电路。
可编程解析器701负责将报文解析为交换机芯片可以识别处理的报文头,例如源/目的互联网协议(internet protocol,IP)地址、源/目的介质访问控制(media access control,MAC)地址、源/目的端口号等信息。
可编程解析器701解析得到的报文头可以经过可编程匹配-动作流水线702。可编程匹配-动作流水线702中的MA节点根据匹配动作表(match action table,MAT)对报文中相应的信息进行处理。每个MAT包含一个或多个条目(也可以称为规则)。条目包含用于匹配的关键字(key)。当报文头中的一个信息与一个条目中的关键字匹配时,可以执行该条目对应的动作。条目对应的动作可以包括添加、删除、修改或者空等。MAT中可以匹配的字段以及它们之间的控制流,以及可允许的动作的范围可以由预先编写并保存在芯片700中的程序指定。该程序还可以指定每个可能的报文头的结构以及表达排序和依赖性的决策图。换句话说,可编程匹配-动作流水线702中的每个MA节点可以根据该程序指定的MAT以及决策图中的数据处理顺序及动作对报文头中的数据进行处理。
最后,在将报文发出适当的端口之前,可编程逆解析器703将报文头重新写回到报文上。
如上所述,本申请实施例中,MA节点将一元操作和多元操作(包括与多元操作相关的一元操作)交由不同的动作模块执行。在此情况下,该决策图可以分为三部分依次为匹配部分、一元操作部分和多元操作部分。然而,报文头的原始决策图往往与包括上述三个部分的决策图(可以称为优化后的决策图)并不相同。因此,可以通过交换(swap)操作将原始决策图转换为优化后的决策图。
图8是原始决策图示意图。图9交换操作的示意性流程图。图10是优化后的决策图的示意图。图9所示的优化后的决策图是如图8所示的原始决策图根据如图9所示的交换操作后得到的。
图8所示的原始决策图中,动作1、动作3、动作4、动作5和动作6是一元操作指令执行的动作,动作5是多元操作指令执行的动作。通过如图9所示的交换操作可以得到三个路径,路径1、路径2和路径3。
路径1是在满足条件a且满足条件b的情况下执行的,需要依次执行动作1、动作4、动作2和动作6。虽然动作6是一元操作指令执行的动作,但是动作6需要在动作5之后被执行,而动作5是需要多元操作指令执行的动作。因此,动作6属于多元操作部分。动作1和动作4是一元操作指令执行的动作且在执行动作1和动作4之前不需要执行多元操作指令。因此,动作1和动作4属于一元操作部分。
路径2是在满足条件a且不满足条件b的情况下执行的,需要依次执行动作1、动作5、动作2和动作6。与路径1类似,路径2中动作6也是在动作5之后被执行,所以动作6也属于多元操作部分。
路径3是在不满足条件a的情况下执行的,需要依次执行动作1、动作3和动作6。在路径3中,动作6是在动作1和动作3之后被执行的,而动作1和动作3都是一元操作指令执行的动作。因此,动作6属于一元操作部分。
根据如图10所示的决策图,如果匹配模块确定满足条件a且满足条件b,那么可以执行路径1中的各个动作,其中动作1和动作4可以由第一动作模块执行,动作2和动作6可以由第二动作模块执行;如果匹配模块确定满足条件a且不满足条件b,那么可以执行路径2中的各个动作,其中动作1和动作5可以由第一动作模块执行,动作2和动作6可以由第二动作模块执行;如果匹配模块确定不满足条件a,那么可以执行路径3中的各个动作,其中动作1、动作3和动作6均由第一动作模块执行。
以图7所示的芯片700为例,假设对于报文执行以下操作:根据报文的目的IP地址和虚拟路由转发(virtual routing forwarding,VRF)查询转发信息库(forward information database,FIB)表,确定下一跳信息;使用下一跳信息查询重写(rewrite)表得到下一跳的目的MAC地址,并将生存时间(time to live,TTL)值减1。
可编程解析器701可以对报文进行解析得到包含目的IP地址和VRF的报文头,并将该报文头输入到可编程匹配-动作流水线702。可编程匹配-动作流水线702中各个MA单元输入和输出的数据可以称为元数据。
MA节点704可以从元数据(即可编程解析器701解析得到的报文头)中提取目的IP地址和VRF;根据该目的IP地址和VRF,查询FIB表,确定下一跳信息;将该下一跳信息写入到元数据,并将该元数据输出。
MA节点704中的匹配模块可以用于从元数据中提取目的IP地址和VRF;根据该目的IP地址和VRF,查询FIB表,确定下一跳信息。将下一跳信息写入到元数据可以通过MOV指令实现。换句话说,将下一跳信息写入到元数据可以通过一个一元操作指令实现。因此,MA节点704中的匹配模块可以将确定的下一跳信息发送至第一动作模块,该第一动作模块可以用于将该下一跳信息写入到元数据。
MA节点705获取MA节点704输出元数据,从该元数据中提取下一跳信息和TTL值;根据该下一跳信息查询重写表,得到下一条目的MAC地址,将该目的MAC地址写入到元数据;将TTL值减1后写回到元数据,并输出元数据。
MA节点705中的匹配模块可以用于从该元数据中提取下一跳信息;根据该下一跳信息查询重写表,得到下一跳目的MAC地址。将目的MAC地址写入到元数据可以通过MOV指令实现。换句话说,将目的MAC地址写入到元数据可以通过一个一元操作指令实现。因此,MA节点705中的匹配模块可以将确定的目的MAC地址发送至第一动作模块,该第一动作模块可以用于将该目的MAC地址写入到元数据。TTL值减1是一个二元操作。因此,TTL值减1需要第二动作模块实现。在此情况下,MA节点705中的第二动作模块可以从元数据中提取TTL值,将该TTL值减1后重新写入到元数据。
可编程逆解析器703根据MA节点705输出的元数据确定报文头,将该报文头重新写回到报文上。
可以理解,图7所示的交换机芯片只是为了帮助本领域技术人员更好地理解本申请实施例的一个示例,本申请实施例中提供的芯片/集成电路也可以应用于其他交换机芯片,例如包括更多MA节点的可编程交换机芯片或者还可以应用于不可编程的交换机芯片。本申请实施例中的芯片/集成电路还可以应用于其他需要处理较多一元操作的芯片,例如存储装置的控制芯片等。
本申请实施例中的芯片可以是网络处理器(network processor,NP)、现场可编程门阵列(field programmable gate array,FPGA),专用集成芯片(application specific integrated circuit,ASIC),还可以是系统芯片(system on chip,SoC)、数字信号处理电路(digital signal processor,DSP)、微控制器(micro controller unit,MCU)、可编程控制器(programmable logic device,PLD)等。
在本申请的一些技术方案中,第一动作模块可以只获取匹配单元确定的响应数据。例如,上述两个例子中,MA节点704中的第一动作模块获取到的响应数据为匹配模块确定 的下一跳信息;MA节点705中的第一动作模块获取到的响应数据为匹配模块确定的下一跳的目的MAC地址。换句话说,第一动作模块可以不需要获取元数据。
如上所述,第一动作模块中可以包括至少一个指令执行单元,每个指令执行单元包括一个源操作数多路选择器。该源操作数多路选择器可以是一个N选1的选择器,N的值是匹配模块产生的响应数据对应的存储空间的最大值。源操作数多路选择器能够访问的存储空间是用于保存匹配模块产生的响应数据的存储空间。例如,芯片中可以使用4个字节(byte)保存响应数据,那么N的取值为4。假设芯片中包括32个物理地址,分别为物理地址0至物理地址15,其中物理地址0至物理地址3是用于保存响应数据的物理地址,物理地址4至物理地址31是用于保存元数据的物理地址。那么源操作数多路选择器可以只需要能够访问物理地址0至物理地址3即可。换句话说,源操作数多路选择器可以不需要访问物理地址4至物理地址31。由于源操作数多路选择器可以不需要访问用于保存元数据的物理地址,因此该源操作数多路选择器的结构可以更加简单,从而进一步降低第一动作模块的复杂度和成本。
第二动作模块可以包括至少一个如图1所示的指令执行单元。第二动作模块中的指令执行单元中的源操作数多路选择器能够访问的存储空间包括用于保存匹配模块产生的响应数据的存储空间以及保存元数据的存储空间。例如,芯片中可以使用4个字节(byte)保存响应数据,那么N的取值为4。假设芯片中包括32个物理地址,分别为物理地址0至物理地址15,其中物理地址0至物理地址3是用于保存响应数据的物理地址,物理地址4至物理地址31是用于保存元数据的物理地址。那么访问物理地址0至物理地址31都可以被该源操作数多路选择器访问。
图11是根据本申请实施例提供的一种处理数据的方法的示意性流程图。
1101,确定待处理数据的类型以及用于处理该待处理数据的第一指令集合,其中该待处理数据的类型为第一类待处理数据或者第二类待处理数据,在该待处理数据的类型为该第一类待处理数据的情况下,该第一指令集合包括的至少一条指令均为一元操作指令,在该待处理数据的类型为该第二类待处理数据的情况下,该第一指令集合包括至少一条多元操作指令;
1102,对该待处理数据执行该第一指令集合中的指令,得到第一已处理数据。
如图11所示的方法可以由如图2或图3所示的集成电路执行。集成电路200或集成电路300中的匹配模块可以负责执行步骤1101,即确定待处理数据的类型以及用于处理该待处理数据的第一指令集合。
步骤1102可以由第一动作模块或者第二动作模块执行。
如果该待处理数据的类型为该第一类待处理数据,那么步骤1102由第一动作模块执行。在此情况下,匹配模块需要将该待处理数据以及该第一指令集合发送至第一动作模块。第一动作模块对该待处理数据执行第一指令集合中的指令,得到第一已处理数据。
如果该待处理数据的类型为该第二类待处理数据,那么步骤1102由第二动作模块执行。在此情况下,匹配模块需要将该待处理数据以及该第一指令集合发送至第二动作模块。第二动作模块对该待处理数据执行第一指令集合中的指令,得到第一已处理数据。
可选的,在一些实施例中,在该待处理数据的类型为该第二类待处理数据的情况下,该第一指令集合还包括至少一条一元操作指令,其中该至少一条一元操作指令中的每条一 元操作指令与该至少一条多元操作指令中的一条或多条对应,该每条一元操作指令在对应的多元操作指令之后被执行。
可选的,在一些实施例中,如果图11所示的方法是由如图2所示的集成电路实现,那么该方法还包括:确定用于处理该第一已处理数据的第二指令集合,对该第一已处理数据执行该第二指令集合中的指令,得到第二已处理数据。
匹配模块、第一动作模块和第二动作模块之间的关系以及各个模块的具体功能可以参见上述实施例,为了简洁在此就不再赘述。
本申请实施例还提供一种计算机设备,该计算机设备包括如上述实施例中所示的集成电路或者芯片。该计算机设备可以是一个交换机,该交换机包括如图7所示的芯片700。
根据本申请实施例提供的方法,本申请还提供一种计算机程序产品,该计算机程序产品包括:计算机程序代码,当该计算机程序代码在计算机上运行时,使得该计算机执行上述实施例中的各个步骤。
根据本申请实施例提供的方法,本申请还提供一种计算机可读介质,该计算机可读介质存储有程序代码,当该程序代码在计算机上运行时,使得该计算机执行上述实施例中的各个步骤。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机 存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (13)

  1. 一种芯片,其特征在于,所述芯片包括匹配模块、第一动作模块和第二动作模块,
    所述匹配模块,用于确定待处理数据的类型以及用于处理所述待处理数据的第一指令集合,其中所述待处理数据的类型为第一类待处理数据或者第二类待处理数据;
    所述匹配模块,还用于在确定所述待处理数据的类型为所述第一类待处理数据的情况下,将所述待处理数据和所述第一指令集合指示给第一动作模块,其中在所述待处理数据的类型为所述第一类待处理数据的情况下,所述第一指令集合包括的至少一条指令均为一元操作指令;
    所述匹配模块,还用于在确定所述待处理数据的类型为所述第二类待处理数据的情况下,将所述待处理数据和所述第一指令集合指示给第二动作模块,其中在所述待处理数据的类型为所述第二类待处理数据的情况下,所述第一指令集合包括至少一条多元操作指令;
    所述第一动作模块,用于在获取到所述待处理数据和所述第一指令集合的情况下,对所述待处理数据执行所述第一指令集合中的指令,得到第一已处理数据;
    所述第二动作模块,用于在获取到所述待处理数据和所述第一指令集合的情况下,对所述待处理数据执行所述第一指令集合中的指令,得到第二已处理数据。
  2. 如权利要求1所述的芯片,其特征在于,在所述待处理数据的类型为所述第二类待处理数据的情况下,所述第一指令集合还包括至少一条一元操作指令,其中所述至少一条一元操作指令中的每条一元操作指令与所述至少一条多元操作指令中的一条或多条对应,所述每条一元操作指令在对应的多元操作指令之后被执行。
  3. 如权利要求1或2所述的芯片,其特征在于,所述第一动作模块的输出数据与所述第二动作模块的输出数据合并后输出。
  4. 如权利要求1或2所述的芯片,其特征在于,所述第一动作模块的输出数据与所述匹配模块的输出数据合并后输入所述第二动作模块。
  5. 如权利要求4所述的芯片,其特征在于,所述匹配模块,还用于确定用于处理所述第一已处理数据的第二指令集合,将所述第二指令集合发送至所述第二动作模块,所述第二指令集合包括至少一条多元操作指令;
    所述第二动作模块,还用在获取到所述第二指令集合的情况下,获取所述第一已处理数据,对所述第一已处理数据执行所述第二指令集合中的指令,得到第三已处理数据。
  6. 如权利要求1至5中任一项所述的芯片,其特征在于,其特征在于,所述第一动作模块包括至少一个指令执行单元,所述至少一个指令执行单元中的每个指令执行单元包括一个源操作数多路选择器、一个指令执行器和一个目的操作数多路输出选择器,所述源操作数多路选择器能够访问的存储空间是用于保存匹配模块产生的响应数据的存储空间。
  7. 如权利要求1至6中任一项所述的芯片,其特征在于,所述一元操作指令为数据搬移操作或数据移位操作。
  8. 如权利要求1至7中任一项所述的芯片,其特征在于,所述芯片包括至少一个处理单元,所述至少一个处理单元中的每个处理单元包括:所述匹配模块、所述第一动作模块和所述第二动作模块。
  9. 一种处理数据的方法,其特征在于,包括:
    确定待处理数据的类型以及用于处理所述待处理数据的第一指令集合,其中所述待处理数据的类型为第一类待处理数据或者第二类待处理数据,在所述待处理数据的类型为所述第一类待处理数据的情况下,所述第一指令集合包括的至少一条指令均为一元操作指令,在所述待处理数据的类型为所述第二类待处理数据的情况下,所述第一指令集合包括至少一条多元操作指令;
    对所述待处理数据执行所述第一指令集合中的指令,得到第一已处理数据。
  10. 如权利要求9所述的方法,其特征在于,在所述待处理数据的类型为所述第二类待处理数据的情况下,所述第一指令集合还包括至少一条一元操作指令,其中所述至少一条一元操作指令中的每条一元操作指令与所述至少一条多元操作指令中的一条或多条对应,所述每条一元操作指令在对应的多元操作指令之后被执行。
  11. 如权利要求9所述的方法,其特征在于,所述方法还包括:确定用于处理所述第一已处理数据的第二指令集合,对所述第一已处理数据执行所述第二指令集合中的指令,得到第二已处理数据。
  12. 如权利要求9至11中任一项所述的方法,其特征在于,所述一元操作指令为数据搬移操作或数据移位操作。
  13. 一种计算机设备,其特征在于,所述计算机设备包括如权利要求1至8中任一项所述的芯片。
PCT/CN2022/085503 2021-04-22 2022-04-07 芯片、处理数据的方法和计算机设备 WO2022222756A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110435405 2021-04-22
CN202110435405.8 2021-04-22
CN202110698347.8A CN115237374A (zh) 2021-04-22 2021-06-23 芯片、处理数据的方法和计算机设备
CN202110698347.8 2021-06-23

Publications (1)

Publication Number Publication Date
WO2022222756A1 true WO2022222756A1 (zh) 2022-10-27

Family

ID=83666925

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085503 WO2022222756A1 (zh) 2021-04-22 2022-04-07 芯片、处理数据的方法和计算机设备

Country Status (2)

Country Link
CN (1) CN115237374A (zh)
WO (1) WO2022222756A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117197A (zh) * 2011-03-04 2011-07-06 中国电子科技集团公司第三十八研究所 高性能通用信号处理器指令分配装置
US20160094340A1 (en) * 2014-09-26 2016-03-31 Intel Corporation Instructions and logic to provide simd sm4 cryptographic block cipher functionality
CN107211300A (zh) * 2015-01-26 2017-09-26 诺基亚通信公司 分析和分类信令集合或呼叫
US20190102198A1 (en) * 2017-09-29 2019-04-04 Intel Corporation Systems, apparatuses, and methods for multiplication and accumulation of vector packed signed values
CN111047022A (zh) * 2018-10-12 2020-04-21 中科寒武纪科技股份有限公司 一种计算装置及相关产品
CN111782580A (zh) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 复杂计算装置、方法、人工智能芯片和电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117197A (zh) * 2011-03-04 2011-07-06 中国电子科技集团公司第三十八研究所 高性能通用信号处理器指令分配装置
US20160094340A1 (en) * 2014-09-26 2016-03-31 Intel Corporation Instructions and logic to provide simd sm4 cryptographic block cipher functionality
CN107211300A (zh) * 2015-01-26 2017-09-26 诺基亚通信公司 分析和分类信令集合或呼叫
US20190102198A1 (en) * 2017-09-29 2019-04-04 Intel Corporation Systems, apparatuses, and methods for multiplication and accumulation of vector packed signed values
CN111047022A (zh) * 2018-10-12 2020-04-21 中科寒武纪科技股份有限公司 一种计算装置及相关产品
CN111782580A (zh) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 复杂计算装置、方法、人工智能芯片和电子设备

Also Published As

Publication number Publication date
CN115237374A (zh) 2022-10-25

Similar Documents

Publication Publication Date Title
US11677664B2 (en) Apparatus and method of generating lookups and making decisions for packet modifying and forwarding in a software-defined network engine
US6920562B1 (en) Tightly coupled software protocol decode with hardware data encryption
WO2015125801A1 (ja) ネットワーク制御方法、ネットワークシステムと装置及びプログラム
US11218574B2 (en) Directed graph traversal using content-addressable memory
US10958770B2 (en) Realization of a programmable forwarding pipeline through packet header summaries in a data processing unit
US9471316B2 (en) Using a single-instruction processor to process messages
CN113037634B (zh) 基于fpga的匹配动作表的处理方法、逻辑电路和设备
US7937495B2 (en) System and method for modifying data transferred from a source to a destination
CN115917473A (zh) 用分布式lpm实现的高度可扩展算法构建数据结构的系统
Zolfaghari et al. Flexible software-defined packet processing using low-area hardware
WO2022222756A1 (zh) 芯片、处理数据的方法和计算机设备
US11095760B1 (en) Implementing configurable packet parsers for field-programmable gate arrays using hardened resources
Hsu et al. The design of a configurable and low-latency packet parsing system for communication networks
US9781062B2 (en) Using annotations to extract parameters from messages
US11755522B1 (en) Method, electronic device, and computer program product for implementing blockchain system on switch
US7757006B2 (en) Implementing conditional packet alterations based on transmit port
Zolfaghari et al. Run-to-Completion versus Pipelined: The Case of 100 Gbps Packet Parsing
Shukla Low power hardware implementations for network packet processing elements
US11960772B2 (en) Pipeline using match-action blocks
Virtanen et al. TACO IPv6 Router: A Case Study in Protocol Processor Design
US20240118890A1 (en) Data Processing Method and Apparatus, Processor, and Network Device
US20230185490A1 (en) Pipeline using match-action blocks
US8082527B1 (en) Representing the behaviors of a packet processor
Chen et al. A large capacity programmable packet forwarding device
JP5639965B2 (ja) 非同期動作検索回路

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22790865

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE