WO2022068503A1 - 电路、芯片和电子设备 - Google Patents

电路、芯片和电子设备 Download PDF

Info

Publication number
WO2022068503A1
WO2022068503A1 PCT/CN2021/115618 CN2021115618W WO2022068503A1 WO 2022068503 A1 WO2022068503 A1 WO 2022068503A1 CN 2021115618 W CN2021115618 W CN 2021115618W WO 2022068503 A1 WO2022068503 A1 WO 2022068503A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor
bus
memory
processing module
type
Prior art date
Application number
PCT/CN2021/115618
Other languages
English (en)
French (fr)
Inventor
田太徐
韩冰
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202011176149.7A external-priority patent/CN114327247A/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21874164.3A priority Critical patent/EP4209886A4/en
Priority to JP2023519636A priority patent/JP2023543466A/ja
Priority to KR1020237014009A priority patent/KR20230073317A/ko
Priority to CA3194399A priority patent/CA3194399A1/en
Priority to MX2023003629A priority patent/MX2023003629A/es
Publication of WO2022068503A1 publication Critical patent/WO2022068503A1/zh
Priority to US18/192,293 priority patent/US20230236727A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4009Coupling between buses with data restructuring
    • G06F13/4018Coupling between buses with data restructuring with data-width conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4204Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus
    • G06F13/4208Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus being a system bus, e.g. VME bus, Futurebus, Multibus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/392Floor-planning or layout, e.g. partitioning or placement

Definitions

  • the present application relates to the field of chip technology, and more particularly, to circuits, chips and electronic devices.
  • the processors in the current high-speed network chips are generally set in a pipeline manner. After a message enters the chip, a program state (PS) will be generated for the message to save the context information during the forwarding of the message.
  • PS program state
  • the processor on the pipeline processes the message, and saves the processing result in the PS and then sends it to the next processor.
  • the design between the processor in the chip and the memory for saving the PS is unreasonable, which will lead to a high delay in reading and writing the PS.
  • the present application provides a circuit, a chip and an electronic device, which can reduce transmission delay.
  • an embodiment of the present application provides a circuit.
  • the circuit includes a first processor and a first processing module connected to the first processor, the first processing module including a second processor connected to the first memory, the second processor performing a read from the first memory
  • the transmission delay generated by the write operation is smaller than the transmission delay generated by the communication between the first processor and the first processing module. Since the transmission delay caused by the second processor performing the read and write operations on the first memory is smaller than the transmission delay caused by the communication between the first processor and the first processing module, the time required for data transmission in the bus can be reduced. the cost of delay.
  • the transmission delay generated by the second processor performing read and write operations on the first memory is less than or equal to the transmission delay generated by the communication between the first processor and the first processing module. 1/10 of the transmission delay.
  • the second processor is a multi-core processor
  • the transmission delay generated by the second processor performing read and write operations on the first memory is a value included in the second processor.
  • the second processor is a multi-core processor
  • the transmission delay caused by the second processor performing read and write operations on the first memory is the transmission time caused by any core processor of the multi-core processors included in the second processor performing read and write operations on the first memory extension.
  • the first processor is connected to the first processing module through a first bus
  • the second processor is connected to the first memory through a second bus, wherein the first processor
  • the bus width of the two buses is larger than the bus width of the first bus, and/or the length of the second bus is smaller than the length of the first bus. Since the length of the second bus line is smaller than that of the first bus line, the area of the circuit can be reduced.
  • the length of the second bus may be less than or equal to 1/10 of the length of the first bus.
  • the first processing module further includes a third processor connected to the second memory, and the third processor performs a read and write operation on the second memory to generate a transmission delay It is less than the transmission delay generated by the communication between the first processor and the first processing module.
  • the first processor is connected to the first processing module through a first bus
  • the second processor is connected to the first memory through a second bus
  • the third processor The second memory is connected through a third bus, and the sum of the bus width of the second bus and the bus width of the third bus is larger than the bus width of the first bus.
  • the first processing module further includes a third processor connected to the first memory, and when the third processor and the first memory perform transmissions generated by read and write operations The delay is smaller than the transmission delay caused by the communication between the first processor and the first processing module.
  • the first processor is connected to the first processing module through a first bus
  • the second processor is connected to the first memory through a second bus
  • the third processor The first memory is connected through a third bus, and the sum of the bus width of the second bus and the bus width of the third bus is larger than the bus width of the first bus.
  • the second processor and the third processor belong to pipeline processors.
  • the circuit further includes a fourth processor and a second processing module connected to the fourth processor, where the second processing module includes an N-th processor connected to the M memories.
  • the N and M are both integers greater than or equal to 1, and the transmission delay generated by any fifth processor performing read and write operations on the memory connected to it is smaller than the communication delay between the fourth processor and the second processing module The resulting transmission delay.
  • the second processor is connected to the third processor through a fourth bus
  • the fourth processor is connected to the first processor through a fifth bus
  • the fourth processor is connected to the first processor through a fifth bus.
  • the bus width of the bus is smaller than the bus width of the fifth bus.
  • the number of processor cores included in the fourth processor is greater than or equal to the number of processor cores included in the first processor. .
  • the fourth processor and the first processor belong to pipeline processors.
  • the first processing module further includes the first memory.
  • an embodiment of the present application further provides a chip, where the chip includes a circuit according to the first aspect or any possible implementation manner of the first aspect.
  • an embodiment of the present application further provides an electronic device, the electronic device includes the chip provided by the embodiment of the present application, and the electronic device further includes a receiver and a transmitter.
  • the receiver is used to receive the message and send the message to the chip.
  • This chip is used to process the message.
  • the transmitter is used for acquiring the message processed by the chip, and sending the processed message to another electronic device.
  • the electronic device may be a switch, a router, or any other electronic device that can be provided with the above-mentioned chip.
  • an embodiment of the present application further provides a processing method, the method comprising: receiving a first packet by a first processor, where the first packet includes flow identification information; flow identification information, and determine a first processing module, where the first processing module corresponds to the flow identification information; the first processor sends the first packet to the first processing module.
  • the first processor sends the packet that needs to be processed by the first processing module to the first processing module according to the flow identification information carried in the packet, and the processor in the first processing module performs corresponding processing. Since the first processing module is closer to the memory than the first processor, it helps to reduce the transmission delay.
  • the method further includes: the first processor receives a second packet from the first processing module, where the second packet is processed by the first processing module according to the flow identification information.
  • the processed packet, the second packet includes the flow identification information.
  • the method further includes: the first processor sending the second packet to a next processor, where the next processor is the next one in the pipeline to which the first processor belongs Jump.
  • an embodiment of the present application further provides a processing method, the method includes: a second processor in the first processing module receives a first packet from the first processor, where the first packet includes a flow identification information; the second processor obtains, according to the flow identification information, parameters for processing the first packet from a memory corresponding to the second processor; the second processor obtains the parameters according to the parameters Process the first packet, and send the processed first packet to the third processor in the first processing module, where the processed first packet includes the flow identification information; the first packet A third processor in a processing module acquires parameters for processing the processed first packet from a memory corresponding to the third processor according to the flow identification information; the third processor The parameter processes the processed first packet, obtains a second packet, and sends it to the first processor.
  • the processor in the first processing module performs a read operation on the memory and performs corresponding processing according to the flow identifier in the first packet. Since the first processing module is closer to the memory than the first processor, it helps to reduce the transmission delay.
  • the processing may include look-up table forwarding, the parameter includes one or more of an index, a base address, and a hash value of a forwarding entry, and the parameter corresponds to the flow identifier.
  • FIG. 1 is a schematic diagram of a chip provided according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram of another chip provided according to an embodiment of the present application.
  • Figure 3 shows a schematic diagram of a circuit.
  • Figure 4 is a schematic diagram of another circuit.
  • FIG. 5 is a schematic flowchart of implementing the determination of the next hop port by using the circuit shown in FIG. 4 .
  • FIG. 6 is a schematic flowchart of implementing the determination of the next hop port by using the circuit shown in FIG. 3 .
  • references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.
  • the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.
  • At least one means one or more, and “plurality” means two or more.
  • And/or which describes the relationship of the associated objects, indicates that there can be three kinds of relationships, for example, A and/or B, it can indicate that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • At least one item (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
  • FIG. 1 is a schematic diagram of a chip provided according to an embodiment of the present application.
  • the chip 100 includes an input/output interface 101 , a processor 111 , a processor 112 , a processor 113 and a processor 114 .
  • the chip 100 further includes a processing module 121 and a processing module 122 .
  • the processing module 121 includes a processor 1211 , a processor 1212 , a memory 1221 and a memory 1222 .
  • the processing module 122 includes a processor 1213 , a processor 1214 , a processor 1215 , a memory 1223 , a memory 1224 and a memory 1225 .
  • the chip 100 also includes a memory 131 and a memory 132 .
  • the processor 111 is connected to the input/output interface 101 through the bus 61 , is connected to the processor 112 through the bus 41 , and is connected to the memory 131 through the bus 51 .
  • the processor 112 is connected to the processor 113 through the bus 42 , and is connected to the processing module 121 through the bus 11 .
  • the processor 113 is connected to the processor 114 through the bus 43 , and is connected to the memory 132 through the bus 52 .
  • the processor 114 is connected to the input/output interface 101 through the bus 62 , and is connected to the processing module 122 through the bus 12 .
  • the processor 1211 is connected to the memory 1221 through the bus 31 , and is connected to the processor 1212 through the bus 21 .
  • the processor 1212 is connected to the memory 1222 through the bus 32 .
  • the processor 1213 is connected to the memory 1223 through the bus 33 , and is connected to the processor 1214 through the bus 22 .
  • the processor 1214 is connected to the memory 1224 through the bus 34 and is connected to the processor 1215 through the bus 23 .
  • the processor 1215 is connected to the memory 1225 through the bus 35 .
  • the chip 100 processes the received packets in a pipeline manner.
  • the processor 111 , the processor 112 , the processor 113 and the processor 114 belong to the same pipeline, and the processor 111 , the processor 112 , the processor 113 and the processor 114 also belong to the pipeline processor.
  • the processor 1211 and the processor 1212 belong to the same pipeline, and the processor 1213, the processor 1214 and the processor 1215 belong to the same pipeline.
  • the pipeline may be referred to as the first pipeline. As shown in FIG. 1, some processors in the first pipeline can directly access the memory through the bus.
  • the processor 111 can directly access the memory 131 through the bus 11
  • the processor 113 can directly access the memory 132 through the bus 13
  • the processor in the first pipeline that can directly access the memory may be referred to as a type 1 processor.
  • another portion of the processors in the first pipeline may communicate with the processing module.
  • the processor 112 communicates with the processing module 121 through the bus 11
  • the processor 114 communicates with the processing module 122 through the bus 12 .
  • the processors in the first pipeline that can communicate with the processing modules may be referred to as type 2 processors. Multiple processors included in each processing module can also belong to a pipeline.
  • the processor 1211 and the processor 1212 belong to the same pipeline, and the processor 1213, the processor 1214 and the processor 1215 belong to the same pipeline.
  • the pipeline in the processing module may be referred to as the second pipeline.
  • the processors in the processing module may be referred to as type 3 processors.
  • the processor 1211 , the processor 1212 , the processor 1213 , the processor 1214 , and the processor 1215 shown in FIG. 1 may all be referred to as type 3 processors.
  • any type 1 processor corresponds to one memory
  • any type 3 processor corresponds to one memory
  • Any type 1 processor or any type 3 processor is connected to the corresponding memory through a bus, so as to perform read and write operations on the memory.
  • the memory 131 corresponds to the processor 111
  • the memory 1221 corresponds to the processor 1211 .
  • the above-mentioned one-to-one correspondence between the memory and the processor can also be replaced by a one-to-many or many-to-one correspondence.
  • any type 1 processor or any type 3 processor can be A plurality of memories correspond to read and write operations on a plurality of memories.
  • multiple processors of type 1 may correspond to one memory
  • multiple processors of type 3 may correspond to one memory, so as to perform read and write operations on the memory.
  • the memory 131 and the memory 132 in FIG. 1 may be replaced by one memory
  • the processor 111 and the processor 113 correspond to the same memory.
  • the memory 131 in FIG. 1 can also be replaced by multiple memories
  • the processor 111 corresponds to the multiple memories.
  • the memory 1221 and the memory 1222 in FIG. 1 may be replaced by one memory, and the processor 1211 and the processor 1212 correspond to the same memory.
  • the buses in the chip 100 include a type 1 bus, a type 2 bus, a type 3 bus, a type 4 bus, a type 5 bus and a type 6 bus.
  • the type 1 bus is used to connect the type 2 processors and the processing modules corresponding to the type 2 processors.
  • the bus 11 for connecting the processor 112 and the processing module 121 and the bus 12 for connecting the processor 114 and the processing module 122 belong to the type 1 bus.
  • a Type 2 bus is used to connect two Type 3 processors.
  • the bus 21 for connecting the processor 1211 and the processor 1212, the bus 22 for connecting the processor 1213 and the processor 1214, and the bus 23 for connecting the processor 1214 and the processor 1215 belong to the type 2 bus.
  • the type 3 bus is used to connect the type 3 processor and the memory corresponding to the type 3 processor.
  • the bus 31 for connecting the processor 1211 and the memory 1221, the bus 33 for connecting the processor 1213 and the memory 1223, and the like belong to the type 3 bus.
  • a type 4 bus is used to connect the two processors in the first pipeline.
  • the bus 41 for connecting the processor 111 and the processor 112, the bus 42 for connecting the processor 112 and the processor 113, and the bus 43 for connecting the processor 113 and the processor 114 belong to the type 4 bus.
  • a type 5 bus is used to connect a type 1 processor and a memory corresponding to the type 1 processor.
  • the bus 51 for connecting the processor 111 and the memory 131 and the bus 52 for connecting the processor 113 and the memory 132 belong to the type 5 bus.
  • a type 6 bus is used to connect the I/O interface 101 and the processor.
  • the bus 61 for connecting the input-output interface 101 and the processor 111 and the bus 62 for connecting the processor 114 and the input-output interface 101 belong to the type 6 bus.
  • each processor in the first pipeline is a multi-core processor.
  • Each processor in the first pipeline may include multiple processor cores (also referred to as cores).
  • different processors in the first pipeline may include the same number of processor cores.
  • any two processors in the first pipeline include the same number of processor cores.
  • the number of processor cores included in the processor 111 is equal to the number of processor cores included in the processor 112
  • the number of processor cores included in the processor 112 is equal to the number of processing cores included in the processor 113
  • the number of processor cores included in the processor 113 is equal to the number of the processor 114 .
  • different processors in the first pipeline may include different numbers of processor cores.
  • any two processors in the first pipeline may include different numbers of processor cores.
  • the number of processor cores included in the processor 111 is greater than the number of processor cores included in the processor 112 .
  • the number of processor cores included in the processor 113 is greater than the number of processor cores included in the processor 114 .
  • the number of processor cores included in the processor 111 is greater than the number of processor cores included in the processor 113
  • the number of processor cores included in the processor 112 is greater than the number of processor cores included in the processor 114 .
  • some processors in the first pipeline include the same number of processor cores.
  • the number of processor cores included in the processor 111 is equal to the number of processor cores included in the processor 113
  • the number of processor cores included in the processor 112 is equal to the number of processor cores included in the processor core 114
  • the number of processor cores included in the processor 111 is different from the number of processor cores included in the processor 112 .
  • the processors in the first pipeline can be divided into two types, namely type 1 processors (such as processor 111 and processor 113 ) and type 2 processors (such as processing 112 and processor 114).
  • processors of the same type may include the same number of processor cores, and processors of different types may include different numbers of processor cores.
  • a Type 1 processor may include a greater number of processor cores than a Type 2 processor core. Since the type 2 processor communicates with the processing module, and the processor included in the processing module can perform some processing operations, the type 2 processor can use a single-core processor or a processor with fewer cores, which can further reduce hardware cost.
  • the number of processor cores included in the type 2 processor may be 1/2, 1/3, 1/5, or 1/8 of the number of processor cores included in the type 1 processor.
  • the type 1 processor may be a multi-core processor and the type 2 processor may be a single-core processor.
  • the Type 3 processor may also be a multi-core processor.
  • a Type 3 processor may also include multiple processor cores.
  • a Type 3 processor includes fewer processor cores than a Type 1 or Type 2 processor. In other words, both the number of processor cores included in the type 1 processor and the number of processor cores included in the type 2 processor are greater than the number of processor cores included in the type 3 processor.
  • the number of processor cores included in the processor 1211 may be less than the number of processor cores included in the processor 111, and the number of processor cores included in the processor 1211 may also be less than the number of processor cores included in the processor 112. number.
  • the number of processor cores included in the type 3 processor may be less than the number of processor cores included in the type 1 processor, and the number of processor cores included in the type 3 processor may be equal to or A processor greater than type 2 includes the number of processor cores.
  • the number of processor cores included in the processor 1213 may be less than the number of processor cores included in the processor 111, and the number of processor cores included in the processor 1213 may be equal to or greater than the number of processor cores included in the processor 114 number.
  • a Type 3 processor may include less than or equal to 1/10 the number of processor cores included in a Type 1 processor.
  • the number of processor cores included in the processor of type 3 may be less than or equal to 1/2, 1/3, and 1/5 of the number of processor cores included in the processor of type 1 , or 1/8 etc.
  • the number of processor cores included in the type 2 processor is equal to the number of processor cores included in one of the processing modules corresponding to the processor.
  • the number of processor cores For example, the sum of the number of processor cores included in the processor 112 and the number of processor cores included in the processor 1212 is equal to the number of processor cores included in the processor 111 .
  • the sum of the number of processor cores included in the processor 114 and the number of processor cores included in the processor 1214 is equal to the number of processor cores included in the processor 113 .
  • the different Type 3 processors may include the same number of processor cores.
  • the number of processor cores included in the processor 1211 is equal to the number of processor cores included in the processor 1212
  • the number of processor cores included in the processor 1212 is equal to the number of processor cores included in the processor 1215 .
  • different Type 3 processors may include different numbers of processor cores.
  • any two processors belonging to the same processing module include the same number of processor cores, and two processors belonging to different processing modules include different numbers of processor cores.
  • the number of processor cores included in the processor 1211 is equal to the number of processor cores included in the processor 1212
  • the number of processor cores included in the processor 1212 is not equal to the number of processor cores included in the processor 1213 .
  • each processing module includes at least two processors.
  • the processing module may also include a multi-core processor.
  • the processing module 121 may only include the processor 1211 and the memory 1221, wherein the processor 1211 is a multi-core processor.
  • the Type 2 processor may also be a single core processor. If the Type 2 processor is a single-core processor, the processing module containing the processor may include at least two processors. In other words, if the processing module includes multiple processors, at least one single-core processor may be included in the multiple processors. Taking the processing module 121 as an example, the processor 1211 in the processing module 121 may be a single-core processor, and the processor 1212 may be a single-core processor or a multi-core processor.
  • the length of the Type 1 bus is greater than the length of the Type 3 bus.
  • the length of a type 3 bus may be equal to 1/5, 1/8, or 1/10, etc., of the length of a type 1 bus.
  • the length of the type 3 bus may be less than 1/10, 1/15 or 1/20 of the length of the type 1 bus, and so on.
  • the sum of the length of the Type 1 bus and the length of the Type 3 bus is equal to the length of the Type 5 bus.
  • the length of any two Type 1 buses may be the same. In some embodiments, the length of any two Type 2 buses may be the same. In some embodiments, the length of any two Type 3 buses may be the same. In some embodiments, the length of any two Type 4 buses may be the same. In some embodiments, the length of any two Type 5 buses may be the same. Due to the limitations of the manufacturing process, the exact length of the bus may be difficult to achieve. Therefore, the same length in the embodiments of the present application can be understood as the same length, and it can also be understood as the length difference within the allowable error range. For example, the sum of the length of the type 1 bus and the length of the type 3 bus is equal to the length of the type 5 bus.
  • the difference between the length of the type 1 bus and the length of the type 3 bus and the length of the type 5 bus is 0 or less than or Equal to the preset allowable error value.
  • the difference between the length of the bus 51 and the length of the bus 52 is 0 or less than or equal to a preset allowable error value.
  • the sum of the widths of all third buses in the same processing module is greater than the width of one first bus.
  • the width of bus 31 and the width of bus 32 are greater than the width of bus 11 .
  • the width of the bus 33 , the sum of the width of the bus 34 and the width of the bus 35 is greater than the width of the bus 12 .
  • the number of bits of binary data that the bus can transmit at the same time is called width (also called bit width), in bits, the larger the bus width, the better the transmission performance, and the amount of data that can be transmitted in the same time. more.
  • FIG. 2 is a schematic diagram of another chip provided according to an embodiment of the present application.
  • the chip 200 includes an input/output interface 201 , a processor 211 , a processor 212 , a processor 213 and a processor 214 .
  • the chip 200 further includes a processing module 221 and a processing module 222 .
  • the processing module 221 includes a processor 2211 and a processor 2212 .
  • the processing module 222 includes a processor 2213 , a processor 2214 and a processor 2215 .
  • the processor 211 is connected to the input/output interface 201 through the bus 2411 .
  • the processor 211 is connected to the processor 212 through the bus 2441 .
  • the processor 212 is connected to the processing module 221 through the bus 2421 .
  • the processor 212 is connected to the processor 213 through the bus 2442 .
  • the processor 213 is connected to the input/output interface 201 through a bus.
  • the processor 213 is connected to the processor 214 through a bus 2443 .
  • the processor 214 is connected to the processing module 222 through a bus 2422.
  • the processing module 221 is connected to the input and output interface 201 through the bus 2431 .
  • the processing module 222 is connected to the input and output interface 201 through the bus 2432 .
  • the processor 2221 is connected to the processor 2212 through the bus 2451.
  • the processor 2213 is connected to the processor 2214 through the bus 2452.
  • the processor 2214 is connected to the processor 2215 through the bus 2453.
  • the memories 231 to 237 are memories located outside the chip 200 .
  • the chip 200 can access the memory 231 to the memory 237 through the input-output interface 201 and the corresponding bus.
  • the memory 231 is connected with the chip 200 through the bus 2461; the memory 232 is connected with the chip 200 through the bus 2462; the memory 233 is connected with the chip 200 through the bus 2463; the memory 234 is connected with the chip 200 through the bus 2464; 200 is connected; the memory 236 is connected to the chip 200 through the bus 2466; the memory 237 is connected to the chip 200 through the bus 2467.
  • the chip 200 processes the received packets in a pipeline manner.
  • the processor 211 , the processor 212 , the processor 213 , and the processor 214 in the chip 200 belong to a pipeline, and the pipeline may be called a first pipeline.
  • some processors in the first pipeline can communicate with the input and output interfaces directly through the bus; another part of the processors in the first pipeline are connected to the processing module through the bus.
  • a processor that can communicate directly with an input-output interface ie, a processor not connected to a processing module
  • a type 1 processor a processor connected to the processing module
  • the type 2 processor may include the processor 212 and the processor 214 .
  • processors included in each processing module can also belong to a pipeline.
  • processor 2211 and processor 2212 belong to the same pipeline
  • processor 2213, processor 2214 and processor 2215 belong to the same pipeline.
  • the pipeline in the processing module may be referred to as the second pipeline.
  • the processors in the processing module may be referred to as type 3 processors.
  • the processor 2211 , the processor 2212 , the processor 2213 , the processor 2214 , and the processor 2215 shown in FIG. 2 may all be referred to as type 3 processors.
  • Each Type 1 processor and each processor has a corresponding memory.
  • the processor can read the data stored in the corresponding memory.
  • the processor may also write data into the corresponding memory.
  • the memory corresponding to the processor 211 is the memory 231
  • the memory corresponding to the processor 2211 is the memory 232
  • the memory corresponding to the processor 2212 is the memory 233
  • the memory corresponding to the processor 213 is the memory 234, and the memory corresponding to the processor 213 is the memory 234.
  • the memory corresponding to the processor 2213 is the memory 235
  • the memory corresponding to the processor 2214 is the memory 236
  • the memory corresponding to the processor 2215 is the memory 237 .
  • the processor 211 may read data stored in the memory 231 and/or write data to the memory 231 .
  • the processor 2213 may read data stored in the memory 235 and/or write data to the memory 235 .
  • a processing module connected to a processor through a bus may be referred to as a processing module corresponding to the processor.
  • the processing module 121 is a processing module corresponding to the processor 112.
  • the bus in the chip 200 may include a type 1 bus, a type 2 bus, a type 3 bus, a type 4 bus, a type 5 bus, and a type 6 bus.
  • a type 1 bus is used to connect a type 2 processor and a processor module corresponding to the type 2 processor.
  • the bus 2411 for connecting the processor 212 and the processing module 221 and the bus 2412 for connecting the processor 214 and the processing module 222 belong to the type 1 bus.
  • a Type 2 bus is used to connect two Type 3 processors.
  • the bus 2421 for connecting the processor 2221 and the processor 2212, the bus 2422 for connecting the processor 2213 and the processor 2214, and the bus 2423 for connecting the processor 2214 and the processor 2215 belong to the type 2 bus.
  • the Type 3 bus is used to connect the processor and the I/O interface in the processing module.
  • the bus 2431, the bus 2432, the bus 2433, the bus 2434 and the bus 2435 all belong to the type 3 bus.
  • the bus 2431 is a type 3 bus for connecting the processor 2211 and the input/output interface 201 .
  • the bus 2432 is a type 3 bus for connecting the processor 2212 and the input/output interface 201 .
  • the bus 2433 is a type 3 bus for connecting the processor 2213 and the input/output interface 201 .
  • the bus 2434 is a type 3 bus for connecting the processor 2214 and the input/output interface 201 .
  • the bus 2435 is a type 3 bus for connecting the processor 2215 and the input/output interface 201 .
  • a type 4 bus is used to connect the processors in the two first pipelines.
  • the bus 2441 for connecting the processor 211 and the processor 212, the bus 2442 for connecting the processor 212 and the processor 213, and the bus 2443 for connecting the processor 213 and the processor 214 belong to the type 4 bus.
  • a type 6 bus is used to connect the first processor and the I/O interface.
  • both bus 2461 and bus 2462 are type 6 buses.
  • the chip 200 is also connected to the memory through the bus.
  • the bus 2471 to the bus 2477 are buses for connecting the chip 200 and the memory, and such a bus may be referred to as a type 7 bus.
  • the processor of type 1 can access the corresponding memory through the corresponding bus and input and output interface.
  • the processor 211 can access the memory 231 through the bus 2461 , the I/O interface 201 and the bus 2471 .
  • the processor 213 can access the memory 234 through the bus 2462 , the input-output interface 201 and the bus 2474 .
  • Type 3 processors can access corresponding memories through corresponding buses and input and output interfaces.
  • the processor 2211 can access the memory 232 through the bus 2431 , the I/O interface 201 and the bus 2472 .
  • the processor 2215 can access the memory 237 through the bus 2435 , the I/O interface 201 and the bus 2477 .
  • the processors shown in FIG. 2 may all be multi-core processors.
  • the Type 1 and Type 2 processors may be multi-core processors
  • the Type 3 processors may be single-core processors.
  • the structure of the type 3 processor may be simpler than that of the type 1 processor.
  • a Type 3 processor may include fewer processor cores than a Type 1 processor.
  • a Type 3 processor may include fewer transistors than a Type 1 processor.
  • the length of the Type 1 bus is greater than the length of the Type 3 bus.
  • the length of a type 1 bus may be equal to 1/5, 1/8, or 1/10, etc., of the length of a type 3 bus.
  • the length of the type 3 bus may be less than 1/10, 1/15 or 1/20 of the length of the type 1 bus, and so on.
  • the sum of the length of the Type 1 bus and the length of the Type 3 bus is equal to the length of the Type 6 bus.
  • the length of any two Type 1 buses may be the same.
  • the length of any two Type 2 buses may be the same.
  • the length of any two Type 3 buses may be the same.
  • the length of any two Type 4 buses may be the same.
  • the length of any two Type 6 buses may be the same.
  • the sum of the bus widths corresponding to the memory and the chip in the same processing module is greater than the width of one first bus.
  • the width of bus 2431 and the width of bus 2432 are greater than the width of bus 2411 .
  • the width of the bus 2433 , the sum of the width of the bus 2434 and the width of the bus 2435 is greater than the width of the bus 2412 .
  • the width of a type 2 bus may be less than the width of a type 4 bus.
  • each processor with corresponding memory (type 3 processor and type 1 processor) and the corresponding memory are located inside the chip.
  • the memory corresponding to the processor is located outside the chip. In other embodiments, part of the memory corresponding to the processor may be located inside the chip, and another part of the memory corresponding to the processor may be located outside the chip.
  • This embodiment can be considered as a combination of the embodiment shown in FIG. 1 and the embodiment shown in FIG. 2 . It can be seen that the chip shown in FIG. 1 or FIG. 2 includes two structures as shown in FIG. 3 .
  • Figure 3 shows a schematic diagram of a hybrid processor circuit. As shown in FIG. 3 , the processor 301 and the processing module 310 are connected through a bus 331 .
  • the processing module 310 includes three processors, namely a processor 311 , a processor 312 and a processor 313 .
  • the processors in the processing module 310 are connected by a bus 332 .
  • the processor 301 is the first processor, and the processor 311 , the processor 312 and the processor 313 are type 3 processors. More specifically, the processor 311 is a type 2 processor.
  • Each processor in processing module 310 has a corresponding memory.
  • the memory corresponding to the processor 311 is the memory 321
  • the memory corresponding to the processor 312 is the memory 322
  • the memory corresponding to the processor 313 is the memory 323 .
  • Each processor in the processing module 310 is connected to a corresponding memory through a bus.
  • the processor 311 is connected to the memory 321 through a bus 333
  • the processor 312 is connected to the memory 322 through a bus 334
  • the processor 313 is connected to the memory 323 through a bus 335 .
  • the processor 310, the processor 311, the processor 312 and the processor 313 are located in the same chip.
  • Each processor and corresponding memory in the processing module 310 may be located in the same chip as the processing module 310, or may be located outside the chip where the processing module 310 is located. If the memory is located outside the chip where the processing module 310 is located, the bus for connecting the processor in the processing module and the corresponding memory may include a bus from the processor to the input and output interface of the chip and a bus from the chip to the corresponding memory.
  • the bus 333 may include a bus from the processor 311 to the I/O interface of the chip and a bus from the I/O interface of the chip to the memory 321 .
  • the structure as shown in FIG. 3 may be referred to as a hybrid processor circuit or a hybrid processor structure.
  • the processing module in the hybrid processor structure shown in FIG. 3 includes three processors.
  • the number of processors in the processing module may be a positive integer greater than or equal to 1, for example, may be 1, 2, 4, or 5, etc.
  • the processor may be a multi-core processor. If the processing module includes at least two processors, the at least two processors may include one or more single-core processors.
  • the length of the bus 333, the length of the bus 334, and the length of the bus 335 are the same, the letter L represents the length of the bus 333, and the letter R represents the length of the bus 331.
  • L is less than R.
  • L may be much smaller than R.
  • L may be equal to one-tenth of R, or L may be less than one-tenth of R.
  • the width of bus 333 is represented by letter A
  • the width of bus 334 is represented by letter B
  • the width of bus 335 is represented by letter C
  • the width of bus 331 is represented by letter D.
  • A, B, C, and D satisfy the following relation: D ⁇ A+B+C.
  • the data transfer cost of the hybrid processor architecture shown in Figure 3 can be expressed as Equation 3.1:
  • Cost_TX L ⁇ (A+B+C)+R ⁇ D, formula 3.1
  • Cost_TX represents the data transfer cost
  • L represents the length of the bus 333 (the length of the bus 333, the length of the bus 334 and the length of the bus 335 are equal)
  • R represents the length of the bus 331
  • A represents the width of the bus 333
  • B represents the bus 334
  • C represents the width of the bus 335
  • D represents the width of the bus 331 .
  • a chip using a pipeline structure can generate a program state (PS) for the message.
  • PS is used to store the context information in the forwarding process of the message.
  • the PS thus passes through each processor in the first pipeline, and the processor in the first pipeline is responsible for processing.
  • the PS processed by the processor in the first pipeline may be referred to as the first PS.
  • PS_Full_Size is used to represent the size after the first PS.
  • the processing module 310 also generates a PS in the process of processing the packet, and the PS sequentially passes through each processor in the second pipeline, and the processor in the second pipeline is responsible for processing.
  • the PS processed by the processor in the second pipeline may be referred to as the second PS.
  • the size of the second PS is represented by PS_Little_Size.
  • the first PS stores context information in the packet forwarding process.
  • the second PS only holds information processed in the second pipeline. Therefore, the size of the first PS will be larger than the size of the second PS (ie PS_Full_Size>PS_Little_Size).
  • the size of the second PS may be equal to or smaller than 1/5, 1/8, 1/10, 1/15, or 1/20, etc., of the size of the first PS.
  • a simpler processor can be used to process the second PS. Therefore, the structure of the processor inside the processing module (ie, the processor of type 3) can be simpler than that of the processor in the first pipeline.
  • a Type 3 processor may include fewer processor cores than a Type 1 and/or Type 2 first processor, and/or a Type 3 processor may include transistors The number may be less than the number of transistors included in Type 1 and/or Type 2 processors. The larger the gap between the size of the first PS and the size of the second PS, the simpler the structure of the Type 3 processor can be.
  • a Type 3 processor may include fewer processor cores than a Type 1 processor, and/or a Type 3 processor may include fewer transistors The number of transistors included in a Type 1 processor. In other embodiments, the Type 3 processor may include fewer processor cores than the Type 2 processor, and/or the Type 3 processor may include the number of transistors A 6 processor that is less than Type 2 includes the number of transistors.
  • N_Little can be used to represent the number of processor cores included in type 3 processors
  • N_Big2 can be used to represent the number of processor cores included in type 2 processors
  • N_Big1 can be used to represent type 1 processors. the number of nuclei.
  • Cost_Proc PS_Little_Size ⁇ N_Little+PS_Full_Size ⁇ N_Big2, formula 3.2
  • Cost_Proc represents the processor cost, and the meanings of PS_Little_Size, N_Little, PS_Full_Size, and N_Big2 are as described above, and are not repeated here for brevity.
  • Latency_L is used to represent the input/output (I/O) delay of a bus of length L
  • Lattency_R is used to represent the I/O delay of a bus of length R
  • Cost_LAT Lattency_L ⁇ 3+Lattency_R ⁇ 1, formula 3.3
  • Cost_LAT represents the delay cost
  • Latency_L represents the I/O delay of the bus of length L
  • Lattency_R represents the I/O delay of the bus of length R.
  • FIG. 4 is a schematic diagram of another circuit including a processor.
  • the circuit shown in FIG. 4 includes three processors, namely, a processor 401 , a processor 402 and a processor 403 .
  • the processor 401, the processor 402 and the processor 403 are type 1 processors (the processor 401 to the processor 403 are all processors in the same pipeline, and the processor 401 to the processor 403 are connected by a bus is a memory rather than a processing module).
  • Each of the three processors has a corresponding memory.
  • the memory corresponding to the processor 401 is the memory 411
  • the memory corresponding to the processor 402 is the memory 412
  • the memory corresponding to the processor 403 is the memory 413 .
  • the processor 401 is connected to the memory 411 through the bus 421 , the processor 402 is connected to the memory 412 through the bus 422 , and the processor 402 is connected to the memory 413 through the bus 423 .
  • the processor 401 is connected to the processor 402 through the bus 424 , and the processor 402 is connected to the processor 403 through the bus 424 .
  • the length of the bus 421, the length of the bus 422, and the length of the bus 423 are the same.
  • the length of bus 421 may be equal to L+R, which is the sum of the length of bus 333 and the length of bus 331 as shown in FIG. 3 .
  • the width of bus 421 may be equal to the width of bus 333
  • the width of bus 422 may be equal to the width of bus 334
  • the width of bus 423 may be equal to the width of bus 335 .
  • the data transfer cost of adopting the structure shown in Figure 4 can be as shown in Equation 4.1:
  • Cost_TX (L+R) ⁇ (A+B+C), formula 4.1
  • Cost_TX represents the data transfer cost
  • L+R is the length of the bus 421 (the length of the bus 422 is equal to the length of the bus 421, and the length of the bus 423 is equal to the length of the bus 421)
  • A represents the width of the bus 421
  • B represents the length of the bus 422.
  • width C represents the width of the bus 423 .
  • Equation 4.1 Comparing Equation 4.1 and Equation 3.1, it can be found that when L is less than R and D is less than A+B+C, the data transfer cost of using the structure shown in Figure 3 is smaller than that of using the structure shown in Figure 4. cost.
  • the data transfer cost of the structure shown in FIG. 3 is smaller.
  • the processors 401 to 403 are all type 1 processors, the PS passing through the processors 401 to 403 is PS_Full.
  • the size of PS_Full is PS_Full_Size, and the type 1 processors include The number of processor cores is N_Big1. Then, the processor cost of adopting the structure shown in Figure 4 can be expressed as Equation 4.2:
  • Cost_Proc PS_Full_Size ⁇ N_Big1, formula 4.2
  • Cost_Proc represents the processor cost
  • PS_Full_Size is the size of PS passing through the processor 401
  • N_Big1 is the number of processor cores included in the processor 401 .
  • the processor cost of adopting the structure shown in FIG. 3 can be saved by (PS_Full_Size-PS_Little_Size) ⁇ N_Little. If the difference between PS_Full_Size and PS_Little_Size is larger, the saved processor cost is larger (ie, the processor cost is smaller). If the difference between N_Big1 and N_Little1 is larger, the greater the processor cost is saved (ie, the smaller the processor cost).
  • Latency_L is used to represent the I/O delay of a bus of length L
  • Lattency_R is used to represent the I/O delay of a bus of length R
  • Cost_LAT (Lattency_L+Lattency_R) ⁇ 3, formula 4.3
  • Cost_LAT represents the delay cost
  • Latency_L represents the I/O delay of the bus of length L
  • Lattency_R represents the I/O delay of the bus of length R.
  • the structure shown in FIG. 3 can save the I/O delay of Lattency_R ⁇ 2.
  • the technical solutions provided by the embodiments of the present application can implement corresponding functions with less cost (data transfer cost, processor cost, and delay cost).
  • the bus length required inside the processing module is short and the bus width between the first processor of type 2 and the processing module is small, therefore, compared with the chip that implements the same function, the chip using the technical solution of the present application has a relatively short bus width. The area is smaller.
  • the basic process for ECMP to determine the next hop port is as follows: determine a hash value according to the flow identification information of the packet (such as quintuple or flow label), and then determine a hash value according to the ECMP routing table and the hash value. entry, the port included in the entry is the next hop port used to send the packet.
  • the ECMP routing table can be divided into multiple tables, for example, can be divided into three tables, which are called the routing entry table 1 and routing table entry respectively. 2 and routing entry table 3.
  • an entry corresponding to the flow identification information is determined from the routing entry table 1 according to the flow identification information of the message, and the entry includes a base address and an index of the routing entry table.
  • determine the routing entry table 2 according to the index of the routing entry table and query the entry corresponding to the base address and the hash value determined according to the flow identification information of the message from the routing entry table 2.
  • the entry includes a port index and an index to the routing entry table.
  • the routing entry table 3 is determined according to the index of the routing entry table, and the entry corresponding to the port index is queried from the routing entry table 3, and the entry includes the next hop port of the packet.
  • FIG. 5 shows a schematic flow chart of implementing the determination of the next hop port by using the circuit shown in FIG. 4 .
  • the processor 401 obtains an index of the entry table of the selected path (hereinafter referred to as the path selection table index 1) from the received PS.
  • the processor 401 sends the routing table index 1 to the memory 411 .
  • the processor 401 receives the routing entry table 1 corresponding to the routing table index 1 from the memory 411.
  • the processor 401 determines an entry in the routing entry table 1 corresponding to the flow identification information of the packet.
  • the entry includes an index of the routing entry table (hereinafter referred to as routing table index 2) and a base address, and the routing table index 2 and the base address are written into the PS.
  • the processor 401 sends the PS (that is, the PS after the routing table index 2 and the base address are written) to the processor 402 .
  • the processor 402 obtains the routing table index 2, the base address and a hash value from the received PS.
  • the hash value is determined according to the flow identification information of the packet.
  • the hash value may be determined by the upstream node of the processor 401 and written into the PS.
  • the processor 402 sends the routing table index 2 to the memory 412 .
  • the processor 402 receives the routing entry table 2 corresponding to the routing table index 2 from the memory 412 .
  • the processor 402 queries the entry corresponding to the base address and the hash value from the routing entry table 2, and the entry includes a port index and an index of the routing entry table (hereinafter referred to as the routing table index). 3), and write the port index and the routing table index 3 into the PS.
  • the routing table index an index of the routing entry table
  • the processor 402 sends the PS (that is, the PS after the port index and the routing table index 3 are written) to the processor 403 .
  • the processor 403 obtains the routing table index 3 and the port index from the received PS.
  • the processor 403 sends the routing table index 3 to the memory 413 .
  • the processor 403 receives the routing entry table 3 corresponding to the routing table index 3 from the memory 412.
  • the processor 403 queries the entry corresponding to the port index from the routing entry table 3, where the content included in the entry is the next hop port of the packet.
  • the processor 403 writes the next-hop port of the packet to the PS, and sends the PS to the next node in the pipeline, and the next node continues to process the packet.
  • FIG. 6 is a schematic flowchart of implementing the determination of the next hop port by using the circuit shown in FIG. 3 .
  • the processor 301 obtains an index of the selected route entry table (hereinafter referred to as route selection table index 1), the flow identification information of the message, and the hash value determined according to the flow identification information of the message from the received PS. .
  • the processor 301 sends the routing table index 1, the flow identification information of the packet, and the hash value determined according to the flow identification information of the packet to the processor 311.
  • the processor 311 sends the routing table index 1 to the memory 321 .
  • the processor 311 receives the routing entry table 1 corresponding to the routing table index 1 from the memory 321.
  • the processor 311 determines an entry in the routing entry table 1 corresponding to the flow identification information of the packet.
  • the entry includes an index of the routing entry table (hereinafter referred to as routing table index 2) and a base address, and the routing table index 2 and the base address are written into the PS.
  • the PS may further include a hash value determined according to the flow identification information of the packet.
  • the processor 311 sends the PS (that is, the PS after the routing table index 2 and the base address are written) to the processor 312 .
  • the processor 312 obtains the routing table index 2, the base address and the hash value from the received PS.
  • the processor 312 sends the routing table index 2 to the memory 322 .
  • the processor 312 receives the routing entry table 2 corresponding to the routing table index 2 from the memory 322.
  • the processor 312 queries the entry corresponding to the base address and the hash value from the routing entry table 2, and the entry includes a port index and an index of the routing entry table (hereinafter referred to as the routing table index). 3), and write the port index and the routing table index 3 into the PS.
  • the routing table index an index of the routing entry table
  • the processor 312 sends the PS (that is, the PS after the port index and the routing table index 3 are written) to the processor 313 .
  • the processor 313 obtains the routing table index 3 and the port index from the received PS.
  • the processor 313 sends the routing table index 3 to the memory 323 .
  • the processor 313 receives the routing entry table 3 corresponding to the routing table index 3 from the memory 323.
  • the processor 313 queries the entry corresponding to the port index from the routing entry table 3, where the content included in the entry is the next hop port of the packet.
  • the processor 313 sends the next hop port of the packet to the processor 301 .
  • the processor 301 writes the next hop port of the packet into the PS, and sends the PS to the next node in the pipeline, and the next node continues to process the packet.
  • the processors 401 to 403 belong to the same pipeline as other processors in the chip.
  • the PS processed by the processor in the pipeline is used to save the context information in the message forwarding process. Therefore, the PS sent by the processor 401 to the processor 402 and the PS sent by the processor 402 to the processor 403 need to include the information required by the subsequent node in addition to the information required for querying the next hop port. Therefore, the size of the PS may be relatively large, for example, the size of the PS may be 512 bytes. Correspondingly, the width of the bus between processors is also larger.
  • the processors 311 to 313 only care about the routing function, and the transmitted PS only needs to include the information required for routing. Therefore, setting a smaller PS can satisfy the requirements of the processors 311 to 313 . For example, a 64-byte PS can meet the routing requirements. Correspondingly, the bus width between processors can be set smaller.
  • the information required by the processor 312 and the processor 313 comes from the previous node, and there is no need to obtain information from the processor 301.
  • the direct bus width between the processor 301 and the processing module 310 can be set smaller.
  • the width of the bus 331 may be 128 bits (bits).
  • the bus between the processor and the memory requires a larger width due to the need to transmit more information (eg, the routing entry table).
  • bus 333 to bus 335 may be 256 bits wide
  • bus 421 to bus 423 may be 256 bits wide.
  • the embodiment of the present application also provides a circuit.
  • the circuit includes a first processor and a first processing module connected to the first processor, the first processing module including a second processor connected to the first memory, the second processor performing a read from the first memory
  • the transmission delay generated by the write operation is smaller than the transmission delay generated by the communication between the first processor and the first processing module.
  • the processor module 121 as shown in FIG. 1 only includes the processor 1221 and the memory 1221 .
  • the first processor may be equivalent to the processor 112
  • the first processing module may be equivalent to the processing module 121
  • the second processor may be equivalent to the processor 1211
  • the first memory may be equivalent to the memory 1221 .
  • the transmission delay caused by the processor 1211 performing read and write operations on the memory 1221 is smaller than the transmission delay caused by the communication between the processor 112 and the processing module 1 .
  • the processing module 221 as shown in FIG. 2 only includes the processor 2211 .
  • the first processor may be equivalent to the processor 212
  • the first processing module may be equivalent to the processing module 221
  • the second processor may be equivalent to the processor 2211
  • the first memory may be equivalent to the memory 323 .
  • the second processor is a multi-core processor
  • the transmission delay generated by the second processor performing read and write operations on the first memory is the multi-core processor included in the second processor.
  • the first processor is connected to the first processing module through a first bus
  • the second processor is connected to the first memory through a second bus, wherein the second bus
  • the bus width is greater than the bus width of the first bus, and/or the length of the second bus is smaller than the length of the first bus.
  • the processing module 121 shown in FIG. 1 only includes the processor 1221 and the memory 1221, the first bus corresponds to the bus 11 for connecting the processor 112 and the processing module 121, and the second bus corresponds to the bus 11 for connecting the processor 112 and the processing module 121.
  • the processing module 221 shown in FIG. 2 only includes the processor 221 .
  • the first bus corresponds to the bus 2411 for connecting the processor 212 and the processing module 221
  • the second bus may correspond to the bus for connecting the processor 2211 and the memory 232 , including the bus 2431 and the bus 2472 .
  • the second bus may also correspond to the bus 2431 for connecting the processor 2211 and the input/output interface 201 .
  • the first processing module further includes a third processor connected to the second memory, and the transmission delay generated by the third processor performing read and write operations on the second memory is smaller than the transmission delay of the second memory.
  • the first processor may be equivalent to the processor 112
  • the first processing module may be equivalent to the processing module 121
  • the second processor may be equivalent to the processor 1211
  • the third processor may be equivalent to the processor 1212
  • the first memory may be equivalent to the memory 1221
  • the second memory may be equivalent to the memory 1222.
  • the transmission delay caused by the processor 1211's read and write operations to the memory 1221 is smaller than the transmission delay caused by the communication between the processor 112 and the processing module 121
  • the transmission delay caused by the processor 1212's read and write operations to the memory 1222 is smaller than that caused by the processor 1122.
  • the transmission delay caused by the communication with the processing module 121 is smaller than the transmission delay caused by the communication between the processor 112 and the processing module 121
  • the transmission delay caused by the processor 1212's read and write operations to the memory 1222 is smaller than that caused by the processor 1122.
  • the transmission delay caused by the communication with the processing module 121 is smaller than the transmission delay caused by the communication with the processing module
  • the first processor may be equivalent to the processor 212
  • the first processing module may be equivalent to the processing module 221
  • the second processor may be equivalent to the processor 2211
  • the third processor may be equivalent to the processor 2212
  • the first memory may correspond to the memory 232
  • the second memory may correspond to the memory 233 .
  • the first processor is connected to the first processing module through a first bus
  • the second processor is connected to the first memory through a second bus
  • the third processor is connected to the first memory through a third bus.
  • the bus is connected to the second memory, and the sum of the bus width of the second bus and the bus width of the third bus is larger than the bus width of the first bus.
  • the first bus may be equivalent to bus 11
  • the second bus may be equivalent to bus 31
  • the third bus may be equivalent to bus 32 .
  • the first bus may be equivalent to bus 2411
  • the second bus may be equivalent to bus 2431 and bus 2472
  • the third bus may be equivalent to bus 2432 and bus 2473 .
  • the second bus may also correspond to the bus 2431
  • the third bus may also correspond to the bus 2432 .
  • the first processing module further includes a third processor connected to the first memory, and the transmission delay caused by the third processor performing read and write operations with the first memory is smaller than the transmission delay.
  • the first processor is connected to the first processing module through a first bus
  • the second processor is connected to the first memory through a second bus
  • the third processor is connected to the first memory through a third bus.
  • the bus is connected to the first memory, and the sum of the bus width of the second bus and the bus width of the third bus is larger than the bus width of the first bus.
  • the second processor and the third processor belong to pipeline processors.
  • the circuit further includes a fourth processor and a third memory connected to the fourth processor.
  • the processor 111 may be equivalent to the fourth processor, and the memory 113 may be equivalent to the third memory.
  • the processor 211 may be equivalent to the fourth processor, and the memory 231 may be equivalent to the third memory.
  • the circuit further includes a fourth processor and a second processing module connected to the fourth processor, the second processing module including N fifth processors connected to the M memories , the N and M are both integers greater than or equal to 1, and the transmission delay generated by any fifth processor performing read and write operations on the memory connected to it is smaller than the transmission delay generated by the communication between the fourth processor and the second processing module time delay.
  • the processor 114 may be equivalent to the fourth processor, and the processing module 122 may be equivalent to the second processing module.
  • the processor 214 may be equivalent to the fourth processor, and the processing module 222 may be equivalent to the second processing module.
  • the second processor is connected to the third processor through a fourth bus
  • the fourth processor is connected to the first processor through a fifth bus
  • the bus of the fourth bus The bit width is smaller than the bus bit width of the fifth bus.
  • the bus 21 may correspond to the fourth bus, and the bus 41 may correspond to the fifth bus.
  • the bus 2421 may be equivalent to the fourth bus, and the bus 2441 may be equivalent to the fifth bus.
  • the number of processor cores included in the fourth processor is greater than or equal to the number of processor cores included in the first processor.
  • the fourth processor and the first processor belong to pipeline processors.
  • the first processing module further includes the first memory.
  • An embodiment of the present application further provides an electronic device, the electronic device includes the chip provided by the embodiment of the present application, and the electronic device further includes a receiver and a transmitter.
  • the receiver is used to receive the message and send the message to the chip.
  • This chip is used to process the message.
  • the transmitter is used for acquiring the message processed by the chip, and sending the processed message to another electronic device.
  • the electronic device may be a switch, a router, or any other electronic device that can be provided with the above-mentioned chip.
  • the chip referred to in the embodiments of the present application may be a system on chip (system on chip, SoC), and may also be a network processor (network processor, NP) or the like.
  • SoC system on chip
  • NP network processor
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • the processor in this embodiment of the present application may be an integrated circuit chip, which has a signal processing capability.
  • each step of the above method embodiments may be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software.
  • the processor may be a microprocessor or the processor may be any conventional processor or the like.
  • each step of the above-mentioned method can be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software.
  • the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware. To avoid repetition, detailed description is omitted here.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution, and the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Hardware Design (AREA)
  • Multi Processors (AREA)
  • Information Transfer Systems (AREA)
  • Advance Control (AREA)

Abstract

本申请提供一种电路、芯片和电子设备,该电路包括第一处理器和与该第一处理器相连的第一处理模块,该第一处理模块包括与第一存储器相连的第二处理器,该第二处理器对该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延。由于该第二处理器对该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延,那么可以减少数据在总线中的传输时延的代价。

Description

电路、芯片和电子设备
本申请要求于2020年9月30日提交中国国家知识产权局、申请号为202011060780.0、申请名称为“处理器架构、设备和方法”的中国专利申请的优先权,以及,2020年10月28日提交中国国家知识产权局、申请号为202011176149.7、申请名称为“电路、芯片和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及芯片技术领域,更具体地,涉及电路、芯片和电子设备。
背景技术
当前的高速网络芯片中的处理器一般采用流水线方式设置。一个报文进入芯片后,会为这个报文生成一份程序状态(program state,PS)来保存这个报文转发过程中的上下文信息。流水线上的处理器对报文进行处理,并将处理结果保存到PS中再送往下一个处理器。目前芯片中处理器与保存PS的存储器之间的设计不合理,会导致读写PS所产生的时延较高。
发明内容
本申请提供一种电路、芯片和电子设备,能够降低传输时延。
第一方面,本申请实施例提供一种电路。该电路包括第一处理器和与该第一处理器相连的第一处理模块,该第一处理模块包括与第一存储器相连的第二处理器,该第二处理器对该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延。由于该第二处理器对该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延,那么可以减少数据在总线中的传输时延的代价。
结合第一方面,在一种可能的实现方式,该第二处理器对该第一存储器执行读写操作产生的传输时延小于或等于该第一处理器与该第一处理模块间通信产生的传输时延的1/10。
结合第一方面,在一种可能的实现方式,该第二处理器为多核处理器,该第二处理器对该第一存储器执行读写操作产生的传输时延为该第二处理器包括的多核处理器中的任一核处理器对该第一存储器执行读写操作产生的传输时延。
结合第一方面,在一种可能的实现方式,该第二处理器为多核处理器,
该第二处理器对该第一存储器执行读写操作产生的传输时延为该第二处理器包括的多核处理器中的任一核处理器对该第一存储器执行读写操作产生的传输时延。
结合第一方面,在一种可能的实现方式,该第一处理器通过第一总线与该第一处理模块相连,该第二处理器通过第二总线与该第一存储器相连,其中,该第二总线的总线位宽 大于该第一总线的总线位宽,和/或,该第二总线的长度小于该第一总线的长度。由于第二总线的长度小于第一总线的长度,可以缩小电路的面积。
结合第一方面,在一种可能的实现方式,该第二总线的长度可以小于或等于第一总线的长度的1/10。上述技术方案可以更进一步缩小电路的面积。
结合第一方面,在一种可能的实现方式,该第一处理模块还包括与第二存储器相连的第三处理器,该第三处理器对该第二存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块通信产生的传输时延。
结合第一方面,在一种可能的实现方式,第一处理器通过第一总线与该第一处理模块相连,该第二处理器通过第二总线与该第一存储器相连,该第三处理器通过第三总线与该第二存储器相连,该第二总线的总线位宽与该第三总线的总线宽度之和大于该第一总线的总线位宽。
结合第一方面,在一种可能的实现方式,该第一处理模块还包括与该第一存储器相连的第三处理器,该第三处理器与该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延。
结合第一方面,在一种可能的实现方式,第一处理器通过第一总线与该第一处理模块相连,该第二处理器通过第二总线与该第一存储器相连,该第三处理器通过第三总线与该第一存储器相连,该第二总线的总线位宽与该第三总线的总线宽度之和大于该第一总线的总线位宽。
结合第一方面,在一种可能的实现方式,该第二处理器和该第三处理器属于流水线pipeline处理器。
结合第一方面,在一种可能的实现方式,该电路还包括第四处理器和与该第四处理器相连的第二处理模块,该第二处理模块包括与M个存储器相连的N个第五处理器,该N和M均为大于或等于1的整数,任一第五处理器对与其相连的存储器执行读写操作产生的传输时延小于该第四处理器与该第二处理模块通信产生的传输时延。
结合第一方面,在一种可能的实现方式,该第二处理器通过第四总线与该第三处理器相连,该第四处理器通过第五总线与该第一处理器相连,该第四总线的总线位宽小于该第五总线的总线位宽。
结合第一方面,在一种可能的实现方式,该第四处理器包括的处理器核数大于或等于该第一处理器包括的处理器核数。。
结合第一方面,在一种可能的实现方式,该第四处理器和该第一处理器属于pipeline处理器。
结合第一方面,在一种可能的实现方式中,该第一处理模块还包括该第一存储器。
第二方面,本本申请实施例还提供一种芯片,该芯片包括如第一方面或第一方面任一种可能的实现方式的电路。
第三方面,本申请实施例还提供一种电子设备,该电子设备包括本申请实施例提供的芯片,该电子设备还包括接收器和发送器。该接收器,用于接收报文并将报文发送至该芯片。该芯片,用于处理该报文。该发送器,用于获取该芯片处理后的报文,并将该处理后的报文发送至另一电子设备。该电子设备可以是交换机、路由器或者其他任何能够设置有上述芯片的电子设备。
第四方面,本申请实施例还提供一种处理方法,该方法包括:第一处理器接收到第一报文,所述第一报文包括流标识信息;所述第一处理器根据所述流标识信息,确定第一处理模块,所述第一处理模块与所述流标识信息对应;所述第一处理器向所述第一处理模块发送所述第一报文。
上述方法中,第一处理器根据报文中携带的流标识信息,将需要第一处理模块进行处理的报文发送至第一处理模块,由第一处理模块中的处理器进行相应的处理。由于第一处理模块相对于第一处理器更靠近存储器,有助于降低传输时延。
可选地,所述方法还包括:所述第一处理器接收来自所述第一处理模块的第二报文,所述第二报文是所述第一处理模块根据所述流标识信息进行处理后的报文,所述第二报文包括所述流标识信息。
可选地,所述方法还包括:所述第一处理器向下一处理器发送所述第二报文,所述下一处理器为所述第一处理器在其所属流水线上的下一跳。
第五方面,本申请实施例还提供一种处理方法,所述方法包括:第一处理模块中的第二处理器接收来自第一处理器的第一报文,所述第一报文包括流标识信息;所述第二处理器根据所述流标识信息,从与第二处理器对应的存储器获取用于对所述第一报文进行处理的参数;所述第二处理器根据所述参数对所述第一报文进行处理,并向第一处理模块中的第三处理器发送处理后的第一报文,所述处理后的第一报文包括所述流标识信息;所述第一处理模块中的第三处理器根据所述流标识信息,从与第三处理器对应的存储器获取用于对所述处理后的第一报文进行处理的参数;所述第三处理器根据所述参数对所述处理后的第一报文进行处理,获得第二报文并发送给所述第一处理器。
上述方法中,第一处理模块中的处理器根据第一报文中的流标识,对存储器进行读取操作并进行相应的处理。由于第一处理模块相对于第一处理器更靠近存储器,有助于降低传输时延。
可选地,所述处理可以包括查表转发,所述参数包括转发表项的索引、基地址和哈希值中的一个或多个,所述参数与所述流标识对应。
附图说明
图1是根据本申请实施例提供的芯片的示意图。
图2是根据本申请实施例提供的另一种芯片的示意图。
图3示出了一种电路的示意图。
图4是另一种电路的示意图。
图5是采用如图4所示的电路实现确定下一跳端口的示意性流程图。
图6是采用如图3所示的电路实现确定下一跳端口的示意性流程图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请将围绕可包括多个设备、组件、模块等的系统来呈现各个方面、实施例或特征。应当理解和明白的是,各个系统可以包括另外的设备、组件、模块等,并且/或者可以并不包括结合附图讨论的所有设备、组件、模块等。此外,还可以使用这些方案的组合。
另外,在本申请实施例中,“示例的”、“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用示例的一词旨在以具体方式呈现概念。
本申请实施例中,“相应的(corresponding,relevant)”和“对应的(corresponding)”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
本申请实施例中,有时候下标如W 1可能会笔误为非下标的形式如W1,在不强调其区别时,其所要表达的含义是一致的。
本申请实施例描述的网络架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施
例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
图1是根据本申请实施例提供的芯片的示意图。如图1所示,芯片100包括输入输出接口101,处理器111,处理器112,处理器113和处理器114。芯片100还包括处理模块121和处理模块122。处理模块121包括处理器1211、处理器1212、存储器1221和存储器1222。处理模块122包括处理器1213、处理器1214、处理器1215、存储器1223、存储器1224和存储器1225。芯片100还包括存储器131和存储器132。处理器111通过总线61与输入输出接口101相连,通过总线41与处理器112相连,通过总线51与存储器131相连。处理器112通过总线42与处理器113相连,通过总线11与处理模块121相连。处理器113通过总线43与处理器114相连,通过总线52与存储器132相连。处理器114通过总线62与输入输出接口101相连,通过总线12与处理模块122相连。处理器1211通过总线31与存储器1221相连,通过总线21与处理器1212相连。处理器1212通过总线32与存储器1222相连。处理器1213通过总线33与存储器1223相连,通过总线22与处理器1214相连。处理器1214通过总线34与存储器1224相连,通过总线23与处理器1215相连。处理器1215通过总线35与存储器1225相连。
芯片100通过流水线(pipeline)方式处理接收到的报文。如图1所示,处理器111、处理器112、处理器113和处理器114属于同一流水线,亦可称处理器111、处理器112、 处理器113和处理器114属于流水线处理器。可选地,处理器1211和处理器1212属于同一流水线,处理器1213、处理器1214和处理器1215属于同一流水线。为了便于描述,可以将该流水线称为第一流水线。如图1所示,第一流水线中的部分处理器可以通过总线直接访问存储器。例如,处理器111可以通过总线11直接访问存储器131,处理器113可以通过总线13直接访问存储器132。为了便于描述,可以将第一流水线中可以直接访问存储器的处理器称为类型1的处理器。如图1所示,第一流水线中的另一部分处理器可以与处理模块通信。例如,处理器112通过总线11与处理模块121通信,处理器114通过总线12与处理模块122通信。为了便于描述,可以将第一流水线中可以与处理模块通信的处理器称为类型2的处理器。每个处理模块中包括的多个处理器也可以属于一条流水线。例如,处理器1211和处理器1212属于同一流水线,处理器1213、处理器1214和处理器1215属于同一流水线。为了便于描述,可以将处理模块中的流水线称为第二流水线。处理模块中的处理器可以称为类型3的处理器。换句话说,图1所示的处理器1211、处理器1212、处理器1213、处理器1214和处理器1215都可以称为类型3的处理器。
如图1所示,任一类型1的处理器与一个存储器对应,任一类型3的处理器与一个存储器对应。任一类型1的处理器或者任一类型3的处理器通过总线与对应的存储器相连,以对存储器进行读写操作。例如,存储器131对应于处理器111,存储器1221对应于处理器1211。上述存储器和处理器之间的一对一的对应关系,也可被一对多或者多对一的对应关系所替代,例如:任一类型1的处理器或者任一类型3的处理器可与多个存储器对应,以对多个存储器进行读写操作。或者多个类型1的处理器可对应一个存储器,多个类型3的处理器可对应一个存储器,以对存储器进行读写操作。例如,图1中的存储器131和存储器132可被一个存储器替代,处理器111和处理器113对应同一存储器。图1中的存储器131也可被多个存储器替代,处理器111对应多个存储器。图1中的存储器1221和存储器1222可被一个存储器替代,处理器1211和处理器1212对应同一存储器。
根据连接对象的不同,芯片100中的总线包括类型1总线、类型2总线、类型3总线、类型4总线、类型5总线和类型6总线。类型1总线用于连接类型2的处理器和与类型2的处理器对应的处理模块。例如,用于连接处理器112和处理模块121的总线11和用于连接处理器114和处理模块122的总线12都属于类型1总线。类型2总线用于连接两个类型3的处理器。例如,用于连接处理器1211和处理器1212的总线21,用于连接处理器1213和处理器1214的总线22,用于连接处理器1214和处理器1215的总线23都属于类型2总线。类型3总线用于连接类型3的处理器和与类型3的处理器对应的存储器。例如,用于连接处理器1211和存储器1221的总线31,用于连接处理器1213和存储器1223的总线33等都属于类型3总线。类型4总线用于连接第一流水线中的两个处理器。例如,用于连接处理器111和处理器112的总线41,用于连接处理器112和处理器113的总线42,用于连接处理器113和处理器114的总线43都属于类型4总线。类型5总线用于连接类型1的处理器和与该类型1的处理器对应的存储器。例如,用于连接处理器111和存储器131的总线51和用于连接处理器113和存储器132的总线52都属于类型5总线。类型6总线用于连接输入输出接口101和处理器。例如,用于连接输入输出接口101和处理器111的总线61和用于连接处理器114和输入输出接口101的总线62都属于类型6的总线。
在一些实施例中,第一流水线中的每个处理器是多核心处理器。第一流水线中的每个处理器可以包括多个处理器核(也可以称为核心(core))。在一些实施例中,第一流水线中的不同的处理器包括的处理器核的数目可以相同。换句话说,第一流水线中的任意两个处理器包括的处理器核的数目均相同。还以图1为例,处理器111中包括的处理器核的数目等于处理器112中包括的处理器核的数目,处理器112中包括的处理器核的数目等于处理器113中包括的处理器核的数据,处理器113中包括的处理器核的数目等于处理器114。在另一些实施例中,第一流水线中的不同的处理器包括的处理器核的数目可以不相同。换句话说,第一流水线中的任意两个处理器包括的处理器核的数目可以不相同。例如,处理器111包括的处理器核的数目大于处理器112包括的处理器核的数目。处理器113包括的处理器核的数目大于处理器114包括的处理器核的数目。处理器111包括的处理器核的数目大于处理器113包括的处理器核的数目,处理器112包括的处理器核的数目大于处理器114包括的处理器核的数目。
在另一些实施例中,第一流水线中的部分处理器包括的处理器核的数目相同。例如,处理器111中包括的处理器核的数目等于处理器113中包括的处理器核的数目,处理器112中包括的处理器核的数目等于处理器核114中包括的处理器核的数目,但是处理器111中包括的处理器核的数目不同于处理器112中包括的处理器核的数目。如上所述,根据连接的对象不同,第一流水线中的处理器可以被分为两类,分别为类型1的处理器(如处理器111和处理器113)和类型2的处理器(如处理器112和处理器114)。在一些实施例中,相同类型的处理器包括的处理器核的数目是相同的,不同类型的处理器包括的处理器核的数目可以是不同的。在一些实施例中,类型1的处理器包括的处理器核的数目可以大于类型2的处理器核的数目。由于类型2的处理器与处理模块通信,而处理模块包括的处理器能够进行部分处理操作,这样,类型2的处理器可采用单核处理器或者核数较少的处理器,能够进一步降低硬件成本。例如,类型2的处理器包括的处理器核的数目可以是类型1的处理器包括的处理器核的数目的1/2、1/3、1/5,或1/8等。
在另一些实施例中,类型1的处理器可以是多核处理器,类型2的处理器可以是单核处理器。在一些实施例中,类型3的处理器也可以是一个多核处理器。换句话说,类型3的处理器也可以包括多个处理器核。在一些实施例中,类型3的处理器包括的处理器核的数目少于类型1或类型2的处理器包括的处理器核的数目。换句话说,类型1的处理器包括的处理器核的数目和类型2的处理器包括的处理器核的数目都大于类型3的处理器包括的处理器核的数目。例如,处理器1211包括的处理器核的数目可以少于处理器111包括的处理器核的数目,且处理器1211包括的处理器核的数目也可以少于处理器112包括的处理器核的数目。在另一些实施例中,类型3的处理器包括的处理器核的数目可以少于类型1的处理器包括的处理器核的数目,类型3的处理器包括的处理器核的数目可以等于或大于类型2的处理器包括的处理器核的数目。例如,处理器1213包括的处理器核的数目可以少于处理器111包括的处理器核的数目,且处理器1213包括的处理器核的数目可以等于或大于处理器114包括的处理器核的数目。例如,在一些实施例中,类型3的处理器包括的处理器核的数目可以小于或等于类型1的处理器包括的处理器核的数目的1/10。又如,在另一些实施例中,类型3的处理器包括的处理器核的数目可以小于或等于类型1的处理器包括的处理器核的数目的1/2、1/3、1/5,或1/8等。
在另一些实施例中,类型2的处理器包括的处理器核的数目与该处理器对应的处理模块中的一个类型3的处理器包括的处理器核的数目等于类型1的处理器包括的处理器核的数目。例如,处理器112包括的处理器核的数目与处理器1212包括的处理器核的数目之和等于处理器111包括的处理器核的数目。又如,处理器114包括的处理器核的数目与处理器1214包括的处理器核的数目之和等于处理器113包括的处理器核的数目。在一些实施例中,不同的类型3的处理器包括的处理器核的数目可以是相同的。例如,处理器1211包括的处理器核的数目等于处理器1212包括的处理器核的数目,处理器1212包括的处理器核的数目等于处理器1215包括的处理器核的数目。
在另一些实施例中,不同的类型3的处理器包括的处理器核的数目可以不同。
在另一些实施例中,属于同一个处理模块的任意两个处理器包括的处理器核的数目相同,属于不同的处理模块的两个处理器包括的处理器核的数目不同。例如,处理器1211包括的处理器核的数目等于处理器1212包括的处理器核的数目,处理器1212包括的处理器核的数目不等于处理器1213包括的处理器核的数目。在图1所示的芯片中,每个处理模块都包括至少两个处理器。在另一些实施例中,处理模块也可以包括一个多核处理器。例如,处理模块121中可以只包括处理器1211和存储器1221,其中处理器1211是一个多核处理器。
在一些实施例中,类型2的处理器也可以是单核处理器。如果类型2的处理器是单核处理器,那么包含该处理器的处理模块可以包括至少两个处理器。换句话说,如果处理模块包括多个处理器,那么该多个处理器中可以包括至少一个单核处理器。以处理模块121为例,处理模块121中的处理器1211可以是单核处理器,处理器1212可以是单核处理器也可以是多核处理器。
在一些实施例中,类型1总线的长度大于类型3总线的长度。例如,类型3总线的长度可以等于类型1总线的长度的1/5、1/8或1/10等。又如,类型3总线的长度可以小于类型1总线的长度的1/10、1/15或1/20等。在一些实施例中,类型1总线的长度和类型3总线的长度之和等于类型5总线的长度。
在一些实施例中,任意两个类型1总线的长度可以是相同的。在一些实施例中,任意两个类型2总线的长度可以是相同的。在一些实施例中,任意两个类型3总线的长度可以是相同的。在一些实施例中,任意两个类型4总线的长度可以是相同的。在一些实施例中,任意两个类型5总线的长度可以是相同的。由于制造工艺的限制,总线的长度完全相同可能很难实现。因此,本申请实施例中的长度相同可以理解为长度完全相同,也可以理解为长度差在允许的误差范围内。例如,类型1总线的长度和类型3总线的长度之和等于类型5总线的长度可以理解为类型1总线的长度和类型3总线的长度之和与类型5总线的长度之差为0或者小于或等于预设的允许误差值。又如,总线51的长度和总线52的长度(即两个类型5总线的长度)之差为0或者小于或等于预设的允许误差值。
在一些实施例中,同一处理模块中的所有第三总线的宽度之和大于一个第一总线的宽度。例如,总线31的宽度和总线32的宽度大于总线11的宽度。又如,总线33的宽度,总线34的宽度和总线35的宽度之和大于总线12的宽度。总线可同时传输的二进制数据的位数就称为宽度(width)(也可以称为位宽),以比特为单位,总线宽度愈大,传输性能就愈佳,相同时间内可传输的数据量越多。总线的带宽(即单位时间内可以传输的总 数据数)为:总线带宽=频率×宽度(Bytes/sec)。
在一些实施例中,类型2总线的宽度可以小于类型4总线的宽度。图2是根据本申请实施例提供的另一种芯片的示意图。如图2所示,芯片200包括输入输出接口201,处理器211,处理器212,处理器213和处理器214。芯片200还包括处理模块221和处理模块222。处理模块221包括处理器2211和处理器2212。处理模块222包括处理器2213、处理器2214和处理器2215。
处理器211通过总线2411与输入输出接口201相连。处理器211通过总线2441与处理器212相连。处理器212通过总线2421与处理模块221相连。处理器212通过总线2442与处理器213相连。处理器213通过总线与输入输出接口201相连。处理器213通过总线2443与处理器214相连。处理器214通过总线2422与处理模块222相连。处理模块221通过总线2431与输入输出接口201相连。处理模块222通过总线2432与输入输出接口201相连。处理器2221通过总线2451与处理器2212相连。处理器2213通过总线2452与处理器2214相连。处理器2214通过总线2453与处理器2215相连。存储器231至存储器237是位于芯片200外部的存储器。芯片200可以通过输入输出接口201和相应的总线访问存储器231至存储器237。具体地,存储器231通过总线2461与芯片200相连;存储器232通过总线2462与芯片200相连;存储器233通过总线2463与芯片200相连;存储器234通过总线2464与芯片200相连;存储器235通过总线2465与芯片200相连;存储器236通过总线2466与芯片200相连;存储器237通过总线2467与芯片200相连。
芯片200通过流水线(pipeline)方式处理接收到的报文。如图2所示,芯片200中的处理器211、处理器212、处理器213和处理器214属于一条流水线,该流水线可以称为第一流水线。如图2所示,第一流水线中的部分处理器可以直接通过总线与输入输出接口通信;第一流水线中的另一部分处理器通过总线与处理模块连接。为了便于描述,可以直接与输入输出接口通信的处理器(即没有与处理模块连接的处理器)可以称为类型1的处理器;与处理模块连接的处理器可以称为类型2的处理器。例如,图2中的类型1的处理器可以包括处理器211和处理器213,类型2的处理器可以包括处理器212和处理器214。
每个处理模块中包括的多个处理器也可以属于一条流水线。例如,处理器2211和处理器2212属于同一流水线,处理器2213、处理器2214和处理器2215属于同一流水线。为了便于描述,可以将处理模块中的流水线称为第二流水线。处理模块中的处理器可以称为类型3的处理器。换句话说,图2所示的处理器2211、处理器2212、处理器2213、处理器2214和处理器2215都可以称为类型3的处理器。
每个类型1的处理器和每个处理器都有一个对应的存储器。处理器可以读取对应的存储器中保存的数据。处理器也可以将数据写入对应的存储器中。在图2中,与处理器211对应的存储器为存储器231,与处理器2211对应的存储器为存储器232,与处理器2212对应的存储器为存储器233,与处理器213对应的存储器为存储器234,与处理器2213对应的存储器为存储器235,与处理器2214对应的存储器为存储器236,与处理器2215对应的存储器为存储器237。例如,处理器211可以读取存储器231中保存的数据,和/或,将数据写入存储器231。又如,处理器2213可以读取存储器235中保存的数据,和/或,将数据写入存储器235。
通过总线与处理器连接的处理模块可以称为对应于该处理器的处理模块。例如,处理 模块121是对应于处理器112的处理模块。
根据连接对象的不同,芯片200中的总线可以包括类型1总线、类型2总线、类型3总线、类型4总线、类型5总线和类型6总线。类型1总线用于连接类型2的处理器和与该类型2的处理器对应的处理器模块。例如,用于连接处理器212和处理模块221的总线2411和用于连接处理器214和处理模块222的总线2412都属于类型1总线。类型2总线用于连接两个类型3的处理器。例如,用于连接处理器2221和处理器2212的总线2421,用于连接处理器2213和处理器2214的总线2422,用于连接处理器2214和处理器2215的总线2423都属于类型2总线。类型3总线用于连接处理模块中的处理器和输入输出接口。例如,包括总线2431、总线2432、总线2433、总线2434和总线2435都属于类型3总线。总线2431是用于连接处理器2211和输入输出接口201的类型3总线。总线2432是用于连接处理器2212和输入输出接口201的类型3总线。总线2433是用于连接处理器2213和输入输出接口201的类型3总线。总线2434是用于连接处理器2214和输入输出接口201的类型3总线。总线2435是用于连接处理器2215和输入输出接口201的类型3总线。类型4总线用于连接两个第一流水线中的处理器。例如,用于连接处理器211和处理器212的总线2441,用于连接处理器212和处理器213的总线2442,用于连接处理器213和处理器214的总线2443都属于类型4总线。类型6总线用于连接第一处理器和输入输出接口。例如,总线2461和总线2462都属于类型6总线。
除了芯片200内的总线以外,芯片200还通过总线与存储器相连。总线2471至总线2477是用于连接芯片200和存储器的总线,这种总线可以称为类型7总线。类型1的处理器可以通过对应总线和输入输出接口访问对应的存储器。例如,处理器211可以通过总线2461、输入输出接口201和总线2471访问存储器231。又如,处理器213可以通过总线2462、输入输出接口201和总线2474访问存储器234。类型3的处理器可以通过对应的总线和输入输出接口访问对应的存储器。例如,处理器2211可以通过总线2431、输入输出接口201和总线2472访问存储器232。又如,处理器2215可以通过总线2435、输入输出接口201和总线2477访问存储器237。
与图1所示的芯片类似,在一些实施例中,图2所示的处理器可以都是多核处理器。在另一些实施例中,类型1和类型2的处理器可以是多核处理器,类型3的处理器可以是单核处理器。类型3的处理器的结构可以比类型1的处理器的结构简单。例如,类型3的处理器包括的处理器核的数目可以比类型1处理器包括的处理器核的数目少。又如,类型3的处理器包括的晶体管数目可以少于类型1处理器包括的晶体管数目。关于类型1的处理器、类型2的处理器和类型3处理器的具体情况可以参考如图1所示的芯片100中的描述,为了简洁,在此就不再赘述。
在一些实施例中,类型1总线的长度大于类型3总线的长度。例如,类型1总线的长度可以等于类型3总线长度的1/5、1/8或1/10等。又如,类型3总线的长度可以小于类型1总线的长度的1/10、1/15或1/20等。在一些实施例中,类型1总线的长度和类型3总线的长度之和等于类型6总线的长度。在一些实施例中,任意两个类型1总线的长度可以是相同的。在一些实施例中,任意两个类型2总线的长度可以是相同的。在一些实施例中,任意两个类型3总线的长度可以是相同的。在一些实施例中,任意两个类型4总线的长度可以是相同的。在一些实施例中,任意两个类型6总线的长度可以是相同的。
在一些实施例中,对应于同一处理模块中的存储器与芯片的总线宽度之和大于一个第一总线的宽度。例如,总线2431的宽度和总线2432的宽度大于总线2411的宽度。又如,总线2433的宽度,总线2434的宽度和总线2435的宽度之和大于总线2412的宽度。在一些实施例中,类型2总线的宽度可以小于类型4总线的宽度。如图1所示的实施例中,每个有对应存储器的处理器(类型3的处理器和类型1的处理器)与对应的存储器都位于芯片内部。
如图2所示的实施例中,与处理器对应的存储器位于芯片外部。在另一些实施例中,部分与处理器对应的存储器可以位于芯片内部,另一部分与处理器器对应的存储器可以位于芯片外部。该实施例可以认为是如图1所示的实施例和如图2所示的实施例相结合。可以看出,如图1或图2所示的芯片中都包括两个如图3所示的结构。图3示出了一种混合处理器电路的示意图。如图3所示,处理器301与处理模块310通过总线331连接。处理模块310中包括3个处理器,分别为处理器311,处理器312和处理器313。处理模块310中的处理器通过总线332相连。处理器301为第一处理器,处理器311、处理器312和处理器313为类型3的处理器。更具体地,处理器311为类型2的处理器。处理模块310中的每个处理器有一个对应的存储器。处理器311对应的存储器为存储器321,处理器312对应的存储器为存储器322,处理器313对应的存储器为存储器323。处理模块310中的每个处理器与对应的存储器通过总线连接。处理器311通过总线333与存储器321连接,处理器312通过总线334与存储器322连接,处理器313通过总线335与存储器323连接。处理器310、处理器311,处理器312和处理器313位于同一个芯片内。处理模块310中的每个处理器与对应的存储器可以与处理模块310位于同一个芯片内,也可以位于处理模块310所在的芯片外。如果存储器位于处理模块310所在的芯片外,那么用于连接处理模块内的处理器和对应的存储器的总线可以包括处理器到该芯片的输入输出接口的总线以及该芯片到对应的存储器的总线。例如,总线333可以包括处理器311到芯片的输入输出接口的总线以及该芯片的输入输出接口到存储器321的总线。
为了便于描述,可以将如图3这种结构成为混合处理器电路或者混合处理器结构。如图3所示的混合处理器结构中的处理模块中包括三个处理器。在另一些实施例中,处理模块中处理器的数目可以是大于或等于1的正整数,例如,可以为1、2、4,或5等。如上所述,如果处理模块中的包括的处理器的数目为1,那么该处理器可以是一个多核处理器。如果处理模块中包括至少两个处理器,那么该至少两个处理器中可以包括一个或多个单核处理器。
为了便于描述,假设总线333的长度、总线334的长度以及总线335的长度相同,以字母L表示总线333的长度,以字母R表示总线331的长度。如上述实施例所述,在一些实施例中,L小于R。在另一些实施例中,L可以远远小于R。例如,L可以等于R的十分之一,或者,L小于R的十分之一。假设以字母A表示总线333的宽度,以字母B表示总线334的宽度,以字母C表示总线335的宽度,以字母D表示总线331的宽度。那么,A、B、C和D满足以下关系:D<A+B+C。这样,如图3所示的混合处理器结构的数据传递代价可以如公式3.1所示:
Cost_TX=L×(A+B+C)+R×D,公式3.1
其中,Cost_TX表示数据传递代价,L表示总线333的长度(总线333的长度、总线 334的长度和总线335的长度相等),R表示总线331的长度,A表示总线333的宽度,B表示总线334的宽度,C表示总线335的宽度,D表示总线331的宽度。
采用流水线结构的芯片(例如如图1所示的芯片100或如图2所示的芯片200)在接收到报文后,可以为这个报文生成一份程序状态(program state,PS)。该PS用来保存这个报文转发过程中的上下文信息。该PS以此经过第一流水线中的各个处理器,由第一流水线中的处理器负责处理。为了便于描述,可以将第一流水线中的处理器处理的PS称为第一PS。假设用PS_Full_Size表示经过第一PS的大小。处理模块310在对该报文处理的过程中也会生成一个PS,该PS依次经过第二流水线中的各个处理器,由第二流水线中的处理器负责处理。为了便于描述,可以将第二流水线中的处理器处理的PS称为第二PS。假设用PS_Little_Size表示第二PS的大小。第一PS保存有报文转发过程中的上下文信息。第二PS只保存第二流水线中处理的信息。因此,第一PS的大小会大于第二PS的大小(即PS_Full_Size>PS_Little_Size)。在一些实施例中,第二PS的大小可以等于或者小于第一PS的大小的1/5、1/8、1/10、1/15或1/20等。
由于与第一PS相比,第二PS大小更小,所以可以使用更为简单的处理器来处理第二PS。因此,处理模块内部的处理器(即类型3的处理器)的结构可以比第一流水线中的处理器的结构简单。换句话说,类型3处理器包括的处理器核的数目可以少于类型1和/或类型2的第一处理器包括的处理器核的数目,和/或,类型3的处理器包括的晶体管的数目可以少于类型1和/或类型2的处理器包括的晶体管的数目。第一PS的大小与第二PS的大小的差距越大,类型3的处理器的结构可以越简单。
在一些实施例中,类型3的处理器包括的处理器核的数目可以少于类型1的处理器包括的处理器核的数目,和/或,类型3的处理器包括的晶体管的数目可以少于类型1的处理器包括的晶体管的数目。在另一些实施例中,类型3的处理器包括的处理器核的数目可以少于类型2的处理器包括的处理器核的数目,和/或,类型3的处理器包括的晶体管的数目可以少于类型2的6处理器包括的晶体管的数目。
以处理器核的数目为例,可以用N_Little表示类型3的处理器包括的处理器核的数目,用N_Big2表示类型2的处理器包括的处理器核的数目,以N_Big1表示类型1的处理器核的数目。
这样,如图3所示的混合处理器结构的处理器代价可以如公式3.2所示:
Cost_Proc=PS_Little_Size×N_Little+PS_Full_Size×N_Big2,公式3.2
其中,Cost_Proc表示处理器代价,PS_Little_Size、N_Little、PS_Full_Size和N_Big2的含义如上所述,为了简洁,在此不再重复。
在一些实施例中,N_Little、N_Big2和N_Big1可以满足以下关系:N_Big1=N_Little+N_Big2。
如果用Latency_L表示长度为L的总线的输入/输出(Input/Output,I/O)时延,用Lattency_R表示长度为R的总线的I/O时延,那么采用如图3所示的混合处理器结构的时延代价可以如公式3.3所示:
Cost_LAT=Lattency_L×3+Lattency_R×1,公式3.3
其中,Cost_LAT表示时延代价,Latency_L表示长度为L的总线的I/O时延,Lattency_R表示长度为R的总线的I/O时延。
如果都以一条流水线中的处理器实现如图3所示的混合处理结构实现的功能,那么需要如图4所示的结构。
图4是另一种包含处理器电路的示意图。如图4所示的电路中包括三个处理器,分别为处理器401,处理器402和处理器403。此外,处理器401、处理器402和处理器403为类型1的处理器(处理器401至处理器403都是同一条流水线中的处理器且处理器401至处理器403通过总线连接的是存储器而非处理模块)。这三个处理器中的每个处理器有一个对应的存储器。处理器401对应的存储器为存储器411,处理器402对应的存储器为存储器412,处理器403对应的存储器为存储器413。处理器401通过总线421与存储器411相连,处理器402通过总线422与存储器412相连,处理器402通过总线423与存储器413相连。处理器401通过总线424与处理器402相连,处理器402通过总线424与处理器403相连。
总线421的长度、总线422的长度和总线423的长度相同。总线421的长度可以等于L+R,即如图3所示的总线333的长度和总线331的长度之和。总线421的宽度可以等于总线333的宽度,总线422的宽度可以等于总线334的宽度,总线423的宽度可以等于总线335的宽度。在此情况下,采用如图4所示的结构的数据传递代价可以如公式4.1所示:
Cost_TX=(L+R)×(A+B+C),公式4.1
其中,Cost_TX表示数据传递代价,L+R为总线421的长度(总线422的长度等于总线421的长度,总线423的长度等于总线421的长度),A表示总线421的宽度,B表示总线422的宽度,C表示总线423的宽度。
比较公式4.1与公式3.l可以发现,在L小于R且D小于A+B+C的情况下,采用如图3所示的结构的数据传递代价小于采用如图4所示结构的数据传递代价。
在一些实施例中,如果R与L的差越大,图3所示的结构的数据传递代价越小。
如上所述,由于处理器401至处理器403都是类型1的处理器,所以经过处理器401至处理器403的PS为PS_Full,相应的,PS_Full的大小为PS_Full_Size,类型1的处理器包括的处理器核的数目为N_Big1。那么,采用如图4所示结构的处理器代价可以如公式4.2所示:
Cost_Proc=PS_Full_Size×N_Big1,公式4.2
其中,Cost_Proc表示处理器代价,PS_Full_Size为经过处理器401的PS的大小,N_Big1为处理器401包括的处理器核数目。
如果N_Big1=N_Little+N_Big2,那么与如图4所示的结构相比,采用如图3所示的结构的处理器代价可以节省(PS_Full_Size-PS_Little_Size)×N_Little。如果PS_Full_Size与PS_Little_Size的差越大,节省的处理器代价越大(即处理器代价越小)。如果N_Big1与N_Little1的差越大,节省的处理器代价越大(即处理器代价越小)。
如果用Latency_L表示长度为L的总线的I/O时延,用Lattency_R表示长度为R的总线的I/O时延,那么采用如图4结构的时延代价可以如公式4.3所示:
Cost_LAT=(Lattency_L+Lattency_R)×3,公式4.3
其中,Cost_LAT表示时延代价,Latency_L表示长度为L的总线的I/O时延,Lattency_R表示长度为R的总线的I/O时延。
可以看出,与如图4所示的结构相比,如图3所示的结构能够节省Lattency_R×2的 I/O时延。R与L的差越大,节省的I/O时延越多。
综上所述,本申请实施例提供的技术方案可以使用更少的代价(数据传递代价、处理器代价和时延代价)实现相应的功能。此外,由于处理模块内部需要的总线长度较短以及类型2的第一处理器和处理模块之间的总线宽度较小,因此,与实现同样功能的芯片相比,利用本申请技术方案的芯片的面积更小。
下面以等价多路径路由(equal-cost multi-path routing,ECMP)为例,对图3和图4的两种结构进行介绍。
ECMP确定下一跳端口的基本过程如下:根据报文的流标识信息(例如五元组或者流标签(flow label))确定出一个哈希值,然后根据ECMP路由表和该哈希值确定一个表项,该表项中包括的端口即为用于发送该报文的下一跳端口。
在一些情况下,为了减少ECMP路由表保存的条目以及提高查找效率,ECMP路由表可以被划分为多个表,例如可以划分为三个表,分别称为选路入口表1、选路表入口2和选路入口表3。首先,根据报文的流标识信息从选路入口表1中确定与该流标识信息对应的表项,该表项包括一个基地址和一个选路入口表的索引。然后,根据该选路入口表的索引确定选路入口表2,从选路入口表2中查询与该基地址和根据该报文的流标识信息确定的哈希值对应的表项,该表项包括一个端口索引和一个选路入口表的索引。最后,根据该选路入口表的索引确定选路入口表3,从选路入口表3中查询与该端口索引对应的表项,该表项包括该报文的下一跳端口。
图5示出了采用如图4所示的电路实现确定下一跳端口的示意性流程图。
501,处理器401从接收到的PS中获取一个入选路入口表的索引(以下称为选路表索引1)。
502,处理器401将该选路表索引1发送至存储器411。
503,处理器401接收来自于存储器411的与该选路表索引1对应的选路入口表1。
504,处理器401确定选路入口表1中对应于该报文的流标识信息的表项。该表项包括一个选路入口表的索引(以下称为选路表索引2)和一个基地址,并将该选路表索引2和该基地址写入到PS中。
505,处理器401将该PS(即写入了选路表索引2和该基地址之后的PS)发送至处理器402。
506,处理器402从接收到的PS中获取该选路表索引2、该基地址和一个哈希值。该哈希值是根据该报文的流标识信息确定的。该哈希值可以是由处理器401的上游节点确定并写入到PS中的。
507,处理器402将该选路表索引2发送至存储器412。
508,处理器402接收来自于存储器412的与该选路表索引2对应的选路入口表2。
509,处理器402从选路入口表2中查询与该基地址和该哈希值对应的表项,该表项包括一个端口索引和一个选路入口表的索引(以下称为选路表索引3),并将该端口索引和该选路表索引3写入到PS中。
510,处理器402将该PS(即写入了该端口索引和该选路表索引3之后的PS)发送至处理器403。
511,处理器403从接收到的PS中获取选路表索引3和该端口索引。
512,处理器403将该选路表索引3发送至存储器413。
513,处理器403接收来自于存储器412的与该选路表索引3对应的选路入口表3。
514,处理器403从选路入口表3中查询与该端口索引对应的表项,该表项包括的内容为该报文的下一跳端口。
515,处理器403将该报文的下一跳端口写入到PS,并将该PS发送至流水线中的下一个节点,由下一个节点继续处理该报文。
图6是采用如图3所示的电路实现确定下一跳端口的示意性流程图。
601,处理器301从接收到的PS中获取一个入选路入口表的索引(以下称为选路表索引1)、报文的流标识信息和根据该报文的流标识信息确定的哈希值。
602,处理器301将该选路表索引1、报文的流标识信息和根据该报文的流标识信息确定的哈希值发送至处理器311。
603,处理器311向存储器321发送该选路表索引1。
604,处理器311接收来自于存储器321的与该选路表索引1对应的选路入口表1。
605,处理器311确定选路入口表1中对应于该报文的流标识信息的表项。该表项包括一个选路入口表的索引(以下称为选路表索引2)和一个基地址,并将该选路表索引2和该基地址写入到PS中。该PS中还可以包括根据该报文的流标识信息确定的哈希值。
606,处理器311将该PS(即写入了选路表索引2和该基地址之后的PS)发送至处理器312。
607,处理器312从接收到的PS中获取该选路表索引2、该基地址和哈希值。
608,处理器312将该选路表索引2发送至存储器322。
609,处理器312接收来自于存储器322的与该选路表索引2对应的选路入口表2。
610,处理器312从选路入口表2中查询与该基地址和该哈希值对应的表项,该表项包括一个端口索引和一个选路入口表的索引(以下称为选路表索引3),并将该端口索引和该选路表索引3写入到PS中。
611,处理器312将该PS(即写入了该端口索引和该选路表索引3之后的PS)发送至处理器313。
612,处理器313从接收到的PS中获取选路表索引3和该端口索引。
613,处理器313将该选路表索引3发送至存储器323。
614,处理器313接收来自于存储器323的与该选路表索引3对应的选路入口表3。
615,处理器313从选路入口表3中查询与该端口索引对应的表项,该表项包括的内容为该报文的下一跳端口。
616,处理器313将该报文的下一跳端口发送至处理器301。
617,处理器301将该报文的下一跳端口写入PS,并将该PS发送至流水线中的下一个节点,由下一个节点继续处理该报文。
在图5所示的流程中,处理器401至处理器403与芯片中的其他处理器都属于同一个流水线上。该流水线中的处理器处理的PS是用来保存报文转发过程中的上下文信息。因此,处理器401向处理器402发送的PS,处理器402向处理器403发送的PS中除了查询下一跳端口所需的信息外,还需要包括后续节点需要的信息。所以,该PS的大小会比较大,例如,该PS的大小可能为512字节。相应的,处理器之间的总线的宽度也较大。
但是在图6所示的流程中,处理器311至处理器313只关心选路功能,传递的PS中只需要包括选路所需的信息。因此,设置较小的PS就可以满足处理器311至处理器313的需求。例如,64字节的PS就可以满足选路需求。相应的,处理器之间的总线宽度可以设置的较小。此外,处理器312和处理器313所需的信息都来自于前一个节点,不需要再从处理器301获取信息。另外,对于处理器301而言,处理器301只关心确定的下一跳端口是什么,处理器301可以不需要获取用于确定下一跳端口的选路入口表3,并且处理器301也无需将与确定下一跳端口无关的信息发送给处理模块310。因此,处理器301和处理模块310直接的总线宽度可以设置的较小。例如,总线331的宽度可以为128比特(bit)。相比之下,处理器和存储器之间的总线由于需要传输较多的信息(例如选路入口表),则需要较大的宽度。例如,总线333至总线335的宽度可能为256比特,总线421至总线423的宽度可能为256比特。
本申请实施例还提供一种电路。该电路包括第一处理器和与该第一处理器相连的第一处理模块,该第一处理模块包括与第一存储器相连的第二处理器,该第二处理器对该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延。
例如,假设如图1所示的处理器模块121中只包括处理器1221和存储器1221。那么该第一处理器可以相当于处理器112,第一处理模块可以相当于处理模块121,第二处理器可以相当于处理器1211,第一存储器可以相当于存储器1221。处理器1211对存储器1221执行读写操作产生的传输时延小于处理器112和处理模块1间通信产生的传输时延。
又如,假设如图2所示的处理模块221中只包含处理器2211。那么该第一处理器可以相当于是处理器212,第一处理模块可以相当于是处理模块221,第二处理器可以相当于是处理器2211,第一存储器可以相当于是存储器323。
可选的,在一些实施例中,该第二处理器为多核处理器,该第二处理器对该第一存储器执行读写操作产生的传输时延为该第二处理器包括的多核处理器中的任一核处理器对该第一存储器执行读写操作产生的传输时延。
可选的,在一些实施例中,该第一处理器通过第一总线与该第一处理模块相连,该第二处理器通过第二总线与该第一存储器相连,其中,该第二总线的总线位宽大于该第一总线的总线位宽,和/或,该第二总线的长度小于该第一总线的长度。
例如,依然假设如图1所示的处理模块121中只包括处理器1221和存储器1221,第一总线相当于用于连接处理器112和处理模块121的总线11,第二总线相当于用于连接处理器1211和存储器1221的总线31。
又如,依然加深如图2所示的处理模块221中只包括处理器221。第一总线相当于用于连接处理器212和处理模块221的总线2411,第二总线可以相当于用于连接处理器2211和存储器232的总线,包括总线2431和总线2472。第二总线也可以相当于用于连接处理器2211和输入输出接口201的总线2431。
可选的,在一些实施例中,该第一处理模块还包括与第二存储器相连的第三处理器,该第三处理器对该第二存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块通信产生的传输时延。
以图1为例,该第一处理器可以相当于处理器112,第一处理模块可以相当于处理模 块121,第二处理器可以相当于处理器1211,第三处理器可以相当于处理器1212,第一存储器可以相当于存储器1221,第二存储器可以相当于存储器1222。处理器1211对存储器1221执行读写操作产生的传输时延小于处理器112和处理模块121间通信产生的传输时延,处理器1212对存储器1222执行读写操作产生的传输时延小于处理器1122和处理模块121通信产生的传输时延。
以图2为例,第一处理器可以相当于处理器212,第一处理模块可以相当于处理模块221,第二处理器可以相当于处理器2211,第三处理器可以相当于处理器2212,第一存储器可以相当于存储器232,第二存储器可以相当于存储器233。
可选的,在一些实施例中,第一处理器通过第一总线与该第一处理模块相连,该第二处理器通过第二总线与该第一存储器相连,该第三处理器通过第三总线与该第二存储器相连,该第二总线的总线位宽与该第三总线的总线宽度之和大于该第一总线的总线位宽。
还以图1为例,第一总线可以相当于总线11,第二总线可以相当于总线31,第三总线可以相当于总线32。
以图2为例,第一总线可以相当于总线2411,第二总线可以相当于总线2431和总线2472,第三总线可以相当于总线2432和总线2473。第二总线也可以相当于总线2431,第三总线也可以相当于总线2432。
可选的,在一些实施例中,该第一处理模块还包括与该第一存储器相连的第三处理器,该第三处理器与该第一存储器执行读写操作产生的传输时延小于该第一处理器与该第一处理模块间通信产生的传输时延。
可选的,在一些实施例中,第一处理器通过第一总线与该第一处理模块相连,该第二处理器通过第二总线与该第一存储器相连,该第三处理器通过第三总线与该第一存储器相连,该第二总线的总线位宽与该第三总线的总线宽度之和大于该第一总线的总线位宽。
可选的,在一些实施例中,该第二处理器和该第三处理器属于流水线pipeline处理器。
可选的,在一些实施例中,该电路还包括第四处理器和与该第四处理器相连的第三存储器。
还以图1为例,处理器111可以相当于是该第四处理器,存储器113可以相当于是该第三存储器。
以图2为例,处理器211可以相当于是该第四处理器,存储器231可以相当于是该第三存储器。
可选的,在一些实施例中,该电路还包括第四处理器和与该第四处理器相连的第二处理模块,该第二处理模块包括与M个存储器相连的N个第五处理器,该N和M均为大于或等于1的整数,任一第五处理器对与其相连的存储器执行读写操作产生的传输时延小于该第四处理器与该第二处理模块通信产生的传输时延。
还以图1为例,处理器114可以相当于是该第四处理器,处理模块122可以相当于该第二处理模块。
以图2为例,处理器214可以相当于是该第四处理器,处理模块222可以相当于是该第二处理模块。
可选的,在一些实施例中,该第二处理器通过第四总线与该第三处理器相连,该第四处理器通过第五总线与该第一处理器相连,该第四总线的总线位宽小于该第五总线的总线 位宽。
还以图1为例,总线21可以相当于该第四总线,总线41可以相当于是该第五总线。
以图2为例,总线2421可以相当于是该第四总线,总线2441可以相当于是该第五总线。
可选的,在一些实施例中,该第四处理器包括的处理器核数大于或等于该第一处理器包括的处理器核数。
可选的,在一些实施例中,该第四处理器和该第一处理器属于pipeline处理器。
可选的,在一些实施例中,该第一处理模块还包括该第一存储器。
本申请实施例还提供一种电子设备,该电子设备包括本申请实施例提供的芯片,该电子设备还包括接收器和发送器。该接收器,用于接收报文并将报文发送至该芯片。该芯片,用于处理该报文。该发送器,用于获取该芯片处理后的报文,并将该处理后的报文发送至另一电子设备。该电子设备可以是交换机、路由器或者其他任何能够设置有上述芯片的电子设备。
本申请实施例中的所称的芯片可以是系统芯片(system on chip,SoC),还可以是网络处理器(network processor,NP)等。
本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
应注意,本申请实施例中的处理器可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。该处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以 硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (16)

  1. 一种电路,其特征在于,所述电路包括第一处理器和与所述第一处理器相连的第一处理模块,所述第一处理模块包括与第一存储器相连的第二处理器,所述第二处理器对所述第一存储器执行读写操作产生的传输时延小于所述第一处理器与所述第一处理模块间通信产生的传输时延。
  2. 根据权利要求1所述的电路,其特征在于,所述第二处理器为多核处理器,所述第二处理器对所述第一存储器执行读写操作产生的传输时延为所述第二处理器包括的多核处理器中的任一核处理器对所述第一存储器执行读写操作产生的传输时延。
  3. 根据权利要求1或2所述的电路,其特征在于,所述第一处理器通过第一总线与所述第一处理模块相连,所述第二处理器通过第二总线与所述第一存储器相连,其中,所述第二总线的总线位宽大于所述第一总线的总线位宽,和/或,所述第二总线的长度小于所述第一总线的长度。
  4. 根据权利要求1或2所述的电路,其特征在于,所述第一处理模块还包括与第二存储器相连的第三处理器,所述第三处理器对所述第二存储器执行读写操作产生的传输时延小于所述第一处理器与所述第一处理模块通信产生的传输时延。
  5. 根据权利要求4所述的电路,其特征在于,所述第一处理器通过第一总线与所述第一处理模块相连,所述第二处理器通过第二总线与所述第一存储器相连,所述第三处理器通过第三总线与所述第二存储器相连,所述第二总线的总线位宽与所述第三总线的总线宽度之和大于所述第一总线的总线位宽。
  6. 根据权利要求1或2所述的电路,其特征在于,所述第一处理模块还包括与所述第一存储器相连的第三处理器,所述第三处理器与所述第一存储器执行读写操作产生的传输时延小于所述第一处理器与所述第一处理模块间通信产生的传输时延。
  7. 根据权利要求6所述的电路,其特征在于,所述第一处理器通过第一总线与所述第一处理模块相连,所述第二处理器通过第二总线与所述第一存储器相连,所述第三处理器通过第三总线与所述第一存储器相连,所述第二总线的总线位宽与所述第三总线的总线宽度之和大于所述第一总线的总线位宽。
  8. 根据权利要求4至7任一所述的电路,其特征在于,所述第二处理器和所述第三处理器属于流水线pipeline处理器。
  9. 根据权利要求1至3任一所述的电路,其特征在于,所述电路还包括第四处理器和与所述第四处理器相连的第三存储器;或者
    所述电路还包括第四处理器和与所述第四处理器相连的第二处理模块,所述第二处理模块包括与M个存储器相连的N个第五处理器,所述N和M均为大于或等于1的整数,任一第五处理器对与其相连的存储器执行读写操作产生的传输时延小于所述第四处理器与所述第二处理模块通信产生的传输时延。
  10. 根据权利要求4至8任一所述的电路,其特征在于,所述电路还包括第四处理器和与所述第四处理器相连的第三存储器;或者
    所述电路还包括第四处理器和与所述第四处理器相连的第二处理模块,所述第二处理 模块包括与M个存储器相连的N个第五处理器,所述N和M均为大于或等于1的整数,任一第五处理器对与其相连的存储器执行读写操作产生的传输时延小于所述第四处理器与所述第二处理模块通信产生的传输时延。
  11. 根据权利要求10所述的电路,其特征在于,所述第二处理器通过第四总线与所述第三处理器相连,所述第四处理器通过第五总线与所述第一处理器相连,所述第四总线的总线位宽小于所述第五总线的总线位宽。
  12. 根据权利要求9至11任一所述的电路,其特征在于,所述第四处理器包括的处理器核数大于或等于所述第一处理器包括的处理器核数。
  13. 根据权利要求9至12任一所述的电路,其特征在于,所述第四处理器和所述第一处理器属于pipeline处理器。
  14. 根据权利要求1至13任一所述的电路,其特征在于,所述第一处理模块还包括所述第一存储器。
  15. 一种芯片,其特征在于,所述芯片包括如权利要求1至14中任一项所述的电路。
  16. 一种电子设备,其特征在于,所述电子设备包括如权利要求15所述的芯片,所述电子设备还包括接收器和发送器,所述接收器,用于接收报文并将报文发送至所述芯片;
    所述芯片,用于处理所述报文;
    所述发送器,用于获取所述芯片处理后的报文,并将所述处理后的报文发送至另一电子设备。
PCT/CN2021/115618 2020-09-30 2021-08-31 电路、芯片和电子设备 WO2022068503A1 (zh)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP21874164.3A EP4209886A4 (en) 2020-09-30 2021-08-31 CIRCUIT, CHIP AND ELECTRONIC DEVICE
JP2023519636A JP2023543466A (ja) 2020-09-30 2021-08-31 回路、チップ、および電子デバイス
KR1020237014009A KR20230073317A (ko) 2020-09-30 2021-08-31 회로, 칩 및 전자 디바이스
CA3194399A CA3194399A1 (en) 2020-09-30 2021-08-31 Circuit, chip, and electronic device
MX2023003629A MX2023003629A (es) 2020-09-30 2021-08-31 Circuito, chip y dispositivo electronico.
US18/192,293 US20230236727A1 (en) 2020-09-30 2023-03-29 Circuit, Chip, and Electronic Device

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202011060780 2020-09-30
CN202011060780.0 2020-09-30
CN202011176149.7A CN114327247A (zh) 2020-09-30 2020-10-28 电路、芯片和电子设备
CN202011176149.7 2020-10-28

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/192,293 Continuation US20230236727A1 (en) 2020-09-30 2023-03-29 Circuit, Chip, and Electronic Device

Publications (1)

Publication Number Publication Date
WO2022068503A1 true WO2022068503A1 (zh) 2022-04-07

Family

ID=80949607

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/115618 WO2022068503A1 (zh) 2020-09-30 2021-08-31 电路、芯片和电子设备

Country Status (7)

Country Link
US (1) US20230236727A1 (zh)
EP (1) EP4209886A4 (zh)
JP (1) JP2023543466A (zh)
KR (1) KR20230073317A (zh)
CA (1) CA3194399A1 (zh)
MX (1) MX2023003629A (zh)
WO (1) WO2022068503A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040215926A1 (en) * 2003-04-28 2004-10-28 International Business Machines Corp. Data processing system having novel interconnect for supporting both technical and commercial workloads
CN106462498A (zh) * 2014-06-23 2017-02-22 利奇德股份有限公司 用于数据存储系统的模块化交换架构
US9588898B1 (en) * 2015-06-02 2017-03-07 Western Digital Technologies, Inc. Fullness control for media-based cache operating in a steady state
CN107291392A (zh) * 2017-06-21 2017-10-24 郑州云海信息技术有限公司 一种固态硬盘及其读写方法
CN108139971A (zh) * 2016-09-29 2018-06-08 华为技术有限公司 一种可扩展内存的芯片
WO2018188084A1 (zh) * 2017-04-14 2018-10-18 华为技术有限公司 一种数据访问方法及装置
CN108920111A (zh) * 2018-07-27 2018-11-30 中国联合网络通信集团有限公司 数据共享方法及分布式数据共享系统
CN109582215A (zh) * 2017-09-29 2019-04-05 华为技术有限公司 硬盘操作命令的执行方法、硬盘及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08212185A (ja) * 1995-01-31 1996-08-20 Mitsubishi Electric Corp マイクロコンピュータ
CN1979461A (zh) * 2005-11-29 2007-06-13 泰安电脑科技(上海)有限公司 多处理器模块
US9432298B1 (en) * 2011-12-09 2016-08-30 P4tents1, LLC System, method, and computer program product for improving memory systems

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040215926A1 (en) * 2003-04-28 2004-10-28 International Business Machines Corp. Data processing system having novel interconnect for supporting both technical and commercial workloads
CN106462498A (zh) * 2014-06-23 2017-02-22 利奇德股份有限公司 用于数据存储系统的模块化交换架构
US9588898B1 (en) * 2015-06-02 2017-03-07 Western Digital Technologies, Inc. Fullness control for media-based cache operating in a steady state
CN108139971A (zh) * 2016-09-29 2018-06-08 华为技术有限公司 一种可扩展内存的芯片
WO2018188084A1 (zh) * 2017-04-14 2018-10-18 华为技术有限公司 一种数据访问方法及装置
CN107291392A (zh) * 2017-06-21 2017-10-24 郑州云海信息技术有限公司 一种固态硬盘及其读写方法
CN109582215A (zh) * 2017-09-29 2019-04-05 华为技术有限公司 硬盘操作命令的执行方法、硬盘及存储介质
CN108920111A (zh) * 2018-07-27 2018-11-30 中国联合网络通信集团有限公司 数据共享方法及分布式数据共享系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4209886A4

Also Published As

Publication number Publication date
MX2023003629A (es) 2023-06-23
EP4209886A4 (en) 2024-02-14
CA3194399A1 (en) 2022-04-07
JP2023543466A (ja) 2023-10-16
EP4209886A1 (en) 2023-07-12
KR20230073317A (ko) 2023-05-25
US20230236727A1 (en) 2023-07-27

Similar Documents

Publication Publication Date Title
US9025495B1 (en) Flexible routing engine for a PCI express switch and method of use
TWI406133B (zh) 資料處理設備及資料傳送方法
US8930593B2 (en) Method for setting parameters and determining latency in a chained device system
US6732184B1 (en) Address table overflow management in a network switch
US8477777B2 (en) Bridge apparatus and communication method
US20080069094A1 (en) Urgent packet latency control of network on chip (NOC) apparatus and method of the same
CN101902401A (zh) 一种搜索处理装置及网络系统
US7995567B2 (en) Apparatus and method for network control
CN1953418A (zh) 处理信息分组的方法和使用该方法的电信设备
WO2022068503A1 (zh) 电路、芯片和电子设备
US8645620B2 (en) Apparatus and method for accessing a memory device
WO2024021801A1 (zh) 报文转发装置及方法、通信芯片及网络设备
US8325768B2 (en) Interleaving data packets in a packet-based communication system
US6957309B1 (en) Method and apparatus for re-accessing a FIFO location
CN116032837A (zh) 一种流表卸载方法及装置
WO2022166854A1 (zh) 一种数据查找方法、装置及集成电路
CN114327247A (zh) 电路、芯片和电子设备
US20050044261A1 (en) Method of operating a network switch
US7739423B2 (en) Bulk transfer of information on network device
TWI817914B (zh) 實體層模組與網路模組
JP3189784B2 (ja) レイヤ3マルチキャスト送信方式
CN115134311B (zh) RapidIO端点控制器及端点设备
TWI649985B (zh) 結合快速周邊元件互連匯流排與乙太網路的網路通訊方法、系統及控制器
CN116886605B (zh) 一种流表卸载系统、方法、设备以及存储介质
WO2023088226A1 (zh) 转发报文的方法以及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21874164

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023519636

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 3194399

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2021874164

Country of ref document: EP

Effective date: 20230404

ENP Entry into the national phase

Ref document number: 20237014009

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE