EP4209886A1 - Circuit, chip, and electronic device - Google Patents
Circuit, chip, and electronic device Download PDFInfo
- Publication number
- EP4209886A1 EP4209886A1 EP21874164.3A EP21874164A EP4209886A1 EP 4209886 A1 EP4209886 A1 EP 4209886A1 EP 21874164 A EP21874164 A EP 21874164A EP 4209886 A1 EP4209886 A1 EP 4209886A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- processor
- bus
- memory
- processing module
- type
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000015654 memory Effects 0.000 claims abstract description 247
- 238000012545 processing Methods 0.000 claims abstract description 173
- 230000005540 biological transmission Effects 0.000 claims abstract description 54
- 238000000034 method Methods 0.000 claims description 35
- 230000006870 function Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000012546 transfer Methods 0.000 description 8
- 238000013461 design Methods 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0611—Improving I/O performance in relation to response time
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4009—Coupling between buses with data restructuring
- G06F13/4018—Coupling between buses with data restructuring with data-width conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/42—Bus transfer protocol, e.g. handshake; Synchronisation
- G06F13/4204—Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus
- G06F13/4208—Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus being a system bus, e.g. VME bus, Futurebus, Multibus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0653—Monitoring storage devices or systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
- G06F30/39—Circuit design at the physical level
- G06F30/392—Floor-planning or layout, e.g. partitioning or placement
Definitions
- This application relates to the field of chip technologies, and more specifically, to a circuit, a chip, and an electronic device.
- Processors in a current high-speed network chip are generally disposed in a pipeline manner. After a packet enters the chip, a program state (program state, PS) is generated for the packet to store context information during packet forwarding. A processor on the pipeline processes the packet, saves a processing result in the PS, and then sends the processing result to a next processor.
- PS program state
- a processor on the pipeline processes the packet, saves a processing result in the PS, and then sends the processing result to a next processor.
- a design between the processor and a memory storing the PS that are in the chip is improper. Consequently, a high latency is generated when the PS is read and written.
- This application provides a circuit, a chip, and an electronic device, to reduce a transmission latency.
- an embodiment of this application provides a circuit.
- the circuit includes a first processor and a first processing module connected to the first processor.
- the first processing module includes a second processor connected to a first memory.
- a transmission latency generated when the second processor performs read and write operations on the first memory is less than a transmission latency generated when the first processor communicates with the first processing module. Because the transmission latency generated when the second processor performs the read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module, a cost of a transmission latency of data in a bus can be reduced.
- the transmission latency generated when the second processor performs the read and write operations on the first memory is less than or equal to 1/10 of the transmission latency generated when the first processor communicates with the first processing module.
- the second processor is a multi-core processor
- the transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor included in the second processor performs read and write operations on the first memory.
- the second processor is a multi-core processor.
- the transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor included in the second processor performs read and write operations on the first memory.
- the first processor is connected to the first processing module through a first bus
- the second processor is connected to the first memory through a second bus, where a bus bit width of the second bus is greater than a bus bit width of the first bus, and/or a length of the second bus is less than a length of the first bus. Because the length of the second bus is less than the length of the first bus, an area of the circuit can be reduced.
- a length of the second bus may be less than or equal to 1/10 of a length of the first bus.
- an area of the circuit can be further reduced.
- the first processing module further includes a third processor connected to a second memory, and a transmission latency generated when the third processor performs read and write operations on the second memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- the first processor is connected to the first processing module through a first bus
- the second processor is connected to the first memory through a second bus
- the third processor is connected to the second memory through a third bus
- a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- the first processing module further includes a third processor connected to the first memory, and a transmission latency generated when the third processor performs read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- the first processor is connected to the first processing module through a first bus
- the second processor is connected to the first memory through a second bus
- the third processor is connected to the first memory through a third bus
- a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- the second processor and the third processor are pipeline pipeline processors.
- the circuit further includes a fourth processor and a second processing module connected to the fourth processor.
- the second processing module includes N fifth processors connected to M memories, where both N and M are integers greater than or equal to 1.
- a transmission latency generated when any fifth processor performs read and write operations on the memory connected to the fifth processor is less than a transmission latency generated when the fourth processor communicates with the second processing module.
- the second processor is connected to the third processor through a fourth bus
- the fourth processor is connected to the first processor through a fifth bus
- a bus bit width of the fourth bus is less than a bus bit width of the fifth bus.
- a quantity of processor cores included in the fourth processor is greater than or equal to a quantity of processor cores included in the first processor.
- the fourth processor and the first processor are pipeline processors.
- the first processing module further includes the first memory.
- an embodiment of this application further provides a chip.
- the chip includes the circuit according to any one of the first aspect or the possible implementations of the first aspect.
- an embodiment of this application further provides an electronic device.
- the electronic device includes the chip according to embodiments of this application, and the electronic device further includes a receiver and a transmitter.
- the receiver is configured to receive a packet and send the packet to the chip.
- the chip is configured to process the packet.
- the transmitter is configured to: obtain a packet processed by the chip, and send the processed packet to another electronic device.
- the electronic device may be a switch, a router, or any other electronic device on which the foregoing chip can be disposed.
- an embodiment of this application further provides a processing method.
- the method includes: A first processor receives a first packet, where the first packet includes flow identifier information; the first processor determines a first processing module based on the flow identifier information, where the first processing module corresponds to the flow identifier information; and the first processor sends the first packet to the first processing module.
- the first processor sends, to the first processing module based on the flow identifier information carried in the packet, the packet that needs to be processed by the first processing module, and a processor in the first processing module performs corresponding processing. Because the first processing module is closer to a memory than the first processor, a transmission latency can be reduced.
- the method further includes: The first processor receives a second packet from the first processing module, where the second packet is a packet that is obtained through processing performed by the first processing module based on the flow identifier information, and the second packet includes the flow identifier information.
- the method further includes: The first processor sends the second packet to a next processor, where the next processor is a next hop of the first processor on a pipeline to which the first processor belongs.
- an embodiment of this application further provides a processing method.
- the method includes: A second processor in a first processing module receives a first packet from a first processor, where the first packet includes flow identifier information; the second processor obtains, from a memory corresponding to the second processor based on the flow identifier information, a parameter used for processing the first packet; the second processor processes the first packet based on the parameter, and sends a processed first packet to a third processor in the first processing module, where the processed first packet includes the flow identifier information; the third processor in the first processing module obtains, from a memory corresponding to the third processor based on the flow identifier information, a parameter used for processing the processed first packet; and the third processor processes the processed first packet based on the parameter, to obtain a second packet and send the second packet to the first processor.
- the processor in the first processing module performs a read operation on the memory based on a flow identifier in the first packet, and performs corresponding processing. Because the first processing module is closer to the memory than the first processor, a transmission latency can be reduced.
- the processing may include table lookup for forwarding, the parameter includes one or more of an index of a forwarding entry, a base address, and a hash value, and the parameter corresponds to the flow identifier.
- example and “for example” are used to represent giving an example, an illustration, or a description. Any embodiment or design scheme described as an “example” in this application should not be explained as being more preferred or having more advantages than another embodiment or design scheme. Specifically, the term “example” is used to present a concept in a specific manner.
- Network architectures and service scenarios described in embodiments of this application are intended to describe the technical solutions in embodiments of this application more clearly, and do not constitute any limitation on the technical solutions according to embodiments of this application.
- At least one means one or more, and "a plurality of" means two or more.
- the term “and/or” describes an association relationship between associated objects and represents that three relationships may exist.
- a and/or B may represent the following three cases: Only A exists, both A and B exist, and only B exists.
- a and B each may be singular or plural.
- the character “/” generally represents an “or” relationship between the associated objects.
- At least one of the following items (pieces) or a similar expression thereof refers to any combination of these items, including any combination of singular items (pieces) or plural items (pieces).
- At least one item (piece) of a, b, or c may indicate: a, b, c, a and b, a and c, b and c, or a, b, and c, where a, b, and c may be singular or plural.
- FIG. 1 is a schematic diagram of a chip according to an embodiment of this application.
- the chip 100 includes an input/output interface 101, a processor 111, a processor 112, a processor 113, and a processor 114.
- the chip 100 further includes a processing module 121 and a processing module 122.
- the processing module 121 includes a processor 1211, a processor 1212, a memory 1221, and a memory 1222.
- the processing module 122 includes a processor 1213, a processor 1214, a processor 1215, a memory 1223, a memory 1224, and a memory 1225.
- the chip 100 further includes a memory 131 and a memory 132.
- the processor 111 is connected to the input/output interface 101 through a bus 61, is connected to the processor 112 through a bus 41, and is connected to the memory 131 through a bus 51.
- the processor 112 is connected to the processor 113 through a bus 42, and is connected to the processing module 121 through a bus 11.
- the processor 113 is connected to the processor 114 through a bus 43, and is connected to the memory 132 through a bus 52.
- the processor 114 is connected to the input/output interface 101 through a bus 62, and is connected to the processing module 122 through a bus 12.
- the processor 1211 is connected to the memory 1221 through a bus 31, and is connected to the processor 1212 through a bus 21.
- the processor 1212 is connected to the memory 1222 through a bus 32.
- the processor 1213 is connected to the memory 1223 through a bus 33, and is connected to the processor 1214 through a bus 22.
- the processor 1214 is connected to the memory 1224 through a bus 34, and is connected to the processor 1215 through a bus 23.
- the processor 1215 is connected to the memory 1225 through a bus 35.
- the chip 100 processes a received packet in a pipeline (pipeline) manner.
- the processor 111, the processor 112, the processor 113, and the processor 114 belong to a same pipeline, and the processor 111, the processor 112, the processor 113, and the processor 114 may also be referred to as pipeline processors.
- the processor 1211 and the processor 1212 belong to a same pipeline, and the processor 1213, the processor 1214, and the processor 1215 belong to a same pipeline.
- the pipeline may be referred to as a first pipeline.
- some processors in the first pipeline may directly access memories through buses.
- the processor 111 may directly access the memory 131 through the bus 11, and the processor 113 may directly access the memory 132 through the bus 13.
- a processor in the first pipeline that can directly access a memory may be referred to as a type 1 processor.
- other processors in the first pipeline may communicate with processing modules.
- the processor 112 communicates with the processing module 121 through the bus 11, and the processor 114 communicates with the processing module 122 through the bus 12.
- a processor in the first pipeline that can communicate with a processing module may be referred to as a type 2 processor.
- a plurality of processors included in each processing module may also belong to a single pipeline.
- the processor 1211 and the processor 1212 belong to the same pipeline, and the processor 1213, the processor 1214, and the processor 1215 belong to the same pipeline.
- a pipeline in a processing module may be referred to as a second pipeline.
- a processor in the processing module may be referred to as a type 3 processor.
- the processor 1211, the processor 1212, the processor 1213, the processor 1214, and the processor 1215 shown in FIG. 1 may all be referred to as type 3 processors.
- any type 1 processor corresponds to one memory
- any type 3 processor corresponds to one memory
- Any type 1 processor or any type 3 processor is connected to a corresponding memory through a bus, to perform read and write operations on the memory.
- the memory 131 corresponds to the processor 111
- the memory 1221 corresponds to the processor 1211.
- a one-to-one correspondence between the memory and the processor may also be replaced by a one-to-many or many-to-one correspondence.
- any type 1 processor or any type 3 processor may correspond to a plurality of memories, to perform read and write operations on a plurality of memories.
- a plurality of type 1 processors may correspond to one memory
- a plurality of type 3 processors may correspond to one memory, to perform read and write operations on the memory.
- the memory 131 and the memory 132 in FIG. 1 may be replaced with one memory
- the processor 111 and the processor 113 correspond to the same memory.
- the memory 131 in FIG. 1 may alternatively be replaced with a plurality of memories
- the processor 111 corresponds to the plurality of memories.
- the memory 1221 and the memory 1222 in FIG. 1 may be replaced with one memory, and the processor 1211 and the processor 1212 correspond to the same memory.
- the bus in the chip 100 includes a type 1 bus, a type 2 bus, a type 3 bus, a type 4 bus, a type 5 bus, and a type 6 bus.
- the type 1 bus is configured to connect the type 2 processor and a processing module corresponding to the type 2 processor.
- the bus 11 configured to connect the processor 112 and the processing module 121 and the bus 12 configured to connect the processor 114 and the processing module 122 are both type 1 buses.
- the type 2 bus is configured to connect two type 3 processors.
- the bus 21 configured to connect the processor 1211 and the processor 1212
- the bus 22 configured to connect the processor 1213 and the processor 1214
- the bus 23 configured to connect the processor 1214 and the processor 1215 are all type 2 buses.
- the type 3 bus is configured to connect the type 3 processor and a memory corresponding to the type 3 processor.
- the bus 31 configured to connect the processor 1211 and the memory 1221
- the bus 33 configured to connect the processor 1213 and the memory 1223, and the like are all type 3 buses.
- the type 4 bus is configured to connect two processors in the first pipeline.
- the bus 41 configured to connect the processor 111 and the processor 112
- the bus 42 configured to connect the processor 112 and the processor 113
- the bus 43 configured to connect the processor 113 and the processor 114 are all type 4 buses.
- the type 5 bus is configured to connect the type 1 processor and a memory corresponding to the type 1 processor.
- the bus 51 configured to connect the processor 111 and the memory 131 and the bus 52 configured to connect the processor 113 and the memory 132 are both type 5 buses.
- the type 6 bus is configured to connect the input/output interface 101 and a processor.
- the bus 61 configured to connect the input/output interface 101 and the processor 111 and the bus 62 configured to connect the processor 114 and the input/output interface 101 are both type 6 buses.
- each processor in the first pipeline is a multi-core processor.
- Each processor in the first pipeline may include a plurality of processor cores (which may also be referred to as cores (cores)).
- different processors in the first pipeline may include a same quantity of processor cores.
- any two processors in the first pipeline include a same quantity of processor cores.
- FIG. 1 is still used as an example.
- a quantity of processor cores included in the processor 111 is equal to a quantity of processor cores included in the processor 112
- the quantity of processor cores included in the processor 112 is equal to a quantity of processor cores included in the processor 113
- the quantity of processor cores included in the processor 113 is equal to a quantity of processor cores included in the processor 114.
- different processors in the first pipeline may include different quantities of processor cores.
- any two processors in the first pipeline may include different quantities of processor cores.
- a quantity of processor cores included in the processor 111 is greater than a quantity of processor cores included in the processor 112.
- a quantity of processor cores included in the processor 113 is greater than a quantity of processor cores included in the processor 114.
- the quantity of processor cores included in the processor 111 is greater than the quantity of processor cores included in the processor 113
- the quantity of processor cores included in the processor 112 is greater than the quantity of processor cores included in the processor 114.
- processors in the first pipeline include a same quantity of processor cores.
- a quantity of processor cores included in the processor 111 is equal to a quantity of processor cores included in the processor 113
- a quantity of processor cores included in the processor 112 is equal to a quantity of processor cores included in the processor 114, but the quantity of processor cores included in the processor 111 is different from the quantity of processor cores included in the processor 112.
- processors in the first pipeline may be classified into two types: a type 1 processor (for example, the processor 111 and the processor 113) and a type 2 processor (for example, the processor 112 and the processor 114).
- processors of a same type include a same quantity of processor cores, and processors of different types may include different quantities of processor cores.
- a quantity of processor cores included in the type 1 processor may be greater than a quantity of processor cores in the type 2 processor.
- the type 2 processor communicates with a processing module, and a processor included in the processing module can perform some processing operations.
- the type 2 processor may be a single-core processor or a processor with a small quantity of cores, so that hardware costs can be further reduced.
- the quantity of processor cores included in the type 2 processor may be 1/2, 1/3, 1/5, or 1/8 of the quantity of processor cores included in the type 1 processor.
- the type 1 processor may be a multi-core processor
- the type 2 processor may be a single-core processor
- the type 3 processor may also be a multi-core processor.
- the type 3 processor may also include a plurality of processor cores.
- a quantity of processor cores included in the type 3 processor is less than a quantity of processor cores included in the type 1 processor or a quantity of processor cores included in the type 2 processor.
- the quantity of processor cores included in the type 1 processor and the quantity of processor cores included in the type 2 processor are both greater than the quantity of processor cores included in the type 3 processor.
- a quantity of processor cores included in the processor 1211 may be less than the quantity of processor cores included in the processor 111, and the quantity of processor cores included in the processor 1211 may also be less than the quantity of processor cores included in the processor 112.
- a quantity of processor cores included in the type 3 processor may be less than a quantity of processor cores included in the type 1 processor, and the quantity of processor cores included in the type 3 processor may be equal to or greater than a quantity of processor cores included in the type 2 processor.
- a quantity of processor cores included in the processor 1213 may be less than the quantity of processor cores included in the processor 111, and the quantity of processor cores included in the processor 1213 may be equal to or greater than the quantity of processor cores included in the processor 114.
- the quantity of processor cores included in the type 3 processor may be less than or equal to 1/10 of the quantity of processor cores included in the type 1 processor.
- the quantity of processor cores included in the type 3 processor may be less than or equal to 1/2, 1/3, 1/5, 1/8, or the like of the quantity of processor cores included in the type 1 processor.
- a sum of a quantity of processor cores included in the type 2 processor and a quantity of processor cores included in one type 3 processor in a processing module corresponding to the processor is equal to a quantity of processor cores included in the type 1 processor.
- a sum of the quantity of processor cores included in the processor 112 and a quantity of processor cores included in the processor 1212 is equal to the quantity of processor cores included in the processor 111.
- a sum of the quantity of processor cores included in the processor 114 and a quantity of processor cores included in the processor 1214 is equal to the quantity of processor cores included in the processor 113.
- different type 3 processors may include a same quantity of processor cores.
- the quantity of processor cores included in the processor 1211 is equal to the quantity of processor cores included in the processor 1212
- the quantity of processor cores included in the processor 1212 is equal to a quantity of processor cores included in the processor 1215.
- different type 3 processors may include different quantities of processor cores.
- any two processors belonging to a same processing module include a same quantity of processor cores, and two processors belonging to different processing modules include different quantities of processor cores.
- the quantity of processor cores included in the processor 1211 is equal to the quantity of processor cores included in the processor 1212, and the quantity of processor cores included in the processor 1212 is not equal to the quantity of processor cores included in the processor 1213.
- each processing module includes at least two processors.
- the processing module may alternatively include one multi-core processor.
- the processing module 121 may include only the processor 1211 and the memory 1221, where the processor 1211 is a multi-core processor.
- the type 2 processor may also be a single-core processor. If the type 2 processor is a single-core processor, a processing module including the processor may include at least two processors. In other words, if the processing module includes a plurality of processors, the plurality of processors may include at least one single-core processor.
- the processing module 121 is used as an example.
- the processor 1211 in the processing module 121 may be a single-core processor, and the processor 1212 may be a single-core processor or a multi-core processor.
- a length of the type 1 bus is greater than a length of the type 3 bus.
- the length of the type 3 bus may be equal to 1/5, 1/8, 1/10, or the like of the length of the type 1 bus.
- the length of the type 3 bus may be less than 1/10, 1/15, 1/20, or the like of the length of the type 1 bus.
- a sum of the length of the type 1 bus and the length of the type 3 bus is equal to a length of the type 5 bus.
- any two type 1 buses may have a same length.
- any two type 2 buses may have a same length.
- any two type 3 buses may have a same length.
- any two type 4 buses may have a same length.
- any two type 5 buses may have a same length. Due to limitations of a manufacturing process, it may be difficult to obtain buses of a completely same length. Therefore, in this embodiment of this application, that lengths are the same may be understood as that the lengths are completely the same, or may be understood as that a length difference is within an allowed error range.
- a sum of the length of the type 1 bus and the length of the type 3 bus is equal to a length of the type 5 bus
- a difference between the sum of the length of the type 1 bus and the length of the type 3 bus and the length of the type 5 bus is 0, or is less than or equal to a preset allowed error value.
- a difference between a length of the bus 51 and a length of the bus 52 is 0, or is less than or equal to a preset allowed error value.
- a sum of widths of all third buses in a same processing module is greater than a width of one first bus.
- a sum of a width of the bus 31 and a width of the bus 32 is greater than a width of the bus 11.
- a sum of a width of the bus 33, a width of the bus 34, and a width of the bus 35 is greater than a width of the bus 12.
- a quantity of bits of binary data that can be simultaneously transmitted through the bus is referred to as a width (width) (which may also be referred to as a bit width), and the width is measured in bits.
- a greater bus width indicates better transmission performance and a larger amount of data that can be transmitted within a same period.
- FIG. 2 is a schematic diagram of another chip according to an embodiment of this application.
- the chip 200 includes an input/output interface 201, a processor 211, a processor 212, a processor 213, and a processor 214.
- the chip 200 further includes a processing module 221 and a processing module 222.
- the processing module 221 includes a processor 2211 and a processor 2212.
- the processing module 222 includes a processor 2213, a processor 2214, and a processor 2215.
- the processor 211 is connected to the input/output interface 201 through a bus 2411.
- the processor 211 is connected to the processor 212 through a bus 2441.
- the processor 212 is connected to the processing module 221 through a bus 2421.
- the processor 212 is connected to the processor 213 through a bus 2442.
- the processor 213 is connected to the input/output interface 201 through a bus.
- the processor 213 is connected to the processor 214 through a bus 2443.
- the processor 214 is connected to the processing module 222 through a bus 2422.
- the processing module 221 is connected to the input/output interface 201 through a bus 2431.
- the processing module 222 is connected to the input/output interface 201 through a bus 2432.
- the processor 2221 is connected to the processor 2212 through a bus 2451.
- the processor 2213 is connected to the processor 2214 through a bus 2452.
- the processor 2214 is connected to the processor 2215 through a bus 2453.
- a memory 231 to a memory 237 are memories located outside the chip 200.
- the chip 200 may access the memory 231 to the memory 237 through the input/output interface 201 and corresponding buses.
- the memory 231 is connected to the chip 200 through a bus 2461
- the memory 232 is connected to the chip 200 through a bus 2462
- the memory 233 is connected to the chip 200 through a bus 2463
- the memory 234 is connected to the chip 200 through a bus 2464
- the memory 235 is connected to the chip 200 through a bus 2465
- the memory 236 is connected to the chip 200 through a bus 2466
- the memory 237 is connected to the chip 200 through a bus 2467.
- the chip 200 processes a received packet in a pipeline (pipeline) manner.
- the processor 211, the processor 212, the processor 213, and the processor 214 in the chip 200 belong to a single pipeline, and the pipeline may be referred to as a first pipeline.
- some processors in the first pipeline can directly communicate with the input/output interface through buses, and other processors in the first pipeline are connected to processing modules through buses.
- a processor that can directly communicate with the input/output interface namely, a processor that is not connected to a processing module
- a type 1 processor may include the processor 211 and the processor 213, and the type 2 processor may include the processor 212 and the processor 214.
- a plurality of processors included in each processing module may also belong to a single pipeline.
- the processor 2211 and the processor 2212 belong to a same pipeline
- the processor 2213, the processor 2214, and the processor 2215 belong to a same pipeline.
- a pipeline in a processing module may be referred to as a second pipeline.
- a processor in the processing module may be referred to as a type 3 processor.
- the processor 2211, the processor 2212, the processor 2213, the processor 2214, and the processor 2215 shown in FIG. 2 may all be referred to as type 3 processors.
- Each type 1 processor and each processor have one corresponding memory.
- the processor may read data stored in the corresponding memory.
- the processor may also write the data into the corresponding memory.
- a memory corresponding to the processor 211 is the memory 231
- a memory corresponding to the processor 2211 is the memory 232
- a memory corresponding to the processor 2212 is the memory 233
- a memory corresponding to the processor 213 is the memory 234
- a memory corresponding to the processor 2213 is the memory 235
- a memory corresponding to the processor 2214 is the memory 236, and
- a memory corresponding to the processor 2215 is the memory 237.
- the processor 211 may read data stored in the memory 231, and/or write the data into the memory 231.
- the processor 2213 may read data stored in the memory 235, and/or write the data into the memory 235.
- a processing module connected to a processor through a bus may be referred to as a processing module corresponding to the processor.
- the processing module 121 is a processing module corresponding to the processor 112.
- the bus in the chip 200 may include a type 1 bus, a type 2 bus, a type 3 bus, a type 4 bus, a type 5 bus, and a type 6 bus.
- the type 1 bus is configured to connect the type 2 processor and a processing module corresponding to the type 2 processor.
- the bus 2411 configured to connect the processor 212 and the processing module 221 and the bus 2412 configured to connect the processor 214 and the processing module 222 are both type 1 buses.
- the type 2 bus is configured to connect two type 3 processors.
- the bus 2421 configured to connect the processor 2221 and the processor 2212
- the bus 2422 configured to connect the processor 2213 and the processor 2214
- the bus 2423 configured to connect the processor 2214 and the processor 2215
- the type 3 bus is configured to connect a processor in a processing module and the input/output interface.
- the bus 2431, the bus 2432, the bus 2433, the bus 2434, and the bus 2435 are all type 3 buses.
- the bus 2431 is a type 3 bus configured to connect the processor 2211 and the input/output interface 201.
- the bus 2432 is a type 3 bus configured to connect the processor 2212 and the input/output interface 201.
- the bus 2433 is a type 3 bus configured to connect the processor 2213 and the input/output interface 201.
- the bus 2434 is a type 3 bus configured to connect the processor 2214 and the input/output interface 201.
- the bus 2435 is a type 3 bus configured to connect the processor 2215 and the input/output interface 201.
- the type 4 bus is configured to connect two processors in the first pipeline.
- the bus 2441 configured to connect the processor 211 and the processor 212
- the bus 2442 configured to connect the processor 212 and the processor 213, and the bus 2443 configured to connect the processor 213 and the processor 214 are all type 4 buses.
- the type 6 bus is configured to connect the first processor and the input/output interface.
- the bus 2461 and the bus 2462 are both type 6 buses.
- the type 1 processor may access a corresponding memory through corresponding buses and the input/output interface.
- the processor 211 may access the memory 231 through the bus 2461, the input/output interface 201, and the bus 2471.
- the processor 213 may access the memory 234 through the bus 2462, the input/output interface 201, and the bus 2474.
- the type 3 processor may access a corresponding memory through corresponding buses and the input/output interface.
- the processor 2211 may access the memory 232 through the bus 2431, the input/output interface 201, and the bus 2472.
- the processor 2215 may access the memory 237 through the bus 2435, the input/output interface 201, and the bus 2477.
- processors shown in FIG. 2 may all be multi-core processors.
- the type 1 processor and the type 2 processor may be multi-core processors
- the type 3 processor may be a single-core processor.
- a structure of the type 3 processor may be simpler than a structure of the type 1 processor.
- a quantity of processor cores included in the type 3 processor may be less than a quantity of processor cores included in the type 1 processor.
- a quantity of transistors included in the type 3 processor may be less than a quantity of transistors included in the type 1 processor.
- the type 2 processor, and the type 3 processor refer to descriptions of the chip 100 shown in FIG. 1 . For brevity, details are not described herein again.
- a length of the type 1 bus is greater than a length of the type 3 bus.
- the length of the type 1 bus may be equal to 1/5, 1/8, 1/10, or the like of the length of the type 3 bus.
- the length of the type 3 bus may be less than 1/10, 1/15, 1/20, or the like of the length of the type 1 bus.
- a sum of the length of the type 1 bus and the length of the type 3 bus is equal to a length of the type 6 bus.
- any two type 1 buses may have a same length.
- any two type 2 buses may have a same length.
- any two type 3 buses may have a same length.
- any two type 4 buses may have a same length.
- any two type 6 buses may have a same length.
- a sum of widths of buses between the chip and memories corresponding to a same processing module is greater than a width of one first bus.
- a sum of a width of the bus 2431 and a width of the bus 2432 is greater than a width of the bus 2411.
- a sum of a width of the bus 2433, a width of the bus 2434, and a width of the bus 2435 is greater than a width of the bus 2412.
- a width of the type 2 bus may be less than a width of the type 4 bus.
- each processor (the type 3 processor and the type 1 processor) that has a corresponding memory and the corresponding memory are located inside the chip.
- FIG. 3 is a schematic diagram of a hybrid processor circuit.
- a processor 301 is connected to a processing module 310 through a bus 331.
- the processing module 310 includes three processors, which are respectively a processor 311, a processor 312, and a processor 313.
- the processors in the processing module 310 are connected through a bus 332.
- the processor 301 is a first processor, and the processor 311, the processor 312, and the processor 313 are type 3 processors. More specifically, the processor 311 is a type 2 processor.
- Each processor in the processing module 310 has one corresponding memory.
- a memory corresponding to the processor 311 is a memory 321, a memory corresponding to the processor 312 is a memory 322, and a memory corresponding to the processor 313 is a memory 323.
- Each processor in the processing module 310 is connected to the corresponding memory through a bus.
- the processor 311 is connected to the memory 321 through a bus 333
- the processor 312 is connected to the memory 322 through a bus 334
- the processor 313 is connected to the memory 323 through a bus 335.
- the processor 310, the processor 311, the processor 312, and the processor 313 are located in a same chip.
- the corresponding memory of each processor in the processing module 310 may be located in a same chip as the processing module 310, or may be located outside a chip in which the processing module 310 is located. If the memory is located outside the chip in which the processing module 310 is located, the bus configured to connect the processor in the processing module and the corresponding memory may include a bus from the processor to an input/output interface of the chip and a bus from the chip to the corresponding memory.
- the bus 333 may include a bus from the processor 311 to the input/output interface of the chip and a bus from the input/output interface of the chip to the memory 321.
- the structure shown in FIG. 3 may be referred to as a hybrid processor circuit or a hybrid processor structure.
- the processing module in the hybrid processor structure shown in FIG. 3 includes three processors.
- a quantity of processors in the processing module may be a positive integer greater than or equal to 1.
- the quantity may be 1, 2, 4, 5, or the like.
- the processing module may be a multi-core processor. If the processing module includes at least two processors, the at least two processors may include one or more single-core processors.
- L is less than R.
- L may be far less than R.
- L may be equal to one tenth of R, or L is less than one tenth of R.
- a letter A is used to represent a width of the bus 333
- a letter B is used to represent a width of the bus 334
- a letter C is used to represent a width of the bus 335
- a letter D is used to represent a width of the bus 331.
- A, B, C, and D meet the following relationship: D ⁇ A+B+C.
- Cost_TX L ⁇ A + B + C + R ⁇ D
- Cost_TX represents the data transfer cost
- L represents the length of the bus 333 (the length of the bus 333, the length of the bus 334, and the length of the bus 335 are equal)
- R represents the length of the bus 331
- A represents the width of the bus 333
- B represents the width of the bus 334
- C represents the width of the bus 335
- D represents the width of the bus 331.
- a chip (for example, the chip 100 shown in FIG. 1 or the chip 200 shown in FIG. 2 ) that uses a pipeline structure may generate a program state (program state, PS) for the packet.
- the PS is used to store context information during packet forwarding.
- the PS passes through each processor in a first pipeline, and the processor in the first pipeline is responsible for processing.
- the PS processed by the processor in the first pipeline may be referred to as a first PS.
- PS_Full_Size represents a size of the first PS.
- the processing module 310 also generates a PS in a process of processing the packet.
- the PS sequentially passes through each processor in a second pipeline, and the processors in the second pipeline are responsible for processing.
- the PS processed by the processor in the second pipeline may be referred to as a second PS.
- PS_Little_Size represents a size of the second PS.
- the first PS stores the context information during the packet forwarding.
- the second PS stores only information processed in the second pipeline. Therefore, the size of the first PS is greater than the size of the second PS (that is, PS_Full_Size>PS_Little_Size).
- the size of the second PS may be equal to or less than 1/5, 1/8, 1/10, 1/15, 1/20, or the like of the size of the first PS.
- a simpler processor may be used to process the second PS. Therefore, a structure of the processor (namely, the type 3 processor) inside the processing module may be simpler than a structure of the processor in the first pipeline.
- a quantity of processor cores included in the type 3 processor may be less than a quantity of processor cores included in the type 1 first processor and/or a quantity of processor cores included in the type 2 first processor, and/or a quantity of transistors included in the type 3 processor may be less than a quantity of transistors included in the type 1 processor and/or a quantity of transistors included in the type 2 processor.
- a greater difference between the size of the first PS and the size of the second PS indicates a simpler structure of the type 3 processor.
- the quantity of processor cores included in the type 3 processor may be less than the quantity of processor cores included in the type 1 processor, and/or the quantity of transistors included in the type 3 processor may be less than the quantity of transistors included in the type 1 processor. In other embodiments, the quantity of processor cores included in the type 3 processor may be less than the quantity of processor cores included in the type 2 processor, and/or the quantity of transistors included in the type 3 processor may be less than the quantity of transistors included in the type 2 processor.
- N_Little may be used to represent the quantity of processor cores included in the type 3 processor
- N_Big2 may be used to represent the quantity of processor cores included in the type 2 processor
- N_Big1 may be used to represent the quantity of processor cores included in the type 1 processor.
- Cost _ Proc PS _ Little _ Size ⁇ N _ Little + PS _ Full _ Size ⁇ N _ Big 2
- Cost_Proc represents the processor cost, and meanings of PS_Little_Size, N_Little, PS_Full_Size, and N_Big2 are described above. For brevity, details are not described herein again.
- Latency_L is used to represent an input/output (Input/Output, I/O) latency of a bus whose length is L
- Latency_R is used to represent an I/O latency of a bus whose length is R
- Cost_LAT Latency_L ⁇ 3 + Latency_R ⁇ 1
- Cost_LAT represents the latency cost
- Latency _L represents the I/O latency of the bus whose length is L
- Latency _R represents the I/O latency of the bus whose length is R.
- processors in one pipeline implement functions implemented by the hybrid processing structure shown in FIG. 3 , a structure shown in FIG. 4 is required.
- FIG. 4 is a schematic diagram another circuit including processors.
- the circuit shown in FIG. 4 includes three processors: a processor 401, a processor 402, and a processor 403.
- the processor 401, the processor 402, and the processor 403 are type 1 processors (the processor 401 to the processor 403 are all processors in a same pipeline, and the processor 401 to the processor 403 are connected to memories rather than processing modules through buses).
- Each of the three processors has a corresponding memory.
- a memory corresponding to the processor 401 is a memory 411
- a memory corresponding to the processor 402 is a memory 412
- a memory corresponding to the processor 403 is a memory 413.
- the processor 401 is connected to the memory 411 through a bus 421, the processor 402 is connected to the memory 412 through a bus 422, and the processor 402 is connected to the memory 413 through a bus 423.
- the processor 401 is connected to the processor 402 through a bus 424, and the processor 402 is connected to the processor 403 through a bus 424.
- the bus 421, the bus 422, and the bus 423 have a same length.
- the length of the bus 421 may be equal to L+R, namely, a sum of the length of the bus 333 and the length of the bus 331 shown in FIG. 3 .
- a width of the bus 421 may be equal to the width of the bus 333, a width of the bus 422 may be equal to the width of the bus 334, and a width of the bus 423 may be equal to the width of the bus 335.
- Cost_TX represents the data transfer cost
- L+R is the length of the bus 421 (the length of the bus 422 is equal to the length of the bus 421, and the length of the bus 423 is equal to the length of the bus 421)
- A represents the width of the bus 421
- B represents the width of the bus 422
- C represents the width of the bus 423.
- a greater difference between R and L indicates a lower data transfer cost of the structure shown in FIG. 3 .
- a PS passing through the processor 401 to the processor 403 is PS_Full.
- a size of PS_Full is PS_Full_Size
- a quantity of processor cores included in the type 1 processor of is N_Big1.
- Cost_Proc represents the processor cost
- PS_Full_Size is the size of the PS that passes through the processor 401
- N_Big1 is the quantity of processor cores included in the processor 401.
- the processor cost generated when using the structure shown in FIG. 3 may be reduced by (PS_Full_Size-PS_Little_Size) ⁇ N_Little.
- PS_Full_Size-PS_Little_Size A greater difference between PS_Full_Size and PS_Little_Size indicates a more reduced processor cost (that is, a lower processor cost).
- N_Big1 and N_Little1 indicates a more reduced processor cost (that is, a lower processor cost).
- Latency _L is used to represent an I/O latency of a bus whose length is L
- Latency R is used to represent an I/O latency of a bus whose length is R
- Cost_LAT Latency_L + Latency_R ⁇ 3
- Cost_LAT represents the latency cost
- Latency _L represents the I/O latency of the bus whose length is L
- Latency _R represents the I/O latency of the bus whose length is R.
- the structure shown in FIG. 3 can reduce an I/O latency of Latency_R ⁇ 2.
- a greater difference between R and L indicates a more reduced I/O latency.
- corresponding functions can be implemented using lower costs (a lower data transfer cost, a lower processor cost, and a lower latency cost).
- a length of a bus required inside a processing module is short and a width of a bus between a type 2 first processor and the processing module is small, compared with a chip that implements a same function, an area of a chip using the technical solutions of this application is small.
- a basic process of determining a next-hop port by the ECMP is as follows: A hash value is determined based on flow identifier information (for example, a quintuple or a flow label (flow label)) of a packet, and then an entry is determined based on an ECMP routing table and the hash value, where a port included in the entry is a next-hop port for sending the packet.
- flow identifier information for example, a quintuple or a flow label (flow label)
- the ECMP routing table may be divided into a plurality of tables, for example, may be divided into three tables, which are respectively referred to as a routing entry table 1, a routing entry table 2, and a routing entry table 3.
- a routing entry table 1 a routing entry table 2
- a routing entry table 3 a routing entry table 3
- the routing entry table 2 is determined based on the index of the routing entry table, and an entry corresponding to the base address and the hash value determined based on the flow identifier information of the packet is queried from the routing entry table 2, where the entry includes one port index and an index of one routing entry table.
- the routing entry table 3 is determined based on the index of the routing entry table, and an entry corresponding to the port index is queried from the routing entry table 3, where the entry includes a next-hop port for the packet.
- FIG. 5 is a schematic flowchart of determining a next-hop port by using the circuit shown in FIG. 4 .
- the processor 401 obtains an index (referred to as a routing table index 1 below) of one routing entry table from a received PS.
- the processor 401 sends the routing table index 1 to the memory 411.
- the processor 401 receives, from the memory 411, a routing entry table 1 corresponding to the routing table index 1.
- the processor 401 determines, from the routing entry table 1, an entry corresponding to flow identifier information of a packet.
- the entry includes an index (referred to as a routing table index 2 below) of one routing entry table and one base address, and the routing table index 2 and the base address are written into the PS.
- the processor 401 sends the PS (namely, the PS into which the routing table index 2 and the base address are written) to the processor 402.
- the processor 402 obtains the routing table index 2, the base address, and one hash value from the received PS.
- the hash value is determined based on the flow identifier information of the packet.
- the hash value may be determined by an upstream node of the processor 401 and written into the PS.
- the processor 402 sends the routing table index 2 to the memory 412.
- the processor 402 receives, from the memory 412, a routing entry table 2 corresponding to the routing table index 2.
- the processor 402 queries, from the routing entry table 2, an entry corresponding to the base address and the hash value, where the entry includes one port index and an index (referred to as a routing table index 3 below) of one routing entry table, and writes the port index and the routing table index 3 into the PS.
- the processor 402 sends the PS (namely, the PS into which the port index and the routing table index 3 are written) to the processor 403.
- the processor 403 obtains the routing table index 3 and the port index from the received PS.
- the processor 403 sends the routing table index 3 to the memory 413.
- the processor 403 receives, from the memory 412, a routing entry table 3 corresponding to the routing table index 3.
- the processor 403 queries, from the routing entry table 3, an entry corresponding to the port index, where content included in the entry is a next-hop port for the packet.
- the processor 403 writes the next-hop port for the packet into the PS, and sends the PS to a next node in a pipeline, so that the next node continues to process the packet.
- FIG. 6A and FIG. 6B are a schematic flowchart of determining a next-hop port by using the circuit shown in FIG. 3 .
- the processor 301 obtains, from a received PS, an index (referred to as a routing table index 1 below) of one routing entry table, flow identifier information of a packet, and a hash value determined based on the flow identifier information of the packet.
- an index referred to as a routing table index 1 below
- the processor 301 sends, to the processor 311, the routing table index 1, the flow identifier information of the packet, and the hash value determined based on the flow identifier information of the packet.
- the processor 311 sends the routing table index 1 to the memory 321.
- the processor 311 receives, from the memory 321, a routing entry table 1 corresponding to the routing table index 1.
- the processor 311 determines, from the routing entry table 1, an entry corresponding to the flow identifier information of the packet.
- the entry includes an index (referred to as a routing table index 2 below) of one routing entry table and one base address, and the routing table index 2 and the base address are written into the PS.
- the PS may further include the hash value determined based on the flow identifier information of the packet.
- the processor 311 sends the PS (namely, the PS into which the routing table index 2 and the base address are written) to the processor 312.
- the processor 312 obtains the routing table index 2, the base address, and the hash value from the received PS.
- the processor 312 sends the routing table index 2 to the memory 322.
- the processor 312 receives, from the memory 322, a routing entry table 2 corresponding to the routing table index 2.
- the processor 312 queries, from the routing entry table 2, an entry corresponding to the base address and the hash value, where the entry includes one port index and an index (referred to as a routing table index 3 below) of one routing entry table, and writes the port index and the routing table index 3 into the PS.
- the processor 312 sends the PS (namely, the PS into which the port index and the routing table index 3 are written) to the processor 313.
- the processor 313 obtains the routing table index 3 and the port index from the received PS.
- the processor 313 sends the routing table index 3 to the memory 323.
- the processor 313 receives, from the memory 323, a routing entry table 3 corresponding to the routing table index 3.
- the processor 313 queries, from the routing entry table 3, an entry corresponding to the port index, where content included in the entry is a next-hop port for the packet.
- the processor 313 sends the next-hop port for the packet to the processor 301.
- the processor 301 writes the next-hop port for the packet into the PS, and sends the PS to a next node in a pipeline, so that the next node continues to process the packet.
- the processor 401 to the processor 403 and other processors in a chip all belong to the same pipeline.
- the PS processed by the processor in the pipeline is used to store context information during packet forwarding. Therefore, the PS sent by the processor 401 to the processor 402 and the PS sent by the processor 402 to the processor 403 need to include information required by a subsequent node in addition to information required for querying a next-hop port. Therefore, a size of the PS is large. For example, the size of the PS may be 512 bytes. Correspondingly, a width of a bus between processors is also large.
- the processor 311 to the processor 313 only care about a routing function, and the transferred PS only needs to include information required for routing. Therefore, setting a PS of a small size may meet requirements of the processor 311 to the processor 313. For example, a 64-byte PS may meet routing requirements. Correspondingly, a bus of a small width may be set between processors. In addition, information required by the processor 312 and information required by the processor 313 are both from previous nodes, and the information does not need to be obtained from the processor 301.
- a bus of a small width may be set between the processor 301 and the processing module 310.
- a width of the bus 331 may be 128 bits (bits).
- widths of the bus 333 to the bus 335 may be 256 bits, and widths of the bus 421 to the bus 423 may be 256 bits.
- An embodiment of this application further provides a circuit.
- the circuit includes a first processor and a first processing module connected to the first processor.
- the first processing module includes a second processor connected to a first memory. A transmission latency generated when the second processor performs read and write operations on the first memory is less than a transmission latency generated when the first processor communicates with the first processing module.
- the processor module 121 shown in FIG. 1 includes only the processor 1221 and the memory 1221.
- the first processor may be equivalent to the processor 112
- the first processing module may be equivalent to the processing module 121
- the second processor may be equivalent to the processor 1211
- the first memory may be equivalent to the memory 1221.
- a transmission latency generated when the processor 1211 performs read and write operations on the memory 1221 is less than a transmission latency generated when the processor 112 communicates with the processing module 1.
- the processing module 221 shown in FIG. 2 includes only the processor 2211.
- the first processor may be equivalent to the processor 212
- the first processing module may be equivalent to the processing module 221
- the second processor may be equivalent to the processor 2211
- the first memory may be equivalent to the memory 323.
- the second processor is a multi-core processor
- the transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor included in the second processor performs read and write operations on the first memory.
- the first processor is connected to the first processing module through a first bus
- the second processor is connected to the first memory through a second bus, where a bus bit width of the second bus is greater than a bus bit width of the first bus, and/or a length of the second bus is less than a length of the first bus.
- the processing module 121 shown in FIG. 1 includes only the processor 1221 and the memory 1221.
- the first bus is equivalent to the bus 11 configured to connect the processor 112 and the processing module 121
- the second bus is equivalent to the bus 31 configured to connect the processor 1211 and the memory 1221.
- the processing module 221 shown in FIG. 2 includes only the processor 221.
- the first bus is equivalent to the bus 2411 configured to connect the processor 212 and the processing module 221
- the second bus may be equivalent to buses configured to connect the processor 2211 and the memory 232, including the bus 2431 and the bus 2472.
- the second bus may also be equivalent to the bus 2431 configured to connect the processor 2211 and the input/output interface 201.
- the first processing module further includes a third processor connected to a second memory, and a transmission latency generated when the third processor performs read and write operations on the second memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- FIG. 1 is used as an example.
- the first processor may be equivalent to the processor 112
- the first processing module may be equivalent to the processing module 121
- the second processor may be equivalent to the processor 1211
- the third processor may be equivalent to the processor 1212
- the first memory may be equivalent to the memory 1221
- the second memory may be equivalent to the memory 1222.
- a transmission latency generated when the processor 1211 performs read and write operations on the memory 1221 is less than a transmission latency generated when the processor 112 communicates with the processing module 121
- a transmission latency generated when the processor 1212 performs read and write operations on the memory 1222 is less than the transmission latency generated when the processor 1122 communicates with the processing module 121.
- FIG. 2 is used as an example.
- the first processor may be equivalent to the processor 212
- the first processing module may be equivalent to the processing module 221
- the second processor may be equivalent to the processor 2211
- the third processor may be equivalent to the processor 2212
- the first memory may be equivalent to the memory 232
- the second memory may be equivalent to the memory 233.
- the first processor is connected to the first processing module through a first bus
- the second processor is connected to the first memory through a second bus
- the third processor is connected to the second memory through a third bus
- a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- FIG. 1 is still used as an example.
- the first bus may be equivalent to the bus 11
- the second bus may be equivalent to the bus 31
- the third bus may be equivalent to the bus 32.
- FIG. 2 is still used as an example.
- the first bus may be equivalent to the bus 2411
- the second bus may be equivalent to the bus 2431 and the bus 2472
- the third bus may be equivalent to the bus 2432 and the bus 2473.
- the second bus may also be equivalent to the bus 2431
- the third bus may also be equivalent to the bus 2432.
- the first processing module further includes a third processor connected to the first memory, and a transmission latency generated when the third processor performs read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- the first processor is connected to the first processing module through a first bus
- the second processor is connected to the first memory through a second bus
- the third processor is connected to the first memory through a third bus
- a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- the second processor and the third processor are pipeline pipeline processors.
- the circuit further includes a fourth processor and a third memory connected to the fourth processor.
- FIG. 1 is still used as an example.
- the processor 111 may be equivalent to the fourth processor, and the memory 113 may be equivalent to the third memory.
- FIG. 2 is still used as an example.
- the processor 211 may be equivalent to the fourth processor, and the memory 231 may be equivalent to the third memory.
- the circuit further includes a fourth processor and a second processing module connected to the fourth processor.
- the second processing module includes N fifth processors connected to M memories, where both N and M are integers greater than or equal to 1.
- a transmission latency generated when any fifth processor performs read and write operations on the memory connected to the fifth processor is less than a transmission latency generated when the fourth processor communicates with the second processing module.
- FIG. 1 is still used as an example.
- the processor 114 may be equivalent to the fourth processor, and the processing module 122 may be equivalent to the second processing module.
- FIG. 2 is still used as an example.
- the processor 214 may be equivalent to the fourth processor, and the processing module 222 may be equivalent to the second processing module.
- the second processor is connected to the third processor through a fourth bus
- the fourth processor is connected to the first processor through a fifth bus
- a bus bit width of the fourth bus is less than a bus bit width of the fifth bus
- FIG. 1 is still used as an example.
- the bus 21 may be equivalent to the fourth bus, and the bus 41 may be equivalent to the fifth bus.
- FIG. 2 is still used as an example.
- the bus 2421 may be equivalent to the fourth bus, and the bus 2441 may be equivalent to the fifth bus.
- a quantity of processor cores included in the fourth processor is greater than or equal to a quantity of processor cores included in the first processor.
- the fourth processor and the first processor are pipeline processors.
- the first processing module further includes the first memory.
- An embodiment of this application further provides an electronic device.
- the electronic device includes the chip according to embodiments of this application, and the electronic device further includes a receiver and a transmitter.
- the receiver is configured to receive a packet and send the packet to the chip.
- the chip is configured to process the packet.
- the transmitter is configured to: obtain a packet processed by the chip, and send the processed packet to another electronic device.
- the electronic device may be a switch, a router, or any other electronic device on which the foregoing chip can be disposed.
- the chip in embodiments of this application may be a system on chip (system on chip, SoC), a network processor (network processor, NP), or the like.
- SoC system on chip
- NP network processor
- the memory in embodiments of this application may be a volatile memory or a nonvolatile memory, or may include both a volatile memory and a nonvolatile memory.
- the nonvolatile memory may be a read-only memory (read-only memory, ROM), a programmable read-only memory (programmable ROM, PROM), an erasable programmable read-only memory (erasable PROM, EPROM), an electrically erasable programmable read-only memory (electrically EPROM, EEPROM), or a flash memory.
- the volatile memory may be a random access memory (random access memory, RAM), used as an external cache.
- RAMs may be used, for example, a static random access memory (static RAM, SRAM), a dynamic random access memory (dynamic RAM, DRAM), a synchronous dynamic random access memory (synchronous DRAM, SDRAM), a double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), an enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), a synchlink dynamic random access memory (synchlink DRAM, SLDRAM), and a direct rambus dynamic random access memory (direct rambus RAM, DR RAM).
- static random access memory static random access memory
- DRAM dynamic random access memory
- DRAM dynamic random access memory
- SDRAM synchronous dynamic random access memory
- double data rate SDRAM double data rate SDRAM
- DDR SDRAM double data rate SDRAM
- ESDRAM enhanced synchronous dynamic random access memory
- synchlink dynamic random access memory synchlink dynamic random access memory
- direct rambus RAM direct rambus RAM
- the processor in embodiments of this application may be an integrated circuit chip, and has a signal processing capability.
- steps in the foregoing method embodiments can be implemented by using a hardware integrated logical circuit in the processor, or by using instructions in a form of software.
- the processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- steps in the foregoing methods can be implemented by using a hardware integrated logical circuit in the processor, or by using instructions in a form of software.
- the steps of the method disclosed with reference to embodiments of this application may be directly performed by a hardware processor, or may be performed by using a combination of hardware in the processor and a software module.
- the software module may be located in a mature storage medium in the art, such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, an electrically erasable programmable memory, or a register.
- the storage medium is located in the memory, and a processor reads information in the memory and completes the steps in the foregoing methods in combination with hardware of the processor. To avoid repetition, details are not described herein again.
- the disclosed system, apparatus, and method may be implemented in other manners.
- the described apparatus embodiments are merely examples.
- division into the units is merely logical function division and there may be other division during actual implementation.
- a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
- the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces.
- the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
- the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual requirements to achieve the objectives of the solutions of embodiments.
- the functions When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of this application essentially, or the part contributing to the conventional technology, or some of the technical solutions may be implemented in a form of a software product.
- the computer software product is stored in a storage medium and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or some of the steps of the methods described in embodiments of this application.
- the foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc.
- program code such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Hardware Design (AREA)
- Multi Processors (AREA)
- Information Transfer Systems (AREA)
- Advance Control (AREA)
Abstract
Description
- This application claims priority to
Chinese Patent Application No. 202011060780.0, filed with the China National Intellectual Property Administration on September 30, 2020 Chinese Patent Application No. 202011176149.7, filed with the China National Intellectual Property Administration on October 28, 2020 - This application relates to the field of chip technologies, and more specifically, to a circuit, a chip, and an electronic device.
- Processors in a current high-speed network chip are generally disposed in a pipeline manner. After a packet enters the chip, a program state (program state, PS) is generated for the packet to store context information during packet forwarding. A processor on the pipeline processes the packet, saves a processing result in the PS, and then sends the processing result to a next processor. Currently, a design between the processor and a memory storing the PS that are in the chip is improper. Consequently, a high latency is generated when the PS is read and written.
- This application provides a circuit, a chip, and an electronic device, to reduce a transmission latency.
- According to a first aspect, an embodiment of this application provides a circuit. The circuit includes a first processor and a first processing module connected to the first processor. The first processing module includes a second processor connected to a first memory. A transmission latency generated when the second processor performs read and write operations on the first memory is less than a transmission latency generated when the first processor communicates with the first processing module. Because the transmission latency generated when the second processor performs the read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module, a cost of a transmission latency of data in a bus can be reduced.
- With reference to the first aspect, in a possible implementation, the transmission latency generated when the second processor performs the read and write operations on the first memory is less than or equal to 1/10 of the transmission latency generated when the first processor communicates with the first processing module.
- With reference to the first aspect, in a possible implementation, the second processor is a multi-core processor, and the transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor included in the second processor performs read and write operations on the first memory.
- With reference to the first aspect, in a possible implementation, the second processor is a multi-core processor.
- The transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor included in the second processor performs read and write operations on the first memory.
- With reference to the first aspect, in a possible implementation, the first processor is connected to the first processing module through a first bus, and the second processor is connected to the first memory through a second bus, where a bus bit width of the second bus is greater than a bus bit width of the first bus, and/or a length of the second bus is less than a length of the first bus. Because the length of the second bus is less than the length of the first bus, an area of the circuit can be reduced.
- With reference to the first aspect, in a possible implementation, a length of the second bus may be less than or equal to 1/10 of a length of the first bus. In the foregoing technical solution, an area of the circuit can be further reduced.
- With reference to the first aspect, in a possible implementation, the first processing module further includes a third processor connected to a second memory, and a transmission latency generated when the third processor performs read and write operations on the second memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- With reference to the first aspect, in a possible implementation, the first processor is connected to the first processing module through a first bus, the second processor is connected to the first memory through a second bus, the third processor is connected to the second memory through a third bus, and a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- With reference to the first aspect, in a possible implementation, the first processing module further includes a third processor connected to the first memory, and a transmission latency generated when the third processor performs read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- With reference to the first aspect, in a possible implementation, the first processor is connected to the first processing module through a first bus, the second processor is connected to the first memory through a second bus, the third processor is connected to the first memory through a third bus, and a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- With reference to the first aspect, in a possible implementation, the second processor and the third processor are pipeline pipeline processors.
- With reference to the first aspect, in a possible implementation, the circuit further includes a fourth processor and a second processing module connected to the fourth processor. The second processing module includes N fifth processors connected to M memories, where both N and M are integers greater than or equal to 1. A transmission latency generated when any fifth processor performs read and write operations on the memory connected to the fifth processor is less than a transmission latency generated when the fourth processor communicates with the second processing module.
- With reference to the first aspect, in a possible implementation, the second processor is connected to the third processor through a fourth bus, the fourth processor is connected to the first processor through a fifth bus, and a bus bit width of the fourth bus is less than a bus bit width of the fifth bus.
- With reference to the first aspect, in a possible implementation, a quantity of processor cores included in the fourth processor is greater than or equal to a quantity of processor cores included in the first processor.
- With reference to the first aspect, in a possible implementation, the fourth processor and the first processor are pipeline processors.
- With reference to the first aspect, in a possible implementation, the first processing module further includes the first memory.
- According to a second aspect, an embodiment of this application further provides a chip. The chip includes the circuit according to any one of the first aspect or the possible implementations of the first aspect.
- According to a third aspect, an embodiment of this application further provides an electronic device. The electronic device includes the chip according to embodiments of this application, and the electronic device further includes a receiver and a transmitter. The receiver is configured to receive a packet and send the packet to the chip. The chip is configured to process the packet. The transmitter is configured to: obtain a packet processed by the chip, and send the processed packet to another electronic device. The electronic device may be a switch, a router, or any other electronic device on which the foregoing chip can be disposed.
- According to a fourth aspect, an embodiment of this application further provides a processing method. The method includes: A first processor receives a first packet, where the first packet includes flow identifier information; the first processor determines a first processing module based on the flow identifier information, where the first processing module corresponds to the flow identifier information; and the first processor sends the first packet to the first processing module.
- In the foregoing method, the first processor sends, to the first processing module based on the flow identifier information carried in the packet, the packet that needs to be processed by the first processing module, and a processor in the first processing module performs corresponding processing. Because the first processing module is closer to a memory than the first processor, a transmission latency can be reduced.
- Optionally, the method further includes: The first processor receives a second packet from the first processing module, where the second packet is a packet that is obtained through processing performed by the first processing module based on the flow identifier information, and the second packet includes the flow identifier information.
- Optionally, the method further includes: The first processor sends the second packet to a next processor, where the next processor is a next hop of the first processor on a pipeline to which the first processor belongs.
- According to a fifth aspect, an embodiment of this application further provides a processing method. The method includes: A second processor in a first processing module receives a first packet from a first processor, where the first packet includes flow identifier information; the second processor obtains, from a memory corresponding to the second processor based on the flow identifier information, a parameter used for processing the first packet; the second processor processes the first packet based on the parameter, and sends a processed first packet to a third processor in the first processing module, where the processed first packet includes the flow identifier information; the third processor in the first processing module obtains, from a memory corresponding to the third processor based on the flow identifier information, a parameter used for processing the processed first packet; and the third processor processes the processed first packet based on the parameter, to obtain a second packet and send the second packet to the first processor.
- In the foregoing method, the processor in the first processing module performs a read operation on the memory based on a flow identifier in the first packet, and performs corresponding processing. Because the first processing module is closer to the memory than the first processor, a transmission latency can be reduced.
- Optionally, the processing may include table lookup for forwarding, the parameter includes one or more of an index of a forwarding entry, a base address, and a hash value, and the parameter corresponds to the flow identifier.
-
-
FIG. 1 is a schematic diagram of a chip according to an embodiment of this application; -
FIG. 2 is a schematic diagram of another chip according to an embodiment of this application; -
FIG. 3 is a schematic diagram of a circuit; -
FIG. 4 is a schematic diagram of another circuit; -
FIG. 5 is a schematic flowchart of determining a next-hop port by using the circuit shown inFIG. 4 ; and -
FIG. 6A andFIG. 6B are a schematic flowchart of determining a next-hop port by using the circuit shown inFIG. 3 . - The following describes technical solutions of this application with reference to accompanying drawings.
- All aspects, embodiments, or features are presented in this application by describing a system that may include a plurality of devices, components, modules, and the like. It should be appreciated and understood that, each system may include another device, component, module, and the like, and/or may not include all devices, components, modules, and the like discussed with reference to the accompanying drawings. A combination of these solutions may also be used.
- In addition, in embodiments of this application, terms such as "example" and "for example" are used to represent giving an example, an illustration, or a description. Any embodiment or design scheme described as an "example" in this application should not be explained as being more preferred or having more advantages than another embodiment or design scheme. Specifically, the term "example" is used to present a concept in a specific manner.
- In embodiments of this application, "corresponding (corresponding, relevant)" and "corresponding (corresponding)" may be interchangeably used sometimes. It should be noted that meanings expressed by the terms are consistent when differences are not emphasized.
- In embodiments of this application, sometimes a subscript such as W1 may be written in an incorrect form such as W1. Expressed meanings are consistent when differences are not emphasized.
- Network architectures and service scenarios described in embodiments of this application are intended to describe the technical solutions in embodiments of this application more clearly, and do not constitute any limitation on the technical solutions according to embodiments of this application.
- A person of ordinary skill in the art may learn that the technical solutions according to embodiments of this application are also applicable to a similar technical problem as a network architecture evolves and a new service scenario emerges.
- Reference to "an embodiment", "some embodiments", or the like described in this specification indicates that one or more embodiments of this application include a specific feature, structure, or characteristic described with reference to the embodiments. Therefore, statements such as "in an embodiment", "in some embodiments", "in some other embodiments", and "in other embodiments" that appear at different places in this specification do not necessarily mean referring to a same embodiment. Instead, the statements mean "one or more but not all of embodiments", unless otherwise specifically emphasized in another manner. Terms "include", "have", and their variants all mean "include but are not limited to", unless otherwise specifically emphasized in another manner.
- In this application, "at least one" means one or more, and "a plurality of" means two or more. The term "and/or" describes an association relationship between associated objects and represents that three relationships may exist. For example, A and/or B may represent the following three cases: Only A exists, both A and B exist, and only B exists. A and B each may be singular or plural. The character "/" generally represents an "or" relationship between the associated objects. "At least one of the following items (pieces)" or a similar expression thereof refers to any combination of these items, including any combination of singular items (pieces) or plural items (pieces). For example, at least one item (piece) of a, b, or c may indicate: a, b, c, a and b, a and c, b and c, or a, b, and c, where a, b, and c may be singular or plural.
-
FIG. 1 is a schematic diagram of a chip according to an embodiment of this application. As shown inFIG. 1 , thechip 100 includes an input/output interface 101, aprocessor 111, aprocessor 112, aprocessor 113, and aprocessor 114. Thechip 100 further includes aprocessing module 121 and aprocessing module 122. Theprocessing module 121 includes aprocessor 1211, aprocessor 1212, amemory 1221, and amemory 1222. Theprocessing module 122 includes aprocessor 1213, aprocessor 1214, aprocessor 1215, amemory 1223, amemory 1224, and amemory 1225. Thechip 100 further includes amemory 131 and amemory 132. Theprocessor 111 is connected to the input/output interface 101 through abus 61, is connected to theprocessor 112 through a bus 41, and is connected to thememory 131 through abus 51. Theprocessor 112 is connected to theprocessor 113 through abus 42, and is connected to theprocessing module 121 through abus 11. Theprocessor 113 is connected to theprocessor 114 through abus 43, and is connected to thememory 132 through abus 52. Theprocessor 114 is connected to the input/output interface 101 through abus 62, and is connected to theprocessing module 122 through abus 12. Theprocessor 1211 is connected to thememory 1221 through abus 31, and is connected to theprocessor 1212 through abus 21. Theprocessor 1212 is connected to thememory 1222 through abus 32. Theprocessor 1213 is connected to thememory 1223 through abus 33, and is connected to theprocessor 1214 through abus 22. Theprocessor 1214 is connected to thememory 1224 through abus 34, and is connected to theprocessor 1215 through abus 23. Theprocessor 1215 is connected to thememory 1225 through abus 35. - The
chip 100 processes a received packet in a pipeline (pipeline) manner. As shown inFIG. 1 , theprocessor 111, theprocessor 112, theprocessor 113, and theprocessor 114 belong to a same pipeline, and theprocessor 111, theprocessor 112, theprocessor 113, and theprocessor 114 may also be referred to as pipeline processors. Optionally, theprocessor 1211 and theprocessor 1212 belong to a same pipeline, and theprocessor 1213, theprocessor 1214, and theprocessor 1215 belong to a same pipeline. For ease of description, the pipeline may be referred to as a first pipeline. As shown inFIG. 1 , some processors in the first pipeline may directly access memories through buses. For example, theprocessor 111 may directly access thememory 131 through thebus 11, and theprocessor 113 may directly access thememory 132 through the bus 13. For ease of description, a processor in the first pipeline that can directly access a memory may be referred to as atype 1 processor. As shown inFIG. 1 , other processors in the first pipeline may communicate with processing modules. For example, theprocessor 112 communicates with theprocessing module 121 through thebus 11, and theprocessor 114 communicates with theprocessing module 122 through thebus 12. For ease of description, a processor in the first pipeline that can communicate with a processing module may be referred to as atype 2 processor. A plurality of processors included in each processing module may also belong to a single pipeline. For example, theprocessor 1211 and theprocessor 1212 belong to the same pipeline, and theprocessor 1213, theprocessor 1214, and theprocessor 1215 belong to the same pipeline. For ease of description, a pipeline in a processing module may be referred to as a second pipeline. A processor in the processing module may be referred to as atype 3 processor. To be specific, theprocessor 1211, theprocessor 1212, theprocessor 1213, theprocessor 1214, and theprocessor 1215 shown inFIG. 1 may all be referred to astype 3 processors. - As shown in
FIG. 1 , anytype 1 processor corresponds to one memory, and anytype 3 processor corresponds to one memory. Anytype 1 processor or anytype 3 processor is connected to a corresponding memory through a bus, to perform read and write operations on the memory. For example, thememory 131 corresponds to theprocessor 111, and thememory 1221 corresponds to theprocessor 1211. A one-to-one correspondence between the memory and the processor may also be replaced by a one-to-many or many-to-one correspondence. For example, anytype 1 processor or anytype 3 processor may correspond to a plurality of memories, to perform read and write operations on a plurality of memories. Alternatively, a plurality oftype 1 processors may correspond to one memory, and a plurality oftype 3 processors may correspond to one memory, to perform read and write operations on the memory. For example, thememory 131 and thememory 132 inFIG. 1 may be replaced with one memory, and theprocessor 111 and theprocessor 113 correspond to the same memory. Thememory 131 inFIG. 1 may alternatively be replaced with a plurality of memories, and theprocessor 111 corresponds to the plurality of memories. Thememory 1221 and thememory 1222 inFIG. 1 may be replaced with one memory, and theprocessor 1211 and theprocessor 1212 correspond to the same memory. - Based on different connected objects, the bus in the
chip 100 includes atype 1 bus, atype 2 bus, atype 3 bus, a type 4 bus, a type 5 bus, and a type 6 bus. Thetype 1 bus is configured to connect thetype 2 processor and a processing module corresponding to thetype 2 processor. For example, thebus 11 configured to connect theprocessor 112 and theprocessing module 121 and thebus 12 configured to connect theprocessor 114 and theprocessing module 122 are bothtype 1 buses. Thetype 2 bus is configured to connect twotype 3 processors. For example, thebus 21 configured to connect theprocessor 1211 and theprocessor 1212, thebus 22 configured to connect theprocessor 1213 and theprocessor 1214, and thebus 23 configured to connect theprocessor 1214 and theprocessor 1215 are alltype 2 buses. Thetype 3 bus is configured to connect thetype 3 processor and a memory corresponding to thetype 3 processor. For example, thebus 31 configured to connect theprocessor 1211 and thememory 1221, thebus 33 configured to connect theprocessor 1213 and thememory 1223, and the like are alltype 3 buses. The type 4 bus is configured to connect two processors in the first pipeline. For example, the bus 41 configured to connect theprocessor 111 and theprocessor 112, thebus 42 configured to connect theprocessor 112 and theprocessor 113, and thebus 43 configured to connect theprocessor 113 and theprocessor 114 are all type 4 buses. The type 5 bus is configured to connect thetype 1 processor and a memory corresponding to thetype 1 processor. For example, thebus 51 configured to connect theprocessor 111 and thememory 131 and thebus 52 configured to connect theprocessor 113 and thememory 132 are both type 5 buses. The type 6 bus is configured to connect the input/output interface 101 and a processor. For example, thebus 61 configured to connect the input/output interface 101 and theprocessor 111 and thebus 62 configured to connect theprocessor 114 and the input/output interface 101 are both type 6 buses. - In some embodiments, each processor in the first pipeline is a multi-core processor. Each processor in the first pipeline may include a plurality of processor cores (which may also be referred to as cores (cores)). In some embodiments, different processors in the first pipeline may include a same quantity of processor cores. In other words, any two processors in the first pipeline include a same quantity of processor cores.
FIG. 1 is still used as an example. A quantity of processor cores included in theprocessor 111 is equal to a quantity of processor cores included in theprocessor 112, the quantity of processor cores included in theprocessor 112 is equal to a quantity of processor cores included in theprocessor 113, and the quantity of processor cores included in theprocessor 113 is equal to a quantity of processor cores included in theprocessor 114. In other embodiments, different processors in the first pipeline may include different quantities of processor cores. In other words, any two processors in the first pipeline may include different quantities of processor cores. For example, a quantity of processor cores included in theprocessor 111 is greater than a quantity of processor cores included in theprocessor 112. A quantity of processor cores included in theprocessor 113 is greater than a quantity of processor cores included in theprocessor 114. The quantity of processor cores included in theprocessor 111 is greater than the quantity of processor cores included in theprocessor 113, and the quantity of processor cores included in theprocessor 112 is greater than the quantity of processor cores included in theprocessor 114. - In other embodiments, some processors in the first pipeline include a same quantity of processor cores. For example, a quantity of processor cores included in the
processor 111 is equal to a quantity of processor cores included in theprocessor 113, and a quantity of processor cores included in theprocessor 112 is equal to a quantity of processor cores included in theprocessor 114, but the quantity of processor cores included in theprocessor 111 is different from the quantity of processor cores included in theprocessor 112. As described above, based on different connected objects, processors in the first pipeline may be classified into two types: atype 1 processor (for example, theprocessor 111 and the processor 113) and atype 2 processor (for example, theprocessor 112 and the processor 114). In some embodiments, processors of a same type include a same quantity of processor cores, and processors of different types may include different quantities of processor cores. In some embodiments, a quantity of processor cores included in thetype 1 processor may be greater than a quantity of processor cores in thetype 2 processor. Thetype 2 processor communicates with a processing module, and a processor included in the processing module can perform some processing operations. In this way, thetype 2 processor may be a single-core processor or a processor with a small quantity of cores, so that hardware costs can be further reduced. For example, the quantity of processor cores included in thetype 2 processor may be 1/2, 1/3, 1/5, or 1/8 of the quantity of processor cores included in thetype 1 processor. - In other embodiments, the
type 1 processor may be a multi-core processor, and thetype 2 processor may be a single-core processor. In some embodiments, thetype 3 processor may also be a multi-core processor. In other words, thetype 3 processor may also include a plurality of processor cores. In some embodiments, a quantity of processor cores included in thetype 3 processor is less than a quantity of processor cores included in thetype 1 processor or a quantity of processor cores included in thetype 2 processor. In other words, the quantity of processor cores included in thetype 1 processor and the quantity of processor cores included in thetype 2 processor are both greater than the quantity of processor cores included in thetype 3 processor. For example, a quantity of processor cores included in theprocessor 1211 may be less than the quantity of processor cores included in theprocessor 111, and the quantity of processor cores included in theprocessor 1211 may also be less than the quantity of processor cores included in theprocessor 112. In other embodiments, a quantity of processor cores included in thetype 3 processor may be less than a quantity of processor cores included in thetype 1 processor, and the quantity of processor cores included in thetype 3 processor may be equal to or greater than a quantity of processor cores included in thetype 2 processor. For example, a quantity of processor cores included in theprocessor 1213 may be less than the quantity of processor cores included in theprocessor 111, and the quantity of processor cores included in theprocessor 1213 may be equal to or greater than the quantity of processor cores included in theprocessor 114. For example, in some embodiments, the quantity of processor cores included in thetype 3 processor may be less than or equal to 1/10 of the quantity of processor cores included in thetype 1 processor. For another example, in other embodiments, the quantity of processor cores included in thetype 3 processor may be less than or equal to 1/2, 1/3, 1/5, 1/8, or the like of the quantity of processor cores included in thetype 1 processor. - In other embodiments, a sum of a quantity of processor cores included in the
type 2 processor and a quantity of processor cores included in onetype 3 processor in a processing module corresponding to the processor is equal to a quantity of processor cores included in thetype 1 processor. For example, a sum of the quantity of processor cores included in theprocessor 112 and a quantity of processor cores included in theprocessor 1212 is equal to the quantity of processor cores included in theprocessor 111. For another example, a sum of the quantity of processor cores included in theprocessor 114 and a quantity of processor cores included in theprocessor 1214 is equal to the quantity of processor cores included in theprocessor 113. In some embodiments,different type 3 processors may include a same quantity of processor cores. For example, the quantity of processor cores included in theprocessor 1211 is equal to the quantity of processor cores included in theprocessor 1212, and the quantity of processor cores included in theprocessor 1212 is equal to a quantity of processor cores included in theprocessor 1215. - In other embodiments,
different type 3 processors may include different quantities of processor cores. - In other embodiments, any two processors belonging to a same processing module include a same quantity of processor cores, and two processors belonging to different processing modules include different quantities of processor cores. For example, the quantity of processor cores included in the
processor 1211 is equal to the quantity of processor cores included in theprocessor 1212, and the quantity of processor cores included in theprocessor 1212 is not equal to the quantity of processor cores included in theprocessor 1213. In the chip shown inFIG. 1 , each processing module includes at least two processors. In other embodiments, the processing module may alternatively include one multi-core processor. For example, theprocessing module 121 may include only theprocessor 1211 and thememory 1221, where theprocessor 1211 is a multi-core processor. - In some embodiments, the
type 2 processor may also be a single-core processor. If thetype 2 processor is a single-core processor, a processing module including the processor may include at least two processors. In other words, if the processing module includes a plurality of processors, the plurality of processors may include at least one single-core processor. Theprocessing module 121 is used as an example. Theprocessor 1211 in theprocessing module 121 may be a single-core processor, and theprocessor 1212 may be a single-core processor or a multi-core processor. - In some embodiments, a length of the
type 1 bus is greater than a length of thetype 3 bus. For example, the length of thetype 3 bus may be equal to 1/5, 1/8, 1/10, or the like of the length of thetype 1 bus. For another example, the length of thetype 3 bus may be less than 1/10, 1/15, 1/20, or the like of the length of thetype 1 bus. In some embodiments, a sum of the length of thetype 1 bus and the length of thetype 3 bus is equal to a length of the type 5 bus. - In some embodiments, any two
type 1 buses may have a same length. In some embodiments, any twotype 2 buses may have a same length. In some embodiments, any twotype 3 buses may have a same length. In some embodiments, any two type 4 buses may have a same length. In some embodiments, any two type 5 buses may have a same length. Due to limitations of a manufacturing process, it may be difficult to obtain buses of a completely same length. Therefore, in this embodiment of this application, that lengths are the same may be understood as that the lengths are completely the same, or may be understood as that a length difference is within an allowed error range. For example, that a sum of the length of thetype 1 bus and the length of thetype 3 bus is equal to a length of the type 5 bus may be understood as that a difference between the sum of the length of thetype 1 bus and the length of thetype 3 bus and the length of the type 5 bus is 0, or is less than or equal to a preset allowed error value. For another example, a difference between a length of thebus 51 and a length of the bus 52 (that is, lengths of two type 5 buses) is 0, or is less than or equal to a preset allowed error value. - In some embodiments, a sum of widths of all third buses in a same processing module is greater than a width of one first bus. For example, a sum of a width of the
bus 31 and a width of thebus 32 is greater than a width of thebus 11. For another example, a sum of a width of thebus 33, a width of thebus 34, and a width of thebus 35 is greater than a width of thebus 12. A quantity of bits of binary data that can be simultaneously transmitted through the bus is referred to as a width (width) (which may also be referred to as a bit width), and the width is measured in bits. A greater bus width indicates better transmission performance and a larger amount of data that can be transmitted within a same period. A formula for calculating a bus bandwidth (a total amount of data that can be transmitted per unit time) is as follows: Bus bandwidth=Frequency×Width (bytes/sec). - In some embodiments, a width of the
type 2 bus may be less than a width of the type 4 bus.FIG. 2 is a schematic diagram of another chip according to an embodiment of this application. As shown inFIG. 2 , thechip 200 includes an input/output interface 201, aprocessor 211, aprocessor 212, aprocessor 213, and aprocessor 214. Thechip 200 further includes a processing module 221 and aprocessing module 222. The processing module 221 includes aprocessor 2211 and aprocessor 2212. Theprocessing module 222 includes aprocessor 2213, aprocessor 2214, and aprocessor 2215. - The
processor 211 is connected to the input/output interface 201 through abus 2411. Theprocessor 211 is connected to theprocessor 212 through abus 2441. Theprocessor 212 is connected to the processing module 221 through abus 2421. Theprocessor 212 is connected to theprocessor 213 through abus 2442. Theprocessor 213 is connected to the input/output interface 201 through a bus. Theprocessor 213 is connected to theprocessor 214 through abus 2443. Theprocessor 214 is connected to theprocessing module 222 through abus 2422. The processing module 221 is connected to the input/output interface 201 through abus 2431. Theprocessing module 222 is connected to the input/output interface 201 through abus 2432. The processor 2221 is connected to theprocessor 2212 through a bus 2451. Theprocessor 2213 is connected to theprocessor 2214 through a bus 2452. Theprocessor 2214 is connected to theprocessor 2215 through a bus 2453. Amemory 231 to amemory 237 are memories located outside thechip 200. Thechip 200 may access thememory 231 to thememory 237 through the input/output interface 201 and corresponding buses. Specifically, thememory 231 is connected to thechip 200 through abus 2461, thememory 232 is connected to thechip 200 through abus 2462, thememory 233 is connected to thechip 200 through a bus 2463, thememory 234 is connected to thechip 200 through a bus 2464, thememory 235 is connected to thechip 200 through a bus 2465, thememory 236 is connected to thechip 200 through a bus 2466, and thememory 237 is connected to thechip 200 through a bus 2467. - The
chip 200 processes a received packet in a pipeline (pipeline) manner. As shown inFIG. 2 , theprocessor 211, theprocessor 212, theprocessor 213, and theprocessor 214 in thechip 200 belong to a single pipeline, and the pipeline may be referred to as a first pipeline. As shown inFIG. 2 , some processors in the first pipeline can directly communicate with the input/output interface through buses, and other processors in the first pipeline are connected to processing modules through buses. For ease of description, a processor that can directly communicate with the input/output interface (namely, a processor that is not connected to a processing module) may be referred to as atype 1 processor, and a processor connected to a processing module may be referred to as atype 2 processor. For example, inFIG. 2 , thetype 1 processor may include theprocessor 211 and theprocessor 213, and thetype 2 processor may include theprocessor 212 and theprocessor 214. - A plurality of processors included in each processing module may also belong to a single pipeline. For example, the
processor 2211 and theprocessor 2212 belong to a same pipeline, and theprocessor 2213, theprocessor 2214, and theprocessor 2215 belong to a same pipeline. For ease of description, a pipeline in a processing module may be referred to as a second pipeline. A processor in the processing module may be referred to as atype 3 processor. To be specific, theprocessor 2211, theprocessor 2212, theprocessor 2213, theprocessor 2214, and theprocessor 2215 shown inFIG. 2 may all be referred to astype 3 processors. - Each
type 1 processor and each processor have one corresponding memory. The processor may read data stored in the corresponding memory. The processor may also write the data into the corresponding memory. InFIG. 2 , a memory corresponding to theprocessor 211 is thememory 231, a memory corresponding to theprocessor 2211 is thememory 232, a memory corresponding to theprocessor 2212 is thememory 233, a memory corresponding to theprocessor 213 is thememory 234, a memory corresponding to theprocessor 2213 is thememory 235, a memory corresponding to theprocessor 2214 is thememory 236, and a memory corresponding to theprocessor 2215 is thememory 237. For example, theprocessor 211 may read data stored in thememory 231, and/or write the data into thememory 231. For another example, theprocessor 2213 may read data stored in thememory 235, and/or write the data into thememory 235. - A processing module connected to a processor through a bus may be referred to as a processing module corresponding to the processor. For example, the
processing module 121 is a processing module corresponding to theprocessor 112. - Based on different connected objects, the bus in the
chip 200 may include atype 1 bus, atype 2 bus, atype 3 bus, a type 4 bus, a type 5 bus, and a type 6 bus. Thetype 1 bus is configured to connect thetype 2 processor and a processing module corresponding to thetype 2 processor. For example, thebus 2411 configured to connect theprocessor 212 and the processing module 221 and thebus 2412 configured to connect theprocessor 214 and theprocessing module 222 are bothtype 1 buses. Thetype 2 bus is configured to connect twotype 3 processors. For example, thebus 2421 configured to connect the processor 2221 and theprocessor 2212, thebus 2422 configured to connect theprocessor 2213 and theprocessor 2214, and thebus 2423 configured to connect theprocessor 2214 and theprocessor 2215 are alltype 2 buses. Thetype 3 bus is configured to connect a processor in a processing module and the input/output interface. For example, thebus 2431, thebus 2432, thebus 2433, thebus 2434, and thebus 2435 are alltype 3 buses. Thebus 2431 is atype 3 bus configured to connect theprocessor 2211 and the input/output interface 201. Thebus 2432 is atype 3 bus configured to connect theprocessor 2212 and the input/output interface 201. Thebus 2433 is atype 3 bus configured to connect theprocessor 2213 and the input/output interface 201. Thebus 2434 is atype 3 bus configured to connect theprocessor 2214 and the input/output interface 201. Thebus 2435 is atype 3 bus configured to connect theprocessor 2215 and the input/output interface 201. The type 4 bus is configured to connect two processors in the first pipeline. For example, thebus 2441 configured to connect theprocessor 211 and theprocessor 212, thebus 2442 configured to connect theprocessor 212 and theprocessor 213, and thebus 2443 configured to connect theprocessor 213 and theprocessor 214 are all type 4 buses. The type 6 bus is configured to connect the first processor and the input/output interface. For example, thebus 2461 and thebus 2462 are both type 6 buses. - In addition to the buses in the
chip 200, thechip 200 is further connected to memories through buses. Thebus 2471 to thebus 2477 are buses configured to connect thechip 200 and the memories, and the bus may be referred to as a type 7 bus. Thetype 1 processor may access a corresponding memory through corresponding buses and the input/output interface. For example, theprocessor 211 may access thememory 231 through thebus 2461, the input/output interface 201, and thebus 2471. For another example, theprocessor 213 may access thememory 234 through thebus 2462, the input/output interface 201, and thebus 2474. Thetype 3 processor may access a corresponding memory through corresponding buses and the input/output interface. For example, theprocessor 2211 may access thememory 232 through thebus 2431, the input/output interface 201, and thebus 2472. For another example, theprocessor 2215 may access thememory 237 through thebus 2435, the input/output interface 201, and thebus 2477. - Similar to the chip shown in
FIG. 1 , in some embodiments, processors shown inFIG. 2 may all be multi-core processors. In other embodiments, thetype 1 processor and thetype 2 processor may be multi-core processors, and thetype 3 processor may be a single-core processor. A structure of thetype 3 processor may be simpler than a structure of thetype 1 processor. For example, a quantity of processor cores included in thetype 3 processor may be less than a quantity of processor cores included in thetype 1 processor. For another example, a quantity of transistors included in thetype 3 processor may be less than a quantity of transistors included in thetype 1 processor. For specific cases of thetype 1 processor, thetype 2 processor, and thetype 3 processor, refer to descriptions of thechip 100 shown inFIG. 1 . For brevity, details are not described herein again. - In some embodiments, a length of the
type 1 bus is greater than a length of thetype 3 bus. For example, the length of thetype 1 bus may be equal to 1/5, 1/8, 1/10, or the like of the length of thetype 3 bus. For another example, the length of thetype 3 bus may be less than 1/10, 1/15, 1/20, or the like of the length of thetype 1 bus. In some embodiments, a sum of the length of thetype 1 bus and the length of thetype 3 bus is equal to a length of the type 6 bus. In some embodiments, any twotype 1 buses may have a same length. In some embodiments, any twotype 2 buses may have a same length. In some embodiments, any twotype 3 buses may have a same length. In some embodiments, any two type 4 buses may have a same length. In some embodiments, any two type 6 buses may have a same length. - In some embodiments, a sum of widths of buses between the chip and memories corresponding to a same processing module is greater than a width of one first bus. For example, a sum of a width of the
bus 2431 and a width of thebus 2432 is greater than a width of thebus 2411. For another example, a sum of a width of thebus 2433, a width of thebus 2434, and a width of thebus 2435 is greater than a width of thebus 2412. In some embodiments, a width of thetype 2 bus may be less than a width of the type 4 bus. In the embodiment shown inFIG. 1 , each processor (thetype 3 processor and thetype 1 processor) that has a corresponding memory and the corresponding memory are located inside the chip. - In the embodiment shown in
FIG. 2 , a memory corresponding to a processor is located outside the chip. In other embodiments, a part of memories corresponding to processors may be located inside the chip, and the other part of memories corresponding to processors may be located outside the chip. This embodiment may be considered as a combination of the embodiment shown inFIG. 1 and the embodiment shown inFIG. 2 . It can be learned that the chip shown inFIG. 1 orFIG. 2 includes two structures shown inFIG. 3. FIG. 3 is a schematic diagram of a hybrid processor circuit. As shown inFIG. 3 , aprocessor 301 is connected to aprocessing module 310 through abus 331. Theprocessing module 310 includes three processors, which are respectively aprocessor 311, aprocessor 312, and aprocessor 313. The processors in theprocessing module 310 are connected through abus 332. Theprocessor 301 is a first processor, and theprocessor 311, theprocessor 312, and theprocessor 313 aretype 3 processors. More specifically, theprocessor 311 is atype 2 processor. Each processor in theprocessing module 310 has one corresponding memory. A memory corresponding to theprocessor 311 is amemory 321, a memory corresponding to theprocessor 312 is amemory 322, and a memory corresponding to theprocessor 313 is amemory 323. Each processor in theprocessing module 310 is connected to the corresponding memory through a bus. Theprocessor 311 is connected to thememory 321 through abus 333, theprocessor 312 is connected to thememory 322 through abus 334, and theprocessor 313 is connected to thememory 323 through abus 335. Theprocessor 310, theprocessor 311, theprocessor 312, and theprocessor 313 are located in a same chip. The corresponding memory of each processor in theprocessing module 310 may be located in a same chip as theprocessing module 310, or may be located outside a chip in which theprocessing module 310 is located. If the memory is located outside the chip in which theprocessing module 310 is located, the bus configured to connect the processor in the processing module and the corresponding memory may include a bus from the processor to an input/output interface of the chip and a bus from the chip to the corresponding memory. For example, thebus 333 may include a bus from theprocessor 311 to the input/output interface of the chip and a bus from the input/output interface of the chip to thememory 321. - For ease of description, the structure shown in
FIG. 3 may be referred to as a hybrid processor circuit or a hybrid processor structure. The processing module in the hybrid processor structure shown inFIG. 3 includes three processors. In other embodiments, a quantity of processors in the processing module may be a positive integer greater than or equal to 1. For example, the quantity may be 1, 2, 4, 5, or the like. As described above, if the processing module includes one processor, the processor may be a multi-core processor. If the processing module includes at least two processors, the at least two processors may include one or more single-core processors. - For ease of description, it is assumed that a length of the
bus 333, a length of thebus 334, and a length of thebus 335 are the same. A letter L is used to represent the length of thebus 333, and a letter R is used to represent a length of thebus 331. As described in the foregoing embodiments, in some embodiments, L is less than R. In other embodiments, L may be far less than R. For example, L may be equal to one tenth of R, or L is less than one tenth of R. It is assumed that a letter A is used to represent a width of thebus 333, a letter B is used to represent a width of thebus 334, a letter C is used to represent a width of thebus 335, and a letter D is used to represent a width of thebus 331. In this case, A, B, C, and D meet the following relationship: D<A+B+C. In this way, a data transfer cost of the hybrid processor structure shown inFIG. 3 may be shown in formula 3.1: - Cost_TX represents the data transfer cost, L represents the length of the bus 333 (the length of the
bus 333, the length of thebus 334, and the length of thebus 335 are equal), R represents the length of thebus 331, A represents the width of thebus 333, B represents the width of thebus 334, C represents the width of thebus 335, and D represents the width of thebus 331. - After receiving a packet, a chip (for example, the
chip 100 shown inFIG. 1 or thechip 200 shown inFIG. 2 ) that uses a pipeline structure may generate a program state (program state, PS) for the packet. The PS is used to store context information during packet forwarding. The PS passes through each processor in a first pipeline, and the processor in the first pipeline is responsible for processing. For ease of description, the PS processed by the processor in the first pipeline may be referred to as a first PS. It is assumed that PS_Full_Size represents a size of the first PS. Theprocessing module 310 also generates a PS in a process of processing the packet. The PS sequentially passes through each processor in a second pipeline, and the processors in the second pipeline are responsible for processing. For ease of description, the PS processed by the processor in the second pipeline may be referred to as a second PS. It is assumed that PS_Little_Size represents a size of the second PS. The first PS stores the context information during the packet forwarding. The second PS stores only information processed in the second pipeline. Therefore, the size of the first PS is greater than the size of the second PS (that is, PS_Full_Size>PS_Little_Size). In some embodiments, the size of the second PS may be equal to or less than 1/5, 1/8, 1/10, 1/15, 1/20, or the like of the size of the first PS. - Since the size of the second PS is less than the size of the first PS, a simpler processor may be used to process the second PS. Therefore, a structure of the processor (namely, the
type 3 processor) inside the processing module may be simpler than a structure of the processor in the first pipeline. To be specific, a quantity of processor cores included in thetype 3 processor may be less than a quantity of processor cores included in thetype 1 first processor and/or a quantity of processor cores included in thetype 2 first processor, and/or a quantity of transistors included in thetype 3 processor may be less than a quantity of transistors included in thetype 1 processor and/or a quantity of transistors included in thetype 2 processor. A greater difference between the size of the first PS and the size of the second PS indicates a simpler structure of thetype 3 processor. - In some embodiments, the quantity of processor cores included in the
type 3 processor may be less than the quantity of processor cores included in thetype 1 processor, and/or the quantity of transistors included in thetype 3 processor may be less than the quantity of transistors included in thetype 1 processor. In other embodiments, the quantity of processor cores included in thetype 3 processor may be less than the quantity of processor cores included in thetype 2 processor, and/or the quantity of transistors included in thetype 3 processor may be less than the quantity of transistors included in thetype 2 processor. - A quantity of processor cores is used as an example. N_Little may be used to represent the quantity of processor cores included in the
type 3 processor, N_Big2 may be used to represent the quantity of processor cores included in thetype 2 processor, and N_Big1 may be used to represent the quantity of processor cores included in thetype 1 processor. -
- Cost_Proc represents the processor cost, and meanings of PS_Little_Size, N_Little, PS_Full_Size, and N_Big2 are described above. For brevity, details are not described herein again.
- In some embodiments, N_Little, N_Big2, and N_Big1 may meet the following relationship: N_Big1=N_Little+N_Big2.
-
- Cost_LAT represents the latency cost, Latency _L represents the I/O latency of the bus whose length is L, and Latency _R represents the I/O latency of the bus whose length is R.
- If processors in one pipeline implement functions implemented by the hybrid processing structure shown in
FIG. 3 , a structure shown inFIG. 4 is required. -
FIG. 4 is a schematic diagram another circuit including processors. The circuit shown inFIG. 4 includes three processors: aprocessor 401, aprocessor 402, and aprocessor 403. In addition, theprocessor 401, theprocessor 402, and theprocessor 403 aretype 1 processors (theprocessor 401 to theprocessor 403 are all processors in a same pipeline, and theprocessor 401 to theprocessor 403 are connected to memories rather than processing modules through buses). Each of the three processors has a corresponding memory. A memory corresponding to theprocessor 401 is amemory 411, a memory corresponding to theprocessor 402 is amemory 412, and a memory corresponding to theprocessor 403 is amemory 413. Theprocessor 401 is connected to thememory 411 through abus 421, theprocessor 402 is connected to thememory 412 through abus 422, and theprocessor 402 is connected to thememory 413 through abus 423. Theprocessor 401 is connected to theprocessor 402 through abus 424, and theprocessor 402 is connected to theprocessor 403 through abus 424. - The
bus 421, thebus 422, and thebus 423 have a same length. The length of thebus 421 may be equal to L+R, namely, a sum of the length of thebus 333 and the length of thebus 331 shown inFIG. 3 . A width of thebus 421 may be equal to the width of thebus 333, a width of thebus 422 may be equal to the width of thebus 334, and a width of thebus 423 may be equal to the width of thebus 335. In this case, a data transfer cost generated when using a structure shown inFIG. 4 may be shown in formula 4.1: - Cost_TX represents the data transfer cost, L+R is the length of the bus 421 (the length of the
bus 422 is equal to the length of thebus 421, and the length of thebus 423 is equal to the length of the bus 421), A represents the width of thebus 421, B represents the width of thebus 422, and C represents the width of thebus 423. - Through comparison between formula 4.1 and formula 3.1, it can be found that, in a case in which L is less than R and D is less than A+B+C, the data transfer cost generated when using the structure shown in
FIG. 3 is less than the data transfer cost generated when using the structure shown inFIG. 4 . - In some embodiments, a greater difference between R and L indicates a lower data transfer cost of the structure shown in
FIG. 3 . - As described above, because the
processor 401 to theprocessor 403 are alltype 1 processors, a PS passing through theprocessor 401 to theprocessor 403 is PS_Full. Correspondingly, a size of PS_Full is PS_Full_Size, and a quantity of processor cores included in thetype 1 processor of is N_Big1. In this case, a processor cost generated when using the structure shown inFIG. 4 may be shown in formula 4.2: - Cost_Proc represents the processor cost, PS_Full_Size is the size of the PS that passes through the
processor 401, and N_Big1 is the quantity of processor cores included in theprocessor 401. - If N_Big1=N_Little+N_Big2, compared with the structure shown in
FIG. 4 , the processor cost generated when using the structure shown inFIG. 3 may be reduced by (PS_Full_Size-PS_Little_Size)×N_Little. A greater difference between PS_Full_Size and PS_Little_Size indicates a more reduced processor cost (that is, a lower processor cost). A greater difference between N_Big1 and N_Little1 indicates a more reduced processor cost (that is, a lower processor cost). -
- Cost_LAT represents the latency cost, Latency _L represents the I/O latency of the bus whose length is L, and Latency _R represents the I/O latency of the bus whose length is R.
- It can be learned that, compared with the structure shown in
FIG. 4 , the structure shown inFIG. 3 can reduce an I/O latency of Latency_R×2. A greater difference between R and L indicates a more reduced I/O latency. - In conclusion, in the technical solutions according to embodiments of this application, corresponding functions can be implemented using lower costs (a lower data transfer cost, a lower processor cost, and a lower latency cost). In addition, because a length of a bus required inside a processing module is short and a width of a bus between a
type 2 first processor and the processing module is small, compared with a chip that implements a same function, an area of a chip using the technical solutions of this application is small. - The following describes two structures in
FIG. 3 and FIG. 4 by using equal-cost multi-path routing (equal-cost multi-path routing, ECMP) as an example. - A basic process of determining a next-hop port by the ECMP is as follows: A hash value is determined based on flow identifier information (for example, a quintuple or a flow label (flow label)) of a packet, and then an entry is determined based on an ECMP routing table and the hash value, where a port included in the entry is a next-hop port for sending the packet.
- In some cases, to reduce entries stored in the ECMP routing table and improve lookup efficiency, the ECMP routing table may be divided into a plurality of tables, for example, may be divided into three tables, which are respectively referred to as a routing entry table 1, a routing entry table 2, and a routing entry table 3. First, based on the flow identifier information of the packet, an entry corresponding to the flow identifier information is determined from the routing entry table 1, where the entry includes one base address and an index of one routing entry table. Then, the routing entry table 2 is determined based on the index of the routing entry table, and an entry corresponding to the base address and the hash value determined based on the flow identifier information of the packet is queried from the routing entry table 2, where the entry includes one port index and an index of one routing entry table. Finally, the routing entry table 3 is determined based on the index of the routing entry table, and an entry corresponding to the port index is queried from the routing entry table 3, where the entry includes a next-hop port for the packet.
-
FIG. 5 is a schematic flowchart of determining a next-hop port by using the circuit shown inFIG. 4 . - 501: The
processor 401 obtains an index (referred to as arouting table index 1 below) of one routing entry table from a received PS. - 502: The
processor 401 sends therouting table index 1 to thememory 411. - 503: The
processor 401 receives, from thememory 411, a routing entry table 1 corresponding to therouting table index 1. - 504: The
processor 401 determines, from the routing entry table 1, an entry corresponding to flow identifier information of a packet. The entry includes an index (referred to as arouting table index 2 below) of one routing entry table and one base address, and therouting table index 2 and the base address are written into the PS. - 505: The
processor 401 sends the PS (namely, the PS into which therouting table index 2 and the base address are written) to theprocessor 402. - 506: The
processor 402 obtains therouting table index 2, the base address, and one hash value from the received PS. The hash value is determined based on the flow identifier information of the packet. The hash value may be determined by an upstream node of theprocessor 401 and written into the PS. - 507: The
processor 402 sends therouting table index 2 to thememory 412. - 508: The
processor 402 receives, from thememory 412, a routing entry table 2 corresponding to therouting table index 2. - 509: The
processor 402 queries, from the routing entry table 2, an entry corresponding to the base address and the hash value, where the entry includes one port index and an index (referred to as arouting table index 3 below) of one routing entry table, and writes the port index and therouting table index 3 into the PS. - 510: The
processor 402 sends the PS (namely, the PS into which the port index and therouting table index 3 are written) to theprocessor 403. - 511: The
processor 403 obtains therouting table index 3 and the port index from the received PS. - 512: The
processor 403 sends therouting table index 3 to thememory 413. - 513: The
processor 403 receives, from thememory 412, a routing entry table 3 corresponding to therouting table index 3. - 514: The
processor 403 queries, from the routing entry table 3, an entry corresponding to the port index, where content included in the entry is a next-hop port for the packet. - 515: The
processor 403 writes the next-hop port for the packet into the PS, and sends the PS to a next node in a pipeline, so that the next node continues to process the packet. -
FIG. 6A andFIG. 6B are a schematic flowchart of determining a next-hop port by using the circuit shown inFIG. 3 . - 601: The
processor 301 obtains, from a received PS, an index (referred to as arouting table index 1 below) of one routing entry table, flow identifier information of a packet, and a hash value determined based on the flow identifier information of the packet. - 602: The
processor 301 sends, to theprocessor 311, therouting table index 1, the flow identifier information of the packet, and the hash value determined based on the flow identifier information of the packet. - 603: The
processor 311 sends therouting table index 1 to thememory 321. - 604: The
processor 311 receives, from thememory 321, a routing entry table 1 corresponding to therouting table index 1. - 605: The
processor 311 determines, from the routing entry table 1, an entry corresponding to the flow identifier information of the packet. The entry includes an index (referred to as arouting table index 2 below) of one routing entry table and one base address, and therouting table index 2 and the base address are written into the PS. The PS may further include the hash value determined based on the flow identifier information of the packet. - 606: The
processor 311 sends the PS (namely, the PS into which therouting table index 2 and the base address are written) to theprocessor 312. - 607: The
processor 312 obtains therouting table index 2, the base address, and the hash value from the received PS. - 608: The
processor 312 sends therouting table index 2 to thememory 322. - 609: The
processor 312 receives, from thememory 322, a routing entry table 2 corresponding to therouting table index 2. - 610: The
processor 312 queries, from the routing entry table 2, an entry corresponding to the base address and the hash value, where the entry includes one port index and an index (referred to as arouting table index 3 below) of one routing entry table, and writes the port index and therouting table index 3 into the PS. - 611: The
processor 312 sends the PS (namely, the PS into which the port index and therouting table index 3 are written) to theprocessor 313. - 612: The
processor 313 obtains therouting table index 3 and the port index from the received PS. - 613: The
processor 313 sends therouting table index 3 to thememory 323. - 614: The
processor 313 receives, from thememory 323, a routing entry table 3 corresponding to therouting table index 3. - 615: The
processor 313 queries, from the routing entry table 3, an entry corresponding to the port index, where content included in the entry is a next-hop port for the packet. - 616: The
processor 313 sends the next-hop port for the packet to theprocessor 301. - 617: The
processor 301 writes the next-hop port for the packet into the PS, and sends the PS to a next node in a pipeline, so that the next node continues to process the packet. - In a procedure shown in
FIG. 5 , theprocessor 401 to theprocessor 403 and other processors in a chip all belong to the same pipeline. The PS processed by the processor in the pipeline is used to store context information during packet forwarding. Therefore, the PS sent by theprocessor 401 to theprocessor 402 and the PS sent by theprocessor 402 to theprocessor 403 need to include information required by a subsequent node in addition to information required for querying a next-hop port. Therefore, a size of the PS is large. For example, the size of the PS may be 512 bytes. Correspondingly, a width of a bus between processors is also large. - However, in a procedure shown in
FIG. 6A andFIG. 6B , theprocessor 311 to theprocessor 313 only care about a routing function, and the transferred PS only needs to include information required for routing. Therefore, setting a PS of a small size may meet requirements of theprocessor 311 to theprocessor 313. For example, a 64-byte PS may meet routing requirements. Correspondingly, a bus of a small width may be set between processors. In addition, information required by theprocessor 312 and information required by theprocessor 313 are both from previous nodes, and the information does not need to be obtained from theprocessor 301. In addition, for theprocessor 301, theprocessor 301 only cares about a determined next-hop port, theprocessor 301 may not need to obtain the routing entry table 3 for determining the next-hop port, and theprocessor 301 does not need to send, to theprocessing module 310, information irrelevant to the determining of the next-hop port. Therefore, a bus of a small width may be set between theprocessor 301 and theprocessing module 310. For example, a width of thebus 331 may be 128 bits (bits). In comparison, because a large amount of information (for example, a routing entry table) needs to be transmitted through a bus between a processor and a memory, a large width is required. For example, widths of thebus 333 to thebus 335 may be 256 bits, and widths of thebus 421 to thebus 423 may be 256 bits. - An embodiment of this application further provides a circuit. The circuit includes a first processor and a first processing module connected to the first processor. The first processing module includes a second processor connected to a first memory. A transmission latency generated when the second processor performs read and write operations on the first memory is less than a transmission latency generated when the first processor communicates with the first processing module.
- For example, it is assumed that the
processor module 121 shown inFIG. 1 includes only theprocessor 1221 and thememory 1221. In this case, the first processor may be equivalent to theprocessor 112, the first processing module may be equivalent to theprocessing module 121, the second processor may be equivalent to theprocessor 1211, and the first memory may be equivalent to thememory 1221. A transmission latency generated when theprocessor 1211 performs read and write operations on thememory 1221 is less than a transmission latency generated when theprocessor 112 communicates with theprocessing module 1. - For another example, it is assumed that the processing module 221 shown in
FIG. 2 includes only theprocessor 2211. In this case, the first processor may be equivalent to theprocessor 212, the first processing module may be equivalent to the processing module 221, the second processor may be equivalent to theprocessor 2211, and the first memory may be equivalent to thememory 323. - Optionally, in some embodiments, the second processor is a multi-core processor, and the transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor included in the second processor performs read and write operations on the first memory.
- Optionally, in some embodiments, the first processor is connected to the first processing module through a first bus, and the second processor is connected to the first memory through a second bus, where a bus bit width of the second bus is greater than a bus bit width of the first bus, and/or a length of the second bus is less than a length of the first bus.
- For example, it is still assumed that the
processing module 121 shown inFIG. 1 includes only theprocessor 1221 and thememory 1221. The first bus is equivalent to thebus 11 configured to connect theprocessor 112 and theprocessing module 121, and the second bus is equivalent to thebus 31 configured to connect theprocessor 1211 and thememory 1221. - For another example, it is still assumed that the processing module 221 shown in
FIG. 2 includes only the processor 221. The first bus is equivalent to thebus 2411 configured to connect theprocessor 212 and the processing module 221, and the second bus may be equivalent to buses configured to connect theprocessor 2211 and thememory 232, including thebus 2431 and thebus 2472. The second bus may also be equivalent to thebus 2431 configured to connect theprocessor 2211 and the input/output interface 201. - Optionally, in some embodiments, the first processing module further includes a third processor connected to a second memory, and a transmission latency generated when the third processor performs read and write operations on the second memory is less than the transmission latency generated when the first processor communicates with the first processing module.
-
FIG. 1 is used as an example. The first processor may be equivalent to theprocessor 112, the first processing module may be equivalent to theprocessing module 121, the second processor may be equivalent to theprocessor 1211, the third processor may be equivalent to theprocessor 1212, the first memory may be equivalent to thememory 1221, and the second memory may be equivalent to thememory 1222. A transmission latency generated when theprocessor 1211 performs read and write operations on thememory 1221 is less than a transmission latency generated when theprocessor 112 communicates with theprocessing module 121, and a transmission latency generated when theprocessor 1212 performs read and write operations on thememory 1222 is less than the transmission latency generated when the processor 1122 communicates with theprocessing module 121. -
FIG. 2 is used as an example. The first processor may be equivalent to theprocessor 212, the first processing module may be equivalent to the processing module 221, the second processor may be equivalent to theprocessor 2211, the third processor may be equivalent to theprocessor 2212, the first memory may be equivalent to thememory 232, and the second memory may be equivalent to thememory 233. - Optionally, in some embodiments, the first processor is connected to the first processing module through a first bus, the second processor is connected to the first memory through a second bus, the third processor is connected to the second memory through a third bus, and a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
-
FIG. 1 is still used as an example. The first bus may be equivalent to thebus 11, the second bus may be equivalent to thebus 31, and the third bus may be equivalent to thebus 32. -
FIG. 2 is still used as an example. The first bus may be equivalent to thebus 2411, the second bus may be equivalent to thebus 2431 and thebus 2472, and the third bus may be equivalent to thebus 2432 and thebus 2473. The second bus may also be equivalent to thebus 2431, and the third bus may also be equivalent to thebus 2432. - Optionally, in some embodiments, the first processing module further includes a third processor connected to the first memory, and a transmission latency generated when the third processor performs read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- Optionally, in some embodiments, the first processor is connected to the first processing module through a first bus, the second processor is connected to the first memory through a second bus, the third processor is connected to the first memory through a third bus, and a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- Optionally, in some embodiments, the second processor and the third processor are pipeline pipeline processors.
- Optionally, in some embodiments, the circuit further includes a fourth processor and a third memory connected to the fourth processor.
-
FIG. 1 is still used as an example. Theprocessor 111 may be equivalent to the fourth processor, and thememory 113 may be equivalent to the third memory. -
FIG. 2 is still used as an example. Theprocessor 211 may be equivalent to the fourth processor, and thememory 231 may be equivalent to the third memory. - Optionally, in some embodiments, the circuit further includes a fourth processor and a second processing module connected to the fourth processor. The second processing module includes N fifth processors connected to M memories, where both N and M are integers greater than or equal to 1. A transmission latency generated when any fifth processor performs read and write operations on the memory connected to the fifth processor is less than a transmission latency generated when the fourth processor communicates with the second processing module.
-
FIG. 1 is still used as an example. Theprocessor 114 may be equivalent to the fourth processor, and theprocessing module 122 may be equivalent to the second processing module. -
FIG. 2 is still used as an example. Theprocessor 214 may be equivalent to the fourth processor, and theprocessing module 222 may be equivalent to the second processing module. - Optionally, in some embodiments, the second processor is connected to the third processor through a fourth bus, the fourth processor is connected to the first processor through a fifth bus, and a bus bit width of the fourth bus is less than a bus bit width of the fifth bus.
-
FIG. 1 is still used as an example. Thebus 21 may be equivalent to the fourth bus, and the bus 41 may be equivalent to the fifth bus. -
FIG. 2 is still used as an example. Thebus 2421 may be equivalent to the fourth bus, and thebus 2441 may be equivalent to the fifth bus. - Optionally, in some embodiments, a quantity of processor cores included in the fourth processor is greater than or equal to a quantity of processor cores included in the first processor.
- Optionally, in some embodiments, the fourth processor and the first processor are pipeline processors.
- Optionally, in some embodiments, the first processing module further includes the first memory.
- An embodiment of this application further provides an electronic device. The electronic device includes the chip according to embodiments of this application, and the electronic device further includes a receiver and a transmitter. The receiver is configured to receive a packet and send the packet to the chip. The chip is configured to process the packet. The transmitter is configured to: obtain a packet processed by the chip, and send the processed packet to another electronic device. The electronic device may be a switch, a router, or any other electronic device on which the foregoing chip can be disposed.
- The chip in embodiments of this application may be a system on chip (system on chip, SoC), a network processor (network processor, NP), or the like.
- The memory in embodiments of this application may be a volatile memory or a nonvolatile memory, or may include both a volatile memory and a nonvolatile memory. The nonvolatile memory may be a read-only memory (read-only memory, ROM), a programmable read-only memory (programmable ROM, PROM), an erasable programmable read-only memory (erasable PROM, EPROM), an electrically erasable programmable read-only memory (electrically EPROM, EEPROM), or a flash memory. The volatile memory may be a random access memory (random access memory, RAM), used as an external cache. By way of example and not limitation, many forms of RAMs may be used, for example, a static random access memory (static RAM, SRAM), a dynamic random access memory (dynamic RAM, DRAM), a synchronous dynamic random access memory (synchronous DRAM, SDRAM), a double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), an enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), a synchlink dynamic random access memory (synchlink DRAM, SLDRAM), and a direct rambus dynamic random access memory (direct rambus RAM, DR RAM). It should be noted that the memory of systems and methods described in this specification includes but is not limited to these and any memory of another proper type.
- It should be noted that, the processor in embodiments of this application may be an integrated circuit chip, and has a signal processing capability. In an implementation process, steps in the foregoing method embodiments can be implemented by using a hardware integrated logical circuit in the processor, or by using instructions in a form of software. The processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- In an implementation process, steps in the foregoing methods can be implemented by using a hardware integrated logical circuit in the processor, or by using instructions in a form of software. The steps of the method disclosed with reference to embodiments of this application may be directly performed by a hardware processor, or may be performed by using a combination of hardware in the processor and a software module. The software module may be located in a mature storage medium in the art, such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, an electrically erasable programmable memory, or a register. The storage medium is located in the memory, and a processor reads information in the memory and completes the steps in the foregoing methods in combination with hardware of the processor. To avoid repetition, details are not described herein again.
- A person of ordinary skill in the art may be aware that, in combination with the examples described in embodiments disclosed in this specification, units and algorithm steps may be implemented by using electronic hardware or a combination of computer software and electronic hardware. Whether the functions are performed by using hardware or software depends on particular applications and design constraint conditions of the technical solutions. A person skilled in the art may use different methods to implement the described functions for each particular application, but it should not be considered that the implementation goes beyond the scope of this application.
- It may be clearly understood by a person skilled in the art that, for the purpose of convenient and brief description, for a detailed working process of the foregoing system, apparatus, and unit, refer to a corresponding process in the foregoing method embodiments. Details are not described herein again.
- In several embodiments according to this application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described apparatus embodiments are merely examples. For example, division into the units is merely logical function division and there may be other division during actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
- The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual requirements to achieve the objectives of the solutions of embodiments.
- In addition, functional units in embodiments of this application may be integrated into one processing unit, each of the units may exist alone physically, or two or more units may be integrated into one unit.
- When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of this application essentially, or the part contributing to the conventional technology, or some of the technical solutions may be implemented in a form of a software product. The computer software product is stored in a storage medium and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or some of the steps of the methods described in embodiments of this application. The foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc.
- The foregoing descriptions are merely specific implementations of this application, but are not intended to limit the protection scope of this application. Any variation or replacement readily figured out by a person skilled in the art within the technical scope disclosed in this application shall fall within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.
Claims (16)
- A circuit, wherein the circuit comprises a first processor and a first processing module connected to the first processor, the first processing module comprises a second processor connected to a first memory, and a transmission latency generated when the second processor performs read and write operations on the first memory is less than a transmission latency generated when the first processor communicates with the first processing module.
- The circuit according to claim 1, wherein the second processor is a multi-core processor, and the transmission latency generated when the second processor performs the read and write operations on the first memory is a transmission latency generated when any core processor of the multi-core processor comprised in the second processor performs read and write operations on the first memory.
- The circuit according to claim 1 or 2, wherein the first processor is connected to the first processing module through a first bus, and the second processor is connected to the first memory through a second bus, wherein a bus bit width of the second bus is greater than a bus bit width of the first bus, and/or a length of the second bus is less than a length of the first bus.
- The circuit according to claim 1 or 2, wherein the first processing module further comprises a third processor connected to a second memory, and a transmission latency generated when the third processor performs read and write operations on the second memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- The circuit according to claim 4, wherein the first processor is connected to the first processing module through a first bus, the second processor is connected to the first memory through a second bus, the third processor is connected to the second memory through a third bus, and a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- The circuit according to claim 1 or 2, wherein the first processing module further comprises a third processor connected to the first memory, and a transmission latency generated when the third processor performs read and write operations on the first memory is less than the transmission latency generated when the first processor communicates with the first processing module.
- The circuit according to claim 6, wherein the first processor is connected to the first processing module through a first bus, the second processor is connected to the first memory through a second bus, the third processor is connected to the first memory through a third bus, and a sum of a bus bit width of the second bus and a bus width of the third bus is greater than a bus bit width of the first bus.
- The circuit according to any one of claims 4 to 7, wherein the second processor and the third processor are pipeline pipeline processors.
- The circuit according to any one of claims 1 to 3, wherein the circuit further comprises a fourth processor and a third memory connected to the fourth processor; or
the circuit further comprises a fourth processor and a second processing module connected to the fourth processor; the second processing module comprises N fifth processors connected to M memories, wherein both N and M are integers greater than or equal to 1; and a transmission latency generated when any fifth processor performs read and write operations on the memory connected to the fifth processor is less than a transmission latency generated when the fourth processor communicates with the second processing module. - The circuit according to any one of claims 4 to 8, wherein the circuit further comprises a fourth processor and a third memory connected to the fourth processor; or
the circuit further comprises a fourth processor and a second processing module connected to the fourth processor; the second processing module comprises N fifth processors connected to M memories, wherein both N and M are integers greater than or equal to 1; and a transmission latency generated when any fifth processor performs read and write operations on the memory connected to the fifth processor is less than a transmission latency generated when the fourth processor communicates with the second processing module. - The circuit according to claim 10, wherein the second processor is connected to the third processor through a fourth bus, the fourth processor is connected to the first processor through a fifth bus, and a bus bit width of the fourth bus is less than a bus bit width of the fifth bus.
- The circuit according to any one of claims 9 to 11, wherein a quantity of processor cores comprised in the fourth processor is greater than or equal to a quantity of processor cores comprised in the first processor.
- The circuit according to any one of claims 9 to 12, wherein the fourth processor and the first processor are pipeline processors.
- The circuit according to any one of claims 1 to 13, wherein the first processing module further comprises the first memory.
- A chip, wherein the chip comprises the circuit according to any one of claims 1 to 14.
- An electronic device, wherein the electronic device comprises the chip according to claim 15, and the electronic device further comprises a receiver and a transmitter, wherein the receiver is configured to receive a packet and send the packet to the chip;the chip is configured to process the packet; andthe transmitter is configured to: obtain a packet processed by the chip, and send the processed packet to another electronic device.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011060780 | 2020-09-30 | ||
CN202011176149.7A CN114327247A (en) | 2020-09-30 | 2020-10-28 | Circuit, chip and electronic equipment |
PCT/CN2021/115618 WO2022068503A1 (en) | 2020-09-30 | 2021-08-31 | Circuit, chip, and electronic device |
Publications (2)
Publication Number | Publication Date |
---|---|
EP4209886A1 true EP4209886A1 (en) | 2023-07-12 |
EP4209886A4 EP4209886A4 (en) | 2024-02-14 |
Family
ID=80949607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP21874164.3A Pending EP4209886A4 (en) | 2020-09-30 | 2021-08-31 | Circuit, chip, and electronic device |
Country Status (7)
Country | Link |
---|---|
US (1) | US20230236727A1 (en) |
EP (1) | EP4209886A4 (en) |
JP (1) | JP7556606B2 (en) |
KR (1) | KR20230073317A (en) |
CA (1) | CA3194399A1 (en) |
MX (1) | MX2023003629A (en) |
WO (1) | WO2022068503A1 (en) |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4207609A (en) * | 1978-05-08 | 1980-06-10 | International Business Machines Corporation | Method and means for path independent device reservation and reconnection in a multi-CPU and shared device access system |
US6928500B1 (en) * | 1990-06-29 | 2005-08-09 | Hewlett-Packard Development Company, L.P. | High speed bus system that incorporates uni-directional point-to-point buses |
US5708668A (en) * | 1992-05-06 | 1998-01-13 | International Business Machines Corporation | Method and apparatus for operating an array of storage devices |
JP3231429B2 (en) * | 1992-11-06 | 2001-11-19 | 株式会社日立製作所 | Semiconductor integrated circuit device having central processing unit and multiplier |
JPH08212185A (en) * | 1995-01-31 | 1996-08-20 | Mitsubishi Electric Corp | Microcomputer |
JP3655403B2 (en) * | 1995-10-09 | 2005-06-02 | 株式会社ルネサステクノロジ | Data processing device |
JP3769413B2 (en) | 1999-03-17 | 2006-04-26 | 株式会社日立製作所 | Disk array controller |
JPWO2002061591A1 (en) | 2001-01-31 | 2004-06-03 | 株式会社ルネサステクノロジ | Data processing system and data processor |
US20030023793A1 (en) * | 2001-07-30 | 2003-01-30 | Mantey Paul J. | Method and apparatus for in-system programming through a common connection point of programmable logic devices on multiple circuit boards of a system |
US7526631B2 (en) * | 2003-04-28 | 2009-04-28 | International Business Machines Corporation | Data processing system with backplane and processor books configurable to support both technical and commercial workloads |
CN1979461A (en) * | 2005-11-29 | 2007-06-13 | 泰安电脑科技(上海)有限公司 | Multi-processor module |
JP5079342B2 (en) * | 2007-01-22 | 2012-11-21 | ルネサスエレクトロニクス株式会社 | Multiprocessor device |
JP2008299610A (en) * | 2007-05-31 | 2008-12-11 | Toshiba Corp | Multiprocessor |
US9432298B1 (en) * | 2011-12-09 | 2016-08-30 | P4tents1, LLC | System, method, and computer program product for improving memory systems |
US20130021350A1 (en) * | 2011-07-19 | 2013-01-24 | Advanced Micro Devices, Inc. | Apparatus and method for decoding using coefficient compression |
EP3158455B1 (en) * | 2014-06-23 | 2020-03-18 | Liqid Inc. | Modular switched fabric for data storage systems |
US20150370673A1 (en) * | 2014-06-24 | 2015-12-24 | Qualcomm Incorporated | System and method for providing a communication channel to a power management integrated circuit in a pcd |
US9588898B1 (en) * | 2015-06-02 | 2017-03-07 | Western Digital Technologies, Inc. | Fullness control for media-based cache operating in a steady state |
EP3511837B1 (en) * | 2016-09-29 | 2023-01-18 | Huawei Technologies Co., Ltd. | Chip having extensible memory |
CN110419034B (en) | 2017-04-14 | 2021-01-08 | 华为技术有限公司 | Data access method and device |
CN107291392A (en) | 2017-06-21 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of solid state hard disc and its reading/writing method |
CN109582215B (en) * | 2017-09-29 | 2020-10-09 | 华为技术有限公司 | Hard disk operation command execution method, hard disk and storage medium |
US10841243B2 (en) | 2017-11-08 | 2020-11-17 | Mellanox Technologies, Ltd. | NIC with programmable pipeline |
CN108920111B (en) * | 2018-07-27 | 2021-05-28 | 中国联合网络通信集团有限公司 | Data sharing method and distributed data sharing system |
-
2021
- 2021-08-31 JP JP2023519636A patent/JP7556606B2/en active Active
- 2021-08-31 WO PCT/CN2021/115618 patent/WO2022068503A1/en active Application Filing
- 2021-08-31 MX MX2023003629A patent/MX2023003629A/en unknown
- 2021-08-31 EP EP21874164.3A patent/EP4209886A4/en active Pending
- 2021-08-31 CA CA3194399A patent/CA3194399A1/en active Pending
- 2021-08-31 KR KR1020237014009A patent/KR20230073317A/en active Search and Examination
-
2023
- 2023-03-29 US US18/192,293 patent/US20230236727A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
WO2022068503A1 (en) | 2022-04-07 |
KR20230073317A (en) | 2023-05-25 |
EP4209886A4 (en) | 2024-02-14 |
JP7556606B2 (en) | 2024-09-26 |
CA3194399A1 (en) | 2022-04-07 |
JP2023543466A (en) | 2023-10-16 |
MX2023003629A (en) | 2023-06-23 |
US20230236727A1 (en) | 2023-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9025495B1 (en) | Flexible routing engine for a PCI express switch and method of use | |
US8908564B2 (en) | Method for Media Access Control address learning and learning rate suppression | |
US6700894B1 (en) | Method and apparatus for shared buffer packet switching | |
US7308523B1 (en) | Flow-splitting and buffering PCI express switch to reduce head-of-line blocking | |
KR100280642B1 (en) | Memory management device of Ethernet controller and its control method | |
CN102047619B (en) | Methods, systems, and computer readable media for dynamically rate limiting slowpath processing of exception packets | |
CN105791126B (en) | Ternary Content Addressable Memory (TCAM) table look-up method and device | |
US20230153264A1 (en) | Data transmission method, chip, and device | |
EP3327993A1 (en) | Route management | |
CN111683017B (en) | Multi-level congestion control method, device, system and medium in high-speed interconnection network | |
US8199648B2 (en) | Flow control in a variable latency system | |
US20110126070A1 (en) | Resending Control Circuit, Sending Device, Resending Control Method and Resending Control Program | |
CN108259348B (en) | Message transmission method and device | |
EP4209886A1 (en) | Circuit, chip, and electronic device | |
CN104509043A (en) | Phase-based packet prioritization | |
US20230016684A1 (en) | Communications Method and Related Apparatus | |
CN113923061B (en) | GPU network communication method based on intelligent network card, medium and equipment | |
WO2023273946A1 (en) | Seamless bidirectional forwarding detection method, system, node, and storage medium | |
CN116032837A (en) | Flow table unloading method and device | |
WO2019240602A1 (en) | Technologies for sharing packet replication resources in a switching system | |
CN109951365B (en) | Network communication method, system and controller combining PCIe bus and Ethernet | |
US6747978B1 (en) | Direct memory access packet router method and apparatus | |
US10523457B2 (en) | Network communication method, system and controller of PCIe and Ethernet hybrid networks | |
CN114327247A (en) | Circuit, chip and electronic equipment | |
CN112583709A (en) | Link aggregation routing method, system, switching equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20230404 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Free format text: PREVIOUS MAIN CLASS: G06F0003060000 Ipc: G06F0013400000 |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20240115 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 30/392 20200101ALI20240109BHEP Ipc: G06F 13/42 20060101ALI20240109BHEP Ipc: G06F 13/40 20060101AFI20240109BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |