WO2017012564A1

WO2017012564A1 - Data processing device and optical transport network switch

Info

Publication number: WO2017012564A1
Application number: PCT/CN2016/090852
Authority: WO
Inventors: 向俊凌; 董立民; 李昆; 丁炽武
Original assignee: 华为技术有限公司
Priority date: 2015-07-22
Filing date: 2016-07-21
Publication date: 2017-01-26
Also published as: CN106375243A; CN106375243B

Abstract

The present invention discloses a data processing device and an OTN switch, which can improve data processing performance. The data processing device comprises a plurality of processing elements; and each processing element in the plurality of processing elements comprises a bit interleaving unit and at least one ALU, wherein the bit interleaving unit is used for, according to current frame header offset information about a plurality of continuous bits, determining a target output port corresponding to each bit group in at least one bit group consisting of the plurality of continuous bits, and outputting the bit group from the corresponding target output port, with each bit group in the at least one bit group comprising at least one continuous bit in the plurality of continuous bits, and at least one target ALU in the at least one ALU is used for receiving at least one first bit group in the at least one bit group transmitted by the bit interleaving unit, and executing an instruction on the at least one first bit group to obtain an instruction execution result.

Description

Data processing equipment and optical transport network switches

Technical field

The present invention relates to the field of communications and, more particularly, to data processing devices and optical transport network switches.

Background technique

In order to reduce the capital expenditure of the network (Capex or Opex), the industry has proposed Software Defined Network (SDN) and Network Function Virtualization (NFV) technology, by separating the data plane and control plane of the communication device. Standardized hardware architecture, open interfaces and programmability to simplify equipment implementation and operation and maintenance, accelerate the innovation and deployment of network services, and take advantage of the scale of Information Technology (IT).

In the existing SDN, the upper-layer service functions of the network are implemented in software and can run on a series of industry standard server hardware. They can be migrated, instantiated and deployed in different locations on the network without installation. The new device generally uses the X86-based server as the basis for its implementation; at the network forwarding layer, the standardized interface is used, the forwarding plane only includes the basic instruction set and table resources, and the forwarding process and services are loaded and deployed by the remote controller. Generally, a network processor (NP) or a protocol independent forwarding (PIF) processor is used as a basis for implementation; the network L1 layer is mainly responsible for clock data recovery and synchronization of the bit stream of the physical layer, and the rate. Adaptation mapping, multiplexing, framing, and Forward Error Correction (FEC) processing are generally implemented by an Application Specific Integrated Circuit (ASIC). The device is a black box for the user. State, the user can only do some configuration management work.

With the development of technology, the L1 layer data surface needs to break through the solidification function realization, breaking the black box state of the device, and the industry proposes to adopt the NP or PIF chip to realize the L1 business function. The NP uses a Reduced Instruction Set Computer (RISC) processor optimized for packet data plane processing as a processing engine to perform business processing through microcode programming. The programming particles of the NP are RISC processors, execute program instructions under the control of the program counter, and access the data storage unit to complete the business processing. The memory wall under this structure becomes data The biggest obstacle to bitstream processing cannot meet the performance requirements of the L1 layer for bitstream processing.

Summary of the invention

The embodiment of the invention provides a data processing device and an OTN switch, which can improve data processing performance.

In a first aspect, a data processing apparatus is provided, comprising: a plurality of processing elements, each of the plurality of processing elements comprising a bit interleaving unit and at least one ALU, at least one output port of the bit interleaving unit Corresponding to at least one ALU, wherein the bit interleaving unit is configured to determine, according to current frame header offset information of the plurality of consecutive bits, a target output corresponding to each of the at least one bit group consisting of the plurality of consecutive bits a port, and outputting each of the bit groups from the corresponding target output port, wherein each of the at least one bit group includes at least one of the plurality of consecutive bits; at least one of the at least one ALU The target ALU is configured to receive at least one first bit group of the at least one bit group transmitted by the bit interleaving unit, and execute an instruction on the at least one first bit group to obtain an instruction execution result, wherein the at least one target ALU Corresponding to at least one target output port corresponding to the at least one bit group.

In a first possible implementation manner, the device stores a correspondence between a preset frame header offset value and an output port, where the bit interleaving unit is specifically configured to use current frame header offset information of the multiple consecutive bits. And determining a correspondence between the preset frame header offset value and the output port, and determining a target output port of each of the at least one bit group.

In combination with the foregoing possible implementation manners, in a second possible implementation manner, the preset frame header offset value is in units of M bits, and M≥1, the bit interleaving unit is specifically configured to: according to the multiple consecutive bits Current frame header offset information, determining a frame header offset value for each of the at least one bit group, wherein each of the at least one bit group includes M consecutive bits; determining the preset And an output port corresponding to a frame header offset value of each bit group in a correspondence between the frame header offset value and the output port; determining the corresponding output port as the target output port of each of the bit groups.

In combination with the foregoing possible implementation manner, in a third possible implementation manner, each processing element of the multiple processing elements stores a correspondence between the preset frame header offset value and an output port.

In combination with the above possible implementation manners, in a fourth possible implementation manner, the device further stores Having a plurality of instruction parameters; the bit interleaving unit is further configured to determine an instruction parameter storage address of each of the at least one bit group according to the current frame header offset information of the plurality of consecutive bits, and pass the A target output port corresponding to each target ALU in a target ALU sends indication information to each target ALU, where the indication information is used to indicate an instruction parameter storage address of the first bit group received by each target ALU; the at least one Each target ALU in the target ALU is further configured to acquire an instruction parameter from an instruction parameter storage address indicated by the indication information sent by the bit interleaving unit before executing the instruction on the received first bit group, and according to the acquired The instruction parameter executes the instruction on the first bit received.

In combination with the foregoing possible implementation manner, in a fifth possible implementation, the bit interleaving unit is further configured to output current frame header offset information of the multiple consecutive bits by an output end of the processing element to which the bit interleaving unit belongs.

In combination with the foregoing possible implementation manner, in a sixth possible implementation, each of the multiple processing elements further includes a converting unit, wherein an input end of the converting unit and at least one output of the bit interleaving unit a port is connected, and an output end of the conversion unit is connected to an output end of the processing element to which the conversion unit belongs; the bit interleaving unit is further configured to determine a target output port of the at least one second bit group of the at least one bit group and the Translating, by the target output port corresponding to the conversion unit, the at least one second bit group; and the converting unit is configured to transmit the received at least one second bit group to the conversion unit The output of the processing element to which it belongs.

In combination with the foregoing possible implementation manners, in a seventh possible implementation manner, the multiple processing elements are in a Mesh structure.

In combination with the foregoing possible implementation manner, in an eighth possible implementation, the multiple processing elements include at least one first processing element and at least one second processing element, wherein each of the at least one first processing element An output of the first processing element is coupled to an input of all of the at least one second processing element.

In combination with the foregoing possible implementation manner, in a ninth possible implementation manner, the device further includes an input unit, where an output end of the input unit is connected to an input end of a third processing element of the plurality of processing elements, where The input unit is configured to perform a framing process on the parallel bit stream to determine a frame header position of the parallel bit stream; the input unit is further configured to send, to the third processing element, a plurality of consecutive bits in the parallel bit stream and the plurality of First frame header offset information of consecutive bits; the third processing element The bit interleaving unit is configured to receive the plurality of consecutive bits transmitted by the input unit and the first frame header offset information of the multiple consecutive bits, and offset the received first frame header of the multiple consecutive bits The information is determined as the current header offset information for the plurality of consecutive bits.

In combination with the foregoing possible implementation manner, in a tenth possible implementation manner, the multiple processing elements include at least one fourth processing element and a fifth processing element, and each of the at least one fourth processing element The output end is connected to the input end of the fifth processing element, wherein the bit interleaving unit of the fifth processing element is specifically configured to: receive a plurality of consecutive bits transmitted by the at least one fourth processing element, where the multiple consecutive bits Obtaining, by the at least one fourth processing element, processing, by processing the received at least one consecutive bit; determining, according to the at least one input port of the plurality of consecutive bits, corresponding to the plurality of consecutive bits in a plurality of local time slot positions At least one slot position; determining current head offset information of the plurality of consecutive bits according to the header offset information of the at least one slot position.

In combination with the foregoing possible implementation manner, in a tenth possible implementation, the current frame header offset information of the multiple consecutive bits includes a first bit of the multiple consecutive bits relative to the frame to which the multiple consecutive bits belong The offset value of the frame header.

In a second aspect, there is provided an optical transmission network switch, comprising: a first photoelectric conversion unit, the data processing device and the second photoelectric conversion unit in the first aspect or any of the possible implementations, wherein the first photoelectric The conversion unit is configured to perform photoelectric conversion processing on the input first optical signal to obtain a bit stream corresponding to the first optical signal, and transmit the bit stream to the processing device; the data processing device is configured to receive the first photoelectric Converting the bit stream transmitted by the unit, processing the bit stream to obtain the processed bit stream, and transmitting the processed bit stream to the second photoelectric conversion unit; the second photoelectric conversion unit is configured to Receiving the processed bit stream transmitted by the data processing device, and performing electro-optical conversion on the processed bit stream to obtain a second optical signal corresponding to the processed bit stream, and outputting the second optical signal.

Therefore, the data processing device and the OTN switch of the embodiments of the present invention include a plurality of processing elements, each processing element including a bit interleaving unit and at least one ALU, wherein the bit interleaving unit is configured to offset a frame header according to a plurality of consecutive bits Information, determining a target output port corresponding to each of the at least one of the plurality of consecutive bit groups, and outputting each of the bit groups from the corresponding target output port, at least one target ALU of the at least one ALU Receiving at least one of the at least one bit group transmitted by the bit interleaving unit, and The execution of the instruction by one less bit group to obtain the instruction execution result can improve the performance of the bit stream processing delay and the like.

DRAWINGS

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments of the present invention or the description of the prior art will be briefly described below. Obviously, the drawings described below are only the present invention. For some embodiments, other drawings may be obtained from those of ordinary skill in the art without departing from the drawings.

FIG. 1 is a schematic block diagram of a data processing device according to an embodiment of the present invention.

FIG. 2 is another schematic block diagram of a data processing device according to an embodiment of the present invention.

FIG. 3 is another schematic block diagram of a data processing device according to an embodiment of the present invention.

FIG. 4 is a schematic block diagram of an input unit in a data processing device according to an embodiment of the present invention.

FIG. 5 is a schematic block diagram of an output unit in a data processing device according to an embodiment of the present invention.

FIG. 6 is a schematic block diagram of processing elements in a data processing device according to an embodiment of the present invention.

FIG. 7 is a schematic block diagram of an OTN switch according to an embodiment of the present invention.

FIG. 8 is a schematic structural diagram of a system for applying a data processing device to a signal multiplexing scenario according to an embodiment of the present invention.

9 is a schematic diagram of the workflow of the system architecture shown in FIG.

Figure 10 is a schematic illustration of the processing flow of the PE 332 of Figure 8 for each of the first 8 bytes received.

Figure 11 is a diagram showing the processing flow of the PE 333 of Figure 8 for each of the received 8 bytes.

FIG. 12 is a schematic diagram of another system architecture of a data processing device according to an embodiment of the present invention applied to a signal multiplexing scenario.

FIG. 13 is a schematic diagram of a combination of data processing device applications according to an embodiment of the present invention.

FIG. 14 is another schematic diagram of a combination of data processing device applications according to an embodiment of the present invention.

FIG. 15 is another schematic diagram of a combination of data processing device applications according to an embodiment of the present invention.

detailed description

The technical solution in the embodiment of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention. It is clear that the described embodiments are part of the embodiments of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.

It should be understood that the technical solutions of the embodiments of the present invention may be applied to various SDN architectures, such as an Open Radio architecture. The technical solution can be applied to the L1 layer of the SDN architecture, and can also be applied to any one or more of the L2 to L7 layers of the SDN architecture. However, the technical solution provided by the present invention can also be applied to other network architectures. This example does not limit this.

The data processing device provided by the invention adopts a data stream machine model based on a data stream architecture, optimizes design for data plane processing, and provides programmability of the data plane. In the data flow architecture, the data-driven manner is used to drive the execution of the instruction. As long as the operand required by the instruction is all ready, the instruction is started, and then the operation result of the instruction is passed to the next instruction and is used as the next instruction. The operands drive the execution of the next instruction. Specifically, the processing program is converted by the compiler into a directed instruction map, and the directed instruction map is mapped to each processing node in the data processing device, wherein one processing node is used to implement an instruction in the directed instruction graph, Eventually a processing pipeline is formed.

FIG. 1 shows a data processing device 100 provided by an embodiment of the present invention. The data processing device 100 includes: a plurality of processing elements (PEs), each of the plurality of PEs including a bit interleaving unit 112 and at least one Arithmetic Logic Unit (ALU) 114, the at least one The ALU 114 has a one-to-one correspondence with at least one output port of the bit interleaving unit 112, where

The bit interleaving unit 112 is configured to determine, according to current frame header offset information of the plurality of consecutive bits, a target output port corresponding to each of the at least one bit group composed of the plurality of consecutive bits, and output from the corresponding target. The port outputs the each bit group;

At least one target ALU 114 of the at least one ALU 114 is configured to receive at least one first bit group of the at least one bit group transmitted by the bit interleaving unit 114, and execute an instruction on the at least one first bit group to obtain The instruction execution result, wherein the at least one target ALU 114 corresponds to at least one target output port corresponding to the at least one bit group.

The plurality of consecutive bits may be received by the bit interleaving unit 112, or may be generated by the bit interleaving unit 112 according to the received at least one consecutive bit, which is not limited by the embodiment of the present invention.

The bit interleaving unit 112 may have a plurality of output ports, wherein at least one of the plurality of output ports is respectively connected to an input of the at least one ALU 114, that is, at least one of the plurality of output ports is The at least one output port may be specifically all or part of the output ports of the bit interleaving unit 112. For example, the number of the at least one ALU 114 is multiple. And the number of the plurality of output ports of the bit interleaving unit 112 is equal to the number of the plurality of ALUs 114, and the plurality of output ports of the bit interleaving unit 112 may be in one-to-one correspondence with the plurality of ALUs 114; or the at least one ALU The number of 114 is one or more, and the number of the at least one ALU 114 is smaller than the number of the plurality of output ports of the bit interleaving unit 112, and a part of the plurality of output ports is one by one with the at least one ALU 114. Corresponding, and the remaining output ports can be directly connected to the output of PE 110 (or as the output port of PE 110) or can be combined with PE 1 The input of the other units included in the 10 is not limited in this embodiment of the present invention.

Prior art data processing devices are memory-centric. Specifically, the bit stream input to the data processing device is first stored to the memory, and the ALU needs to read a plurality of required bits and instructions from the memory before the read instruction can be executed on the plurality of required bits, and The operation result of the instruction is written into the memory. Since the read/write speed of the current memory has seriously lags behind the calculation speed of the processor, the repeated reading and writing operations to the memory during the above processing further aggravate the processing delay. The data processing device provided by the embodiment of the present invention directly interleaves and distributes each bit group in the bit stream to the corresponding ALU through the bit interleaving unit, and does not need to repeatedly read and write the memory by the ALU, thereby improving other processing performances such as data processing delay and jitter. .

For convenience of description, the following describes the number of the plurality of consecutive bits as N, but the embodiment of the present invention is not limited thereto.

The bit interleaving unit 112 may determine current frame header offset information of the N consecutive bits, and determine, according to the current frame header offset information of the N consecutive bits, each of the at least one bit group composed of the N consecutive bits. a target output port corresponding to each bit group, and each bit group of the at least one bit group is output by a target output port corresponding to each of the bit groups. Specifically, the N consecutive bits constitute one or more bit groups, each bit group includes one or more consecutive bits of the N consecutive bits, and bits included in different bit groups in the at least one bit group The number may be the same or different, which is not limited by the embodiment of the present invention.

The N consecutive bits may correspond to one of a plurality of output ports of the bit interleaving unit 112 One or more target output ports, one of the one or more target output ports may be connected to one of the at least one ALU 114, or to other units in the PE 110, or directly Connected to the output of PE 110. Correspondingly, at least one target output port of the one or more target output ports may be in one-to-one correspondence with at least one target ALU 114 of the at least one ALU 114, wherein the at least one target output port may be specifically the one or Some or all of the target output ports of the plurality of target output ports, the at least one target ALU 114 may also be a part or all of the at least one ALU 114, which is not limited in this embodiment of the present invention.

The at least one bit group may include at least one first bit group, wherein a target output port of each first bit group may correspond to a target ALU in the at least one ALU 114 and be transmitted to the corresponding target ALU 114. Each of the at least one target ALUs 114 may receive one or more first groups of bits transmitted by the bit interleaving unit 112 through a target output port corresponding to each of the target ALUs 114, and the received ones Each of the one or more first bit groups executes an instruction to obtain an instruction execution result corresponding to each of the first bit groups.

The instructions executed by each of the at least one target ALU 114 may be determined by a compiler. Specifically, the compiler may generate a directed instruction map according to a function that the data processing device needs to implement, and map the directed instruction map to the data processing device, wherein some or all of the plurality of PEs 110 are 110 Each of the PEs 110 may be used to implement one or more instructions, and each ALU in the PE 110 may correspond to one instruction, for example, an exclusive OR or an assignment, etc., but embodiments of the present invention are not limited thereto.

In the embodiment of the present invention, an on-chip network may be formed between the plurality of PEs 110, wherein the plurality of PEs 110 may have various distribution forms. For example, the plurality of PEs 110 may have a mesh structure as shown in FIG. 1, or may have a Clos structure or a butterfly structure; or a part of the plurality of PEs 110 or All of the PEs 110 may be fully interconnected. For example, as shown in FIG. 2, the plurality of PEs 110 may include at least one first PE 110 and at least one second PE 110, wherein the at least one first PE 110 An output of each first PE 110 is coupled to an input of all of the second PEs 110 of the at least one second PE 110, ie, any one of the at least one first PEs 110 may be associated with the at least one Any one of the second PEs 110 is connected to the second PE 110, but the embodiment of the present invention is not limited thereto.

In the embodiment of the present invention, the frame header offset information of the N consecutive bits may include an offset value of a first bit of the N consecutive bits relative to a frame header position of the frame to which the N consecutive bits belong, or A frame header offset value of each of the at least one bit group consisting of the N consecutive bits, that is, a deviation of a first bit in each of the bit groups relative to a frame header position of a frame to which the frame belongs The value is shifted, but the embodiment of the invention is not limited thereto.

The frame in the embodiment of the present invention may be specifically a Synchronous Transfer Mode (STM) frame in a Synchronous Digital Hierarchy (SDH) or a Gigabit Passive Optical Network (GPON). The GPON Transmission Convergence (GPON Transmission Convergence) frame, or the 66-bit block (66-bit block), and the like are not limited in this embodiment of the present invention.

The bit interleaving unit 112 can determine the current frame header offset information for the N consecutive bits in a variety of ways. Specifically, the current frame header offset information of the N consecutive bits may be received by the bit interleaving unit 112. For example, the input end of the PE 110 to which the bit interleaving unit 112 belongs may be associated with other components in the data processing device. The output is connected, for example, another PE 110 or an input unit. At this time, the bit interleaving unit 112 can receive the header offset information transmitted by the other component, and determine the received header offset information as the N Current frame header offset information of consecutive bits, wherein the frame header offset information of the plurality of consecutive bits may be transmitted in the same clock cycle as the plurality of consecutive bits; or, the current frame header bias of the N consecutive bits The shift information may be locally generated by the bit interleaving unit 112. For example, the bit interleaving unit 112 may determine at least one slot position locally corresponding to the N consecutive bits according to the input ports of the N consecutive bits, and according to The frame header offset information of the corresponding at least one slot position determines the current frame header offset information of the N consecutive bits, but the embodiment of the present invention is not limited thereto.

As an optional embodiment, as shown in FIG. 3, the data processing device further includes an input unit 120, and an input port of the input unit 120 can be connected to an input end of the data processing device, and an output end of the input unit 120 can be One or more of the plurality of PEs 110 are connected to an input of the input PE 110. The input unit 120 can be configured to frame the parallel bit stream to determine a frame header position of the parallel bit stream. Wherein, the parallel bit stream may include one or more frames, and correspondingly, the input unit 120 may determine one or more frame header positions of the parallel bit stream. Further, the input unit 120 may transmit the parallel bit stream to at least one third PE 110 connected to the input unit 120 in units of a plurality of consecutive bits (for example, N consecutive bits), where The at least one third PE 110 may be part or all of the one or more input PEs 110, and may be predetermined by a compiler, which is not limited by the embodiment of the present invention.

In addition, when the input unit 120 transmits the at least one consecutive bit to the third PE 110, the first frame header offset information of the at least one consecutive bit may be determined according to the frame header position of the frame to which the at least one consecutive bit belongs ( That is, the initial frame header offset information), and transmitting the first frame header offset information of the at least one consecutive bit to the third PE 110, wherein, optionally, the input unit 120 can be in the same clock cycle to the first The third PE 110 transmits the at least one consecutive bit and the first header offset information of the at least one consecutive bit. As an optional embodiment, the bit interleaving unit 112 of the third PE 110 may process the at least one consecutive bit to generate the N consecutive bits, and determine current frame header offset information of the N consecutive bits (ie, Local frame header offset information), for example, the bit interleaving unit 112 of the third PE 110 may perform bit stuffing processing on the at least one consecutive bit to generate N consecutive bits, and according to the bit padding process, the at least one consecutive The position of the bit in the N consecutive bits and/or the first frame header offset information of the at least one consecutive bit determines the current frame header offset information of the N consecutive bits, but the embodiment of the present invention is not limited thereto.

As another optional embodiment, the input unit may be specifically configured to transmit N consecutive bits and first frame header offset information of the N consecutive bits to a third PE 110 connected thereto, and correspondingly, a third The bit interleaving unit 112 of the PE 110 may be further configured to receive the N consecutive bits transmitted by the input unit 120 and the first frame header offset information of the N consecutive bits, and receive the received first frame header offset information. The current frame header offset information for the N consecutive bits is determined.

Optionally, the input unit 120 may further receive the serial bit stream before performing the framing processing on the parallel bit stream, and perform serial-to-parallel conversion processing on the received serial bit stream to obtain the parallel bit stream. However, embodiments of the invention are not limited thereto.

FIG. 4 exemplarily shows the structure of the input unit 120, wherein the input unit 120 may include p1 first input/output (I/O) subunits 121-1, . . ., 121-p1, p1. The framing sub-units 122-1, ..., 122-p1 and the first conversion sub-unit 123, wherein p1 ≥ 1. Specifically, the i-th (1 ≤ i ≤ p1) first I/O sub-units 121-i may be configured to receive the first serial bit stream and perform serial-to-parallel conversion on the received first serial bit stream, A first parallel bit stream is obtained, and the first parallel bit stream is transmitted to an ith framing sub-unit 122-i connected thereto. The ith framing sub-unit 122-i may stream the first parallel bit stream transmitted by the ith I/O sub-unit 121-i Performing a framing process to obtain at least one frame header position of the first parallel bit stream, and outputting the first parallel bit stream in units of at least one consecutive bit, wherein the number of the at least one consecutive bit may be a processing bit width However, embodiments of the invention are not limited thereto. The first conversion sub-unit 123 may receive at least one consecutive bit transmitted by the i framing sub-units 122-i and transmit the at least one of the at least one third PE 110 of the at least one PE 110 connected to the input unit 120. One continuous bit. It should be understood that the input unit 120 in the embodiment of the present invention may also have other structures. For example, the input unit 120 may not include the first conversion subunit 123, or may further include other subunits. limited.

Optionally, as shown in FIG. 3, the data processing device may further include an output unit 130, wherein an input end of the output unit 130 is connected to an output of one or more of the plurality of PEs 110. The output unit 130 may be configured to receive a second parallel bit stream transmitted by the at least one sixth PE 110 of the one or more output PEs 110, and perform parallel-to-serial conversion processing on the second parallel bit stream to obtain a Two serial bit streams and outputting the second serial bit stream. The second parallel bit stream may include a plurality of consecutive bits, and the at least one sixth PE 110 may be part or all of the PEs 110 of the one or more output PEs 110, which is not limited by the embodiment of the present invention.

FIG. 5 exemplarily shows the structure of the output unit 130, wherein the output unit 130 may include a second conversion sub-unit 131 and p2 second I/O sub-units 132-1, . . ., 132-p2, where P2 ≥ 1, p2 may be equal or unequal to p1. Specifically, the second conversion subunit 132 is configured to receive the second parallel bit stream sent by the at least one sixth PE 110 connected to the output unit 130, and to the ith one of the p2 second I/O subunits The second I/O sub-unit 132-i transmits the second parallel bit stream, and the i-th second I/O sub-unit 132-i may receive the second parallel bit stream transmitted by the second conversion sub-unit 132, The second parallel bit stream is parallel-serial converted to obtain a second serial bit stream corresponding to the second parallel bit stream, and the second serial bit stream is output. It should be understood that the output unit 130 in the embodiment of the present invention may also have other structures. For example, the output unit 130 may not include the second conversion subunit 131, or may further include other subunits. limited.

As another optional embodiment, the plurality of PEs 110 may include at least one fourth PE 110 and a fifth PE 110, wherein an output of each of the fourth PEs 110 of the at least one fourth PE 110 and the first The input terminals of the five PEs 110 are connected. At this time, one of the at least one fourth PE 110 The fourth PE 110 may perform a first instruction on the received at least one consecutive bit to obtain L consecutive bits, L≥1, and transmit the L consecutive bits to the fifth PE 110 and optionally transmit the L The second header offset information of the consecutive bits, wherein the second header offset information may be received by the fourth PE 110, for example, the second header offset information may be the fourth PE 110 The second frame header offset information may be generated locally by the PE 110 sent by the input terminal or sent by the input unit connected to the input end of the fourth PE 110. The embodiment of the invention does not limit this.

The bit interleaving unit 112 of the fifth PE 110 may perform mapping multiplexing processing on the N consecutive bits after receiving the N consecutive bits transmitted by the at least one fourth PE 110. Optionally, if the bit interleaving unit 112 receives the second frame header offset information of the N consecutive bits, the second frame header offset information may be terminated, and the local frame header offset of the N consecutive bits is determined. Information (ie, current frame header offset information), but embodiments of the present invention are not limited thereto.

Specifically, the bit interleaving unit 112 of the fifth PE 110 may determine a plurality of local slot positions, and determine, according to the at least one input port corresponding to the N consecutive bits, the N consecutive slots. At least one slot position corresponding to the bit, and determining current head offset information of the N consecutive bits according to the header offset information of the at least one slot position.

The plurality of slot positions may be basic bit units local to the fifth PE. Optionally, the fifth PE may store a correspondence between the input port and the preset slot position, and correspondingly, the bit interleaving unit may be configured according to the correspondence and the at least one input port corresponding to the N consecutive bits. At least one slot position corresponding to the N consecutive bits in the plurality of slot positions is determined, but the invention does not limit this. Optionally, the bit interleaving unit may determine a frame header offset value of each slot position in the at least one slot position as at least one consecutive bit of the N consecutive bits corresponding to each slot position. The frame header offset value, but the embodiment of the present invention is not limited thereto.

When the at least one target ALU 114 of the fifth PE 110 receives the at least one first bit group transmitted by the bit interleaving unit 112 of the fifth PE 110, each of the at least one target ALU 114 may receive the The first bit group executes the second instruction to obtain a second instruction execution result. Thus, the fifth PE 110 further processes the result of the instruction of the fourth PE 110 as an operand, thereby forming a processing pipeline.

In the embodiment of the present invention, the processing program may be converted by the compiler into a directed instruction flow graph with data dependencies, and the instructions and related information are mapped to the corresponding PE according to the PE 110 resource. At the same time, according to the instruction corresponding to each PE 110 node, the stream format information to be processed by the PE 110 node is also synchronously mapped, and finally a processing pipeline for processing the bit stream is formed. Specifically, after the original bit stream passes through the processing of the input unit 120, each PE 110 node receives the bit stream processed by the upper-level PE 110 node, performs a local instruction operation, and sends the execution result to the next-level PE 110 node. The output result is output by the output unit 130 until the data stream processing is completed.

After determining the current frame header offset information of the N consecutive bits, the bit interleaving unit 112 may determine, in the at least one bit group consisting of the N consecutive bits, according to the current frame header offset information of the N consecutive bits. The target output port of each bit group. As an optional embodiment, the data processing device stores a correspondence between a preset frame header offset value and an output port. Specifically, the data processing device may include a first storage unit, where the first storage unit is configured to store a correspondence between the preset frame header offset value and an output port, where the first storage unit may be independent of the Multiple PEs 110 are deployed, or may be deployed in at least one of the plurality of PEs 110, for example, each of the plurality of PEs 110 includes a first storage unit, ie, each of the plurality of PEs 110 The PEs 110 may store a correspondence between a preset frame header offset value and an output port, where the correspondences stored by different PEs 110 may be the same or different, and may be pre-compiled by the compiler. The configuration of the present invention is not limited thereto.

At this time, the bit interleaving unit 112 may determine, according to the current frame header offset information of the N consecutive bits and the stored correspondence between the preset frame header offset value and the output port, the N consecutive bits are determined. A target output port of each bit group in at least one bit group.

Specifically, the bit interleaving unit 112 may send current frame header offset information of the N consecutive bits to a storage unit that exists independently or deployed in the PE 110, and receive the current frame header of the storage unit according to the N consecutive bits. The information of the target output port corresponding to each bit group determined by the offset information. Alternatively, the bit interleaving unit 112 may also acquire the correspondence, and determine a current frame header of each of the at least one bit group consisting of the N consecutive bits according to the current frame header offset information of the N consecutive bits. Offset value, and determining a target output port corresponding to each bit group in the at least one bit group by querying the current frame header offset value of each bit group in the obtained correspondence. Optionally, each of the at least one bit group includes M consecutive bits, where M is an integer greater than or equal to 1, and N is an integer multiple of M, eg, M=8, ie, each bit group A byte is included, but the embodiment of the present invention does not limit this.

As another optional embodiment, the preset frame header offset value is in units of M bits, 1 ≤ M ≤ N, at this time, the bit interleaving unit 112 is specifically used to:

Determining, according to the frame header offset information of the N consecutive bits, a frame header offset value of each of the at least one bit group, wherein each of the at least one bit group includes M consecutive bits;

Determining, in the correspondence between the preset frame header offset value and the output port, an output port corresponding to the frame header offset value of each of the bit groups;

The corresponding output port is determined as the target output port of each bit group.

As another alternative embodiment, as shown in FIG. 6, each of the plurality of PEs 110 further includes a conversion unit 116, wherein an input of the conversion unit 116 and at least one output of the bit interleaving unit 112 The ports are connected and the output of the conversion unit 116 is connected to the output of each of the PEs 110. At this time, a part of the plurality of output ports of the bit interleaving unit 112 may correspond to the at least one ALU 114, and a part of the output ports may correspond to the converting unit 116.

The bit interleaving unit 112 is further configured to: when determining that the target output port of the at least one second bit group of the at least one bit group corresponds to the converting unit 116, to convert the target output port corresponding to the converting unit 116 Unit 116 transmits the at least one second bit group;

Correspondingly, the converting unit 116 is configured to transmit the received at least one second bit group to the output end of the PE 110 to which the converting unit 116 belongs.

Optionally, as shown in FIG. 6, the correspondence between the preset frame header offset value and the output port may be stored in the PE in the form of an information format table. Table 1 shows an example of an information format table in which it is assumed here that the PE includes a bit interleaving unit, a converting unit, and three ALUs, which are ALU1, ALU 2, and ALU 3, respectively, and correspondingly, the bit interleaving unit has four output ports. , respectively, is a switching unit port corresponding to the switching unit, an ALU 1 port corresponding to the ALU 1, an ALU 2 port corresponding to the ALU 2, and an ALU 3 port corresponding to the ALU 3. The frame header offset value in the information format table is in units of bytes. Accordingly, the bit interleaving unit 112 may determine a target output port of each byte included in the N consecutive bits according to the format information table, but the present invention The embodiment is not limited to this.

Table 1 format information example

After receiving the one or more first bit groups transmitted by the bit interleaving unit 112, the target ALU 114 may execute an instruction on the one or more first bit groups to obtain an instruction execution result. Wherein, the instruction may be pre-configured by the compiler in the target ALU 114 or obtained by the target ALU 114 from the second storage unit of the data processing device. The instruction parameters required by the target ALU 114 when executing the instructions may be obtained by the ALU 114 from the second storage unit of the data processing device, wherein the first storage unit and the second storage unit may be the same or different, and the The second storage unit may be deployed in the part or all of the plurality of PEs 110, which is not limited by the embodiment of the present invention.

As an optional embodiment, the bit interleaving unit 112 may determine, according to the frame header offset information of the N consecutive bits, corresponding to the first bit group transmitted to each target ALU 114 of the at least one target ALU 114. Command parameters or instruction parameter information, and transmitting the first bit group to the each ALU 114 while transmitting the instruction parameter or instruction parameter information corresponding to the first bit group (for example, storing the instruction parameter) address). For example, Table 1 further includes a correspondence between a preset frame header offset value and an instruction parameter, but the correspondence between the preset frame header offset value and the command parameter in the embodiment of the present invention is different from the preset frame header bias. The corresponding relationship between the value-shifting and the output port can also be stored in different tables, which is not limited in this embodiment of the present invention.

As an optional embodiment, the bit interleaving unit 112 is further configured to obtain a correspondence between the preset frame header offset information and the instruction parameter storage address. At this time, the bit interleaving unit 112 may be further configured to determine, according to the current frame header offset information of the N consecutive bits, an instruction parameter storage address of each of the at least one of the N consecutive bits. And transmitting, by the target output port corresponding to each target ALU 114 of the at least one target ALU 114, the indication information to the each target ALU 114, the indication information being used to indicate the first bit group received by each target ALU 114 Instruction parameter storage address;

Correspondingly, each target ALU 114 of the at least one target ALU 114 is further configured to acquire an instruction parameter storage address indicated by the indication information sent from the bit interleaving unit 112 before executing the instruction on the received first bit group. The instruction parameter is executed, and the instruction is executed on the received first bit group according to the obtained instruction parameter.

As another optional embodiment, if the output end of the PE 110 to which the bit interleaving unit belongs is connected to the input end of another PE 110, the bit interleaving unit may output the current of the N consecutive bits through the output port of the PE to which it belongs. The header offset information is such that the other PE 110 continues processing the N consecutive bits output by the PE to which the bit interleaving unit belongs according to the current header offset information of the N consecutive bits.

Specifically, the bit interleaving unit may output the current frame header offset information of the N consecutive bits to the output end of the PE through the converting unit, or the bit interleaving unit may have at least one output port connected to the output end of the PE, and The current frame header offset information of the N consecutive bits is directly output to the output end of the PE, and the embodiment of the present invention is not limited thereto.

Therefore, the data processing device of the embodiment of the present invention includes a plurality of processing elements, each of the processing elements includes a bit interleaving unit and at least one ALU, wherein the bit interleaving unit is configured to determine, according to the frame header offset information of the plurality of consecutive bits, a target output port corresponding to each of the at least one bit group of the plurality of consecutive bits, and outputting each bit group from the corresponding target output port, at least one target ALU of the at least one ALU for receiving And at least one first bit group of the at least one bit group transmitted by the bit interleaving unit, and executing an instruction on the at least one first bit group to obtain an instruction execution result, which can improve performance such as delay of bit stream processing.

In addition, the data processing device provided by the embodiment of the present invention can perform clock data recovery and synchronization, rate matching, mapping, multiplexing, and framing on the physical layer bit stream by using the L1 layer by processing the bit stream in a programmable manner. The functions such as FEC normalize the implementation of the hardware, simplify the implementation of the device, and improve the flexibility and maintainability of the device. In addition, by introducing the programmable method to the data plane processing of the L1 layer, it lays the foundation for the white boxing trend of the L1 layer.

The application of the data processing device provided by the embodiment of the present invention is specifically described below. FIG. 7 shows an OTN switch 200 provided by an embodiment of the present invention. The OTN switch 200 may include:

a first photoelectric conversion unit 210, a data processing device 220, and a second photoelectric conversion unit 230, wherein

The first photoelectric conversion unit 210 is configured to perform photoelectric conversion processing on the input optical signal to obtain Obtaining a bit stream corresponding to the optical signal, and transmitting the bit stream to the data processing device 220;

The data processing device 220 is configured to receive the bit stream transmitted by the first photoelectric conversion unit, process the bit stream to obtain the processed bit stream, and transmit the processed bit stream to the second Photoelectric conversion unit;

The second photoelectric conversion unit is configured to receive the processed bit stream transmitted by the data processing device 220, and perform electro-optical conversion on the processed bit stream to obtain an optical signal corresponding to the processed bit stream, and output The optical signal.

The number of the first photoelectric conversion unit 210 and the second photoelectric conversion unit 230 included in the OTN switch may be one or more, respectively. As shown in FIG. 7, the OTN switch may include k1 first optical switching units 210-1, ..., 210-k1, and k2 second optical switching units 230-1, ..., 230-k2, where k1 ≥ 1, k2 ≥ 1, which is not limited by the embodiment of the present invention.

The structure and working principle of the data processing device 220 can be referred to the above, and for brevity, no further details are provided herein. The OTN switch in this embodiment does not need to separate the branch, line and cross-separation structure, and only the bit stream processor can complete the main functions of the OTN switch.

8 and 9 respectively exemplarily show a system architecture and a processing flow for realizing signal multiplexing by the above data processing device. For convenience of description, in the present embodiment, it is assumed that the data processing device is a bit stream processor, and the data processing device is applied to an Optical Transport Network (OTN) for connecting two parallel optical transmission units ( The optical signal transmitted by the Optical Transport Unit (OTU) 1 is multiplexed to the OTU 2, wherein the transmission rate of the OTU 1 is assumed to be 2.5 Gbps, and the transmission rate of the OTU 2 is 10 Gbps, but the embodiment of the present invention is not limited thereto.

As shown in FIG. 8, the bit stream processing system 300 includes a first optical/electrical conversion (O/E) unit 310, a second O/E unit 320, a bit stream processor 330, and a third O/E unit. 340, wherein an input end of the bit stream processor 330 is connected to an output end of the first O/E unit 310 and the second O/E unit 320, respectively, and an output end of the bit stream processor 330 and a third O/E The inputs of unit 340 are connected. Alternatively, the bit stream processor 330 may have any of the structures described above (e.g., the structure shown in FIG. 3), and for the sake of brevity, only the portion related to the present embodiment is shown in FIG.

As shown in FIG. 9, multiplexing of optical signals can be implemented by the following processes: photoelectric conversion, serial-to-parallel conversion, framing processing, descrambling processing, mapping multiplexing processing, framing processing, scrambling processing, parallel-to-serial conversion, and Electro-optic conversion. Referring to FIG. 8, the plurality of PEs included in the bit stream processor 330 can be It is specifically used to implement functions such as descrambling, mapping multiplexing, framing, and scrambling.

Specifically, the first O/E unit 310 may be configured to perform photoelectric conversion processing on the first optical signal to obtain a first serial bit stream, and transmit the first serial bit stream to the bit stream processor 330. The second O/E unit 320 can be configured to perform photoelectric conversion processing on the second optical signal to obtain a second serial bit stream, and the second serial bit stream is transmitted to the bit stream processor 330.

The input unit 331 of the bit stream processor 330 may perform serial-to-parallel conversion processing on the received first serial bit stream to obtain a first parallel bit stream corresponding to the first serial bit stream, and the first A parallel bit stream is subjected to framing processing to obtain at least one header position in the first parallel bit stream, and the first parallel bit stream and the L consecutive bits are transmitted to the PE 332 in units of L consecutive bits Initial frame header offset information, where L may be the processing bit width of the bitstream processor 330.

Similarly, the input unit 331 of the bit stream processor 330 may perform serial-to-parallel conversion processing on the received second serial bit stream to obtain a second parallel bit stream corresponding to the second serial bit stream, and And framing the second parallel bit stream to obtain at least one frame header position in the second parallel bit stream, and transmitting the second parallel bit stream to the PE 333 in units of L consecutive bits and the L Initial frame header offset information for consecutive bits.

For convenience of description, the following assumption is that L is 64, which corresponds to 8 bytes, but the processing bit width in the embodiment of the present invention may also be other values, which is not limited in the embodiment of the present invention.

The PE 332 and the PE 333 may be specifically configured to implement a descrambling function, that is, perform an exclusive OR (XOR) instruction. Specifically, the PE 332 and the PE 333 may have a format information table as shown in Table 1. The PE 332 and the bit interleaving unit in the PE 333 may receive the received initial frame header when receiving the 8 consecutive bytes transmitted by the input unit 331 and the initial header offset information of the 8 consecutive bytes. The offset information determines the current header offset information of the 8 consecutive bytes, and determines the target output port corresponding to each of the 8 bytes according to Table 1.

Specifically, the bit interleaving unit of the PE 332 may receive the first beat bit stream (ie, the first 8 bytes) in a certain frame and the initial frame header offset information of the first beat bit stream, and determine the received A target output port corresponding to each byte of 8 bytes (each bit corresponding to one bit group) composed of 64 consecutive bits (ie, N ₁ = L = 64). Specifically, the offset of the beat of the first beat bit stream relative to the frame header is 0. At this time, the frame header offset values of the first six bytes are respectively 0 to 5 bytes, as shown in FIG. 10, the bit interleaving unit may pass the first six bytes according to Table 1 through the corresponding to the conversion unit. The output port is transmitted to the conversion unit, and the conversion unit directly outputs the received byte without operating the received byte; the frame offset of the last two bytes is 6 to 7 bytes, respectively, as shown in FIG. The bit interleaving unit may transmit the last two bytes to the ALU 3 through the output port corresponding to the ALU 3 according to Table 1. In addition, the bit interleaving unit may further determine the last two bytes according to Table 1. Each byte corresponds to scrambling matrix information (such as scrambling matrix value or scrambling matrix value storage address, etc.) and transmits it to ALU 3 through an output port corresponding to ALU 3, where each byte corresponds The scrambling matrix information can be the same or different. After receiving the last two bytes transmitted by the bit interleaving unit and the scrambling code matrix information corresponding thereto, the ALU 3 may adopt a scrambling code matrix value corresponding to each byte of the last two bytes. The bytes are XORed to obtain two consecutive bytes after XOR processing, and the two consecutive bytes after the XOR processing are output.

Similarly, the bit interleaving unit of the PE 333 can receive the second beat bit stream (ie, the 9th to 16th bytes) in the frame and the initial header offset information of the second beat bit stream, and determine the received The target output port corresponding to each byte of the 8 bytes (each byte corresponding to one bit group) composed of 64 consecutive bits (ie, N ₂ = L = 64). Specifically, the offset of the tap of the second beat bit stream relative to the frame header is 8 bytes. At this time, as shown in FIG. 11, the bit interleaving unit of the PE 333 can transmit the first byte and the second byte in the second beat bit stream through the output port corresponding to the ALU 0 according to Table 1. To ALU 0, the third byte and the fourth byte are transmitted to the ALU 1 through the output port corresponding to the ALU 1, and the fifth byte and the sixth byte are transmitted through the output port corresponding to the ALU 2 To ALU 2, and the seventh byte and the eighth byte are transmitted to the ALU 3 through the output port corresponding to the ALU 3. Optionally, the bit interleaving unit may further transmit the scrambling code matrix information corresponding to each byte to the corresponding ALU. Each ALU may perform an exclusive OR operation on each byte according to the scrambling code matrix value corresponding to each byte of the received two bytes, and output the result of the exclusive OR operation.

In addition, the bit interleaving unit of the PE 332 and the PE 333 may also output current frame header offset information (ie, initial header offset information) of the 64 consecutive bits to the output port of the associated PE. The current frame header offset information is transmitted to the PE 335.

The 8 bytes (64 consecutive bits) output by the PE 332 and their corresponding current header offset information are transmitted to the PE 335, and the PE 333 outputs 8 bytes (64 consecutive bits) and their corresponding The current frame header offset information can be transmitted to the PE 335 via the PE 334. The PE 335 can be specifically configured to implement a mapping multiplexing function. The PE 335 can terminate the received header offset information, and Generating local frame header offset information, and determining a target output port for each byte based on local frame header offset information.

Specifically, the PE 335 may have a format information table shown in Table 2. Specifically, since the transmission rate of the OTU 2 is 4 times of the transmission rate of the OTU 1, the bit stream per beat corresponding to the OTU 2 side includes 4 slot positions, and each slot position can accommodate at least 8 bytes. Correspondingly, the bit interleaving unit in the PE 335 can determine the bit corresponding to each of the four slot positions and the target output port according to Table 2. Specifically, for the first slot position, the frame header offset value is 0. As can be seen from Table 2, the first slot position corresponds to the input ports 0-7 and the ALU 0 port, wherein the PE 335 If the input ports 0 to 7 correspond to the PE 332, the bit interleaving unit can transmit the 8 bytes transmitted by the PE 332 to the ALU 0 through the output port corresponding to the ALU 0, and the ALU 0 can set the first time slot position. The value is the received 8 bytes; for the second slot position, the header offset value is 1, as can be seen from Table 2, the second slot position corresponds to the input ports 8 to 15 and the ALU. 1 port, wherein the input ports 8 to 15 of the PE 335 correspond to the PE 334 or the PE 333, the bit interleaving unit can transmit the 8 bytes transmitted by the PE 333 to the ALU 1 through the output port corresponding to the ALU 1 ALU 1 may assign the second slot position to the received 8 bytes; for the third slot position, the frame header offset value is 2, as shown in Table 2, the third If the slot position has no corresponding input, the bit interleaving unit may use a signature for indicating no input. The output port corresponding to the ALU 2 is transmitted to the ALU 2, and the ALU 2 can determine, according to the signature, that the third slot position has no corresponding input, and fills the third slot position with 8 bytes, wherein The padded 8 bytes may be generated locally by the ALU 2, or may be acquired by the ALU 2 from the instruction parameter memory; for the fourth slot position, the frame header offset value is 3, as can be seen from Table 2, If the fourth time slot position still has no corresponding input, the bit interleaving unit may transmit the signature to the ALU 3 through an output port corresponding to the ALU 3, and the ALU 3 may be the fourth similar to the ALU 2 The slot position is filled with 64 bits.

It should be understood that the feature code in the embodiment of the present invention can be distinguished from the bit stream carrying the data, and is used to indicate that there is no data input. For example, the feature code can be composed of multiple binary bits set to 0, but the embodiment of the present invention It is not limited to its specific form.

At least 32 bytes of the PE 335 output can be transmitted to the PE 336. The PE 336 can be specifically configured to perform overhead insertion to implement a framing function. PE 337 can be specifically used to perform with PE 332 and PE 333 operate similarly to implement the scrambling function and transmit the obtained third parallel bit stream to output unit 338.

The output sheet 338 may perform parallel-to-serial conversion processing on the received third parallel bit stream to obtain a third serial data stream, and transmit the third serial data stream to the third O/E unit 340. The third O/E unit 340 may perform electro-optical conversion on the received third serial data stream to obtain a third optical signal, and output the third optical signal.

Table 2 Example of format information table

帧头偏移值OHeader offset value O	输入端口Input port	输出端口Output port		指令数据Instruction data
0～15/4080～4095/…/12240～122550～15/4080～4095/.../12240～12255	NULLNULL	NULLNULL	NULLNULL
O mod 4＝0O mod 4=0	0～70 to 7	ALU 0端口 ALU 0 port	NULLNULL
O mod 4＝1O mod 4=1	8～158~15	ALU 1端口 ALU 1 port	NULLNULL
O mod 4＝2O mod 4=2	NULL NULL		ALU 2端口ALU 2 port	填充值1Fill value 1
O mod 4＝3O mod 4=3	NULL NULL		ALU 3端口ALU 3 port		填充值2Fill value 2

It should be understood that the foregoing embodiment describes the bit stream processing flow in the bit stream processor 330 by using FIG. 8 as an example. Alternatively, the bit stream processing flow in the bit stream processor 330 may also be as shown in FIG. The 8 bytes of the output of the PE 333 are transmitted to the PE 335 through the PE 332, but the embodiment of the present invention is not limited thereto.

The bit stream processing system 300 in the above embodiment can be applied to the transmitting end. The bit stream processor provided by the embodiment of the present invention can also be applied to a receiving end, where the difference from the transmitting end is that the receiving end demultiplexes one channel of signals into two signals, and directly performs framing processing without performing descrambling processing. .

The data processing device provided by the embodiment of the present invention can also be used to implement the cross function of OTN fixed particles (also referred to as OTN rigid pipes). For convenience of description, in this embodiment, it is assumed that the data processing device is a bit stream processor, wherein a plurality of intermediate PEs of the bit stream processor have a fully connected relationship, and optionally, the bit stream processor may have FIG. 2 The structure shown, but the embodiment of the invention is not limited thereto. Specifically, the multiple PEs may demultiplex the bit stream into the same size of the to-be-interleaved particles, and read the value of the overhead position and assign the value of the overhead position to implement the cross function, and the specific process and the above signal multiplexing embodiment The process is similar, for the sake of brevity, it will not be repeated here.

The plurality of data processing devices provided by the embodiments of the present invention may also perform any combination to achieve more powerful service processing capabilities. 13 to 15 respectively illustrate possible ways of combining a plurality of data processing devices provided by embodiments of the present invention. The plurality of data processing devices in FIG. 13 are connected in series, and the plurality of data processing devices in FIG. 14 are connected in parallel. The plurality of data processing devices in FIG. 15 are distributed in a Mesh structure, wherein between the plurality of data processing devices The embodiments of the present invention are not limited thereto, and may be independent of each other or may interact with each other to exchange certain information or data.

Optionally, the multiple data processing devices in the embodiment of the present invention may also be deployed in any combination of the foregoing manners, for example, multiple data processing devices in the system are connected in a Mesh manner, and another plurality of data processing devices are The embodiments are connected in series or in parallel, and the embodiment of the invention is not limited thereto.

It should be noted that the examples of FIG. 8 to FIG. 15 are intended to help those skilled in the art to better understand the embodiments of the present invention, and not to limit the scope of the embodiments of the present invention. A person skilled in the art will be able to make various modifications and changes in accordance with the example of FIG. 2, and such modifications or variations are also within the scope of the embodiments of the present invention.

It should be understood that in the embodiments of the present invention, the term and/or merely an association relationship describing the associated object indicates that there may be three relationships. For example, A and/or B may indicate that A exists separately, and A and B exist simultaneously, and B cases exist alone. In addition, the character / in this paper generally indicates that the contextual object is an OR relationship.

Those skilled in the art will appreciate that the various method steps and elements described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both, in order to clearly illustrate hardware and software. Interchangeability, the steps and composition of the various embodiments have been generally described in terms of function in the foregoing description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. Different methods may be used to implement the described functionality for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.

A person skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division, and may be implemented in actual implementation. In a different manner, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the embodiments of the present invention.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing element, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention contributes in essence or to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

The above is only the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any equivalent person can be easily conceived within the technical scope of the present invention by any person skilled in the art. Modifications or substitutions are intended to be included within the scope of the invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

A data processing apparatus, comprising: a plurality of processing elements, each of the plurality of processing elements comprising a bit interleaving unit and at least one arithmetic logic unit ALU, the at least one ALU and the bit At least one output port of the interleaving unit is in one-to-one correspondence, wherein

The bit interleaving unit is configured to determine, according to current frame header offset information of the plurality of consecutive bits, a target output port corresponding to each of the at least one bit group consisting of the plurality of consecutive bits, and from the corresponding The target output port outputs the each bit group, wherein each of the at least one bit group includes at least one consecutive bit of the plurality of consecutive bits;

At least one target ALU of the at least one ALU is configured to receive at least one first bit group of the at least one bit group transmitted by the bit interleaving unit, and execute an instruction to the at least one first bit group to Obtaining an instruction execution result, wherein the at least one target ALU corresponds to at least one target output port corresponding to the at least one first bit group.
The device according to claim 1, wherein the device stores a correspondence between a preset frame header offset value and an output port;

The bit interleaving unit is configured to determine, according to a current frame header offset information of the multiple consecutive bits, and a correspondence between the preset frame header offset value and an output port, each of the at least one bit group is determined. The target output port of the bit group.
The device according to claim 2, wherein the preset frame header offset value is in units of M bits, and M ≥ 1. The bit interleaving unit is specifically configured to:

Determining a frame header offset value of each of the at least one bit group according to current frame header offset information of the plurality of consecutive bits, wherein each of the at least one bit group includes M consecutive bits;

Determining, in the correspondence between the preset frame header offset value and the output port, an output port corresponding to the frame header offset value of each of the bit groups;

The corresponding output port is determined as the target output port of each of the bit groups.
The device according to claim 2 or 3, wherein each of the plurality of processing elements stores a correspondence between the preset frame header offset value and an output port.
The device according to any one of claims 1 to 4, wherein the device further stores a plurality of instruction parameters;

The bit interleaving unit is further configured to determine, according to current frame header offset information of the multiple consecutive bits, an instruction parameter storage address of each of the at least one bit group, and to each target ALU And transmitting the indication information, where the indication information is used to indicate an instruction parameter storage address of the first bit group received by each target ALU;

Each of the at least one target ALU is further configured to acquire an instruction parameter by using an instruction parameter storage address indicated by the indication information sent by the bit interleaving unit before executing the instruction on the received first bit group, and The instruction is executed on the received first bit according to the obtained instruction parameter.
The device according to any one of claims 1 to 5, wherein the bit interleaving unit is further configured to output a current frame header of the plurality of consecutive bits by an output end of a processing element to which the bit interleaving unit belongs Offset information.
The apparatus according to any one of claims 1 to 6, wherein each of the plurality of processing elements further comprises a conversion unit, wherein an input of the conversion unit is interleaved with the bit At least one output port of the unit is connected, and an output end of the conversion unit is connected to an output end of the processing element to which the conversion unit belongs;

The bit interleaving unit is further configured to: when determining that a target output port of the at least one second bit group of the at least one bit group corresponds to the converting unit, transmit the at least one second bit to the converting unit group;

The converting unit is configured to transmit the received at least one second bit group to an output end of a processing element to which the conversion unit belongs.
The apparatus according to any one of claims 1 to 7, wherein the plurality of processing elements are in a mesh Mesh structure.
The apparatus according to any one of claims 1 to 7, wherein the plurality of processing elements comprise at least one first processing element and at least one second processing element, wherein the at least one first processing element An output of each of the first processing elements is coupled to an input of all of the at least one second processing element.
The device according to any one of claims 1 to 9, characterized in that the device further comprises an input unit, an output of the input unit and an input of a third processing element of the plurality of processing elements Connected, among them,

The input unit is configured to perform a framing process on the parallel bit stream to determine the parallel bit stream Frame header position;

The input unit is further configured to send, to the third processing element, a plurality of consecutive bits in the parallel bitstream and first frame header offset information of the multiple consecutive bits;

The bit interleaving unit of the third processing element is specifically configured to receive a plurality of consecutive bits transmitted by the input unit and first frame header offset information of the multiple consecutive bits, and receive the multiple consecutive frames The first frame header offset information of the bit is determined as the current frame header offset information of the plurality of consecutive bits.
The apparatus according to any one of claims 1 to 10, wherein the plurality of processing elements comprise at least one fourth processing element and a fifth processing element, each of the at least one fourth processing element An output of the four processing elements is coupled to an input of the fifth processing element, wherein

The bit interleaving unit of the fifth processing element is specifically configured to:

Receiving a plurality of consecutive bits transmitted by the at least one fourth processing element, wherein the plurality of consecutive bits are obtained by processing the at least one fourth processing element by processing the received at least one consecutive bit;

Determining at least one slot position corresponding to the plurality of consecutive bits among the plurality of local slot positions according to the at least one input port corresponding to the plurality of consecutive bits;

Determining current frame header offset information of the plurality of consecutive bits according to the header offset information of the at least one slot position.
The apparatus according to any one of claims 1 to 11, wherein the current header offset information of the plurality of consecutive bits comprises a first bit of the plurality of consecutive bits relative to the The offset value of the frame header of the frame to which consecutive bits belong.
An optical transmission network switch, comprising: a first photoelectric conversion unit, the data processing device according to claims 1 to 12, and a second photoelectric conversion unit, wherein

The first photoelectric conversion unit is configured to perform photoelectric conversion processing on the input first optical signal to obtain a bit stream corresponding to the first optical signal, and transmit the bit stream to the processing device;

The data processing device is configured to receive the bit stream transmitted by the first photoelectric conversion unit, process the bit stream to obtain the processed bit stream, and use the processed bit Streaming to the second photoelectric conversion unit;

The second photoelectric conversion unit is configured to receive the processed bit stream transmitted by the data processing device, and perform electro-optical conversion on the processed bit stream to obtain a corresponding bit stream of the processed bit stream. a second optical signal, and outputting the second optical signal.