WO2023207952A1 - 数据处理方法和装置、芯片、电子设备、介质 - Google Patents

数据处理方法和装置、芯片、电子设备、介质 Download PDF

Info

Publication number
WO2023207952A1
WO2023207952A1 PCT/CN2023/090512 CN2023090512W WO2023207952A1 WO 2023207952 A1 WO2023207952 A1 WO 2023207952A1 CN 2023090512 W CN2023090512 W CN 2023090512W WO 2023207952 A1 WO2023207952 A1 WO 2023207952A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
group
nodes
default
working
Prior art date
Application number
PCT/CN2023/090512
Other languages
English (en)
French (fr)
Inventor
冷祥纶
李冰
赵月新
王海生
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023207952A1 publication Critical patent/WO2023207952A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2002Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where interconnections or communication control functionality are redundant
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant

Definitions

  • the present disclosure relates to the technical field of data processing devices, and in particular to data processing methods and devices, chips, electronic equipment, and media.
  • redundant logic can be added to improve the availability of the data processing device.
  • redundant logic in related technologies often requires major changes to the topology of each node.
  • an embodiment of the present disclosure provides a data processing device, the data processing device includes a node array, the node array includes a plurality of node groups; wherein, for each of the plurality of node groups, the node Adjacent nodes in the group are connected, and each node in the node group is connected to multiple nodes in other node groups; the multiple nodes include default nodes and backup nodes, and the default node of a node is the node where the node is located.
  • a backup node for at least one other node in the group when the default node of the first node in a node group is in a normal state, the connection between the first node and the corresponding default node is enabled; in a node group When the default node of the first node in is in an abnormal state, the connection between the first node and at least one second node in the same node group as the first node and the corresponding backup node is enabled.
  • each node in the node array includes a processing core and a router connected to the processing core.
  • the router of each node in one node group is used to connect multiple nodes of another node group. router; when both the router and the processing core of a node are in a normal state, the node is in a normal state; when at least one of the router and the processing core of a node is in an abnormal state, the node is in Abnormal status.
  • each of the plurality of node groups includes at least one redundant node and a working node other than the redundant node, and the redundant nodes of one node group are those of another node group.
  • a backup node of at least one working node, and the first node is the working node; when each working node of a node group is in a normal state, the redundant node of the node group is disabled.
  • the backup node of each working node is the default node of the next working node in the group where the node is located;
  • connections between all working nodes from the working node to the last working node of the node group and the corresponding standby node are respectively enabled, wherein, The redundant node is the next node of the last working node in the node group.
  • At least two redundant nodes are provided, and the at least two redundant nodes are distributed at both ends of the node group, in response to determining The default node of at least one working node in the node group is in an abnormal state.
  • all working nodes from the working node to the previous working node of the target redundant node of the working node are Work node with Connections between corresponding backup nodes are enabled, wherein the number of nodes in an abnormal state in any node group is less than or equal to the number of redundant nodes in the node group, and the target redundant node and the at least one working node Be in the same node group, and for each of the at least one working node, the default node of each working node between the subsequent working node of the working node and the previous working node of the target redundant node is at normal status.
  • At least two redundant nodes are provided, and the at least two redundant nodes are distributed at one end of the node group, in response to It is determined that the default node of at least one working node in the node group is in an abnormal state. For each of the at least one working node, all working nodes from the working node to the previous working node of the redundant node of the node group are determined.
  • connection between the node and the corresponding backup node is enabled, where the number of nodes in an abnormal state in any node group is less than or equal to the number of redundant nodes in the node group, for each connection between the enable and the corresponding backup node A third node of the connection, the default node of the third node is not adjacent to the backup node of the third node.
  • the backup node of each working node in the node array is the default node of the next working node in the node group where the working node is located.
  • the backup node of each working node in the node array is the default node of the working node set apart from the working node in the node group where the working node is located.
  • the redundant node of each node group in the plurality of node groups includes the Nth node of the node group, and the jth node of the i-th node group is the i-1th node group.
  • the j-th node and the i+1-th node group are the default nodes of the j-th node, and the j+1-th node of the i-th node group is the j-th node and the i-th node group
  • the connection between the v+1-th node of is enabled, and the connection between the v-th node of the i+1-th node group and the v+1-th node of the i-th node group is enabled; where
  • the redundant nodes of each node group in the plurality of node groups include the 1st node and the Nth node of the node group; the jth node of the i-th node group is the i-th node.
  • the default node of the j-th node of the -1 node group and the j-th node of the i+1-th node group, the j+1-th node of the i-th node group and the j-1 of the i-th node group node is the backup node of the j-th node in the i-1th node group, and the j+1th node in the i-th node group and the j-1th node in the i-th node group are the i+1th node.
  • the backup node of the j-th node of the i-th node group when the j-th node and the k-th node of the i-th node group are both in an abnormal state, the v-th node of the i-1 node group and the The connection between the v+1-th node of the i-th node group is enabled, and the connection between the v-th node of the i+1-th node group and the v+1-th node of the i-th node group is enabled, The connection between the u-th node of the i-1th node group and the u-1th node of the i-th node group is enabled, and the u-th node of the i+1th node group is connected to the i-th node group.
  • N is the total number of nodes in each node group in the plurality of node groups.
  • the redundant nodes of each node group in the plurality of node groups include the N-1th node and the Nth node of the node group; the jth node of the i-th node group is The default node of the jth node of the i-1th node group and the jth node of the i+1th node group, the j+1th node of the ith node group and the jth node of the ith node group
  • the +2 nodes are the backup nodes of the j-th node in the i-1 node group, and the j+1-th node in the i-th node group and the j+2-th node in the i-th node group are the i-th node.
  • the connection between the node and the v+2-th node of the i-th node group is enabled, and the connection between the v-th node of the i+1-th node group and the v+2-th node of the i-th node group is enabled.
  • the connection is enabled; where, 1 ⁇ j ⁇ N-1, j ⁇ v ⁇ N-1, v, i, j and N are all positive integers, and N is the number of nodes.
  • the redundant node of each node group in the plurality of node groups includes the Nth node of the node group, and the jth node of the i-th node group is the i-1th node group.
  • the j-th node and the i+1-th node group are the default nodes of the j-th node, and the j+1-th node of the i-th node group is the j-th node and the i-th node group
  • the connection between the v-th node of the i-th node group and the v+1-th node of the i-th node group is enabled, and the v-th node of the i+2-th node group is connected to the
  • the target in each node group in the plurality of node groups are all bypassed, so that the number of abnormal nodes that are not bypassed in any node group is less than or equal to the number of redundant nodes in the node group; wherein the target node includes the abnormal node status node, and the target node in one node group is the default node of the target node in another node group.
  • the number of redundant nodes in a node group is determined based on at least one of the following conditions: the area of the data processing device, the probability that the node is in an abnormal state, and the number of nodes in the node array.
  • connection between a node and the corresponding default node or backup node is enabled based on the corresponding preset identification information; wherein each default node and backup node of a node correspond to different preset identification information.
  • the data processing device further includes a control unit configured to: for each node in the node array, obtain the working status of the node; and set the preset identification information of the node based on the working status of the node. .
  • the abnormal state includes a first abnormal state caused by a process defect; the data processing device further includes: a storage unit configured to store the first location information of the node in the first abnormal state, So that the control unit sets the preset identification information of the node in the first abnormal state based on the first location information.
  • the abnormal state includes a second abnormal state caused by the working environment; the data processing device further includes: a detection unit for detecting the second abnormality in real time during the operation of the data processing device. Second location information of the node in the abnormal state, so that the control unit sets the preset identification information of the node in the second abnormal state based on the second location information.
  • control unit is further configured to: in the event that at least one node switches from a policy state to an abnormal state, set the presets of each node in the node array based on the working status of each node. Before identifying the information, pause the tasks currently executed by each node in the node array.
  • the output end of the node in each node group of the plurality of node groups is connected to a multiplexer, and the input end of the node in each node group is connected to a demultiplexer.
  • the multiplexer of a node is used to output the output signal of the node to the default node or backup node of the node through different channels;
  • the demultiplexer of a node is used to output the output signal to the node through different channels. The node's output signal is input to this node.
  • the data processing device further includes: multiple interfaces for connecting nodes of other data processing devices.
  • each node in the node array includes a bypass unit; when the node is in an abnormal state, the bypass unit bypasses the node so that the Two nodes adjacent to the node in the node group where the node is located are directly connected.
  • an embodiment of the disclosure provides a chip including the data processing device described in any embodiment of the disclosure. Set.
  • an embodiment of the present disclosure provides an electronic device, including the data processing device according to any embodiment of the present disclosure.
  • an embodiment of the disclosure provides a data processing method for adjusting the connection relationship of each node in the data processing device according to any embodiment of the first aspect of the disclosure; the method includes:
  • the status of the default node of each node includes normal status and abnormal status
  • the default node of the first node in a node group When the default node of the first node in a node group is in an abnormal state, enable the connection between the first node and at least one second node in the same node group as the first node and the corresponding backup node.
  • an embodiment of the present disclosure provides a data processing device for adjusting the connection relationship of each node in the data processing device according to any embodiment of the first aspect of the present disclosure; the device includes:
  • the acquisition module is used to obtain the status of the default node of each node, and the status includes normal status and abnormal status;
  • the adjustment module is used to adjust the connection relationship of multiple nodes in the node array based on the status of the default node of each node;
  • the default node of the first node in a node group When the default node of the first node in a node group is in an abnormal state, enable the connection between the first node and at least one second node in the same node group as the first node and the corresponding backup node.
  • the nodes of each node group can be connected to multiple nodes including the default node and the backup node.
  • the backup node can be used as the redundant node of the default node to implement redundant logic.
  • the default node of the first node is in an abnormal state, the connection between the first node and at least one second node in the same node group as the first node and the corresponding backup node is enabled, And the default node of each second node is another second node or a backup node of the first node. In this way, the overall topology of the entire node array remains the same as the original topology as much as possible, reducing the need for redundancy. Changes in the topological structure of nodes in the post-logic data processing device.
  • FIG. 1 is a schematic diagram of a data processing device according to an embodiment of the present disclosure.
  • FIG. 2A is a schematic diagram of a connection mode of nodes in a normal state according to an embodiment of the present disclosure.
  • FIG. 2B and FIG. 2C are respectively schematic diagrams of different connection modes of nodes in abnormal states according to embodiments of the present disclosure.
  • 3 to 8 are respectively schematic diagrams of redundant logic in different situations according to embodiments of the present disclosure.
  • 9 and 10 are respectively schematic diagrams of a node array including a processing core and a router according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic diagram of the connection method of nodes in an abnormal state in the node array shown in FIG. 10 .
  • Figure 12 is a schematic diagram of a data processing device including an interface according to an embodiment of the present disclosure.
  • FIG. 13 is a schematic diagram of a data processing apparatus including a multiplexer and a demultiplexer according to an embodiment of the present disclosure.
  • Figure 14 is a schematic diagram when the number of nodes in an abnormal state is greater than the number of redundant nodes according to an embodiment of the present disclosure.
  • Figure 15 is a flowchart of a data processing method according to an embodiment of the present disclosure.
  • FIG. 16 is a block diagram of a data processing device according to another embodiment of the present disclosure.
  • first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.
  • first information may also be called second information, and similarly, the second information may also be called first information.
  • word “if” as used herein may be interpreted as "when” or “when” or “in response to determining.”
  • a data processing device that includes multiple nodes
  • when one of the nodes is in an abnormal state it may cause the entire data processing device to fail.
  • data processing devices are used in chips.
  • the chip area is getting larger and larger.
  • the larger the chip area the lower the yield.
  • yield refers to the ratio between the number of qualified chips and the total number of chips produced.
  • redundant logic is generally added to the chip. For example, in a chip including a node array of X rows and Y columns, an additional row or column of nodes can be added as a redundant node.
  • the redundant node does not work; but when there is a node in an abnormal state in the node array, the redundant node can be used to replace the node in the abnormal state.
  • redundant logic in related technologies often requires major changes to the topology of each node in the chip.
  • the present disclosure provides a data processing device.
  • the data processing device includes a node array 101, and the node array 101 includes a plurality of node groups 101a;
  • adjacent nodes in a node group 101a are connected, and each node in the node group 101a Connected to multiple nodes of other node groups 101a; the multiple nodes include default nodes and backup nodes, and the default node of a node is the backup node of at least one other node in the node group 101a where the node is located;
  • the node array 101 may include multiple nodes, and the nodes in each node group are used for data processing, and may also be called data processing nodes or processing nodes.
  • the multiple nodes may form an array with P rows and Q columns, where P and Q are both positive integers.
  • each block represents a node, and the values of P and Q are both 5.
  • P and Q can also take other values, and the values of P and Q do not need to be equal.
  • the numbers in each square represent the logical coordinates of the corresponding node in the node array 101. The first number in the brackets represents the row where the node is located, and the subsequent number represents the column where the node is located.
  • (0,0) represents the node in the first row and the first column of the node array 101
  • (0,1) represents the node in the first row and the second column of the node array 101.
  • the node is represented by its coordinates below.
  • a node with coordinates (0,0) is represented as node (0,0).
  • the naming of other functional units is similar to the naming of nodes.
  • the node array 101 may include multiple node groups 101a, and the node groups may also be referred to as groups for short.
  • a node group 101a may be a row or a column in the node array 101.
  • node (0,0), node (1,0), node (2,0), node (3,0) and node (4,0) may form one of the node groups 101a. Adjacent nodes in the same node group are connected. For example, node (0,0) is connected to node (1,0), node (2,0) is connected to node (1,0) and node (3,0) respectively.
  • each node group 101a are used to connect multiple nodes of another (for example, adjacent) node group 101a to the first node group where the first column in the node array 101 is located and the second node group in the node array 101 Take the second node group where the column is located as an example.
  • the node (0,0) in the first node group is used to connect the node (0,1) and the node (1,1) in the second node group.
  • the node (1,0) is used to connect the node (0,1), the node (1,1) and the node (2,1) in the second node group.
  • the multiple nodes used to connect another node group 101a include default nodes and backup nodes, and the number of default nodes and backup nodes of the same node may be greater than or equal to 1.
  • the corresponding node in the same row (that is, the same row coordinate) as the node in other node groups can be The neighboring node is used as the default node of the node, and the nodes in other node groups that are in different rows from the node (that is, the row coordinates are different) and have a connection relationship with the node are used as the backup nodes of the node.
  • the default node of node (0,0) is node (0,1), and the default node of node (0,1) includes node (0,0) and node (0,2); the default node of node (0,0)
  • the backup node is node (1,1), and the backup nodes of node (0,1) include node (1,0) and node (1,2).
  • a node can also have a larger number of backup nodes.
  • the backup nodes of node (0,0) can include node (1,1) and node (2,1)
  • the backup nodes of node (0,1) can include node (1,0), node (2,0 ), node(1,2) and node(2,2). That is to say, the backup node of a node may include one or more nodes that are in the same node group as the default node of the node and whose row position is below the default node of the node.
  • a node's backup node may also include one or more nodes that are in the same node group as the node's default node and whose row position is above the node's default node.
  • the backup node of node (2,0) can be node (1,1), or include node (1,1) and node (0,1).
  • a node's backup nodes may include nodes that meet any of the above conditions.
  • backup nodes for node (2,0) may include node (1,1) and node (3,1), or backup nodes for node (2,0) may include node (0,1), node (1, 1) and node (3,1).
  • the above implementation shows a situation where the row distance between a node and the corresponding backup node is less than or equal to 2. In other embodiments, the row distance between a node and the corresponding backup node can also be greater than 2, which is no longer the case here. Expand the description.
  • the default node of a node can be the backup node of at least one other node in the same node group.
  • the default node of node (1,0) is node (1,1)
  • node (1,1) can be node (0 ,0) and the backup node of node (2,0).
  • each node group 101a includes a column in the node array 101.
  • each node group 101a may also include a row in the node array 101.
  • node (2,0), node (2,1), node (2,2), node (2,3), and node (2,4) may form one of the node groups 101a. Adjacent nodes in the same node group are connected. For example, node (2,0) is connected to node (2,1), and node (2,2) is connected to node (2,1) and node (2,3) respectively. are connected, the solid line with an arrow indicates the connection relationship, and the arrow indicates the data flow direction.
  • the nodes in each node group 101a are used to connect multiple nodes of another node group 101a, with the third node group where the first row in the node array 101 is located and the fourth node group where the second row in the node array 101 is located.
  • node (0,0) in the third node group is used to connect node (1,0) and node (1,1) in the fourth node group
  • node (0,1) in the third node group Used to connect node (1,0), node (1,1) and node (1,2) in the fourth node group.
  • the multiple nodes include default nodes and backup nodes, and the number of default nodes and backup nodes of the same node may be greater than or equal to 1.
  • the nodes in the same column that is, the same column coordinate
  • the nodes in other node groups that are in different columns that is, different column coordinates
  • the default node of node (0,0) is node (1,0), and the default nodes of node (1,0) include node (0,0) and node (2,0); the default node of node (0,0)
  • the backup node is node (1,1), and the backup nodes of node (1,0) include node (0,1) and node (2,1).
  • a node can also have a larger number of backup nodes.
  • the backup nodes of node (0,0) can include node (1,1) and node (1,2)
  • the backup nodes of node (1,0) can include node (0,1), node (0,2 ), node(2,1) and node(2,2). That is to say, the backup node of a node may include one or more nodes that are in the same node group as the default node of the node and whose column position is after the default node of the node.
  • a node's backup node may also include one or more nodes that are in the same node group as the node's default node and whose column position is before the node's default node.
  • the backup node of node (0,2) can be node (1,1), or include node (1,1) and node (1,0).
  • a node's backup nodes may include nodes that meet any of the above conditions.
  • backup nodes for node (0,2) may include node (1,1) and node (1,3), or backup nodes for node (0,2) may include node (1,0), node (1, 1) and node (1,3).
  • the above embodiment shows a situation where the column distance between a node and the corresponding backup node is less than or equal to 2. In other embodiments, the column distance between a node and the corresponding backup node can also be greater than 2, which is not the case here. Let’s expand the explanation one by one.
  • the default node of a node can be the backup node of at least one other node in the same node group.
  • the default node of node (0,1) is node (1,1)
  • node (1,1) can be node (0 ,0) and the backup node of node (0,2).
  • a connection between a node and the default node and each backup node may only have one enabled at a time. Further, the connection between the first node and the default node can be enabled preferentially. That is to say, as long as the default node of the first node is in a normal state, the first node will be preferentially connected to the corresponding default node.
  • the backup node can serve as a redundant node of the default node to establish a connection with the corresponding first node when the default node is in an abnormal state, so as to ensure that the entire node array 101 is in a normal working state.
  • each node group 101a includes one column in the node array 101
  • the node (1,0) The default node is node (1,1)
  • the backup nodes are node (2,1) and node (3,1).
  • node (1,1) is in normal state
  • node (1,0) and The connection between node (1,1) is enabled
  • the connection between node (1,0) and node (2,1) is disabled
  • the connection between node (1,0) and node (3,1) Also disabled.
  • node (1,1) When node (1,1) is in an abnormal state, the connection between node (1,0) and node (1,1) is disabled, and the connection between node (1,0) and node (2,1) The connection is enabled, the connection between node (1,0) and node (3,1) is disabled. Or, when node (1,1) is in an abnormal state, the connection between node (1,0) and node (1,1) is disabled, and the connection between node (1,0) and node (2,1) The connection between node (1,0) and node (3,1) is enabled.
  • the specific connection to which backup node is enabled can be determined based on the actual situation.
  • each node group 101a includes one row in the node array 101, assume that the default node for node (0,1) is node (1,1) and the backup nodes are node (1,2) and node (1,3 ), then when node (1,1) is in a normal state, the connection between node (0,1) and node (1,1) is enabled, and the connection between node (0,1) and node (1,2) The connection between nodes is disabled, and the connection between node (0,1) and node (1,3) is also disabled.
  • node (1,1) When node (1,1) is in an abnormal state, the connection between node (0,1) and node (1,1) is disabled, and the connection between node (0,1) and node (1,2) The connection is enabled and the connection between node (0,1) and node (1,3) is disabled. Or, when node (1,1) is in an abnormal state, the connection between node (1,0) and node (1,1) is disabled, and the connection between node (0,1) and node (1,2) The connection between node (0,1) and node (1,3) is enabled.
  • the specific connection to which backup node is enabled can be determined based on the actual situation.
  • each node group 101a includes a row in the node array 101 is equivalent to the case where each node group 101a includes a column in the node array 101, rotating the node array 90 degrees,
  • the following description mainly takes the example that each node group 101a includes a column in the node array 101.
  • the embodiment of the present disclosure will also enable at least one node in the same node group as the first node.
  • the default node of each second node is another second node or a backup node of the first node.
  • the adjacent node in the same row as node A in Figure 1 is the default node of node A, is on the next row of the default node of node A, and is the same as the node A.
  • the node in the same column as the default node of A is the backup node of node A.
  • connection between each node and the corresponding default node When each node is in a normal state, what is enabled is the connection between each node and the corresponding default node.
  • the connection method in this case is shown in Figure 2A. Because connections between each node and its backup node are disabled, connections exist only between adjacent nodes in the same row and between adjacent nodes in the same column.
  • node (1,0) i.e. node (1,1)
  • node (1,0), node (2,0), node (3,0), node ( 4,0), node(1,2), node(2,2), node(3,2), and node(4,2) are all enabled with their respective backup nodes connected to Connections between nodes are disabled by default for the respective nodes, and the connection mode in this case is shown in Figure 2B. It should be noted that under the above redundancy logic, the last row of nodes does not have a corresponding backup node.
  • the node (4,0) and node (4,2) in the last row of nodes can be , node (4,3) and node (4,4) are all set to non-working status, that is, node (4,0), node (4,2), node (4,3) and node (4,4) are deactivated ).
  • the topological structure of each node in the node array can be kept basically unchanged, and only one row of nodes is reduced in the node array.
  • node (4,1) replaces node (3,1) as the node connecting node (3,0) and node (3,2)
  • node (3,1) replaces Node (2,1) serves as the node connecting node (2,0) and node (2,2)
  • node (2,1) replaces node (1,1) as connecting node (1,0) and node (1, 2)
  • the topology of the first 4 rows in the node array remains unchanged.
  • the node in the z-th row below the default node of node A and in the same column as the default node of node A can also be regarded as the backup node of node A, where z is greater than an integer of 2.
  • nodes located in multiple rows below the default node of node A and in the same column as the default node of node A can also be used as backup nodes of node A.
  • each node group includes at least one redundant node and a working node other than the redundant node, and the redundant node of one node group is a backup node of at least one working node of another node group,
  • the first node is the working node.
  • the last node of each node group may be used as a redundant node, that is, the last row of nodes in the node array 101 are all redundant nodes.
  • the redundant node of the node group When each working node of a node group is in a normal state, the redundant node of the node group can be disabled, that is, the redundant node of the node group is set to a non-working state. Only when there are working nodes in an abnormal state in a certain node group, the redundant nodes of the node group are enabled. Still taking the embodiment shown in Figure 2B as an example, the last row of nodes in the node array is a redundant node. These redundant nodes are in a non-working state under normal circumstances. From a certain moment, due to the node (1, 1) is in an abnormal state. Therefore, in order to implement redundant logic, node (4,1) among the redundant nodes is enabled, while other redundant nodes remain in a non-working state.
  • each worker node can remain unchanged.
  • node (4,0), node (4,2), node (4,3) and node (4,4) are all set to a non-working state.
  • the last row of nodes is in the working state like other nodes. Only when there is an abnormal node, the last row of nodes will be set to the non-working state.
  • the redundant node includes a row of nodes in the node array 101.
  • the redundant node may also include a column of nodes in the node array 101, or include at least two rows and/or a row of nodes in the node array 101. Or at least two columns of nodes.
  • the redundant nodes include two rows of nodes in the node array 101.
  • the at least two rows of nodes may be two consecutive rows in the node array 101, for example, row 1 and row 2, or the last row and the second to last row. row; it can also be two discontinuous rows in the node array 101, for example, the first row and the last row.
  • the number of redundant nodes in a node group is determined based on at least one of the following conditions: the area of the data processing device, the probability that the node is in an abnormal state, and the number of nodes in the node array.
  • the area of the data processing device, the probability that the node is in an abnormal state, and the number of nodes in the node array are all positively related to the number of redundant nodes in a node group.
  • the larger the area of the data processing device the greater the number of redundant nodes in a node group.
  • Each node group includes one redundant node.
  • the connection between the working node to the last working node of the node group where the working node is located and the corresponding backup node can be respectively enabled, wherein, The number of nodes in an abnormal state in any node group is less than or equal to the number of redundant nodes in the group, and the redundant node is the node next to the last working node in the group to which it belongs.
  • the backup node of each working node is the default node of the next working node in the group where the node is located; or, optionally, the backup node of each working node is the working node in the group where the node is located. Then, the default node of the working node whose distance from the working node is greater than or equal to 2.
  • the group where node A is located is group T
  • the group The nodes in T are ⁇ node B, node C, node A, node D, node E, node F, redundant node ⁇
  • the backup node of node A is the default node of the next working node of node A (i.e. node D)
  • the backup node of node D is the default node of the next working node of node D (ie, node E), and so on.
  • All nodes from the working node A to the last working node F in the node group where the working node A is located include node A, node D, node E and node F.
  • the connections between the above nodes A, D, E, and F and the corresponding backup nodes are all enabled.
  • the redundant nodes of each node group include the N-th node of the node group, the j-th node of the i-th node group is the j-th node and the i-th node of the i-1-th node group.
  • the default node of the j-th node of the +1 node group, the j+1-th node of the i-th node group is the j-th node of the i-1-th node group, and the j-th node of the i+1-th node group backup node.
  • the connection between the v-th node of the i-1 node group and the v+1-th node of the i-th node group is enabled,
  • the connection between the v-th node of the i+1-th node group and the v+1-th node of the i-th node group is enabled; where, 1 ⁇ j ⁇ N, j ⁇ v ⁇ N, v, j and N are all positive integers, and N is the total number of nodes in each node group.
  • This embodiment is similar to the embodiment shown in Figure 2B.
  • the only difference is that the redundant node is in a disabled state when there is no abnormal working node in the node group to which it belongs.
  • the above difference has been explained in the previous embodiment. , which will not be described again here.
  • the redundant nodes of each node group can also be replaced by the 1st node of the node group.
  • the jth node of the ith node group is the i-1th node group.
  • the j-th node and the i+1-th node group are the default nodes of the j-th node, and the j-1th node of the i-th node group is the j-th node and the i-th node group.
  • the backup node of the jth node of the +1 node group is the i-1th node group.
  • the connection between the v-th node of the i-1 node group and the v-1-th node of the i-th node group is enabled,
  • the connection between the v-th node of the i+1-th node group and the v-1-th node of the i-th node group is enabled; where, 1 ⁇ j ⁇ N, 1 ⁇ v ⁇ j, v, i, Both j and N are positive integers, and N is the total number of nodes in each node group.
  • Each node group is provided with at least two redundant nodes, and the at least two redundant nodes are distributed at both ends of the node group.
  • the connection between the working node and the previous working node of the target redundant node of the working node and the corresponding standby node can be respectively enabled,
  • the number of nodes in an abnormal state in any node group is less than or equal to the number of redundant nodes in the group, the target redundant node and the working node are in the same node group, and the latter of the working node
  • the default node of each working node between the working node and the previous working node of the target redundant node is in a normal state.
  • the backup node of each working node is the default node of the next working node in the node group where the node is located; or, optionally, the backup node of each working node is the default node of the next working node in the node group where the node is located.
  • the group where node A is located is group T
  • the nodes in group T are ⁇ redundant node 1, node C, node D, node A, node E, node B, node F, redundant node 2 ⁇
  • the target redundant node of node A is redundant node 1
  • the target redundant node of node B is redundant node 2.
  • all the nodes from the working node A to the previous working node C of the target redundant node (redundant node 1) of the working node A include node C, node D, node A, for the node For B, all the nodes from the working node B to the previous working node F of the working node's target redundant node (redundant node 2) include node B and node F.
  • the redundant nodes in each node group include The 1st node and the Nth node of this node group; the jth node of the ith node group is the jth node of the i-1th node group and the jth node of the i+1th node group
  • the default node of , the j+1th node of the i-th node group and the j-1th node of the i-th node group are the backup nodes of the j-th node of the i-1th node group, and the i-th node
  • the j+1th node of the node group and the j-1th node of the i-th node group are the backup nodes of the jth node of the i+1th node group.
  • the connection between is enabled, the connection between the v-th node of the i+1-th node group and the v+1-th node of the i-th node group is enabled, and the u-th node of the i-1th node group is enabled.
  • the connection between the u-1th node of the i-th node group is enabled, and the connection between the u-th node of the i+1th node group and the u-1th node of the i-th node group is enabled.
  • 1 ⁇ j ⁇ k ⁇ N, k ⁇ v ⁇ N, 1 ⁇ u ⁇ j, u, v, i, j, k and N are all positive integers, and N is the total number of nodes in each node group.
  • the node array 101 includes 6 rows and 5 columns, in which a row of nodes with a row coordinate of 0 and a row of nodes with a row coordinate of 5 are redundant nodes.
  • node (4,1) When node (4,1) is in an abnormal state, the connection between node (4,0) and node (5,1) can be enabled, and the connection between node (4,2) and node (5,1) connection is enabled.
  • node (2,1) When node (2,1) is in an abnormal state, the connection between node (2,0) and node (1,1) can be enabled, and the connection between node (2,2) and node (1,1) can be enabled. Enable the connection between node (1,0) and node (0,1), and enable the connection between node (1,2) and node (0,1).
  • Each node group is provided with at least two redundant nodes, and the at least two redundant nodes are distributed at one end of the node group.
  • the connection between the working node and the previous working node of the redundant node of the group where the working node is located and the corresponding standby node can be respectively enabled.
  • the number of nodes in an abnormal state in any node group is less than or equal to the number of redundant nodes in the group.
  • the third node's default node is not adjacent to the third node's backup node.
  • the backup node of each working node is the default node of the working node in the group where the working node is located and the interval is set with the working node.
  • the spacing setting between two nodes may include a situation in which one or more nodes are included between the two nodes.
  • the default nodes of node A and node B are in an abnormal state
  • the node group where node A is located is group T
  • the nodes in group T are ⁇ node C, node D, node A, node B, node E, node F, Redundant node 1, redundant node 2 ⁇
  • node A takes redundant node 1 as the target redundant node, from the working node A to the previous working node F of the redundant node 1 in the node group where the working node A is located.
  • All nodes include node A, node B, node E, and node F.
  • the backup node of node A is node E, and the default node of node A and the backup node of node A also include the default node of node B. Therefore, the default node of node A and the backup node of node A are not adjacent.
  • the backup node of node B is node F, and the default node of node B and the backup node of node B also include the default node of node E. Therefore, the default node of node B and the backup node of node B are not adjacent. .
  • the redundant nodes of each node group include the N-1th node and the Nth node of the node group; the jth node of the i-th node group is the jth node of the i-1th node group.
  • the default node of the j-th node grouped with the i+1-th node, the j+1-th node grouped by the i-th node, and the j+2-th node grouped by the i-th node are the i-1th node group
  • the j-th node of , and the j+1-th node of the i-th node group and the j+2-th node of the i-th node group are the backup nodes of the j-th node of the i+1-th node group. node.
  • the v-th node of the i-1 node group and the v+2-th node of the i-th node group The connection between nodes is enabled, and the connection between the v-th node of the i+1-th node group and the v+2-th node of the i-th node group is enabled; where, 1 ⁇ j ⁇ N-1, j ⁇ v ⁇ N-1, v, i, j and N are all positive integers, and N is the total number of nodes in each node group.
  • the node array 101 includes 6 rows and 5 columns, in which a row of nodes with a row coordinate of 4 and a row of nodes with a row coordinate of 5 are redundant nodes.
  • the connection between node (2,0) and node (4,1) can be enabled, and node (2,2)
  • the connection between node (4,1) is enabled, the connection between node (3,0) and node (5,1) is enabled, the connection between node (3,2) and node (5,1) is enabled Connection enabled.
  • the redundant nodes of each node group include the N-th node of the node group, the j-th node of the i-th node group is the j-th node and the i+1-th node of the i-1 node group
  • the v-th node in the i-1-th node group and the i-th node group are both in an abnormal state
  • the v-th node in the i-1-th node group and the i-th node group The connection between the v+1th node of is enabled, and the connection between the vth node of the i+2th node group and the v+1th node of the i+1 node group is enabled; where, 1 ⁇ j ⁇ N, j ⁇ v ⁇ N, v, i, j and N are all positive integers, and N is the total number of nodes in each node group.
  • Case 4 is actually a special manifestation of Case 1 or Case 2 or Case 3.
  • Case 4 is a special manifestation of Case 1; when each node group includes In the case of two redundant nodes, and the redundant nodes are distributed at both ends of the node group, case 4 is a special manifestation of case 2; each node group includes two redundant nodes, and the redundant nodes are distributed at one end of the node group.
  • case four is a special manifestation of case three. Referring to FIG. 5 , taking the case where each node group includes a redundant node as an example, assume that the node array 101 includes 5 rows and 5 columns, in which a row of nodes with a row coordinate of 5 is a redundant node.
  • connection between node (2,0) and node (3,1) can be enabled, and node (2,3)
  • the connection between node (3,2) is enabled, the connection between node (3,0) and node (4,1) is enabled, the connection between node (3,3) and node (4,2) is enabled Connection enabled.
  • Figures 6 and 7 respectively show two combination methods.
  • the redundant nodes include the last two rows of nodes in the node array 101 . Since the node (3,1) in the second column and the node (3,2) in the third column are both in an abnormal state, the node in the first column looks for a backup node in the second column, while the node in the fourth column is in Look for backup nodes in the third column.
  • the second column and the third column both include two nodes in an abnormal state, and the redundant nodes are two consecutive rows in the node array 101, therefore, node (2,0), node (3,0), The row distance between node (2,3) and node (3,3) and their respective backup nodes is 2.
  • the redundant nodes include the first row and the last row in the node array 101 . Since the node (2,1) in the second column and the node (2,2) in the third column are both in an abnormal state, the node in the first column looks for a backup node in the second column, while the node in the fourth column is in Look for backup nodes in the third column.
  • both the second column and the third column include two nodes in an abnormal state, and the redundant nodes are two discontinuous rows in the node array 101, therefore, node (1,0) and node (1,3) Looks up for backup nodes, while node(2,0), node(3,0), node(2,3), and node(3,3) look down for backup nodes.
  • redundant nodes are provided in both the row direction and the column direction of the node array 101 .
  • node (4,1) in the redundant node set in the row direction can be used to replace node (3,1); when node (1,3) In an abnormal state, node (1,3) can be replaced by node (1,4) in the redundant nodes set in the column direction.
  • each node in the node array includes a processing core and a router connected to the processing core.
  • the router of each node in one node group is used to connect multiple nodes of another node group. router. That is to say, in the above embodiment, the connection between nodes is realized through the connection between routers between nodes.
  • the router can be used to transmit data between nodes, and can also be used to send the data received by the node to the processing core connected to the node, so that the processing core can process the data received by the node.
  • the router can also receive Process the data returned by the kernel.
  • each square represents a router and each ellipse represents a processing core.
  • the coordinates in the squares represent the coordinates of the router, and the coordinates in the ellipse represent the coordinates of the processing core.
  • Figures 9 and 10 only show the connection between each router and adjacent routers, and the connections between each router and non-adjacent routers are omitted.
  • Each dotted box in Figure 9 and Figure 10 represents a node. It can be seen that in Figure 9, each node includes a router and a processing core; each processing core is connected to a router, and different routers are connected to The processing core is different.
  • each node in the first and last columns includes a router and a processing core, and other nodes include a router and two processing cores, and adjacent nodes can share a processing core.
  • node (0,0) when both the router and the processing core of a node are in a normal state, the node is in a normal state; when at least one of the router and the processing core of a node is in an abnormal state, the node is in a normal state.
  • the above node is in an abnormal state.
  • node (0,0) includes router (0,0) and processing core (0,0). In router (0,0) and processing core (0,0) When both are in a normal state, node (0,0) can be considered to be in a normal state; when one of the router (0,0) or the processing core (0,0) is in an abnormal state, the node ( 0,0) is in an abnormal state.
  • any processing core or router belonging to a node is in an abnormal state
  • the node is deemed to be in an abnormal state.
  • the nodes including the processing core (1,1), the router (1,2), and the processing core (1,2) are determined to be in an abnormal state
  • the nodes including the processing core (1,0), the router (1,1), and the processing core (1,1) are determined to be in an abnormal state.
  • a node is considered to be in a normal state only when all processing cores and routers belonging to it are in a normal state.
  • each processing core is only connected to one router, when the node is in an abnormal state, there is no need to adjust the connection mode of the processing core, only the connection mode of the router needs to be adjusted.
  • the connection method of the router connected to the node and the processing core needs to be adjusted.
  • the last row of routers and their connected processing cores are redundant routers and redundant processing cores respectively, as shown by the white squares and white ovals in Figure 11.
  • Black squares represent routers in an abnormal state
  • black ovals represent processing cores in an abnormal state
  • gray squares and gray ovals represent routers in a normal state and processing cores in a normal state respectively.
  • the data processing device further includes a plurality of interfaces for connecting nodes of other data processing devices. Multiple interfaces can be provided on the periphery of the data processing device. One arrangement method is shown in Figure 12. In actual applications, an interface may be set for each node; in other embodiments, multiple nodes may share an interface.
  • the interface can send data output by other data processing devices to the router in this data processing device, and can also output data sent by the router of this data processing device to other data processing devices.
  • the interface may adopt serdes, GPIO bus interface, I 2 C interface, etc.
  • the connection between a node A and the corresponding default node or backup node is enabled based on the preset identification information of this node (that is, the node A).
  • the preset identification information may be a string of binary numbers including a plurality of data bits, and the number of data bits is determined based on the total number of default nodes and backup nodes of a node. For example, when the total number of default nodes and backup nodes does not exceed 4, the number of data bits is 2; when the total number of default nodes and backup nodes is greater than 4 and does not exceed 8, the number of data bits is 3 .
  • each default node and backup node of a node correspond to different preset identification information.
  • a node includes a default node and two backup nodes
  • the connection of each node to its default node or backup node can be selectively enabled based on different identification information. For example, when the identification information of a node is 00, the node's connection to its default node is enabled.
  • the data processing device further includes a control unit configured to obtain the working status of each node in the node array; and set the preset identification information of each node based on the working status of each node.
  • the abnormal state includes a first abnormal state caused by a process defect.
  • Abnormal states caused by process defects are generally fixed and irreversible. Therefore, the location of the node in the first abnormal state can be determined before the data processing device leaves the factory.
  • the data processing device may further include a storage unit for storing first location information of the node in the first abnormal state, so that the control unit sets the third location information based on the first location information. Default identification information of a node in an abnormal state.
  • the node in the first abnormal state can be determined through Design for Testability (DFT) detection, and the storage unit can be a one-time programmable memory such as an electrically programmed fuse (efuse).
  • DFT Design for Testability
  • the storage unit can be a one-time programmable memory such as an electrically programmed fuse (efuse).
  • efuse electrically programmed fuse
  • the location of the defective cores pre-stored in efuse can be read through the Micro-Controller Unit (MCU) or special hardware, and a core replacement strategy is selected based on the number and location of defective cores, and the replacement strategy is selected based on the number and location of the defective cores.
  • Strategy configure the preset identification information of the corresponding node to complete the reconstruction of the node array.
  • the abnormal state includes a second abnormal state caused by the working environment.
  • Abnormal states caused by the working environment for example, high temperature, high pressure
  • a detection unit can be provided in the data processing device for detecting the second location information of the node in the second abnormal state in real time during the operation of the data processing device, so that the control unit can detect the second location information of the node in the second abnormal state based on the operation of the data processing device.
  • the second location information sets preset identification information of the node in the second abnormal state.
  • the detection unit may be a failure detection circuit or detection software.
  • the failure detection circuit may be implemented by one or more sensors.
  • control unit is also configured to pause before setting the preset identification information of each node based on the working status of each node when at least one node switches from a normal state to an abnormal state.
  • the tasks currently performed by each node in the node array that is to say, whenever a new node is in an abnormal state, the currently executing tasks of each node can be suspended first, and then the replacement node can be re-determined, and the preset identification information can be configured according to the re-determined replacement node, thereby completing the node array reconstruction. structure.
  • an output of a node in each node group is connected to a multiplexer, and an input of a node in each node group is connected to a demultiplexer; a multiplexer of a node The demultiplexer of a node is used to input the output signal output to the node through different channels to the node.
  • the multiplexer (Multiplexer) is marked as MUX
  • the de-multiplexer (De-multiplexer) is marked as DMUX.
  • Each square represents a node.
  • the left side of the dotted line is the signal flow direction when each node is in a normal state.
  • the right side of the dotted line is the signal flow direction when node (1,1) is in an abnormal state.
  • the solid line with arrows represents the connection between nodes. relation. It can be seen that in the node array on the left, the connections between node (1,0), node (1,1) and node (1,2) are enabled.
  • the thicker As shown by the solid line; in the node array on the right, the connections between node (1,0), node (0,1) and node (1,2) are enabled, as in the node array on the right Shown by the thicker solid line. In the same way, the connection between node (0,0) and the previous node of node (0,1) is enabled, and the connection between node (0,2) and the previous node of node (0,1) is also enabled. Enable.
  • each multiplexer and each demultiplexer corresponds to one of a set of preset identification information.
  • the node (1,1) is in an abnormal state, and the preset identification information of each multiplexer and each demultiplexer connected by the thicker solid line can be configured.
  • a multiplexer and a demultiplexer involve a total of three signals. Therefore, a set of preset identification information may include 00, 01 and 11, where 00 indicates that the same connection is connected.
  • the node of the row, 01 means connecting the node of the previous row, 11 means connecting the node of the next row.
  • the default identification information corresponding to the MUX on the right side of 0,1), the DMUX on the right side of node (0,1), the DMUX connected to node (1,2), and the MUX connected to node (1,2) are respectively set to: 01 ,01,11,11,11,11,01,01.
  • each node includes a bypass unit; when the node is in an abnormal state, the bypass unit bypasses the node so that the node is in the node group where the node is located. Two nodes adjacent to the node are directly connected.
  • each node is connected to each other not only in the horizontal direction, but also in the vertical direction.
  • node (1,1) When node (1,1) is in an abnormal state, node (1,1) needs to be bypassed through the bypass unit in node (1,1) to make node (0,1) in the vertical direction. Directly connected to node (2,1).
  • the target nodes in each node group are bypassed, so that any The number of nodes in an abnormal state that are not bypassed in a node group is less than or equal to the number of redundant nodes in the node group; wherein the target nodes in each node group include the nodes in an abnormal state, And the target node in one node group is the default node of the target node in another node group.
  • the node grouping includes grouping 1, grouping 2, grouping 3 and grouping 4, and the target nodes in the above four node groups are sequentially recorded as target node 1, target node 2, target node 3 and target node 4, then the target node 1 is the default node of target node 2, target node 2 is the default node of target node 3, and target node 3 is the default node of target node 4.
  • a node group includes a column of a node array, and the default node of a node in a group is in the same row as the node, then the above target node 1, target node 2, target node 3 and target node 4 are node arrays. nodes in the same row.
  • a node group includes a row of the node array, and the default node of a node in a group is in the same column as the node, then the above-mentioned target node 1, target node 2, target node 3 and target node 4 are nodes. nodes in the same column in the array.
  • nodes in the last row are redundant nodes. Since there is only one redundant node in each node group, there are two nodes in an abnormal state in the node group in the second column, namely node (0, 1) and node (1,1), at this time, the redundant nodes are not enough to provide redundant logic for nodes in abnormal status. Therefore, the row where node (0,1) is located, that is, each node in the dotted box, can be removed to ensure that the number of redundant nodes is enough to provide redundant logic for nodes in abnormal states.
  • the present disclosure also provides an electronic device, which includes the data processing apparatus described in any embodiment of the present disclosure.
  • the present disclosure also provides a data processing method for Adjust the connection relationship of each node in the data processing device according to any embodiment; the method includes:
  • Step 1501 Obtain the status of the default node of each node, the status includes normal status and abnormal status;
  • Step 1502 Adjust the connection relationship of multiple nodes in the node array based on the status of the default node of each node; wherein:
  • Step 15021 When the default node of the first node in a node group is in a normal state, enable the connection between the first node and the corresponding default node;
  • Step 15022 When the default node of the first node in a node group is in an abnormal state, enable the first node and at least one second node in the same node group as the first node and the corresponding backup node.
  • the methods of the embodiments of the present disclosure can be executed by processing units such as MCUs and CPUs or dedicated processing hardware.
  • processing units such as MCUs and CPUs or dedicated processing hardware.
  • the details of how to enable the connection between nodes can be found in the foregoing embodiments of the data processing device, and will not be described again here.
  • the present disclosure also provides a data processing device for adjusting the connection relationship of each node in the data processing device according to any embodiment of the present disclosure; the device includes:
  • the acquisition module 1601 is used to obtain the status of the default node of each node, the status includes normal status and abnormal status;
  • the adjustment module 1602 is used to adjust the connection relationship of multiple nodes in the node array based on the status of the default node of each node; wherein:
  • the default node of the first node in a node group When the default node of the first node in a node group is in an abnormal state, enable the connection between the first node and at least one second node in the same node group as the first node and the corresponding backup node.
  • the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments.
  • the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments.
  • An embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the method described in any of the foregoing embodiments is implemented.
  • Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape cassettes tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • the embodiments of this specification can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solutions of the embodiments of this specification can be embodied in the form of software products in essence or in other words, the parts that contribute to the existing technology.
  • the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes a number of instructions to cause a computer device (which can be a personal computer, server, or network device, etc.) to execute the software.
  • a typical implementation device is a computer, which may be in the form of a personal computer, a laptop, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, or a game controller. desktop, tablet, wearable device, or a combination of any of these devices.
  • each embodiment in this specification is described in a progressive manner.
  • the same and similar parts between the various embodiments can be referred to each other.
  • Each embodiment focuses on its differences from other embodiments.
  • the description is relatively simple.
  • the device embodiments described above are only illustrative.
  • the modules described as separate components may or may not be physically separated.
  • the functions of each module may be integrated into the same device. or implemented in multiple software and/or hardware. Some or all of the modules can also be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

本公开实施例提供一种数据处理方法和装置、芯片、电子设备、介质,所述数据处理装置包括节点阵列,所述节点阵列包括多个节点分组;其中,对于该多个节点分组中的每个,该节点分组中相邻的节点相连接,以及该节点分组中每个节点与其他节点分组的多个节点相连接;该多个节点包括默认节点和备用节点,一个节点的默认节点是该节点所在节点分组中至少一个其他节点的备用节点;在一个节点分组中的第一节点的默认节点处于正常状态的情况下,该第一节点与对应的默认节点之间的连接被启用;在一个节点分组中的第一节点的默认节点处于异常状态的情况下,该第一节点以及与该第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接被启用。

Description

数据处理方法和装置、芯片、电子设备、介质
相关申请的交叉引用
本申请要求于2022年04月29日提交的、申请号为202210473209.4的中国专利申请的优先权,该申请以引用的方式并入本文中。
技术领域
本公开涉及数据处理装置技术领域,尤其涉及数据处理方法和装置、芯片、电子设备、介质。
背景技术
对于包括多个节点的数据处理装置,当其中一个节点处于异常状态时,可能导致整个数据处理装置不合格。为此,可以增加冗余逻辑来提高数据处理装置的可用性。然而,相关技术中的冗余逻辑往往需要对各节点的拓扑结构进行较大的改动。
发明内容
第一方面,本公开实施例提供一种数据处理装置,所述数据处理装置包括节点阵列,所述节点阵列包括多个节点分组;其中,对于所述多个节点分组中的每个,该节点分组中相邻的节点相连接,以及该节点分组中每个节点与其他节点分组的多个节点相连接;所述多个节点包括默认节点和备用节点,一个节点的默认节点是该节点所在节点分组中至少一个其他节点的备用节点;在一个节点分组中的第一节点的默认节点处于正常状态的情况下,所述第一节点与对应的默认节点之间的连接被启用;在一个节点分组中的第一节点的默认节点处于异常状态的情况下,所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接被启用。
在一些实施例中,所述节点阵列中的每个节点均包括处理内核以及与所述处理内核相连接的路由器,一个节点分组中的每节点的路由器用于连接另一个节点分组的多个节点的路由器;在一个节点的路由器和处理内核均处于正常状态的情况下,所述节点处于正常状态;在一个节点的路由器和处理内核中的至少一者处于异常状态的情况下,所述节点处于异常状态。
在一些实施例中,所述多个节点分组中的每个节点分组均包括至少一个冗余节点以及除所述冗余节点以外的工作节点,一个节点分组的冗余节点为另一个节点分组的至少一个工作节点的备用节点,所述第一节点为所述工作节点;在一个节点分组的各个工作节点均处于正常状态的情况下,所述节点分组的冗余节点被禁用。
在一些实施例中,针对所述多个节点分组中的每个节点分组设置有一个冗余节点的情况,每个工作节点的备用节点均为该节点所在分组中下一工作节点的默认节点;响应于确定该节点分组中的一个工作节点的默认节点处于异常状态,分别将从该工作节点至该节点分组的最后一个工作节点的所有工作节点与对应的备用节点之间的连接启用,其中,所述冗余节点为该节点分组中的最后一个工作节点的下一节点。
在一些实施例中,针对所述多个节点分组中的每个节点分组设置有至少两个冗余节点,且所述至少两个冗余节点分布在所述节点分组两端的情况,响应于确定所述节点分组中至少一个工作节点的默认节点处于异常状态,针对所述至少一个工作节点中的每个,分别将从该工作节点至该工作节点的目标冗余节点的前一工作节点的所有工作节点与 对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该节点分组中冗余节点的数量,所述目标冗余节点与所述至少一个工作节点处于同一节点分组,且针对所述至少一个工作节点中的每个,该工作节点的后一工作节点与所述目标冗余节点的前一工作节点之间的每个工作节点的默认节点均处于正常状态。
在一些实施例中,针对所述多个节点分组中的每个节点分组设置有至少两个冗余节点,且所述至少两个冗余节点分布在所述节点分组的一端的情况,响应于确定所述节点分组中至少一个工作节点的默认节点处于异常状态,针对所述至少一个工作节点中的每个,分别将从该工作节点至该节点分组的冗余节点的前一个工作节点的所有节点与对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该节点分组中冗余节点的数量,针对每个启用与对应的备用节点之间连接的第三节点,该第三节点的默认节点与该第三节点的备用节点不相邻。
在一些实施例中,所述节点阵列中每个工作节点的备用节点均为该工作节点所在的节点分组中下一工作节点的默认节点。
在一些实施例中,所述节点阵列中每个工作节点的备用节点均为该工作节点所在节点分组中与该工作节点间隔设置的工作节点的默认节点。
在一些实施例中,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第N个节点,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点;在第i个节点分组的第j个节点处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用;其中,1≤j<N,j≤v<N,v、i、j和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
在一些实施例中,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第1个节点和第N个节点;第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点和第i个节点分组的第j-1个节点为第i-1个节点分组的第j个节点的备用节点,且第i个节点分组的第j+1个节点和第i个节点分组的第j-1个节点为第i+1个节点分组的第j个节点的备用节点;在第i个节点分组的第j个节点和第k个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i-1个节点分组的第u个节点与第i个节点分组的第u-1个节点之间的连接被启用,第i+1个节点分组的第u个节点与第i个节点分组的第u-1个节点之间的连接被启用;其中,1<j<k<N,k<v<N,1<u<j,u、v、i、j、k和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
在一些实施例中,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第N-1个节点和第N个节点;第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点和第i个节点分组的第j+2个节点为第i-1个节点分组的第j个节点的备用节点,且第i个节点分组的第j+1个节点和第i个节点分组的第j+2个节点为第i+1个节点分组的第j个节点的备用节点;在第i个节点分组的第j个节点和第j+1个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+2个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+2个节点之间的连接被启用;其中,1≤j<N-1,j≤v<N-1,v、i、j和N均为正整数,N为所述多个节点分 组中的每个节点分组的节点总数。
在一些实施例中,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第N个节点,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点;在第i个节点分组的第j个节点和第i+1个节点分组的第j个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+2个节点分组的第v个节点与i+1个节点分组的第v+1个节点之间的连接被启用;其中,1≤j<N,j≤v<N,v、i、j和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
在一些实施例中,在任意一个节点分组中处于异常状态的节点的数量大于所述节点分组中的冗余节点的数量的情况下,所述多个节点分组中的每个节点分组中的目标节点均被旁路,以使任意一个节点分组中未被旁路的异常状态的节点的数量小于或等于所述节点分组中的冗余节点的数量;其中,所述目标节点包括所述处于异常状态的节点,且一个节点分组中的目标节点为另一个节点分组中目标节点的默认节点。
在一些实施例中,一个节点分组中冗余节点的数量基于以下至少一个条件确定:所述数据处理装置的面积、所述节点处于异常状态的概率、所述节点阵列中节点的数量。
在一些实施例中,一个节点与对应的默认节点或备用节点之间的连接基于对应的预设标识信息被启用;其中,一个节点的各个默认节点和备用节点对应不同的预设标识信息。
在一些实施例中,所述数据处理装置还包括控制单元,用于:针对所述节点阵列中每个节点,获取该节点的工作状态;基于该节点的工作状态设置该节点的预设标识信息。
在一些实施例中,所述异常状态包括由工艺缺陷导致的第一异常状态;所述数据处理装置还包括:存储单元,用于存储处于所述第一异常状态的节点的第一位置信息,以使所述控制单元基于所述第一位置信息设置所述第一异常状态的节点的预设标识信息。
在一些实施例中,所述异常状态包括由工作环境导致的第二异常状态;所述数据处理装置还包括:检测单元,用于在所述数据处理装置工作过程中,实时检测处于第二异常状态的节点的第二位置信息,以使所述控制单元基于所述第二位置信息设置所述第二异常状态的节点的预设标识信息。
在一些实施例中,所述控制单元还用于:在至少一个节点从政策状态切换到异常状态的情况下,在基于所述节点阵列中的各个节点的工作状态设置所述各个节点的预设标识信息之前,暂停所述节点阵列中的各个节点当前执行的任务。
在一些实施例中,所述多个节点分组中的每个节点分组中的节点的输出端连接一个多路复用器,每个节点分组中的节点的输入端连接一个解多路复用器;一个节点的多路复用器用于将所述节点的输出信号通过不同的通道输出至所述节点的默认节点或备用节点;一个节点的解多路复用器用于将通过不同通道输出至该节点的输出信号输入该节点。
在一些实施例中,所述数据处理装置还包括:多个接口,用于连接其他数据处理装置的节点。
在一些实施例中,所述节点阵列中的每个节点均包括一个旁路单元;在所述节点处于异常状态的情况下,所述旁路单元对所述节点进行旁路,以使所述节点所在的节点分组中与所述节点相邻的两个节点直接相连。
第二方面,本公开实施例提供一种芯片,包括本公开任一实施例所述的数据处理装 置。
第三方面,本公开实施例提供一种电子设备,包括本公开任一实施例的数据处理装置。
第四方面,本公开实施例提供一种数据处理方法,用于对本公开第一方面中的任一实施例所述的数据处理装置中各节点的连接关系进行调整;所述方法包括:
获取各节点的默认节点的状态,所述状态包括正常状态和异常状态;
基于各节点的默认节点的状态调整所述节点阵列中多个节点的连接关系;其中:
在一个节点分组中的第一节点的默认节点处于正常状态的情况下,启用所述第一节点与对应的默认节点之间的连接;
在一个节点分组中的第一节点的默认节点处于异常状态的情况下,启用所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接,其中,所述至少一个第二节点的每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点。
第五方面,本公开实施例提供一种数据处理装置,用于对本公开第一方面中的任一实施例所述的数据处理装置中各节点的连接关系进行调整;所述装置包括:
获取模块,用于获取各节点的默认节点的状态,所述状态包括正常状态和异常状态;
调整模块,用于基于各节点的默认节点的状态调整所述节点阵列中多个节点的连接关系;其中:
在一个节点分组中的第一节点的默认节点处于正常状态的情况下,启用所述第一节点与对应的默认节点之间的连接;
在一个节点分组中的第一节点的默认节点处于异常状态的情况下,启用所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接,其中,所述至少一个第二节点的每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点。
本公开实施例中每个节点分组的节点均可连接到包括默认节点和备用节点在内的多个节点,这样,在默认节点处于正常状态的情况下,可以仅启用与默认节点之间的连接,而在默认节点处于异常状态的情况下,可以采用备用节点作为默认节点的冗余节点,实现了冗余逻辑。此外,在第一节点的默认节点处于异常状态的情况下,所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接被启用,且每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点,这样,使得整个节点阵列总体的拓扑结构尽可能地保持与原拓扑结构相同,减小了采用冗余逻辑后数据处理装置中节点的拓扑结构的变化。
应当理解,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。
图1是根据本公开实施例的数据处理装置的示意图。
图2A是根据本公开实施例的正常状态的节点的连接方式的示意图。
图2B和图2C分别是根据本公开实施例的异常状态的节点的不同连接方式的示意图。
图3至图8分别是根据本公开实施例的不同情况下的冗余逻辑的示意图。
图9和图10分别是根据本公开实施例的包括处理核心和路由器的节点阵列的示意图。
图11是图10所示的节点阵列中异常状态的节点的连接方式的示意图。
图12是根据本公开实施例的包括接口的数据处理装置的示意图。
图13是根据本公开实施例的包括多路复用器和解多路复用器的数据处理装置的示意图。
图14是根据本公开实施例的异常状态的节点数量大于冗余节点数量时的示意图。
图15是根据本公开实施例的数据处理方法的流程图。
图16是根据本公开另一实施例的数据处理装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合。
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
为了使本技术领域的人员更好的理解本公开实施例中的技术方案,并使本公开实施例的上述目的、特征和优点能够更加明显易懂,下面结合附图对本公开实施例中的技术方案作进一步详细的说明。
对于包括多个节点的数据处理装置,当其中一个节点处于异常状态时,可能导致整个数据处理装置不合格。在一些应用场景下,数据处理装置应用于芯片中。随着人工智能的飞速发展,芯片面积越来越大。而芯片面积越大,良率(yield)越低。其中,良率是指合格芯片的数量与生产的芯片的总数之间的比值。为了提高芯片的良率,一般会在芯片中增加冗余逻辑,例如,在包括X行Y列的节点阵列的芯片中,可以额外增加一行或一列节点作为冗余节点。在正常情况下,冗余节点不工作;而在节点阵列中存在处于异常状态的节点时,可以通过冗余节点来代替该异常状态的节点进行工作。然而,相关技术中的冗余逻辑往往需要对芯片中各节点的拓扑结构进行较大的改动。
基于此,本公开提供一种数据处理装置,参见图1,所述数据处理装置包括节点阵列101,所述节点阵列101包括多个节点分组101a;
其中,一个节点分组101a中相邻的节点相连接,以及该节点分组101a中每个节点 与其他节点分组101a的多个节点相连接;所述多个节点包括默认节点和备用节点,一个节点的默认节点是该节点所在节点分组101a中至少一个其他节点的备用节点;
在一个节点分组101a中的第一节点的默认节点处于正常状态的情况下,所述第一节点与对应的默认节点之间的连接被启用;
在一个节点分组101a中的第一节点的默认节点处于异常状态的情况下,所述第一节点以及与第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接被启用。
在本公开实施例提供的数据处理装置中,所述节点阵列101可以包括多个节点,各个节点分组中的节点均用于进行数据处理,也可以称为数据处理节点或者处理节点。所述多个节点可以组成一个P行Q列的阵列,其中,P和Q均为正整数。在图1所示的实施例中,每个方块表示一个节点,P和Q的取值均为5。当然,在实际应用中,P和Q也可以取其他值,并且P和Q的取值可以不相等。在图1中,每个方块中的数字表示对应节点在节点阵列101中的逻辑坐标,括号中在前的数字表示节点所在的行,在后的数字表示节点所在的列。例如,(0,0)表示节点阵列101中第1行第1列的节点,(0,1)表示节点阵列101中第1行第2列的节点。为了便于描述,下文中以节点的坐标表示该节点,例如,将坐标为(0,0)的节点表示为节点(0,0),其他功能单元的命名方式与节点的命名方式类似。
所述节点阵列101可以包括多个节点分组101a,节点分组也可以简称为分组。一个节点分组101a可以是节点阵列101中的一行或一列,例如,在以节点阵列101中的每一列作为一个节点分组101a的情况下,节点(0,0)、节点(1,0)、节点(2,0)、节点(3,0)和节点(4,0)可组成其中一个节点分组101a。同一节点分组中相邻的节点相连接,例如,节点(0,0)与节点(1,0)相连接,节点(2,0)分别与节点(1,0)和节点(3,0)相连接,带箭头的实线表示连接关系,箭头表示数据流向。每个节点分组101a中的节点均用于连接另一个(例如,相邻的)节点分组101a的多个节点,以节点阵列101中第一列所在的第一节点分组和节点阵列101中第二列所在的第二节点分组为例,第一节点分组中的节点(0,0)用于连接第二节点分组中的节点(0,1)和节点(1,1),第一节点分组中的节点(1,0)用于连接第二节点分组中的节点(0,1)、节点(1,1)和节点(2,1)。
所述用于连接另一个节点分组101a的多个节点包括默认节点和备用节点,同一节点的默认节点和备用节点的数量均可以大于或等于1。在以节点阵列101中的每一列作为一个节点分组101a的情况下,针对节点阵列101中的一个节点分组中的节点,可以将其他节点分组中与该节点处于同一行(即行坐标相同)的相邻节点作为该节点的默认节点,将其他节点分组中与该节点处于不同行(即行坐标不同),且与该节点存在连接关系的节点作为该节点的备用节点。例如,节点(0,0)的默认节点为节点(0,1),节点(0,1)的默认节点包括节点(0,0)和节点(0,2);节点(0,0)的备用节点为节点(1,1),节点(0,1)的备用节点包括节点(1,0)和节点(1,2)。
除了图1中所示的情况以外,一个节点还可以有更多数量的备用节点。例如,节点(0,0)的备用节点可以包括节点(1,1)和节点(2,1),节点(0,1)的备用节点可以包括节点(1,0)、节点(2,0)、节点(1,2)和节点(2,2)。也就是说,一个节点的备用节点可以包括与该节点的默认节点处于同一节点分组,且行位置在该节点的默认节点下方的一个或多个节点。除了上述情形之外,一个节点的备用节点也可以包括与该节点的默认节点处于同一节点分组,且行位置在该节点的默认节点上方的一个或多个节点。例如,节点(2,0)的备用节点可以为节点(1,1),或者包括节点(1,1)和节点(0,1)。或者,一个节点的备用节点可以同时包括满足上述任意一种情形的节点。例如,节点(2,0)的备用节点可以包括节点(1,1)和节点(3,1),或者节点(2,0)的备用节点可以包括节点(0,1)、节点(1,1)和节点(3,1)。上述实施 例示出了一个节点与对应的备用节点之间的行距离小于或等于2的情形,在其他实施例中,一个节点与对应的备用节点之间的行距离也可以大于2,此处不再一一展开说明。
一个节点的默认节点可以是同一节点分组中至少一个其他节点的备用节点,例如,节点(1,0)的默认节点为节点(1,1),而节点(1,1)可以作为节点(0,0)和节点(2,0)的备用节点。
上述例子示例性地描述了每个节点分组101a包括节点阵列101中的一列的情况,在其他例子中,每个节点分组101a也可以包括节点阵列101中的一行。例如,在图1中,节点(2,0)、节点(2,1)、节点(2,2)、节点(2,3)和节点(2,4)可组成其中一个节点分组101a。同一节点分组中相邻的节点相连接,例如,节点(2,0)与节点(2,1)相连接,节点(2,2)分别与节点(2,1)和节点(2,3)相连接,带箭头的实线表示连接关系,箭头表示数据流向。每个节点分组101a中的节点均用于连接另一个节点分组101a的多个节点,以节点阵列101中第一行所在的第三节点分组和节点阵列101中第二行所在的第四节点分组为例,第三节点分组中的节点(0,0)用于连接第四节点分组中的节点(1,0)和节点(1,1),第三节点分组中的节点(0,1)用于连接第四节点分组中的节点(1,0)、节点(1,1)和节点(1,2)。
同样地,所述多个节点包括默认节点和备用节点,同一节点的默认节点和备用节点的数量均可以大于或等于1。在以节点阵列101中的每一行作为一个节点分组101a的情况下,针对节点阵列101中的一个节点分组中的节点,可以将其他节点分组中与该节点处于同一列(即列坐标相同)的相邻节点作为该节点的默认节点,将其他节点分组中与该节点处于不同列(即列坐标不同),且与该节点存在连接关系的节点作为该节点的备用节点。例如,节点(0,0)的默认节点为节点(1,0),节点(1,0)的默认节点包括节点(0,0)和节点(2,0);节点(0,0)的备用节点为节点(1,1),节点(1,0)的备用节点包括节点(0,1)和节点(2,1)。
除了图1中所示的情况以外,一个节点还可以有更多数量的备用节点。例如,节点(0,0)的备用节点可以包括节点(1,1)和节点(1,2),节点(1,0)的备用节点可以包括节点(0,1)、节点(0,2)、节点(2,1)和节点(2,2)。也就是说,一个节点的备用节点可以包括与该节点的默认节点处于同一节点分组,且列位置在该节点的默认节点之后的一个或多个节点。除了上述情形之外,一个节点的备用节点也可以包括与该节点的默认节点处于同一节点分组,且列位置在该节点的默认节点之前的一个或多个节点。例如,节点(0,2)的备用节点可以为节点(1,1),或者包括节点(1,1)和节点(1,0)。或者,一个节点的备用节点可以同时包括满足上述任意一种情形的节点。例如,节点(0,2)的备用节点可以包括节点(1,1)和节点(1,3),或者节点(0,2)的备用节点可以包括节点(1,0)、节点(1,1)和节点(1,3)。上述实施例示出了一个节点与对应的备用节点之间的列距离小于或等于2的情形,在其他实施例中,一个节点与对应的备用节点之间的列距离也可以大于2,此处不再一一展开说明。
一个节点的默认节点可以是同一节点分组中至少一个其他节点的备用节点,例如,节点(0,1)的默认节点为节点(1,1),而节点(1,1)可以作为节点(0,0)和节点(0,2)的备用节点。
在一些实施例中,在一个节点处于正常状态的情况下,该节点与同一分组中的相邻节点处于连接状态;在一个节点处于异常状态的情况下,可以将该节点旁路掉,以使同一分组中该节点的相邻两个节点处于连接状态。在一些实施例中,一个节点与默认节点和各个备用节点之间的连接在同一时刻可以仅启用一者。进一步地,可以优先启用第一节点与默认节点之间的连接,也就是说,只要第一节点的默认节点处于正常状态,则该第一节点优先连接到对应的默认节点,只有在第一节点的默认节点处于异常状态,该第一节点才会连接到对应的备用节点。这样,备用节点可以在默认节点处于非正常状态时作为默认节点的冗余节点来与对应的第一节点建立连接,以保证节点阵列101整体处于正常工作状态。
例如,在每个节点分组101a包括节点阵列101中的一列的情况下,假设节点(1,0) 的默认节点为节点(1,1),备用节点为节点(2,1)和节点(3,1),则在节点(1,1)处于正常状态的情况下,节点(1,0)与节点(1,1)之间的连接被启用,节点(1,0)与节点(2,1)之间的连接被禁用,节点(1,0)与节点(3,1)之间的连接也被禁用。在节点(1,1)处于异常状态的情况下,节点(1,0)与节点(1,1)之间的连接被禁用,节点(1,0)与节点(2,1)之间的连接被启用,节点(1,0)与节点(3,1)之间的连接被禁用。或者,在节点(1,1)处于异常状态的情况下,节点(1,0)与节点(1,1)之间的连接被禁用,节点(1,0)与节点(2,1)之间的连接被禁用,节点(1,0)与节点(3,1)之间的连接被启用。具体启用与哪一个备用节点之间的连接可以根据实际情况确定。
在每个节点分组101a包括节点阵列101中的一行的情况下,假设节点(0,1)的默认节点为节点(1,1),备用节点为节点(1,2)和节点(1,3),则在节点(1,1)处于正常状态的情况下,节点(0,1)与节点(1,1)之间的连接被启用,节点(0,1)与节点(1,2)之间的连接被禁用,节点(0,1)与节点(1,3)之间的连接也被禁用。在节点(1,1)处于异常状态的情况下,节点(0,1)与节点(1,1)之间的连接被禁用,节点(0,1)与节点(1,2)之间的连接被启用,节点(0,1)与节点(1,3)之间的连接被禁用。或者,在节点(1,1)处于异常状态的情况下,节点(1,0)与节点(1,1)之间的连接被禁用,节点(0,1)与节点(1,2)之间的连接被禁用,节点(0,1)与节点(1,3)之间的连接被启用。具体启用与哪一个备用节点之间的连接可以根据实际情况确定。
本领域技术人员可以理解,每个节点分组101a包括节点阵列101中的一行的情况相当于在每个节点分组101a包括节点阵列101中的一列的情况下,对节点阵列进行了90度的旋转,为了便于说明,下文中主要以每个节点分组101a包括节点阵列101中的一列为例进行说明。
除了启用第一节点与对应的备用节点之间的连接之外,本公开实施例在第一节点的默认节点处于异常状态的情况下,还会启用与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接。其中,每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点。例如,针对图1中的除最后一行以外的任意一个节点A,假设图1中与节点A处于同一行的相邻节点为节点A的默认节点,处于节点A的默认节点下一行,且与节点A的默认节点处于同一列的节点为节点A的备用节点。
在各个节点均处于正常状态的情况下,被启用的是各节点与对应的默认节点之间的连接,该情况下的连接方式如图2A所示。由于每个节点与其备用节点之间的连接被禁用,因此,仅在处于同一行的各相邻节点之间,以及处于同一列的各相邻节点之间存在连接关系。
假设节点(1,0)的默认节点(即节点(1,1))为异常状态的节点,则将节点(1,0)、节点(2,0)、节点(3,0)、节点(4,0)、节点(1,2)、节点(2,2)、节点(3,2)以及节点(4,2)与各自节点的备用节点之间的连接均启用,并将上述节点与各自节点的默认节点之间的连接禁用,该情况下的连接方式如图2B所示。应当说明的是,由于在上述冗余逻辑下,最后一行节点没有对应的备用节点,因此,在该连接关系下,可以将最后一行节点中的节点(4,0)、节点(4,2)、节点(4,3)以及节点(4,4)均设为非工作状态,即停用节点(4,0)、节点(4,2)、节点(4,3)以及节点(4,4)。
通过上述方式,能够使节点阵列中各节点的拓扑结构保持基本不变,仅在节点阵列中减少了一行节点。例如,在图2B所示的实施例中,节点(4,1)代替节点(3,1)作为连接节点(3,0)和节点(3,2)的节点,节点(3,1)代替节点(2,1)作为连接节点(2,0)和节点(2,2)的节点,节点(2,1)代替节点(1,1)作为连接节点(1,0)和节点(1,2)的节点,节点阵列中前4行的拓扑结构保持不变。
当然,上述情况仅为示例性说明,并非实现本公开的方案的唯一方式,例如,在图2C所示的实施例中,针对图1中的除最后两行以外的任意一个节点A,假设图1中与 节点A处于同一行的相邻节点为节点A的默认节点,处于节点A的默认节点下方的第二行,且与节点A的默认节点处于同一列的节点为节点A的备用节点。除此之外,在其他实施例中,也可以将处于节点A的默认节点下方的第z行,且与节点A的默认节点处于同一列的节点为节点A的备用节点,其中,z为大于2的整数。或者,也可以将处于节点A的默认节点下方的多行,且与节点A的默认节点处于同一列的节点为节点A的备用节点。
在一些实施例中,每个节点分组均包括至少一个冗余节点以及除所述冗余节点以外的工作节点,一个节点分组的冗余节点为另一个节点分组的至少一个工作节点的备用节点,所述第一节点为所述工作节点。例如,在图2A和图2B所示的实施例中,可以将每个节点分组的最后一个节点作为冗余节点,即,节点阵列101中的最后一行节点均为冗余节点。
在一个节点分组的各个工作节点均处于正常状态的情况下,可以将所述节点分组的冗余节点禁用,即,该节点分组的冗余节点被设置为非工作状态。只有在某一个节点分组中存在非正常状态的工作节点的情况下,所述节点分组的冗余节点才被启用。仍以图2B所示的实施例为例,节点阵列中的最后一行节点为冗余节点,这些冗余节点在正常情况下是处于非工作状态的,从某一时刻开始,由于节点(1,1)处于异常状态,因此,为了实现冗余逻辑,才启用冗余节点中的节点(4,1),而其他冗余节点仍然保持在非工作状态。这样,可以使各个工作节点的拓扑结构均保持不变。本实施例与前述实施例中将节点(4,0)、节点(4,2)、节点(4,3)以及节点(4,4)均设为非工作状态的区别在于,在前述实施例中,最后一行节点与其他节点一样,都是处于工作状态,只有在存在异常状态的节点的情况下,才会将最后一行节点设置为非工作状态。
上述实施例示出了冗余节点包括节点阵列101中的一行节点的情况,除此之外,冗余节点也可以包括节点阵列101中的一列节点,或者包括节点阵列101中的至少两行和/或至少两列节点。以冗余节点包括节点阵列101中的两行节点为例,所述至少两行节点可以是节点阵列101中连续的两行,例如,第1行和第2行,或者最后一行和倒数第2行;也可以是节点阵列101中不连续的两行,例如,第1行和最后一行。
一个节点分组中冗余节点的数量基于以下至少一个条件确定:所述数据处理装置的面积、所述节点处于异常状态的概率、所述节点阵列中节点的数量。一般来说,数据处理装置面积、所述节点处于异常状态的概率、所述节点阵列中节点的数量均与一个节点分组中冗余节点的数量正相关,数据处理装置面积越大,一个节点分组中包括的节点也越多,从而该节点分组中可能处于异常状态的节点也越多。因此,需要在节点分组中设置更多数量的冗余节点。同理,节点处于异常状态的概率越大,或者节点阵列101中节点的数量越多,一个节点分组中可能处于异常状态的节点也越多,因此,同样需要在节点分组中设置更多数量的冗余节点。
下面对冗余节点的数量和位置的可能情况,以及这些情况下的冗余逻辑进行举例说明。
情况一:每个节点分组包括一个冗余节点。在这种情况下,可以响应于至少一个工作节点的默认节点处于异常状态,分别将该工作节点至该工作节点所在节点分组的最后一个工作节点与对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该分组中冗余节点的数量,所述冗余节点为所属分组中的最后一个工作节点的下一节点。可选地,每个工作节点的备用节点均为该节点所在分组中下一工作节点的默认节点;或者,可选地,每个工作节点的备用节点均为该节点所在分组中处于该工作节点之后,且与该工作节点距离大于或等于2的工作节点的默认节点。
例如,假设节点A的默认节点处于异常状态,节点A所在的分组为分组T,且分组 T中的节点为{节点B,节点C,节点A,节点D,节点E,节点F,冗余节点},节点A的备用节点为节点A的下一工作节点(即节点D)的默认节点,节点D的备用节点为节点D的下一工作节点(即节点E)的默认节点,以此类推。从该工作节点A至该工作节点A所在节点分组的最后一个工作节点F的所有节点包括节点A,节点D,节点E和节点F。上述节点A、D、E、F与对应的备用节点之间的连接均被启用。
在一些实施例中,每个节点分组的冗余节点包括该节点分组的第N个节点,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点。
在第i个节点分组的第j个节点处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用;其中,1≤j<N,j≤v<N,v、j和N均为正整数,N为每个节点分组的节点总数。
本实施例与图2B所示的实施例类似,区别仅在于,冗余节点在所属节点分组内不存在异常状态的工作节点的情况下处于禁用状态,前面的实施例已经对上述区别进行了说明,此处不再赘述。
应当说明的是,各节点分组的冗余节点也可以替换为所述节点分组的第1个节点,在这种情况下,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j-1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点。在第i个节点分组的第j个节点处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v-1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v-1个节点之间的连接被启用;其中,1<j≤N,1<v≤j,v、i、j和N均为正整数,N为每个节点分组的节点总数。这种情况与冗余节点为节点分组中的第N个节点的情况类似,相当于对数据处理装置进行了上下翻转。
情况二:每个节点分组设置有至少两个冗余节点,且所述至少两个冗余节点分布在所述节点分组两端。在这种情况下,可以响应于至少一个工作节点的默认节点处于异常状态,分别将该工作节点至该工作节点的目标冗余节点的前一工作节点与对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该分组中冗余节点的数量,所述目标冗余节点与所述工作节点处于同一节点分组,且所述工作节点的后一工作节点与所述目标冗余节点的前一工作节点之间的每个工作节点的默认节点均处于正常状态。
可选地,每个工作节点的备用节点均为该节点所在节点分组中下一工作节点的默认节点;或者,可选地,每个工作节点的备用节点均为该节点所在节点分组中处于该工作节点之后,且与该工作节点距离大于或等于2的工作节点的默认节点。
例如,假设节点A和节点B的默认节点处于异常状态,节点A所在的分组为分组T,且分组T中的节点为{冗余节点1,节点C,节点D,节点A,节点E,节点B,节点F,冗余节点2},则节点A的目标冗余节点为冗余节点1,节点B的目标冗余节点为冗余节点2。这样,对于节点A来说,从该工作节点A至该工作节点A的目标冗余节点(冗余节点1)的前一工作节点C的所有节点包括节点C,节点D,节点A,对于节点B来说,该工作节点B至该工作节点的目标冗余节点(冗余节点2)的前一工作节点F的所有节点包括节点B,节点F。
以每个节点分组中的冗余节点数量等于2为例,假设每个节点分组的冗余节点包括 该节点分组的第1个节点和第N个节点;第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点和第i个节点分组的第j-1个节点为第i-1个节点分组的第j个节点的备用节点,且第i个节点分组的第j+1个节点和第i个节点分组的第j-1个节点为第i+1个节点分组的第j个节点的备用节点。
在第i个节点分组的第j个节点和第k个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i-1个节点分组的第u个节点与第i个节点分组的第u-1个节点之间的连接被启用,第i+1个节点分组的第u个节点与第i个节点分组的第u-1个节点之间的连接被启用;其中,1<j<k<N,k<v<N,1<u<j,u、v、i、j、k和N均为正整数,N为每个节点分组的节点总数。
参见图3,假设节点阵列101包括6行5列,其中,行坐标为0的一行节点以及行坐标为5的一行节点为冗余节点。在节点(4,1)处于异常状态的情况下,可以将节点(4,0)与节点(5,1)之间的连接启用,并将节点(4,2)与节点(5,1)之间的连接启用。在节点(2,1)处于异常状态的情况下,可以将节点(2,0)与节点(1,1)之间的连接启用,将节点(2,2)与节点(1,1)之间的连接启用,将节点(1,0)与节点(0,1)之间的连接启用,并将节点(1,2)与节点(0,1)之间的连接启用。
情况三:每个节点分组设置有至少两个冗余节点,且所述至少两个冗余节点分布在所述节点分组的一端。在这种情况下,可以响应于至少一个工作节点的默认节点处于异常状态,分别将该工作节点至该工作节点所在分组的冗余节点的前一个工作节点与对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该分组中冗余节点的数量。
在一些实施例中,针对每个启用与备用节点之间连接的第三节点,该第三节点的默认节点与该第三节点的备用节点不相邻。例如,每个工作节点的备用节点均为该工作节点所在分组中与该工作节点间隔设置的工作节点的默认节点。其中,两个节点间隔设置可以包括所述两个节点之间包括一个或多个节点的情况。
例如,节点A和节点B的默认节点处于异常状态,节点A所在的节点分组为分组T,且分组T中的节点为{节点C,节点D,节点A,节点B,节点E,节点F,冗余节点1,冗余节点2},则节点A将冗余节点1作为目标冗余节点,从该工作节点A至该工作节点A所在节点分组的冗余节点1的前一个工作节点F的所有节点包括节点A,节点B,节点E,节点F。节点A的备用节点为节点E,且节点A的默认节点与节点A的备用节点之间还包括节点B的默认节点,因此,节点A的默认节点与节点A的备用节点不相邻。同理,节点B的备用节点为节点F,且节点B的默认节点与节点B的备用节点之间还包括节点E的默认节点,因此,节点B的默认节点与节点B的备用节点不相邻。
例如,每个节点分组的冗余节点包括该节点分组的第N-1个节点和第N个节点;第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点和第i个节点分组的第j+2个节点为第i-1个节点分组的第j个节点的备用节点,且第i个节点分组的第j+1个节点和第i个节点分组的第j+2个节点为第i+1个节点分组的第j个节点的备用节点。
在第i个节点分组的第j个节点和第j+1个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+2个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+2个节点之间的连接被启用;其中,1≤j<N-1,j≤v<N-1,v、i、j和N均为正整数,N为每个节点分组的节点总数。
参见图4,假设节点阵列101包括6行5列,其中,行坐标为4的一行节点以及行坐标为5的一行节点为冗余节点。在节点(2,1)和节点(3,1)均处于异常状态的情况下,可以将节点(2,0)与节点(4,1)之间的连接启用,将节点(2,2)与节点(4,1)之间的连接启用,将节点(3,0)与节点(5,1)之间的连接启用,将节点(3,2)与节点(5,1)之间的连接启用。
情况四:一个节点与该节点的默认节点均处于异常状态。例如,每个节点分组的冗余节点包括该节点分组的第N个节点,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点。
在第i个节点分组的第j个节点和第i+1个节点分组的第j个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+2个节点分组的第v个节点与i+1个节点分组的第v+1个节点之间的连接被启用;其中,1≤j<N,j≤v<N,v、i、j和N均为正整数,N为每个节点分组的节点总数。
情况四实际上是情况一或情况二或情况三的一种特殊表现形式,在每个节点分组包括一个冗余节点的情况下,情况四是情况一的特殊表现形式;在每个节点分组包括两个冗余节点,且冗余节点分布在节点分组两端的情况下,情况四是情况二的特殊表现形式;在每个节点分组包括两个冗余节点,且冗余节点分布在节点分组一端的情况下,情况四是情况三的特殊表现形式。参见图5,以每个节点分组包括一个冗余节点的情况为例,假设节点阵列101包括5行5列,其中,行坐标为5的一行节点为冗余节点。在节点(2,1)和节点(2,2)均处于异常状态的情况下,可以将节点(2,0)与节点(3,1)之间的连接启用,将节点(2,3)与节点(3,2)之间的连接启用,将节点(3,0)与节点(4,1)之间的连接启用,将节点(3,3)与节点(4,2)之间的连接启用。
此外,上述两种或两种以上情况可以相结合,图6和图7分别示出了两种结合方式。如图6所示,是上述情况三与情况四的结合,在这种情况下,冗余节点包括节点阵列101中的最后两行节点。由于第二列的节点(3,1)和第三列的节点(3,2)均处于异常状态,因此,第一列的节点在第二列中寻找备用节点,而第四列的节点在第三列中寻找备用节点。此外,由于第二列和第三列均包括两个处于异常状态的节点,且冗余节点为节点阵列101中连续的两行,因此,节点(2,0)、节点(3,0)、节点(2,3)和节点(3,3)与各自的备用节点之间的行距离均为2,例如,节点(2,0)的备用节点为节点(4,1),二者的行距离为4-2=2。
如图7所示,是上述情况二与情况四的结合,在这种情况下,冗余节点包括节点阵列101中的第一行和最后一行。由于第二列的节点(2,1)和第三列的节点(2,2)均处于异常状态,因此,第一列的节点在第二列中寻找备用节点,而第四列的节点在第三列中寻找备用节点。此外,由于第二列和第三列均包括两个处于异常状态的节点,且冗余节点为节点阵列101中不连续的两行,因此,节点(1,0)和节点(1,3)向上寻找备用节点,而节点(2,0)、节点(3,0)、节点(2,3)和节点(3,3)向下寻找备用节点。
如图8所示,是在节点阵列101的行方向上和列方向上均设置冗余节点的情况。在这种情况下,当节点(3,1)处于异常状态,可以用行方向设置的冗余节点中的节点(4,1)来代替节点(3,1);当节点(1,3)处于异常状态,可以用列方向设置的冗余节点中的节点(1,4)来代替节点(1,3)。
除了以上列举的情形之外,冗余节点的位置和数量,以及冗余逻辑的实现方式还可以根据实际需要调整为其他情况,此处不再一一列举。
在一些实施例中,所述节点阵列中的每个节点均包括处理内核以及与所述处理内核相连接的路由器,一个节点分组中的每节点的路由器用于连接另一个节点分组的多个节点的路由器。也就是说,在上述实施例中,节点之间的连接是通过节点间的路由器之间的连接实现的。路由器可以用于实现数据在节点间的传输,还可以用于将节点接收到的数据发送到该节点所连接的处理内核,以使处理内核对该节点接收到的数据进行处理,路由器还可以接收处理内核返回的数据。
如图9和图10所示,每个方块表示一个路由器,每个椭圆表示一个处理内核。方块中的坐标表示路由器的坐标,椭圆中的坐标表示处理内核的坐标。为了简洁,图9和图10中仅示出了每个路由器与相邻路由器之间的连接,而省略了各个路由器与不相邻的路由器之间的连接。图9和图10的每个虚线框表示一个节点,可以看出,在图9中,每个节点均包括一个路由器和一个处理内核;每个处理内核连接一个路由器,且不同的路由器所连接的处理内核是不同的。在图10中,第一列和最后一列的每个节点均包括一个路由器和一个处理内核,其他节点均包括一个路由器和两个处理内核,且相邻节点可以共用一个处理内核。
在一些实施例中,在一个节点的路由器和处理内核均处于正常状态的情况下,所述节点处于正常状态;在一个节点的路由器和处理内核中的至少一者处于异常状态的情况下,所述节点处于异常状态。例如,在图9所示的实施例中,假设节点(0,0)包括路由器(0,0)和处理内核(0,0),在路由器(0,0)和处理内核(0,0)均处于正常状态的情况下,可认为节点(0,0)处于正常状态;在路由器(0,0)或者处理内核(0,0)中的一者处于异常状态的情况下,可认为节点(0,0)处于异常状态。在图10所示的实施例中,属于一个节点的任意一个处理内核或者路由器处于异常状态,均视为该节点处于异常状态。例如,在处理内核(1,1)处于异常状态的情况下,将包括处理内核(1,1)、路由器(1,2)和处理内核(1,2)的节点确定为处于异常状态,并将包括处理内核(1,0)、路由器(1,1)和处理内核(1,1)的节点确定为处于异常状态。只有当属于一个节点的全部处理内核和路由器均处于正常状态,才视为该节点处于正常状态。
在图9所示的实施例中,由于每个处理内核仅连接一个路由器,因此,在节点处于异常状态的情况下,无需调整处理内核的连接方式,只需要调整路由器的连接方式。而在图10所示的实施例中,由于一个处理内核可连接两个路由器,因此,在一个节点处于异常状态的情况下,连接该节点的路由器和处理内核的连接方式都需要进行调整。如图11所示,最后一行路由器及其所连接的处理内核分别为冗余路由器和冗余处理内核,如图11中的白色方块和白色椭圆所示。黑色方块表示处于异常状态的路由器,黑色椭圆表示处于异常状态的处理内核,灰色方块和灰色椭圆分别表示处于正常状态的路由器和处于正常状态的处理内核。可以看出,由于路由器(3,1)处于异常状态,因此,连接路由器(3,1)的路由器(3,0)和处理内核(3,0)的连接方式均需要调整,调整后的路由器(3,0)和处理内核(3,0)连接到路由器(4,1)。同理,连接路由器(3,2)的路由器(3,3)和处理内核(3,2)的连接方式均需要调整,调整后的路由器(3,3)和处理内核(3,2)连接到路由器(4,2)。调整后路由器(4,1)、路由器(4,2)以及处理内核(4,1)都作为备用节点处于正常状态,因此,在图中将路由器(4,1)、路由器(4,2)以及处理内核(4,1)分别表示为灰色方块和灰色椭圆。
在一些实施例中,所述数据处理装置还包括多个接口,用于连接其他数据处理装置的节点。接口可以在数据处理装置外围设置多个,一种设置方式如图12所示。在实际应用中,可以为每个节点分别设置一个接口;在另一些实施例中,也可以多个节点共享一个接口。接口可以将其他数据处理装置输出的数据发送给本数据处理装置中的路由器,也可以将本数据处理装置的路由器发送的数据输出到其他数据处理装置。在一些实施例中,所述接口可以采用serdes、GPIO总线接口、I2C接口等。
在一些实施例中,一个节点A与对应的默认节点或备用节点之间的连接基于本节点(即该节点A)的预设标识信息被启用。所述预设标识信息可以是包括多个数据位的一串二进制数,数据位的位数基于一个节点的默认节点和备用节点的总数确定。例如,在默认节点和备用节点的总数不超过4的情况下,数据位的位数为2;在默认节点和备用节点的总数大于4且不超过8的情况下,数据位的位数为3。
其中,一个节点的各个默认节点和备用节点对应不同的预设标识信息。例如,当一个节点包括一个默认节点和两个备用节点时,可以将默认节点对应的标识信息设置为00,将其中一个备用节点的标识信息设置为01,并将另一个备用节点的标识信息设置为11。这样,可以根据不同的标识信息选择性地启用各个节点与其默认节点或备用节点的连接。例如,在一个节点的标识信息为00时,启用该节点与其默认节点的连接。
在一些实施例中,所述数据处理装置还包括控制单元,用于获取所述节点阵列中各个节点的工作状态;基于所述各个节点的工作状态设置所述各个节点的预设标识信息。
在一些实施例中,所述异常状态包括由工艺缺陷导致的第一异常状态。由工艺缺陷导致的异常状态一般是固定且不可逆的,因此,可以在数据处理装置出厂前就确定处于所述第一异常状态的节点的位置。具体来说,所述数据处理装置还可以包括存储单元,用于存储处于所述第一异常状态的节点的第一位置信息,以使所述控制单元基于所述第一位置信息设置所述第一异常状态的节点的预设标识信息。
处于第一异常状态的节点可以通过可测试性设计(Design for Testability,DFT)检测确定,所述存储单元可以是电编程熔丝(efuse)等一次性可编程存储器。在数据处理装置上电启动后,可以通过微控制单元(Micro-Controller Unit,MCU)或者专用硬件读取efuse中预先存储的缺陷核心位置,根据缺陷核心数量和位置选择替换核心策略,并根据替换策略,配置相应节点的预设标识信息,从而完成节点阵列的重构。
在一些实施例中,所述异常状态包括由工作环境导致的第二异常状态。由工作环境(例如,高温、高压)导致的异常状态往往是不确定的,既可以是可逆的,也可以是不可逆的。因此,无法直接将处于所述第二异常状态的节点的位置存储在存储单元中。为了解决该问题,可以在数据处理装置中设置检测单元,用于在所述数据处理装置工作过程中,实时检测处于第二异常状态的节点的第二位置信息,以使所述控制单元基于所述第二位置信息设置所述第二异常状态的节点的预设标识信息。
所述检测单元可以是失效检测电路或检测软件,例如,失效检测电路可以通过一个或多个传感器实现。
在一些实施例中,所述控制单元还用于在至少一个节点从正常状态切换到异常状态的情况下,在基于所述各个节点的工作状态设置所述各个节点的预设标识信息之前,暂停所述节点阵列中的各个节点当前执行的任务。也就是说,每当出现新的节点处于异常状态时,可以先暂停各个节点当前执行的任务,再重新确定替换节点,并根据重新确定的替换节点配置预设标识信息,从而完成节点阵列的重构。
在一些实施例中,每个节点分组中的节点的输出端连接一个多路复用器,每个节点分组中的节点的输入端连接一个解多路复用器;一个节点的多路复用器用于将所述节点的输出信号通过不同的通道输出至所述节点的默认节点或备用节点;一个节点的解多路复用器用于将通过不同通道输出至该节点的输出信号输入该节点。
如图13所示,多路复用器(Multiplexer)记为MUX,解多路复用器(De-multiplexer)记为DMUX。每个方块表示一个节点,虚线左侧为各节点均处于正常状态时的信号流向,虚线右侧为节点(1,1)处于异常状态时的信号流向,带箭头的实线表示节点间的连接 关系。可以看出,在左侧的节点阵列中,节点(1,0)、节点(1,1)与节点(1,2)两两之间的连接被启用,如左侧的节点阵列中较粗的实线所示;在右侧的节点阵列中,节点(1,0)、节点(0,1)与节点(1,2)两两之间的连接被启用,如右侧的节点阵列中较粗的实线所示。同理,节点(0,0)与节点(0,1)的上一个节点之间的连接被启用,节点(0,2)与节点(0,1)的上一个节点之间的连接也被启用。
在上述实施例中,每个多路复用器和每个解多路复用器均对应一组预设标识信息中的其中一个。在上述实施例中,节点(1,1)处于异常状态,可以配置较粗的实线所连接的各个多路复用器和各个解多路复用器的预设标识信息。例如,在上述实施例中,一个多路复用器和一个解多路复用器共涉及三路信号,因此,一组预设标识信息可以包括00,01和11,其中,00表示连接同一行的节点,01表示连接上一行的节点,11表示连接下一行的节点。在这种情况下,节点(1,0)连接的MUX、节点(1,0)连接的DMUX、节点(0,1)左侧的DMUX、节点(0,1)左侧的MUX、节点(0,1)右侧的MUX、节点(0,1)右侧的DMUX、节点(1,2)连接的DMUX以及节点(1,2)连接的MUX对应的预设标识信息分别设置为:01,01,11,11,11,11,01,01。
在一些实施例中,每个节点均包括一个旁路单元;在所述节点处于异常状态的情况下,所述旁路单元对所述节点进行旁路,以使所述节点所在的节点分组中与所述节点相邻的两个节点直接相连。
以图2B所示的实施例为例,各个节点不仅在水平方向上互相连接,还在竖直方向上互相连接。在节点(1,1)处于异常状态的情况下,需要通过节点(1,1)中的旁路单元将节点(1,1)旁路掉,以使节点(0,1)在竖直方向上直接连接节点(2,1)。
在一些实施例中,在任意一个节点分组中处于异常状态的节点的数量大于所述节点分组中的冗余节点的数量的情况下,各个节点分组中的目标节点均被旁路,以使任意一个节点分组中未被旁路的异常状态的节点的数量小于或等于所述节点分组中的冗余节点的数量;其中,所述各个节点分组中的目标节点包括所述处于异常状态的节点,且一个节点分组中的目标节点为另一个节点分组中目标节点的默认节点。例如,假设节点分组包括分组1、分组2、分组3和分组4,且上述四个节点分组中的目标节点依次记为目标节点1、目标节点2、目标节点3和目标节点4,则目标节点1为目标节点2的默认节点,目标节点2为目标节点3的默认节点,目标节点3为目标节点4的默认节点。
在一些实施例中,一个节点分组包括节点阵列的一列,一个分组中的节点的默认节点与该节点处于同一行,则上述目标节点1、目标节点2、目标节点3和目标节点4为节点阵列中的同一行节点。在另一些实施例中,一个节点分组包括节点阵列的一行,一个分组中的节点的默认节点与该节点处于同一列,则上述目标节点1、目标节点2、目标节点3和目标节点4为节点阵列中的同一列节点。也就是说,在上述实施例中,一旦某个节点分组中处于异常状态的节点的数量大于所述节点分组中的冗余节点的数量,则会去掉节点阵列中包括异常状态节点的一行或一列。
如图14所示,假设最后一行节点为冗余节点,由于每个节点分组中只有一个冗余节点,但第二列所在的节点分组中存在两个处于异常状态的节点,即节点(0,1)和节点(1,1),此时,冗余节点不足以为异常状态的节点提供冗余逻辑。因此,可以去除节点(0,1)所在的一行,即虚线框中的各个节点,以保证冗余节点的数量足够为异常状态的节点提供冗余逻辑。
在一些实施例中,本公开还提供一种电子设备,所述电子设备包括本公开任一实施例所述的数据处理装置。
在一些实施例中,如图15所示,本公开还提供一种数据处理方法,用于对本公 开任一实施例所述的数据处理装置中各节点的连接关系进行调整;所述方法包括:
步骤1501:获取各节点的默认节点的状态,所述状态包括正常状态和异常状态;
步骤1502:基于各节点的默认节点的状态调整所述节点阵列中多个节点的连接关系;其中:
步骤15021:在一个节点分组中的第一节点的默认节点处于正常状态的情况下,启用所述第一节点与对应的默认节点之间的连接;
步骤15022:在一个节点分组中的第一节点的默认节点处于异常状态的情况下,启用所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接,其中,所述至少一个第二节点的每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点。
本公开实施例的方法可由MCU、CPU等处理单元或者专用处理硬件执行。各节点之间的连接的启用方式详见前述数据处理装置的实施例,此处不再赘述。
在一些实施例中,如图16所示,本公开还提供一种数据处理装置,用于对本公开任一实施例所述的数据处理装置中各节点的连接关系进行调整;所述装置包括:
获取模块1601,用于获取各节点的默认节点的状态,所述状态包括正常状态和异常状态;
调整模块1602,用于基于各节点的默认节点的状态调整所述节点阵列中多个节点的连接关系;其中:
在一个节点分组中的第一节点的默认节点处于正常状态的情况下,启用所述第一节点与对应的默认节点之间的连接;
在一个节点分组中的第一节点的默认节点处于异常状态的情况下,启用所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接,其中,所述至少一个第二节点的每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点。
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,不再赘述。
本公开实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现前述任一实施例所述的方法。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本说明书实施例可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本说明书实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体 现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本说明书实施例各个实施例或者实施例的某些部分所述的方法。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机数据处理装置或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,在实施本说明书实施例方案时可以把各模块的功能在同一个或多个软件和/或硬件中实现。也可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上所述仅是本说明书实施例的具体实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本说明书实施例原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本说明书实施例的保护范围。

Claims (26)

  1. 一种数据处理装置,所述数据处理装置包括节点阵列,所述节点阵列包括多个节点分组;
    其中,对于所述多个节点分组中的每个,该节点分组中相邻的节点相连接,以及该节点分组中每个节点与其他节点分组的多个节点相连接;所述多个节点包括默认节点和备用节点,一个节点的默认节点是该节点所在节点分组中至少一个其他节点的备用节点;
    在一个节点分组中的第一节点的默认节点处于正常状态的情况下,所述第一节点与对应的默认节点之间的连接被启用;
    在一个节点分组中的第一节点的默认节点处于异常状态的情况下,所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接被启用。
  2. 根据权利要求1所述的数据处理装置,其特征在于,所述节点阵列中的每个节点均包括处理内核以及与所述处理内核相连接的路由器,一个节点分组中的每个节点的路由器用于连接另一个节点分组的多个节点的路由器;
    在一个节点的路由器和处理内核均处于正常状态的情况下,所述节点处于正常状态;
    在一个节点的路由器和处理内核中的至少一者处于异常状态的情况下,所述节点处于异常状态。
  3. 根据权利要求1或2所述的数据处理装置,其特征在于,所述多个节点分组中的每个节点分组均包括至少一个冗余节点以及除所述冗余节点以外的工作节点,一个节点分组的冗余节点为另一个节点分组的至少一个工作节点的备用节点,所述第一节点为所述工作节点;
    在一个节点分组的各个工作节点均处于正常状态的情况下,所述节点分组的冗余节点被禁用。
  4. 根据权利要求3所述的数据处理装置,其特征在于,针对所述多个节点分组中的每个节点分组设置有一个冗余节点的情况,响应于确定该节点分组中的一个工作节点的默认节点处于异常状态,分别将从该工作节点至该节点分组的最后一个工作节点的所有工作节点与对应的备用节点之间的连接启用,其中,所述冗余节点为该节点分组中的最后一个工作节点的下一节点。
  5. 根据权利要求3所述的数据处理装置,其特征在于,针对所述多个节点分组中的每个节点分组设置有至少两个冗余节点,且所述至少两个冗余节点分布在所述节点分组两端的情况,响应于确定所述节点分组中至少一个工作节点的默认节点处于异常状态,针对所述至少一个工作节点中的每个,分别将从该工作节点至该工作节点的目标冗余节点的前一工作节点的所有工作节点与对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该节点分组中冗余节点的数量,所述目标冗余节点与所述至少一个工作节点处于同一节点分组,且针对所述至少一个工作节点中的每个,该工作节点的后一工作节点与所述目标冗余节点的前一工作节点之间的每个工作节点的默认节点均处于正常状态。
  6. 根据权利要求3所述的数据处理装置,其特征在于,针对所述多个节点分组中的每个节点分组设置有至少两个冗余节点,且所述至少两个冗余节点分布在所述节点分组的一端的情况,响应于确定所述节点分组中至少一个工作节点的默认节点处于异常状态,针对所述至少一个工作节点中的每个,分别将从该工作节点至该节点分组的冗余节点的前一个工作节点的所有节点与对应的备用节点之间的连接启用,其中,任一节点分组中处于异常状态的节点的数量小于或等于该节点分组中冗余节点的数量,针对每个启用与对应的备用节点之间连接的第三节点,该第三节点的默认节点与该第三节点的备用节点不相邻。
  7. 根据权利要求4或5所述的数据处理装置,其特征在于,所述节点阵列中每个 工作节点的备用节点均为该工作节点所在的节点分组中下一工作节点的默认节点。
  8. 根据权利要求6所述的数据处理装置,其特征在于,所述节点阵列中每个工作节点的备用节点均为该工作节点所在节点分组中与该工作节点间隔设置的工作节点的默认节点。
  9. 根据权利要求3或4所述的数据处理装置,其特征在于,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第N个节点,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点;
    在第i个节点分组的第j个节点处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用;
    其中,1≤j<N,j≤v<N,v、i、j和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
  10. 根据权利要求3或5所述的数据处理装置,其特征在于,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第1个节点和第N个节点;第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点和第i个节点分组的第j-1个节点为第i-1个节点分组的第j个节点的备用节点,且第i个节点分组的第j+1个节点和第i个节点分组的第j-1个节点为第i+1个节点分组的第j个节点的备用节点;
    在第i个节点分组的第j个节点和第k个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i-1个节点分组的第u个节点与第i个节点分组的第u-1个节点之间的连接被启用,第i+1个节点分组的第u个节点与第i个节点分组的第u-1个节点之间的连接被启用;
    其中,1<j<k<N,k<v<N,1<u<j,u、v、i、j、k和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
  11. 根据权利要求3或6或8所述的数据处理装置,其特征在于,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第N-1个节点和第N个节点;第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点和第i个节点分组的第j+2个节点为第i-1个节点分组的第j个节点的备用节点,且第i个节点分组的第j+1个节点和第i个节点分组的第j+2个节点为第i+1个节点分组的第j个节点的备用节点;
    在第i个节点分组的第j个节点和第j+1个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+2个节点之间的连接被启用,第i+1个节点分组的第v个节点与第i个节点分组的第v+2个节点之间的连接被启用;
    其中,1≤j<N-1,j≤v<N-1,v、i、j和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
  12. 根据权利要求3或4所述的数据处理装置,其特征在于,所述多个节点分组中的每个节点分组的冗余节点包括该节点分组的第N个节点,第i个节点分组的第j个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的默认节点,第i个节点分组的第j+1个节点为第i-1个节点分组的第j个节点和第i+1个节点分组的第j个节点的备用节点;
    在第i个节点分组的第j个节点和第i+1个节点分组的第j个节点均处于异常状态的情况下,第i-1个节点分组的第v个节点与第i个节点分组的第v+1个节点之间的连接被启用,第i+2个节点分组的第v个节点与第i+1个节点分组的第v+1个节点之间的连 接被启用;
    其中,1≤j<N,j≤v<N,v、i、j和N均为正整数,N为所述多个节点分组中的每个节点分组的节点总数。
  13. 根据权利要求3至12任意一项所述的数据处理装置,其特征在于,在任意一个节点分组中处于异常状态的节点的数量大于所述节点分组中的冗余节点的数量的情况下,所述多个节点分组中的每个节点分组中的目标节点均被旁路,以使任意一个节点分组中未被旁路的异常状态的节点的数量小于或等于所述节点分组中的冗余节点的数量;
    其中,所述所述多个节点分组中的每个节点分组中的目标节点包括所述处于异常状态的节点,且一个节点分组中的目标节点为另一个节点分组中目标节点的默认节点。
  14. 根据权利要求3至13任意一项所述的数据处理装置,其特征在于,一个节点分组中冗余节点的数量基于以下至少一个条件确定:所述数据处理装置的面积、所述节点处于异常状态的概率、所述节点阵列中节点的数量。
  15. 根据权利要求1至14任意一项所述的数据处理装置,其特征在于,一个节点与对应的默认节点或备用节点之间的连接基于该节点的预设标识信息被启用;其中,一个节点的各个默认节点和备用节点对应不同的预设标识信息。
  16. 根据权利要求15所述的数据处理装置,其特征在于,所述数据处理装置还包括控制单元,用于:
    针对所述节点阵列中每个节点,获取该节点的工作状态;
    基于该节点的工作状态设置该节点的预设标识信息。
  17. 根据权利要求16所述的数据处理装置,其特征在于,所述异常状态包括由工艺缺陷导致的第一异常状态;所述数据处理装置还包括:
    存储单元,用于存储处于所述第一异常状态的节点的第一位置信息,以使所述控制单元基于所述第一位置信息设置所述第一异常状态的节点的预设标识信息。
  18. 根据权利要求16或17所述的数据处理装置,其特征在于,所述异常状态包括由工作环境导致的第二异常状态;所述数据处理装置还包括:
    检测单元,用于在所述数据处理装置工作过程中,实时检测处于第二异常状态的节点的第二位置信息,以使所述控制单元基于所述第二位置信息设置所述第二异常状态的节点的预设标识信息。
  19. 根据权利要求16至18任意一项所述的数据处理装置,其特征在于,所述控制单元还用于:
    在至少一个节点从政策状态切换到异常状态的情况下,在基于所述节点阵列中的各个节点的工作状态设置所述各个节点的预设标识信息之前,暂停所述节点阵列中的各个节点当前执行的任务。
  20. 根据权利要求1至19任意一项所述的数据处理装置,其特征在于,所述多个节点分组中的每个节点分组中的节点的输出端连接一个多路复用器,每个节点分组中的节点的输入端连接一个解多路复用器;
    一个节点的多路复用器用于将所述节点的输出信号通过不同的通道输出至所述节点的默认节点或备用节点;
    一个节点的解多路复用器用于将通过不同通道输出至该节点的输出信号输入该节点。
  21. 根据权利要求1至20任意一项所述的数据处理装置,其特征在于,所述数据处理装置还包括:
    多个接口,用于连接其他数据处理装置的节点。
  22. 根据权利要求1至21任意一项所述的数据处理装置,其特征在于,所述节点阵列中的每个节点均包括一个旁路单元;
    在所述节点处于异常状态的情况下,所述旁路单元对所述节点进行旁路,以使所述节点所在的节点分组中与所述节点相邻的两个节点直接相连。
  23. 一种芯片,包括权利要求1至22任意一项所述的数据处理装置。
  24. 一种电子设备,包括权利要求1至22任意一项所述的数据处理装置或者权利要求23所述的芯片。
  25. 一种数据处理方法,用于对权利要求1至22任意一项所述的数据处理装置中各节点的连接关系进行调整;所述方法包括:
    获取各节点的默认节点的状态,所述状态包括正常状态和异常状态;
    基于各节点的默认节点的状态调整所述节点阵列中多个节点的连接关系;其中:
    在一个节点分组中的第一节点的默认节点处于正常状态的情况下,启用所述第一节点与对应的默认节点之间的连接;
    在一个节点分组中的第一节点的默认节点处于异常状态的情况下,启用所述第一节点以及与所述第一节点处于同一节点分组的至少一个第二节点与对应的备用节点之间的连接,其中,所述至少一个第二节点的每一个第二节点的默认节点是另一个第二节点或所述第一节点的备用节点。
  26. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求25所述的方法。
PCT/CN2023/090512 2022-04-29 2023-04-25 数据处理方法和装置、芯片、电子设备、介质 WO2023207952A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210473209.4A CN114860511A (zh) 2022-04-29 2022-04-29 数据处理方法和装置、芯片、电子设备、介质
CN202210473209.4 2022-04-29

Publications (1)

Publication Number Publication Date
WO2023207952A1 true WO2023207952A1 (zh) 2023-11-02

Family

ID=82636188

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/090512 WO2023207952A1 (zh) 2022-04-29 2023-04-25 数据处理方法和装置、芯片、电子设备、介质

Country Status (2)

Country Link
CN (1) CN114860511A (zh)
WO (1) WO2023207952A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860511A (zh) * 2022-04-29 2022-08-05 上海阵量智能科技有限公司 数据处理方法和装置、芯片、电子设备、介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101663649A (zh) * 2007-04-18 2010-03-03 国际商业机器公司 动态地重新路由并行计算机系统上的节点业务
CN109117322A (zh) * 2018-08-28 2019-01-01 郑州云海信息技术有限公司 一种服务器主备冗余的控制方法、系统、设备及存储介质
US20210173732A1 (en) * 2019-12-04 2021-06-10 Industrial Technology Research Institute Redundant processing node changing method and processor capable of changing redundant processing node
CN114860511A (zh) * 2022-04-29 2022-08-05 上海阵量智能科技有限公司 数据处理方法和装置、芯片、电子设备、介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101663649A (zh) * 2007-04-18 2010-03-03 国际商业机器公司 动态地重新路由并行计算机系统上的节点业务
CN109117322A (zh) * 2018-08-28 2019-01-01 郑州云海信息技术有限公司 一种服务器主备冗余的控制方法、系统、设备及存储介质
US20210173732A1 (en) * 2019-12-04 2021-06-10 Industrial Technology Research Institute Redundant processing node changing method and processor capable of changing redundant processing node
CN114860511A (zh) * 2022-04-29 2022-08-05 上海阵量智能科技有限公司 数据处理方法和装置、芯片、电子设备、介质

Also Published As

Publication number Publication date
CN114860511A (zh) 2022-08-05

Similar Documents

Publication Publication Date Title
US11301340B2 (en) Memory-based distributed processor architecture
US8756486B2 (en) Method and apparatus for repairing high capacity/high bandwidth memory devices
WO2023207952A1 (zh) 数据处理方法和装置、芯片、电子设备、介质
KR102214556B1 (ko) 메모리 장치 및 모듈
US20210141697A1 (en) Mission-Critical AI Processor with Multi-Layer Fault Tolerance Support
US20130215695A1 (en) Self-repairing memory
WO2014047225A1 (en) Substitute redundant memory
CN105589770A (zh) 一种故障检测的方法和装置
US20190385692A1 (en) Memory device
CN116756079B (zh) 一种基于大容量非易失存储的多任务智能处理器
US9442658B2 (en) Apparatuses and methods including selectively providing a single or separate chip select signals
WO2024098907A1 (zh) 固态硬盘访问控制方法、装置、计算机设备和存储介质
US9489255B2 (en) Dynamic array masking
US8164936B2 (en) Switched memory devices
US9792230B2 (en) Data input circuit of semiconductor apparatus
KR20200089336A (ko) 수 개의 어레이에 의해 저장된 데이터 값 간의 정합 결정
US11036399B2 (en) Memory system and operating method of the memory system
CN113012748A (zh) 修复分析电路及包括其的存储器
Wongyai Improve fault tolerance in cell-based evolve hardware architecture
JPS6256538B2 (zh)
CN115328718B (zh) 控制片上计算系统的方法、片上计算系统及芯片
CN118277177A (zh) 一种PCIe交换芯片兼容性测试方法、装置及相关设备
DE112021007536T5 (de) Systeminterne abschwächung unkorrigierbarer fehler basierend auf vertrauensfaktoren, basierend auf einer fehlerbewussten analyse
CN118210733A (zh) 一种存储器管理方法以及相关设备
JPH06119200A (ja) メモリ装置およびそのテスト方式

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795384

Country of ref document: EP

Kind code of ref document: A1