WO2023273146A1 - Data processing apparatus, system, method, and board card - Google Patents

Data processing apparatus, system, method, and board card Download PDF

Info

Publication number
WO2023273146A1
WO2023273146A1 PCT/CN2021/134517 CN2021134517W WO2023273146A1 WO 2023273146 A1 WO2023273146 A1 WO 2023273146A1 CN 2021134517 W CN2021134517 W CN 2021134517W WO 2023273146 A1 WO2023273146 A1 WO 2023273146A1
Authority
WO
WIPO (PCT)
Prior art keywords
data processing
processing device
mode
chip
controller
Prior art date
Application number
PCT/CN2021/134517
Other languages
French (fr)
Chinese (zh)
Inventor
黄炎坡
Original Assignee
深圳市商汤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市商汤科技有限公司 filed Critical 深圳市商汤科技有限公司
Publication of WO2023273146A1 publication Critical patent/WO2023273146A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Definitions

  • the present disclosure relates to the field of computer technology, in particular, to a data processing device, system, method and board.
  • a central processing unit (CPU) is usually used as the main control chip, supplemented by an artificial intelligence (AI) accelerator card chip as an accelerator card, and a "CPU+AI acceleration chip” is constructed.
  • Card hardware architecture to achieve data processing, thereby effectively improving the speed of data processing.
  • Embodiments of the present disclosure at least provide a data processing device, system, method and board.
  • an embodiment of the present disclosure provides a data processing device, including: a data processing chip and a multiplexer; wherein the multiplexer is configured to select a The data processing chip is used to obtain the first transmission path or the second transmission path of the configuration information; the data processing chip is used to respond to the first transmission path being gated, and obtain the first configuration information, based on the The first configuration information is used to determine its own mode as the master mode; in response to the second transmission path being selected, the second configuration information is obtained, and based on the second configuration information, its own mode is determined as the controlled mode .
  • the data processing chip is further configured to decompose the data processing task into multiple subtasks in response to receiving the data processing task when its own mode is in the master control mode, and send Other data processing devices in the controlled mode issue the subtasks; or the data processing chip is further configured to respond to receiving the subtask from other data processing devices in the master mode when its own mode is in the controlled mode.
  • the delivered subtasks execute the subtasks delivered by the other data processing devices.
  • the data processing chip whose own mode is in the master control mode can play a controlling role, and deliver subtasks corresponding to data processing tasks to other data processing devices in the controlled mode.
  • a data processing chip whose self-mode is in the controlled mode it can perform specific tasks of data processing.
  • the data processing chips in the controlled mode can be parallel Data processing is performed on corresponding subtasks, so the efficiency of data processing is high.
  • the data processing device further includes: a first memory and a second memory; wherein, the first memory and the second memory are respectively connected to the multiplexer;
  • the multiplexer is configured to gate the first transmission path between the data processing chip and the first memory, or gate the data processing chip and the A second transmission path between the second memories.
  • the data processing device further includes: the signal converter is connected to the multiplexer; and is connected to the controller; the signal converter is used to The communication protocol receives the control command sent by the controller, converts the control command into a port gating signal, and sends the port gating signal to the multiplexer.
  • the control command sent by the controller can be received, and the control command can be effectively converted into a port strobe signal, and sent to the multiplexer.
  • the data processing device further includes: a monitoring chip; the monitoring chip is respectively connected to the data processing chip and the signal converter; the monitoring chip is used to monitor the data Processing the working state of the chip, and sending a monitoring signal corresponding to the working state to the signal converter; the signal converter is also used to receive the monitoring signal sent by the monitoring chip, and based on a preset A second communication protocol sends the monitoring signal to the controller.
  • the data processing device further includes: a register; the register is used to store type information of the data processing chip.
  • the type information of the data processing chip can be stored in the register in advance, so that the controller can directly know the type of the corresponding data processing chip through the register, without having to repeat it every time the type is checked
  • the detection is more efficient.
  • the embodiment of the present disclosure also provides a data processing system, including: the data processing device provided in the embodiment of the present disclosure, and a controller; wherein, there are multiple data processing devices; multiple data processing devices It includes: a first data processing device in a master mode, and a second data processing device in a controlled mode; the controller is configured to monitor the state of the first data processing device; responding to the state Instructing to switch the first data processing device to a controlled mode, sending a first control instruction to the first data processing device, and sending a second control instruction to a target second data processing device; the first data processing device , configured to switch its own mode to a controlled mode in response to receiving a first control instruction; the target second data processing device is configured to switch its own mode to a master mode in response to receiving a second control instruction.
  • the state of the data processing device includes at least one of the following: working state; data path state.
  • the controller when monitoring the state of the first data processing device, is configured to: receive a monitoring signal sent by the first data processing device; , determining the working state of the first data processing device; determining whether to switch the first data processing device to a controlled mode based on the working state of the first data processing device.
  • the data processing system further includes: a communication switch; multiple data processing devices are respectively connected to the controller through the communication switch; When monitoring the state of a data processing device, it is used to: monitor the state of the data path between the first data processing device and the communication switch; based on the data between the first data processing device and the communication switch A channel state, determining whether to switch the first data processing device to a controlled mode.
  • the state of the data path between the first data processing device and the communication switch can be monitored, and when the state of the data path is abnormal, it is determined to switch the first data processing device to the controlled mode.
  • the self mode of the first data processing device can be switched to ensure that the data path of the first data processing device is normal, thereby ensuring that the data processing device in the master mode can pass the subtask normally.
  • the data path is delivered to other data processing devices.
  • the controller when monitoring the state of the first data processing device, is configured to: in response to the state of the first data processing device being an abnormal state, determine that the The first data processing device switches to a controlled mode.
  • the controller before sending the second control instruction to the target second data processing device, is further configured to: based on the state of the second data processing device, from the second data processing device In the processing device, a target second data processing device to switch to the master mode is determined.
  • the controller based on the state of the second data processing device, determines from among the second data processing devices the target second data processing device to be switched to the master mode when the state of the first data processing device is abnormal, determine the candidate data processing device from the second data processing device, and detect the state of the candidate data processing device; in response The state of the candidate data processing device is a normal state, and the candidate data processing device is determined as the target second data processing device; the controller is further configured to: respond to the candidate data processing device If the state of the data processing device is abnormal, determine a new candidate data processing device from the second data processing device, and return to the step of detecting whether the state of the candidate data processing device is normal.
  • the selected data processing device still cannot normally complete the decomposition and distribution of data processing tasks, it can accurately determine the data processing device that can switch to the master mode and execute the data processing tasks in this mode, so it can avoid It is more efficient to switch multiple data processing devices repeatedly.
  • the controller when determining from the second data processing device a target second data processing device to be switched to the master mode, is further configured to, from the alternative Reading type information from the data processing device; in response to the read type information being preset type information, determining the candidate data processing device as the target second data processing device.
  • the controller can also determine the type information of the candidate data processing device, so as to prevent the data processing device that cannot perform the data processing task in the master control mode from switching its own mode to the master control mode, causing the data processing system to fail to continue to work normally. Or multiple times of frequent switching of the candidate data processing device as the data processing device in the master control mode.
  • the controller when the controller sends the first control instruction to the first data processing device and sends the second control instruction to the target second data processing device, it is configured to: A data processing device and the target second data processing device send a reset signal; after the reset signal is sent successfully, send a first control instruction to the first data processing device, and send a second control instruction to the target second data processing device control instruction; the first data processing device, when switching its own mode to the controlled mode in response to receiving the first control instruction, is used to: execute reset in response to receiving the reset signal; and complete the reset Afterwards, in response to receiving the first control instruction, switch its own mode to the controlled mode; when the target second data processing device switches its own mode to the master mode in response to receiving the second control instruction , configured to perform a reset in response to receiving the reset signal; and switch its own mode to a master mode in response to receiving the second control instruction after the reset is completed.
  • the controller sending a reset signal to the first data processing device and the target second data processing device, the first data processing device and the target second data processing device can release the current self-mode, so that the first data processing device and the target
  • the second data processing device switches its own mode, it can be realized directly by gating the transmission path between its corresponding data processing chip and the first memory or the second memory, so that the first data processing device can be relatively simply
  • the device and the target data processing device complete the switching of their own modes, and effectively reduce the occurrence of switching failures during switching.
  • the controller is further configured to: in response to receiving a successful handover signal sent by the first data processing device and the target second data processing device, send a message to the first data processing device The data processing device and the target second data processing device send a reset release signal; the first data processing device releases the reset in response to receiving the reset release signal; the target second data processing device responds to The reset is released upon receiving the reset release signal.
  • the target second data processing device can continue to receive data processing tasks based on its own mode as the master mode, and decompose it into subtasks; similarly , the first data processing device may process the received subtasks based on its own mode as the controlled mode, so that the data processing system can continue to complete the corresponding data processing tasks, so as to restore the normal data processing state of the data processing system.
  • the controller is also connected to the data processing device through a bus; the controller sends a reset signal to the first data processing device and the target second data processing device , configured to: send a reset signal to the first data processing device and the target second data processing device through the bus.
  • the embodiments of the present disclosure further provide a board, the board includes the first aspect of the present disclosure and any implementation thereof to provide a data processing device, or the second aspect of the disclosure and any implementation thereof A data processing system is provided.
  • the embodiments of the present disclosure further provide a data processing method applied to a data processing device; the data processing method includes:
  • the multiplexer In response to receiving the port gating signal, the multiplexer gating the first transmission path or the second transmission path used by the data processing chip to obtain the configuration information;
  • the data processing chip acquires first configuration information in response to the first transmission path being selected, and determines its own mode as the master mode based on the first configuration information; in response to the selection of the second transmission path
  • the second configuration information is acquired, and based on the second configuration information, the self mode is determined as the controlled mode.
  • the embodiments of the present disclosure also provide another data processing method, which is applied to a data processing system; the data processing method includes:
  • the controller monitors the status of the first data processing device; in response to the status indication, switches the first data processing device to a controlled mode, sends a first control command to the first data processing device, and sends a target the second data processing device sends a second control instruction;
  • the first data processing device switches its own mode to the controlled mode in response to receiving the first control instruction
  • the target second data processing device switches its own mode to the master mode in response to receiving the second control instruction.
  • FIG. 1 shows a schematic diagram of a data processing device provided by an embodiment of the present disclosure
  • FIG. 2 shows a schematic structural diagram of a data processing device provided by an embodiment of the present disclosure
  • Fig. 3 shows a schematic diagram of a data processing system provided by an embodiment of the present disclosure
  • FIG. 4 shows a schematic structural diagram of a data processing system provided by an embodiment of the present disclosure
  • Fig. 5 shows a schematic diagram of a second data processing device for determining a target provided by an embodiment of the present disclosure
  • Fig. 6 shows a schematic diagram when a controller according to an embodiment of the present disclosure sends a data reset signal to a data processing device
  • FIG. 7A and FIG. 7B show a flow chart of a data processing system provided by an embodiment of the present disclosure when performing a data processing task
  • FIG. 8 shows a schematic diagram of a board provided by an embodiment of the present disclosure
  • FIG. 9 shows a flow chart of a data processing method provided by an embodiment of the present disclosure.
  • FIG. 10 shows a flowchart of another data processing method provided by an embodiment of the present disclosure.
  • the CPU can be used as the main control chip to receive data processing tasks sent by the outside world, decompose the data processing tasks into multiple sub-tasks, and assign different sub-tasks to multiple AI accelerator cards. Tasks, sub-tasks are processed in parallel by multiple AI accelerator cards to improve the efficiency of data processing.
  • CPU+AI accelerator card hardware architecture, once the CPU as the main control chip fails, the entire data processing process will be affected, resulting in poor stability of the hardware architecture.
  • the present disclosure provides a data processing device, by switching the data processing chip in the data processing device to the master mode or the controlled mode under different circumstances, so as to get rid of the data processing in the traditional data processing device.
  • Functional limitations of the chip The data processing chip can switch its own mode according to actual needs, improving the stability of the data processing device.
  • the data processing device 100 includes: a data processing chip 10 and a multiplexer 20; wherein,
  • the multiplexer 20 is configured to, in response to receiving a port gating signal, gating the first transmission path or the second transmission path used by the data processing chip 10 for obtaining configuration information;
  • the data processing chip 10 is configured to obtain first configuration information in response to the first transmission path being gated, and determine its own mode as the master mode based on the first configuration information; in response to the second The transmission channel is selected, the second configuration information is obtained, and the own mode is determined as the controlled mode based on the second configuration information.
  • the multiplexer may gate the first transmission path between the data processing chip and the first memory in response to receiving the port gating signal, so that the data processing chip The mode is switched to the master mode; or, the second transmission path between the data processing chip and the second memory is selected, so that the data processing chip switches its own mode to the controlled mode.
  • the data processing chip can switch its own mode according to actual needs, improving the stability of the data processing device.
  • the data processing chip 10 in the data processing device 100 may include but not limited to at least one of the following: AI chip, graphics processing unit (graphics processing unit, GPU), field programmable logic gate array (Field Programmable Gate Array, FPGA), and Application Specific Integrated Circuit (ASIC).
  • AI chip graphics processing unit
  • GPU graphics processing unit
  • FPGA field programmable logic gate array
  • ASIC Application Specific Integrated Circuit
  • the data processing apparatus 100 correspondingly including the data processing chip 10 may include hardware devices for accelerating data processing, for example.
  • a network adapter for interfacing with an external network may also be included, for example, the network adapter may receive data sent by an external network, such as a related data.
  • the data processing chip 10 has different functions in different modes. Among them, in the master control mode, the data processing chip 10 can receive the data processing tasks sent by the outside world, decompose the data processing tasks into multiple subtasks, and send subtasks to other data processing chips 10 in the controlled mode, so as to play the role of The control function of other data processing devices 100; in the controlled mode, the data processing chip 10 can receive the subtasks sent by other data processing chips 10 in the master control mode, and process the subtasks to actually execute the data processing tasks role.
  • the master control mode the data processing chip 10 can receive the data processing tasks sent by the outside world, decompose the data processing tasks into multiple subtasks, and send subtasks to other data processing chips 10 in the controlled mode, so as to play the role of The control function of other data processing devices 100; in the controlled mode, the data processing chip 10 can receive the subtasks sent by other data processing chips 10 in the master control mode, and process the subtasks to actually execute the data processing tasks role.
  • the data processing task may include classifying and recognizing multiple images, where after decomposing the data processing task into multiple subtasks, each subtask includes: classifying and recognizing one of the multiple images.
  • the data processing task may also include performing multiple data enhancement processing on an image, wherein after decomposing the data processing task into multiple subtasks, each subtask includes: using one of the multiple data enhancement processing methods
  • the data enhancement processing method is to perform data enhancement processing on the image; various data enhancement processing includes, for example, smoothing the image, Gaussian blur processing, random erasing processing, and boundary detection processing.
  • the above data processing tasks only show two examples, and the data processing apparatus 100 provided in the embodiments of the present disclosure may also be used to perform other data processing tasks, which are not limited in the embodiments of the present disclosure.
  • the data processing chip 10 has different functions when its own mode is in the master mode and its own mode is in the controlled mode.
  • the data processing chip 10 may decompose the data processing task into multiple subtasks in response to receiving the data processing task, Other data processing apparatuses 100 in the control mode deliver the subtasks.
  • the data processing chip 10 may execute the subtasks issued by other data processing devices 100 in the master mode in response to receiving the subtasks issued by the other data processing devices 100. sent subtasks.
  • the data processing chip 10 in the data processing device 100 may be represented by an IC (integrated circuit), for example.
  • the corresponding data processing chips 10 may be denoted as IC_1, IC_2, and IC_3 respectively.
  • the self-mode of IC_1 is in the master mode, and the self-modes of IC_2 and IC_3 are in the controlled mode.
  • IC_1 receives the data processing task sent by the host (host) side or other upper computer devices, it decomposes the received data processing task into multiple subtasks.
  • the plurality of subtasks may include, for example, a subtask M_1 for classifying and identifying the first image, and a subtask M_2 for classifying and identifying the second image.
  • IC_1 may, for example, send subtask M_1 to IC_2 in the controlled mode, and send subtask M_2 to IC_3 also in the controlled mode.
  • the data processing chip 10 in the controlled mode may also send the processing result to the data processing chip 10 in the master mode after executing the subtask. For example, when IC_2 determines that the classification recognition result is C1, and IC_3 determines that the classification recognition result is C2, the recognition result C1 and the recognition result C2 may also be sent to IC_1 respectively.
  • the data processing chip 10 In the data processing device 100, if the data processing chip 10 is in the master control mode; when the data processing chip 10 fails, the current mode of the data processing chip 10 can be switched from the master control mode to the controlled mode; meanwhile, another The data processing chip 10 in the controlled mode in a data processing device 100 is switched from the controlled mode to the master mode. In this way, the simultaneous occurrence of multiple data processing chips 10 in the master mode can be avoided. At the same time, due to the mode switching of the data processing chip 10, it is guaranteed that once the data processing chip 10 in the main control mode breaks down, the data processing chip 10 in other data processing devices will replace its main control function, ensuring the stability of the system .
  • the data processing chip 10 in the fault state may not be switched from the master control mode to the controlled mode, but In response to a failure of the data processing chip 10 in the current master control mode, the data transmission path between the data processing chip 10 and other data processing chips 10 is disconnected. In this case, it is necessary to switch the data processing chip 10 in another data processing device 100 from the controlled mode to the master mode.
  • the data processing device 100 including the data processing chip 10 can be integrated into an integrated circuit, a circuit board or a chip, that is, the data processing device 100 is hot-swappable, so when the data processing chip 10 fails, it can be The data processing device 100 including the data processing chip 10 is removed, and a new data processing device 100 is replaced. And in this process, there is no need to suspend the normal use of other data processing devices 100, which can reduce the impact of the fault repair process on other data processing devices 100, and maintain the stability of the hardware architecture.
  • the switching of the data processing chip 10 may be implemented by utilizing the first memory and the second memory further included in the data processing device 100 , for example.
  • FIG. 2 it shows a specific structural diagram of a data processing device, which includes a first memory 30 and a second memory 40; the first memory 30 and the second memory 40 are respectively connected to the multiple Port A and port B in the multiplexer 20 are connected.
  • the first memory 30 and the second memory 40 may be flash memory (flash).
  • the multiplexer 20 may gate the first transmission path between the data processing chip 10 and the first memory 30, or gate the data processing chip 10 and the first memory 30 in response to receiving the port gate signal.
  • the second transmission path between the second memory 40 is selected, so that the data processing chip 10 determines its own mode as the master mode in response to the first transmission path being selected; or the data processing chip 10 responds to the second transmission path is strobed to determine its own mode as the controlled mode.
  • the first configuration information is stored in the first memory 30 .
  • the data processing chip 10 obtains the first configuration information, for example, it can receive a data processing task sent by a host computer or other upper computer equipment, or it can also perform other data configurations on the data processing chip 10 in other data processing devices 100 .
  • the data processing chip 10 receiving the first configuration information can determine its own mode as the master mode.
  • a first transmission path is included between the data processing chip 10 and the first memory 30, and the first transmission path may include, for example, a serial peripheral interface (Serial Peripheral Interface, SPI) bus.
  • SPI Serial Peripheral Interface
  • the multiplexer 20 is connected to the first memory 30, when the self mode of the data processing chip 10 is determined to be the master mode, the first memory between the multiplexer 20 and the first memory 30 can be selected.
  • the communication protocol between the multiplexer 20 and the first memory 30 may also include an SPI bus, or a different protocol may be selected according to actual conditions, and details will not be repeated here.
  • the data processing chip 10 may, for example, receive subtasks sent by the data processing chips 10 in other data processing devices 100 , such as the aforementioned subtask M_1 and subtask M_2 .
  • the data processing chip 10 that receives the second configuration information is unaware of the data processing chip 10 that sends the subtask to it, and executes the corresponding data processing task after receiving the subtask, namely Can. That is to say, the data processing chip 10 corresponding to receiving the second configuration information can switch its own mode to the controlled mode, and continue to execute the subtasks sent by the data processing chip 10 whose own mode is currently in the master control mode, which can reduce the execution data. Interruption while processing tasks, with better processing coherence.
  • a second transmission path may be included between the data processing chip 10 and the second memory 40 , for example.
  • the second transmission path for example, can also be set as the same as the first transmission path as the SPI bus, so that the data processing chip 10 does not need to switch the communication protocol when switching its own mode, and the switching is relatively simple, and the efficiency can also be improved to a certain extent .
  • the multiplexer 20 is connected with the second memory 40, so when the data processing chip 10 whose current self mode is in the master control mode determines to switch its own mode to the controlled mode, the multiplexer 20 can be connected to the second memory 40 through the gate. A data path between memories 40 is implemented.
  • the multiplexer 20 since the multiplexer 20 mainly acts as a switch, it cannot undertake the task of judging whether the data processing chip 10 in the data processing device 100 needs to switch its own mode, so in the data processing device 100 also includes a signal converter and a monitoring chip.
  • FIG. 2 it shows the circuit connection relationship of the signal converter 50 and the monitoring chip 60 in the data processing device 100 . Please refer to the description of the signal converter 50 and the monitoring chip 60 below for details.
  • the monitoring chip 60 can be used to monitor the state of the data processing chip 10 to determine whether the data processing chip 10 can work normally, so as to determine whether the current mode of the data processing chip 10 needs to be switched.
  • the signal converter 50 is also connected to the controller outside the data processing device, because the controller cannot directly use the bus (here, the bus includes a System Management Bus (SMBUS)) to directly read the information determined by the monitoring chip 60.
  • SMBUS System Management Bus
  • Monitoring results so the controller can read the monitoring results sent by the monitoring chip 60 from the signal converter 50 and stored in the signal converter 50, and further send control instructions to the signal converter 50 to control multiplexing
  • the device 20 gates the first transmission path or gates the second transmission path, so as to realize the mode switching of the data processing chip 10 itself.
  • the monitoring chip 60 is respectively connected to the data processing chip 10 and the signal converter 50 , and for the specific connection relationship, please refer to the following description of the signal converter 50 .
  • the monitoring chip 60 is configured to monitor the working state of the data processing chip 10 and send a monitoring signal corresponding to the working state to the signal converter 50 .
  • the monitoring chip 60 can be, for example, an SP706 chip, or a MAX706 chip.
  • the monitoring chip 60 includes multiple pins, which can receive or send different signals.
  • the monitoring chip 60 may include, for example, a watchdog signal input (Watch Dog Input, WDI) pin, and the WDI pin is connected to the data processing chip 10 for monitoring the working state of the data processing chip 10.
  • WDI Watch Dog Input
  • the watchdog signal output included in the monitoring chip 60
  • the output signal of the Output (WDO) pin is high level, indicating that the working state of the data processing chip 10 is normal. If the I/O port of the WDI pin of the monitoring chip 60 has no level change within 1.6 seconds, the output signal of the WDO pin included in the monitoring chip 60 is low level, indicating that the working state of the data processing chip 10 is abnormal.
  • a monitoring signal corresponding to the working state can be determined.
  • the output signal of the WDO pin represented by high and low levels can be directly used as the monitoring signal; or, the level logic can also be set to determine the monitoring signal, for example, when the working state of the data processing chip 10 is determined to be normal , determining that the corresponding monitoring signal is at a low level, and determining that the corresponding monitoring signal is at a high level when it is determined that the working state of the data processing chip 10 is abnormal.
  • the monitoring chip 60 After the monitoring chip 60 determines the monitoring signal, it can send the monitoring signal to the signal converter 50 .
  • a controller can be provided outside the data processing device 100 to judge whether data processing is required according to the monitoring signal.
  • the own mode of the chip 10 is switched. That is, the controller may include, for example, an electronic device that plays a decision-making role outside the data processing device.
  • the controller when the controller reads the monitoring signal determined by the monitoring chip 60 , it can use SMBUS to read the monitoring signal. But when selecting the SP706 chip as the monitoring chip 60, because the controller cannot directly read the monitoring signal determined by the SP706 chip through the SMBUS, it can effectively control the output of any input pin of the signal converter 50 through the SMBUS to correspond to the monitoring signal. , and can be read by the controller, so the monitoring chip 60 can send the monitoring signal to the signal converter 50, and then the controller reads the monitoring signal stored in the signal converter 50 through SMBUS.
  • the controller can only read monitoring signals corresponding to a limited number of data processing devices 100 through SMBUS because the number of SMBUS that the controller can support to read is limited.
  • an expansion chip such as PCA9548, can also be provided for the controller, so that the controller can read the monitoring signal of the data processing device 100 currently to be monitored.
  • the controller sends other signals to the data processing device 100 , it is also possible to implement sending signals to a larger number of data processing devices 100 by expanding the chip for it.
  • the controller when the controller sends the reset signal to the data processing device 100 and releases the reset signal, the controller can also be provided with an expansion chip to realize the failure to receive the signal when a single controller is set.
  • Other data processing devices 100 transmit signals. It will not be repeated in the following.
  • the signal converter 50 is further configured to receive the monitoring signal sent by the monitoring chip 60, and send the monitoring signal to the controller based on a preset second communication protocol .
  • the signal converter 50 can be, for example, a PCA9555 chip.
  • the signal converter 50 may include, for example, two input/output pins, respectively denoted as I/O_1 and I/O_2; and include a synchronous serial bus (Inter-Integrated Circuit, I2C )interface.
  • I/O_1 in the signal converter 50 is connected to the WDO in the monitoring chip.
  • the monitoring chip 60 can, for example, utilize the WDO to transmit the monitoring signal to the signal converter 50; wherein, I/O_1 is used to receive the monitoring signal sent by the WDO in the monitoring chip 60, and then the register in the signal converter 50 rewrites the monitoring signal, and the controller
  • the rewritten monitoring signal can be read from the I2C, so as to determine whether the data processing chip 10 can work normally.
  • the data path between the I2C and the controller in the signal converter 50 indicated in FIG. 2 refers to the data path through which the controller reads signals from the I2C.
  • the signal converter 50 may send a monitoring signal to the controller based on a preset second communication protocol.
  • the second communication protocol may include, for example, a communication protocol used by SMBUS.
  • the data processing chip 10 when the mode of the data processing chip 10 is switched, the data processing chip 10 needs to be able to adapt to the mode after switching.
  • the types of the data processing chip 10 include computing chips that can only provide computing power, and other chips that can provide computing power and also perform subtask distribution. Then for the computing chip, when switching its own mode from the controlled mode to the master mode, since the computing chip cannot undertake the function of distributing subtasks, the computing chip is switching its own mode to the master mode. After that, the data processing task cannot be executed normally, that is, the self-mode of the computing chip should not be switched to the main control mode.
  • registers may also be included in the data processing device 100 .
  • FIG. 2 shows a schematic circuit connection diagram of the register.
  • the register 90 can be, for example, an electrically erasable programmable read only memory (Electrically Erasable Programmable read only memory, E2PROM), specifically a field-replaceable unit (Field-Replaceable Unit E2PROM, FRU E2PROM). Since the tasks that the data processing chip 10 can undertake are fixed, that is, its type is fixed, so the type information of the data processing chip 10 can be predetermined.
  • E2PROM Electrically erasable programmable read only memory
  • FRU E2PROM Field-Replaceable Unit
  • the corresponding register 90 can be determined for the data processing chip 10, and the register 90 is used to store the type information of the data processing chip 10; and, since the type information corresponding to the data processing chip 10 is fixed, the When using the data processing device 100, the type information can be directly programmed into the register 90, and the type information stored in the register 90 remains unchanged.
  • the register 90 for example, can be connected to the corresponding data processing chip 10; or exist as a separate register in the data processing device 100, and has no connection relationship with other modules in the data processing chip 10, as shown in FIG. 2 The register 90 is not connected to other modules in the data processing chip 10 .
  • a controller outside the data processing device 100 is connected to the register 90; specifically, the controller can also use SMBUS to read the type information stored in the register 90, and judge whether the data processing chip 10 can switch its own mode.
  • the controller can determine the data processing chip 10 whose own mode is currently in the controlled mode, and then monitor the data processing chip 10, specifically after receiving the monitoring signal stored in the signal converter 50, through the monitoring The signal judges whether the corresponding data processing chip 10 can work normally. In a possible situation, if the monitoring signal indicates that the data processing chip 10 can work normally, keep the current mode of the data processing chip 10 unchanged and continue to execute the data processing task.
  • the controller may, for example, control the signal gate in the data processing device 100 where the data processing chip 10 is located, so that the signal gate The port gate signal is sent to the port gate interface in the multiplexer 20 .
  • FIG. 2 it shows a circuit connection diagram of a multiplexer.
  • the signal converter 50 is used to receive the control command sent by the controller based on the preset first communication protocol, convert the control command into a port gating signal, and send it to the multiplexer 20 The port strobe signal.
  • the first communication protocol can be, for example, the same communication protocol as the second communication protocol, such as SMBUS, or other communication protocols different from the first communication protocol can also be selected according to actual conditions.
  • I/O_2 in the signal converter 50 can be used to send a port gating signal to the port gating port in the multiplexer 20 connected to it, and the port gating signal is used to indicate the multiplexer 20 gates the first transmission path or the second transmission path between the multiplexer 20 and the first memory 30 or the second memory 40 .
  • the controller can, for example, send a control instruction to the synchronous serial bus (Inter-Integrated Circuit, I2C) interface in the signal converter 50 in the corresponding data processing device 100, to After receiving the control instruction, the signal converter 50 sends the port gate signal PO_1 to the port gate port of the multiplexer 20 to instruct the multiplexer 20 to select the second transmission path.
  • I2C synchronous serial bus
  • the data path between the I2C and the controller in the signal converter 50 indicated in FIG. 2 refers to the data path through which the controller sends a control instruction to the I2C.
  • its own mode will be switched from the current master control mode to the controlled mode.
  • the controller reads the type information of the data processing chip 10 stored in the register 90 in the data processing device 100 where the data processing chip 10 is located, determine its type information
  • a control instruction can be sent to the I2C interface in its corresponding signal converter 50, so that after receiving the control instruction, the multi-channel
  • the port gate port of the multiplexer 20 sends a port gate signal PO_2 instructing the multiplexer 20 to gate the first transmission path. Then, for the data processing chip 10, its own mode will be switched from the current controlled mode to the master mode.
  • the embodiment of the present disclosure also provides a data processing system.
  • FIG. 3 it is a schematic diagram of a data processing system 200 provided by an embodiment of the present disclosure; the data processing system 200 includes a data processing device 100 provided by an embodiment of the present disclosure, and a controller 70; wherein, the data There are multiple processing devices 100; the multiple data processing devices 100 include: a first data processing device 101 in a master mode, and a second data processing device 102 in a controlled mode; wherein,
  • the controller 70 is configured to monitor the status of the first data processing device; switch the first data processing device to a controlled mode in response to the status indication, and send a message to the first data processing device a first control instruction, and sending a second control instruction to a target second data processing device;
  • the first data processing device 101 is configured to switch its own mode to a controlled mode in response to receiving a first control instruction
  • the target second data processing device 103 is configured to switch its own mode to the master mode in response to receiving the second control instruction.
  • the data processing system 200 may use the controller to determine the state of the first data processing device, and determine whether to switch its mode.
  • the controller determines to switch its own mode of the first data processing device, it can send a first control command to the first data processing device to switch its own mode to the controlled mode;
  • the second data processing device sends a second control instruction, so that the target second data processing device switches its own mode to the master mode.
  • the data processing device in the master mode in the data processing system 200 fails, other data processing devices in the controlled mode can switch to the master mode, and continue to assume the responsibility of the data processing device in the master mode. function, so as to ensure the normal and stable operation of the data processing system 200.
  • the first data processing device 101 includes one, its own mode is the main control mode, and correspondingly the data processing chip in the first data processing device 101, its own mode is the main control mode.
  • the second data processing device 102 there may be one or more devices according to actual data processing task requirements or requirements on the working stability of the data processing system 200 .
  • an appropriate target second data processing device 103 can be selected in the second data processing device 102 to switch to the master mode to replace the abnormal first data processing device 101, In this way, the normal and stable operation of the data processing system 200 is guaranteed.
  • the controller 70 when the controller 70 monitors the state of the first data processing device 101, for example, it may receive a monitoring signal sent by the first data processing device 101; based on the monitoring signal, determine the first data The working status of the processing device 101: based on the working status of the first data processing device 101, determine whether to switch the first data processing device 101 to a controlled mode.
  • the specific process of the first data processing device 101 sending the monitoring signal to the controller 70 may refer to the above-mentioned description of the data processing device 100 , which will not be repeated here.
  • the controller 70 determines the state of the first data processing device 101 based on the monitoring signal, for example, it may determine that the state of the first data processing device 101 is the working state and/or the data path state according to the monitoring signal .
  • the first data processing device 101 when the first data processing device 101 works normally in the master control mode, for example, the first data processing device 101 processes the data processing task into multiple subtasks after receiving it, and distributes the subtasks to the second data processing device 102.
  • the state of the first data processing device 101 can be considered as a working state, that is, the first data processing device 101 is correspondingly executing a data processing task.
  • the state of the first data processing device 101 may also include a data path state.
  • the state of the first data processing means is determined as the data path state.
  • the status of the first data processing device 101 can be intuitively used to determine whether it needs to be switched to the controlled mode.
  • the following takes the first data processing device 101 including the working status as an example for illustration.
  • the controller 70 can simply determine the state of the first data processing device 101 according to the correspondence between preset monitoring signals and optional state judgment results. For example, when the working state of the data processing chip in the first data processing device 101 is abnormal, a high-level monitoring signal can be sent to the controller 70, and after the controller 70 receives the high-level monitoring signal, It may be determined that the state of the first data processing device 101 is an abnormal working state. Or, when the working state of the data processing chip in the first data processing device 101 is normal, a low-level monitoring signal can be sent to the controller 70, and after the controller 70 receives the low-level monitoring signal, It may be determined that the state of the first data processing device 101 is a normal working state.
  • the controller 70 may also determine whether to switch the first data processing device 101 to the controlled mode based on the determined working state of the first data processing device 101. Specifically, when the controller 70 determines that the working state of the first data processing device 101 is abnormal, it determines that the state of the first data processing device 101 is an abnormal state, and determines to switch the first data processing device 101 to the controlled mode; Alternatively, when the controller 70 determines that the working state of the first data processing device 101 is normal, it keeps the self-mode of the first data processing device 101 as the master mode.
  • the following takes the first data processing device 101 including the data path state as an example to illustrate whether it is necessary to switch the mode of the first data processing device 101 to the controlled mode.
  • the controller 70 may monitor the state of the data path between the first data processing device 101 and the communication switch; based on the state of the data path between the first data processing device 101 and the communication switch, determine whether to Switch the first data processing device 101 to a controlled mode.
  • the communication switch may include, for example, a high-speed serial computer expansion bus standard switch (Peripheral Component Interconnect Express Switch, PCIE Switch).
  • PCIE Switch Peripheral Component Interconnect Express Switch
  • FIG. 4 it is a schematic structural diagram of a data processing system provided by an embodiment of the present disclosure; wherein, multiple data processing devices are respectively connected to the controller 70 through the communication switch 80 .
  • PCIE Slot high-speed serial computer expansion bus standard exchange card slot
  • Shown in Fig. 4 a plurality of data processing devices (data processing The device is represented by "DP".
  • the multiple data processing devices shown in the figure include PCIE Slots corresponding to DP_1, DP_2, ..., DP_n), including PCIE Slot#1, PCIE Slot#2, PCIE Slot#3,... ..., PCIE Slot#n. In this way, when the data processing device communicates with the communication switch 80, it can select the PCIE protocol for communication.
  • the controller 70 may determine that the current data transmission path of the first data processing device 101 is in a normal state through a data path normal signal actively sent by the communication switch 80, such as a PORT_GOOD# signal.
  • the communication switch 80 may, for example, send PORT_GOOD# signals respectively corresponding to a plurality of data processing devices DP_1 to DP_n to the controller, for example, may include PORT_GOOD#1 to PORT_GOOD# shown in FIG. 4 n.
  • the controller 70 fails to receive the data path normal signal corresponding to any data processing device from the communication switch 80 , it can determine that the state of the data path between the data processing device and the communication switch 80 is abnormal. For example, if the controller 70 fails to receive PORT_GOOD#1, it can determine that the state of the data transmission path between the data processing device DP_1 and the communication switch 80 is abnormal, that is, the state of the data path corresponding to PCIE Slot#1 is abnormal.
  • the controller 70 determines the state of the data path between the corresponding data processing device and the communication switch 80 based on the PORT_GOOD# signal sent by the communication switch 80
  • the direction of data transmission is the direction in which the data processing device transmits data to the controller.
  • the first data processing device 101 in the master mode is determined among multiple data processing devices, the first data processing device 101 may also send data to other first data processing devices 101 in the controlled mode.
  • Two data processing devices 102 send subtasks (for example, when DP_1 serves as the first data processing device 101 , the remaining DP_2 to DP_n can all serve as the second data processing device 102 ).
  • the data transmission path of the first data processing device 101 can include, for example, a data transmission path corresponding to the PCIE communication protocol, or other data transmission paths that can be used for subtask transmission can also be selected, and the used one can be replaced accordingly.
  • communication switch 80 .
  • the controller 70 may also determine whether to switch the first data processing device 101 to the controlled mode based on the state of the data path. Specifically, when the controller 70 determines that the state of the data path between the first data processing device 101 and the communication switch 80 is abnormal, it determines that the state of the first data processing device 101 is an abnormal state, and determines that the first data processing device 101 switch to the controlled mode; or, when the controller 70 determines that the state of the data path between the first data processing device 101 and the communication switch 80 is normal, keep the first data processing device 101 in its own mode as the master mode.
  • the following takes the first data processing device 101 including the working state and the data path state as an example to explain whether it is necessary to switch the mode of the first data processing device 101 to the controlled mode.
  • the controller 70 determines the state of the data path between the first data processing device 101 and the communication switch 80, based on the state of the data path and the working state of the first data processing device 101, determine whether to The data processing device 101 switches to the controlled mode. Specifically, when the controller 70 determines that the working state of the first data processing device 101 is normal but the state of the data path is abnormal, the controller 70 determines that the state of the first data processing device 101 is an abnormal state, and determines to switch the first data processing device 101 to is the controlled mode; or, when the controller 70 determines that the working state of the first data processing device 101 is abnormal and the state of the data path is abnormal, it determines that the state of the first data processing device 101 is an abnormal state, and determines that the first data processing device 101 is abnormal.
  • the processing device 101 switches to the controlled mode; or, when the controller 70 determines that the working state of the first data processing device 101 is abnormal but the data path state is normal, determine that the state of the first data processing device 101 is an abnormal state, and determine Switch the first data processing device 101 to the controlled mode; or, when the controller 70 determines that the working state of the first data processing device 101 is normal but the data path state is abnormal, determine that the state of the first data processing device 101 is abnormal state, and determine to switch the first data processing device 101 to the controlled mode; or, the controller 70 keeps the first data processing device 101 in Its own mode is master mode.
  • the controller 70 may also monitor the state of the data path between the first data processing device 101 and the communication switch 80 after determining the working state of the first data processing device 101; The state and the state of the data path between the first data processing device 101 and the communication switch 80 determine whether to switch the first data processing device 101 to the controlled mode.
  • the specific switching method is similar to the previous description, and will not be repeated here.
  • the present disclosure does not limit the order in which the controller 70 determines the working status of the first data processing device 101 and determines the data path status of the first data processing device 101 . That is, the controller 70 may first determine the data path status of the first data processing device 101, and then determine the working status of the first data processing device 101; it may also perform in reverse order; or both may be performed at the same time.
  • the controller 70 determines to switch the first data processing device 101 to the controlled mode, because the working state of the first data processing device 101 currently in use is abnormal and cannot work normally, in order to ensure that the data processing system 200 can continue to complete the In the work of decomposing the data processing task and delivering the subtasks obtained after decomposing, the target second data processing device 103 can also be determined in the second data processing device 102, so that the target second data processing device 103 can be used as The new first data processing device 101 completes corresponding data processing tasks. Therefore, the controller 70 may also determine, from among the second data processing devices 102 , the target second data processing device 103 to switch to the master mode based on the state of the second data processing device 102 .
  • the controller 70 may determine an alternative data processing apparatus from the second data processing apparatus 102 in response to the state of the first data processing apparatus 101 being an abnormal state, and detect the alternative data processing apparatus Device status: determining the candidate data processing device as the target second data processing device 103 in response to the candidate data processing device being in a normal state.
  • the normal status of the candidate data processing device includes that the working status and the data path status of the candidate data processing device are both normal.
  • FIG. 5 is a schematic diagram of a second data processing device for determining a target provided by an embodiment of the present disclosure.
  • data processing devices DP_1 , DP_2 , . . . , DP_n usable in the data processing system 200 are included.
  • DP_1 is the current first data processing device 101 .
  • the second data processing device 102 sequentially adjacent to DP_1 may be determined as a candidate data processing device.
  • the second data processing device 102 sequentially adjacent to the first data processing device 101 when determining the second data processing device 102 sequentially adjacent to the first data processing device 101, for example, it may be determined according to the sequence numbers of multiple data processing devices.
  • n data processing devices when determining a plurality of data processing devices in the data processing system 200, if n data processing devices are sequentially determined, including DP_1, DP_2, ..., DP_n, it is considered that they are in the same order as the first data processing device DP_1.
  • the adjacent second data processing device that is, DP_2, serves as an alternative data processing device.
  • the second data processing device 102 sequentially adjacent to the first data processing device 101 may also be determined according to the current states of multiple data processing devices other than the first data processing device 101 .
  • the operating temperature of multiple data processing devices can be monitored first, and the operating temperature can be changed from a lower operating temperature to a higher operating temperature according to the respective operating temperatures of the multiple data processing devices.
  • the direction of temperature determines the ordering of the multiple data processing means.
  • the multiple data processing apparatuses include DP_2, DP_3, and DP_4, the corresponding ordering is, for example, DP_3, DP_4, and DP_2.
  • the determined candidate data processing devices may also be different.
  • the corresponding candidate data processing device may be determined according to actual conditions, which is not limited here.
  • the controller 70 can also monitor the status of the alternative data processing device.
  • the controller 70 may also determine a new candidate data processing device from the second data processing device 102 in response to the state of the candidate data processing device being an abnormal state, and return Go to the step of detecting whether the state of the alternative data processing device is normal.
  • the manner of determining a new candidate data processing device is similar to the above-mentioned manner of determining a candidate data processing device from the second data processing device 102 , and will not be repeated here.
  • the controller 70 determines from the second data processing device 102 the target second data processing device 103 to switch to the master mode, it is also used to read the type information from the candidate data processing device ; Determining the candidate data processing device as the target second data processing device 103 in response to the read type information being preset type information.
  • the controller 70 can also read the type information of the candidate data processing device to determine whether it can be selected as the target second data processing device 103 and work as the data processing device in the master mode.
  • the controller 70 reads the type information of the candidate data processing device, reference may be made to relevant descriptions in the above-mentioned embodiment corresponding to FIG. 2 , which will not be repeated here.
  • the candidate data processing device when the data processing system 200 can be used normally after its own mode is switched to the master mode, and the efficiency is high. After it is determined that the state of the candidate data processing device is normal, the screening of the second data processing device 102 is stopped, and the determined candidate data processing device is used as the target second data processing device 103 .
  • the controller 70 After the controller 70 determines the target second data processing device 103, it can determine a new data processing system 200 by switching its own mode corresponding to the first data processing device 101 and the target second data processing device 103 respectively.
  • the first data processing device 101 After the controller 70 determines the target second data processing device 103, it can determine a new data processing system 200 by switching its own mode corresponding to the first data processing device 101 and the target second data processing device 103 respectively.
  • the first data processing device 101 After the controller 70 determines the target second data processing device 103, it can determine a new data processing system 200 by switching its own mode corresponding to the first data processing device 101 and the target second data processing device 103 respectively.
  • the controller 70 may send a reset signal to the first data processing device 101 and the target second data processing device 103; after the reset signal is sent successfully, send a reset signal to the first data processing device 101 The first control instruction, and sending the second control instruction to the target second data processing device 103 .
  • FIG. 6 it is a schematic diagram of a controller sending a data reset signal to a data processing device according to an embodiment of the present disclosure.
  • the reset signal for example, a device reset signal (PCIE Reset, PERST) can be used.
  • the controller 70 may also send a device reset signal to the communication switch 80 . After the communication switch 80 receives the reset signal, it performs a reset operation.
  • the controller 70 cannot send a reset signal to the communication switch 80, and then the communication switch 80 sends a reset signal to the data processing device.
  • the communication is not shown in FIG. 6
  • the connection relationship between the switch 80 and the data processing device directly expresses the data transmission path through which the controller 70 sends a reset signal to the communication switch 80 and the data processing device.
  • the controller 70 is also connected to the data processing device through a bus; when the controller 70 sends a reset signal to the first data processing device 101 and the target second data processing device 103, It is configured to: send a reset signal to the first data processing device 101 and the target second data processing device 103 through the bus.
  • the bus may include SMBUS, for example.
  • the controller 70 can send the reset signal to the communication switch 80, Then, the communication switch 80 sends reset signals to the data processing devices DP_1 to DP_n connected thereto.
  • the controller 70 can use the structure shown in FIG. The I2C in the converter sends a reset signal, thereby resetting the corresponding data processing device.
  • the first data processing device 101 and the target second data processing device 103 can cancel the current self-mode, so that when the two are respectively When the corresponding self-mode is switched, it can be realized directly by gating the transmission paths between the corresponding data processing chips and the first memory or the second memory, so that the first data processing device 101 and the target The data processing device completes the switching of its own mode, and effectively reduces the occurrence of switching failures during switching.
  • the communication switch 80 can also stop sending the PORT_GOOD# signal to the controller 70, that is, stop monitoring the data path corresponding to the data processing device, so as to reduce the communication switch 80 and the controller 70 during switching. power consumption.
  • the controller 70 can send the first control instruction to the first data processing device 101 and send the target second data processing device 103 Send the second control instruction.
  • the first control instruction is used to instruct the first data processing device 101 to switch its own mode to the controlled mode;
  • the second control instruction is used to instruct the target data processing device to switch its own mode to the master mode.
  • the first data processing device 101 when switching its own mode to the controlled mode in response to receiving the first control instruction, it is used to perform a reset in response to receiving the reset signal; and after completing the reset , switching the own mode to the controlled mode in response to receiving the first control instruction.
  • the target data processing device when switching its own mode to the master mode in response to receiving the second control instruction, it is used to perform a reset in response to receiving the reset signal; and after the reset is completed, in response to receiving the The second control instruction switches its own mode to the main control mode.
  • the first control instruction can be sent to the first data processing device
  • the signal converter in 101, and the data processing chip in the first data processing device 101 can obtain the second configuration information in the second memory by sending a port gating signal from the signal converter to the multiplexer , so as to switch the first data processing device 101 from its own mode to the controlled mode.
  • the target second data processing device 103 may switch its own mode to the master mode in a similar manner, which will not be repeated here.
  • a successful switching signal may be sent to the controller 70 respectively.
  • the controller 70 does not receive the successful switching signals sent by the two respectively, for example, it can directly report an error, wait for the staff to check, and keep the first data processing device 101 and the target second data processing device 103 reset status, or the solution can be determined according to the actual situation, which is not limited here.
  • the controller 70 may send a message to the first data processing device 101 and the target second data processing device
  • the device 103 sends a de-reset signal, and may also send a de-reset signal to the communication switch 80 .
  • the first data processing device 101 may release the reset in response to the received reset release signal; similarly, the target second data processing device 103 may also release the reset in response to receiving the reset release signal.
  • the manner in which the controller 70 sends the release reset signal to the first data processing device 101 and the target second data processing device 103 is similar to the manner in which the reset signal is sent to them, and will not be repeated here.
  • the communication switch 80 After the communication switch 80 receives the release reset signal, it can re-monitor the corresponding data paths of the multiple data processing devices in the new data processing system 200 . In this way, the target second data processing device 103 can continue to receive data processing tasks and decompose them into subtasks based on its own mode as the master mode; similarly, the first data processing device 101 can be controlled based on its own mode, Task processing is performed on the received subtasks, so that the data processing system 200 can continue to complete corresponding data processing tasks.
  • FIG. 7A and FIG. 7B it is a flowchart of a data processing system performing a data processing task provided by an embodiment of the present disclosure; wherein,
  • S701 The controller determines that the current own mode is the first data processing device in the main control mode
  • S702 The data processing system executes a data processing task
  • S703 The controller monitors whether the status of the first data processing device is normal; wherein, S703 includes the following S7031 to S7034;
  • S7031 The controller monitors the state of the data path between the first data processing device and the communication switch, and determines whether the state of the first data processing device is normal; if yes, execute S7032; if not, execute S7033;
  • S7032 The controller monitors whether the data processing chip in the first data processing device can work normally; wherein, S7032 includes the following S70321 and S70322;
  • the monitoring chip monitors the working state of the data processing chip in the first data processing device, and sends a monitoring signal corresponding to the working state to the signal converter;
  • S70322 The controller reads the monitoring signal stored in the signal converter, and determines whether the state of the first data processing device is normal; if yes, execute S7034; if not, execute S7033;
  • S7031 and S7032 can be executed in reverse order, for example, S7032 is executed first, and then S7031 is executed; or, S7031 and S7032 can be executed synchronously.
  • S7033 Determine that the state of the first data processing device is an abnormal state
  • S7034 Determine that the state of the first data processing device is normal; return to execute S702:
  • S704 The controller determines a candidate data processing device from the second data processing device
  • S705 The controller monitors whether the state of the alternative data processing device is normal; if yes, execute S706; if not, execute S707;
  • S706 The controller determines that the candidate data processing device is the target second data processing device
  • S707 The controller determines a new candidate data processing device from the second data processing device; return to execute S705;
  • S708 The controller sends a reset signal to the first data processing device and the target second data processing device to reset the first data processing device and the target second data processing device;
  • the controller sends the first control instruction to the first data processing device, and sends the second control instruction to the target second data processing device;
  • S710 The first data processing device and the target second data processing device switch their own modes; wherein, S710 includes S7101 and S71012;
  • the first data processing device switches its own mode to a controlled mode in response to receiving a first control instruction
  • S7102 The target second data processing device switches its own mode to the master mode in response to receiving the second control instruction
  • S711 The controller sends a reset release signal to the first data processing device and the target second data processing device; the first data processing device and the target second data processing device release reset.
  • an embodiment of the present disclosure also provides a board.
  • the board provided by the embodiments of the present disclosure may include any data processing device or any data processing system disclosed in the embodiments of the present disclosure. Refer to FIG. 1 and FIG. 2 for the board including the data processing device, and refer to FIG. 8 for the board including the data processing system.
  • FIG. 8 is a schematic diagram of a board provided by an embodiment of the present disclosure; the board 300 includes a plurality of data processing devices 100 provided by an embodiment of the present disclosure (n data processing devices 100 are shown in FIG. 8, including DP_1 to DP_n), and a controller 70; wherein, a plurality of said data processing devices 100 are respectively connected to said controller 70 through a communication switch 80;
  • the controller 70 is configured to monitor the status of each of the data processing devices 100; switch one of the data processing devices 100 to the master mode according to the status of each of the data processing devices 100, and process other data The device 100 switches to the controlled mode.
  • the data processing device 100 in the master control mode can undertake tasks such as receiving, splitting, and distributing data processing tasks.
  • the master control mode is also called the root processing mode (Root Complex, RC).
  • the data processing apparatus 100 in the controlled mode can undertake tasks such as processing subtasks, and the controlled mode is also called a node processing mode (End Point, EP).
  • CPU#1 and CPU#2 respectively connect the data processing device 100 in RC mode and the data processing device 100 in EP mode
  • CPU+data processing device hardware architecture
  • data processing devices in different modes correspond to different CPUs to complete related data processing tasks.
  • the whole cannot be used normally, and thus the data processing task cannot be performed normally.
  • the mode of the data processing device 100 can be switched in the board, that is, the mode of the data processing device 100 can be switched to the RC mode or the EP mode accordingly, so in the RC
  • the data processing device 100 in the EP mode fails, it can quickly switch to the RC mode from the data processing device 100 in other EP modes to continue to complete the data processing task. Therefore, when the board is performing the data processing task More flexible and more stable.
  • the embodiment of the present disclosure also provides a data processing method corresponding to the data processing device. Since the problem-solving principle of the method in the embodiment of the present disclosure is similar to that of the above-mentioned data processing device in the embodiment of the present disclosure, the implementation of the method Reference can be made to the implementation of the device, and repeated descriptions will not be repeated.
  • FIG. 9 it is a flowchart of a data processing method provided by an embodiment of the present disclosure, the data processing method is applied to the data processing device provided by the embodiment of the present disclosure; the data processing method includes:
  • the multiplexer selects the first transmission path or the second transmission path used by the data processing chip to obtain the configuration information in response to receiving the port strobe signal;
  • the data processing chip acquires first configuration information in response to the first transmission path being gated, and determines its own mode as the master mode based on the first configuration information; in response to the second transmission path is gated, acquires second configuration information, and determines its own mode as the controlled mode based on the second configuration information.
  • the data processing method further includes: when the data processing chip is in the master control mode in its own mode, in response to receiving a data processing task, decomposing the data processing task into multiple Subtasks, sending the subtasks to other data processing devices in the controlled mode; or the data processing chip is in the controlled mode in its own mode, in response to receiving other data processing devices in the master mode
  • the delivered subtasks execute the subtasks delivered by the other data processing devices.
  • the data processing device further includes: a first memory and a second memory; wherein, the first memory and the second memory are respectively connected to the multiplexer;
  • the data processing method further includes: the multiplexer gates the first transmission path between the data processing chip and the first memory in response to receiving the port gate signal, or gates the data A second transmission path between the processing chip and the second memory.
  • the data processing device further includes: a signal converter; the signal converter is connected to the multiplexer; and connected to a controller; the data processing method further includes: The signal converter receives the control instruction sent by the controller based on the preset first communication protocol, converts the control instruction into a port selection signal, and sends the port selection communication to the multiplexer No.
  • the data processing device further includes: a monitoring chip; the monitoring chip is respectively connected to the data processing chip and the signal converter; the data processing method further includes the monitoring chip monitoring the The working state of the data processing chip, and send a monitoring signal corresponding to the working state to the signal converter; the signal converter receives the monitoring signal sent by the monitoring chip, and based on the preset second A communication protocol sends the monitoring signal to the controller.
  • the data processing device further includes: a register; the register is used to store type information of the data processing chip.
  • the embodiment of the present disclosure also provides a data processing method corresponding to the data processing system. Since the problem-solving principle of the method in the embodiment of the present disclosure is similar to that of the above-mentioned data processing system in the embodiment of the present disclosure, the method For the implementation of , please refer to the implementation of the system, and the repetition will not be repeated.
  • FIG. 10 it is a flow chart of another data processing method provided by an embodiment of the present disclosure.
  • the data processing method is applied to the data processing system provided by the embodiment of the present disclosure; the data processing method includes:
  • S1001 The controller monitors the status of the first data processing device; in response to the status indication, switches the first data processing device to a controlled mode, and sends a first control instruction to the first data processing device , and sending a second control instruction to the target second data processing device;
  • the first data processing device switches its own mode to a controlled mode in response to receiving the first control instruction
  • S1003 The target second data processing device switches its own mode to the master mode in response to receiving the second control instruction.
  • the state of the data processing device includes at least one of the following: working state; data path state.
  • the controller monitoring the state of the first data processing device includes: receiving a monitoring signal sent by the first data processing device; based on the monitoring signal, determining the The working status of the first data processing device: determining whether to switch the first data processing device to a controlled mode based on the working status of the first data processing device.
  • the data processing system further includes: a communication switch; multiple data processing devices are respectively connected to the controller through the communication switch; The state of the processing device is monitored, including: monitoring the state of the data path between the first data processing device and the communication switch; based on the state of the data path between the first data processing device and the communication switch, determining Whether to switch the first data processing device to a controlled mode.
  • the monitoring of the state of the first data processing device by the controller includes: in response to the state of the first data processing device being an abnormal state, determining that the first data processing device The processing device switches to controlled mode.
  • the controller before the controller sends the second control instruction to the target second data processing device, the controller includes: based on the state of the second data processing device, from the second data processing device, A target second data processing device to be switched to the master mode is determined.
  • the controller determines, from among the second data processing devices, a target second data processing device to be switched to the master mode based on the state of the second data processing device, including : in response to the state of the first data processing device being an abnormal state, determine a candidate data processing device from the second data processing device, and detect the state of the candidate data processing device; in response to the candidate The state of the data processing device is a normal state, and the candidate data processing device is determined as the target second data processing device; the controller responds to the state of the candidate data processing device as an abnormal state, from the The second data processing device determines a new candidate data processing device, and returns to the step of detecting whether the state of the candidate data processing device is normal.
  • the controller determines from the second data processing device the target second data processing device to switch to the master mode, including: reading from the candidate data processing device Obtaining type information; in response to the read type information being preset type information, determining the candidate data processing device as the target second data processing device.
  • the controller sending the first control instruction to the first data processing device, and sending the second control instruction to the target second data processing device include: sending the first data processing device The device, and the target second data processing device send a reset signal; after the reset signal is sent successfully, send a first control instruction to the first data processing device, and send a second control instruction to the target second data processing device;
  • the first data processing device switches its own mode to the controlled mode in response to receiving the first control instruction, including performing a reset in response to receiving the reset signal; and after completing the reset, in response to receiving the The first control instruction is to switch its own mode to the controlled mode;
  • the target second data processing device switches its own mode to the master mode in response to receiving the second control instruction, including in response to receiving the reset signal, Executing reset; and after the reset is completed, in response to receiving the second control instruction, switch its own mode to the master mode.
  • the data processing method further includes: the controller responds to receiving a successful switching signal sent by the first data processing device and the target second data processing device, sending A data processing device and the target second data processing device send a reset release signal; the first data processing device releases reset in response to receiving the reset release signal; the target second data processing device responds Upon receiving the de-reset signal, de-reset.
  • the controller is further connected to the data processing device through a bus; the controller sends a reset signal to the first data processing device and the target second data processing device, including : sending a reset signal to the first data processing device and the target second data processing device through the bus.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides an electronic device, including: an instruction memory and the data processing device provided in the embodiment of the present disclosure, or including the data processing system provided in the embodiment of the present disclosure, or including the board provided in the embodiment of the present disclosure.
  • the data processing device, data processing system, or board provided by the embodiments of the present disclosure may include a chip, an AI chip, and the like.
  • the electronic device provided by the embodiment of the present disclosure may include a smart terminal such as a mobile phone, or may also be another device capable of data processing, a server, etc., which is not limited here.
  • Boards include, for example, printed circuit boards.
  • the embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and the program is used by a multiplexer or a data processing chip to execute the method provided in any data processing method embodiment of the present disclosure; or, The method provided by any data processing method embodiment of the present disclosure is executed by the controller, the first data processing device, and the target second data processing device.
  • the embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the data processing method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Hardware Redundancy (AREA)

Abstract

A data processing apparatus, system, method, and board card. The data processing apparatus (100) comprises: a multiplexer (20) which, in response to receiving a port gating signal, is used to gate a first transmission path or a second transmission path of a data processing chip (10) which are used to acquire configuration information; and the data processing chip, which is used to acquire first configuration information in response to the first transmission path being gated, and determine its own mode to be a master control mode on the basis of the first configuration information, and to acquire second configuration information in response to the second transmission path being gated, and determine its own mode to be a controlled mode on the basis of the second configuration information. In this way, a functional limitation on a data processing chip can be eliminated, so that the data processing chip can switch its own mode according to actual need, improving stability of a data processing apparatus.

Description

数据处理装置、系统、方法及板卡Data processing device, system, method and board
相关申请的交叉引用Cross References to Related Applications
本专利申请要求于2021年6月29日提交的、申请号为202110727748.1、发明名称为“数据处理装置、系统、板卡、方法、电子设备及存储介质”的中国专利申请的优先权,该申请以引用的方式并入文本中。This patent application claims the priority of the Chinese patent application with the application number 202110727748.1 and the invention title "data processing device, system, board, method, electronic equipment and storage medium" filed on June 29, 2021. The application Incorporated into the text by reference.
技术领域technical field
本公开涉及计算机技术领域,具体而言,涉及一种数据处理装置、系统、方法及板卡。The present disclosure relates to the field of computer technology, in particular, to a data processing device, system, method and board.
背景技术Background technique
在处理图像等数据时,通常会利用中央处理器(central processing unit,CPU)作为主控芯片,并辅以人工智能(Artificial Intelligence,AI)加速卡芯片作为加速卡,构建出“CPU+AI加速卡”硬件架构以实现数据处理,从而有效的提高数据处理的速度。但是,一旦作为主控芯片的CPU出现故障,就会导致无法再正常进行数据处理,造成硬件架构的稳定性较差。When processing data such as images, a central processing unit (CPU) is usually used as the main control chip, supplemented by an artificial intelligence (AI) accelerator card chip as an accelerator card, and a "CPU+AI acceleration chip" is constructed. Card" hardware architecture to achieve data processing, thereby effectively improving the speed of data processing. However, once the CPU as the main control chip breaks down, data processing cannot be performed normally, resulting in poor stability of the hardware architecture.
发明内容Contents of the invention
本公开实施例至少提供一种数据处理装置、系统、方法及板卡。Embodiments of the present disclosure at least provide a data processing device, system, method and board.
第一方面,本公开实施例提供了一种数据处理装置,包括:数据处理芯片以及多路复用器;其中,所述多路复用器,用于响应于接收到端口选通信号,选通所述数据处理芯片用于获取配置信息的第一传输通路或者第二传输通路;所述数据处理芯片,用于响应于所述第一传输通路被选通,获取第一配置信息,基于所述第一配置信息,将自身模式确定为主控模式;响应于所述第二传输通路被选通,获取第二配置信息,并基于所述第二配置信息,将自身模式确定为被控模式。In a first aspect, an embodiment of the present disclosure provides a data processing device, including: a data processing chip and a multiplexer; wherein the multiplexer is configured to select a The data processing chip is used to obtain the first transmission path or the second transmission path of the configuration information; the data processing chip is used to respond to the first transmission path being gated, and obtain the first configuration information, based on the The first configuration information is used to determine its own mode as the master mode; in response to the second transmission path being selected, the second configuration information is obtained, and based on the second configuration information, its own mode is determined as the controlled mode .
这样,通过在不同情况下将数据处理装置中的数据处理芯片切换为主控模式或者被控模式,从而可以摆脱传统的数据处理装置中对数据处理芯片的功能限制,数据处理芯片可以根据实际的需求切换自身的模式,提升数据处理装置的稳定性。In this way, by switching the data processing chip in the data processing device to the main control mode or the controlled mode under different circumstances, it is possible to get rid of the functional restrictions on the data processing chip in the traditional data processing device, and the data processing chip can be used according to the actual situation. It needs to switch its own mode to improve the stability of the data processing device.
一种可选的实施方式中,所述数据处理芯片,还用于在自身模式处于所述主控模式下,响应于接收到数据处理任务,将所述数据处理任务分解为多个子任务,向处于被控模式的其他数据处理装置下发所述子任务;或者所述数据处理芯片,还用于在自身模式处于所述被控模式下,响应于接收到处于主控模式的其他数据处理装置下发的子任务,执行所述其他数据处理装置下发的子任务。In an optional implementation manner, the data processing chip is further configured to decompose the data processing task into multiple subtasks in response to receiving the data processing task when its own mode is in the master control mode, and send Other data processing devices in the controlled mode issue the subtasks; or the data processing chip is further configured to respond to receiving the subtask from other data processing devices in the master mode when its own mode is in the controlled mode. The delivered subtasks execute the subtasks delivered by the other data processing devices.
这样,对于自身模式处于主控模式下的数据处理芯片,其能够起到控制作用,向其他处于被控模式的下的数据处理装置下发与数据处理任务对应的子任务。对于自身模式处于被控模式下的数据处理芯片,其能够执行数据处理的具体任务,在存在多个被控模式下的数据处理芯片的情况下,多个被控模式下的数据处理芯片可以并行对对应的子任务进行数据处理,因此数据处理的效率较高。In this way, the data processing chip whose own mode is in the master control mode can play a controlling role, and deliver subtasks corresponding to data processing tasks to other data processing devices in the controlled mode. For a data processing chip whose self-mode is in the controlled mode, it can perform specific tasks of data processing. When there are multiple data processing chips in the controlled mode, the data processing chips in the controlled mode can be parallel Data processing is performed on corresponding subtasks, so the efficiency of data processing is high.
一种可选的实施方式中,所述数据处理装置还包括:第一存储器和第二存储器;其中,所述第一存储器和所述第二存储器分别与所述多路复用器连接;所述多路复用器,用于响应于接收到端口选通信号,选通所述数据处理芯片和所述第一存储器之间的第一传输通路、或者选通所述数据处理芯片和所述第二存储器之间的第二传输通路。In an optional implementation manner, the data processing device further includes: a first memory and a second memory; wherein, the first memory and the second memory are respectively connected to the multiplexer; The multiplexer is configured to gate the first transmission path between the data processing chip and the first memory, or gate the data processing chip and the A second transmission path between the second memories.
这样,通过设置第一存储器和第二存储器,数据处理装置中的数据处理芯片在切换自身模式时,选通与对应模式下的存储器对应的传输通路即可,较为便捷,也可以有效的提升数据处理装置中的数据处理芯片在切换自身模式时的效率。In this way, by setting the first memory and the second memory, when the data processing chip in the data processing device switches its own mode, it only needs to gate the transmission path corresponding to the memory in the corresponding mode, which is more convenient and can also effectively improve the data quality. The efficiency with which a data processing chip in a processing device switches its mode.
一种可选的实施方式中,所述数据处理装置还包括:所述信号转换器与所述多路复用器连接;以及与控制器连接;所述信号转换器用于基于预设的第一通信协议接收所述控制器发送的控制指令,并将所述控制指令转换为端口选通信号,向所述多路复用器发送所述端口选通信号。In an optional implementation manner, the data processing device further includes: the signal converter is connected to the multiplexer; and is connected to the controller; the signal converter is used to The communication protocol receives the control command sent by the controller, converts the control command into a port gating signal, and sends the port gating signal to the multiplexer.
这样,通过设置信号转换器,即可以接收到控制器发送的控制指令,并有效的将该控制指令转换为端口选通信号,并向多路复用器发送。In this way, by setting the signal converter, the control command sent by the controller can be received, and the control command can be effectively converted into a port strobe signal, and sent to the multiplexer.
一种可选的实施方式中,所述数据处理装置还包括:监测芯片;所述监测芯片分别与所述数据处理芯片和所述信号转换器连接;所述监测芯片,用于监测所述数据处理芯片的工作状态,并向所述信号转换器发送与所述工作状态对应的监测信号;所述信号转换器,还用于接收所述监测芯片发送的所述监测信号,并基于预设的第二通信协议向所述控制器发送所述监测信号。In an optional implementation manner, the data processing device further includes: a monitoring chip; the monitoring chip is respectively connected to the data processing chip and the signal converter; the monitoring chip is used to monitor the data Processing the working state of the chip, and sending a monitoring signal corresponding to the working state to the signal converter; the signal converter is also used to receive the monitoring signal sent by the monitoring chip, and based on a preset A second communication protocol sends the monitoring signal to the controller.
这样,通过设置监测芯片,可以实现便捷且准确的对数据处理芯片进行监测,以确定数据处理芯片的工作状态。In this way, by setting the monitoring chip, convenient and accurate monitoring of the data processing chip can be realized, so as to determine the working state of the data processing chip.
一种可选的实施方式中,所述数据处理装置还包括:寄存器;所述寄存器用于存储所述数据处理芯片的类型信息。In an optional implementation manner, the data processing device further includes: a register; the register is used to store type information of the data processing chip.
这样,通过设置寄存器,可以预先在寄存器中存储数据处理芯片的类型信息,以使控制器可以直接通过寄存器获知对应的数据处理芯片的类型,而无需在每次对其类型进行查看时都进行重复的检测,效率更高。In this way, by setting the register, the type information of the data processing chip can be stored in the register in advance, so that the controller can directly know the type of the corresponding data processing chip through the register, without having to repeat it every time the type is checked The detection is more efficient.
第二方面,本公开实施例还提供一种数据处理系统,包括:本公开实施例提供的数据处理装置、以及控制器;其中,所述数据处理装置有多个;多个所述数据处理装置包括:处于主控模式的第一数据处理装置、以及处于被控模式的第二数据处理装置;所述控制器,用于对所述第一数据处理装置的状态进行监测;响应于所述状态指示将所述第一数据处理装置切换为被控模式,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;所述第一数据处理装置,用于响应于接收到第一控制指令,将自身模式切换为被控模式;所述目标第二数据处理装置,用于响应于接收到第二控制指令,将自身模式切换为主控模式。In the second aspect, the embodiment of the present disclosure also provides a data processing system, including: the data processing device provided in the embodiment of the present disclosure, and a controller; wherein, there are multiple data processing devices; multiple data processing devices It includes: a first data processing device in a master mode, and a second data processing device in a controlled mode; the controller is configured to monitor the state of the first data processing device; responding to the state Instructing to switch the first data processing device to a controlled mode, sending a first control instruction to the first data processing device, and sending a second control instruction to a target second data processing device; the first data processing device , configured to switch its own mode to a controlled mode in response to receiving a first control instruction; the target second data processing device is configured to switch its own mode to a master mode in response to receiving a second control instruction.
这样,在数据处理系统中的主控模式下的数据处理装置出现故障的情况下,可以由其他被控模式下的数据处理装置切换为主控模式,继续承担主控模式下的数据处理装置的功能,从而保证数据处理系统正常、稳定的运行。In this way, in the case of failure of the data processing device in the master mode in the data processing system, other data processing devices in the controlled mode can switch to the master mode, and continue to assume the responsibility of the data processing device in the master mode. function, so as to ensure the normal and stable operation of the data processing system.
一种可选的实施方式中,所述数据处理装置的状态,包括以下至少之一:工作状态;数据通路状态。In an optional implementation manner, the state of the data processing device includes at least one of the following: working state; data path state.
一种可选的实施方式中,所述控制器,在对所述第一数据处理装置的状态进行监测时,用于:接收所述第一数据处理装置发送的监测信号;基于所述监测信号,确定所述第一数据处理装置的工作状态;基于所述第一数据处理装置的工作状态,确定是否要将所述第一数据处理装置切换为被控模式。In an optional implementation manner, the controller, when monitoring the state of the first data processing device, is configured to: receive a monitoring signal sent by the first data processing device; , determining the working state of the first data processing device; determining whether to switch the first data processing device to a controlled mode based on the working state of the first data processing device.
这样,通过对第一数据处理装置的工作状态进行监测,即可以快速的确定数据处理系统当前是否能够正常工作;若数据处理系统由于第一数据处理装置的故障而导致不能工作,则确定将第一数据处理装置切换为被控模式,这样能够应急响应数据处理系统的故障,稳定性更强。In this way, by monitoring the working state of the first data processing device, it can be quickly determined whether the data processing system can currently work normally; A data processing device is switched to the controlled mode, which can respond to the failure of the data processing system in an emergency, and the stability is stronger.
一种可选的实施方式中,所述数据处理系统还包括:通信交换机;多个所述数据处理装置分别通过所述通信交换机与所述控制器连接;所述控制器,在对所述第一数据处理装置的状态进行监测时,用于:监测所述第一数据处理装置与所述通信交换机之间的数据通路状态;基于所述第一数据处理装置与所述通信交换机之间的数据通路状态,确定是否要将所述第一数据处理装置切换为被控模式。In an optional implementation manner, the data processing system further includes: a communication switch; multiple data processing devices are respectively connected to the controller through the communication switch; When monitoring the state of a data processing device, it is used to: monitor the state of the data path between the first data processing device and the communication switch; based on the data between the first data processing device and the communication switch A channel state, determining whether to switch the first data processing device to a controlled mode.
这样,通过设置通信交换机,可以监测第一数据处理装置与通信交换机之间的数据通路状态,并在数据通路状态存在异常时,确定将第一数据处理装置切换为被控模式。这样还可以在数据通路不正常时,将第一数据处理装置的自身模式切换,以保证第一数据处理装置的数据通路正常,从而保证主控模式下的数据处理装置可以将子任务正常的通过数据通路向其他数据处理装置下发。In this way, by setting the communication switch, the state of the data path between the first data processing device and the communication switch can be monitored, and when the state of the data path is abnormal, it is determined to switch the first data processing device to the controlled mode. In this way, when the data path is not normal, the self mode of the first data processing device can be switched to ensure that the data path of the first data processing device is normal, thereby ensuring that the data processing device in the master mode can pass the subtask normally. The data path is delivered to other data processing devices.
一种可选的实施方式中,所述控制器,在对所述第一数据处理装置的状态进行监测时,用于:响应于所述第一数据处理装置的状态为异常状态,确定需要将所述第一数据处理装置切换为被控模式。In an optional implementation manner, the controller, when monitoring the state of the first data processing device, is configured to: in response to the state of the first data processing device being an abnormal state, determine that the The first data processing device switches to a controlled mode.
一种可选的实施方式中,所述控制器,在向目标第二数据处理装置发送第二控制指令之前,还用于:基于所述第二数据处理装置的状态,从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置。In an optional implementation manner, the controller, before sending the second control instruction to the target second data processing device, is further configured to: based on the state of the second data processing device, from the second data processing device In the processing device, a target second data processing device to switch to the master mode is determined.
一种可选的实施方式中,所述控制器,在基于所述第二数据处理装置的状态,从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置时,用于:响应于所述第一数据处理装置的状态为异常状态,从所述第二数据处理装置中确定备选数据处理装置,并检测所述备选数据处理装置的状态;响应于所述备选数据处理装置的状态为正常状态,将所述备选数据处理装置确定为所述目标第二数据处理装置;所述控制器,还用于:响应于所述备选数据处理装置的状态为异常状态,从所述第二数据处理装置中确定新的备选数据处理装置,并返回至检测所述备选数据处理装置的状态是否正常的步骤。In an optional implementation manner, the controller, based on the state of the second data processing device, determines from among the second data processing devices the target second data processing device to be switched to the master mode when the state of the first data processing device is abnormal, determine the candidate data processing device from the second data processing device, and detect the state of the candidate data processing device; in response The state of the candidate data processing device is a normal state, and the candidate data processing device is determined as the target second data processing device; the controller is further configured to: respond to the candidate data processing device If the state of the data processing device is abnormal, determine a new candidate data processing device from the second data processing device, and return to the step of detecting whether the state of the candidate data processing device is normal.
这样,通过先确定备选数据处理装置的状态是否正常,再确定其是否可以作为目标第二数据处理装置的方式,可以避免直接将备选数据处理装置切换为目标第二数据处理装置后,备选数据处理装置仍无法正常的完成数据处理任务的分解、下发等任务的情况,而是可以准确的确定可以切换为主控模式并执行该模式下数据处理任务的数据处理装置,因此可以避免重复多次对多个数据处理装置进行切换,效率更高。In this way, by first determining whether the state of the candidate data processing device is normal, and then determining whether it can be used as the target second data processing device, it is possible to avoid directly switching the candidate data processing device to the target second data processing device. If the selected data processing device still cannot normally complete the decomposition and distribution of data processing tasks, it can accurately determine the data processing device that can switch to the master mode and execute the data processing tasks in this mode, so it can avoid It is more efficient to switch multiple data processing devices repeatedly.
一种可选的实施方式中,所述控制器,在从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置时,还用于,从所述备选数据处理装置中读取类型信息;响应于读取到的所述类型信息为预设类型信息,将所述备选数据处理装置确定为所述目标第二数据处理装置。In an optional implementation manner, the controller, when determining from the second data processing device a target second data processing device to be switched to the master mode, is further configured to, from the alternative Reading type information from the data processing device; in response to the read type information being preset type information, determining the candidate data processing device as the target second data processing device.
这样,控制器还可以确定备选数据处理装置的类型信息,以避免将无法执行主控模式下数据处理任务的数据处理装置将自身模式切换为主控模式,导致数据处理系统无法继续正常工作,或者多次将备选数据处理装置作为主控模式下的数据处理装置的频繁切换。In this way, the controller can also determine the type information of the candidate data processing device, so as to prevent the data processing device that cannot perform the data processing task in the master control mode from switching its own mode to the master control mode, causing the data processing system to fail to continue to work normally. Or multiple times of frequent switching of the candidate data processing device as the data processing device in the master control mode.
一种可选的实施方式中,所述控制器,向所述第一数据处理装置发送第一控制指令,以及向目 标第二数据处理装置发送第二控制指令时,用于:向所述第一数据处理装置、和所述目标第二数据处理装置发送复位信号;在复位信号发送成功后,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;所述第一数据处理装置,在响应于接收到第一控制指令,将自身模式切换为被控模式时,用于:响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第一控制指令,将自身模式切换为被控模式;所述目标第二数据处理装置,在响应于接收到第二控制指令,将自身模式切换为主控模式时,用于响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第二控制指令,将自身模式切换为主控模式。In an optional implementation manner, when the controller sends the first control instruction to the first data processing device and sends the second control instruction to the target second data processing device, it is configured to: A data processing device and the target second data processing device send a reset signal; after the reset signal is sent successfully, send a first control instruction to the first data processing device, and send a second control instruction to the target second data processing device control instruction; the first data processing device, when switching its own mode to the controlled mode in response to receiving the first control instruction, is used to: execute reset in response to receiving the reset signal; and complete the reset Afterwards, in response to receiving the first control instruction, switch its own mode to the controlled mode; when the target second data processing device switches its own mode to the master mode in response to receiving the second control instruction , configured to perform a reset in response to receiving the reset signal; and switch its own mode to a master mode in response to receiving the second control instruction after the reset is completed.
这样,通过控制器向第一数据处理装置以及目标第二数据处理装置下发复位信号,第一数据处理装置和目标第二数据处理装置可以解除当前的自身模式,使得第一数据处理装置和目标第二数据处理装置在对自身模式进行切换时,可以直接通过选通其分别对应的数据处理芯片与第一存储器或者第二存储器之间的传输通路实现,这样可以较为简单的使第一数据处理装置、以及目标数据处理装置完成自身模式的切换,并有效的减少在切换时出现切换故障的情况发生。In this way, through the controller sending a reset signal to the first data processing device and the target second data processing device, the first data processing device and the target second data processing device can release the current self-mode, so that the first data processing device and the target When the second data processing device switches its own mode, it can be realized directly by gating the transmission path between its corresponding data processing chip and the first memory or the second memory, so that the first data processing device can be relatively simply The device and the target data processing device complete the switching of their own modes, and effectively reduce the occurrence of switching failures during switching.
一种可选的实施方式中,所述控制器,还用于:响应于接收到所述第一数据处理装置、和所述目标第二数据处理装置发送的成功切换信号,向所述第一数据处理装置、和所述目标第二数据处理装置发送解除复位信号;所述第一数据处理装置,响应于接收到所述解除复位信号,解除复位;所述目标第二数据处理装置,响应于接收到所述解除复位信号,解除复位。In an optional implementation manner, the controller is further configured to: in response to receiving a successful handover signal sent by the first data processing device and the target second data processing device, send a message to the first data processing device The data processing device and the target second data processing device send a reset release signal; the first data processing device releases the reset in response to receiving the reset release signal; the target second data processing device responds to The reset is released upon receiving the reset release signal.
这样,第一数据处理装置和目标第二数据处理装置在解除复位后,目标第二数据处理装置可以基于自身模式为主控模式,继续接收数据处理任务,并将其分解为子任务;同样的,第一数据处理装置可以基于自身模式为被控模式,对接受到的子任务进行任务处理,以使数据处理系统能够继续完成相应的数据处理任务,以恢复数据处理系统的正常数据处理状态。In this way, after the first data processing device and the target second data processing device are released from reset, the target second data processing device can continue to receive data processing tasks based on its own mode as the master mode, and decompose it into subtasks; similarly , the first data processing device may process the received subtasks based on its own mode as the controlled mode, so that the data processing system can continue to complete the corresponding data processing tasks, so as to restore the normal data processing state of the data processing system.
一种可选的实施方式中,所述控制器还通过总线与所述数据处理装置连接;所述控制器,在向所述第一数据处理装置和所述目标第二数据处理装置发送复位信号时,用于:通过所述总线向所述第一数据处理装置和所述目标第二数据处理装置发送复位信号。In an optional implementation manner, the controller is also connected to the data processing device through a bus; the controller sends a reset signal to the first data processing device and the target second data processing device , configured to: send a reset signal to the first data processing device and the target second data processing device through the bus.
第三方面,本公开实施例还提供一种板卡,所述板卡包括本公开第一方面及其任一实施方式提供一种数据处理装置,或者本公开第二方面及其任一实施方式提供一种数据处理系统。In the third aspect, the embodiments of the present disclosure further provide a board, the board includes the first aspect of the present disclosure and any implementation thereof to provide a data processing device, or the second aspect of the disclosure and any implementation thereof A data processing system is provided.
第四方面,本公开实施例还提供一种数据处理方法,应用于数据处理装置;所述数据处理方法包括:In a fourth aspect, the embodiments of the present disclosure further provide a data processing method applied to a data processing device; the data processing method includes:
多路复用器响应于接收到端口选通信号,选通数据处理芯片用于获取配置信息的第一传输通路或者第二传输通路;In response to receiving the port gating signal, the multiplexer gating the first transmission path or the second transmission path used by the data processing chip to obtain the configuration information;
所述数据处理芯片响应于所述第一传输通路被选通,获取第一配置信息,基于所述第一配置信息,将自身模式确定为主控模式;响应于所述第二传输通路被选通,获取第二配置信息,并基于所述第二配置信息,将自身模式确定为被控模式。The data processing chip acquires first configuration information in response to the first transmission path being selected, and determines its own mode as the master mode based on the first configuration information; in response to the selection of the second transmission path In general, the second configuration information is acquired, and based on the second configuration information, the self mode is determined as the controlled mode.
第五方面,本公开实施例还提供另一种数据处理方法,应用于数据处理系统;所述数据处理方法包括:In the fifth aspect, the embodiments of the present disclosure also provide another data processing method, which is applied to a data processing system; the data processing method includes:
控制器对第一数据处理装置的状态进行监测;响应于所述状态指示将所述第一数据处理装置切换为被控模式,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;The controller monitors the status of the first data processing device; in response to the status indication, switches the first data processing device to a controlled mode, sends a first control command to the first data processing device, and sends a target the second data processing device sends a second control instruction;
所述第一数据处理装置响应于接收到第一控制指令,将自身模式切换为被控模式;The first data processing device switches its own mode to the controlled mode in response to receiving the first control instruction;
所述目标第二数据处理装置响应于接收到第二控制指令,将自身模式切换为主控模式。The target second data processing device switches its own mode to the master mode in response to receiving the second control instruction.
关于上述板卡、数据处理方法的效果描述参见上述数据处理装置以及数据处理系统的说明,这里不再赘述。For the effect description of the above-mentioned board and data processing method, refer to the description of the above-mentioned data processing device and data processing system, which will not be repeated here.
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present disclosure, and are used together with the description to explain the technical solutions of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these figures are obtained other related figures.
图1示出了本公开实施例所提供的一种数据处理装置的示意图;FIG. 1 shows a schematic diagram of a data processing device provided by an embodiment of the present disclosure;
图2示出了本公开实施例所提供的一种数据处理装置的具体结构示意图;FIG. 2 shows a schematic structural diagram of a data processing device provided by an embodiment of the present disclosure;
图3示出了本公开实施例所提供的一种数据处理系统的示意图;Fig. 3 shows a schematic diagram of a data processing system provided by an embodiment of the present disclosure;
图4示出了本公开实施例所提供的一种数据处理系统的具体结构示意图;FIG. 4 shows a schematic structural diagram of a data processing system provided by an embodiment of the present disclosure;
图5示出了本公开实施例所提供的一种确定目标第二数据处理装置的示意图;Fig. 5 shows a schematic diagram of a second data processing device for determining a target provided by an embodiment of the present disclosure;
图6示出了本公开实施例所提供的一种控制器向数据处理装置发送数据复位信号时的示意图;Fig. 6 shows a schematic diagram when a controller according to an embodiment of the present disclosure sends a data reset signal to a data processing device;
图7A和图7B示出了本公开实施例所提供的一种数据处理系统在执行数据处理任务时的流程图;FIG. 7A and FIG. 7B show a flow chart of a data processing system provided by an embodiment of the present disclosure when performing a data processing task;
图8示出了本公开实施例所提供的一种板卡的示意图;FIG. 8 shows a schematic diagram of a board provided by an embodiment of the present disclosure;
图9示出了本公开实施例所提供的一种数据处理方法的流程图;FIG. 9 shows a flow chart of a data processing method provided by an embodiment of the present disclosure;
图10示出了本公开实施例所提供的另一种数据处理方法的流程图。FIG. 10 shows a flowchart of another data processing method provided by an embodiment of the present disclosure.
具体实施方式detailed description
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only It is a part of the embodiments of the present disclosure, but not all of them. The components of the disclosed embodiments generally described and illustrated herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the claimed disclosure, but merely represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.
经研究发现,在利用CPU进行对图像等的数据处理时,为了加快数据处理的速度以提高效率,通常辅以多个AI加速卡并行对数据进行处理。例如,在对多张图像进行数据处理时,CPU可以作为主控芯片,用于接收外界发送的数据处理任务,并将数据处理任务分解为多个子任务,向多个AI加速卡分配不同的子任务,由多个AI加速卡并行对子任务进行处理,以提升数据处理的效率。但是,在这种“CPU+AI加速卡”硬件架构下,一旦作为主控芯片的CPU出现故障,就会造成整个的数据处理过程受到影响,造成硬件架构的稳定性较差。After research, it is found that when using CPU to process data such as images, in order to speed up data processing and improve efficiency, it is usually supplemented by multiple AI accelerator cards to process data in parallel. For example, when performing data processing on multiple images, the CPU can be used as the main control chip to receive data processing tasks sent by the outside world, decompose the data processing tasks into multiple sub-tasks, and assign different sub-tasks to multiple AI accelerator cards. Tasks, sub-tasks are processed in parallel by multiple AI accelerator cards to improve the efficiency of data processing. However, under this "CPU+AI accelerator card" hardware architecture, once the CPU as the main control chip fails, the entire data processing process will be affected, resulting in poor stability of the hardware architecture.
基于上述研究,本公开提供了一种数据处理装置,通过在不同情况下将数据处理装置中的数据处理芯片切换为主控模式或者被控模式,从而可以摆脱传统的数据处理装置中对数据处理芯片的功能限制。数据处理芯片可以根据实际的需求切换自身的模式,提升数据处理装置的稳定性。Based on the above research, the present disclosure provides a data processing device, by switching the data processing chip in the data processing device to the master mode or the controlled mode under different circumstances, so as to get rid of the data processing in the traditional data processing device. Functional limitations of the chip. The data processing chip can switch its own mode according to actual needs, improving the stability of the data processing device.
针对以上方案所存在的缺陷,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。The defects in the above solutions are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions proposed by the present disclosure below for the above problems should be the result of the inventor Contributions made to this disclosure during the course of this disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
为便于对本实施例进行理解,首先对本公开实施例所公开的一种数据处理装置进行详细介绍。To facilitate understanding of this embodiment, a data processing device disclosed in this embodiment of the disclosure is first introduced in detail.
参见图1所示,为本公开实施例提供的一种数据处理装置的示意图;所述数据处理装置100包括:数据处理芯片10以及多路复用器20;其中,Referring to FIG. 1 , it is a schematic diagram of a data processing device provided by an embodiment of the present disclosure; the data processing device 100 includes: a data processing chip 10 and a multiplexer 20; wherein,
所述多路复用器20,用于响应于接收到端口选通信号,选通所述数据处理芯片10用于获取配置信息的第一传输通路或者第二传输通路;The multiplexer 20 is configured to, in response to receiving a port gating signal, gating the first transmission path or the second transmission path used by the data processing chip 10 for obtaining configuration information;
所述数据处理芯片10,用于响应于所述第一传输通路被选通,获取第一配置信息,基于所述第一配置信息,将自身模式确定为主控模式;响应于所述第二传输通路被选通,获取第二配置信息,并基于所述第二配置信息,将自身模式确定为被控模式。The data processing chip 10 is configured to obtain first configuration information in response to the first transmission path being gated, and determine its own mode as the master mode based on the first configuration information; in response to the second The transmission channel is selected, the second configuration information is obtained, and the own mode is determined as the controlled mode based on the second configuration information.
本公开实施例提供的数据处理装置中,多路复用器可以响应于接收到端口选通信号,选通数据处理芯片和第一存储器之间的第一传输通路,以使数据处理芯片将自身模式切换为主控模式;或者,选通数据处理芯片和第二存储器之间的第二传输通路,以使数据处理芯片将自身模式切换为被控模式。这种方式通过在不同情况下将数据处理装置中的数据处理芯片切换为主控模式或者被控模式,从而可以摆脱传统的数据处理装置中对数据处理芯片的功能限制。数据处理芯片可以根据实际的需求切换自身的模式,提升数据处理装置的稳定性。In the data processing device provided by the embodiments of the present disclosure, the multiplexer may gate the first transmission path between the data processing chip and the first memory in response to receiving the port gating signal, so that the data processing chip The mode is switched to the master mode; or, the second transmission path between the data processing chip and the second memory is selected, so that the data processing chip switches its own mode to the controlled mode. In this way, by switching the data processing chip in the data processing device to the master mode or the controlled mode under different circumstances, it can get rid of the functional limitation of the data processing chip in the traditional data processing device. The data processing chip can switch its own mode according to actual needs, improving the stability of the data processing device.
在具体实施中,数据处理装置100中的数据处理芯片10例如可以包括但不限于下述至少一种:AI芯片、图形处理器(graphics processing unit,GPU)、现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA)、以及特殊应用集成电路(Application Specific Integrated Circuit,ASIC)。具体地,在数据处理芯片10包括AI芯片的情况下,对应包含该数据处理芯片10的数据处理装置100例如可以包括用于加速数据处理的硬件设备。此处,在数据处理芯片10中,例如还可以包括用于与外部网络对接的网络转接器(network adaptor),该网络转接器例如可以接收外部网络发送的数据,例如数据处理任务中的相关数据。In a specific implementation, the data processing chip 10 in the data processing device 100 may include but not limited to at least one of the following: AI chip, graphics processing unit (graphics processing unit, GPU), field programmable logic gate array (Field Programmable Gate Array, FPGA), and Application Specific Integrated Circuit (ASIC). Specifically, in the case that the data processing chip 10 includes an AI chip, the data processing apparatus 100 correspondingly including the data processing chip 10 may include hardware devices for accelerating data processing, for example. Here, in the data processing chip 10, for example, a network adapter (network adapter) for interfacing with an external network may also be included, for example, the network adapter may receive data sent by an external network, such as a related data.
在本公开实施例中,数据处理芯片10在不同的模式下具有不同的功能。其中,在主控模式下,数据处理芯片10可以接收外界发送的数据处理任务,并将数据处理任务分解为多个子任务,并向其他处于被控模式的数据处理芯片10发送子任务,起到其他数据处理装置100的控制作用;在被控模式下,数据处理芯片10可以接收其他处于主控模式下的数据处理芯片10发送的子任务,并对子任务进行处理,起实际执行数据处理任务的作用。In the embodiments of the present disclosure, the data processing chip 10 has different functions in different modes. Among them, in the master control mode, the data processing chip 10 can receive the data processing tasks sent by the outside world, decompose the data processing tasks into multiple subtasks, and send subtasks to other data processing chips 10 in the controlled mode, so as to play the role of The control function of other data processing devices 100; in the controlled mode, the data processing chip 10 can receive the subtasks sent by other data processing chips 10 in the master control mode, and process the subtasks to actually execute the data processing tasks role.
示例性的,数据处理任务例如可以包括对多张图像进行分类识别,其中,将该数据处理任务分解为多个子任务后,每个子任务包括:对多张图像中的一张图像进行分类识别。或者,数据处理任务还可以包括对一张图像进行多种数据增强处理,其中,将该数据处理任务分解为多个子任务后,每个子任务包括:利用其中多种数据增强处理方式中的一种数据增强处理方式,对图像进行数据增强处理;多种数据增强处理例如包括:对图像进行平滑处理、高斯模糊处理、随机擦除处理、以及边界检测处理。另外,上述数据处理任务仅仅示出了两种示例,还可以利用本公开实施例提供的数据处理装置100执行其他的数据处理任务,本公开实施例中不做限定。Exemplarily, the data processing task may include classifying and recognizing multiple images, where after decomposing the data processing task into multiple subtasks, each subtask includes: classifying and recognizing one of the multiple images. Alternatively, the data processing task may also include performing multiple data enhancement processing on an image, wherein after decomposing the data processing task into multiple subtasks, each subtask includes: using one of the multiple data enhancement processing methods The data enhancement processing method is to perform data enhancement processing on the image; various data enhancement processing includes, for example, smoothing the image, Gaussian blur processing, random erasing processing, and boundary detection processing. In addition, the above data processing tasks only show two examples, and the data processing apparatus 100 provided in the embodiments of the present disclosure may also be used to perform other data processing tasks, which are not limited in the embodiments of the present disclosure.
在具体实施中,数据处理芯片10在自身模式处于主控模式下、与自身模式处于被控模式下时,具有不同的功能。In a specific implementation, the data processing chip 10 has different functions when its own mode is in the master mode and its own mode is in the controlled mode.
具体地,对于在自身模式处于所述主控模式下的数据处理芯片10而言,数据处理芯片10可以响应于接收到数据处理任务,将所述数据处理任务分解为多个子任务,向处于被控模式的其他数据处理装置100下发所述子任务。Specifically, for the data processing chip 10 in the master control mode in its own mode, the data processing chip 10 may decompose the data processing task into multiple subtasks in response to receiving the data processing task, Other data processing apparatuses 100 in the control mode deliver the subtasks.
对于自身模式处于被控模式下的数据处理芯片10而言,数据处理芯片10可以响应于接收到处于主控模式的其他数据处理装置100下发的子任务,执行所述其他数据处理装置100下发的子任务。For the data processing chip 10 whose own mode is in the controlled mode, the data processing chip 10 may execute the subtasks issued by other data processing devices 100 in the master mode in response to receiving the subtasks issued by the other data processing devices 100. sent subtasks.
示例性的,数据处理装置100中的数据处理芯片10例如可以用IC(integrated circuit)表示。其中,对应于多个数据处理装置100(例如包括三个数据数据处理装置),其分别对应的数据处理芯片10可以分别表示为IC_1、IC_2、以及IC_3。当前,IC_1的自身模式处于主控模式、IC_2以及IC_3的自身模式处于被控模式。IC_1在接受到由主机(host)侧、或者由其他上位机设备发送的数据处理任务后,将接收到的数据处理任务分解为多个子任务。其中,多个子任务例如可以包括对第一张图像进行分类识别的子任务M_1、以及对第二张图像进行分类识别的子任务M_2。IC_1在确定子任务M_1以及M_2后,例如可以向处于被控模式的IC_2发送子任务M_1、以及向同样处于被控模式的IC_3发送子任务M_2。Exemplarily, the data processing chip 10 in the data processing device 100 may be represented by an IC (integrated circuit), for example. Wherein, corresponding to a plurality of data processing devices 100 (for example, including three data processing devices), the corresponding data processing chips 10 may be denoted as IC_1, IC_2, and IC_3 respectively. Currently, the self-mode of IC_1 is in the master mode, and the self-modes of IC_2 and IC_3 are in the controlled mode. After IC_1 receives the data processing task sent by the host (host) side or other upper computer devices, it decomposes the received data processing task into multiple subtasks. Wherein, the plurality of subtasks may include, for example, a subtask M_1 for classifying and identifying the first image, and a subtask M_2 for classifying and identifying the second image. After determining the subtasks M_1 and M_2, IC_1 may, for example, send subtask M_1 to IC_2 in the controlled mode, and send subtask M_2 to IC_3 also in the controlled mode.
IC_2在接收到子任务M_1后,执行该子任务M_1,对第一张图像进行分类识别;类似的,IC_3在接收到子任务M_2后,执行该子任务M_2,对第二张图像进行分类识别。在一种可能的实施方式中,处于被控模式的数据处理芯片10还可以在执行子任务后,将处理结果发送至处于主控模式的数据处理芯片10。例如,在IC_2确定分类(classification)识别结果为C1、以及IC_3确定分类识别结果为C2的情况下,还可以分别将识别结果C1以及识别结果C2发送至IC_1。After receiving the subtask M_1, IC_2 executes the subtask M_1 to classify and recognize the first image; similarly, after receiving the subtask M_2, IC_3 executes the subtask M_2 to classify and recognize the second image . In a possible implementation manner, the data processing chip 10 in the controlled mode may also send the processing result to the data processing chip 10 in the master mode after executing the subtask. For example, when IC_2 determines that the classification recognition result is C1, and IC_3 determines that the classification recognition result is C2, the recognition result C1 and the recognition result C2 may also be sent to IC_1 respectively.
数据处理装置100中,若数据处理芯片10处于主控模式;数据处理芯片10在出现故障的情况下,可以将该数据处理芯片10的当前模式由主控模式切换为被控模式;同时,另一数据处理装置100中处于被控模式的数据处理芯片10,由被控模式切换为主控模式,这样,可以避免同时出现多个主控模式下的数据处理芯片10的情况产生。同时,由于数据处理芯片10模式的切换,保证了一旦处于主控模式的数据处理芯片10发生故障,就会有其他数据处理装置中的数据处理芯片10替代其主控功能,保证系统的稳定性。In the data processing device 100, if the data processing chip 10 is in the master control mode; when the data processing chip 10 fails, the current mode of the data processing chip 10 can be switched from the master control mode to the controlled mode; meanwhile, another The data processing chip 10 in the controlled mode in a data processing device 100 is switched from the controlled mode to the master mode. In this way, the simultaneous occurrence of multiple data processing chips 10 in the master mode can be avoided. At the same time, due to the mode switching of the data processing chip 10, it is guaranteed that once the data processing chip 10 in the main control mode breaks down, the data processing chip 10 in other data processing devices will replace its main control function, ensuring the stability of the system .
在另一种可能的实施方式中,为了避免故障状态的数据处理芯片10影响数据处理任务的正常执行,也可以不将出现故障的数据处理芯片10由主控模式切换为被控模式,而是响应于当前主控模式下的数据处理芯片10出现故障,断开所述数据处理芯片10与其他数据处理芯片10之间的数据传输通路。在该种情况下,需要将另一数据处理装置100中的数据处理芯片10由被控模式切换为主控模式。In another possible implementation, in order to prevent the data processing chip 10 in the fault state from affecting the normal execution of the data processing task, the data processing chip 10 in the fault state may not be switched from the master control mode to the controlled mode, but In response to a failure of the data processing chip 10 in the current master control mode, the data transmission path between the data processing chip 10 and other data processing chips 10 is disconnected. In this case, it is necessary to switch the data processing chip 10 in another data processing device 100 from the controlled mode to the master mode.
另外,由于包含数据处理芯片10的数据处理装置100可以单独被集成为集成电路、电路板或者芯片,也即数据处理装置100是可热插拔的,因此数据处理芯片10在出现故障时,可以将包含有该数据处理芯片10的数据处理装置100拔除,并更换新的数据处理装置100。且在该过程中,无需暂停其他数据处理装置100的正常使用,能够减少故障修复过程对其他数据处理装置100的影响,以及维护硬件架构的稳定性。In addition, since the data processing device 100 including the data processing chip 10 can be integrated into an integrated circuit, a circuit board or a chip, that is, the data processing device 100 is hot-swappable, so when the data processing chip 10 fails, it can be The data processing device 100 including the data processing chip 10 is removed, and a new data processing device 100 is replaced. And in this process, there is no need to suspend the normal use of other data processing devices 100, which can reduce the impact of the fault repair process on other data processing devices 100, and maintain the stability of the hardware architecture.
在本公开另一实施例中,数据处理芯片10的切换例如可以是利用数据处理装置100中还包括的第一存储器和第二存储器实现的。参见图2所示,示出了一种数据处理装置的具体结构示意图,其中,包括第一存储器30以及第二存储器40;所述第一存储器30和所述第二存储器40分别与所述多路复用器20中的端口A以及端口B连接。In another embodiment of the present disclosure, the switching of the data processing chip 10 may be implemented by utilizing the first memory and the second memory further included in the data processing device 100 , for example. Referring to FIG. 2 , it shows a specific structural diagram of a data processing device, which includes a first memory 30 and a second memory 40; the first memory 30 and the second memory 40 are respectively connected to the multiple Port A and port B in the multiplexer 20 are connected.
具体地,第一存储器30和第二存储器40,例如,可以为闪存(flash)。多路复用器20响应于接受到端口选通信号,可以选通所述数据处理芯片10和所述第一存储器30之间的第一传输通路、或者选通所述数据处理芯片10和所述第二存储器40之间的第二传输通路,以使数据处理芯片10响应于第一传输通路被选通,将自身模式确定为主控模式;或者使数据处理芯片10响应于第二传输通路被选通,将自身模式确定为被控模式。Specifically, the first memory 30 and the second memory 40, for example, may be flash memory (flash). The multiplexer 20 may gate the first transmission path between the data processing chip 10 and the first memory 30, or gate the data processing chip 10 and the first memory 30 in response to receiving the port gate signal. The second transmission path between the second memory 40 is selected, so that the data processing chip 10 determines its own mode as the master mode in response to the first transmission path being selected; or the data processing chip 10 responds to the second transmission path is strobed to determine its own mode as the controlled mode.
其中,在第一存储器30中,存储有第一配置信息。数据处理芯片10在获取第一配置信息后, 例如可以接收由主机或者其他上位机设备发送的数据处理任务,或者,也可以对其他的数据处理装置100中的数据处理芯片10进行其他的数据配置。对应的,接收第一配置信息的数据处理芯片10即可以将自身模式确定为主控模式。Wherein, the first configuration information is stored in the first memory 30 . After the data processing chip 10 obtains the first configuration information, for example, it can receive a data processing task sent by a host computer or other upper computer equipment, or it can also perform other data configurations on the data processing chip 10 in other data processing devices 100 . Correspondingly, the data processing chip 10 receiving the first configuration information can determine its own mode as the master mode.
示例性的,数据处理芯片10与第一存储器30之间例如包括第一传输通路,第一传输通路例如可以包括串行外设接口(Serial Peripheral Interface,SPI)总线。由于多路复用器20与第一存储器30连接,因此在数据处理芯片10的自身模式确定为主控模式时,可以通过选通多路复用器20与第一存储器30之间的第一数据通路实现。其中,多路复用器20与第一存储器30之间的通信协议也可以包括SPI总线,也可以根据实际情况选取不同的协议,具体地在此不再赘述。Exemplarily, a first transmission path is included between the data processing chip 10 and the first memory 30, and the first transmission path may include, for example, a serial peripheral interface (Serial Peripheral Interface, SPI) bus. Since the multiplexer 20 is connected to the first memory 30, when the self mode of the data processing chip 10 is determined to be the master mode, the first memory between the multiplexer 20 and the first memory 30 can be selected. Data path implementation. Wherein, the communication protocol between the multiplexer 20 and the first memory 30 may also include an SPI bus, or a different protocol may be selected according to actual conditions, and details will not be repeated here.
在第二存储器40中,存储有第二配置信息。数据处理芯片10在获取第二配置信息后,例如可以接收其他数据处理装置100中的数据处理芯片10发送的子任务,例如上述子任务M_1、以及子任务M_2。具体地,在本公开实施例中,接收第二配置信息的数据处理芯片10对向其发送子任务的数据处理芯片10是无感知的,其在接收到子任务后执行响应的数据处理任务即可。也即,对应接收第二配置信息的数据处理芯片10可以将自身模式切换为被控模式,并且继续执行由当前自身模式处于主控模式的数据处理芯片10发送的子任务,能够减少在执行数据处理任务时的中断,具有更好的处理连贯性。In the second memory 40, second configuration information is stored. After acquiring the second configuration information, the data processing chip 10 may, for example, receive subtasks sent by the data processing chips 10 in other data processing devices 100 , such as the aforementioned subtask M_1 and subtask M_2 . Specifically, in the embodiment of the present disclosure, the data processing chip 10 that receives the second configuration information is unaware of the data processing chip 10 that sends the subtask to it, and executes the corresponding data processing task after receiving the subtask, namely Can. That is to say, the data processing chip 10 corresponding to receiving the second configuration information can switch its own mode to the controlled mode, and continue to execute the subtasks sent by the data processing chip 10 whose own mode is currently in the master control mode, which can reduce the execution data. Interruption while processing tasks, with better processing coherence.
示例性的,数据处理芯片10与第二存储器40之间例如可以包括第二传输通路。其中,第二传输通路例如也可以与第一传输通路相同的设置为SPI总线,以使数据处理芯片10在切换自身模式时无需切换通信协议,切换时较为简单,也可以在一定程度上提高效率。在当前自身模式处于主控模式的数据处理芯片10出现故障时,对于此时可以切换为主控模式的其他数据处理芯片10,该当前自身模式处于主控模式的数据处理芯片10中的多路复用器20与第二存储器40连接,因此在该当前自身模式处于主控模式的数据处理芯片10确定将自身模式切换为被控模式时,可以通过选通多路复用器20与第二存储器40之间的数据通路实现。Exemplarily, a second transmission path may be included between the data processing chip 10 and the second memory 40 , for example. Wherein, the second transmission path, for example, can also be set as the same as the first transmission path as the SPI bus, so that the data processing chip 10 does not need to switch the communication protocol when switching its own mode, and the switching is relatively simple, and the efficiency can also be improved to a certain extent . When the data processing chip 10 whose current own mode is in the master control mode breaks down, for other data processing chips 10 that can be switched to the master control mode at this time, the multiple channels in the data processing chip 10 whose current self mode is in the master control mode The multiplexer 20 is connected with the second memory 40, so when the data processing chip 10 whose current self mode is in the master control mode determines to switch its own mode to the controlled mode, the multiplexer 20 can be connected to the second memory 40 through the gate. A data path between memories 40 is implemented.
在另一实施例中,由于多路复用器20主要起转换开关的作用,其无法承担判断数据处理装置100中的数据处理芯片10是否需要对自身模式进行切换的任务,因此在数据处理装置100中还包括信号转换器以及监测芯片。参见图2所示,其示出了信号转换器50以及监测芯片60在数据处理装置100中的电路连接关系,详见下述对信号转换器50与监测芯片60的说明。In another embodiment, since the multiplexer 20 mainly acts as a switch, it cannot undertake the task of judging whether the data processing chip 10 in the data processing device 100 needs to switch its own mode, so in the data processing device 100 also includes a signal converter and a monitoring chip. Referring to FIG. 2 , it shows the circuit connection relationship of the signal converter 50 and the monitoring chip 60 in the data processing device 100 . Please refer to the description of the signal converter 50 and the monitoring chip 60 below for details.
其中,监测芯片60可以用于对数据处理芯片10的状态进行监测,以判断数据处理芯片10是否可以正常工作,从而确定是否需要对数据处理芯片10的当前模式进行切换。另外,信号转换器50还与在数据处理装置外部的控制器连接,由于控制器无法直接利用总线(此处,总线包括系统管理总线(System Management Bus,SMBUS))直接读取监测芯片60确定的监测结果,因此控制器可以从信号转换器50中读取监测芯片60发送、并存储至信号转换器50中的监测结果,并进一步地向信号转换器50发送控制指令,以控制多路复用器20选通第一传输通路、或者选通第二传输通路,以实现数据处理芯片10自身模式的切换。Wherein, the monitoring chip 60 can be used to monitor the state of the data processing chip 10 to determine whether the data processing chip 10 can work normally, so as to determine whether the current mode of the data processing chip 10 needs to be switched. In addition, the signal converter 50 is also connected to the controller outside the data processing device, because the controller cannot directly use the bus (here, the bus includes a System Management Bus (SMBUS)) to directly read the information determined by the monitoring chip 60. Monitoring results, so the controller can read the monitoring results sent by the monitoring chip 60 from the signal converter 50 and stored in the signal converter 50, and further send control instructions to the signal converter 50 to control multiplexing The device 20 gates the first transmission path or gates the second transmission path, so as to realize the mode switching of the data processing chip 10 itself.
具体地,针对监测芯片60而言,所述监测芯片60分别与所述数据处理芯片10和所述信号转换器50连接,具体的连接关系可以参见下述对信号转换器50的说明。所述监测芯片60,用于监测所述数据处理芯片10的工作状态,并向所述信号转换器50发送与所述工作状态对应的监测信号。Specifically, for the monitoring chip 60 , the monitoring chip 60 is respectively connected to the data processing chip 10 and the signal converter 50 , and for the specific connection relationship, please refer to the following description of the signal converter 50 . The monitoring chip 60 is configured to monitor the working state of the data processing chip 10 and send a monitoring signal corresponding to the working state to the signal converter 50 .
示例性的,监测芯片60例如可以选用SP706芯片,或者选用MAX706芯片。监测芯片60包括有多个管脚,可以接收或者发送不同的信号。具体地,监测芯片60例如可以包括看门狗信号输入(Watch Dog Input,WDI)管脚,该WDI管脚与数据处理芯片10连接,用于监测数据处理芯片10的工作状态。Exemplarily, the monitoring chip 60 can be, for example, an SP706 chip, or a MAX706 chip. The monitoring chip 60 includes multiple pins, which can receive or send different signals. Specifically, the monitoring chip 60 may include, for example, a watchdog signal input (Watch Dog Input, WDI) pin, and the WDI pin is connected to the data processing chip 10 for monitoring the working state of the data processing chip 10.
示例性的,若监测芯片60的WDI管脚的输入/输出(Input/Output,I/O)端口在1.6秒内有电平变化时,监测芯片60中包括的看门狗信号输出(Watch Dog Output,WDO)管脚的输出信号为高电平,表征数据处理芯片10的工作状态正常。若监测芯片60的WDI管脚的I/O端口在1.6秒内没有电平变化时,监测芯片60中包括的WDO管脚的输出信号为低电平,表征数据处理芯片10的工作状态异常。Exemplary, if the input/output (Input/Output, I/O) port of the WDI pin of the monitoring chip 60 has a level change within 1.6 seconds, the watchdog signal output (Watch Dog) included in the monitoring chip 60 The output signal of the Output (WDO) pin is high level, indicating that the working state of the data processing chip 10 is normal. If the I/O port of the WDI pin of the monitoring chip 60 has no level change within 1.6 seconds, the output signal of the WDO pin included in the monitoring chip 60 is low level, indicating that the working state of the data processing chip 10 is abnormal.
在监测芯片60对数据处理芯片10的工作状态进行监测后,可以确定与工作状态对应的监测信号。示例性的,可以直接将以高、低电平表示的WDO管脚的输出信号作为监测信号;或者,也可以另外设置电平逻辑确定监测信号,例如在确定数据处理芯片10的工作状态正常时,确定对应的监测信号为低电平,并在确定数据处理芯片10的工作状态异常时,确定对应的监测信号为高电平。具体地可以根据实际情况确定,这里不做出限定。After the monitoring chip 60 monitors the working state of the data processing chip 10, a monitoring signal corresponding to the working state can be determined. Exemplarily, the output signal of the WDO pin represented by high and low levels can be directly used as the monitoring signal; or, the level logic can also be set to determine the monitoring signal, for example, when the working state of the data processing chip 10 is determined to be normal , determining that the corresponding monitoring signal is at a low level, and determining that the corresponding monitoring signal is at a high level when it is determined that the working state of the data processing chip 10 is abnormal. Specifically, it may be determined according to actual conditions, and no limitation is made here.
监测芯片60在确定监测信号后,可以向信号转换器50发送监测信号。此处,为了将数据处理装置100轻量化、并且可以利用一个控制器对多个数据处理装置100进行管理,因此可以在数据处理装置100外设置控制器,以根据监测信号判断是否需要对数据处理芯片10的自身模式进行切换。 也即,控制器例如可以包括在数据处理装置外起判断决策作用的电子器件。After the monitoring chip 60 determines the monitoring signal, it can send the monitoring signal to the signal converter 50 . Here, in order to reduce the weight of the data processing device 100 and manage a plurality of data processing devices 100 with one controller, a controller can be provided outside the data processing device 100 to judge whether data processing is required according to the monitoring signal. The own mode of the chip 10 is switched. That is, the controller may include, for example, an electronic device that plays a decision-making role outside the data processing device.
此处,控制器在读取监测芯片60确定的监测信号时,可以利用SMBUS读取该监测信号。但在选用SP706芯片作为监测芯片60时,由于控制器无法通过SMBUS直接读取SP706芯片确定的监测信号,但可以通过SMBUS有效地控制信号转换器50任意一个输入管脚的输出与监测信号对应的、且可以由控制器读取的电平信号,因此监测芯片60可以将监测信号发送至信号转换器50,然后控制器再通过SMBUS读取信号转换器50中存储的监测信号。Here, when the controller reads the monitoring signal determined by the monitoring chip 60 , it can use SMBUS to read the monitoring signal. But when selecting the SP706 chip as the monitoring chip 60, because the controller cannot directly read the monitoring signal determined by the SP706 chip through the SMBUS, it can effectively control the output of any input pin of the signal converter 50 through the SMBUS to correspond to the monitoring signal. , and can be read by the controller, so the monitoring chip 60 can send the monitoring signal to the signal converter 50, and then the controller reads the monitoring signal stored in the signal converter 50 through SMBUS.
其中,在数据处理装置100的数量较多的情况下,由于控制器可以支持读取的SMBUS的数量有限,因此控制器只能通过SMBUS读取有限数量的数据处理装置100对应的监测信号。在该种情况下,还可以为控制器提供扩展芯片,例如PCA9548,以使控制器可以对当前所需监控的数据处理装置100进行监测信号的读取。同样的,对于控制器向数据处理装置100发送其他信号的情况,也可以通过为其扩展芯片的方式,实现向更多数量的数据处理装置100发发送信号。例如,在下述实施例中,控制器向数据处理装置100发送复位信号、以及解除复位信号时,也可以采用为控制器提供扩展芯片的方式,实现向单一设置控制器时未能接收到信号的其他数据处理装置100发送信号。在下文中不再重复赘述。Wherein, when the number of data processing devices 100 is large, the controller can only read monitoring signals corresponding to a limited number of data processing devices 100 through SMBUS because the number of SMBUS that the controller can support to read is limited. In this case, an expansion chip, such as PCA9548, can also be provided for the controller, so that the controller can read the monitoring signal of the data processing device 100 currently to be monitored. Similarly, for the situation that the controller sends other signals to the data processing device 100 , it is also possible to implement sending signals to a larger number of data processing devices 100 by expanding the chip for it. For example, in the following embodiments, when the controller sends the reset signal to the data processing device 100 and releases the reset signal, the controller can also be provided with an expansion chip to realize the failure to receive the signal when a single controller is set. Other data processing devices 100 transmit signals. It will not be repeated in the following.
本公开另一实施例中,所述信号转换器50,还用于接收所述监测芯片60发送的所述监测信号,并基于预设的第二通信协议向所述控制器发送所述监测信号。In another embodiment of the present disclosure, the signal converter 50 is further configured to receive the monitoring signal sent by the monitoring chip 60, and send the monitoring signal to the controller based on a preset second communication protocol .
其中,信号转换器50例如可以选用PCA9555芯片。Wherein, the signal converter 50 can be, for example, a PCA9555 chip.
示例性的,参见图2所示,信号转换器50例如可以包括两个输入/输出管脚,分别表示为I/O_1以及I/O_2;并包括一个同步串行总线(Inter-Integrated Circuit,I2C)接口。其中,信号转换器50中的I/O_1与监测芯片中的WDO连接。监测芯片60例如可以利用WDO将监测信号传输至信号转换器50;其中,I/O_1用于接收监测芯片60中WDO发送的监测信号,然后信号转换器50中的寄存器改写该监测信号,控制器可以从I2C读取改写后的监测信号,从而判断数据处理芯片10是否可以正常工作。在该种情况下,图2中指示的信号转换器50中的I2C与控制器之间的数据通路,是指控制器从I2C读取信号的数据通路。Exemplarily, as shown in FIG. 2, the signal converter 50 may include, for example, two input/output pins, respectively denoted as I/O_1 and I/O_2; and include a synchronous serial bus (Inter-Integrated Circuit, I2C )interface. Wherein, the I/O_1 in the signal converter 50 is connected to the WDO in the monitoring chip. The monitoring chip 60 can, for example, utilize the WDO to transmit the monitoring signal to the signal converter 50; wherein, I/O_1 is used to receive the monitoring signal sent by the WDO in the monitoring chip 60, and then the register in the signal converter 50 rewrites the monitoring signal, and the controller The rewritten monitoring signal can be read from the I2C, so as to determine whether the data processing chip 10 can work normally. In this case, the data path between the I2C and the controller in the signal converter 50 indicated in FIG. 2 refers to the data path through which the controller reads signals from the I2C.
其中,信号转换器50可以基于预设的第二通信协议向所述控制器发送监测信号。第二通信协议例如可以包括SMBUS使用的通信协议。Wherein, the signal converter 50 may send a monitoring signal to the controller based on a preset second communication protocol. The second communication protocol may include, for example, a communication protocol used by SMBUS.
在本公开另一实施例中,由于在对数据处理芯片10的自身模式进行切换时,同时需要满足数据处理芯片10能够适应切换后的模式。例如,数据处理芯片10的类型包括仅能提供算力的计算芯片,以及可以提供算力、也可以执行子任务分发的其他芯片。则对于计算芯片而言,在将其自身模式切换由被控模式切换为主控模式时,由于该计算芯片无法承担分发子任务的功能,因此该计算芯片在将其自身模式切换为主控模式后,数据处理任务也无法正常执行,也即,计算芯片的自身模式不应被切换为主控模式。In another embodiment of the present disclosure, when the mode of the data processing chip 10 is switched, the data processing chip 10 needs to be able to adapt to the mode after switching. For example, the types of the data processing chip 10 include computing chips that can only provide computing power, and other chips that can provide computing power and also perform subtask distribution. Then for the computing chip, when switching its own mode from the controlled mode to the master mode, since the computing chip cannot undertake the function of distributing subtasks, the computing chip is switching its own mode to the master mode. After that, the data processing task cannot be executed normally, that is, the self-mode of the computing chip should not be switched to the main control mode.
因此,数据处理装置100中还可以包括寄存器。参见图2所示,其中示出了寄存器的电路连接示意图。其中,寄存器90例如可以选用带电可擦可编程只读存储器(Electrically Erasable Programmable read only memory,E2PROM),具体地可以选用现场可更换部件(Field-Replaceable Unit E2PROM,FRU E2PROM)。由于数据处理芯片10可以承担的任务是固定的,也即其类型是固定的,因此数据处理芯片10的类型信息可以预先确定。具体地,可以为数据处理芯片10确定对应的寄存器90,所述寄存器90用于存储所述数据处理芯片10的类型信息;并且,由于数据处理芯片10对应的类型信息是固定的,因此在制作数据处理装置100时,可以直接将类型信息烧录至寄存器90中,并且寄存器90中存储的类型信息保持不变。此处,寄存器90例如可以与与其对应的数据处理芯片10连接;或者在数据处理装置100中作为单独的寄存器存在,与数据处理芯片10中的其他模块无连接关系,在图2中示出了与数据处理芯片10中的其他模块无连接关系的寄存器90。另外,数据处理装置100外的控制器与该寄存器90连接;具体地,控制器也可以利用SMBUS读取寄存器90中存储的类型信息,并判断该数据处理芯片10是否能够切换自身模式。Therefore, registers may also be included in the data processing device 100 . Refer to FIG. 2 , which shows a schematic circuit connection diagram of the register. Wherein, the register 90 can be, for example, an electrically erasable programmable read only memory (Electrically Erasable Programmable read only memory, E2PROM), specifically a field-replaceable unit (Field-Replaceable Unit E2PROM, FRU E2PROM). Since the tasks that the data processing chip 10 can undertake are fixed, that is, its type is fixed, so the type information of the data processing chip 10 can be predetermined. Specifically, the corresponding register 90 can be determined for the data processing chip 10, and the register 90 is used to store the type information of the data processing chip 10; and, since the type information corresponding to the data processing chip 10 is fixed, the When using the data processing device 100, the type information can be directly programmed into the register 90, and the type information stored in the register 90 remains unchanged. Here, the register 90, for example, can be connected to the corresponding data processing chip 10; or exist as a separate register in the data processing device 100, and has no connection relationship with other modules in the data processing chip 10, as shown in FIG. 2 The register 90 is not connected to other modules in the data processing chip 10 . In addition, a controller outside the data processing device 100 is connected to the register 90; specifically, the controller can also use SMBUS to read the type information stored in the register 90, and judge whether the data processing chip 10 can switch its own mode.
示例性的,控制器可以确定当前自身模式处于被控模式的数据处理芯片10,然后对该数据处理芯片10进行监测,具体可以在接收到信号转换器50中存储的监测信号后,通过该监测信号判断对应的数据处理芯片10是否可以正常工作。在一种可能的情况下,若监测信号表征数据处理芯片10能够正常工作,则保持当前数据处理芯片10的自身模式不变,继续执行数据处理任务。Exemplarily, the controller can determine the data processing chip 10 whose own mode is currently in the controlled mode, and then monitor the data processing chip 10, specifically after receiving the monitoring signal stored in the signal converter 50, through the monitoring The signal judges whether the corresponding data processing chip 10 can work normally. In a possible situation, if the monitoring signal indicates that the data processing chip 10 can work normally, keep the current mode of the data processing chip 10 unchanged and continue to execute the data processing task.
在另一种可能的情况下,若监测信号表征数据处理芯片10不能正常工作,则控制器例如可以控制该数据处理芯片10所在数据处理装置100中的信号选通器,以使信号选通器向多路复用器20中的端口选通接口发送端口选通信号。参见图2所示,示出了多路复用器的电路连接图。In another possible situation, if the monitoring signal indicates that the data processing chip 10 cannot work normally, the controller may, for example, control the signal gate in the data processing device 100 where the data processing chip 10 is located, so that the signal gate The port gate signal is sent to the port gate interface in the multiplexer 20 . Referring to FIG. 2 , it shows a circuit connection diagram of a multiplexer.
具体地,所述信号转换器50用于基于预设的第一通信协议接收控制器发送的控制指令,并将所述控制指令转换为端口选通信号,向所述多路复用器20发送所述端口选通信号。其中,第一通信协 议例如可以是与第二通信协议相同的通信协议,例如SMBUS,或者也可以根据实际情况选取与第一通信协议不同的其他通信协议。Specifically, the signal converter 50 is used to receive the control command sent by the controller based on the preset first communication protocol, convert the control command into a port gating signal, and send it to the multiplexer 20 The port strobe signal. Wherein, the first communication protocol can be, for example, the same communication protocol as the second communication protocol, such as SMBUS, or other communication protocols different from the first communication protocol can also be selected according to actual conditions.
其中,信号转换器50中的I/O_2,可以用于向与其连接的多路复用器20中的端口选通端口发送端口选通信号,该端口选通信号用于指示多路复用器20选通多路复用器20与第一存储器30或者第二存储器40之间的第一传输通路或者第二传输通路。Wherein, I/O_2 in the signal converter 50 can be used to send a port gating signal to the port gating port in the multiplexer 20 connected to it, and the port gating signal is used to indicate the multiplexer 20 gates the first transmission path or the second transmission path between the multiplexer 20 and the first memory 30 or the second memory 40 .
对应于当前不能正常工作的数据处理芯片10,控制器例如可以向其对应的数据处理装置100中的信号转换器50中的同步串行总线(Inter-Integrated Circuit,I2C)接口发送控制指令,以使信号转换器50在接收到该控制指令后,向多路复用器20的端口选通端口发送指示多路复用器20选通第二传输通路的端口选通信号PO_1。在该种情况下,图2中指示的信号转换器50中的I2C与控制器之间的数据通路,是指控制器向I2C发送控制指令的数据通路。此时,对于该数据处理芯片10,其自身模式将会由当前的主控模式切换为被控模式。Corresponding to the data processing chip 10 that cannot work normally at present, the controller can, for example, send a control instruction to the synchronous serial bus (Inter-Integrated Circuit, I2C) interface in the signal converter 50 in the corresponding data processing device 100, to After receiving the control instruction, the signal converter 50 sends the port gate signal PO_1 to the port gate port of the multiplexer 20 to instruct the multiplexer 20 to select the second transmission path. In this case, the data path between the I2C and the controller in the signal converter 50 indicated in FIG. 2 refers to the data path through which the controller sends a control instruction to the I2C. At this time, for the data processing chip 10 , its own mode will be switched from the current master control mode to the controlled mode.
而对应于当前可以正常工作的数据处理芯片10,若控制器在读取该数据处理芯片10所在的数据处理装置100中的寄存器90中存储的数据处理芯片10的类型信息后,确定其类型信息指示该数据处理芯片10可以用作主控模式下的数据处理芯片10时,可以向其对应的信号转换器50中的I2C接口发送控制指令,以使在接收到该控制指令后,向多路复用器20的端口选通端口发送指示多路复用器20选通第一传输通路的端口选通信号PO_2。则对于该数据处理芯片10,其自身模式将会由当前的被控模式切换为主控模式。And corresponding to the data processing chip 10 that can work normally at present, if the controller reads the type information of the data processing chip 10 stored in the register 90 in the data processing device 100 where the data processing chip 10 is located, determine its type information When indicating that the data processing chip 10 can be used as the data processing chip 10 in the master control mode, a control instruction can be sent to the I2C interface in its corresponding signal converter 50, so that after receiving the control instruction, the multi-channel The port gate port of the multiplexer 20 sends a port gate signal PO_2 instructing the multiplexer 20 to gate the first transmission path. Then, for the data processing chip 10, its own mode will be switched from the current controlled mode to the master mode.
本公开实施例还提供了一种数据处理系统。参见图3所示,为本公开实施例提供的一种数据处理系统200的示意图;所述数据处理系统200包括本公开实施例提供的数据处理装置100、以及控制器70;其中,所述数据处理装置100有多个;多个所述数据处理装置100包括:处于主控模式的第一数据处理装置101、以及处于被控模式的第二数据处理装置102;其中,The embodiment of the present disclosure also provides a data processing system. Referring to FIG. 3 , it is a schematic diagram of a data processing system 200 provided by an embodiment of the present disclosure; the data processing system 200 includes a data processing device 100 provided by an embodiment of the present disclosure, and a controller 70; wherein, the data There are multiple processing devices 100; the multiple data processing devices 100 include: a first data processing device 101 in a master mode, and a second data processing device 102 in a controlled mode; wherein,
所述控制器70,用于对所述第一数据处理装置的状态进行监测;响应于所述状态指示将所述第一数据处理装置切换为被控模式,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;The controller 70 is configured to monitor the status of the first data processing device; switch the first data processing device to a controlled mode in response to the status indication, and send a message to the first data processing device a first control instruction, and sending a second control instruction to a target second data processing device;
所述第一数据处理装置101,用于响应于接收到第一控制指令,将自身模式切换为被控模式;The first data processing device 101 is configured to switch its own mode to a controlled mode in response to receiving a first control instruction;
所述目标第二数据处理装置103,用于响应于接收到第二控制指令,将自身模式切换为主控模式。The target second data processing device 103 is configured to switch its own mode to the master mode in response to receiving the second control instruction.
本公开实施例提供的数据处理系统200,可以利用控制器确定第一数据处理装置的状态,确定是否需要对其模式进行切换。在控制器确定切换第一数据处理装置的自身模式时,可以向第一数据处理装置发送第一控制指令,以使其自身模式切换为被控模式;同时可以向第二数据处理装置中的目标第二数据处理装置发送第二控制指令,以使目标第二数据处理装置将自身模式切换为主控模式。这样,在数据处理系统200中的主控模式下的数据处理装置出现故障的情况下,可以由其他被控模式下的数据处理装置切换为主控模式,继续承担主控模式下的数据处理装置的功能,从而保证数据处理系统200正常、稳定的运行。The data processing system 200 provided by the embodiment of the present disclosure may use the controller to determine the state of the first data processing device, and determine whether to switch its mode. When the controller determines to switch its own mode of the first data processing device, it can send a first control command to the first data processing device to switch its own mode to the controlled mode; The second data processing device sends a second control instruction, so that the target second data processing device switches its own mode to the master mode. In this way, when the data processing device in the master mode in the data processing system 200 fails, other data processing devices in the controlled mode can switch to the master mode, and continue to assume the responsibility of the data processing device in the master mode. function, so as to ensure the normal and stable operation of the data processing system 200.
在具体实施中,第一数据处理装置101包括一个,其自身模式为主控模式,相应地在第一数据处理装置101中的数据处理芯片,其自身模式为主控模式。对应于第二数据处理装置102,其可以根据实际的数据处理任务需求、或者处于对数据处理系统200的工作稳定性的要求,设置为一个或多个。在第一数据处理装置101的工作状态出现异常时,可以在第二数据处理装置102中选择合适的目标第二数据处理装置103切换为主控模式,以替代异常的第一数据处理装置101,从而保证数据处理系统200正常、稳定的运行。In a specific implementation, the first data processing device 101 includes one, its own mode is the main control mode, and correspondingly the data processing chip in the first data processing device 101, its own mode is the main control mode. Corresponding to the second data processing device 102 , there may be one or more devices according to actual data processing task requirements or requirements on the working stability of the data processing system 200 . When the working state of the first data processing device 101 is abnormal, an appropriate target second data processing device 103 can be selected in the second data processing device 102 to switch to the master mode to replace the abnormal first data processing device 101, In this way, the normal and stable operation of the data processing system 200 is guaranteed.
具体地,控制器70在对所述第一数据处理装置101的状态进行监测时,例如可以接收所述第一数据处理装置101发送的监测信号;基于所述监测信号,确定所述第一数据处理装置101的工作状态;基于所述第一数据处理装置101的工作状态,确定是否要将所述第一数据处理装置101切换为被控模式。Specifically, when the controller 70 monitors the state of the first data processing device 101, for example, it may receive a monitoring signal sent by the first data processing device 101; based on the monitoring signal, determine the first data The working status of the processing device 101: based on the working status of the first data processing device 101, determine whether to switch the first data processing device 101 to a controlled mode.
其中,第一数据处理装置101向控制器70发送监测信号的具体过程可以参见上述对数据处理装置100中的说明,在此不再赘述。Wherein, the specific process of the first data processing device 101 sending the monitoring signal to the controller 70 may refer to the above-mentioned description of the data processing device 100 , which will not be repeated here.
在具体实施中,控制器70基于所述监测信号,确定所述第一数据处理装置101的状态时,例如可以根据监测信号确定第一数据处理装置101的状态为工作状态和/或数据通路状态。In a specific implementation, when the controller 70 determines the state of the first data processing device 101 based on the monitoring signal, for example, it may determine that the state of the first data processing device 101 is the working state and/or the data path state according to the monitoring signal .
其中,在第一数据处理装置101在主控模式下正常工作时,例如第一数据处理装置101在接收到数据处理任务后将其处理为多个子任务,并向第二数据处理装置102分发子任务时,可以认为第一数据处理装置101的状态是工作状态,也即第一数据处理装置101在相应的执行数据处理任务。Wherein, when the first data processing device 101 works normally in the master control mode, for example, the first data processing device 101 processes the data processing task into multiple subtasks after receiving it, and distributes the subtasks to the second data processing device 102. When performing a task, the state of the first data processing device 101 can be considered as a working state, that is, the first data processing device 101 is correspondingly executing a data processing task.
或者第一数据处理装置101的状态还可以包括数据通路状态。例如,对于当前不能正常工作的第一数据处理装置101,其无法正常向第二数据处理装置102分发子任务,因此只有数据通路上 的连接关系。在这种情况下,将第一数据处理装置的状态确定为数据通路状态。Alternatively, the state of the first data processing device 101 may also include a data path state. For example, for the first data processing device 101 that cannot work normally at present, it cannot normally distribute subtasks to the second data processing device 102, so there is only a connection relationship on the data path. In this case, the state of the first data processing means is determined as the data path state.
这样,可以较为直观的利用第一数据处理装置101的状态,确定是否需要将其切换为被控模式,下面以第一数据处理装置101包含工作状态为例进行说明。In this way, the status of the first data processing device 101 can be intuitively used to determine whether it needs to be switched to the controlled mode. The following takes the first data processing device 101 including the working status as an example for illustration.
控制器70可以简单的依据预设的监测信号、以及可选的状态判断结果之间的对应关系,为第一数据处理装置101确定其状态。例如,在第一数据处理装置101中的数据处理芯片的工作状态异常的情况下,可以向控制器70发送高电平的监测信号,控制器70在接受到该高电平的监测信号后,可以确定第一数据处理装置101的状态为工作状态异常。或者,在第一数据处理装置101中的数据处理芯片的工作状态正常的情况下,可以向控制器70发送低电平的监测信号,控制器70在接受到该低电平的监测信号后,可以确定第一数据处理装置101的状态为工作状态正常。The controller 70 can simply determine the state of the first data processing device 101 according to the correspondence between preset monitoring signals and optional state judgment results. For example, when the working state of the data processing chip in the first data processing device 101 is abnormal, a high-level monitoring signal can be sent to the controller 70, and after the controller 70 receives the high-level monitoring signal, It may be determined that the state of the first data processing device 101 is an abnormal working state. Or, when the working state of the data processing chip in the first data processing device 101 is normal, a low-level monitoring signal can be sent to the controller 70, and after the controller 70 receives the low-level monitoring signal, It may be determined that the state of the first data processing device 101 is a normal working state.
控制器70在确定第一数据处理装置101的工作状态后,还可以基于确定的第一数据处理装置101的工作状态,确定是否要将所述第一数据处理装置101切换为被控模式。具体地,控制器70在确定第一数据处理装置101的工作状态异常的情况下,确定第一数据处理装置101的状态为异常状态,并确定将第一数据处理装置101切换为被控模式;或者,控制器70在确定第一数据处理装置101的工作状态正常的情况下,保持第一数据处理装置101的自身模式为主控模式。After determining the working state of the first data processing device 101, the controller 70 may also determine whether to switch the first data processing device 101 to the controlled mode based on the determined working state of the first data processing device 101. Specifically, when the controller 70 determines that the working state of the first data processing device 101 is abnormal, it determines that the state of the first data processing device 101 is an abnormal state, and determines to switch the first data processing device 101 to the controlled mode; Alternatively, when the controller 70 determines that the working state of the first data processing device 101 is normal, it keeps the self-mode of the first data processing device 101 as the master mode.
下面以第一数据处理装置101包含数据通路状态为例进行说明是否需要将第一数据处理装置101的模式切换为被控模式。The following takes the first data processing device 101 including the data path state as an example to illustrate whether it is necessary to switch the mode of the first data processing device 101 to the controlled mode.
具体地,控制器70可以监测所述第一数据处理装置101与通信交换机之间的数据通路状态;基于所述第一数据处理装置101与所述通信交换机之间的数据通路状态,确定是否要将所述第一数据处理装置101切换为被控模式。Specifically, the controller 70 may monitor the state of the data path between the first data processing device 101 and the communication switch; based on the state of the data path between the first data processing device 101 and the communication switch, determine whether to Switch the first data processing device 101 to a controlled mode.
其中,通信交换机例如可以包括高速串行计算机扩展总线标准交换机(Peripheral Component Interconnect Express Switch,PCIE Switch)。参见图4所示,为本公开实施例提供的一种数据处理系统的具体结构示意图;其中,多个数据处理装置分别通过所述通信交换机80与所述控制器70连接。另外,在多个数据处理装置分别与通信交换机80连接时,可以利用高速串行计算机扩展总线标准交换机卡槽(PCIE Slot)进行连接;在图4中示出了多个数据处理装置(数据处理装置以“DP”表示,图中示出的多个数据处理装置包括DP_1、DP_2、……、DP_n)分别对应的PCIE Slot,包括PCIE Slot#1、PCIE Slot#2、PCIE Slot#3、……、PCIE Slot#n。这样,数据处理装置在向通信交换机80通信时,即可以选用PCIE协议进行通信。Wherein, the communication switch may include, for example, a high-speed serial computer expansion bus standard switch (Peripheral Component Interconnect Express Switch, PCIE Switch). Referring to FIG. 4 , it is a schematic structural diagram of a data processing system provided by an embodiment of the present disclosure; wherein, multiple data processing devices are respectively connected to the controller 70 through the communication switch 80 . In addition, when multiple data processing devices are connected with communication switch 80 respectively, can utilize high-speed serial computer expansion bus standard exchange card slot (PCIE Slot) to connect; Shown in Fig. 4 a plurality of data processing devices (data processing The device is represented by "DP". The multiple data processing devices shown in the figure include PCIE Slots corresponding to DP_1, DP_2, ..., DP_n), including PCIE Slot#1, PCIE Slot#2, PCIE Slot#3,... ..., PCIE Slot#n. In this way, when the data processing device communicates with the communication switch 80, it can select the PCIE protocol for communication.
具体地,控制器70可以通过通信交换机80主动发出的数据通路正常信号,例如PORT_GOOD#信号,确定当前的第一数据处理装置101的数据传输通路处于正常状态。Specifically, the controller 70 may determine that the current data transmission path of the first data processing device 101 is in a normal state through a data path normal signal actively sent by the communication switch 80, such as a PORT_GOOD# signal.
示例性的,参见图4所示,通信交换机80例如可以向控制器发送与多个数据处理装置DP_1至DP_n分别对应的PORT_GOOD#信号,例如可以包括图4中示出的PORT_GOOD#1至PORT_GOOD#n。Exemplarily, as shown in FIG. 4 , the communication switch 80 may, for example, send PORT_GOOD# signals respectively corresponding to a plurality of data processing devices DP_1 to DP_n to the controller, for example, may include PORT_GOOD#1 to PORT_GOOD# shown in FIG. 4 n.
此处,若控制器70未能接收到通信交换机80发出的与任一数据处理装置对应的数据通路正常信号,则可以确定该数据处理装置与通信交换机80之间的数据通路状态出现异常。例如,控制器70若未能接收到PORT_GOOD#1,则可以确定数据处理装置DP_1与通信交换机80之间的数据传输通路状态出现异常,也即PCIE Slot#1对应的数据通路状态出现异常。Here, if the controller 70 fails to receive the data path normal signal corresponding to any data processing device from the communication switch 80 , it can determine that the state of the data path between the data processing device and the communication switch 80 is abnormal. For example, if the controller 70 fails to receive PORT_GOOD#1, it can determine that the state of the data transmission path between the data processing device DP_1 and the communication switch 80 is abnormal, that is, the state of the data path corresponding to PCIE Slot#1 is abnormal.
另外,由于控制器70在基于通信交换机80发送的PORT_GOOD#信号确定对应数据处理装置与通信交换机80之间的数据通路状态时,数据传输的方向为由数据处理装置向控制器进行数据传输的方向。但是,在一种可能的情况下,若在多个数据处理装置中确定自身模式为主控模式的第一数据处理装置101,则第一数据处理装置101也可以向其他处于被控模式的第二数据处理装置102发送子任务(例如在DP_1作为第一数据处理装置101时,其余的DP_2至DP_n均可以作为第二数据处理装置102)。因此在图4中示出的数据传输通路中也可以存在向第二数据处理装置102(也即DP_2至DP_n)发送的数据,则与上述确定数据通路状态时的数据传输方向相反。因此,在图4中未示出数据传输时的具体方向,但在具体的数据处理情况下,均可以代表对应于数据处理任务的数据传输方向。In addition, when the controller 70 determines the state of the data path between the corresponding data processing device and the communication switch 80 based on the PORT_GOOD# signal sent by the communication switch 80, the direction of data transmission is the direction in which the data processing device transmits data to the controller. . However, in a possible situation, if the first data processing device 101 in the master mode is determined among multiple data processing devices, the first data processing device 101 may also send data to other first data processing devices 101 in the controlled mode. Two data processing devices 102 send subtasks (for example, when DP_1 serves as the first data processing device 101 , the remaining DP_2 to DP_n can all serve as the second data processing device 102 ). Therefore, there may also be data sent to the second data processing device 102 (ie, DP_2 to DP_n) in the data transmission path shown in FIG. 4 , which is opposite to the above data transmission direction when determining the state of the data path. Therefore, the specific direction during data transmission is not shown in FIG. 4 , but in a specific data processing situation, it can represent the data transmission direction corresponding to the data processing task.
此处,第一数据处理装置101的数据传输通路例如可以包括与PCIE通信协议对应的数据传输通路,或者也可以选取其他可以用于进行子任务传输的数据传输通路,并对应的替换所使用的通信交换机80。Here, the data transmission path of the first data processing device 101 can include, for example, a data transmission path corresponding to the PCIE communication protocol, or other data transmission paths that can be used for subtask transmission can also be selected, and the used one can be replaced accordingly. communication switch 80 .
控制器70在确定第一数据处理装置101与通信交换机80之间的数据通路状态后,还可以基于该数据通路状态,确定是否要将第一数据处理装置101切换为被控模式。具体地,控制器70在确定第一数据处理装置101与通信交换机80之间数据通路状态异常的情况下,确定第一数据处理装置101的状态为异常状态,并确定将第一数据处理装置101切换为被控模式;或者,控制器70在确定第一数据处理装置101与通信交换机80之间的数据通路状态正常的情况下,保持第一数据处理装置 101的自身模式为主控模式。After determining the state of the data path between the first data processing device 101 and the communication switch 80, the controller 70 may also determine whether to switch the first data processing device 101 to the controlled mode based on the state of the data path. Specifically, when the controller 70 determines that the state of the data path between the first data processing device 101 and the communication switch 80 is abnormal, it determines that the state of the first data processing device 101 is an abnormal state, and determines that the first data processing device 101 switch to the controlled mode; or, when the controller 70 determines that the state of the data path between the first data processing device 101 and the communication switch 80 is normal, keep the first data processing device 101 in its own mode as the master mode.
下面以第一数据处理装置101包含工作状态和数据通路状态为例进行说明是否需要将第一数据处理装置101的模式切换为被控模式。The following takes the first data processing device 101 including the working state and the data path state as an example to explain whether it is necessary to switch the mode of the first data processing device 101 to the controlled mode.
具体地,控制器70在确定第一数据处理装置101与通信交换机80之间的数据通路状态后,还可以基于该数据通路状态以及第一数据处理装置101的工作状态,确定是否要将第一数据处理装置101切换为被控模式。具体地,控制器70在确定第一数据处理装置101的工作状态正常但数据通路状态异常的情况下,确定第一数据处理装置101的状态为异常状态,并确定将第一数据处理装置101切换为被控模式;或者,控制器70在确定第一数据处理装置101的工作状态异常且数据通路状态异常的情况下,确定第一数据处理装置101的状态为异常状态,并确定将第一数据处理装置101切换为被控模式;或者,控制器70在确定第一数据处理装置101的工作状态异常但数据通路状态正常的情况下,确定第一数据处理装置101的状态为异常状态,并确定将第一数据处理装置101切换为被控模式;或者,控制器70在确定第一数据处理装置101的工作状态正常但数据通路状态异常的情况下,确定第一数据处理装置101的状态为异常状态,并确定将第一数据处理装置101切换为被控模式;或者,控制器70在确定数据通路状态正常以及第一数据处理装置101的工作状态正常的情况下,保持第一数据处理装置101的自身模式为主控模式。Specifically, after the controller 70 determines the state of the data path between the first data processing device 101 and the communication switch 80, based on the state of the data path and the working state of the first data processing device 101, determine whether to The data processing device 101 switches to the controlled mode. Specifically, when the controller 70 determines that the working state of the first data processing device 101 is normal but the state of the data path is abnormal, the controller 70 determines that the state of the first data processing device 101 is an abnormal state, and determines to switch the first data processing device 101 to is the controlled mode; or, when the controller 70 determines that the working state of the first data processing device 101 is abnormal and the state of the data path is abnormal, it determines that the state of the first data processing device 101 is an abnormal state, and determines that the first data processing device 101 is abnormal. The processing device 101 switches to the controlled mode; or, when the controller 70 determines that the working state of the first data processing device 101 is abnormal but the data path state is normal, determine that the state of the first data processing device 101 is an abnormal state, and determine Switch the first data processing device 101 to the controlled mode; or, when the controller 70 determines that the working state of the first data processing device 101 is normal but the data path state is abnormal, determine that the state of the first data processing device 101 is abnormal state, and determine to switch the first data processing device 101 to the controlled mode; or, the controller 70 keeps the first data processing device 101 in Its own mode is master mode.
具体地,控制器70也可以在确定第一数据处理装置101的工作状态后,监测所述第一数据处理装置101与通信交换机80之间的数据通路状态;基于第一数据处理装置101的工作状态和所述第一数据处理装置101与通信交换机80之间的数据通路状态,确定是否要将所述第一数据处理装置101切换为被控模式。具体切换方式与前文描述类似,这里不再赘述。Specifically, the controller 70 may also monitor the state of the data path between the first data processing device 101 and the communication switch 80 after determining the working state of the first data processing device 101; The state and the state of the data path between the first data processing device 101 and the communication switch 80 determine whether to switch the first data processing device 101 to the controlled mode. The specific switching method is similar to the previous description, and will not be repeated here.
需要说明的是,本公开对控制器70确定第一数据处理装置101的工作状态以及确定第一数据处理装置101的数据通路状态的顺序不做限制。也即,控制器70可以先确定第一数据处理装置101的数据通路状态,后确定第一数据处理装置101的工作状态;也可以以相反的顺序进行;还可以两者同时进行。It should be noted that the present disclosure does not limit the order in which the controller 70 determines the working status of the first data processing device 101 and determines the data path status of the first data processing device 101 . That is, the controller 70 may first determine the data path status of the first data processing device 101, and then determine the working status of the first data processing device 101; it may also perform in reverse order; or both may be performed at the same time.
针对控制器70在确定将第一数据处理装置101切换为被控模式的情况,由于当前使用的第一数据处理装置101的工作状态出现异常无法正常工作,为了保证数据处理系统200能够继续完成对数据处理任务的分解、以及对分解后得到的子任务进行下发的工作,还可以在第二数据处理装置102中确定目标第二数据处理装置103,以将该目标第二数据处理装置103作为新的第一数据处理装置101,完成相应的数据处理任务。因此,控制器70还可以基于所述第二数据处理装置102的状态,从所述第二数据处理装置102中,确定要切换为主控模式的目标第二数据处理装置103。For the situation where the controller 70 determines to switch the first data processing device 101 to the controlled mode, because the working state of the first data processing device 101 currently in use is abnormal and cannot work normally, in order to ensure that the data processing system 200 can continue to complete the In the work of decomposing the data processing task and delivering the subtasks obtained after decomposing, the target second data processing device 103 can also be determined in the second data processing device 102, so that the target second data processing device 103 can be used as The new first data processing device 101 completes corresponding data processing tasks. Therefore, the controller 70 may also determine, from among the second data processing devices 102 , the target second data processing device 103 to switch to the master mode based on the state of the second data processing device 102 .
在具体实施中,控制器70可以响应于所述第一数据处理装置101的状态为异常状态,从所述第二数据处理装置102中确定备选数据处理装置,并检测所述备选数据处理装置的状态;响应于所述备选数据处理装置的状态为正常状态,将所述备选数据处理装置确定为所述目标第二数据处理装置103。备选数据处理装置的正常状态包括备选数据处理装置的工作状态和数据通路状态均正常。In a specific implementation, the controller 70 may determine an alternative data processing apparatus from the second data processing apparatus 102 in response to the state of the first data processing apparatus 101 being an abnormal state, and detect the alternative data processing apparatus Device status: determining the candidate data processing device as the target second data processing device 103 in response to the candidate data processing device being in a normal state. The normal status of the candidate data processing device includes that the working status and the data path status of the candidate data processing device are both normal.
具体地,参见图5所示,为本公开实施例提供的一种确定目标第二数据处理装置的示意图。在图5中,包括数据处理系统200中可使用的数据处理装置DP_1、DP_2、……、DP_n。其中,DP_1为当前的第一数据处理装置101。在确定DP_1的状态为异常状态后,在从第二数据处理装置102中确定备选数据处理装置时,例如可以确定与DP_1顺位相邻的第二数据处理装置102作为备选数据处理装置。Specifically, refer to FIG. 5 , which is a schematic diagram of a second data processing device for determining a target provided by an embodiment of the present disclosure. In FIG. 5 , data processing devices DP_1 , DP_2 , . . . , DP_n usable in the data processing system 200 are included. Wherein, DP_1 is the current first data processing device 101 . After determining that the state of DP_1 is an abnormal state, when determining a candidate data processing device from the second data processing device 102, for example, the second data processing device 102 sequentially adjacent to DP_1 may be determined as a candidate data processing device.
其中,在确定与第一数据处理装置101顺位相邻的第二数据处理装置102时,例如可以根据多个数据处理装置的顺序编号确定。示例性的,在确定数据处理系统200中的多个数据处理装置时,依次确定了n个数据处理装置,包括DP_1、DP_2、……、DP_n,则认为与第一数据处理装置DP_1顺位相邻的第二数据处理装置,也即DP_2,作为备选数据处理装置。Wherein, when determining the second data processing device 102 sequentially adjacent to the first data processing device 101, for example, it may be determined according to the sequence numbers of multiple data processing devices. Exemplarily, when determining a plurality of data processing devices in the data processing system 200, if n data processing devices are sequentially determined, including DP_1, DP_2, ..., DP_n, it is considered that they are in the same order as the first data processing device DP_1. The adjacent second data processing device, that is, DP_2, serves as an alternative data processing device.
或者,例如还可以依据除该第一数据处理装置101外的多个数据处理装置的当前状态,确定与第一数据处理装置101顺位相邻的第二数据处理装置102。例如,可以在确定备选数据处理装置时,首先对多个数据处理装置的运行温度等进行监控,并根据多个数据处理装置分别对应的运行温度,由较低的运行温度至较高的运行温度的方向,确定多个数据处理装置的排序。示例性的,在多个数据处理装置包括DP_2、DP_3、DP_4的情况下,对应的排序例如为DP_3、DP_4、DP_2。然后,在确定第一数据处理装置101顺位相邻的第二数据处理装置102时,确定运行温度最低的第二数据处理装置102,也即DP_3,作为备选数据处理装置。Alternatively, for example, the second data processing device 102 sequentially adjacent to the first data processing device 101 may also be determined according to the current states of multiple data processing devices other than the first data processing device 101 . For example, when determining a candidate data processing device, the operating temperature of multiple data processing devices can be monitored first, and the operating temperature can be changed from a lower operating temperature to a higher operating temperature according to the respective operating temperatures of the multiple data processing devices. The direction of temperature determines the ordering of the multiple data processing means. Exemplarily, in the case where the multiple data processing apparatuses include DP_2, DP_3, and DP_4, the corresponding ordering is, for example, DP_3, DP_4, and DP_2. Then, when determining the second data processing device 102 sequentially adjacent to the first data processing device 101, determine the second data processing device 102 with the lowest operating temperature, that is, DP_3, as a candidate data processing device.
此处,在确定多个数据处理装置的顺位时选取的方式不同,确定的备选数据处理装置也可能不同。具体地可以根据实际情况确定对应的备选数据处理装置,在此不做出限定。Here, different methods are selected when determining the sequence of multiple data processing devices, and the determined candidate data processing devices may also be different. Specifically, the corresponding candidate data processing device may be determined according to actual conditions, which is not limited here.
在确定备选数据处理装置后,为了避免备选数据处理装置存在故障,导致将其自身模式切换为主控模式后,数据处理系统200仍不能正常工作的问题,也同时保证在对第一数据处理装置101 以及第二数据处理装置102切换时的效率,控制器70还可以对备选数据处理装置的状态进行监测。After the candidate data processing device is determined, in order to avoid the problem that the data processing system 200 still cannot work normally after switching its own mode to the master control mode due to the failure of the candidate data processing device, it is also guaranteed that the first data processing device For the efficiency when the processing device 101 and the second data processing device 102 switch, the controller 70 can also monitor the status of the alternative data processing device.
在一种可能的实施方式中,控制器70还可以响应于所述备选数据处理装置的状态为异常状态,从所述第二数据处理装置102中确定新的备选数据处理装置,并返回至检测所述备选数据处理装置的状态是否正常的步骤。In a possible implementation manner, the controller 70 may also determine a new candidate data processing device from the second data processing device 102 in response to the state of the candidate data processing device being an abnormal state, and return Go to the step of detecting whether the state of the alternative data processing device is normal.
其中,确定新的备选数据处理装置的方式,与上述从第二数据处理装置102中确定备选数据处理装置的方式相似,在此不再赘述。Wherein, the manner of determining a new candidate data processing device is similar to the above-mentioned manner of determining a candidate data processing device from the second data processing device 102 , and will not be repeated here.
另外,控制器70在从所述第二数据处理装置102中,确定要切换为主控模式的目标第二数据处理装置103时,还用于从所述备选数据处理装置中读取类型信息;响应于读取到的类型信息为预设类型信息,将所述备选数据处理装置确定为所述目标第二数据处理装置103。In addition, when the controller 70 determines from the second data processing device 102 the target second data processing device 103 to switch to the master mode, it is also used to read the type information from the candidate data processing device ; Determining the candidate data processing device as the target second data processing device 103 in response to the read type information being preset type information.
此处,与上述图2对应的数据处理装置中的说明相似,若备选数据处理装置的类型信息指示该数据处理装置不能作为主控模式下的数据处理装置时,即使该备选数据处理装置的状态为正常状态,也不能将该备选数据处理装置的自身模式切换为主控模式,并将其作为目标第二数据处理装置103。因此,控制器70还可以对备选数据处理装置的类型信息进行读取,确定其是否可以被选作目标第二数据处理装置103并作为主控模式下的数据处理装置工作。控制器70具体读取备选数据处理装置的类型信息的方式,可以参见上述图2对应的实施例中的相关说明,在此不再赘述。Here, similar to the description in the data processing device corresponding to FIG. 2 above, if the type information of the candidate data processing device indicates that the data processing device cannot be used as the data processing device in the master mode, The status of the data processing device is normal, and the self-mode of the candidate data processing device cannot be switched to the master mode, and it can be used as the target second data processing device 103 . Therefore, the controller 70 can also read the type information of the candidate data processing device to determine whether it can be selected as the target second data processing device 103 and work as the data processing device in the master mode. For a specific manner in which the controller 70 reads the type information of the candidate data processing device, reference may be made to relevant descriptions in the above-mentioned embodiment corresponding to FIG. 2 , which will not be repeated here.
这样,利用这种方式可以确定在将其自身模式切换为主控模式后,数据处理系统200可以正常使用时的备选数据处理装置,效率较高。在确定备选数据处理装置的状态为正常状态后,停止筛选第二数据处理装置102,并将该确定的备选数据处理装置,作为目标第二数据处理装置103。In this way, it is possible to determine the candidate data processing device when the data processing system 200 can be used normally after its own mode is switched to the master mode, and the efficiency is high. After it is determined that the state of the candidate data processing device is normal, the screening of the second data processing device 102 is stopped, and the determined candidate data processing device is used as the target second data processing device 103 .
控制器70在确定目标第二数据处理装置103后,即可以通过对第一数据处理装置101以及目标第二数据处理装置103分别对应的自身模式进行切换的方式,为数据处理系统200确定新的第一数据处理装置101。After the controller 70 determines the target second data processing device 103, it can determine a new data processing system 200 by switching its own mode corresponding to the first data processing device 101 and the target second data processing device 103 respectively. The first data processing device 101.
在具体实施中,控制器70可以向所述第一数据处理装置101、和所述目标第二数据处理装置103发送复位信号;在复位信号发送成功后,向所述第一数据处理装置101发送第一控制指令,以及向目标第二数据处理装置103发送第二控制指令。In a specific implementation, the controller 70 may send a reset signal to the first data processing device 101 and the target second data processing device 103; after the reset signal is sent successfully, send a reset signal to the first data processing device 101 The first control instruction, and sending the second control instruction to the target second data processing device 103 .
参见图6所示,为本公开实施例提供的一种控制器向数据处理装置发送数据复位信号时的示意图。其中,复位信号例如可以使用设备复位信号(PCIE Reset,PERST)。另外,控制器70还可以向通信交换机80发送设备复位信号。通信交换机80接收到复位信号后,执行复位操作。Referring to FIG. 6 , it is a schematic diagram of a controller sending a data reset signal to a data processing device according to an embodiment of the present disclosure. Wherein, as the reset signal, for example, a device reset signal (PCIE Reset, PERST) can be used. In addition, the controller 70 may also send a device reset signal to the communication switch 80 . After the communication switch 80 receives the reset signal, it performs a reset operation.
另一实施例中,存在控制器70无法在向通信交换机80发送复位信号后,再由通信交换机80向数据处理装置发送复位信号的情况,在该种情况下,在图6中未示出通信交换机80与数据处理装置之间的连接关系,而是直接表达控制器70向通信交换机80以及数据处理装置发送复位信号的数据传输通路。In another embodiment, there is a situation where the controller 70 cannot send a reset signal to the communication switch 80, and then the communication switch 80 sends a reset signal to the data processing device. In this case, the communication is not shown in FIG. 6 The connection relationship between the switch 80 and the data processing device directly expresses the data transmission path through which the controller 70 sends a reset signal to the communication switch 80 and the data processing device.
在具体实施中,控制器70还通过总线与所述数据处理装置连接;所述控制器70,在向所述第一数据处理装置101和所述目标第二数据处理装置103发送复位信号时,用于:通过所述总线向所述第一数据处理装置101和所述目标第二数据处理装置103发送复位信号。In a specific implementation, the controller 70 is also connected to the data processing device through a bus; when the controller 70 sends a reset signal to the first data processing device 101 and the target second data processing device 103, It is configured to: send a reset signal to the first data processing device 101 and the target second data processing device 103 through the bus.
示例性的,总线例如可以包括SMBUS。Exemplarily, the bus may include SMBUS, for example.
在一种可能的情况下,对应于图4示出的电路结构,若通信交换机80可以使用SMBUS向数据处理装置DP_1至DP_n发送复位信号,则可以由控制器70向通信交换机80发送复位信号,再由通信交换器80向与其连接的数据处理装置DP_1至DP_n发送复位信号。在另一种可能的情况下,若通信交换机不能使用SMBUS向数据处理装置DP_1至DP_n发送复位信号,则控制器70利用可以利用上述图2示出的结构,通过SMBUS向数据处理装置中的信号转换器中的I2C发送复位信号,从而使得对应的数据处理装置复位。In a possible situation, corresponding to the circuit structure shown in FIG. 4 , if the communication switch 80 can use SMBUS to send reset signals to the data processing devices DP_1 to DP_n, then the controller 70 can send the reset signal to the communication switch 80, Then, the communication switch 80 sends reset signals to the data processing devices DP_1 to DP_n connected thereto. In another possible situation, if the communication switch cannot use SMBUS to send reset signals to the data processing devices DP_1 to DP_n, then the controller 70 can use the structure shown in FIG. The I2C in the converter sends a reset signal, thereby resetting the corresponding data processing device.
控制器70向第一数据处理装置101以及目标第二数据处理装置103发送复位信号后,第一数据处理装置101和目标第二数据处理装置103可以解除当前的自身模式,这样在对二者分别对应的自身模式进行切换时,可以直接通过选通其分别对应的数据处理芯片与第一存储器或者第二存储器之间的传输通路实现,这样可以较为简单的使第一数据处理装置101、以及目标数据处理装置完成自身模式的切换,并有效的减少在切换时出现切换故障的情况发生。After the controller 70 sends a reset signal to the first data processing device 101 and the target second data processing device 103, the first data processing device 101 and the target second data processing device 103 can cancel the current self-mode, so that when the two are respectively When the corresponding self-mode is switched, it can be realized directly by gating the transmission paths between the corresponding data processing chips and the first memory or the second memory, so that the first data processing device 101 and the target The data processing device completes the switching of its own mode, and effectively reduces the occurrence of switching failures during switching.
具体地,通信交换机80接收到复位信号后,还可以停止向控制器70发送PORT_GOOD#信号,也即停止对数据处理装置对应数据通路的监测,以减小在切换时通信交换机80以及控制器70的功耗。在第一数据处理装置101和目标第二数据处理装置103接收到复位信号并完成复位后,控制器70可以向第一数据处理装置101发送第一控制指令,并向目标第二数据处理装置103发送第二控制指令。其中,第一控制指令用于指示第一数据处理装置101的自身模式切换为被控模式;第二控制指令用于指示目标数据处理装置的自身模式切换为主控模式。Specifically, after the communication switch 80 receives the reset signal, it can also stop sending the PORT_GOOD# signal to the controller 70, that is, stop monitoring the data path corresponding to the data processing device, so as to reduce the communication switch 80 and the controller 70 during switching. power consumption. After the first data processing device 101 and the target second data processing device 103 receive the reset signal and complete the reset, the controller 70 can send the first control instruction to the first data processing device 101 and send the target second data processing device 103 Send the second control instruction. Wherein, the first control instruction is used to instruct the first data processing device 101 to switch its own mode to the controlled mode; the second control instruction is used to instruct the target data processing device to switch its own mode to the master mode.
对应的,对于第一数据处理装置101,在响应于接收到第一控制指令,将自身模式切换为被控 模式时,用于响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第一控制指令,将自身模式切换为被控模式。对于目标数据处理装置,在响应于接收到第二控制指令,将自身模式切换为主控模式时,用于响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第二控制指令,将自身模式切换为主控模式。Correspondingly, for the first data processing device 101, when switching its own mode to the controlled mode in response to receiving the first control instruction, it is used to perform a reset in response to receiving the reset signal; and after completing the reset , switching the own mode to the controlled mode in response to receiving the first control instruction. For the target data processing device, when switching its own mode to the master mode in response to receiving the second control instruction, it is used to perform a reset in response to receiving the reset signal; and after the reset is completed, in response to receiving the The second control instruction switches its own mode to the main control mode.
以第一数据处理装置101为例,第一数据处理装置101在接收到第一控制指令后,参见本公开实施例提供的数据处理装置的说明,第一控制指令可以发送至第一数据处理装置101中的信号转换器,并通过由信号转换器向多路复用器发送端口选通信号的方式,使第一数据处理装置101中的数据处理芯片可以获取第二存储器中的第二配置信息,以将该第一数据处理装置101的自身模式切换为被控模式。类似的,目标第二数据处理装置103可以以相似的方式,将其自身模式切换为主控模式,在此不再赘述。Taking the first data processing device 101 as an example, after the first data processing device 101 receives the first control instruction, referring to the description of the data processing device provided in the embodiment of the present disclosure, the first control instruction can be sent to the first data processing device The signal converter in 101, and the data processing chip in the first data processing device 101 can obtain the second configuration information in the second memory by sending a port gating signal from the signal converter to the multiplexer , so as to switch the first data processing device 101 from its own mode to the controlled mode. Similarly, the target second data processing device 103 may switch its own mode to the master mode in a similar manner, which will not be repeated here.
在第一数据处理装置101和目标第二数据处理装置103的自身模式完成切换后,例如还可以向控制器70分别发送成功切换信号。在一种可能的情况下,若控制器70未接收到两者分别发送的成功切换信号,例如可以直接报错,等待工作人员检查,并保持第一数据处理装置101和目标第二数据处理装置103的复位状态,或者也可以根据实际情况确定解决方法,在此不做出限定。After the self-mode switching of the first data processing device 101 and the target second data processing device 103 is completed, for example, a successful switching signal may be sent to the controller 70 respectively. In a possible situation, if the controller 70 does not receive the successful switching signals sent by the two respectively, for example, it can directly report an error, wait for the staff to check, and keep the first data processing device 101 and the target second data processing device 103 reset status, or the solution can be determined according to the actual situation, which is not limited here.
在另一种可能的情况下,若控制器70接受到第一数据处理装置101、和目标第二数据处理装置103的成功切换信号,可以向第一数据处理装置101、和目标第二数据处理装置103发送解除复位信号,并且也可以向通信交换机80发送接解除复位信号。对应的,第一数据处理装置101可以响应于接收到的解除复位信号,解除复位;同样的,目标第二数据处理装置103,也可以响应于接收到所述解除复位信号,解除复位。其中,控制器70向第一数据处理装置101、和目标第二数据处理装置103发送解除复位信号的方式,与向其发送复位信号的方式相似,在此不再赘述。通信交换机80在接收到解除复位信号后,可以重新对新的数据处理系统200中的多个数据处理装置进行对应数据通路的监测。这样,目标第二数据处理装置103可以基于自身模式为主控模式,继续接收数据处理任务,并将其分解为子任务;同样的,第一数据处理装置101可以基于自身模式为被控模式,对接受到的子任务进行任务处理,以使数据处理系统200能够继续完成相应的数据处理任务。In another possible situation, if the controller 70 receives the successful switching signal from the first data processing device 101 and the target second data processing device 103, it may send a message to the first data processing device 101 and the target second data processing device The device 103 sends a de-reset signal, and may also send a de-reset signal to the communication switch 80 . Correspondingly, the first data processing device 101 may release the reset in response to the received reset release signal; similarly, the target second data processing device 103 may also release the reset in response to receiving the reset release signal. Wherein, the manner in which the controller 70 sends the release reset signal to the first data processing device 101 and the target second data processing device 103 is similar to the manner in which the reset signal is sent to them, and will not be repeated here. After the communication switch 80 receives the release reset signal, it can re-monitor the corresponding data paths of the multiple data processing devices in the new data processing system 200 . In this way, the target second data processing device 103 can continue to receive data processing tasks and decompose them into subtasks based on its own mode as the master mode; similarly, the first data processing device 101 can be controlled based on its own mode, Task processing is performed on the received subtasks, so that the data processing system 200 can continue to complete corresponding data processing tasks.
在本公开另一实施例中,还提供了一种数据处理系统200在对数据处理任务进行处理时的具体实施例。参见图7A和图7B所示,为本公开实施例提供的一种数据处理系统在执行数据处理任务时的流程图;其中,In another embodiment of the present disclosure, a specific embodiment of data processing system 200 processing data processing tasks is also provided. Referring to FIG. 7A and FIG. 7B , it is a flowchart of a data processing system performing a data processing task provided by an embodiment of the present disclosure; wherein,
S701:控制器确定当前自身模式为主控模式的第一数据处理装置;S701: The controller determines that the current own mode is the first data processing device in the main control mode;
S702:数据处理系统执行数据处理任务;S702: The data processing system executes a data processing task;
S703:控制器监控第一数据处理装置的状态是否正常;其中,S703包括下述S7031至S7034;S703: The controller monitors whether the status of the first data processing device is normal; wherein, S703 includes the following S7031 to S7034;
S7031:控制器监控第一数据处理装置与通信交换机之间的数据通路状态,确定第一数据处理装置的状态是否正常;若是,执行S7032;若否,执行S7033;S7031: The controller monitors the state of the data path between the first data processing device and the communication switch, and determines whether the state of the first data processing device is normal; if yes, execute S7032; if not, execute S7033;
S7032:控制器监控第一数据处理装置中的数据处理芯片是否可以正常工作;其中,S7032包括下述S70321、以及S70322;S7032: The controller monitors whether the data processing chip in the first data processing device can work normally; wherein, S7032 includes the following S70321 and S70322;
S70321:监测芯片监测第一数据处理装置中的数据处理芯片的工作状态,并向信号转换器发送与工作状态对应的监测信号;S70321: The monitoring chip monitors the working state of the data processing chip in the first data processing device, and sends a monitoring signal corresponding to the working state to the signal converter;
S70322:控制器读取信号转换器中存储的监测信号,确定第一数据处理装置的状态是否正常;若是,执行S7034;若否,执行S7033;S70322: The controller reads the monitoring signal stored in the signal converter, and determines whether the state of the first data processing device is normal; if yes, execute S7034; if not, execute S7033;
此处,S7031与S7032可以以相反的顺序执行,例如先执行S7032,然后再执行S7031;或者,S7031与S7032可以同步执行。Here, S7031 and S7032 can be executed in reverse order, for example, S7032 is executed first, and then S7031 is executed; or, S7031 and S7032 can be executed synchronously.
S7033:确定第一数据处理装置的状态为异常状态;S7033: Determine that the state of the first data processing device is an abnormal state;
S7034:确定第一数据处理装置的状态为正常状态;返回执行S702:S7034: Determine that the state of the first data processing device is normal; return to execute S702:
S704:控制器从第二数据处理装置中确定备选数据处理装置;S704: The controller determines a candidate data processing device from the second data processing device;
S705:控制器监测备选数据处理装置的状态是否正常;若是,执行S706;若否,执行S707;S705: The controller monitors whether the state of the alternative data processing device is normal; if yes, execute S706; if not, execute S707;
S706:控制器确定备选数据处理装置为目标第二数据处理装置;S706: The controller determines that the candidate data processing device is the target second data processing device;
S707:控制器从第二数据处理装置中确定新的备选数据处理装置;返回执行S705;S707: The controller determines a new candidate data processing device from the second data processing device; return to execute S705;
S708:控制器向第一数据处理装置、和目标第二数据处理装置发送复位信号,使第一数据处理装置、和目标第二数据处理装置复位;S708: The controller sends a reset signal to the first data processing device and the target second data processing device to reset the first data processing device and the target second data processing device;
S709:控制器向第一数据处理装置发送第一控制指令、以及向目标第二数据处理装置发送第二控制指令;S709: The controller sends the first control instruction to the first data processing device, and sends the second control instruction to the target second data processing device;
S710:第一数据处理装置、和目标第二数据处理装置切换自身模式;其中,S710包括S7101以及S71012;S710: The first data processing device and the target second data processing device switch their own modes; wherein, S710 includes S7101 and S71012;
S7101:第一数据处理装置响应于接收到第一控制指令,将自身模式切换为被控模式;S7101: The first data processing device switches its own mode to a controlled mode in response to receiving a first control instruction;
S7102:目标第二数据处理装置响应于接收到第二控制指令,将自身模式切换为主控模式;S7102: The target second data processing device switches its own mode to the master mode in response to receiving the second control instruction;
S711:控制器向第一数据处理装置以及目标第二数据处理装置发送解除复位信号;第一数据处理装置以及目标第二数据处理装置解除复位。S711: The controller sends a reset release signal to the first data processing device and the target second data processing device; the first data processing device and the target second data processing device release reset.
基于同一发明构思,本公开实施例中还提供了一种板卡。本公开实施例提供的板卡可以包括本公开实施例公开的任一种数据处理装置,或者任一种数据处理系统。包括数据处理装置的板卡可以参见图1和图2,包括数据处理系统的板卡可以参见图8所示。Based on the same inventive concept, an embodiment of the present disclosure also provides a board. The board provided by the embodiments of the present disclosure may include any data processing device or any data processing system disclosed in the embodiments of the present disclosure. Refer to FIG. 1 and FIG. 2 for the board including the data processing device, and refer to FIG. 8 for the board including the data processing system.
图8为本公开实施例提供的一种板卡的示意图;所述板卡300包括多个本公开实施例提供的数据处理装置100(图8中示出了n个数据处理装置100,包括DP_1至DP_n)、以及控制器70;其中,多个所述数据处理装置100分别通过通信交换机80与所述控制器70连接;FIG. 8 is a schematic diagram of a board provided by an embodiment of the present disclosure; the board 300 includes a plurality of data processing devices 100 provided by an embodiment of the present disclosure (n data processing devices 100 are shown in FIG. 8, including DP_1 to DP_n), and a controller 70; wherein, a plurality of said data processing devices 100 are respectively connected to said controller 70 through a communication switch 80;
所述控制器70,用于对各个所述数据处理装置100的状态进行监测;根据各个所述数据处理装置100的状态,将其中一个数据处理装置100切换为主控模式,并将其他数据处理装置100切换为被控模式。The controller 70 is configured to monitor the status of each of the data processing devices 100; switch one of the data processing devices 100 to the master mode according to the status of each of the data processing devices 100, and process other data The device 100 switches to the controlled mode.
其中,在该板卡中不存在单独的CPU。对于图8中示出的板卡,其中主控模式下的数据处理装置100可以承担数据处理任务的接收、拆分、分发等任务,主控模式又称根处理模式(Root Complex,RC)。在被控模式下的数据处理装置100可以承担对子任务的处理等任务,被控模式又称节点处理模式(End Point,EP)。Wherein, there is no separate CPU in this board. For the board shown in FIG. 8 , the data processing device 100 in the master control mode can undertake tasks such as receiving, splitting, and distributing data processing tasks. The master control mode is also called the root processing mode (Root Complex, RC). The data processing apparatus 100 in the controlled mode can undertake tasks such as processing subtasks, and the controlled mode is also called a node processing mode (End Point, EP).
在相关技术中,对于利用不同的CPU,例如CPU#1和CPU#2分别连接RC模式下的数据处理装置100和EP模式下的数据处理装置100,这样的“CPU+数据处理中装置”硬件架构中,不同模式下的数据处理装置对应不同的CPU,以完成相关的数据处理任务。而在CPU或者数据处理装置中任一个损坏后,会导致整体无法正常使用,从而导致的数据处理任务无法正常执行。另外,在更换新的RC模式下的硬件架构时,还需要对与新的数据处理装置连接的CPU以及该新的数据处理装置整体进行单板验证,例如对“CPU#3+新选出的数据处理装置”的功能进行验证,以确定其是否可以作为新的RC模式下的硬件建构,这种方式也较为繁琐。In the related art, for using different CPUs, for example, CPU#1 and CPU#2 respectively connect the data processing device 100 in RC mode and the data processing device 100 in EP mode, such "CPU+data processing device" hardware architecture Among them, data processing devices in different modes correspond to different CPUs to complete related data processing tasks. However, when any one of the CPU or the data processing device is damaged, the whole cannot be used normally, and thus the data processing task cannot be performed normally. In addition, when replacing the hardware architecture in the new RC mode, it is also necessary to perform single-board verification on the CPU connected to the new data processing device and the new data processing device as a whole, for example, for "CPU#3+newly selected The function of "data processing device" is verified to determine whether it can be used as a hardware construction in the new RC mode, and this method is also relatively cumbersome.
而对于本公开实施例提供的板卡而言,由于板卡中可以切换数据处理装置100的自身模式,也即相应的可以将数据处理装置100的模式切换为RC模式或EP模式,因此在RC模式下的数据处理装置100出现故障时,可以快速地通过由其他EP模式下的数据处理装置100切换为RC模式的方法,继续完成数据处理任务,因此该板卡在进行数据处理任务的执行时更灵活、更稳定。For the board provided by the embodiment of the present disclosure, since the mode of the data processing device 100 can be switched in the board, that is, the mode of the data processing device 100 can be switched to the RC mode or the EP mode accordingly, so in the RC When the data processing device 100 in the EP mode fails, it can quickly switch to the RC mode from the data processing device 100 in other EP modes to continue to complete the data processing task. Therefore, when the board is performing the data processing task More flexible and more stable.
基于同一发明构思,本公开实施例中还提供了与数据处理装置对应的数据处理方法,由于本公开实施例中的方法解决问题的原理与本公开实施例上述数据处理装置相似,因此方法的实施可以参见装置的实施,重复之处不再赘述。Based on the same inventive concept, the embodiment of the present disclosure also provides a data processing method corresponding to the data processing device. Since the problem-solving principle of the method in the embodiment of the present disclosure is similar to that of the above-mentioned data processing device in the embodiment of the present disclosure, the implementation of the method Reference can be made to the implementation of the device, and repeated descriptions will not be repeated.
参照图9所示,为本公开实施例提供的一种数据处理方法的流程图,所述数据处理方法应用于本公开实施例提供的数据处理装置;所述数据处理方法包括:Referring to FIG. 9 , it is a flowchart of a data processing method provided by an embodiment of the present disclosure, the data processing method is applied to the data processing device provided by the embodiment of the present disclosure; the data processing method includes:
S901:多路复用器响应于接收到端口选通信号,选通所述数据处理芯片用于获取配置信息的第一传输通路或者第二传输通路;S901: The multiplexer selects the first transmission path or the second transmission path used by the data processing chip to obtain the configuration information in response to receiving the port strobe signal;
S902:所述数据处理芯片响应于所述第一传输通路被选通,获取第一配置信息,基于所述第一配置信息,将自身模式确定为主控模式;响应于所述第二传输通路被选通,获取第二配置信息,并基于所述第二配置信息,将自身模式确定为被控模式。S902: The data processing chip acquires first configuration information in response to the first transmission path being gated, and determines its own mode as the master mode based on the first configuration information; in response to the second transmission path is gated, acquires second configuration information, and determines its own mode as the controlled mode based on the second configuration information.
一种可选的实施方式中,所述数据处理方法还包括:所述数据处理芯片在自身模式处于所述主控模式下,响应于接收到数据处理任务,将所述数据处理任务分解为多个子任务,向处于被控模式的其他数据处理装置下发所述子任务;或者所述数据处理芯片在自身模式处于所述被控模式下,响应于接收到处于主控模式的其他数据处理装置下发的子任务,执行所述其他数据处理装置下发的子任务。In an optional implementation manner, the data processing method further includes: when the data processing chip is in the master control mode in its own mode, in response to receiving a data processing task, decomposing the data processing task into multiple Subtasks, sending the subtasks to other data processing devices in the controlled mode; or the data processing chip is in the controlled mode in its own mode, in response to receiving other data processing devices in the master mode The delivered subtasks execute the subtasks delivered by the other data processing devices.
一种可选的实施方式中,所述数据处理装置还包括:第一存储器和第二存储器;其中,所述第一存储器和所述第二存储器分别与所述多路复用器连接;所述数据处理方法还包括:所述多路复用器响应于接收到端口选通信号,选通所述数据处理芯片和所述第一存储器之间的第一传输通路、或者选通所述数据处理芯片和所述第二存储器之间的第二传输通路。In an optional implementation manner, the data processing device further includes: a first memory and a second memory; wherein, the first memory and the second memory are respectively connected to the multiplexer; The data processing method further includes: the multiplexer gates the first transmission path between the data processing chip and the first memory in response to receiving the port gate signal, or gates the data A second transmission path between the processing chip and the second memory.
一种可选的实施方式中,所述数据处理装置还包括:信号转换器;所述信号转换器与所述多路复用器连接;以及与控制器连接;所述数据处理方法还包括:所述信号转换器基于预设的第一通信协议接收所述控制器发送的控制指令,并将所述控制指令转换为端口选通信号,向所述多路复用器发送所述端口选通信号。In an optional implementation manner, the data processing device further includes: a signal converter; the signal converter is connected to the multiplexer; and connected to a controller; the data processing method further includes: The signal converter receives the control instruction sent by the controller based on the preset first communication protocol, converts the control instruction into a port selection signal, and sends the port selection communication to the multiplexer No.
一种可选的实施方式中,所述数据处理装置还包括:监测芯片;所述监测芯片分别与所述数据处理芯片和所述信号转换器连接;所述数据处理方法还包括监测芯片监测所述数据处理芯片的工作状态,并向所述信号转换器发送与所述工作状态对应的监测信号;所述信号转换器接收所述监测 芯片发送的所述监测信号,并基于预设的第二通信协议向所述控制器发送所述监测信号。In an optional implementation manner, the data processing device further includes: a monitoring chip; the monitoring chip is respectively connected to the data processing chip and the signal converter; the data processing method further includes the monitoring chip monitoring the The working state of the data processing chip, and send a monitoring signal corresponding to the working state to the signal converter; the signal converter receives the monitoring signal sent by the monitoring chip, and based on the preset second A communication protocol sends the monitoring signal to the controller.
一种可选的实施方式中,所述数据处理装置还包括:寄存器;所述寄存器用于存储所述数据处理芯片的类型信息。In an optional implementation manner, the data processing device further includes: a register; the register is used to store type information of the data processing chip.
另外,基于同一发明构思,本公开实施例中还提供了与数据处理系统对应的数据处理方法,由于本公开实施例中的方法解决问题的原理与本公开实施例上述数据处理系统相似,因此方法的实施可以参见系统的实施,重复之处不再赘述。In addition, based on the same inventive concept, the embodiment of the present disclosure also provides a data processing method corresponding to the data processing system. Since the problem-solving principle of the method in the embodiment of the present disclosure is similar to that of the above-mentioned data processing system in the embodiment of the present disclosure, the method For the implementation of , please refer to the implementation of the system, and the repetition will not be repeated.
参照图10所示,为本公开实施例提供的另一种数据处理方法的流程图,所述数据处理方法应用于本公开实施例提供的数据处理系统;所述数据处理方法包括:Referring to FIG. 10 , it is a flow chart of another data processing method provided by an embodiment of the present disclosure. The data processing method is applied to the data processing system provided by the embodiment of the present disclosure; the data processing method includes:
S1001:控制器对所述第一数据处理装置的状态进行监测;响应于所述状态指示将所述第一数据处理装置切换为被控模式,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;S1001: The controller monitors the status of the first data processing device; in response to the status indication, switches the first data processing device to a controlled mode, and sends a first control instruction to the first data processing device , and sending a second control instruction to the target second data processing device;
S1002:所述第一数据处理装置响应于接收到第一控制指令,将自身模式切换为被控模式;S1002: The first data processing device switches its own mode to a controlled mode in response to receiving the first control instruction;
S1003:所述目标第二数据处理装置响应于接收到第二控制指令,将自身模式切换为主控模式。S1003: The target second data processing device switches its own mode to the master mode in response to receiving the second control instruction.
一种可选的实施方式中,所述数据处理装置的状态,包括以下至少之一:工作状态;数据通路状态。In an optional implementation manner, the state of the data processing device includes at least one of the following: working state; data path state.
一种可选的实施方式中,所述控制器对所述第一数据处理装置的状态进行监测,包括:接收所述第一数据处理装置发送的监测信号;基于所述监测信号,确定所述第一数据处理装置的工作状态;基于所述第一数据处理装置的工作状态,确定是否要将所述第一数据处理装置切换为被控模式。In an optional implementation manner, the controller monitoring the state of the first data processing device includes: receiving a monitoring signal sent by the first data processing device; based on the monitoring signal, determining the The working status of the first data processing device: determining whether to switch the first data processing device to a controlled mode based on the working status of the first data processing device.
一种可选的实施方式中,所述数据处理系统还包括:通信交换机;多个所述数据处理装置分别通过所述通信交换机与所述控制器连接;所述控制器对所述第一数据处理装置的状态进行监测,包括:监测所述第一数据处理装置与所述通信交换机之间的数据通路状态;基于所述第一数据处理装置与所述通信交换机之间的数据通路状态,确定是否要将所述第一数据处理装置切换为被控模式。In an optional implementation manner, the data processing system further includes: a communication switch; multiple data processing devices are respectively connected to the controller through the communication switch; The state of the processing device is monitored, including: monitoring the state of the data path between the first data processing device and the communication switch; based on the state of the data path between the first data processing device and the communication switch, determining Whether to switch the first data processing device to a controlled mode.
一种可选的实施方式中,所述控制器对所述第一数据处理装置的状态进行监测,包括:响应于所述第一数据处理装置的状态为异常状态,确定将所述第一数据处理装置切换为被控模式。In an optional implementation manner, the monitoring of the state of the first data processing device by the controller includes: in response to the state of the first data processing device being an abnormal state, determining that the first data processing device The processing device switches to controlled mode.
一种可选的实施方式中,所述控制器向目标第二数据处理装置发送第二控制指令之前,包括:基于所述第二数据处理装置的状态,从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置。In an optional implementation manner, before the controller sends the second control instruction to the target second data processing device, the controller includes: based on the state of the second data processing device, from the second data processing device, A target second data processing device to be switched to the master mode is determined.
一种可选的实施方式中,所述控制器基于所述第二数据处理装置的状态,从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置,包括:响应于所述第一数据处理装置的状态为异常状态,从所述第二数据处理装置中确定备选数据处理装置,并检测所述备选数据处理装置的状态;响应于所述备选数据处理装置的状态为正常状态,将所述备选数据处理装置确定为所述目标第二数据处理装置;所述控制器响应于所述备选数据处理装置的状态为异常状态,从所述第二数据处理装置中确定新的备选数据处理装置,并返回至检测所述备选数据处理装置的状态是否正常的步骤。In an optional implementation manner, the controller determines, from among the second data processing devices, a target second data processing device to be switched to the master mode based on the state of the second data processing device, including : in response to the state of the first data processing device being an abnormal state, determine a candidate data processing device from the second data processing device, and detect the state of the candidate data processing device; in response to the candidate The state of the data processing device is a normal state, and the candidate data processing device is determined as the target second data processing device; the controller responds to the state of the candidate data processing device as an abnormal state, from the The second data processing device determines a new candidate data processing device, and returns to the step of detecting whether the state of the candidate data processing device is normal.
一种可选的实施方式中,所述控制器从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置,包括:从所述备选数据处理装置中读取类型信息;响应于读取到的类型信息为预设类型信息,将所述备选数据处理装置确定为所述目标第二数据处理装置。In an optional implementation manner, the controller determines from the second data processing device the target second data processing device to switch to the master mode, including: reading from the candidate data processing device Obtaining type information; in response to the read type information being preset type information, determining the candidate data processing device as the target second data processing device.
一种可选的实施方式中,所述控制器向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令,包括:向所述第一数据处理装置、和所述目标第二数据处理装置发送复位信号;在复位信号发送成功后,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;所述第一数据处理装置响应于接收到第一控制指令,将自身模式切换为被控模式,包括响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第一控制指令,将自身模式切换为被控模式;所述目标第二数据处理装置响应于接收到第二控制指令,将自身模式切换为主控模式,包括响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第二控制指令,将自身模式切换为主控模式。In an optional implementation manner, the controller sending the first control instruction to the first data processing device, and sending the second control instruction to the target second data processing device include: sending the first data processing device The device, and the target second data processing device send a reset signal; after the reset signal is sent successfully, send a first control instruction to the first data processing device, and send a second control instruction to the target second data processing device; The first data processing device switches its own mode to the controlled mode in response to receiving the first control instruction, including performing a reset in response to receiving the reset signal; and after completing the reset, in response to receiving the The first control instruction is to switch its own mode to the controlled mode; the target second data processing device switches its own mode to the master mode in response to receiving the second control instruction, including in response to receiving the reset signal, Executing reset; and after the reset is completed, in response to receiving the second control instruction, switch its own mode to the master mode.
一种可选的实施方式中,数据处理方法还包括:所述控制器响应于接收到所述第一数据处理装置、和所述目标第二数据处理装置发送的成功切换信号,向所述第一数据处理装置、和所述目标第二数据处理装置发送解除复位信号;所述第一数据处理装置,响应于接收到所述解除复位信号,解除复位;所述目标第二数据处理装置,响应于接收到所述解除复位信号,解除复位。In an optional implementation manner, the data processing method further includes: the controller responds to receiving a successful switching signal sent by the first data processing device and the target second data processing device, sending A data processing device and the target second data processing device send a reset release signal; the first data processing device releases reset in response to receiving the reset release signal; the target second data processing device responds Upon receiving the de-reset signal, de-reset.
一种可选的实施方式中,所述控制器还通过总线与所述数据处理装置连接;所述控制器向所述第一数据处理装置和所述目标第二数据处理装置发送复位信号,包括:通过所述总线向所述第一数据处理装置和所述目标第二数据处理装置发送复位信号。In an optional implementation manner, the controller is further connected to the data processing device through a bus; the controller sends a reset signal to the first data processing device and the target second data processing device, including : sending a reset signal to the first data processing device and the target second data processing device through the bus.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严 格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.
本公开实施例还提供一种电子设备,包括:指令存储器和本公开实施例提供的数据处理装置,或者包括本公开实施例提供的数据处理系统,或者包括本公开实施例提供的板卡。The embodiment of the present disclosure also provides an electronic device, including: an instruction memory and the data processing device provided in the embodiment of the present disclosure, or including the data processing system provided in the embodiment of the present disclosure, or including the board provided in the embodiment of the present disclosure.
本公开实施例提供的数据处理装置、数据处理系统、或者板卡可以包括芯片、AI芯片等。本公开实施例提供的电子设备可以包括手机等智能终端,或者也可以是可以进行数据处理的其他设备、服务器等,这里并不限制。The data processing device, data processing system, or board provided by the embodiments of the present disclosure may include a chip, an AI chip, and the like. The electronic device provided by the embodiment of the present disclosure may include a smart terminal such as a mobile phone, or may also be another device capable of data processing, a server, etc., which is not limited here.
板卡,例如包括印刷电路板。Boards include, for example, printed circuit boards.
本公开实施例还提供一种计算机可读存储介质,其上存储有计算机程序,所述程序被多路复用器、数据处理芯片执行本公开任一数据处理方法实施例提供的方法;或者,被控制器、第一数据处理装置、目标第二数据处理装置执行本公开任一数据处理方法实施例提供的方法。The embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and the program is used by a multiplexer or a data processing chip to execute the method provided in any data processing method embodiment of the present disclosure; or, The method provided by any data processing method embodiment of the present disclosure is executed by the controller, the first data processing device, and the target second data processing device.
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的数据处理方法的步骤,具体可参见上述方法实施例,在此不再赘述。The embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the data processing method described in the above method embodiment, for details, please refer to the above method The embodiment will not be repeated here.
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。Wherein, the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台电子设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make an electronic device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.

Claims (20)

  1. 一种数据处理装置,包括:数据处理芯片以及多路复用器;其中,A data processing device, comprising: a data processing chip and a multiplexer; wherein,
    所述多路复用器,用于响应于接收到端口选通信号,选通所述数据处理芯片用于获取配置信息的第一传输通路或者第二传输通路;The multiplexer is configured to select the first transmission path or the second transmission path used by the data processing chip for obtaining configuration information in response to receiving a port selection signal;
    所述数据处理芯片,用于响应于所述第一传输通路被选通,获取第一配置信息,基于所述第一配置信息,将自身模式确定为主控模式;响应于所述第二传输通路被选通,获取第二配置信息,并基于所述第二配置信息,将自身模式确定为被控模式。The data processing chip is configured to obtain first configuration information in response to the first transmission channel being gated, and determine its own mode as the master mode based on the first configuration information; in response to the second transmission The path is selected, the second configuration information is obtained, and the own mode is determined as the controlled mode based on the second configuration information.
  2. 根据权利要求1所述的数据处理装置,其特征在于,所述数据处理芯片,还用于在自身模式处于所述主控模式下,响应于接收到数据处理任务,将所述数据处理任务分解为多个子任务,向处于被控模式的其他数据处理装置下发所述子任务;或者The data processing device according to claim 1, wherein the data processing chip is further configured to decompose the data processing task in response to receiving a data processing task when its own mode is in the master control mode For multiple subtasks, sending the subtasks to other data processing devices in the controlled mode; or
    所述数据处理芯片,还用于在自身模式处于所述被控模式下,响应于接收到处于主控模式的其他数据处理装置下发的子任务,执行所述其他数据处理装置下发的子任务。The data processing chip is further configured to execute subtasks issued by other data processing devices in response to receiving subtasks issued by other data processing devices in master mode when its own mode is in the controlled mode. Task.
  3. 根据权利要求1或2所述的数据处理装置,其特征在于,还包括:The data processing device according to claim 1 or 2, further comprising:
    第一存储器和第二存储器;a first memory and a second memory;
    其中,所述第一存储器和所述第二存储器分别与所述多路复用器连接;Wherein, the first memory and the second memory are respectively connected to the multiplexer;
    所述多路复用器,用于响应于接收到所述端口选通信号,选通所述数据处理芯片和所述第一存储器之间的第一传输通路、或者选通所述数据处理芯片和所述第二存储器之间的第二传输通路。The multiplexer is configured to gate the first transmission path between the data processing chip and the first memory, or gate the data processing chip in response to receiving the port gate signal and the second transmission path between the second memory.
  4. 根据权利要求1-3任一项所述的数据处理装置,其特征在于,还包括:信号转换器;The data processing device according to any one of claims 1-3, further comprising: a signal converter;
    所述信号转换器与所述多路复用器连接;以及与控制器连接;the signal converter is connected to the multiplexer; and is connected to a controller;
    所述信号转换器用于基于预设的第一通信协议接收所述控制器发送的控制指令,并将所述控制指令转换为所述端口选通信号,向所述多路复用器发送所述端口选通信号。The signal converter is used to receive the control command sent by the controller based on the preset first communication protocol, convert the control command into the port gating signal, and send the control command to the multiplexer. port strobe signal.
  5. 根据权利要求1-4任一项所述的数据处理装置,其特征在于,还包括:监测芯片;The data processing device according to any one of claims 1-4, further comprising: a monitoring chip;
    所述监测芯片分别与所述数据处理芯片和所述信号转换器连接;The monitoring chip is respectively connected with the data processing chip and the signal converter;
    所述监测芯片,用于监测所述数据处理芯片的工作状态,并向所述信号转换器发送与所述工作状态对应的监测信号;The monitoring chip is used to monitor the working state of the data processing chip, and send a monitoring signal corresponding to the working state to the signal converter;
    所述信号转换器,还用于接收所述监测芯片发送的所述监测信号,并基于预设的第二通信协议向所述控制器发送所述监测信号。The signal converter is further configured to receive the monitoring signal sent by the monitoring chip, and send the monitoring signal to the controller based on a preset second communication protocol.
  6. 根据权利要求1-5任一项所述的数据处理装置,其特征在于,还包括:寄存器;所述寄存器用于存储所述数据处理芯片的类型信息。The data processing device according to any one of claims 1-5, further comprising: a register; the register is used to store type information of the data processing chip.
  7. 一种数据处理系统,包括:如权利要求1-6任一项所述的数据处理装置、以及控制器;A data processing system, comprising: the data processing device according to any one of claims 1-6, and a controller;
    其中,所述数据处理装置有多个;多个所述数据处理装置包括:处于主控模式的第一数据处理装置、以及处于被控模式的第二数据处理装置;Wherein, there are multiple data processing devices; the multiple data processing devices include: a first data processing device in a master mode, and a second data processing device in a controlled mode;
    所述控制器,用于对所述第一数据处理装置的状态进行监测;响应于所述状态指示将所述第一数据处理装置切换为被控模式,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;The controller is configured to monitor the status of the first data processing device; switch the first data processing device to a controlled mode in response to the status indication, and send the first data processing device to the first data processing device a control command, and sending the second control command to the target second data processing device;
    所述第一数据处理装置,用于响应于接收到所述第一控制指令,将自身模式切换为被控模式;The first data processing device is configured to switch its own mode to a controlled mode in response to receiving the first control instruction;
    所述目标第二数据处理装置,用于响应于接收到所述第二控制指令,将自身模式切换为主控模式。The target second data processing device is configured to switch its own mode to a master mode in response to receiving the second control instruction.
  8. 根据权利要求7所述的数据处理系统,其特征在于,所述数据处理装置的状态,包括以下至少之一:The data processing system according to claim 7, wherein the state of the data processing device includes at least one of the following:
    工作状态;数据通路状态。Working status; data path status.
  9. 根据权利要求7或8所述的数据处理系统,其特征在于,所述控制器,在对所述第一数据处 理装置的状态进行监测时,用于:The data processing system according to claim 7 or 8, wherein the controller, when monitoring the state of the first data processing device, is configured to:
    接收所述第一数据处理装置发送的监测信号;receiving a monitoring signal sent by the first data processing device;
    基于所述监测信号,确定所述第一数据处理装置的工作状态;determining the working state of the first data processing device based on the monitoring signal;
    基于所述第一数据处理装置的工作状态,确定是否要将所述第一数据处理装置切换为被控模式。Based on the working state of the first data processing device, it is determined whether to switch the first data processing device to a controlled mode.
  10. 根据权利要求7-9任一项所述的数据处理系统,其特征在于,还包括:通信交换机;The data processing system according to any one of claims 7-9, further comprising: a communication switch;
    多个所述数据处理装置分别通过所述通信交换机与所述控制器连接;A plurality of the data processing devices are respectively connected to the controller through the communication switch;
    所述控制器,在对所述第一数据处理装置的状态进行监测时,用于:监测所述第一数据处理装置与所述通信交换机之间的数据通路状态;The controller, when monitoring the state of the first data processing device, is configured to: monitor the state of a data path between the first data processing device and the communication switch;
    基于所述第一数据处理装置与所述通信交换机之间的数据通路状态,确定是否要将所述第一数据处理装置切换为被控模式。Based on the state of the data path between the first data processing device and the communication switch, it is determined whether to switch the first data processing device to a controlled mode.
  11. 根据权利要求7-10任一项所述的数据处理系统,其特征在于,所述控制器,在对所述第一数据处理装置的状态进行监测时,用于:The data processing system according to any one of claims 7-10, wherein the controller, when monitoring the state of the first data processing device, is configured to:
    响应于所述第一数据处理装置的状态为异常状态,确定将所述第一数据处理装置切换为被控模式。In response to the state of the first data processing device being an abnormal state, it is determined to switch the first data processing device to a controlled mode.
  12. 根据权利要求7-11任一项所述的数据处理系统,其特征在于,所述控制器,在向目标第二数据处理装置发送第二控制指令之前,还用于:The data processing system according to any one of claims 7-11, wherein the controller, before sending the second control instruction to the target second data processing device, is further configured to:
    基于所述第二数据处理装置的状态,从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置。Based on the state of the second data processing device, a target second data processing device to be switched to the master mode is determined from among the second data processing devices.
  13. 根据权利要求12所述的数据处理系统,其特征在于,所述控制器,在基于所述第二数据处理装置的状态,从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置时,用于:The data processing system according to claim 12, wherein the controller, based on the state of the second data processing device, determines from the second data processing device to switch to the master mode When targeting a second data processing device, for:
    响应于所述第一数据处理装置的状态为异常状态,从所述第二数据处理装置中确定备选数据处理装置,并检测所述备选数据处理装置的状态;响应于所述备选数据处理装置的状态为正常状态,将所述备选数据处理装置确定为所述目标第二数据处理装置;In response to the state of the first data processing device being an abnormal state, determining a candidate data processing device from the second data processing device, and detecting the state of the candidate data processing device; in response to the candidate data The status of the processing device is normal, and the candidate data processing device is determined as the target second data processing device;
    所述控制器,还用于:响应于所述备选数据处理装置的状态为异常状态,从所述第二数据处理装置中确定新的备选数据处理装置,并返回至检测所述备选数据处理装置的状态是否正常的步骤。The controller is further configured to: in response to the state of the candidate data processing device being an abnormal state, determine a new candidate data processing device from the second data processing device, and return to detecting the candidate data processing device The step of whether the state of the data processing device is normal.
  14. 根据权利要求12或13所述的数据处理系统,其特征在于,A data processing system according to claim 12 or 13, characterized in that,
    所述控制器,在从所述第二数据处理装置中,确定要切换为主控模式的目标第二数据处理装置时,还用于,从备选数据处理装置中读取类型信息;响应于读取到的所述类型信息为预设类型信息,将所述备选数据处理装置确定为所述目标第二数据处理装置。The controller, when determining from the second data processing device a target second data processing device to be switched to master mode, is further configured to read type information from an alternative data processing device; in response The read type information is preset type information, and the candidate data processing device is determined as the target second data processing device.
  15. 根据权利要求7-14任一项所述的数据处理系统,其特征在于,所述控制器,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令时,用于:The data processing system according to any one of claims 7-14, wherein the controller sends a first control instruction to the first data processing device, and sends a second control command to the target second data processing device. When controlling instructions, it is used to:
    向所述第一数据处理装置、和所述目标第二数据处理装置发送复位信号;在复位信号发送成功后,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;Send a reset signal to the first data processing device and the target second data processing device; after the reset signal is sent successfully, send a first control instruction to the first data processing device, and send a first control instruction to the target second data processing device The device sends a second control instruction;
    所述第一数据处理装置,在响应于接收到第一控制指令,将自身模式切换为被控模式时,用于:响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第一控制指令,将自身模式切换为被控模式;The first data processing device, when switching its own mode to the controlled mode in response to receiving the first control instruction, is configured to: perform a reset in response to receiving the reset signal; and after completing the reset, respond upon receiving the first control instruction, switch the own mode to the controlled mode;
    所述目标第二数据处理装置,在响应于接收到第二控制指令,将自身模式切换为主控模式时,用于:响应于接收到所述复位信号,执行复位;并在完成复位后,响应于接收到所述第二控制指令,将自身模式切换为主控模式。The target second data processing device, when switching its own mode to the master mode in response to receiving the second control instruction, is configured to: perform a reset in response to receiving the reset signal; and after completing the reset, In response to receiving the second control instruction, switch its own mode to the master mode.
  16. 根据权利要求15所述的数据处理系统,其特征在于,所述控制器,还用于:响应于接收到所述第一数据处理装置、和所述目标第二数据处理装置发送的成功切换信号,向所述第一数据处理装置、和所述目标第二数据处理装置发送解除复位信号;The data processing system according to claim 15, wherein the controller is further configured to: respond to receiving a successful switching signal sent by the first data processing device and the target second data processing device , sending a release reset signal to the first data processing device and the target second data processing device;
    所述第一数据处理装置,响应于接收到所述解除复位信号,解除复位;The first data processing device, in response to receiving the reset release signal, releases the reset;
    所述目标第二数据处理装置,响应于接收到所述解除复位信号,解除复位。The target second data processing device releases the reset in response to receiving the reset release signal.
  17. 根据权利要求15或16所述的数据处理系统,其特征在于,所述控制器还通过总线与所述数据处理装置连接;The data processing system according to claim 15 or 16, wherein the controller is further connected to the data processing device through a bus;
    所述控制器,在向所述第一数据处理装置和所述目标第二数据处理装置发送复位信号时,用于:通过所述总线向所述第一数据处理装置和所述目标第二数据处理装置发送复位信号。The controller, when sending a reset signal to the first data processing device and the target second data processing device, is configured to: send data to the first data processing device and the target second data processing device through the bus The processing means sends a reset signal.
  18. 一种板卡,包括:如权利要求1-6任一项提供的数据处理装置,或者如权利要求7-17任一项所述的数据处理系统。A board, comprising: the data processing device according to any one of claims 1-6, or the data processing system according to any one of claims 7-17.
  19. 一种数据处理方法,应用于如权利要求1-6任一项所述的数据处理装置;所述数据处理方法包括:A data processing method, applied to the data processing device according to any one of claims 1-6; the data processing method comprises:
    多路复用器响应于接收到端口选通信号,选通数据处理芯片用于获取配置信息的第一传输通路或者第二传输通路;In response to receiving the port gating signal, the multiplexer gating the first transmission path or the second transmission path used by the data processing chip to obtain the configuration information;
    所述数据处理芯片响应于所述第一传输通路被选通,获取第一配置信息,基于所述第一配置信息,将自身模式确定为主控模式;响应于所述第二传输通路被选通,获取第二配置信息,并基于所述第二配置信息,将自身模式确定为被控模式。The data processing chip acquires first configuration information in response to the first transmission path being selected, and determines its own mode as the master mode based on the first configuration information; in response to the selection of the second transmission path In general, the second configuration information is acquired, and based on the second configuration information, the self mode is determined as the controlled mode.
  20. 一种数据处理方法,应用于如权利要求7-17任一项所述的数据处理系统;所述数据处理方法包括:A data processing method, applied to the data processing system according to any one of claims 7-17; the data processing method comprises:
    控制器对第一数据处理装置的状态进行监测;响应于所述状态指示将所述第一数据处理装置切换为被控模式,向所述第一数据处理装置发送第一控制指令,以及向目标第二数据处理装置发送第二控制指令;The controller monitors the status of the first data processing device; in response to the status indication, switches the first data processing device to a controlled mode, sends a first control command to the first data processing device, and sends a target the second data processing device sends a second control instruction;
    所述第一数据处理装置响应于接收到第一控制指令,将自身模式切换为被控模式;The first data processing device switches its own mode to the controlled mode in response to receiving the first control instruction;
    所述目标第二数据处理装置响应于接收到第二控制指令,将自身模式切换为主控模式。The target second data processing device switches its own mode to the master mode in response to receiving the second control instruction.
PCT/CN2021/134517 2021-06-29 2021-11-30 Data processing apparatus, system, method, and board card WO2023273146A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110727748.1 2021-06-29
CN202110727748.1A CN113344767A (en) 2021-06-29 2021-06-29 Data processing device, system, board card, method, electronic device and storage medium

Publications (1)

Publication Number Publication Date
WO2023273146A1 true WO2023273146A1 (en) 2023-01-05

Family

ID=77481480

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/134517 WO2023273146A1 (en) 2021-06-29 2021-11-30 Data processing apparatus, system, method, and board card

Country Status (2)

Country Link
CN (1) CN113344767A (en)
WO (1) WO2023273146A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344767A (en) * 2021-06-29 2021-09-03 深圳市商汤科技有限公司 Data processing device, system, board card, method, electronic device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082667A (en) * 2010-11-17 2011-06-01 北京曙光天演信息技术有限公司 Method for switching master mode and slave mode of encryption card and encryption card
CN104965168A (en) * 2015-07-23 2015-10-07 北京华峰测控技术有限公司 FPGA configuration system and method for testing of integrated circuit
CN106528244A (en) * 2016-11-25 2017-03-22 迈普通信技术股份有限公司 Automatic loading system and method of FPGA (Field-Programmable Gate Array) configuration file
US20180277213A1 (en) * 2017-03-22 2018-09-27 Kabushiki Kaisha Toshiba Semiconductor integrated circuit
CN108983695A (en) * 2018-07-23 2018-12-11 郑州云海信息技术有限公司 A kind of master-slave switching method and device based on Complex Programmable Logic Devices
CN112272024A (en) * 2020-10-29 2021-01-26 国核自仪系统工程有限公司 Method and circuit for refreshing configuration data of FPGA device and storage medium
CN113344767A (en) * 2021-06-29 2021-09-03 深圳市商汤科技有限公司 Data processing device, system, board card, method, electronic device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102082667A (en) * 2010-11-17 2011-06-01 北京曙光天演信息技术有限公司 Method for switching master mode and slave mode of encryption card and encryption card
CN104965168A (en) * 2015-07-23 2015-10-07 北京华峰测控技术有限公司 FPGA configuration system and method for testing of integrated circuit
CN106528244A (en) * 2016-11-25 2017-03-22 迈普通信技术股份有限公司 Automatic loading system and method of FPGA (Field-Programmable Gate Array) configuration file
US20180277213A1 (en) * 2017-03-22 2018-09-27 Kabushiki Kaisha Toshiba Semiconductor integrated circuit
CN108983695A (en) * 2018-07-23 2018-12-11 郑州云海信息技术有限公司 A kind of master-slave switching method and device based on Complex Programmable Logic Devices
CN112272024A (en) * 2020-10-29 2021-01-26 国核自仪系统工程有限公司 Method and circuit for refreshing configuration data of FPGA device and storage medium
CN113344767A (en) * 2021-06-29 2021-09-03 深圳市商汤科技有限公司 Data processing device, system, board card, method, electronic device and storage medium

Also Published As

Publication number Publication date
CN113344767A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
TWI446161B (en) Apparatus and method for handling a failed processor of a multiprocessor information handling system
KR100968641B1 (en) Point-to-point link negotiation method and apparatus
TWI803715B (en) System and method for hardware management and configuration in a datacenter using augmented reality and available sensor data
CN107479721B (en) Storage device, system and method for remote multicomputer switching technology
CN113475172B (en) System and method for positioning and navigating data center through augmented reality and available sensor data
JP4558519B2 (en) Information processing apparatus and system bus control method
CN103827834B (en) A kind of moving method of internal storage data, computing machine and device
US20060233204A1 (en) Redundant I/O interface management
CN107368401B (en) Management system and management method
US10691562B2 (en) Management node failover for high reliability systems
US20200252302A1 (en) System and Method for Remote Hardware Support Using Augmented Reality and Available Sensor Data
WO2021004256A1 (en) Node switching method in node failure and related device
CN1324463C (en) Method and apparatus for enumeration of a multi-node computer system
WO2023273146A1 (en) Data processing apparatus, system, method, and board card
CN117389790B (en) Firmware detection system, method, storage medium and server capable of recovering faults
US20200257994A1 (en) Inference processing system, inference processing device, and computer program product
CN115905094A (en) Electronic equipment and PCIe topology configuration method and device thereof
US10964405B2 (en) Memory initialization reporting and control
US11093422B2 (en) Processor/endpoint communication coupling configuration system
CN111858187A (en) Electronic equipment and service switching method and device
KR101056759B1 (en) Recording medium recording an information processing system, a method of controlling the information processing system, and a control program of the information processing system
CN115905072A (en) Computer system, control method based on PCIe device and related device
JP5217128B2 (en) Emulation apparatus and emulation method
US9639438B2 (en) Methods and systems of managing an interconnection
CN111258763A (en) Server system and control method and device of server system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21948072

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04/04/2024)