CN109947605A - Method for diagnosing faults - Google Patents
Method for diagnosing faults Download PDFInfo
- Publication number
- CN109947605A CN109947605A CN201711402304.0A CN201711402304A CN109947605A CN 109947605 A CN109947605 A CN 109947605A CN 201711402304 A CN201711402304 A CN 201711402304A CN 109947605 A CN109947605 A CN 109947605A
- Authority
- CN
- China
- Prior art keywords
- data
- chip
- node
- node chip
- working condition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000012545 processing Methods 0.000 claims abstract description 75
- 230000005540 biological transmission Effects 0.000 claims description 28
- 238000004891 communication Methods 0.000 claims description 14
- 239000013078 crystal Substances 0.000 claims description 6
- 230000006978 adaptation Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 238000003745 diagnosis Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 9
- 230000009849 deactivation Effects 0.000 description 6
- 230000004913 activation Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 3
- 230000001934 delay Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000004377 microelectronic Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Landscapes
- Communication Control (AREA)
Abstract
The invention discloses a kind of method for diagnosing faults.The described method includes: sending working condition querying command to the node chip of data processing equipment;Each node chip of data processing equipment successively forwards working condition querying command;Judge whether the chip address of each node chip matches with the chip address specified in working condition querying command;If the chip address of node chip matches with the chip address specified in working condition querying command, return register data;According to the register data that node chip returns, the working condition of node chip is judged.The embodiment of the present invention can effectively realize the rapid failure diagnosis of series connection node chipset.
Description
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of method for diagnosing faults.
Background technique
Currently, with machine learning especially depth learning technology applications in various fields and development, to computing device
More stringent requirements are proposed for data-handling capacity.GPU handles chip because of its powerful graphics process and simultaneously better than tradition CPU
Row operational capability is widely used to the data operation task in each field, becomes general deep learning computing platform.
However, the computing capability of single GPU architecture is still limited, deep learning, Hash operation etc. are unable to satisfy to high-strength
The demand of the data computing capability of degree.For this purpose, the Chinese invention patent application application No. is CN201610312586.4 proposes
A kind of scheme of the operational capability of growth data processing unit, as shown in Figure 1.The program proposes one kind by multiple node chips
The data processing equipment of series connection, the data processing equipment connect via the outside for the first node chip for being located at downlink communication direction
Mouth receives data processing task, carries out calculation process to data processing task by the node chips at different levels of serial connection, and lead to
Cross the external interface returned data processing result of first node chip.The quantity of program interior joint chip can be according to data processing
The operational capability demand of task is extended, and a node chip is only needed to be communicatively coupled with external equipment, is not accounted for
With the communication interface of external equipment, it is, therefore, possible to provide the stronger data-handling capacity for being easy to extend.
Although node chip is connected in series the above-mentioned prior art, each node chip is responsible for a part of calculation processing,
Data processing speed is accelerated, but carries out data transmission between each node chip and is easy to generate conflict.Also, at the data
The data processing task that device receives external equipment transmission is managed, needs to distribute data processing task between each node chip, such as
It is also problem in need of consideration that task is where distributed between multiple node chips with the interaction for reducing signaling.In addition, concatenated
When each node chip handles same data processing task, it is understood that there may be the fault condition of delay machine occurs for some node chip, thus
Entire node chip group is caused to can not work normally, how quickly to carry out the fault diagnosis of node chip is also to need what is solved to ask
Topic.
Summary of the invention
To solve the above-mentioned problems, the present invention proposes a kind of method for diagnosing faults.
According to an aspect of the invention, it is proposed that a kind of method for diagnosing faults, the method for diagnosing faults is applied to have more
The data processing equipment of a node chip being sequentially connected in series, described method includes following steps:
Working condition querying command is sent to the node chip of the data processing equipment;
Each node chip of the data processing equipment successively forwards the working condition querying command;
Judge each node chip chip address whether in the working condition querying command specify chip address phase
Matching;
If the chip address of node chip matches with the chip address specified in the working condition querying command, return
Register data;
According to the register data that node chip returns, the working condition of node chip is judged.
Optionally, the register data returned according to node chip, judges the working condition of node chip, comprising:
If detection discovery does not receive the node to match with the chip address specified in the working condition querying command
The register data that chip returns, then judge that the node chip breaks down.
According to another aspect of the invention, it is proposed that a kind of method for diagnosing faults, the method for diagnosing faults is applied to have
The data processing equipment of multiple node chips being sequentially connected in series, described method includes following steps:
Working condition querying command is sent to the node chip of the data processing equipment;
Each node chip of the data processing equipment successively forwards the working condition querying command;
Judge whether the working condition querying command specifies the working condition for inquiring whole node chips;
If the working condition querying command specifies the working condition for inquiring whole node chips, each node chip is successively
Return register data;
According to the register data that node chip returns, the working condition of node chip is judged.
Optionally, the register data returned according to node chip, judges the working condition of node chip, comprising:
When the working condition querying command specifies the working condition for inquiring whole node chips, according to what is received
The number for the register data that node chip returns, judges the node chip to break down.
Optionally, the data processing equipment includes multiple node chips being sequentially connected in series, chopped-off head node chip
The data input cell of data outputting unit and external control device connection, for returning to the operation result of data processing equipment
To external control device;The data input cell of superior node chip is connect with the data outputting unit of downstream site chip, is used
The data obtained after receiving the operation of downstream site chip;One or more data input cells of the chopped-off head node chip with
One or more data outputting units of external control device connect, to receive the data input or order of external control device
Input, one or more data outputting units of superior node chip and one or more data of downstream site chip input single
Member connection, for sending data input or order input to junior's node chip.
Optionally, the node chip includes control unit and multiple operation operators, the operation operator be divided into two groups or
Person's multiple groups, every group of operation operator include the operation operator of two or more series connections, the chopped-off head operation in every group of operation operator
Operator is connect with described control unit.
Optionally, the operation operator includes: arithmetic unit and storage unit;Wherein:
The arithmetic unit is connect with the storage unit of higher level's operation operator, for reading higher level's operation operator storage unit
The data of middle storage simultaneously carry out operation;
The arithmetic unit is connect with storage unit, and the data for obtaining operation are stored in storage unit, under
Grade operation operator calls.
Optionally, the data processing equipment further includes signal conversion unit, two node chips is connected, for carrying out
Signal voltage adaptation.
Optionally, the data processing equipment further includes one or more clock crystals, the clock letter of the clock crystal
Number output interface is connect with the clock signal input interface of a node chip in the data processing equipment.
Optionally, which is characterized in that the node chip is provided with busy signal input order and busy signal output order, institute
Busy signal input order and busy signal output order are stated for controlling data hair of the respective nodes chip on uplink communication direction
It send.
Optionally, when the busy signal output pin is level low/high, instruction can forward next stage node chip to return
Data;When the busy signal output pin is high/low level, indicate that the same level node chip or even higher level of node chip will
Or sending data.
Optionally, when the busy signal input pin of node chip is high/low level, the busy signal output of the node chip
Pin is also high/low level.
Optionally, when the same level node chip has data latency transmission, when detect busy signal input pin be high/low level
When, when the busy signal input pin being waited to switch to level low/high, retransmit data;When detecting the busy signal input pipe
When foot is level low/high, data are sent immediately.
Optionally, when the same level node chip has data latency transmission, busy signal output pin is exported as high/low level,
The busy signal output pin is exported as level low/high after data are sent completely.
Optionally, just when sending data, if detecting, busy signal input pin is high/low level to the same level node chip,
Continue to send data, until the total data in buffer queue is sent completely.
Optionally, it after the busy signal output pin output of the same level node chip is high/low level, waits between scheduled protection
Every the time, then carry out the transmission of data.
Optionally, the protection interval time sets respectively according to either synchronously or asynchronously communication pattern is taken between node chip
It is fixed.
Compared with prior art, some embodiments of the present invention control series connection by configuring busy signal input and output pin
Data between node chip are sent, and effectively prevent the data transmission collision between node chip;Pass through control unit and section
Less instruction interaction realizes the distribution of the calculating task of series connection node chip between point chip, takes full advantage of series connection node
The computing capability of chip, and realize the rapid failure diagnosis of series connection node chipset.
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram of data processing equipment in the prior art;
Fig. 2 is the structural schematic diagram of data processing equipment according to an embodiment of the invention;
Fig. 3 is the structural block diagram according to the node chip of one embodiment of the disclosure;
Fig. 4 is the structural block diagram according to the operation operator of one embodiment of the disclosure;
Fig. 5 is the flow chart of data transmission method for uplink according to an embodiment of the invention;
Fig. 6 is the flow chart of data transmission method for uplink according to another embodiment of the present invention;
Fig. 7 is the flow chart of data transmission method for uplink according to another embodiment of the present invention;
Fig. 8 is the flow chart of method for allocating tasks according to an embodiment of the invention;
Fig. 9 is the flow chart of method for allocating tasks according to another embodiment of the present invention;
Figure 10 is the flow chart of method for allocating tasks according to another embodiment of the present invention;
Figure 11 is the flow chart of method for diagnosing faults according to an embodiment of the invention;
Figure 12 is the flow chart of method for diagnosing faults according to another embodiment of the present invention;
Figure 13 is the structural schematic diagram according to an embodiment of the invention for calculating equipment.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference
Attached drawing, the present invention is described in more detail.
Fig. 2 is the structural schematic diagram of data processing equipment 10 according to an embodiment of the invention.As shown in Fig. 2, the number
It include multiple node chips 20 being sequentially connected in series according to processing unit 10, in which:
Positioned at downlink communication direction chopped-off head node chip by external interface receive external control device command signal,
It is transferred to more than one node chip to be handled, and is returned by external interface to external control device and calculate data;
The node chip 20 is provided with busy signal input pin BI and busy signal output pin BO, is located at downlink communication side
To the busy signal output pin BO of node chip 20 be coupled to the busy signal input pin BI of next stage node chip, it is described busy
Signal input tube foot and busy signal output pin are used to control data of the respective nodes chip 20 on uplink communication direction and send.
In some embodiments, the node chip 20 can use ASIC specific integrated circuit, GPU, DSP or FPGA
Chip is realized.
In some embodiments, when busy signal output pin BO is level low/high, instruction can forward next stage node
The data that chip returns when busy signal output pin BO is high/low level, indicate the same level node chip or even higher level of node core
Piece will or send data.
In some embodiments, when the busy signal input pin BI of node chip 20 is high/low level, the section
The busy signal output pin BO of point chip is correspondingly also high/low level.
In some embodiments, when the data to be sent such as having in the buffer queue FIFO of the same level node chip 20, when
When the busy signal input pin BI for detecting itself is high/low level, need to wait the busy signal input pin BI switch to it is low/
When high level, data can be just sent;It, can be immediately when the busy signal input pin BI for detecting itself is level low/high
Send data.
In some embodiments, when the same level node chip 20 just when sending data, if detecting the described of itself
When busy signal input pin BI is high/low level, data transmission is unaffected, that is, continues to send data, until buffer queue
Total data in FIFO is sent completely.
It in some embodiments, will when the data to be sent such as having in the buffer queue FIFO of the same level node chip 20
The busy signal output pin BO output of its own is high/low level, exports the busy signal of its own after data are sent completely
Pin BO output is level low/high.In the embodiment of the present invention, when node chip receives reset signal, busy signal efferent duct
Foot BO can become level low/high.
In some embodiments, the busy signal output pin BO output of the same level node chip 20 is high/low level
Afterwards, scheduled protection interval time GAP can be waited, then carries out the transmission of data.
Setting protection interval time GAP is to also guarantee next stage when the same level node chip needs to send data
Node chip is not sending data simultaneously.It includes two kinds of possible situations that next stage node chip, which sends data: first is that next
Grade node chip is sending the data of oneself, second is that next stage node chip is forwarding node chip more backward to send
Data.
If when needing to send data, busy signal output pin BO is exported as high/low level for the same level node chip,
Even if next stage node chip does not have data being forwarded at this time, it is also desirable to wait a protection interval time GAP, this be for
It avoids when busy signal output pin BO is just set to high/low level, next stage node chip has developed transmission
Data.When it is implemented, protection interval time GAP will at least have the data transmission period of 8 bits, for example protection can be set
Interval time GAP is the data transmission period of 16 bits.
If there is next stage node chip is sending data, the same level node chip will also wait a protection interval time
Then GAP starts the data for sending the same level node chip to ensure that the data transmission of next stage node chip terminates again.Due to
The busy signal output pin BO of the same level node chip has exported as high/low level before this, then next stage node chip
It is subsequent to be further continued for sending data.
There are also a kind of situation in practical application, when the chopped-off head node chip of data processing equipment need to send data when
It waits, is high/low level busy signal output pin BO output, chains road afterbody node chip is waited to detect that busy signal is defeated
Enter pin BI be high/low level when, have already been through N grades of delays.If afterbody node chip is not detecting
When busy signal input pin BI is high/low level, data just are sent toward uplink communication direction, then the number that it sends
It is believed that number, and need just to reach chopped-off head node chip by N grades of delays.Therefore, when setting protection interval time GAP, this
Two N grades of delays, will also control within protection interval time GAP.
In some embodiments, setting protection interval time GAP is also needed according to the difference used between node chip
Communication pattern is set.By taking 256 node chips are connected as an example, specifically:
1) when node chip use asynchronous serial communication mode (UART) when, it is contemplated that inside node chip line delay,
PCB is delayed, and under this asynchronous communication model, it is enough for being waited using the UART transmission time of 16 bits as the protection interval time
's.
2) when node chip uses synchronous serial communication mode, in 256 cascade situations of node chip, it is contemplated that
From busy signal output pin BO to the delay time of busy signal input pin BI, and data are forwarded in every level-one node chip and are needed
As soon as the clock cycle, overall delay needs 256 clock cycle.Therefore, when can set waiting 512 in actual circuit
The clock period.
In some embodiments, the protection interval time is according to the propagation delay time and/or chip of signal or order
Arithmetic speed is arranged.
Fig. 3 be according to the structural block diagram of the node chip of one embodiment of the disclosure, as shown in figure 3, in this embodiment, institute
Stating node chip includes: 201, two groups of control unit or multiple groups operation operator 202 and one or more input/output interfaces
203, in which:
Described control unit 201 is connect with the input/output interface 203, for carrying out data exchange with outside;
Every group of operation operator includes the operation operator of two or more series connections.
It usually requires that multiple operation operators are arranged on one node chip, in order to save wiring space, it is multiple to reduce wiring
Miscellaneous degree is more convenient the control of control unit, can be according to the usable area of node chip, the work characteristics of operation operator, operation
Multiple operation operators are divided into two or more sets operation operator tuples by the performance of operator, the function of operation operator or other factors, and
It is connected in series with each other the operation operator in each group of operation operator.
It above are only illustrative explanation, in actual operation, those skilled in the art can be according to the needs of practical application
Operation operator is grouped, the disclosure is not especially limited specific group technology.
In one embodiment of the disclosure, the chopped-off head operation operator in every group of operation operator is connect with described control unit.
Since the operation operator in every group of operation operator is connected in series with each other, as long as having one in every group of operation operator
A operation operator is connect with control unit can.In one embodiment of the disclosure, the head in every group of operation operator can be made
Grade operation operator is connect with described control unit, and the chopped-off head operation operator is usually that the nearest operation of distance controlling unit is calculated
Son thus can further save wiring space, reduce wiring complexity.
It above are only illustrative explanation, in actual operation, those skilled in the art can be according to the needs of practical application
The operation operator connecting with control unit is selected, the disclosure is not especially limited it.
In one embodiment of the disclosure, the input/output interface 203 is two, is set up separately the two of the node chip
End, two input/output interfaces are connect with described control unit, and control unit is made to pass through input/output interface and external progress
Data exchange.
It above are only illustrative explanation, in actual operation, those skilled in the art can be according to the needs of practical application
The installation position of input/output interface is selected, the disclosure is not especially limited it.
Fig. 4 is according to the structural block diagram of the operation operator 202 of one embodiment of the disclosure, as shown in figure 4, real in the disclosure one
It applies in mode, the operation operator 202 includes: one or more arithmetic units 2021,2022 and of one or more storage units
Clock input interface 2023, in which:
The arithmetic unit 2021 is connect with the storage unit 2022 of higher level's operation operator, for reading higher level's operation operator
The data that store in storage unit 2022 simultaneously carry out operation;
The arithmetic unit 2021 is connect with storage unit 2022, and the data for obtaining operation are stored in storage unit
In 2022, called for junior's operation operator;
The clock input interface 2023 is connect with the clock output interface of described control unit.
In this embodiment, by the data connection step by step of mutual concatenated operation operator, each operation operator can
The data of oneself needs are enough obtained, and this cascaded structure can save wiring space, reduces wiring complexity.
Further, the operation operator is made of microelectronic circuit, and the microelectronic circuit is managed by COMS, NMOS tube
Composition.
In practical applications, those skilled in the art can select to match with operation purpose according to the needs of practical application
Operation operator and storage unit, the disclosure are not especially limited the selection of operation operator and storage unit, related model.
In some embodiments, the data outputting unit of the chopped-off head node chip and the data of external control device are defeated
Enter unit connection, for the operation result of data processing equipment to be returned to external control device;The data of superior node chip
Input unit is connect with the data outputting unit of downstream site chip, for receiving the number obtained after downstream site chip operation
According to;One or more data input cells of the chopped-off head node chip and one or more data of external control device export
Unit connection, to receive the data input or order input of external control device, one or more numbers of superior node chip
It is connect according to output unit with one or more data input cells of downstream site chip, for sending number to junior's node chip
According to input or order input.
In some embodiments, the data processing equipment further includes signal conversion unit, and two node chips are connected
It connects, for carrying out signal voltage adaptation.
It in some embodiments, further include one or more clock crystals, the clock signal output of the clock crystal
Interface is connect with the clock signal input interface of a node chip in the data processing equipment.
Fig. 5 is the flow chart of an embodiment of the data transmission method for uplink based on data processing equipment 10 of the present invention.Such as
Shown in Fig. 5, based on the data processing unit data transmission method for uplink the following steps are included:
Step S1, the same level node chip such as judge whether to have in buffer queue at the data to be sent;
Step S2, if so, then detecting whether the busy signal input pin is high/low level;
Step S3, if the busy signal input pin be high/low level, wait the busy signal input pin from
After high/low level becomes level low/high, start to send the data in buffer queue;
Step S4 sends the data in buffer queue if the busy signal input pin is level low/high immediately.
In some embodiments, the data transmission method for uplink further comprises:
When the same level node chip detects that the busy signal input pin is high/low level, by the busy signal efferent duct
Foot also exports as high/low level.
Fig. 6 is the flow chart of the another embodiment of the data transmission method for uplink based on data processing equipment 10 of the present invention.
As shown in fig. 6, based on the data processing unit data transmission method for uplink the following steps are included:
Step S11, the same level node chip such as judge whether to have in buffer queue at the data to be sent;
Step S12, if so, then exporting the busy signal output pin for high/low level;
Step S13 detects whether the busy signal input pin is high/low level;
Step S14 sends the data in buffer queue if the busy signal input pin is level low/high immediately;
Step S15, judges whether the data in the buffer queue are sent completely;
Step S16, it is if the data in the buffer queue have been sent completely, the busy signal output pin is defeated
It is out level low/high;
Step S17, if the data in the buffer queue detect the busy signal input when being sent completely not yet
Pin is high/low level, then continues to send data until the total data in buffer queue is sent completely.
Fig. 7 is the flow chart of the another embodiment of the data transmission method for uplink based on data processing equipment 10 of the present invention.
As shown in fig. 7, on the basis of the embodiment described in Fig. 6, the data transmission method for uplink of the processing unit based on the data, in step
It is further comprising the steps of after rapid S12:
Step S18 is waited scheduled after the busy signal output pin of the same level node chip exports as high/low level
The protection interval time, then the transmission of data is carried out, to ensure that next stage node chip will not send data simultaneously.
In some embodiments, the protection interval time either synchronously or asynchronously communicates mould according to taking between node chip
Formula and be set separately.
Some embodiments of the present invention as previously shown control series connection node by configuring busy signal input and output pin
Data between chip are sent, and the setting of join protection interval time, effectively prevent the data hair between node chip
Send conflict.
Fig. 8 is the flow chart of an embodiment of the method for allocating tasks based on data processing equipment 10 of the present invention.Such as
Shown in Fig. 8, the method for allocating tasks of the processing unit based on the data is suitable for control unit, and the method includes following
Step:
The node chip of step S21, Xiang Suoshu data processing equipment sends order, so that node chip enters deactivation shape
State;
The node chip of step S22, Xiang Suoshu data processing equipment sends address distribution order, is followed successively by each node chip
Distribute chip address;
Step S23 distributes calculating task according to the chip address of each node chip for each node chip.
In some embodiments, described send to the node chip of the data processing equipment is ordered, so that node core
Piece enters the step of deactivated state and includes:
Order is sent respectively to each node chip of the data processing equipment, so that each node chip sequentially enters deactivation
State.
In some embodiments, described send to the node chip of the data processing equipment is ordered, so that node core
Piece enters the step of deactivated state and includes:
A subcommand is sent to all node chips of the data processing equipment, is gone so that each node chip enters simultaneously
State living.
Fig. 9 is the flow chart of the another embodiment of the method for allocating tasks based on data processing equipment 10 of the present invention.
As shown in figure 9, on the basis of embodiment shown in Fig. 8, the method for allocating tasks of the processing unit based on the data, in step
It is further comprising the steps of after S23:
Step S24, node chip execute one or many Hash operations according to the calculating task for its distribution.
Specifically, Hash operation may include cryptographic Hash operation or Hash collision operation.
Figure 10 is the flow chart of the another embodiment of the method for allocating tasks based on data processing equipment 10 of the present invention.
As shown in Figure 10, the method for allocating tasks is suitable for node chip, the described method comprises the following steps:
Order is distributed in step S31, the address that reception control unit is sent;
Whether step S32 judges node chip currently in deactivated state;
Step S33, when node chip is in deactivated state, order is distributed in the address that parsing control unit is sent, and will be
Its chip address distributed is saved to register, and is transferred to state of activation;
Step S34, when node chip is active, which does not parse the address point of control unit transmission
With order, next stage node chip is directly forwarded it to.
The task input command format that data processing equipment 10 uses includes HCN field, starting random number offset SNO word
Section.HCN field is used to control the calculation times of each node chip, for example, it is assumed that the calculating task executed needs 2^32 meter
It calculates, i.e. random number nonce is incremented by since initial value, traverses 2^32 numerical value, then 32 node chips is connected in series, each
Node chip only needs to calculate 2^27 step.Originating random number offset SNO field includes a number.
The address the SetAddress distribution command format that data processing equipment 10 uses includes chip address ChipAddr word
Section, ChipAddr field specify the chip address of individual node chip.The calculating task of each node chip by SNO and
Numerical value in ChipAddr field determines.
Distribution for chip address, firstly, control unit issues a ChainInactive order for a node core
Piece is transferred to deactivation (Inactive) state.In specific implementation, can all node chips together enter deactivation (Inactive) shape
Deactivation (Inactive) state can also be arranged in state one by one.Then, control unit is sent to this node chip
The distribution order of the address SetAddress, distributes arbitrary address to the node chip.
Under deactivation (Inactive) state, node chip can parse the distribution order of the address SetAddress and in register
Middle its address of preservation, is then transferred to activation (Active) state.In specific implementation, the node chip after distributing address can basis
The order of control unit enters activation (Active) state, also can parse after address distribution is ordered automatically into activation
(Active) state.
Under activation (Active) state, node chip does not parse the distribution order of the address SetAddress, directly by its turn
Issue next stage node chip.It connects for N number of node chip, control unit needs successively to issue N number of SetAddress order
To distribute arbitrary address to node chip one by one.For example, control CPU needs for the mine machine of 256 node chip cascades
256 addresses SetAddress distribution orders of continuous sending could complete the address setting of all node chips.
Figure 11 is the flow chart of an embodiment of the method for diagnosing faults based on data processing equipment 10 of the present invention.Such as
Shown in Figure 11, the method for diagnosing faults of the processing unit based on the data is suitable for control unit, the method includes with
Lower step:
The node chip of step S41, Xiang Suoshu data processing equipment sends working condition querying command;
Step S42, each node chip of the data processing equipment successively forward the working condition querying command;
Step S43, judge each node chip chip address whether in the working condition querying command specify core
Piece address matches;
Step S44, if the chip address of node chip and the chip address phase specified in the working condition querying command
Matching, return register data;
Step S45 judges the working condition of node chip according to the register data that node chip returns.
In some embodiments, the method for diagnosing faults further comprises:
If detection discovery does not receive the node to match with the chip address specified in the working condition querying command
The register data that chip returns, then judge that the node chip breaks down.
Figure 12 is the flow chart of the another embodiment of the method for diagnosing faults based on data processing equipment 10 of the present invention.
As shown in figure 12, the method for diagnosing faults of the processing unit based on the data is suitable for control unit, the method includes
Following steps:
The node chip of step S51, Xiang Suoshu data processing equipment sends working condition querying command;
Step S52, each node chip of the data processing equipment successively forward the working condition querying command;
Step S53, judges whether the working condition querying command specifies the working condition for inquiring whole node chips;
Step S54, if it is, each node chip successively return register data;
Step S55 judges the working condition of node chip according to the register data that node chip returns.
In some embodiments, the method for diagnosing faults further comprises:
When the working condition querying command specifies the working condition for inquiring whole node chips, control unit according to
The number for the register data that the node chip received returns, judges which node chip breaks down.
After node chip receives the working condition inquiry name of control unit, it is transmitted to next stage node core first
Piece, can numerical value by UART interface corresponding registers for the node chip that matches of address in chip address and order
(for example, just in the numerical value of operation) returns to control unit.Control unit can be according to the number of the node chip response received
Carry out the core number in determining device.It is thus possible to the fault diagnosis of node chip group be carried out using order, if worked
When some node chip there is failure, on the one hand can should may determine that substantially from the Hash rate of data processing equipment
Out, which node chip operation numerical value can also not replied for a long time according to be judged.
Figure 13 is the structural schematic diagram of the calculating equipment 40 based on one embodiment of the invention.As shown in figure 13, the calculating
Equipment 40 include aforementioned data processing unit 10 and control unit 30, the data processing equipment 10 by external interface with it is described
Control unit 30 communicates to connect.
Particular embodiments described above has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects
It describes in detail bright, it should be understood that the above is only a specific embodiment of the present invention, is not intended to restrict the invention, it is all
Within the spirit and principles in the present invention, any modification, equivalent substitution, improvement and etc. done should be included in guarantor of the invention
Within the scope of shield.
Claims (17)
1. a kind of method for diagnosing faults applied to the data processing equipment with multiple node chips being sequentially connected in series,
It is characterized in that, described method includes following steps:
Working condition querying command is sent to the node chip of the data processing equipment;
Each node chip of the data processing equipment successively forwards the working condition querying command;
Judge whether the chip address of each node chip matches with the chip address specified in the working condition querying command;
If the chip address of node chip matches with the chip address specified in the working condition querying command, deposit is returned
Device data;
According to the register data that node chip returns, the working condition of node chip is judged.
2. method for diagnosing faults according to claim 1, which is characterized in that the register returned according to node chip
Data judge the working condition of node chip, comprising:
If detection discovery does not receive the node chip to match with the chip address specified in the working condition querying command
The register data of return then judges that the node chip breaks down.
3. a kind of method for diagnosing faults applied to the data processing equipment with multiple node chips being sequentially connected in series,
It is characterized in that, described method includes following steps:
Working condition querying command is sent to the node chip of the data processing equipment;
Each node chip of the data processing equipment successively forwards the working condition querying command;
Judge whether the working condition querying command specifies the working condition for inquiring whole node chips;
If the working condition querying command specifies the working condition for inquiring whole node chips, each node chip is successively returned
Register data;
According to the register data that node chip returns, the working condition of node chip is judged.
4. method for diagnosing faults according to claim 3, which is characterized in that the register returned according to node chip
Data judge the working condition of node chip, comprising:
When the working condition querying command specifies the working condition for inquiring whole node chips, according to the node received
The number for the register data that chip returns, judges the node chip to break down.
5. method for diagnosing faults according to claim 1 or 3, which is characterized in that the data processing equipment includes multiple
The node chip being sequentially connected in series, the data outputting unit of chopped-off head node chip and the data input cell of external control device
Connection, for the operation result of data processing equipment to be returned to external control device;The data of superior node chip input single
Member is connect with the data outputting unit of downstream site chip, for receiving the data obtained after downstream site chip operation;It is described
One or more data input cells of chopped-off head node chip and one or more data outputting units of external control device connect
It connects, to receive the data input or order input of external control device, one or more data output of superior node chip
Unit is connect with one or more data input cells of downstream site chip, for sending data input to junior's node chip
Or order input.
6. method for diagnosing faults according to claim 1 or 3, which is characterized in that the node chip includes control unit
With multiple operation operators, the operation operator is divided into two groups or multiple groups, and every group of operation operator includes that two or more series connection connect
The operation operator connect, the chopped-off head operation operator in every group of operation operator are connect with described control unit.
7. method for diagnosing faults according to claim 1 or 3, which is characterized in that the operation operator includes: arithmetic unit
And storage unit;Wherein:
The arithmetic unit is connect with the storage unit of higher level's operation operator, is deposited for reading in higher level's operation operator storage unit
The data of storage simultaneously carry out operation;
The arithmetic unit is connect with storage unit, and the data for obtaining operation are stored in storage unit, is transported for junior
Operator is calculated to call.
8. method for diagnosing faults according to claim 1 or 3, which is characterized in that the data processing equipment further includes letter
Number converting unit connects two node chips, for carrying out signal voltage adaptation.
9. method for diagnosing faults according to claim 1 or 3, which is characterized in that the data processing equipment further includes one
A or multiple clock crystals, the clock signal output interface and a node in the data processing equipment of the clock crystal
The clock signal input interface of chip connects.
10. method for diagnosing faults according to claim 5, which is characterized in that
The node chip is provided with busy signal input order and busy signal output order, the busy signal input order and busy letter
Number output order is sent for controlling data of the respective nodes chip on uplink communication direction.
11. method for diagnosing faults according to claim 10, which is characterized in that the busy signal output pin is low/high
When level, instruction can forward the data of next stage node chip return;When the busy signal output pin is high/low level,
Instruction the same level node chip or even higher level of node chip will or send data.
12. method for diagnosing faults described in 0 or 11 according to claim 1, which is characterized in that when the busy signal of node chip inputs
When pin is high/low level, the busy signal output pin of the node chip is also high/low level.
13. method for diagnosing faults according to claim 11, which is characterized in that the same level node chip has data latency transmission
When, when detecting busy signal input pin is high/low level, when the busy signal input pin being waited to switch to level low/high,
Retransmit data;When detecting the busy signal input pin is level low/high, data are sent immediately.
14. method for diagnosing faults according to claim 11, which is characterized in that the same level node chip has data latency transmission
When, busy signal output pin is exported as high/low level, is by the busy signal output pin output after data are sent completely
Level low/high.
15. method for diagnosing faults according to claim 11, which is characterized in that the same level node chip is sending data
When, if detecting, busy signal input pin is high/low level, continues to send data, until the total data hair in buffer queue
Send completion.
16. method for diagnosing faults according to claim 14, which is characterized in that the busy signal efferent duct of the same level node chip
Foot exports after high/low level, to wait the scheduled protection interval time, then carry out the transmission of data.
17. method for diagnosing faults according to claim 16, which is characterized in that the protection interval time is according to node core
Either synchronously or asynchronously communication pattern is taken between piece and is set separately.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711402304.0A CN109947605A (en) | 2017-12-21 | 2017-12-21 | Method for diagnosing faults |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711402304.0A CN109947605A (en) | 2017-12-21 | 2017-12-21 | Method for diagnosing faults |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109947605A true CN109947605A (en) | 2019-06-28 |
Family
ID=67006296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711402304.0A Pending CN109947605A (en) | 2017-12-21 | 2017-12-21 | Method for diagnosing faults |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109947605A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111324070A (en) * | 2020-03-04 | 2020-06-23 | 明峰医疗系统股份有限公司 | Debugging method of CT serial detector module cluster based on FPGA |
CN112557882A (en) * | 2021-02-19 | 2021-03-26 | 深圳市明微电子股份有限公司 | Chip initial address self-adaptive detection method, device, equipment and storage medium |
CN112732629A (en) * | 2020-12-31 | 2021-04-30 | 明峰医疗系统股份有限公司 | CT detector data transmission structure and data transmission method based on source synchronous LVDS-SERDES |
CN112860622A (en) * | 2021-02-08 | 2021-05-28 | 山东云海国创云计算装备产业创新中心有限公司 | Processing system and system on chip |
CN117093523A (en) * | 2023-10-20 | 2023-11-21 | 合肥为国半导体有限公司 | Chip array, fault positioning method thereof and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060031593A1 (en) * | 2004-08-09 | 2006-02-09 | Sinclair Alan W | Ring bus structure and its use in flash memory systems |
CN102163184A (en) * | 2011-03-22 | 2011-08-24 | 中兴通讯股份有限公司 | Master-slave transmission system and method based on special multi-chip serial interconnection interface |
CN102981992A (en) * | 2012-11-28 | 2013-03-20 | 中国人民解放军国防科学技术大学 | On-chip communication method and device of integrated circuit based on asynchronous structure |
CN105760324A (en) * | 2016-05-11 | 2016-07-13 | 北京比特大陆科技有限公司 | Data processing device and server |
CN107037791A (en) * | 2017-03-07 | 2017-08-11 | 佛山华数机器人有限公司 | A kind of producing line device visualization method for diagnosing faults |
-
2017
- 2017-12-21 CN CN201711402304.0A patent/CN109947605A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060031593A1 (en) * | 2004-08-09 | 2006-02-09 | Sinclair Alan W | Ring bus structure and its use in flash memory systems |
CN102163184A (en) * | 2011-03-22 | 2011-08-24 | 中兴通讯股份有限公司 | Master-slave transmission system and method based on special multi-chip serial interconnection interface |
CN102981992A (en) * | 2012-11-28 | 2013-03-20 | 中国人民解放军国防科学技术大学 | On-chip communication method and device of integrated circuit based on asynchronous structure |
CN105760324A (en) * | 2016-05-11 | 2016-07-13 | 北京比特大陆科技有限公司 | Data processing device and server |
CN107037791A (en) * | 2017-03-07 | 2017-08-11 | 佛山华数机器人有限公司 | A kind of producing line device visualization method for diagnosing faults |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111324070A (en) * | 2020-03-04 | 2020-06-23 | 明峰医疗系统股份有限公司 | Debugging method of CT serial detector module cluster based on FPGA |
CN112732629A (en) * | 2020-12-31 | 2021-04-30 | 明峰医疗系统股份有限公司 | CT detector data transmission structure and data transmission method based on source synchronous LVDS-SERDES |
CN112860622A (en) * | 2021-02-08 | 2021-05-28 | 山东云海国创云计算装备产业创新中心有限公司 | Processing system and system on chip |
CN112557882A (en) * | 2021-02-19 | 2021-03-26 | 深圳市明微电子股份有限公司 | Chip initial address self-adaptive detection method, device, equipment and storage medium |
CN112557882B (en) * | 2021-02-19 | 2021-05-28 | 深圳市明微电子股份有限公司 | Chip initial address self-adaptive detection method, device, equipment and storage medium |
CN117093523A (en) * | 2023-10-20 | 2023-11-21 | 合肥为国半导体有限公司 | Chip array, fault positioning method thereof and electronic equipment |
CN117093523B (en) * | 2023-10-20 | 2024-01-26 | 合肥为国半导体有限公司 | Chip array, fault positioning method thereof and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109947605A (en) | Method for diagnosing faults | |
JP5793690B2 (en) | Interface device and memory bus system | |
CN101383712B (en) | Routing node microstructure for on-chip network | |
CN102970247B (en) | Effective communication time scheduling method of time-triggered network | |
CN103595627A (en) | NoC router based on multicast dimension order routing algorithm and routing algorithm thereof | |
CN110995598A (en) | Variable-length message data processing method and scheduling device | |
CN103312614B (en) | A kind of multicast message processing method, line card and communication equipment | |
CN109947555A (en) | Data processing equipment, data transmission method for uplink and calculating equipment | |
CN104717152A (en) | Method and device for achieving interface caching dynamic allocation | |
CN109194430A (en) | A kind of C6678 distribution type system time synchronous method and system based on SRIO | |
CN105786734B (en) | Data transmission method, expansion device, peripheral equipment and system | |
CN116150051A (en) | Command processing method, device and system | |
CN110825210B (en) | Method, apparatus, device and medium for designing clock tree structure of system on chip | |
CN109933433B (en) | GPU resource scheduling system and scheduling method thereof | |
CN109947556A (en) | Method for allocating tasks | |
CN109101451A (en) | Chip-in series circuit calculates equipment and communication means | |
CN110519145B (en) | Multi-master 485 route communication method and system based on bidirectional ring network | |
CN103152275A (en) | Router suitable for network on chip and allowable for configuring switching mechanisms | |
CN105893321A (en) | Path diversity-based crossbar switch fine-grit fault-tolerant module in network on chip and method | |
WO2010012172A1 (en) | Data processing method, controller and system | |
US20160085706A1 (en) | Methods And Systems For Controlling Ordered Write Transactions To Multiple Devices Using Switch Point Networks | |
CN115757251A (en) | Data transmission system, method, device and medium | |
CN112818183B (en) | Data synthesis method, device, computer equipment and storage medium | |
US20230305976A1 (en) | Data flow-based neural network multi-engine synchronous calculation system | |
TW202147141A (en) | A computing device and computing system for digital currency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190628 |