CN109144793B - Fault correction device and method based on data flow driving calculation - Google Patents
Fault correction device and method based on data flow driving calculation Download PDFInfo
- Publication number
- CN109144793B CN109144793B CN201811044090.9A CN201811044090A CN109144793B CN 109144793 B CN109144793 B CN 109144793B CN 201811044090 A CN201811044090 A CN 201811044090A CN 109144793 B CN109144793 B CN 109144793B
- Authority
- CN
- China
- Prior art keywords
- data
- unit
- address
- calculation
- calculation result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 166
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000012937 correction Methods 0.000 title claims abstract description 25
- 238000013500 data storage Methods 0.000 claims abstract description 45
- 238000004458 analytical method Methods 0.000 claims abstract description 19
- 238000003780 insertion Methods 0.000 claims description 8
- 230000037431 insertion Effects 0.000 claims description 8
- 230000000873 masking effect Effects 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 7
- 239000000243 solution Substances 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 5
- 230000015654 memory Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 230000003014 reinforcing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/1641—Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/165—Error detection by comparing the output of redundant processing systems with continued operation after detection of the error
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Complex Calculations (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a fault correction device and method based on data flow driving calculation, the device comprises: the data storage unit is arranged at the input end of the fault correction device and used for simultaneously sending original data to the two calculation units according to the data address; the computing unit is used for carrying out mirror image computation on the original data; the judging unit is used for sending the mirror image calculation result to the data storage unit when judging that the two mirror image calculation results are equal, and generating and sending a fault breakpoint address to the analysis module when judging that the two mirror image calculation results are not equal; the data storage unit is used for generating a storage address according to the data address and storing a mirror image calculation result according to the storage address; the analysis module is used for acquiring a data address according to the fault breakpoint address and sending the data address to the data storage unit. Through the technical scheme in the application, the time consumption of chip data stream driving calculation is reduced, the resource consumption of the chip is reduced, and the interruption probability of the data stream driving calculation is reduced.
Description
Technical Field
The present disclosure relates to the field of chip operations, and in particular, to a fault correction device and a fault correction method based on data flow driven computation.
Background
Cosmic rays consisting of multiple ray nuclei and a single heavy ion have extremely strong penetrating power and can cause great damage to electronic systems. The single event upset is one of the main causes of electronic system failure and abnormal work in space environment due to the single event injection causing the electronic elements in the chip to fail. The existing mainstream hardware accelerator generally adopts a data stream driving calculation mode, and because the phenomenon that electronic elements in a chip are failed due to single event upset often occurs in a high-altitude environment or even an outer space environment, the correctness of a calculation result of the chip is affected, and therefore, fault detection and correction are required to be carried out on operation result data of data stream driving calculation.
In the prior art, the commonly adopted reinforcing means mainly comprise the following schemes:
1) time triple modular redundancy is adopted: and calculating the same data three times, comparing results before and after, and correctly avoiding module errors as long as two errors do not occur before and after the calculated errors, thereby ensuring the correct output of the results. However, the period of the calculation is three times that of the original calculation, the performance of the hardware accelerator is greatly reduced, and the calculation is not feasible for a high-speed calculation unit;
2) hardware triple modular redundancy is adopted: the three mirror modules execute the same operation at the same time, and the output result with the same majority is used as the correct result of the voting system, usually the voting of three-out-of-two. Because the three modules are independent, the probability that the two modules simultaneously have errors is extremely low, so that the reliability of the system can be greatly improved, and meanwhile, hardware resources and system consumption are greatly increased;
3) and (3) adopting dual-mode redundancy of the traditional hardware, enabling the two mirror modules to execute the same operation at the same time, then comparing and checking result data, and if different result data occur, interrupting the calculation of the data stream and restarting the calculation. The resource consumption of the method is half less than that of a hardware triple modular redundancy method, but the processing speed of the calculation is greatly reduced, and the probability that the data flow driving calculation is interrupted is relatively high along with the increase of the single event upset probability.
Disclosure of Invention
The purpose of this application lies in: the accuracy of the system calculation result is ensured, the time consumption of the system in the data calculation process is reduced, the hardware consumption of the system is reduced, and the efficiency and the performance of the whole calculation system are improved.
The technical scheme of the first aspect of the application is as follows: there is provided a fault correction apparatus based on data stream driven computation, the apparatus including: the device comprises a data storage unit, two calculation units, a judgment unit, a data storage unit and an analysis module; the data storage unit is arranged at the input end of the fault correction device and used for simultaneously sending corresponding original data to the two calculation units according to the data address; the input ends of the two computing units are respectively connected to the output end of the data storage unit, and the computing units are used for performing mirror image computation on original data; the two input ends of the judging unit are respectively connected with the output ends of the two calculating units, the judging unit is used for sending the mirror image calculation result to the data storage unit when judging that the two mirror image calculation results are equal, and the judging unit is also used for generating and sending a fault breakpoint address to the analysis module when judging that the two mirror image calculation results are not equal; the data storage unit is connected to the first output end of the judging unit and used for generating a storage address according to the data address and storing the mirror image calculation result of the first calculating unit according to the storage address; the input end of the analysis module is connected to the second output end of the judgment unit, the output end of the analysis module is connected to the input end of the data storage unit, and the analysis module is used for acquiring a data address according to the fault breakpoint address and sending the data address to the data storage unit.
In any of the above technical solutions, further, the parsing module specifically includes: a breakpoint storage unit and an address resolution unit; the breakpoint storage unit is arranged at the input end of the analysis module and used for generating a breakpoint list according to the fault breakpoint address; the address analysis unit is arranged at the output end of the analysis module and used for analyzing the breakpoint list and sending the data address to the data storage unit.
In any one of the above technical solutions, further, the method further includes: a data masking unit; the data shielding unit is arranged between the first output end of the judging unit and the data storage unit, and the data shielding unit is used for sending a low-level enabling signal to the data storage unit when the two mirror image calculation results are judged to be unequal.
In any of the above technical solutions, further, the computing unit is a butterfly operator, and the butterfly operator includes: a first selection switch unit, two radix-2 butterfly operators, a first complex adder and a second complex adder.
The technical scheme of the second aspect of the application is as follows: there is provided a method of fault correction based on data flow driven computation, the method comprising: step S10, calculating original data by two same calculating units according to the data address to respectively obtain a first calculating result and a second calculating result; a step S20 of determining whether the first calculation result is equal to the second calculation result, and when the determination is yes, performing a step S30, and when the determination is no, performing a step S40; step S30, storing the first calculation result; step S40, generating a breakpoint list according to the data address corresponding to the first calculation result; in step S50, the data address is restored according to the breakpoint list, and step S10 is performed.
In any one of the above technical solutions, further, step S30 specifically includes: step S31, generating a storage address according to the data address corresponding to the first calculation result; in step S32, a calculation result storage table is generated based on the storage address and the first calculation result.
In any one of the above technical solutions, further, step S50 specifically includes: step S51, acquiring a fault breakpoint address in the breakpoint list; step S52, acquiring a corresponding data address according to the fault breakpoint address; step S53, according to the preset period, sending the original data corresponding to the data address according to the data address, and executing step S10.
In any one of the above technical solutions, further, step S50 specifically further includes: step S54, when the first calculation result is equal to the second calculation result, generating an insertion storage address according to the data address corresponding to the first calculation result; in step S55, the recalculated first calculation result is inserted into the calculation result storage table based on the insertion storage address and the calculation result storage table.
The beneficial effect of this application is: the fault breakpoint address is generated by comparing the mirror image calculation results, and after the data stream driving calculation is completed, the fault breakpoint address is analyzed, and the corresponding original data is recalculated, so that on the premise of ensuring the accuracy, the time consumption of the chip data stream driving calculation is favorably reduced, the resource consumption of the chip is reduced, the manufacturing cost of the chip is favorably reduced, the processing speed of the chip is improved, the probability of generating a single event upset fault is favorably reduced, and the interruption probability of the data stream driving calculation is reduced.
According to the method, the first selection switch unit, the two radix-2 butterfly calculators, the first complex adders and the second complex adders are arranged, reconstruction of the butterfly calculators is facilitated, the two radix-2 butterfly calculators are reconstructed into 1 radix-3 butterfly calculators, the utilization efficiency of electronic elements is improved, the radix-3 butterfly calculators are used for calculation, the possibility of introducing noise in the calculation process is facilitated to be reduced, and the time delay in the calculation process is reduced.
Drawings
The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic block diagram of a fault correction apparatus based on data flow driven computation according to an embodiment of the present application;
FIG. 2 is a circuit diagram of a butterfly operator according to an embodiment of the present application;
FIG. 3 is a circuit diagram of a radix-2 butterfly operator, according to one embodiment of the present application;
FIG. 4 is a circuit diagram of a radix-3 butterfly according to one embodiment of the present application;
FIG. 5 is a schematic diagram of an address resolution unit storing data according to an embodiment of the present application;
FIG. 6 is a schematic flow chart diagram of a fault correction method based on data flow driven computation according to an embodiment of the present application;
FIG. 7 is a schematic timing diagram of fault correction for data stream driven computing according to one embodiment of the present application.
Detailed Description
In order that the above objects, features and advantages of the present application can be more clearly understood, the present application will be described in further detail with reference to the accompanying drawings and detailed description. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited by the specific embodiments disclosed below.
The first embodiment is as follows:
the following describes the failure correction device in the present embodiment with reference to fig. 1 to 5.
As shown in fig. 1, a fault correction apparatus based on data flow driving calculation in the present embodiment includes: a data storage unit 10, two calculation units, a judgment unit 30, a data storage unit 40 and an analysis module 50; the data storage unit 10 is arranged at the input end of the fault correction device, and the data storage unit 10 is used for simultaneously sending corresponding original data to the two calculation units according to the data address;
specifically, the data storage unit 10 is composed of a plurality of Random Access Memories (RAMs), and is driven by data addresses to send original data to the two computing units, that is, a data list is stored in the data storage unit 10, and the data storage unit 10 retrieves original data corresponding to the data addresses according to the data addresses in the data list, and then sends the original data to the two computing units.
In this embodiment, the input ends of the two calculating units are respectively connected to the output end of the data storage unit 10, and the calculating units are used for performing mirror image calculation on the original data;
specifically, the two calculation units may be divided into a first calculation unit 21 and a second calculation unit 22, the first calculation unit 21 and the second calculation unit 22 having the same structure and function are mirror image calculation units, and whether the two calculation units are affected by the single event upset is verified through the calculation of the same data by the first calculation unit 21 and the second calculation unit 22.
Further, the calculation unit is a butterfly operator, and the butterfly operator includes: a first selection switch unit, two radix-2 butterfly operators, a first complex adder and a second complex adder;
the first selection switch unit is used for selecting a corresponding conduction mode according to the operation instruction;
specifically, as shown in fig. 2, the first selection switch unit is composed of 7 two-out selectors, including a selector 14, a selector 16, a selector 17, a selector 18, a selector 19, a selector 20 and a selector 21, taking the selector 14 as an example, when the instruction input to the selector 14 by the operation instruction is "01", the corresponding one is the radix-2 butterfly operator, and the on mode of the selector 14 is the mode that the input end a of the second radix-2 butterfly operator is connected to the input end a of the second radix-2 butterfly operator2The input of which is sent to a complex adder 8. When the operation instruction is 10 for the instruction input to the selector 14, corresponding to the radix-3 butterfly operator, the on mode of the selector 14 is to add complex numbersThe operation result of the unit 7 (the third complex adder of the first radix-2 butterfly operator) is sent to the complex adder 8.
The radix-2 butterfly operator comprises a first complex multiplier, a third complex adder and a complex subtracter, wherein the first input end of the radix-2 butterfly operator is respectively connected with the first input end of the third complex adder and the first input end of the complex subtracter through a first selection switch unit, the output end of the third complex adder is connected with the first output end of the radix-2 butterfly operator through a first selection switch unit, the output end of the complex subtracter is connected with the second output end of the radix-2 butterfly operator through a first selection switch unit, the second input end of the radix-2 butterfly operator and the twiddle factor are respectively connected with the input end of the first complex multiplier, the output end of the first complex multiplier is respectively connected with the second input end of the third complex adder and the second input end of the complex subtracter, and the radix-2 butterfly operator is used for conducting mode corresponding to the first selection switch unit according to-operated data, performing a radix-2 butterfly operation;
wherein the radix-2 butterfly operator includes: a first complex multiplier, a third complex adder and a complex subtracter; the first input end of the radix-2 butterfly arithmetic unit is respectively connected with the first input end of a third complex adder and the first input end of a complex subtracter through a first selection switch unit, the output end of the third complex adder is connected with the first output end of the radix-2 butterfly arithmetic unit through a first selection switch, and the output end of the complex subtracter is connected with the second output end of the radix-2 butterfly arithmetic unit through a first selection switch; the second input end and the twiddle factor of the radix-2 butterfly arithmetic unit are respectively connected with the input end of the first complex multiplier, and the output end of the first complex multiplier is respectively connected with the second input end of the third complex adder and the second input end of the complex subtracter.
Specifically, as shown in FIG. 3, input A of the radix-2 butterfly (first radix-2 butterfly) operator1The first input terminal of the radix-2 butterfly operator is connected to the complex adder 7 (third complex adder) and the selector 17, respectively, and the selector 17 transfers the corresponding data to be operated to the complex subtractor 10, wherein the complex adder 10 isA complex adder for performing a complex subtraction, the output of the complex adder 7 being connected to the first output X of the radix-2 butterfly via a selector 181The output terminal of the complex subtractor 10 is connected to the second output terminal X of the radix-2 butterfly operator via the selector 192(ii) a Second input B and twiddle factor input W of radix-2 butterfly operator1Connected to the complex multiplier 1 (first complex multiplier), the complex multiplier 1 sends the calculated products to the complex adder 7 and the complex subtractor 10, respectively. Setting input terminal A1The corresponding digital signal to be calculated is a ═ ar1+ai1J), the digital signal to be calculated corresponding to the input end B is B' ═ Br+biJ) twiddle factor of W1' then the corresponding output results are:
X1=A1'+B'*W1'
=[ar1+(br*wr1-bi*wi1)]+j*[ai1+(bi*wr1+br*wi1)]
X2=A1-B*W1'
=[ar1-(br*wr1-bi*wi1)]+j*[ai1-(bi*wr1+br*wi1)]
in the formula, W1'=wr1+j*wi1,W2'=wr2+j*wi2。
The complex adder 9 is a complex adder that performs a complex subtraction operation. Input terminal A of second radix-2 butterfly operator2Is connected with the input end A1The connection mode of the second input terminal C of the second radix-2 butterfly operator is similar to that of the second input terminal B, and is not described herein again.
The two radix-2 butterfly operators are further configured to form a radix-3 butterfly operator with the first complex adder and the second complex adder according to the corresponding conduction mode of the first selection switch unit, and the radix-3 butterfly operator further includes: the first input end of the radix-3 butterfly operator is connected with the first output end of the illustrated radix-3 butterfly operator through the complex adder group and the first selection switch unit, the first input end of the radix-3 butterfly operator is further connected with the first input end of the first complex adder, the output end of the first complex adder is connected with the second output end of the radix-3 butterfly operator through the first selection switch unit, the first input end of the radix-3 butterfly operator is further connected with the first input end of the second complex adder, the output end of the second complex adder is connected with the third output end of the radix-3 butterfly operator through the first selection switch unit, the second input end of the radix-3 butterfly operator is connected with the input end of the second complex multiplier, and the first output end of the second complex multiplier is connected with the radix-3 butterfly operator through the complex adder group and the first selection switch unit The first output end of the butterfly operator, the second output end of the second complex multiplier is connected with the second input end of the first complex adder through the shift calculation unit, the first selection switch unit and the complex adder group, the third output end of the second complex multiplier is connected with the second input end of the second complex adder through the shift calculation unit, the first selection switch unit and the complex adder group, the third input end of the radix-3 butterfly operator is connected with the input end of the third complex multiplier, the first output end of the third complex multiplier is connected with the first output end of the radix-3 butterfly operator through the complex adder group and the first selection switch unit, the second output end of the third complex multiplier is connected with the second input end of the first complex adder through the shift calculation unit and the complex adder group, the third output end of the second complex multiplier is connected with the second input end of the first complex adder through the shift calculation unit and the complex adder group, The first selection switch unit and the complex adder group are connected to the second input end of the second complex adder, the radix-3 butterfly operator is used for performing radix-3 butterfly operation on data to be operated, the second complex multiplier is a first complex multiplier of one radix-2 butterfly operator, the third complex multiplier is a first complex multiplier of the other radix-2 butterfly operator, and the complex adder group comprises a third complex adder and a complex subtracter of the two radix-2 butterfly operators.
Wherein the radix-3 butterfly operator includes: the complex adder comprises a second complex multiplier, a third complex multiplier, a complex adder group, a first complex adder and a second complex adder, wherein the second complex multiplier is a first complex multiplier of one radix-2 butterfly operator, the third complex multiplier is a first complex multiplier of the other radix-2 butterfly operator, and the complex adder group comprises a third complex adder and a complex subtracter of two radix-2 butterfly operators.
Specifically, as shown in fig. 4, the second complex multiplier is a complex multiplier 1, the third complex multiplier is a complex multiplier 2, the first complex adder is a complex adder 11, and the complex adder group includes a complex adder 7, a complex adder 8, a complex adder 9, and a complex adder 10.
The first input end of the radix-3 butterfly arithmetic unit is connected with the first output end of the radix-3 butterfly arithmetic unit through a complex addition device group and a first selection switch unit, the first input end of the radix-3 butterfly arithmetic unit is also connected with the first input end of a first complex addition device, the output end of the first complex addition device is connected with the second output end of the radix-3 butterfly arithmetic unit through a first selection switch unit, the first input end of the radix-3 butterfly arithmetic unit is also connected with the first input end of a second complex addition device, and the output end of the second complex addition device is connected with the third output end of the radix-3 butterfly arithmetic unit through a first selection switch unit;
specifically, as shown in fig. 4, the input terminal a1The first input of the radix-3 butterfly operator is connected to a first input of a complex adder 7, a first input of a complex adder 11 (first complex adder) and a first input of a complex adder 12 (second complex adder), an output of the complex adder 7 is connected to a first input of a complex adder 8 via a selector 14, an output of the complex adder 8 is connected to an output X via a selector 181(first output of the radix-3 butterfly). The output terminal of the complex adder 11 is connected to the output terminal X via the selector 192(second output of the radix-3 butterfly), the output of the complex adder 12 is connected to the output X via a selector 203(third output of the radix-3 butterfly).
The second input end of the radix-3 butterfly operator is connected with the input end of a second complex multiplier, the first output end of the second complex multiplier is connected with the first output end of the radix-3 butterfly operator through a complex adder group and a first selection switch unit, the second output end of the second complex multiplier is connected with the second input end of the first complex adder through a shift calculation unit 103, a first selection switch unit and a complex adder group, and the third output end of the second complex multiplier is connected with the second input end of the second complex adder through the shift calculation unit 103, the first selection switch unit and the complex adder group;
specifically, as shown in fig. 4, the input terminal B (the second input terminal of the radix-3 butterfly operator) is connected to the complex multiplier 1, and the first output terminal of the complex multiplier 1 is connected to the second input terminal of the complex adder 7. The second output terminal of the complex multiplier 1 is connected to the input terminal of the selector 16 through the complex multiplier 3 and the shift register 22, the output terminal of the selector 16 is connected to the first input terminal of the complex adder 9, and the output terminal of the complex adder 9 is connected to the second input terminal of the complex adder 11. The third output terminal of the complex multiplier 1 is connected to the first input terminal of the complex adder 10 through the complex multiplier 4 and the shift register 23, and the output terminal of the complex adder 10 is connected to the second input terminal of the complex adder 12.
The third input end of the radix-3 butterfly operator is connected with the input end of a third complex multiplier, the first output end of the third complex multiplier is connected with the first output end of the radix-3 butterfly operator through a complex adder group and a first selection switch unit, the second output end of the third complex multiplier is connected with the second input end of the first complex adder through a shift calculation unit 103 and a complex adder group, and the third output end of the second complex multiplier is connected with the second input end of the second complex adder through the shift calculation unit 103, the first selection switch unit and the complex adder group;
specifically, as shown in fig. 4, the input terminal C (the third input terminal of the radix-3 butterfly operator) is connected to the complex multiplier 2, the first output terminal of the complex multiplier 2 is connected to the second input terminal of the complex adder 8, the second output terminal of the complex multiplier 2 is connected to the second input terminal of the complex adder 9 through the complex multiplier 5 and the shift register 24, the third output terminal of the complex multiplier 2 is connected to the selector 17 through the complex multiplier 6 and the shift register 25, and the output terminal of the selector 17 is connected to the second input terminal of the complex adder 10.
Set the input terminal A at this time1The corresponding data to be operated on is A1”=(ar1+ai1J), the data to be operated corresponding to the input end B is B ═ B ═r+biJ), the data to be calculated corresponding to the input end C is C ═ C ═r+ciJ), the corresponding output result is:
The computing unit is set as a butterfly arithmetic unit, which is beneficial to improving the processing range of the computing data of the computing unit and increasing the flexibility of digital signal processing, and is especially beneficial to converting the DFT with large points into the DFT with small points for operation when performing Discrete Fourier Transform (DTF) so as to achieve the purpose of reducing the operation complexity.
The first selection switch unit, the two radix-2 butterfly calculators, the first complex adder and the second complex adder are arranged, so that the reconstruction of the butterfly calculators is facilitated, the two radix-2 butterfly calculators are reconstructed into 1 radix-3 butterfly calculators, the utilization efficiency of electronic elements is improved, the radix-3 butterfly calculators are used for calculation, the possibility of introducing noise in the calculation process is facilitated to be reduced, and the delay in the calculation process is reduced.
In this embodiment, two input ends of the determining unit 30 are respectively connected to output ends of the two calculating units, the determining unit 30 is configured to send the mirror image calculation result to the data saving unit 40 when it is determined that the two mirror image calculation results are equal, and the determining unit 30 is further configured to generate and send a fault breakpoint address to the parsing module 50 when it is determined that the two mirror image calculation results are not equal;
specifically, since the first calculating unit 21 and the second calculating unit 22 have the same structure and function as the received original data, when the first calculating unit 21 and the second calculating unit 22 are not affected by the single event upset, the calculating results of the first calculating unit 21 and the second calculating unit 22 are the same, and at this time, the judging unit 30 judges that the two mirror image calculating results are equal, and sends the mirror image calculating result of the first calculating unit 21 or the second calculating unit 22 to the data storing unit 40 for storing.
When the first calculating unit 21 and/or the second calculating unit 22 are affected by the single event upset, the mirror image calculation results of the first calculating unit 21 and/or the second calculating unit 22 will be deviated, the judging unit 30 judges that the mirror image calculation results of the first calculating unit 21 and/or the second calculating unit 22 are different, the data address of the corresponding original data is obtained according to the mirror image calculation result of the first calculating unit 21 or the second calculating unit 22, the fault breakpoint address is generated according to the obtained data address at the moment, and the fault breakpoint address is sent to the analyzing module 50 for storage.
Preferably, when it is determined that the two mirror calculation results are equal, the determination unit 30 sends the mirror calculation result of the first calculation unit 21 to the data holding unit 40.
In this embodiment, the data saving unit 40 is connected to the first output end of the judging unit 30, and the data saving unit 40 is configured to generate a storage address according to the data address and store the mirror image calculation result of the first calculating unit according to the storage address;
further, the data saving unit 40 is a random access memory;
specifically, the data storage unit 40 is an RAM, when the judging unit 30 sends the mirror image calculation result to the data storage unit 40, the data storage unit 40 stores the received mirror image calculation result according to the address jump conforming to the increasing rule, and when the relevant mirror image calculation result is output or searched, the output or search of the mirror image calculation result can be realized only by acquiring the corresponding address in the RAM, which is beneficial to reducing the time delay in the data search process.
Further, the fault correction apparatus further includes: the data masking unit 60; the data masking unit 60 is disposed between the first output end of the determining unit 30 and the data saving unit 40, and the data masking unit 60 is configured to send a low-level enable signal to the data saving unit 40 when it is determined that the two mirror image calculation results are not equal.
Specifically, the data storage unit 40 is set to be a high-level write-in RAM, and when it is determined that the two mirror image calculation results are not equal, the data shielding unit 60 sends a low-level enable signal to the data storage unit 40, so that data cannot be written in the corresponding storage address of the RAM, the mirror image calculation result at this time is shielded, and the accuracy of the mirror image calculation result stored in the data storage unit 40 is ensured.
In this embodiment, an input end of the parsing module 50 is connected to the second output end of the determining unit 30, an output end of the parsing module 50 is connected to an input end of the data storage unit 10, and the parsing module 50 is configured to obtain a data address according to the fault breakpoint address and send the data address to the data storage unit 10.
Further, the parsing module 50 specifically includes: a breakpoint storage unit 51 and an address resolution unit 52; the breakpoint storage unit 51 is arranged at the input end of the analysis module 50, and the breakpoint storage unit 51 is configured to generate a breakpoint list according to the fault breakpoint address;
specifically, the breakpoint storage unit 51 is composed of a plurality of registers, each of which can store a fault breakpoint address, so as to form a breakpoint list, so as to call the fault breakpoint address in the corresponding register according to the base breakpoint list.
The address resolution unit 52 is disposed at an output end of the resolution module 50, and the address resolution unit 52 is configured to resolve the breakpoint list and send the data address to the data storage unit 10.
Further, the address resolution unit 52 is a random access memory.
Specifically, the address resolution unit 52 is an address synchronizer composed of a RAM, and the address resolution unit 52 writes the same data addresses as those in the data list in order, as shown in fig. 5, where AD is the address where the data in the address resolution unit 52 is stored, and D is the data stored in the address resolution unit 52, which is the same data address as that in the data list. The breakpoint list in the breakpoint storage unit 51 is equivalent to a pointer list, and the address resolution unit resolves a corresponding data address according to a fault breakpoint address in the power-off list, and then sends the data address to the data storage unit 10, so that the data storage unit 10 sends corresponding original data to the calculation unit according to the data address.
Example two:
the following describes a failure correction method in the present embodiment with reference to fig. 6 to 7.
As shown in fig. 6, a method for correcting a fault based on data flow driving calculation in the present embodiment includes:
step S10, calculating original data by two same calculating units according to the data address to respectively obtain a first calculating result and a second calculating result;
specifically, a crossbar control instruction is generated according to a received calculation instruction, and a calculation unit conforming to the calculation instruction is controlled to be generated, so that the original data is calculated, and a first calculation result and a second calculation result are obtained.
A step S20 of determining whether the first calculation result is equal to the second calculation result, and when the determination is yes, performing a step S30, and when the determination is no, performing a step S40;
step S30, storing the first calculation result;
in step S30, the method specifically includes:
step S31, generating a storage address according to the data address corresponding to the first calculation result;
in step S32, a calculation result storage table is generated based on the storage address and the first calculation result.
Specifically, as shown in fig. 7, the preset calculation time length is composed of N unit calculation time lengths with equal time lengths, in each unit calculation time length, the corresponding calculation data can be calculated according to the original data, and the number of the address in the chip is determined by the serial number of the unit calculation time length, that is, the number of the data address corresponding to the kth unit calculation time length, the number of the storage address of the calculation result, and the number of the fault point address are all k. Since the numbers of the addresses in the chip are sequentially increased, after the unit calculation time length is over, for the calculation result r1(k) corresponding to the original data s (k), the number k +1 of the storage address in the calculation result storage table is 1 greater than the number k of the data address corresponding to the original data, that is, the address number of the data address of the original data is decreased by 1 when the storage address is generated.
For example, for the second unit calculation time length, the corresponding data address is a (1), the corresponding original data is s (1), the corresponding first calculation result is r1(1), the second calculation result is r2(1), and r1(1) and r2(1) are stored in the storage address corresponding to the third unit calculation time length.
Step S40, generating a breakpoint list according to the data address corresponding to the first calculation result;
specifically, the number of the address of the failure point in the breakpoint list is also 1 greater than the number of the data address of the corresponding original data. As shown in fig. 7, in the (k +1) th unit calculation time length, it is determined that the first calculation result r1(k) is not equal to the second calculation result r2(k), and the data address a (k) corresponding to the first calculation result r1(k) is stored in the fault breakpoint address g (k +1) corresponding to the (k +1) th unit calculation time length in the breakpoint list.
Before this step 40, further comprising: a data masking instruction is generated.
Specifically, as shown in fig. 7, when it is determined that the first calculation result r1(k) is not equal to the second calculation result r2(k) corresponding to the k +1 unit calculation time length, a data masking instruction is generated to mask the memory space corresponding to the memory address b (k +1) without storing corresponding data.
In step S50, the data address is restored according to the breakpoint list, and step S10 is performed.
In step S50, the method specifically includes:
step S51, acquiring a fault breakpoint address in the breakpoint list;
step S52, acquiring a data address corresponding to the fault breakpoint address according to the fault breakpoint address;
step S53, according to the preset period, sending the original data corresponding to the data address according to the data address, and executing step S10.
Specifically, as shown in fig. 7, after the preset calculation duration is over, the nth unit calculation duration obtains the fault breakpoint address g (k +1) in the breakpoint list, analyzes the fault breakpoint address g (k +1), obtains the corresponding data address a (k), and further sends the original data s (k) corresponding to the data address a (k) to two same calculation units to execute step 10 again.
In step S50, the method specifically includes:
step S54, when the first calculation result is equal to the second calculation result, generating an insertion storage address according to the data address corresponding to the first calculation result;
in step S55, the recalculated first calculation result is inserted into the calculation result storage table based on the insertion storage address and the calculation result storage table.
Specifically, as shown in fig. 7, at the N +1 th time, when it is determined that the recalculated first calculation result r '1 (k) and the recalculated second calculation result r' 2(k) are equal, an insertion memory address is generated corresponding to the memory address b (k + 1). And storing the recalculated first calculation result r1(k) to the position corresponding to the storage address b (k +1) according to the generated insertion storage address, wherein at this time, the recalculated first calculation result r' 1(k) is the mirror image calculation result corresponding to the original data s (k) after the fault correction.
The technical scheme of the application and the two existing technical schemes are compared and tested, and the obtained comparison result is shown in table 1.
TABLE 1
Compared with the existing triple-modular redundancy fault correction method, on the premise of ensuring the accuracy and the time consumption proportion, the method can be completed by only 2 computing units, so that the resource consumption in the chip is reduced, the size of the chip is favorably reduced, and the manufacturing cost of the chip is reduced.
Compared with the traditional hardware dual-mode redundancy correction method, on the premise of ensuring that the hardware resource consumption and the accuracy are the same, the method reduces the time consumption in the calculation process, improves the processing speed of the chip, is beneficial to reducing the probability of generating single event upset faults, and reduces the interruption probability of data flow driving calculation.
The steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.
The units in the device can be merged, divided and deleted according to actual requirements.
Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and not restrictive of the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.
Claims (6)
1. A device for fault correction based on data stream driven computation, the device comprising: a data storage unit (10), two calculation units, a judgment unit (30), a data storage unit (40) and an analysis module (50);
the data storage unit (10) is arranged at the input end of the fault correction device, and the data storage unit (10) is used for simultaneously sending corresponding original data to the two calculation units according to data addresses; wherein, the computing unit is a butterfly arithmetic unit;
the input ends of the two computing units are respectively connected to the output end of the data storage unit (10), and the computing units are used for performing mirror image computation on the original data;
the two input ends of the judging unit (30) are respectively connected to the output ends of the two computing units, the judging unit (30) is used for sending the mirror image computing result to the data storage unit (40) when judging that the two mirror image computing results are equal, and the judging unit (30) is also used for generating and sending a fault breakpoint address to the analysis module (50) when judging that the two mirror image computing results are not equal;
the data storage unit (40) is connected to a first output end of the judging unit (30), and the data storage unit (40) is used for generating a storage address according to the data address and storing the mirror image calculation result of the first calculating unit according to the storage address;
the input end of the analysis module (50) is connected to the second output end of the judgment unit (30), the output end of the analysis module (50) is connected to the input end of the data storage unit (10), the analysis module (50) is configured to obtain the data address according to the fault breakpoint address and send the data address to the data storage unit (10), wherein the analysis module (50) specifically includes: a breakpoint storage unit (51) and an address resolution unit (52);
the breakpoint storage unit (51) is arranged at an input end of the analysis module (50), and the breakpoint storage unit (51) is used for generating a breakpoint list according to the fault breakpoint address;
the address resolution unit (52) is arranged at an output end of the resolution module (50), and the address resolution unit (52) is used for resolving the breakpoint list and sending the data address to the data storage unit (10).
2. The apparatus for fault correction based on data-flow driven computation of claim 1, further comprising: a data masking unit (60);
the data shielding unit (60) is disposed between the first output end of the judging unit (30) and the data saving unit (40), and the data shielding unit (60) is configured to send a low level enable signal to the data saving unit (40) when it is determined that the two mirror image calculation results are not equal.
3. The apparatus for correcting a failure based on data flow driven computation of claim 1, wherein the computation unit is a butterfly operator, the butterfly operator comprising: a first selection switch unit, two radix-2 butterfly operators, a first complex adder and a second complex adder.
4. A method for fault correction based on data stream driven computation, the method comprising:
step S10, calculating original data by two same calculating units according to the data address to respectively obtain a first calculating result and a second calculating result; wherein, the computing unit is a butterfly arithmetic unit;
a step S20 of determining whether the first calculation result is equal to the second calculation result, and when the determination is yes, performing step S30, and when the determination is no, performing step S40;
step S30, storing the first calculation result;
step S40, generating a breakpoint list according to the data address corresponding to the first calculation result;
step S50, restoring the data address according to the breakpoint list, and executing step S10, where step S50 specifically includes:
step S51, acquiring the fault breakpoint address in the breakpoint list;
step S52, acquiring the corresponding data address according to the fault breakpoint address;
and step S53, sending the original data corresponding to the data address according to a preset period, and executing the step S10.
5. The method for correcting a fault based on data flow driving calculation according to claim 4, wherein the step S30 specifically includes:
step S31, generating a storage address according to the data address corresponding to the first calculation result;
step S32, generating a calculation result storage table according to the storage address and the first calculation result.
6. The method for correcting a fault based on data flow driving calculation according to claim 4, wherein the step S50 further includes:
step S54, when it is determined that the recalculated first calculation result is equal to the recalculated second calculation result, generating an insertion storage address based on the data address corresponding to the recalculated first calculation result;
step S55, insert the recalculated first calculation result into the calculation result storage table according to the insertion storage address and the calculation result storage table.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811044090.9A CN109144793B (en) | 2018-09-07 | 2018-09-07 | Fault correction device and method based on data flow driving calculation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811044090.9A CN109144793B (en) | 2018-09-07 | 2018-09-07 | Fault correction device and method based on data flow driving calculation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109144793A CN109144793A (en) | 2019-01-04 |
CN109144793B true CN109144793B (en) | 2021-12-31 |
Family
ID=64823741
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811044090.9A Expired - Fee Related CN109144793B (en) | 2018-09-07 | 2018-09-07 | Fault correction device and method based on data flow driving calculation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109144793B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109991531B (en) * | 2019-03-28 | 2021-12-24 | 西北核技术研究所 | Method for measuring atmospheric neutron single event effect cross section under low probability condition |
CN111047034B (en) * | 2019-11-26 | 2023-09-15 | 中山大学 | On-site programmable neural network array based on multiplier-adder unit |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8886994B2 (en) * | 2009-12-07 | 2014-11-11 | Space Micro, Inc. | Radiation hard and fault tolerant multicore processor and method for ionizing radiation environment |
CN104615510B (en) * | 2015-03-09 | 2017-01-25 | 中国科学院自动化研究所 | Programmable device-based dual-mode redundant fault-tolerant method |
CN105045766B (en) * | 2015-06-29 | 2019-07-19 | 深圳市中兴微电子技术有限公司 | Data processing method and processor based on the transformation of 3072 point quick Fouriers |
CN105320579B (en) * | 2015-10-27 | 2018-03-23 | 首都师范大学 | Towards the selfreparing dual redundant streamline and fault-tolerance approach of SPARC V8 processors |
CN105260256B (en) * | 2015-10-27 | 2018-03-23 | 首都师范大学 | A kind of fault detect of duplication redundancy streamline and backing method |
-
2018
- 2018-09-07 CN CN201811044090.9A patent/CN109144793B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN109144793A (en) | 2019-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110442560B (en) | Log replay method, device, server and storage medium | |
US5608867A (en) | Debugging system using virtual storage means, a normal bus cycle and a debugging bus cycle | |
US20080276226A1 (en) | Device, method and computer program product for evaluating a debugger script | |
US11036507B2 (en) | Processor testing using pairs of counter incrementing and branch instructions | |
CN109144793B (en) | Fault correction device and method based on data flow driving calculation | |
US20170177463A1 (en) | Data flow analysis in processor trace logs using compiler-type information method and apparatus | |
US11663113B2 (en) | Real time fault localization using combinatorial test design techniques and test case priority selection | |
US10303566B2 (en) | Apparatus and method for checking output data during redundant execution of instructions | |
CN111522648A (en) | Transaction processing method and device for block chain and electronic equipment | |
US20050229163A1 (en) | Thread-scoped breakpoints | |
US20070150866A1 (en) | Displaying parameters associated with call statements | |
CN112579373B (en) | Verification method, system, device and storage medium for branch predictor | |
US9411014B2 (en) | Reordering or removal of test patterns for detecting faults in integrated circuit | |
US6845440B2 (en) | System for preventing memory usage conflicts when generating and merging computer architecture test cases | |
CN108959070B (en) | Python hook function method and device based on code object | |
CN115421965A (en) | Consistency checking method and device, electronic equipment and storage medium | |
CN115525454A (en) | Chip FIFO module fault judgment method and device, electronic equipment and storage medium | |
CN112187966B (en) | Acceleration card, MAC address generation method and device thereof and storage medium | |
Chang et al. | LAMP: Controllability, observability, and maintenance engineering technique (COMET) | |
CN115114066A (en) | Memory fault monitoring method, system, storage medium and equipment | |
CN108228239B (en) | Branch instruction grabbing method and device based on quick simulator QEMU | |
Zhang et al. | A novel model for software development and testing in programmable logic | |
US11822326B2 (en) | Voter-based method of controlling redundancy, electronic device, and storage medium | |
RU2483346C1 (en) | Apparatus for detecting dynamic range overflow, determining error and localisation of computation channel faults in computers operating in residue number system | |
CN1797411A (en) | Method and equipment for implementing verification of digital-analog mixed type IC |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20211231 |
|
CF01 | Termination of patent right due to non-payment of annual fee |