CN115459204B - Parallel HSC chip fault processing device, method, equipment and medium - Google Patents
Parallel HSC chip fault processing device, method, equipment and medium Download PDFInfo
- Publication number
- CN115459204B CN115459204B CN202211404113.9A CN202211404113A CN115459204B CN 115459204 B CN115459204 B CN 115459204B CN 202211404113 A CN202211404113 A CN 202211404113A CN 115459204 B CN115459204 B CN 115459204B
- Authority
- CN
- China
- Prior art keywords
- resistor
- hsc
- chip
- fault
- switch tube
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000012545 processing Methods 0.000 title claims abstract description 30
- 238000006243 chemical reaction Methods 0.000 claims abstract description 91
- 238000001514 detection method Methods 0.000 claims description 74
- 238000004590 computer program Methods 0.000 claims description 9
- 238000003745 diagnosis Methods 0.000 abstract description 4
- 238000013461 design Methods 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 30
- 238000010586 diagram Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 6
- 238000012827 research and development Methods 0.000 description 5
- 101150090950 Hsc70-1 gene Proteins 0.000 description 4
- 101100150366 Schizosaccharomyces pombe (strain 972 / ATCC 24843) sks2 gene Proteins 0.000 description 4
- 210000003995 blood forming stem cell Anatomy 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 101100434411 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ADH1 gene Proteins 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 101150102866 adc1 gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/28—Testing of electronic circuits, e.g. by signal tracer
- G01R31/2851—Testing of integrated circuits [IC]
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H1/00—Details of emergency protective circuit arrangements
- H02H1/0007—Details of emergency protective circuit arrangements concerning the detecting means
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/02—Details
- H02H3/04—Details with warning or supervision in addition to disconnection, e.g. for indicating that protective apparatus has functioned
- H02H3/042—Details with warning or supervision in addition to disconnection, e.g. for indicating that protective apparatus has functioned combined with means for locating the fault
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/08—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current
- H02H3/087—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current for dc applications
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/08—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current
- H02H3/10—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current additionally responsive to some other abnormal electrical conditions
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/20—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess voltage
- H02H3/202—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess voltage for dc systems
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H5/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal non-electric working conditions with or without subsequent reconnection
- H02H5/04—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal non-electric working conditions with or without subsequent reconnection responsive to abnormal temperature
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Microelectronics & Electronic Packaging (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Emergency Protection Circuit Devices (AREA)
- Power Sources (AREA)
Abstract
The invention relates to the field of circuit design, in particular to a device, a method, equipment and a medium for processing faults of parallel HSC chips. The device comprises: multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals; the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals; and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip. The scheme of the invention realizes automatic fault location and fault diagnosis, avoids manpower and material resources investment brought by field analysis of research personnel, and improves the fault location efficiency.
Description
Technical Field
The invention relates to the field of circuit design, in particular to a device, a method, equipment and a medium for processing faults of parallel HSC chips.
Background
With the continuous rise of cloud computing technology in recent years, the internet traffic is increasing. Higher requirements are put forward on the data processing capacity and the storage capacity of the computer room server. As a unit-cabinet system in a traditional computer room, the data processing capacity of server computing nodes deployed in cabinets is required to be stronger and higher, and the deployment density is required to be higher and higher. With the increase of internet user service, the network data throughput is also larger and larger, and the deployment density of a data center machine room to a server is higher and higher. In order to improve the deployment density and enhance the maintainability, the servers are generally deployed in the cabinet in the form of a single node, and the single node server is directly plugged on a POWER bus bar (bus bar) to get electricity through a POWER CLIP (POWER CLIP) connector.
Servers are also increasingly burdened as basic data processing units in data centers. Especially, the work load current of the CPU chip in the server is larger and larger, and the work current of a single CPU is as high as 100 to 200A. Therefore, the current at the power supply input end of a single server node is higher and higher, and the current requirement of the corresponding hot plug line at the input end is higher and higher. Therefore, a Hot Swap unit often uses multiple Hot Swap Controllers (HSC) chips connected in parallel to meet the requirement of high current application. Multiple HSCs are connected in parallel, and in practical application, there are the following risks: firstly, when one or more HSCs have one-way triggering Over-Current Protection (OCP), short-circuit Protection (SCP), over-Temperature Protection (OTP), or Over-Voltage Protection (OVP), the whole hot-plug line may be powered down, which may not quickly and accurately determine the fault location. Secondly, when a plurality of HSCs are used in parallel, the current flowing through each HSC chip is different due to the difference between the HSC chips and the LAYOUT, and when the difference becomes large, the problem of HSC non-uniform current is caused. Under the long-time high-load working condition, breakdown failure easily occurs, and the risk of power failure or board burning caused by short circuit of the HSC chip to the ground is caused. The normal operation of customer's business is influenced, also because of burning the board simultaneously, brings the hidden danger to the fire control safety of data center computer lab. Meanwhile, once the problem of burning the plate occurs, research and development engineers often invest resources to carry out positioning analysis on the problem on site, the problem site is difficult to reproduce due to plate burning, and the positioning burning plate root also has great difficulty.
Fig. 1 is a schematic diagram of a traditional multi-path HSC parallel application scene, when a plurality of HSC chips are connected in parallel, an input terminal Vin is gathered in a POWER CLIP connector to get POWER from a BUSBAR, and an output terminal Vout is gathered together to supply POWER to a load. Meanwhile, a Baseboard Management Controller (BMC) on the server node motherboard samples the output voltage of the HSC to detect the abnormal state of the output voltage. However, this approach has the following drawbacks: on one hand, when one or more HSCs have one-way triggering of OCP, SCP, OTP and OVP, fault location is difficult. On the other hand, because the difference between the HSC chip individuals and the line layout results in the failure of automatic protection when the HSC is not uniform, the safety is poor and improvement is urgently needed.
Disclosure of Invention
In view of the above, it is desirable to provide a device, a method, an apparatus and a medium for processing a fault of a parallel HSC chip.
According to a first aspect of the present invention, there is provided a method for processing faults of parallel HSC chips, the method comprising:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals;
and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switch tube, a second switch tube, a third switch tube, a fourth switch tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance;
the first switch tube is connected with the first resistor in series to form a first branch, the second switch tube is connected with the second resistor in series to form a second branch, the third switch tube is connected with the third resistor in series to form a third branch, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same time.
In some embodiments, the conversion circuit comprises: the resistor comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor, a fifth resistor, a sixth resistor, a seventh resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
In some embodiments, the controller is further configured to:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to two thirds of power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit does not have fault.
In some embodiments, each HSC chip further has a current detection signal, and the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule;
the device further comprises an AND gate, wherein a first input end of the AND gate is connected with a starting signal, a second input end of the AND gate is connected with the current sharing control signal, and an output end of the AND gate is connected to an enabling signal pin of each HSC chip.
In some embodiments, the controller is further configured to:
calculating the average value of all the acquired current detection signals;
responding to the current detection signal of any HSC chip which is less than 0.6 times or more than 1.6 times of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
According to a second aspect of the present invention, there is provided a method for processing faults of parallel HSC chips, the method comprising:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switch tube, a second switch tube, a third switch tube, a fourth switch tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same time.
In some embodiments, the conversion circuit comprises: the resistor comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor, a fifth resistor, a sixth resistor, a seventh resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
In some embodiments, the analyzing, by the controller, the fault alarm signal output by each conversion circuit to determine the failed HSC chip and the type of fault of the HSC chip comprises:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to two thirds of power supply voltage, and determining that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
In some embodiments, each HSC chip further has a current detection signal, the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule, and the method further includes;
and connecting the first input end of the AND gate with a starting signal, connecting the second input end of the AND gate with the current-sharing control signal, and connecting the output end of the AND gate to an enabling signal pin of each HSC chip.
In some embodiments, the processing the collected current detection signals and then generating the current sharing alarm signal and the current sharing control signal according to a preset determination rule includes:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip which is less than 0.6 times or more than 1.6 times of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips which are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip which is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
According to a third aspect of the present invention, there is also provided a computer apparatus comprising:
at least one processor; and
the memory stores a computer program which can run on the processor, and the processor executes the method for processing the fault of the parallel HSC chip when executing the program.
According to a fourth aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program, which when executed by a processor, performs the method for processing the failure of the parallel HSC chips.
According to the fault processing device for the parallel HSC chips, the preset fault state signals inside the HSC chips are converted into the fault alarm signals through the conversion circuit, and the fault alarm signals are analyzed and judged through the controller to locate specific fault positions and fault reasons, so that automatic fault location and fault diagnosis are achieved, manpower and material resources input brought by field analysis of research and development personnel are avoided, and the fault location efficiency is improved.
In addition, the invention also provides a method for processing faults of the parallel HSC chips, computer equipment and a computer readable storage medium, which can also realize the technical effects and are not described again.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic diagram of a conventional multi-HSC parallel application scenario;
fig. 2 is a schematic diagram of a device for handling a fault of parallel HSC chips according to an embodiment of the present invention;
fig. 3A is a schematic structural diagram of a conversion circuit according to an embodiment of the present invention;
fig. 3B is a schematic structural diagram of a conversion circuit according to another embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a parallel HSC current sharing detection and determination principle according to another embodiment of the present invention;
fig. 5 is a schematic diagram illustrating a method for handling a fault of parallel HSC chips according to an embodiment of the present invention;
fig. 6 is an internal structural view of a computer device in another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are only used for convenience of expression and should not be construed as a limitation to the embodiments of the present invention, and no description is given in the following embodiments.
In an embodiment, referring to fig. 2, the present invention provides a device for handling failures of parallel HSC chips, specifically, the device includes:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals;
and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip.
According to the parallel HSC chip fault processing device, the preset fault state signal inside the HSC chip is converted into the fault alarm signal through the conversion circuit, and the fault alarm signal is analyzed and judged through the controller to locate the specific fault position and fault reason, so that automatic fault location and fault diagnosis are realized, the manpower and material resources input brought by field analysis of research and development personnel is avoided, and the fault location efficiency is improved.
In some embodiments, referring to fig. 3A, the conversion circuit includes: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3 and the fourth resistor R4 are different in resistance;
the first switch tube S1 is connected with the first resistor R1 in series to form a first branch, the second switch tube S2 is connected with the first resistor R2 in series to form a second branch, the third switch tube S3 is connected with the third resistor R3 in series to form a third branch, the fourth switch tube S4 is connected with the fourth resistor R4 in series to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switching tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switching tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switching tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switching tube S4 to be switched on or switched off, wherein at most one of the first switching tube S1, the second switching tube S2, the third switching tube S3 and the fourth switching tube S4 is switched on at the same time.
In some embodiments, referring to fig. 3B, the converting circuit includes: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4, a fifth resistor R5, a sixth resistor R6, a seventh resistor R7 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3, the fourth resistor R4, the fifth resistor R5, the sixth resistor R6, the seventh resistor R7 and the pull-up resistor are the same in resistance;
the first switch tube S1 is connected in series with the first resistor R1 to form a first branch, the second switch tube S2 is connected in series with the first resistor R2 to form a second branch, the third switch tube S3 is connected in series with the third resistor R3 to form a third branch, the fourth switch tube S4 is connected in series with the fourth resistor R4 to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected in parallel between the other end of the pull-up resistor and the ground, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, the fifth resistor R5 is connected in series between the first resistor R1 and the first resistor R2, the sixth resistor R6 is connected in series between the first resistor R2 and the third resistor R3, the seventh resistor R7 is connected in series between the third resistor R3 and the fourth resistor R4, and the other end of the fault signal is the pull-up voltage value;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switching tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switching tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switching tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switching tube S4 to be switched on or switched off, wherein at most one of the first switching tube S1, the second switching tube S2, the third switching tube S3 and the fourth switching tube S4 is switched on at the same time.
In some embodiments, the controller is further configured to:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to two thirds of power supply voltage, and determining that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three quarters of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, and the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
In some embodiments, referring to fig. 4, each HSC chip further has a current detection signal, and the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule;
the device further comprises an AND gate, wherein a first input end of the AND gate is connected with a starting signal, a second input end of the AND gate is connected with the current sharing control signal, and an output end of the AND gate is connected to an enabling signal pin of each HSC chip.
In some embodiments, the controller is further configured to:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip which is less than 0.6 times or more than 1.6 times of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to that the current detection signals of all HSC chips are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal not turning off the HSC chips;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
In another embodiment, to facilitate understanding of the solution of the present invention, a data center server is taken as an example of a load, and the load uses n (n is greater than or equal to 2) HSC chips to supply power in parallel, please refer to fig. 2 to 4 again, which provide another parallel HSC chip fault handling apparatus, including: POWER CLIP connector, HSC0, HSC1, \ 8230, HSCn, conversion circuit, LOAD LOAD, BMC. The device is different from the traditional mode (the voltage signal transmitted by the HSC is monitored by the BMC alone) in that: (1) And corresponding the fault states of the OCP, SCP, OTP and OVP in the HSC with the voltage value of the alarm signal ALERT, and positioning specific fault information by judging the voltage value of the alarm signal in the BMC. And 92, introducing an IMON signal of the HSC chips into the BMC, and generating a current-sharing alarm signal and a current-sharing control signal according to a judgment rule after the IMON signal is digitally processed in the BMC. And the current-sharing control signal is used for controlling whether the HSC is closed or not, so that the problem of plate burning caused by non-uniform current can be effectively avoided. The function of the various parts of the device is explained in detail below:
HSCn has multiple input ends connected in parallel to get Vin from BUSBAR through POWER CLIP connector, and output ends connected in parallel to converge and output Vout to supply POWER for LOAD.
The current flowing through each HSC chip was, in order: i0 I1, \ 8230;, in. Each HSC chip elicited a signal: [ IMON0, IMON1, \8230 ], [ IMONn ], [ ALERT0, ALERT0, \8230;, ALERTn ] is connected to the ADC I/O port of BMC. [ IMON0, IMON1, \8230 ];, IMONn denotes a current detection signal derived from HSC0, HSC1, \8230; and HSCn chips, which represents the magnitude of current flowing through each HSC chip.
The alarm signals led out from HSC0, HSC1, \8230, HSCn chips represent the specific fault triggered by each HSC chip.
The BMC can carry out judgment processing according to the collected Current signal and the voltage value of the alarm signal, gives a fault positioning result and triggers a Current sharing protection signal Current Balance.
As shown in fig. 3B, the HSC fault location and sorting and processing circuit structure provided in this embodiment takes a common 4-core fault of the HSC chip as an example (when S0, S1, S2, and S3 are low, the transfer line switch is closed): wherein, S0 represents: triggering OCP (over-current protection); s1 represents: triggering OTP (over temperature protection); s2 represents: triggering SCP (short circuit protection); s3 represents: triggering OVP (overvoltage protection); s0, S1, S2, and S3 are logic level signals, and the 4 logic level signals are used to control the lower right conversion line of fig. 3B and output ALERT signals with different voltage values, and the conversion relationship is as shown in table 1 below.
TABLE 1 Fault Warning Signal Voltage value and Fault type correspondence
FIG. 4 is a schematic diagram of a current sharing diagnostic structure for detecting multiple HSC parallel circuits, which generates current detection signals for HSC0, HSC1, \8230;, HSCn: [ IMON0, IMON1, \8230;, IMONn ] is passed to the ADC I/O port of BMC: [ ADC0, ADC1, \8230, ADCn ]. After the current detection signal is digitally processed inside the BMC chip, the current average value is calculated according to the following formula:
TABLE 2 IMONi, current sharing alarm signal, current sharing control signal corresponding relation
The BMC can make a comparison judgment according to the IMONi sum to generate a Current Balance ALERT signal and a Current Balance SHUT control signal. The correspondence and the corresponding actions are shown in table 2 above. When the Balance SHUT control signal is low, the HSC is turned off after passing through the AND gate together with the power-on signal PSON. Thereby preventing HSC burnout due to severe uneven flow.
The following will describe the specific working engineering of the parallel HSC chip fault handling device in detail:
step one, according to server system configuration: the method comprises the steps of calculating the total power consumption P _ T of a node mainboard by a CPU, a memory, a hard disk array and a system fan module, and determining the maximum current value I _ T of a power supply path from a BUSBAR to a HSC according to the derating standard of 80%.
And step two, selecting a POWER CLIP connector meeting the current flow size, HSC chips (such as MP5991 GLU-Z) with proper models and the number of HSC chips needing to be connected in parallel according to the maximum current value I _ T.
And step three, respectively connecting the input end and the output end of each group of HSC chips in parallel and mutually. And interconnecting the IMON pin of each group of HSC chips with the ADC I/O pin of the BMC. And interconnecting the ALERT pin of the HSC chip with the ADC I/O pin of the BMC.
And step four, importing a judgment rule of alarm signal fault information and importing a current-sharing signal alarm and control judgment rule into the BMC firmware.
And fifthly, connecting the Current Balance SHUT generated by the BMC and the node starting signal PSON with pins EN of the HSC chips through a logic AND gate chip.
The parallel HSC chip fault processing at least has the following beneficial technical effects: firstly, OCP, SCP, OTP and OVP fault states in the HSC correspond to the voltage value of an alarm signal ALERT, and specific fault information is positioned through judgment of the alarm signal voltage value in the BMC, so that manpower and material resource investment brought by field analysis of research and development personnel is avoided, and the fault positioning efficiency is improved; secondly, an IMON signal of the multi-HSC chip is introduced into the BMC, and after the IMON signal is digitally processed inside the BMC, a current sharing alarm signal and a current sharing control signal are generated according to a judgment rule. And uses the current sharing control signal to control whether the HSC is turned off. The problem of board burning caused by uneven current can be effectively avoided, and the power supply safety of the server is enhanced.
In another embodiment, referring to fig. 5, the present invention further provides a method 100 for processing failures of parallel HSC chips, the method comprising:
101, connecting multiple HSC chips in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
102, converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
According to the method for processing the fault of the parallel HSC chip, the preset fault state signal inside the HSC chip is converted into the fault alarm signal through the conversion circuit, and the fault alarm signal is analyzed and judged through the controller to position the specific fault position and fault reason, so that automatic fault positioning and fault diagnosis are realized, the manpower and material resource input brought by field analysis of research and development personnel is avoided, and the fault positioning efficiency is improved.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3 and the fourth resistor R4 are different in resistance;
the first switch tube S1 is connected with the first resistor R1 in series to form a first branch, the second switch tube S2 is connected with the first resistor R2 in series to form a second branch, the third switch tube S3 is connected with the third resistor R3 in series to form a third branch, the fourth switch tube S4 is connected with the fourth resistor R4 in series to form a fourth branch, one end of the pull-up resistor is connected with a power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switching tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switching tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switching tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switching tube S4 to be switched on or switched off, wherein at most one of the first switching tube S1, the second switching tube S2, the third switching tube S3 and the fourth switching tube S4 is switched on at the same time.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4, a fifth resistor R5, a sixth resistor R6, a seventh resistor R7 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3, the fourth resistor R4, the fifth resistor R5, the sixth resistor R6, the seventh resistor R7 and the pull-up resistor are the same in resistance;
the first switch tube S1 is connected in series with the first resistor R1 to form a first branch, the second switch tube S2 is connected in series with the first resistor R2 to form a second branch, the third switch tube S3 is connected in series with the third resistor R3 to form a third branch, the fourth switch tube S4 is connected in series with the fourth resistor R4 to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected in parallel between the other end of the pull-up resistor and the ground, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, the fifth resistor R5 is connected in series between the first resistor R1 and the first resistor R2, the sixth resistor R6 is connected in series between the first resistor R2 and the third resistor R3, the seventh resistor R7 is connected in series between the third resistor R3 and the fourth resistor R4, and the other end of the fault signal is the pull-up voltage value;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switching tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switching tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switching tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switching tube S4 to be switched on or switched off, wherein at most one of the first switching tube S1, the second switching tube S2, the third switching tube S3 and the fourth switching tube S4 is switched on at the same time.
In some embodiments, the analyzing, by the controller, the failure warning signal output by each switching circuit to determine the failed HSC chip and the failure type of the HSC chip in step 103 includes:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to two thirds of power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
In some embodiments, each HSC chip further has a current detection signal, the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule, and the method further includes;
and connecting the first input end of the AND gate with a starting signal, connecting the second input end of the AND gate with the current-sharing control signal, and connecting the output end of the AND gate to an enabling signal pin of each HSC chip.
In some embodiments, the processing the collected current detection signals and then generating the current sharing alarm signal and the current sharing control signal according to a preset determination rule includes:
calculating the average value of all the acquired current detection signals;
responding to the current detection signal of any HSC chip which is less than 0.6 times or more than 1.6 times of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to that the current detection signals of all HSC chips are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal not turning off the HSC chips;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
According to another aspect of the present invention, a computer device is provided, and the computer device may be a server, and its internal structure is shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. When executed by a processor, the computer program implements the method for processing faults of the parallel HSC chips, specifically, the method comprises the following steps:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
According to still another aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for processing a fault of a parallel HSC chip as described above, specifically comprising performing the steps of:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (16)
1. A parallel HSC chip failure handling device, the device comprising:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip;
the conversion circuit includes: the circuit comprises a first switch tube, a second switch tube, a third switch tube, a fourth switch tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor;
the first switch tube is connected with the first resistor in series to form a first branch, the second switch tube is connected with the second resistor in series to form a second branch, the third switch tube is connected with the third resistor in series to form a third branch, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same time.
2. The parallel HSC chip fault handling device of claim 1, wherein the switching circuit comprises: the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance value.
3. The parallel HSC chip failure handling device of claim 1, wherein the switching circuit further comprises: the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance value;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with the power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor.
4. The parallel HSC chip failure handling device of claim 3, wherein the controller is further configured to:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to two thirds of power supply voltage, and determining that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
5. The device for processing the faults of the parallel HSC chips according to claim 1, wherein each HSC chip is further provided with a current detection signal, the controller is further used for acquiring the current detection signal of each HSC chip, and generating a current-sharing alarm signal and a current-sharing control signal according to a preset determination rule after processing the acquired current detection signal;
the device further comprises an AND gate, wherein a first input end of the AND gate is connected with a starting signal, a second input end of the AND gate is connected with the current-sharing control signal, and an output end of the AND gate is connected to an enabling signal pin of each HSC chip.
6. The parallel HSC chip fault handling device of claim 5, wherein the controller is further configured to:
calculating the average value of all the acquired current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips which are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip which is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
7. The device for handling faults of HSC chips in parallel according to any one of claims 1-6, wherein the controller is a baseboard management controller.
8. A parallel HSC chip fault handling method is characterized by comprising the following steps:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
analyzing the fault alarm signal output by each conversion circuit by the controller to determine the HSC chip with the fault and the fault type of the HSC chip;
the conversion circuit includes: the circuit comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the plurality of preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
9. The method of claim 8, wherein the first resistor, the second resistor, the third resistor and the fourth resistor have different resistances.
10. The parallel HSC chip failure handling method of claim 9, wherein the switching circuit further comprises: the resistance values of the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are the same;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor.
11. The method of claim 10, wherein the analyzing, by the controller, the fault warning signal output by each switching circuit to determine the type of the fault of the failed HSC chip and HSC chip comprises:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to two thirds of power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
12. The method of handling faults in parallel HSC chips of claim 8, wherein each HSC chip further has a current detection signal, the controller is further configured to collect the current detection signal of each HSC chip, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule after processing the collected current detection signal, the method further comprising;
and connecting a first input end of the AND gate with a power-on signal, connecting a second input end of the AND gate with the current-sharing control signal, and connecting an output end of the AND gate to an enable signal pin of each HSC chip.
13. The method of processing faults of HSC chips connected in parallel according to claim 12, wherein the step of generating a current sharing alarm signal and a current sharing control signal according to a preset determination rule after processing the collected current detection signals comprises:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to that the current detection signals of all HSC chips are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal not turning off the HSC chips;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
14. The method for processing the failure of the HSC chips in parallel according to any one of claims 8-13, wherein the controller is a baseboard management controller.
15. A computer device, comprising:
at least one processor; and
a memory storing a computer program operable in the processor, the processor when executing the program performing the method of any of claims 8-13.
16. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 8-13.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211404113.9A CN115459204B (en) | 2022-11-10 | 2022-11-10 | Parallel HSC chip fault processing device, method, equipment and medium |
PCT/CN2023/100842 WO2024098750A1 (en) | 2022-11-10 | 2023-06-16 | Parallel hot swap controller chip fault processing apparatus and method, and device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211404113.9A CN115459204B (en) | 2022-11-10 | 2022-11-10 | Parallel HSC chip fault processing device, method, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115459204A CN115459204A (en) | 2022-12-09 |
CN115459204B true CN115459204B (en) | 2023-02-28 |
Family
ID=84295572
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211404113.9A Active CN115459204B (en) | 2022-11-10 | 2022-11-10 | Parallel HSC chip fault processing device, method, equipment and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115459204B (en) |
WO (1) | WO2024098750A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115459204B (en) * | 2022-11-10 | 2023-02-28 | 苏州浪潮智能科技有限公司 | Parallel HSC chip fault processing device, method, equipment and medium |
CN115808640B (en) * | 2023-02-09 | 2023-05-16 | 苏州浪潮智能科技有限公司 | Power failure detection circuit, method, system, electronic device and storage medium |
CN116298635A (en) * | 2023-03-30 | 2023-06-23 | 海信家电集团股份有限公司 | IPM fault detection system, IPM fault detection method, IPM fault detection device and storage medium |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2872089B2 (en) * | 1995-10-27 | 1999-03-17 | 群馬日本電気株式会社 | Hot-swap device |
US8847438B2 (en) * | 2008-07-14 | 2014-09-30 | Texas Instruments Incorporated | Minimum loss and wiring circuit and method for paralleling hot swap controllers |
CN103227583B (en) * | 2013-04-26 | 2015-09-02 | 中国能源建设集团江苏省电力设计院有限公司 | A kind of hot swap type converter system for new forms of energy and energy-storage system |
CN105656183A (en) * | 2014-11-15 | 2016-06-08 | 北京航天万源科技公司 | Modularized intelligent power supply and distribution device |
US9917437B2 (en) * | 2015-05-06 | 2018-03-13 | Cisco Technology, Inc. | Hot swap controller with individually controlled parallel current paths |
CN108880230B (en) * | 2018-06-28 | 2019-09-17 | 烽火通信科技股份有限公司 | Power sources in parallel control module and parallel system based on Switching Power Supply chopping voltage |
CN110377138A (en) * | 2019-06-29 | 2019-10-25 | 苏州浪潮智能科技有限公司 | A kind of multipath server power supply circuit and method for controlling power supply |
CN112069104A (en) * | 2020-05-28 | 2020-12-11 | 苏州浪潮智能科技有限公司 | Chip hot-plug protection circuit |
CN213072104U (en) * | 2020-08-28 | 2021-04-27 | 苏州浪潮智能科技有限公司 | Circuit for current sharing, chip for current sharing and circuit for hot plug current sharing control |
CN112181123B (en) * | 2020-09-28 | 2022-07-12 | 苏州浪潮智能科技有限公司 | Device and method for improving output current sharing of hot-plug chip |
CN115459204B (en) * | 2022-11-10 | 2023-02-28 | 苏州浪潮智能科技有限公司 | Parallel HSC chip fault processing device, method, equipment and medium |
-
2022
- 2022-11-10 CN CN202211404113.9A patent/CN115459204B/en active Active
-
2023
- 2023-06-16 WO PCT/CN2023/100842 patent/WO2024098750A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
CN115459204A (en) | 2022-12-09 |
WO2024098750A1 (en) | 2024-05-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115459204B (en) | Parallel HSC chip fault processing device, method, equipment and medium | |
CN110045182B (en) | Low-voltage transformer area power supply loop abnormity analysis method based on impedance calculation | |
EP2981902B1 (en) | Automatic configuration of alarm aggregations | |
CN112286709B (en) | Diagnosis method, diagnosis device and diagnosis equipment for server hardware faults | |
CN101685333B (en) | Electronic device and power connection module thereof | |
US8054599B2 (en) | Apparatus, system, and method for detecting a power system component failure | |
CN114815946B (en) | Current output equipment, method, device, system and medium | |
Mokoka et al. | Reliability evaluation of distribution networks using NEPLAN & DIgSILENT power factory | |
CN113595246B (en) | Micro-grid state online monitoring method and device, computer equipment and storage medium | |
CN112838668B (en) | Circuit breaker identification method, device and equipment | |
CN115808640B (en) | Power failure detection circuit, method, system, electronic device and storage medium | |
CN112054484A (en) | High-reliability multi-phase power supply system and method | |
CN106647958A (en) | Server rack | |
CN115728665A (en) | Power failure detection circuit, method and system | |
CN210351233U (en) | Circuit breaker, cabinet and system | |
CN210604876U (en) | Fault detection circuit and equipment | |
CN112769968B (en) | Circuit breaker, cabinet, system, address acquisition method and equipment | |
CN113533891A (en) | Fault diagnosis system and device | |
CN112001588A (en) | Accident event online pre-judging method and device based on N-1 state | |
CN220913297U (en) | Blade board installation detection circuit, blade server and detection equipment | |
CN116626540B (en) | Method, system, terminal and storage medium for judging broken line fault interval | |
CN114356617B (en) | Error injection testing method, device, system and computing equipment | |
TWI754941B (en) | Electronic equipment and operation method thereof | |
CN114928159A (en) | Regional power consumption monitoring method, system, terminal and storage medium | |
CN216599106U (en) | Power supply device and power supply system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |