CN115459204A - Device, method, equipment and medium for processing faults of parallel HSC (HSC) chips - Google Patents
Device, method, equipment and medium for processing faults of parallel HSC (HSC) chips Download PDFInfo
- Publication number
- CN115459204A CN115459204A CN202211404113.9A CN202211404113A CN115459204A CN 115459204 A CN115459204 A CN 115459204A CN 202211404113 A CN202211404113 A CN 202211404113A CN 115459204 A CN115459204 A CN 115459204A
- Authority
- CN
- China
- Prior art keywords
- resistor
- hsc
- switch tube
- chip
- fault
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012545 processing Methods 0.000 title claims abstract description 29
- 238000006243 chemical reaction Methods 0.000 claims abstract description 92
- 238000001514 detection method Methods 0.000 claims description 75
- 238000004590 computer program Methods 0.000 claims description 9
- 238000003745 diagnosis Methods 0.000 abstract description 4
- 238000013461 design Methods 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 31
- 238000010586 diagram Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000012827 research and development Methods 0.000 description 5
- 101150090950 Hsc70-1 gene Proteins 0.000 description 4
- 101100150366 Schizosaccharomyces pombe (strain 972 / ATCC 24843) sks2 gene Proteins 0.000 description 4
- 210000003995 blood forming stem cell Anatomy 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 101100434411 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ADH1 gene Proteins 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 101150102866 adc1 gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01R—MEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
- G01R31/00—Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
- G01R31/28—Testing of electronic circuits, e.g. by signal tracer
- G01R31/2851—Testing of integrated circuits [IC]
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H1/00—Details of emergency protective circuit arrangements
- H02H1/0007—Details of emergency protective circuit arrangements concerning the detecting means
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/02—Details
- H02H3/04—Details with warning or supervision in addition to disconnection, e.g. for indicating that protective apparatus has functioned
- H02H3/042—Details with warning or supervision in addition to disconnection, e.g. for indicating that protective apparatus has functioned combined with means for locating the fault
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/08—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current
- H02H3/087—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current for dc applications
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/08—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current
- H02H3/10—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess current additionally responsive to some other abnormal electrical conditions
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H3/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection
- H02H3/20—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess voltage
- H02H3/202—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition with or without subsequent reconnection ; integrated protection responsive to excess voltage for dc systems
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02H—EMERGENCY PROTECTIVE CIRCUIT ARRANGEMENTS
- H02H5/00—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal non-electric working conditions with or without subsequent reconnection
- H02H5/04—Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal non-electric working conditions with or without subsequent reconnection responsive to abnormal temperature
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Microelectronics & Electronic Packaging (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Emergency Protection Circuit Devices (AREA)
- Power Sources (AREA)
Abstract
The invention relates to the field of circuit design, in particular to a device, a method, equipment and a medium for processing faults of parallel HSC chips. The device comprises: multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals; the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals; and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip. The scheme of the invention realizes automatic fault location and fault diagnosis, avoids manpower and material resources input brought by field analysis of research personnel, and improves the fault location efficiency.
Description
Technical Field
The invention relates to the field of circuit design, in particular to a device, a method, equipment and a medium for processing faults of parallel HSC chips.
Background
With the continuous rise of cloud computing technology in recent years, the internet traffic is increasing. Higher requirements are put forward on the data processing capacity and the storage capacity of the computer room server. As a unit-cabinet system in a traditional computer room, the data processing capacity of server computing nodes deployed inside a cabinet is required to be stronger and higher, and the deployment density is required to be higher and higher. With the increase of internet user service, the network data throughput is also larger and larger, and the deployment density of a data center machine room to a server is higher and higher. In order to improve the deployment density and enhance the maintainability, the servers are generally deployed in the cabinet in the form of a single node, and the single node server is directly plugged on a POWER bus bar (bus bar) to get electricity through a POWER CLIP (POWER CLIP) connector.
Servers are also increasingly burdened as basic data processing units in data centers. Especially, the work load current of the CPU chip in the server is larger and larger, and the work current of a single CPU is as high as 100 to 200A. Therefore, the current at the power supply input end of a single server node is higher and higher, and the current requirement of the corresponding hot plug line at the input end is higher and higher. Therefore, a plurality of Hot Swap Controller (HSC) chips are often connected in parallel in the Hot Swap unit to meet the requirement of high-current application. In practical application, the parallel connection of multiple HSCs has the following risks: firstly, when one or more HSCs have one-way triggering Over-Current Protection (OCP), short-circuit Protection (SCP), over-Temperature Protection (OTP), and Over-Voltage Protection (OVP), the whole hot-plug line will be powered off, and the fault location cannot be determined quickly and accurately. Secondly, when a plurality of HSCs are used in parallel, the current flowing through each HSC chip is different due to the difference between the HSC chips and the LAYOUT, and when the difference becomes large, the problem of HSC non-uniform current is caused. Under the long-time high-load working condition, breakdown failure easily occurs, and the risk of power failure or board burning caused by short circuit of the HSC chip to the ground is caused. The normal operation of customer service is influenced, and meanwhile, hidden danger is brought to the fire safety of a data center machine room due to board burning. Meanwhile, once the problem of burning the plate occurs, research and development engineers often invest resources to carry out positioning analysis on the problem on site, the problem site is difficult to reproduce due to plate burning, and the positioning burning plate root also has great difficulty.
Fig. 1 is a schematic diagram of a traditional multi-path HSC parallel application scene, when a plurality of HSC chips are connected in parallel, an input terminal Vin is gathered in a POWER CLIP connector to get POWER from a BUSBAR, and an output terminal Vout is gathered together to supply POWER to a load. Meanwhile, a Baseboard Management Controller (BMC) on the server node motherboard samples the output voltage of the HSC to detect the abnormal state of the output voltage. However, this approach has the following drawbacks: on one hand, when one or more HSCs have one-way triggering of OCP, SCP, OTP and OVP, fault location is difficult. On the other hand, because the difference between the HSC chip individuals and the line layout results in the failure of automatic protection when the HSC is not uniform, the safety is poor and improvement is urgently needed.
Disclosure of Invention
In view of the above, there is a need to provide a device, a method, an apparatus and a medium for processing failures of parallel HSC chips.
According to a first aspect of the present invention, there is provided a method for processing faults of parallel HSC chips, the method comprising:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals;
and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the plurality of preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
In some embodiments, the conversion circuit comprises: the resistor comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor, a fifth resistor, a sixth resistor, a seventh resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
In some embodiments, the controller is further configured to:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to two thirds of power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
In some embodiments, each HSC chip further has a current detection signal, and the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule;
the device further comprises an AND gate, wherein a first input end of the AND gate is connected with a starting signal, a second input end of the AND gate is connected with the current sharing control signal, and an output end of the AND gate is connected to an enabling signal pin of each HSC chip.
In some embodiments, the controller is further configured to:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
According to a second aspect of the present invention, there is provided a method for processing faults of parallel HSC chips, the method comprising:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switch tube, a second switch tube, a third switch tube, a fourth switch tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same time.
In some embodiments, the conversion circuit comprises: the pull-up resistor comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor, a fifth resistor, a sixth resistor, a seventh resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
In some embodiments, the analyzing, by the controller, the fault alarm signal output by each conversion circuit to determine the failed HSC chip and the type of fault of the HSC chip comprises:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to two thirds of power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
In some embodiments, each HSC chip further has a current detection signal, the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule, and the method further includes;
and connecting the first input end of the AND gate with a starting signal, connecting the second input end of the AND gate with the current-sharing control signal, and connecting the output end of the AND gate to an enabling signal pin of each HSC chip.
In some embodiments, the processing the collected current detection signals and then generating the current sharing alarm signal and the current sharing control signal according to a preset determination rule includes:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
According to a third aspect of the present invention, there is also provided a computer apparatus comprising:
at least one processor; and
the memory stores a computer program which can run on the processor, and the processor executes the method for processing the fault of the parallel HSC chip when executing the program.
According to a fourth aspect of the present invention, there is also provided a computer-readable storage medium storing a computer program, which when executed by a processor, performs the method for processing the failure of the parallel HSC chips.
According to the parallel HSC chip fault processing device, the preset fault state signal inside the HSC chip is converted into the fault alarm signal through the conversion circuit, and the fault alarm signal is analyzed and judged through the controller to locate the specific fault position and fault reason, so that automatic fault location and fault diagnosis are realized, the manpower and material resources input brought by field analysis of research and development personnel is avoided, and the fault location efficiency is improved.
In addition, the invention also provides a method for processing faults of the parallel HSC chips, computer equipment and a computer readable storage medium, which can also realize the technical effects and are not described again.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic diagram of a conventional multi-HSC parallel application scenario;
fig. 2 is a schematic diagram of a device for handling a fault of parallel HSC chips according to an embodiment of the present invention;
fig. 3A is a schematic structural diagram of a conversion circuit according to an embodiment of the present invention;
fig. 3B is a schematic structural diagram of a conversion circuit according to another embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a parallel HSC current sharing detection and determination principle according to another embodiment of the present invention;
fig. 5 is a schematic diagram illustrating a method for handling a fault of a parallel HSC chip according to an embodiment of the present invention;
fig. 6 is an internal structural view of a computer device in another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In an embodiment, referring to fig. 2, the present invention provides a device for handling failures of parallel HSC chips, specifically, the device includes:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
the conversion circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals;
and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip.
According to the parallel HSC chip fault processing device, the preset fault state signal inside the HSC chip is converted into the fault alarm signal through the conversion circuit, and the fault alarm signal is analyzed and judged through the controller to locate the specific fault position and fault reason, so that automatic fault location and fault diagnosis are realized, the manpower and material resources input brought by field analysis of research and development personnel is avoided, and the fault location efficiency is improved.
In some embodiments, referring to fig. 3A, the conversion circuit includes: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3 and the fourth resistor R4 are different in resistance;
the first switch tube S1 is connected with the first resistor R1 in series to form a first branch, the second switch tube S2 is connected with the first resistor R2 in series to form a second branch, the third switch tube S3 is connected with the third resistor R3 in series to form a third branch, the fourth switch tube S4 is connected with the fourth resistor R4 in series to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switching tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switching tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switching tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switching tube S4 to be switched on or switched off, wherein at most one of the first switching tube S1, the second switching tube S2, the third switching tube S3 and the fourth switching tube S4 is switched on at the same time.
In some embodiments, referring to fig. 3B, the conversion circuit includes: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4, a fifth resistor R5, a sixth resistor R6, a seventh resistor R7 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3, the fourth resistor R4, the fifth resistor R5, the sixth resistor R6, the seventh resistor R7 and the pull-up resistor are the same in resistance;
the first switch tube S1 is connected in series with the first resistor R1 to form a first branch, the second switch tube S2 is connected in series with the first resistor R2 to form a second branch, the third switch tube S3 is connected in series with the third resistor R3 to form a third branch, the fourth switch tube S4 is connected in series with the fourth resistor R4 to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected in parallel between the other end of the pull-up resistor and the ground, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, the fifth resistor R5 is connected in series between the first resistor R1 and the first resistor R2, the sixth resistor R6 is connected in series between the first resistor R2 and the third resistor R3, the seventh resistor R7 is connected in series between the third resistor R3 and the fourth resistor R4, and the other end of the fault signal is the pull-up voltage value;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switch tube S4 to be switched on or switched off, wherein at most one of the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 is switched on at the same time.
In some embodiments, the controller is further configured to:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to two thirds of power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
In some embodiments, referring to fig. 4, each HSC chip further has a current detection signal, and the controller is further configured to collect the current detection signal of each HSC chip, process the collected current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule;
the device further comprises an AND gate, wherein a first input end of the AND gate is connected with a starting signal, a second input end of the AND gate is connected with the current sharing control signal, and an output end of the AND gate is connected to an enabling signal pin of each HSC chip.
In some embodiments, the controller is further configured to:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to that the current detection signals of all HSC chips are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal not turning off the HSC chips;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
In another embodiment, to facilitate understanding of the solution of the present invention, a data center server is taken as an example as a load, and the load uses n (n is greater than or equal to 2) HSC chips to supply power in parallel, please refer to fig. 2 to 4 again, which provide another parallel HSC chip fault handling apparatus, including: POWER CLIP connector, HSC0, HSC1, \ 8230, HSCn, conversion circuit, LOAD LOAD, BMC. The device is different from the traditional mode (the voltage signal transmitted by the HSC is monitored by the BMC alone) in that: (1) And corresponding the fault states of the OCP, SCP, OTP and OVP in the HSC with the voltage value of the alarm signal ALERT, and positioning specific fault information by judging the voltage value of the alarm signal in the BMC. And 92, introducing an IMON signal of the HSC chips into the BMC, and generating a current-sharing alarm signal and a current-sharing control signal according to a judgment rule after the IMON signal is digitally processed in the BMC. And the current-sharing control signal is used for controlling whether the HSC is closed or not, so that the problem of plate burning caused by non-uniform current can be effectively avoided. The function of the various parts of the device is explained in detail below:
HSCn has multiple input ends connected in parallel to get Vin from BUSBAR through POWER CLIP connector, and output ends connected in parallel to converge and output Vout to supply POWER for LOAD.
The current flowing through each HSC chip was in turn: i0 I1, \ 8230;, in. Each HSC chip elicited a signal: [ IMON0, IMON1, \8230 ], [ IMONn ], [ ALERT0, ALERT0, \8230;, ALERTn ] is connected to the ADC I/O port of BMC. [ IMON0, IMON1, \8230;, IMONn ] indicates the current detection signal derived from HSC0, HSC1, \8230;, HSCn chips, representing the magnitude of the current flowing through each HSC chip.
The alarm signals led out from HSC0, HSC1, \8230, HSCn chips represent the specific fault triggered by each HSC chip.
The BMC can judge and process the voltage value according to the collected Current signal and the alarm signal, gives a fault positioning result and triggers the Current sharing protection signal Current Balance.
As shown in fig. 3B, the HSC fault location differentiation and processing circuit structure provided in this embodiment takes a common 4-level fault of the HSC chip as an example (when S0, S1, S2, and S3 are low, the transfer line switch is closed): wherein, S0 represents: triggering OCP (over-current protection); s1 represents: triggering OTP (over temperature protection); s2 represents: triggering SCP (short circuit protection); s3 represents: triggering OVP (overvoltage protection); the 4 logic level signals are used to control the conversion circuit at the lower right of fig. 3B, and output the ALERT signals with different voltage values, and the conversion relationship of S0, S1, S2, and S3 is as the following table 1.
TABLE 1 Fault Warning Signal Voltage value and Fault type correspondence
FIG. 4 is a schematic diagram of a current sharing diagnostic structure for detecting multiple HSC parallel circuits, which generates current detection signals for HSC0, HSC1, \8230;, HSCn: [ IMON0, IMON1, \8230;, IMONn ] is passed to the ADC I/O port of BMC: [ ADC0, ADC1, \ 8230; ADCn ]. After the current detection signal is digitally processed inside the BMC chip, the current average value is calculated according to the following formula:
TABLE 2 IMONi, current sharing alarm signal, current sharing control signal corresponding relation
The BMC can make a comparison judgment according to the IMONi sum to generate a Current Balance ALERT signal and a Current Balance SHUT control signal. The correspondence and the corresponding actions are shown in table 2 above. When the Balance SHUT control signal is low, the HSC is turned off after passing through the AND gate together with the power-on signal PSON. Thereby preventing HSC burnout due to severe uneven flow.
The following will describe in detail the specific working engineering of the device for processing the parallel HSC chip failures:
step one, according to server system configuration: the method comprises the steps of counting the total power consumption P _ T of a node mainboard by a CPU, a memory, a hard disk array and a system fan module, and determining the maximum current value I _ T of a power supply path from a BUSBAR to a HSC according to the derating standard of 80%.
And step two, selecting a POWER CLIP connector meeting the current flow size, HSC chips (such as MP5991 GLU-Z) with proper models and the number of HSC chips needing to be connected in parallel according to the maximum current value I _ T.
And step three, respectively connecting the input end and the output end of each group of HSC chips in parallel and mutually. And interconnecting the IMON pin of each group of HSC chips with the ADC I/O pin of the BMC. And interconnecting the ALERT pin of the HSC chip with the ADC I/O pin of the BMC.
And step four, importing a judgment rule of alarm signal fault information and importing a current-sharing signal alarm and control judgment rule into the BMC firmware.
And fifthly, connecting the Current Balance SHUT generated by the BMC and the node starting signal PSON with pins EN of the HSC chips through a logic AND gate chip.
The parallel HSC chip fault processing at least has the following beneficial technical effects: firstly, OCP, SCP, OTP and OVP fault states in the HSC are corresponding to the voltage value of an alarm signal ALERT, and specific fault information is positioned through judgment of the alarm signal voltage value in the BMC, so that the investment of manpower and material resources brought by field analysis of research and development personnel is avoided, and the fault positioning efficiency is improved; secondly, an IMON signal of the multi-HSC chip is introduced into the BMC, and after the IMON signal is digitally processed inside the BMC, a current sharing alarm signal and a current sharing control signal are generated according to a judgment rule. And uses the current sharing control signal to control whether the HSC is turned off. The problem of board burning caused by uneven current can be effectively avoided, and the power supply safety of the server is enhanced.
In another embodiment, referring to fig. 5, the present invention further provides a method 100 for processing failures of parallel HSC chips, the method comprising:
101, connecting multiple HSC chips in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
102, converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
According to the method for processing the faults of the parallel HSC chips, the preset fault state signals inside the HSC chips are converted into the fault alarm signals through the conversion circuit, and the fault alarm signals are analyzed and judged through the controller to position specific fault positions and fault reasons, so that automatic fault positioning and fault diagnosis are achieved, manpower and material resources input brought by field analysis of research and development personnel are avoided, and the fault positioning efficiency is improved.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3 and the fourth resistor R4 are different in resistance;
the first switch tube S1 is connected with the first resistor R1 in series to form a first branch, the second switch tube S2 is connected with the first resistor R2 in series to form a second branch, the third switch tube S3 is connected with the third resistor R3 in series to form a third branch, the fourth switch tube S4 is connected with the fourth resistor R4 in series to form a fourth branch, one end of the pull-up resistor is connected with a power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switching tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switching tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switching tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switching tube S4 to be switched on or switched off, wherein at most one of the first switching tube S1, the second switching tube S2, the third switching tube S3 and the fourth switching tube S4 is switched on at the same time.
In some embodiments, the conversion circuit comprises: the circuit comprises a first switching tube S1, a second switching tube S2, a third switching tube S3, a fourth switching tube S4, a first resistor R1, a first resistor R2, a third resistor R3, a fourth resistor R4, a fifth resistor R5, a sixth resistor R6, a seventh resistor R7 and a pull-up resistor, wherein the first resistor R1, the first resistor R2, the third resistor R3, the fourth resistor R4, the fifth resistor R5, the sixth resistor R6, the seventh resistor R7 and the pull-up resistor are the same in resistance;
the first switch tube S1 is connected in series with the first resistor R1 to form a first branch, the second switch tube S2 is connected in series with the first resistor R2 to form a second branch, the third switch tube S3 is connected in series with the third resistor R3 to form a third branch, the fourth switch tube S4 is connected in series with the fourth resistor R4 to form a fourth branch, one end of the pull-up resistor is connected with the power supply, the first branch, the second branch, the third branch and the fourth branch are sequentially connected in parallel between the other end of the pull-up resistor and the ground, the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 are all grounded, the fifth resistor R5 is connected in series between the first resistor R1 and the first resistor R2, the sixth resistor R6 is connected in series between the first resistor R2 and the third resistor R3, the seventh resistor R7 is connected in series between the third resistor R3 and the fourth resistor R4, and the other end of the fault signal is the pull-up voltage value;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube S1 to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube S2 to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube S3 to be switched on or switched off, and the overvoltage protection signal is used for driving the fourth switch tube S4 to be switched on or switched off, wherein at most one of the first switch tube S1, the second switch tube S2, the third switch tube S3 and the fourth switch tube S4 is switched on at the same time.
In some embodiments, in step 103, the analyzing, by the controller, the failure warning signal output by each conversion circuit to determine the failed HSC chip and the failure type of the HSC chip includes:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to two thirds of power supply voltage, and determining that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit does not have fault.
In some embodiments, each HSC chip further has a current detection signal, the controller is further configured to acquire the current detection signal of each HSC chip, process the acquired current detection signal, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule, and the method further includes;
and connecting the first input end of the AND gate with a starting signal, connecting the second input end of the AND gate with the current-sharing control signal, and connecting the output end of the AND gate to an enabling signal pin of each HSC chip.
In some embodiments, the processing the collected current detection signals and then generating the current sharing alarm signal and the current sharing control signal according to a preset determination rule includes:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips which are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip which is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
In some embodiments, the controller is a baseboard management controller.
According to another aspect of the present invention, a computer device is provided, which may be a server, and an internal structure thereof is shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. When executed by a processor, the computer program implements the method for processing faults of the parallel HSC chips, and specifically, the method comprises the following steps:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
According to still another aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method for processing a fault of a parallel HSC chip as described above, specifically comprising performing the steps of:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (16)
1. A parallel HSC chip failure handling device, the device comprising:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
the switching circuit is corresponding to each HSC chip and is used for converting a plurality of preset fault state signals into fault alarm signals and outputting the fault alarm signals;
and the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with the fault and the fault type of the HSC chip.
2. The parallel HSC chip failure handling device of claim 1, wherein the switching circuit comprises: the circuit comprises a first switch tube, a second switch tube, a third switch tube, a fourth switch tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same time.
3. The parallel HSC chip failure handling device of claim 1, wherein the switching circuit comprises: the resistor comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor, a fifth resistor, a sixth resistor, a seventh resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
4. The parallel HSC chip failure handling device of claim 3, wherein the controller is further configured to:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to two thirds of power supply voltage, and determining that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three-fourths of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit does not have fault.
5. The device of claim 1, wherein each HSC chip further has a current detection signal, the controller is further configured to collect the current detection signal of each HSC chip, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule after processing the collected current detection signal;
the device further comprises an AND gate, wherein a first input end of the AND gate is connected with a starting signal, a second input end of the AND gate is connected with the current sharing control signal, and an output end of the AND gate is connected to an enabling signal pin of each HSC chip.
6. The parallel HSC chip failure handling device of claim 5, wherein the controller is further configured to:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips, if the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to that the current detection signals of all HSC chips are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal not turning off the HSC chips;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
7. The device for processing the fault of the parallel HSC chips as claimed in any one of claims 1-6, wherein the controller is a baseboard management controller.
8. A parallel HSC chip fault handling method is characterized by comprising the following steps:
multiple HSC chips connected in parallel, wherein each HSC chip is provided with a plurality of preset fault state signals;
converting a plurality of preset fault state signals into fault alarm signals by using a conversion circuit corresponding to each HSC chip and outputting the fault alarm signals;
the controller analyzes the fault alarm signal output by each conversion circuit to determine the HSC chip with fault and the fault type of the HSC chip.
9. The parallel HSC chip failure handling method of claim 8, wherein the switching circuit comprises: the circuit comprises a first switch tube, a second switch tube, a third switch tube, a fourth switch tube, a first resistor, a second resistor, a third resistor, a fourth resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor and the fourth resistor are different in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, and the fault warning signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same time.
10. The parallel HSC chip failure handling method of claim 9, wherein the switching circuit comprises: the pull-up resistor comprises a first switching tube, a second switching tube, a third switching tube, a fourth switching tube, a first resistor, a second resistor, a third resistor, a fourth resistor, a fifth resistor, a sixth resistor, a seventh resistor and a pull-up resistor, wherein the first resistor, the second resistor, the third resistor, the fourth resistor, the fifth resistor, the sixth resistor, the seventh resistor and the pull-up resistor are all the same in resistance;
the first switch tube is connected with the first resistor in series to form a first branch circuit, the second switch tube is connected with the second resistor in series to form a second branch circuit, the third switch tube is connected with the third resistor in series to form a third branch circuit, the fourth switch tube is connected with the fourth resistor in series to form a fourth branch circuit, one end of the pull-up resistor is connected with a power supply, the first branch circuit, the second branch circuit, the third branch circuit and the fourth branch circuit are sequentially connected between the other end of the pull-up resistor and the ground in parallel, the first switch tube, the second switch tube, the third switch tube and the fourth switch tube are all grounded, the fifth resistor is connected between the first resistor and the second resistor in series, the sixth resistor is connected between the second resistor and the third resistor in series, the seventh resistor is connected between the third resistor and the fourth resistor in series, and the fault alarm signal is a voltage value at the other end of the pull-up resistor;
the preset fault state signals comprise an overcurrent protection signal, an overtemperature protection signal, a short-circuit protection signal and an overvoltage protection signal, the overcurrent protection signal is used for driving the first switch tube to be switched on or switched off, the overtemperature protection signal is used for driving the second switch tube to be switched on or switched off, the short-circuit protection signal is used for driving the third switch tube to be switched on or switched off, the overvoltage protection signal is used for driving the fourth switch tube to be switched on or switched off, and at most one of the first switch tube, the second switch tube, the third switch tube and the fourth switch tube is switched on at the same moment.
11. The method of claim 10, wherein the analyzing, by the controller, the fault warning signal output by each switching circuit to determine the type of the fault of the failed HSC chip and HSC chip comprises:
responding to the fault alarm signal output by a certain conversion circuit being equal to one half of the power supply voltage, confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overcurrent protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to two thirds of power supply voltage, and determining that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip over-temperature protection;
responding to the fault alarm signal output by a certain conversion circuit being equal to three quarters of power supply voltage, confirming that an HSC chip corresponding to the certain conversion circuit has a fault, and the fault type is HSC chip short-circuit protection;
responding to the fault alarm signal output by a certain conversion circuit to be equal to four fifths of power supply voltage, and confirming that the HSC chip corresponding to the certain conversion circuit has a fault, wherein the fault type is HSC chip overvoltage protection;
and responding to the fault alarm signal output by a certain conversion circuit being equal to zero, and determining that the HSC chip corresponding to the certain conversion circuit is not in fault.
12. The method of claim 8, wherein each HSC chip further has a current detection signal, the controller is further configured to collect the current detection signal of each HSC chip, and generate a current sharing alarm signal and a current sharing control signal according to a preset determination rule after processing the collected current detection signal, the method further comprising;
and connecting the first input end of the AND gate with a starting signal, connecting the second input end of the AND gate with the current-sharing control signal, and connecting the output end of the AND gate to an enabling signal pin of each HSC chip.
13. The method of claim 12, wherein the processing the collected current detection signals and then generating the current sharing alarm signal and the current sharing control signal according to the preset determination rule comprises:
calculating the average value of all the collected current detection signals;
responding to the current detection signal of any HSC chip being less than 0.6 times of the average value or more than 1.6 of the average value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal for turning off the HSC chip;
responding to the current detection signals of all HSC chips which are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip which is more than or equal to 0.6 time of the mean value and less than 0.9 time of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chips;
responding to the current detection signals of all HSC chips, wherein the current detection signals are more than or equal to 0.6 time of the mean value and less than or equal to 1.6 times of the mean value, and the current detection signal of any HSC chip is more than 1.1 times of the mean value and less than or equal to 1.6 times of the mean value, outputting a current-sharing alarm signal needing to be alarmed and outputting a current-sharing control signal which does not turn off the HSC chip;
and responding to the current detection signals of all HSC chips which are more than or equal to 0.9 time of the mean value and less than or equal to 1.1 time of the mean value, outputting a current-sharing alarm signal which does not need to be alarmed, and outputting a current-sharing control signal which does not turn off the HSC chips.
14. The method for processing the fault of the parallel HSC chips as claimed in any one of the claims 8-13, wherein the controller is a baseboard management controller.
15. A computer device, comprising:
at least one processor; and
a memory storing a computer program operable in the processor, the processor when executing the program performing the method of any of claims 8-13.
16. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 8-13.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211404113.9A CN115459204B (en) | 2022-11-10 | 2022-11-10 | Parallel HSC chip fault processing device, method, equipment and medium |
PCT/CN2023/100842 WO2024098750A1 (en) | 2022-11-10 | 2023-06-16 | Parallel hot swap controller chip fault processing apparatus and method, and device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211404113.9A CN115459204B (en) | 2022-11-10 | 2022-11-10 | Parallel HSC chip fault processing device, method, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115459204A true CN115459204A (en) | 2022-12-09 |
CN115459204B CN115459204B (en) | 2023-02-28 |
Family
ID=84295572
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211404113.9A Active CN115459204B (en) | 2022-11-10 | 2022-11-10 | Parallel HSC chip fault processing device, method, equipment and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115459204B (en) |
WO (1) | WO2024098750A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115808640A (en) * | 2023-02-09 | 2023-03-17 | 苏州浪潮智能科技有限公司 | Power failure detection circuit, method, system, electronic device, and storage medium |
CN116298635A (en) * | 2023-03-30 | 2023-06-23 | 海信家电集团股份有限公司 | IPM fault detection system, IPM fault detection method, IPM fault detection device and storage medium |
WO2024098750A1 (en) * | 2022-11-10 | 2024-05-16 | 苏州元脑智能科技有限公司 | Parallel hot swap controller chip fault processing apparatus and method, and device and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103227583A (en) * | 2013-04-26 | 2013-07-31 | 江苏省电力设计院 | Hot plug type conversion system for new energy and energy storage system |
CN105656183A (en) * | 2014-11-15 | 2016-06-08 | 北京航天万源科技公司 | Modularized intelligent power supply and distribution device |
CN108880230A (en) * | 2018-06-28 | 2018-11-23 | 烽火通信科技股份有限公司 | Power sources in parallel control module and parallel system based on Switching Power Supply chopping voltage |
CN112069104A (en) * | 2020-05-28 | 2020-12-11 | 苏州浪潮智能科技有限公司 | Chip hot-plug protection circuit |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2872089B2 (en) * | 1995-10-27 | 1999-03-17 | 群馬日本電気株式会社 | Hot-swap device |
US8847438B2 (en) * | 2008-07-14 | 2014-09-30 | Texas Instruments Incorporated | Minimum loss and wiring circuit and method for paralleling hot swap controllers |
US9917437B2 (en) * | 2015-05-06 | 2018-03-13 | Cisco Technology, Inc. | Hot swap controller with individually controlled parallel current paths |
CN110377138A (en) * | 2019-06-29 | 2019-10-25 | 苏州浪潮智能科技有限公司 | A kind of multipath server power supply circuit and method for controlling power supply |
CN213072104U (en) * | 2020-08-28 | 2021-04-27 | 苏州浪潮智能科技有限公司 | Circuit for current sharing, chip for current sharing and circuit for hot plug current sharing control |
CN112181123B (en) * | 2020-09-28 | 2022-07-12 | 苏州浪潮智能科技有限公司 | Device and method for improving output current sharing of hot-plug chip |
CN115459204B (en) * | 2022-11-10 | 2023-02-28 | 苏州浪潮智能科技有限公司 | Parallel HSC chip fault processing device, method, equipment and medium |
-
2022
- 2022-11-10 CN CN202211404113.9A patent/CN115459204B/en active Active
-
2023
- 2023-06-16 WO PCT/CN2023/100842 patent/WO2024098750A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103227583A (en) * | 2013-04-26 | 2013-07-31 | 江苏省电力设计院 | Hot plug type conversion system for new energy and energy storage system |
CN105656183A (en) * | 2014-11-15 | 2016-06-08 | 北京航天万源科技公司 | Modularized intelligent power supply and distribution device |
CN108880230A (en) * | 2018-06-28 | 2018-11-23 | 烽火通信科技股份有限公司 | Power sources in parallel control module and parallel system based on Switching Power Supply chopping voltage |
CN112069104A (en) * | 2020-05-28 | 2020-12-11 | 苏州浪潮智能科技有限公司 | Chip hot-plug protection circuit |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024098750A1 (en) * | 2022-11-10 | 2024-05-16 | 苏州元脑智能科技有限公司 | Parallel hot swap controller chip fault processing apparatus and method, and device and medium |
CN115808640A (en) * | 2023-02-09 | 2023-03-17 | 苏州浪潮智能科技有限公司 | Power failure detection circuit, method, system, electronic device, and storage medium |
CN115808640B (en) * | 2023-02-09 | 2023-05-16 | 苏州浪潮智能科技有限公司 | Power failure detection circuit, method, system, electronic device and storage medium |
CN116298635A (en) * | 2023-03-30 | 2023-06-23 | 海信家电集团股份有限公司 | IPM fault detection system, IPM fault detection method, IPM fault detection device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2024098750A1 (en) | 2024-05-16 |
CN115459204B (en) | 2023-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115459204B (en) | Parallel HSC chip fault processing device, method, equipment and medium | |
CN110045182B (en) | Low-voltage transformer area power supply loop abnormity analysis method based on impedance calculation | |
EP2981902B1 (en) | Automatic configuration of alarm aggregations | |
US8054599B2 (en) | Apparatus, system, and method for detecting a power system component failure | |
JP2019537701A (en) | Method and apparatus for detecting failure of distribution network with high reliability, and storage medium | |
CN116500487B (en) | Fault detection system and method for switching power supply, terminal equipment and medium | |
JP2015018838A (en) | Fault detector of backflow prevention diode for solar cell, fault detection system of backflow prevention diode for solar cell, and fault detection method of backflow prevention diode for solar cell | |
CN113595246B (en) | Micro-grid state online monitoring method and device, computer equipment and storage medium | |
CN112054484B (en) | High-reliability multi-phase power supply system and method | |
CN115808640B (en) | Power failure detection circuit, method, system, electronic device and storage medium | |
CN112838668B (en) | Circuit breaker identification method, device and equipment | |
CN106647958A (en) | Server rack | |
CN113238172B (en) | Current transformer neutral wire abnormity judgment method based on neutral wire resistance | |
CN210350849U (en) | Security protection power supply is equipped with electric circuit and power panel | |
CN210604876U (en) | Fault detection circuit and equipment | |
CN112001588A (en) | Accident event online pre-judging method and device based on N-1 state | |
CN112398226A (en) | Power supply system electricity stealing prevention method, system, terminal and storage medium | |
CN111865700A (en) | Information node screening method and related device for electric power information physical system | |
CN220913297U (en) | Blade board installation detection circuit, blade server and detection equipment | |
CN116626540B (en) | Method, system, terminal and storage medium for judging broken line fault interval | |
CN218445813U (en) | Mainboard detection device and microelectronic equipment | |
TWI754941B (en) | Electronic equipment and operation method thereof | |
CN216599106U (en) | Power supply device and power supply system | |
CN114356617B (en) | Error injection testing method, device, system and computing equipment | |
CN113253153B (en) | Neutral line abnormity judgment method based on non-fault phase second harmonic component ratio |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |