CN117896191A - DPU card powering-on and powering-off method, system, device, equipment and storage medium - Google Patents

DPU card powering-on and powering-off method, system, device, equipment and storage medium Download PDF

Info

Publication number
CN117896191A
CN117896191A CN202311278085.5A CN202311278085A CN117896191A CN 117896191 A CN117896191 A CN 117896191A CN 202311278085 A CN202311278085 A CN 202311278085A CN 117896191 A CN117896191 A CN 117896191A
Authority
CN
China
Prior art keywords
server
server node
dpu
power
card
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311278085.5A
Other languages
Chinese (zh)
Inventor
郑冠儒
韩威
薛广营
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311278085.5A priority Critical patent/CN117896191A/en
Publication of CN117896191A publication Critical patent/CN117896191A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a DPU card powering-on and powering-off method, a system, a device, equipment and a storage medium, which are applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes and comprises the following steps: receiving state change information sent by adjacent server nodes in the same network, wherein the state change information comprises local operation information of a server, and the local operation information characterizes whether the server is powered on or powered off; if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information; if so, the DPU card and the server in the current server node are controlled to cooperatively power up and down, namely, the method and the device improve the response load change timeliness of the device through the self-adaptive cooperative power up and down flow of the device, greatly save power consumption and liberate operators from a large number of single maintenance.

Description

DPU card powering-on and powering-off method, system, device, equipment and storage medium
Technical Field
The present invention relates to the field of server technologies, and in particular, to a DPU card powering-on and powering-off method, a system, a device, equipment, and a storage medium.
Background
DPU (Data Processing Unit, data processor) technology is one of the most hot directions in the current cloud computing field, DPU is a short term for data processing units, data processing of an access of a node is mainly undertaken in a data center, single and repeated actions in the node of the data center are handed to an inexpensive DPU unit for processing, and resources of an expensive CPU are released and deployed on other more valuable works. The main application scenes of the DPU comprise a bare metal scene, a virtualization scene, a storage scene and the like. A DPU card is generally deployed in one server node, and the DPU cards of all server nodes in the data center are networked, so that a large resource pool is constructed and formed for scheduling and use by a service end.
Current DPU technology development is limited by board area and device power consumption. Because resources inside the server are limited, unlimited space, power supply and heat dissipation capacity cannot be provided on the DPU card, and in the related art, a manager needs to operate each station device point to point, so that response is not timely, the DPU card is in a high-power-consumption state for a long time, and power consumption of a system center is too high.
Disclosure of Invention
In view of this, the present invention aims to provide a method, a system, a device and a storage medium for powering on and powering off a DPU card, which solve the problem that the power consumption of the DPU card is not timely changed and responded, and the workload of point-to-point management of the device in the network is complex.
According to a first aspect of the present invention, there is provided a DPU card power-on/power-off method applied to any one of server nodes in a DPU card power-on/power-off system, where the DPU card power-on/power-off system includes at least two server nodes, and the server nodes and adjacent server nodes are interconnected by a network cable, the method including:
receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information;
if yes, the DPU card in the current server node and the server are controlled to cooperatively power up and power down.
Optionally, before the step of receiving state change information sent by a neighboring server node under the same network, where the state change information includes whether the server is in a power-on state, the method includes:
collecting state data of servers in adjacent server nodes under the same network, wherein the state data comprises at least one of running states of the servers, power consumption of the servers and error reporting information of the servers;
And processing the state data through a preset algorithm to determine whether the current server node and the adjacent server node are in a relevant state.
Optionally, the processing the state data through a preset algorithm, and determining whether the current server node and the adjacent server node are in a relevant state includes:
analyzing and processing the state data through a preset AI algorithm to obtain the correlation degree between the current server node and the adjacent server node, or obtain the output value between the current server node and the adjacent server node;
determining whether the current server node and the adjacent server node are in a relevant state according to the relevance, or determining whether the current server node and the adjacent server node are in a relevant state according to the output value;
the determining whether the current server node and the adjacent server node are in a relevant state according to the relevance comprises the following steps:
when the correlation degree is detected to be in a target preset range or more than or equal to a target threshold value, the current server node and the adjacent server nodes are in a correlation state;
When the correlation degree is detected to be in a non-target preset range or smaller than a target threshold value, the current server node and the adjacent server node are in a non-correlation state;
the determining whether the current server node and the adjacent server node are in a relevant state according to the output value comprises:
under the condition that the output value is 1, the current server node and the adjacent server node are in a relevant state;
in case that the output value is detected to be 0, the current server node and the neighboring server node are in an uncorrelated state.
According to a second aspect of the present invention, there is provided a DPU card power-on-off system that performs the DPU card power-on-off method as claimed in any one of claims 1 to 3.
Optionally, the DPU card power-on and power-off system includes: the system comprises at least two server nodes, wherein the server nodes are interconnected through a network cable and comprise a feature extraction unit, a decision unit and a feedback unit;
the feature extraction unit is used for collecting state data sent by the target server, wherein the state data comprises at least one of a server running state, a power consumption state and a fault reporting state lamp display state;
The decision unit is used for determining the correlation degree between the server nodes according to the state data;
the feedback unit is used for collecting the local operation information of the current server.
Optionally, the server node further includes a DPU card and a server;
the DPU card and the server are connected with each other.
Optionally, the DPU card power-on and power-off system further includes a management end, where the management end is communicatively connected to the server node;
the management end is used for monitoring the server nodes, and the management end sends power-on and power-off commands to any one of the server nodes so that any one of the server nodes sends state change information to adjacent server nodes.
According to a third aspect of the present invention, there is provided a DPU card power-up and power-down apparatus, the apparatus comprising:
the receiving module is used for receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
the determining module is used for determining whether to send a state change command to the DPU card according to the state change information if the current server node and the adjacent server node are in a relevant state;
And the control module is used for controlling the DPU card in the current server node and the server to cooperatively power up and power down if the DPU card is powered up.
Optionally, the apparatus further comprises:
the system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring state data of servers in adjacent server nodes under the same network, and the state data comprises at least one of running states of the servers, power consumption of the servers and error reporting information of the servers;
and the algorithm processing module is used for processing the state data through a preset algorithm and determining whether the current server node and the adjacent server node are in a relevant state or not.
Optionally, the algorithm processing module includes:
the first algorithm processing sub-module is used for analyzing and processing the state data through a preset AI algorithm to obtain the correlation between the current server node and the adjacent server node or obtain the output value between the current server node and the adjacent server node;
the second algorithm processing sub-module is used for determining whether the current server node and the adjacent server node are in a relevant state according to the relevance, or determining whether the current server node and the adjacent server node are in a relevant state according to the output value;
The second algorithm processing sub-module includes:
the first determining unit is used for determining that the current server node and the adjacent server node are in a correlation state under the condition that the correlation degree is detected to be in a target preset range or greater than or equal to a target threshold value;
the second determining unit is used for being in a non-relevant state between the current server node and the adjacent server node under the condition that the detected relevance is in a non-target preset range or smaller than a target threshold value;
the second algorithm processing sub-module includes:
a third determining unit, configured to, when the output value is detected to be 1, be in a correlation state between the current server node and the neighboring server node;
and the fourth determining unit is used for being in an uncorrelated state between the current server node and the adjacent server node under the condition that the output value is detected to be 0.
According to still another aspect of the present invention, there is also provided an electronic apparatus including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the DPU card power-up and power-down method as described above.
According to yet another aspect of the present invention, there is also provided a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the DPU card power-up and power-down method as described above.
The DPU card powering-on and powering-off method provided by the embodiment of the invention is applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes, and the server nodes and adjacent server nodes are interconnected through a network cable, and the method comprises the following steps: receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off; if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information; if so, the DPU card and the server in the current server node are controlled to be powered on and powered off cooperatively, and the embodiment of the application is based on the theoretical basis of the consistency of the device behaviors on the network, and solves the problems of point-to-point load change response elimination and execution action issuing of managers in the traditional architecture by combining with an AI algorithm and feature extraction. The response time of the system is greatly improved, the waste of power consumption caused by long-time unresponsiveness is avoided, and the workload for managing the system is reduced. The method has the advantages that greater benefits are brought to the operation of the data center, and the response load change time of the equipment is improved through the self-adaptive cooperative power-on and power-off flow of the equipment, so that the power consumption is greatly saved, and operators are liberated from a large number of single maintenance.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 is a flowchart of a power-on/power-off method of a DPU card according to an embodiment of the present invention;
fig. 2 is a flowchart of steps of another power-on/power-off method for a DPU card provided by an embodiment of the present invention;
FIG. 3 is a flowchart of step 202 of another power-on/power-off method for a DPU card provided by the embodiment of the invention in FIG. 2;
fig. 4 is a schematic diagram of a power-on and power-off system of a DPU card according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a power-on/off device of a DPU card according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a DPU card architecture in the related art according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the following detailed description of the embodiments of the present invention will be given with reference to the accompanying drawings. However, those of ordinary skill in the art will understand that in various embodiments of the present invention, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not be construed as limiting the specific implementation of the present invention, and the embodiments can be mutually combined and referred to without contradiction.
Referring to fig. 1, a step flow chart of a DPU card powering-up and powering-down method provided by an embodiment of the present invention is shown.
Step 101, receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
it should be noted that, in the embodiment of the present application, the present application is applied to any one server node in a DPU card power-on/power-off system, where the DPU card power-on/power-off system includes at least two server nodes, and the server nodes and adjacent server nodes are interconnected by a network cable.
The DPU card power-on and power-off system can comprise a plurality of server nodes, and each server node comprises a feature extraction unit, a decision unit, a feedback unit, the DPU card and a server, wherein the adjacent server nodes among the plurality of server nodes are connected in pairs.
It should be noted that, the adjacent server node is determined by referring to any server node in the DPU card power-on and power-off system, that is, the server node where the server 1 is located and the server node where the server 2 is located are adjacent to each other, and then the server node where the server 2 is located is the adjacent server node of the server node where the server 1 is located, and similarly, the server node where the server 2 is located and the server node where the server 3 is located are both adjacent server nodes.
The same network may refer to the same network environment that is connected by a network cable, and may communicate with each other.
Therefore, after the large cluster system where the server is located is started, and the like, after the system is initialized and other configurations are carried out, state change information sent by adjacent server nodes in the same network can be received, wherein each server node comprises a feedback unit, and the feedback unit is used for collecting local operation information of the current equipment and sending the local operation information to other adjacent nodes on the network, wherein the current equipment can comprise a server and/or a DPU card in the server node.
Thus, the state change information includes local operation information of the server, which may characterize whether the server in the current server node (neighboring server node) is powered up or powered down, i.e., a power-on state and a power-off state.
It should be noted that, the local operation information may represent whether the server in the current server node (the neighboring server node) is powered up or powered down, or may represent the power consumption state of the current server node.
Step 102, if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information;
When the current server node is in the target power consumption state, the adjacent server node enters the target power consumption state, wherein the state of correlation between the current server node and the adjacent server node is that the current server node and the adjacent server node are in the same target power consumption state.
Specifically, in the embodiment of the application, for the correlation, if two devices present the correlation, when one device enters the low power consumption state, the other device also intelligently enters the low power consumption state together therewith, so that the power on and power off in the network device are intelligently controlled cooperatively.
Therefore, cooperative control between devices in the two server nodes can be achieved by judging whether the current server node and the adjacent server nodes are in a relevant state or not.
If the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information.
The state change information may include server power-up or server power-down information, and if there is information of a server state change or other device state change in the state change information, it is determined to send a state change command to the DPU card.
And step 103, if yes, controlling the DPU card in the current server node and the server to cooperatively power up and power down.
After determining to send the state change command, the server and the DPU card may be controlled to perform a coordinated power-on/power-off action.
The DPU card powering-on and powering-off method provided by the embodiment of the invention is applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes, and the server nodes and adjacent server nodes are interconnected through a network cable, and the method comprises the following steps: receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off; if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information; if so, the DPU card and the server in the current server node are controlled to be powered on and powered off cooperatively, and the embodiment of the application is based on the theoretical basis of the consistency of the device behaviors on the network, and solves the problems of point-to-point load change response elimination and execution action issuing of managers in the traditional architecture by combining with an AI algorithm and feature extraction. The response time of the system is greatly improved, the waste of power consumption caused by long-time unresponsiveness is avoided, and the workload for managing the system is reduced. The method has the advantages that greater benefits are brought to the operation of the data center, and the response load change time of the equipment is improved through the self-adaptive cooperative power-on and power-off flow of the equipment, so that the power consumption is greatly saved, and operators are liberated from a large number of single maintenance.
Referring to fig. 2, a flowchart of steps of another power-on/power-off method for a DPU card provided by an embodiment of the present invention is shown.
Step 201, collecting state data of servers in adjacent server nodes in the same network, wherein the state data comprises at least one of running state of the servers, power consumption of the servers and error reporting information of the servers.
It should be noted that, in this embodiment of the present application, the feature extraction unit of each server node on the same network may collect state data of adjacent nodes in the network, where the state data includes a behavior feature, where the behavior feature includes an operation state of a server, power consumption of the server, and error reporting information of the server, where the operation state of the server may be power-on or power-off, the power consumption of the server may include a high power consumption state and a low power consumption state, and the error reporting information of the server may be whether a current server has a fault and an error, and a display state corresponding to an error reporting status lamp.
And 202, processing the state data through a preset algorithm to determine whether the current server node and the adjacent server node are in a relevant state.
It should be noted that, after the required state data is collected, the state data may be processed by a preset algorithm, so as to determine whether the current server node and the neighboring server node are in a relevant state.
Further, as shown in fig. 3, fig. 3 is a flowchart of step 202 of another power-on/power-off method for a DPU card provided in the embodiment of the present invention, and step 202, that is, processing the status data through a preset algorithm, determines whether a current server node and an adjacent server node are in a relevant state includes:
step 2021, analyzing and processing the state data through a preset AI algorithm to obtain a correlation between the current server node and the adjacent server node, or obtain an output value between the current server node and the adjacent server node;
step 2022, determining whether the current server node and the adjacent server node are in a correlation state according to the correlation degree, or determining whether the current server node and the adjacent server node are in a correlation state according to the output value.
It should be noted that, according to the behavior characteristics of the devices in the network, the relevance between the current device and the devices adjacent to the network is analyzed through an AI algorithm, if the two devices show relevance, when one device enters low power consumption, the other device also enters a low power consumption state intelligently along with the low power consumption, thereby realizing intelligent cooperative control of power on and power off in the network device.
The AI algorithm is not specifically limited herein, and any suitable model algorithm may be used to analyze the correlation between two devices, and the foregoing discussion is referred to as correlation.
Therefore, the state data is analyzed and processed through a preset AI algorithm to obtain the correlation degree between the current server node and the adjacent server node, or obtain the output value between the current server node and the adjacent server node.
That is, the correlation degree can be obtained by the analysis processing, or the output value can be one, and then the judgment can be performed based on different rules according to the correlation degree and the output value.
Further, the determining whether the current server node and the adjacent server node are in the relevant state according to the relevance degree comprises:
when the correlation degree is detected to be in a target preset range or more than or equal to a target threshold value, the current server node and the adjacent server nodes are in a correlation state;
when the correlation degree is detected to be in a non-target preset range or smaller than a target threshold value, the current server node and the adjacent server node are in a non-correlation state;
the determining whether the current server node and the adjacent server node are in a relevant state according to the output value comprises:
Under the condition that the output value is 1, the current server node and the adjacent server node are in a relevant state;
in case that the output value is detected to be 0, the current server node and the neighboring server node are in an uncorrelated state.
It should be noted that, for the correlation, if the correlation between the current server node and the adjacent server node is obtained after the analysis processing, the correlation is detected to be in the target preset range, or in the case that the correlation is greater than or equal to the target threshold, the current server node and the adjacent server node are in a correlation state.
Or if the output value between the current server node and the adjacent server node is obtained after the analysis processing, whether the current server node and the adjacent server node are related is determined according to whether the output value is 0 or 1.
Step 203, receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
Step 204, if the current server node and the adjacent server node are in the relevant state, determining whether to send a state change command to the DPU card according to the state change information;
and step 205, if yes, controlling the DPU card in the current server node and the server to cooperatively power up and power down.
It should be noted that, in the embodiment of the present application, the above steps 203 to 205 are discussed with reference to the foregoing, and are not repeated here.
The DPU card powering-on and powering-off method provided by the embodiment of the invention is applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes, and the server nodes and adjacent server nodes are interconnected through a network cable, and the method comprises the following steps: receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off; if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information; if so, the DPU card and the server in the current server node are controlled to be powered on and powered off cooperatively, and the embodiment of the application is based on the theoretical basis of the consistency of the device behaviors on the network, and solves the problems of point-to-point load change response elimination and execution action issuing of managers in the traditional architecture by combining with an AI algorithm and feature extraction. The response time of the system is greatly improved, the waste of power consumption caused by long-time unresponsiveness is avoided, and the workload for managing the system is reduced. The method has the advantages that greater benefits are brought to the operation of the data center, and the response load change time of the equipment is improved through the self-adaptive cooperative power-on and power-off flow of the equipment, so that the power consumption is greatly saved, and operators are liberated from a large number of single maintenance.
In addition, in the embodiment of the application, the state data is analyzed and processed through a preset AI algorithm, and the correlation between the current server node and the adjacent server node is obtained, or the output value between the current server node and the adjacent server node is obtained to further determine the correlation between the two adjacent server nodes, so that the power on and power off are coordinated, the response load change timeliness of the equipment is improved, the power consumption is greatly saved, and an operator is liberated from a large number of single maintenance.
Referring to fig. 4, a schematic diagram of a power-on and power-off system of a DPU card according to an embodiment of the present invention is shown.
It should be noted that, in the embodiment of the present application, as shown in fig. 7, a conventional DPU card architecture is shown in fig. 7. The DPU card and the server form a node, each node is accessed into the network as a single individual, and the management of the system is responsible for managing each device. The administrator needs to perform operations on each device point-to-point, including collecting the current running state of the device, and controlling the state of each device, for example, shutting down the DPU card 1 and the server 1, restarting the DPU card 2 and the server 2, and so on. However, since the devices of the data center are thousands of devices, and some devices may be in a low power consumption state depending on the load conditions when they are operated. In this case, in consideration of energy saving, the load of the current device can be transferred to other devices, and then the current device is turned off. These operations, if performed by the administrator, have two problems, the first is that the response is not timely, thus leading to the server already being in a low power state at this time, but the DPU card still works and is in a high power state; the second is that the number of devices is large, and a large load is brought to the manager, thereby affecting the execution efficiency.
Therefore, in the embodiment of the application, a DPU card power-on and power-off system includes: the system comprises at least two server nodes, wherein the server nodes are interconnected through a network cable and comprise a feature extraction unit, a decision unit and a feedback unit;
the feature extraction unit is used for collecting state data sent by the target server, wherein the state data comprises at least one of a server running state, a power consumption state and a fault reporting state lamp display state;
the decision unit is used for determining the correlation degree between the server nodes according to the state data;
the feedback unit is used for collecting the local operation information of the current server.
The server node also comprises a DPU card and a server;
the DPU card and the server are connected with each other.
It should be noted that the power-on and power-off system of the DPU card comprises a management end, a server node, a feature extraction unit in the server node, a decision unit, a feedback unit, the DPU card, a server and the like.
The server node can be realized by using an X86 system of Intel corporation, and a server system is built based on an X86 platform; the feature extraction unit and the feedback unit of the server node are realized by a network adapter and an ARM SOC, and the decision unit operates in the ARM SOC; and the server nodes and the manager are interconnected through network cables to realize the management of each device in the network.
The DPU card powering-on and powering-off system further comprises a management end which is in communication connection with the server node;
the management end is used for monitoring the server nodes, and the management end sends power-on and power-off commands to any one of the server nodes so that any one of the server nodes sends state change information to adjacent server nodes.
In this embodiment of the present application, the management end may send a power-on/power-off command to any one of the server nodes, so that any one of the server nodes sends state change information to the adjacent server nodes, and further, performs subsequent coordinated power-on/power-off.
The DPU card powering-on and powering-off method provided by the embodiment of the invention is applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes, and the server nodes and adjacent server nodes are interconnected through a network cable, and the method comprises the following steps: receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off; if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information; if so, the DPU card and the server in the current server node are controlled to be powered on and powered off cooperatively, and the embodiment of the application is based on the theoretical basis of the consistency of the device behaviors on the network, and solves the problems of point-to-point load change response elimination and execution action issuing of managers in the traditional architecture by combining with an AI algorithm and feature extraction. The response time of the system is greatly improved, the waste of power consumption caused by long-time unresponsiveness is avoided, and the workload for managing the system is reduced. The method has the advantages that greater benefits are brought to the operation of the data center, and the response load change time of the equipment is improved through the self-adaptive cooperative power-on and power-off flow of the equipment, so that the power consumption is greatly saved, and operators are liberated from a large number of single maintenance.
Referring to fig. 5, a schematic structural diagram of a power-on/power-off device of a DPU card according to an embodiment of the present invention is shown.
A receiving module 501, configured to receive state change information sent by the adjacent server nodes in the same network, where the state change information includes local operation information of the server, where the local operation information characterizes whether the server is powered on or powered off;
a determining module 502, configured to determine whether to send a state change command to the DPU card according to the state change information if the current server node and the neighboring server node are in a relevant state;
and the control module 503 is configured to control the DPU card and the server in the current server node to cooperatively power up and power down if the DPU card and the server are on.
Optionally, the apparatus further comprises:
the system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring state data of servers in adjacent server nodes under the same network, and the state data comprises at least one of running states of the servers, power consumption of the servers and error reporting information of the servers;
and the algorithm processing module is used for processing the state data through a preset algorithm and determining whether the current server node and the adjacent server node are in a relevant state or not.
Optionally, the algorithm processing module includes:
the first algorithm processing sub-module is used for analyzing and processing the state data through a preset AI algorithm to obtain the correlation between the current server node and the adjacent server node or obtain the output value between the current server node and the adjacent server node;
the second algorithm processing sub-module is used for determining whether the current server node and the adjacent server node are in a relevant state according to the relevance, or determining whether the current server node and the adjacent server node are in a relevant state according to the output value;
the second algorithm processing sub-module includes:
the first determining unit is used for determining that the current server node and the adjacent server node are in a correlation state under the condition that the correlation degree is detected to be in a target preset range or greater than or equal to a target threshold value;
the second determining unit is used for being in a non-relevant state between the current server node and the adjacent server node under the condition that the detected relevance is in a non-target preset range or smaller than a target threshold value;
the second algorithm processing sub-module includes:
A third determining unit, configured to, when the output value is detected to be 1, be in a correlation state between the current server node and the neighboring server node;
and the fourth determining unit is used for being in an uncorrelated state between the current server node and the adjacent server node under the condition that the output value is detected to be 0.
The DPU card powering-on and powering-off method provided by the embodiment of the invention is applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes, and the server nodes and adjacent server nodes are interconnected through a network cable, and the method comprises the following steps: receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off; if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information; if so, the DPU card and the server in the current server node are controlled to be powered on and powered off cooperatively, and the embodiment of the application is based on the theoretical basis of the consistency of the device behaviors on the network, and solves the problems of point-to-point load change response elimination and execution action issuing of managers in the traditional architecture by combining with an AI algorithm and feature extraction. The response time of the system is greatly improved, the waste of power consumption caused by long-time unresponsiveness is avoided, and the workload for managing the system is reduced. The method has the advantages that greater benefits are brought to the operation of the data center, and the response load change time of the equipment is improved through the self-adaptive cooperative power-on and power-off flow of the equipment, so that the power consumption is greatly saved, and operators are liberated from a large number of single maintenance.
The embodiment of the invention also provides an electronic device, as shown in fig. 6, which comprises a processor 601, a communication interface 602, a memory 603 and a communication bus 604, wherein the processor 601, the communication interface 602 and the memory 603 complete communication with each other through the communication bus 604,
a memory 603 for storing a computer program;
the processor 601 is configured to execute the program stored in the memory 603, and implement the following steps:
receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information;
if yes, the DPU card in the current server node and the server are controlled to cooperatively power up and power down.
The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (Peripheral Component Interconnect, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the terminal and other devices.
The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processing, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, a computer readable storage medium is provided, where instructions are stored, which when executed on a computer, cause the computer to perform the DPU card power-up and power-down method of any one of the above embodiments.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (10)

1. The DPU card powering-on and powering-off method is characterized by being applied to any one server node in a DPU card powering-on and powering-off system, wherein the DPU card powering-on and powering-off system comprises at least two server nodes, and the server nodes and adjacent server nodes are interconnected through network cables, and the method comprises the following steps:
receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
if the current server node and the adjacent server node are in a relevant state, determining whether to send a state change command to the DPU card according to the state change information;
if yes, the DPU card in the current server node and the server are controlled to cooperatively power up and power down.
2. The DPU card power-up and power-down method of claim 2, wherein prior to the step of receiving state change information sent by a neighboring server node under the same network, wherein the state change information includes whether the server is in a power-up state, the method comprises:
Collecting state data of servers in adjacent server nodes under the same network, wherein the state data comprises at least one of running states of the servers, power consumption of the servers and error reporting information of the servers;
and processing the state data through a preset algorithm to determine whether the current server node and the adjacent server node are in a relevant state.
3. The DPU card power-up and power-down method of claim 2, wherein the processing the status data by a preset algorithm to determine whether a current server node and an adjacent server node are in a relevant state comprises:
analyzing and processing the state data through a preset AI algorithm to obtain the correlation degree between the current server node and the adjacent server node, or obtain the output value between the current server node and the adjacent server node;
determining whether the current server node and the adjacent server node are in a relevant state according to the relevance, or determining whether the current server node and the adjacent server node are in a relevant state according to the output value;
the determining whether the current server node and the adjacent server node are in a relevant state according to the relevance comprises the following steps:
When the correlation degree is detected to be in a target preset range or more than or equal to a target threshold value, the current server node and the adjacent server nodes are in a correlation state;
when the correlation degree is detected to be in a non-target preset range or smaller than a target threshold value, the current server node and the adjacent server node are in a non-correlation state;
the determining whether the current server node and the adjacent server node are in a relevant state according to the output value comprises:
under the condition that the output value is 1, the current server node and the adjacent server node are in a relevant state;
in case that the output value is detected to be 0, the current server node and the neighboring server node are in an uncorrelated state.
4. A DPU card power-on/power-off system, wherein the DPU card power-on/power-off system performs the DPU card power-on/power-off method of any one of claims 1 to 3.
5. The DPU card power-on-off system of claim 1, wherein the DPU card power-on-off system comprises: the system comprises at least two server nodes, wherein the server nodes are interconnected through a network cable and comprise a feature extraction unit, a decision unit and a feedback unit;
The feature extraction unit is used for collecting state data sent by the target server, wherein the state data comprises at least one of a server running state, a power consumption state and a fault reporting state lamp display state;
the decision unit is used for determining the correlation degree between the server nodes according to the state data;
the feedback unit is used for collecting the local operation information of the current server.
6. The DPU card power-on-off system of claim 1, wherein the server node further comprises a DPU card and a server;
the DPU card and the server are connected with each other.
7. The DPU card power-on-off system of claim 1, further comprising a management end communicatively coupled to the server node;
the management end is used for monitoring the server nodes, and the management end sends power-on and power-off commands to any one of the server nodes so that any one of the server nodes sends state change information to adjacent server nodes.
8. A DPU card power-on and power-off device, the device comprising:
the receiving module is used for receiving state change information sent by the adjacent server nodes in the same network, wherein the state change information comprises local operation information of the server, and the local operation information characterizes whether the server is powered on or powered off;
The determining module is used for determining whether to send a state change command to the DPU card according to the state change information if the current server node and the adjacent server node are in a relevant state;
and the control module is used for controlling the DPU card in the current server node and the server to cooperatively power up and power down if the DPU card is powered up.
9. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the DPU card power-up and power-down method of any one of claims 1 to 3.
10. A readable storage medium, wherein a computer program is stored on the readable storage medium, which when executed by a processor, implements the DPU card power-up and power-down method of any one of claims 1 to 3.
CN202311278085.5A 2023-09-28 2023-09-28 DPU card powering-on and powering-off method, system, device, equipment and storage medium Pending CN117896191A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311278085.5A CN117896191A (en) 2023-09-28 2023-09-28 DPU card powering-on and powering-off method, system, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311278085.5A CN117896191A (en) 2023-09-28 2023-09-28 DPU card powering-on and powering-off method, system, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117896191A true CN117896191A (en) 2024-04-16

Family

ID=90643468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311278085.5A Pending CN117896191A (en) 2023-09-28 2023-09-28 DPU card powering-on and powering-off method, system, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117896191A (en)

Similar Documents

Publication Publication Date Title
WO2020253347A1 (en) Container cluster management method, device and system
US10048996B1 (en) Predicting infrastructure failures in a data center for hosted service mitigation actions
US9164782B2 (en) Virtual machine administration for data center resource managers
CN102546269B (en) A kind of method and system of Fast Monitoring IP network
US20150081878A1 (en) Describing datacenter rack information in management system
CN112506755B (en) Log acquisition method, device, computer equipment and storage medium
CN111818159A (en) Data processing node management method, device, equipment and storage medium
US20050044220A1 (en) Method and system of managing computing resources
CN113645085B (en) Method and device for detecting abnormality of intelligent network card, electronic equipment and storage medium
CN111694707A (en) Small server cluster management system and method
CN114244683A (en) Event classification method and device
CN114666335B (en) Distributed system load balancing device based on data distribution service DDS
CN105245379A (en) Method and device for testing network management system through simulating SNMP (simple network management protocol) network element
CN108111578B (en) Method for accessing power distribution terminal data acquisition platform into terminal equipment based on NIO
US11334436B2 (en) GPU-based advanced memory diagnostics over dynamic memory regions for faster and efficient diagnostics
CN112714022A (en) Control processing method and device for multiple clusters and computer equipment
CN116723198A (en) Multi-node server host control method, device, equipment and storage medium
CN117896191A (en) DPU card powering-on and powering-off method, system, device, equipment and storage medium
CN112866011B (en) Method, device, equipment and medium for determining network topology structure
CN116361703A (en) Energy-saving control method and device for data center, electronic equipment and readable medium
US11237892B1 (en) Obtaining data for fault identification
CN111263387B (en) Wireless sensor network exception handling method and system
CN114595848A (en) Equipment supervision method and device
CN112003727A (en) Multi-node server power supply testing method, system, terminal and storage medium
CN117093465B (en) Server log collection method, device, communication equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination