WO2021128084A1 - Data processing, acquisition, model training and power consumption control methods, system and device - Google Patents

Data processing, acquisition, model training and power consumption control methods, system and device Download PDF

Info

Publication number
WO2021128084A1
WO2021128084A1 PCT/CN2019/128400 CN2019128400W WO2021128084A1 WO 2021128084 A1 WO2021128084 A1 WO 2021128084A1 CN 2019128400 W CN2019128400 W CN 2019128400W WO 2021128084 A1 WO2021128084 A1 WO 2021128084A1
Authority
WO
WIPO (PCT)
Prior art keywords
power consumption
processing unit
measurement information
frequency
inference model
Prior art date
Application number
PCT/CN2019/128400
Other languages
French (fr)
Chinese (zh)
Inventor
王加龙
张云
朱昊
李栈
任志星
宋军
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Priority to PCT/CN2019/128400 priority Critical patent/WO2021128084A1/en
Publication of WO2021128084A1 publication Critical patent/WO2021128084A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power

Definitions

  • This application relates to the field of computer technology, and in particular to a method, system, and equipment for data processing, acquisition, model training, and power consumption control.
  • the embodiments of the present application provide a data processing, acquisition, model training, and power consumption control method, system, and equipment to solve or improve the problems in the prior art.
  • a data processing method includes:
  • the frequency of the processing unit of the first device is determined.
  • a data processing system in another embodiment, includes:
  • the first device is used to generate measurement information during work
  • the first management device is configured to obtain measurement information related to the first device, and determine the frequency of the processing unit of the first device according to the measurement information.
  • a data processing method includes:
  • the measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
  • a data acquisition method includes:
  • the processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  • model training methods include:
  • the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
  • the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
  • test system includes:
  • the second device for testing including the same hardware structure and performance as the first device, is used to run the test program to load the corresponding test load;
  • the second management device for testing is connected to the second device, and is used to obtain the processing unit frequency of the second device under the test load and the measurement information related to the second device; and the frequency of the processing unit And the measurement information is used as a sample pair in the training samples used to train the inference model to be trained.
  • a device power consumption control method includes:
  • the power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
  • an electronic device includes: a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the frequency of the processing unit of the first device is determined.
  • an electronic device includes: a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
  • an electronic device includes: a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  • an electronic device includes: a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
  • the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
  • an electronic device includes: a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
  • the inventor who implements the technical solutions provided by the embodiments of the present application discovered through a lot of creative work that the power of the processing unit of the device is related to measurement information, where the measurement information is measurable information related to the device.
  • the processing unit power mentioned in this article that is, the clock frequency of the processing unit, is simply the operating frequency of the processing unit during operation. Therefore, in the technical solution provided by an embodiment of the present application, measurement information related to the first device is acquired; and the processing unit frequency of the first device is determined based on the measurement information.
  • the processing unit frequency obtained by using the technical solution provided in this embodiment is more accurate than the existing processing unit frequency read from the BMC (baseboard management controller); Helps improve the power management capabilities of the device.
  • the self-learning ability of the training model is used to self-learn the correlation between the measurement information and the frequency of the processing unit; and then the inference model completed by the training is used to combine the measurement information related to the first device As an input of the inference model, the processing unit frequency of the first device is obtained by executing the inference model.
  • the processing unit frequency obtained by the technical solution provided in this embodiment has a higher accuracy rate.
  • the frequency of the processing unit of the second device under different load conditions and related measurement information is simulated by adding a load to the second device for testing, and the frequency of the processing unit and the measurement information are taken as
  • the training samples of the inferred model can learn more accurately the relationship between the measurement information and the frequency of the processing unit, thereby improving the calculation accuracy of the frequency of the processing unit.
  • the measurement information related to the first device is used to determine the processing unit frequency of the first device, and then based on the determined processing unit frequency, the power consumption capping control is reset.
  • the reference first power consumption cap value due to the high accuracy of the processing unit frequency, the accurate processing unit frequency is used to reset the first power consumption cap value, which is closer to the actual situation of the first device; it can be seen that the embodiment of the application is adopted.
  • FIG. 1 is a schematic flowchart of a data processing method provided by an embodiment of this application
  • FIG. 2 is a schematic structural diagram of a data processing system provided by an embodiment of this application.
  • FIG. 3 is a schematic flowchart of a data processing method provided by another embodiment of this application.
  • FIG. 4 is a schematic flowchart of a data acquisition method provided by an embodiment of the application.
  • FIG. 5 is a schematic flowchart of a model training method provided by an embodiment of this application.
  • FIG. 6 is a schematic structural diagram of a data processing system provided by another embodiment of this application.
  • FIG. 7 is a schematic diagram of a training sample generation process provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of an inference model training process provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of a real-time processing unit frequency inference process when using an inference model to perform power capping control on a server according to an embodiment of the application;
  • FIG. 10 is a schematic flowchart of a method for controlling power consumption of a device according to an embodiment of the application.
  • FIG. 11 is a schematic structural diagram of a data processing device provided by an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of a data processing device provided by another embodiment of this application.
  • FIG. 13 is a schematic structural diagram of a data acquisition device provided by an embodiment of this application.
  • FIG. 14 is a schematic structural diagram of a model training device provided by an embodiment of the application.
  • 15 is a schematic structural diagram of a device power consumption control apparatus provided by an embodiment of the application.
  • FIG. 16 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • the power consumption management in the prior art includes three main parts: setting a capping value, monitoring operating power consumption, and performing power capping actions. That is, first set the power consumption cap value of each rack server according to the power distribution requirements of the cabinet, the actual power consumption of the normal operation of the server, business pressure requirements, etc., and then write the cap value into the out-of-band management device as the upper limit power consumption of the server operation .
  • the out-of-band management device monitors the power consumption of the whole machine, and if it finds that the power consumption exceeds the capping value, it performs the capping action.
  • the out-of-band management device has limited access to the server status. For example, the out-of-band management device cannot accurately and directly obtain the frequency of the processing unit. In power management, it is crucial to accurately obtain the processing unit frequency.
  • each embodiment of the present application provides a solution to obtain a processing unit frequency with a higher accuracy rate, so as to perform power consumption management more accurately.
  • Fig. 1 shows a schematic flowchart of a data processing method provided by an embodiment of the present application. As shown in Figure 1, the data processing method includes:
  • the measurement information related to the first device can be understood as: all measurable information in the working process of the first device, including but not limited to: processing unit power consumption, processing unit temperature value, processing unit utilization Speed, motherboard temperature information, fan speed, etc.
  • the foregoing step 101 "obtain measurement information related to the first device" may include:
  • start power capping control In a case where the power consumption of the first device exceeds the first power capping threshold value (power capping threshold value), start power capping control.
  • the power consumption capping control can be simply understood as: monitoring the power consumption of the first device, and controlling the power consumption of the first device to not exceed the first power consumption capping value.
  • the processing unit frequency that is, the clock frequency of the processing unit, is simply the abbreviation of the operating frequency (the number of synchronization pulses generated in 1 second) of the processing unit during operation; it determines the operating speed of the computer.
  • step 102 determine the frequency of the processing unit of the first device according to the measurement information.
  • the "obtaining inference model" in the above step 1021 may specifically include:
  • training samples include: processing unit frequencies and measurement information corresponding to the processing unit frequency samples;
  • the inference model completes training
  • the inventor who implements the technical solutions provided by the embodiments of the present application discovered through a lot of creative work that the power of the processing unit of the device is related to measurement information, where the measurement information is measurable information related to the device. Therefore, in the technical solution provided by an embodiment of the present application, measurement information related to the first device is acquired; and the processing unit frequency of the first device is determined based on the measurement information. Practice has proved that the processing unit frequency obtained by using the technical solution provided in this embodiment is more accurate than the existing processing unit frequency read from the BMC (baseboard management controller); Helps improve the power management capabilities of the device.
  • BMC baseboard management controller
  • the method provided in this embodiment may further include the following steps:
  • the first preset condition may be: whether it is within a set value range. It is assumed that the value range is between a first preset value and a second preset value, where the first preset value is smaller than the second preset value.
  • the above step 103 "resetting the first power consumption cap value when the frequency of the processing unit does not meet the first preset condition" may specifically include:
  • the first preset value and the second preset value may be empirical values or values obtained through multiple experiments, etc., which are not specifically limited in this embodiment.
  • the increase and decrease of the first power consumption cap value may be determined based on a preset reset rule.
  • the reset rule is: each adjustment is increased or decreased by a fixed value or a fixed ratio.
  • the processing unit mentioned in this article can be a general-purpose processor, such as a CPU; it can also be a dedicated processor or a heterogeneous computing unit, such as DSP (Digital Signal Processing, digital signal processor), ASIC (dedicated Integrated circuit), GPU (Graphics Processing Unit, graphics processor), FPGA (Field-Programmable Gate Array, field programmable gate array), network card acceleration chip, etc.
  • DSP Digital Signal Processing, digital signal processor
  • ASIC dedicated Integrated circuit
  • GPU Graphics Processing Unit, graphics processor
  • FPGA Field-Programmable Gate Array, field programmable gate array
  • network card acceleration chip etc.
  • the data processing system includes:
  • the first device 201 is used to generate measurement information during work
  • the first management device 202 is configured to obtain measurement information related to the first device, and determine the frequency of the processing unit of the first device according to the measurement information.
  • the first management device 202 is further configured to send a power consumption capping control instruction to the first device when the power consumption of the first device exceeds the first power consumption cap value.
  • the first device 201 is also configured to perform a power capping operation according to the control instruction.
  • the first management device 202 is further configured to obtain measurement information related to the first device when the power consumption capping control is activated by the first device, and determine the first device according to the measurement information The processing unit frequency.
  • the first device is a server in a server cluster; the first management device is an out-of-band management device.
  • the server includes multiple pieces of hardware, including but not limited to the following: motherboard, processing unit, power consumption, temperature sensor, and fan.
  • the measurement information related to the first device acquired by the out-of-band management apparatus may include: power consumption of the processing unit, temperature value of the processing unit, utilization of the processing unit, temperature information of the main board, fan speed, and so on.
  • a first interface may be provided in the first device, and the first interface is used to connect to the BMC.
  • BMC can realize the collection function of part or all of the measurement information.
  • the first interface may include, but is not limited to: a USB interface and a PCI (Peripheral Component Interconnect) slot.
  • the out-of-band management implemented by the out-of-band management device may include: out-of-band management of the power consumption of the device according to the out-of-band information sent by the server (that is, all measurement information related to the first device that can be measured), and/or Hardware shared by at least one server, such as a fan, is managed out-of-band.
  • out-of-band management device provided in this embodiment can implement other functions in addition to the above functions.
  • out-of-band management device can implement other functions in addition to the above functions.
  • FIG. 3 shows a schematic flowchart of a data processing method provided by another embodiment of the present application. As shown in the figure, the data processing method includes:
  • the above step 301 "obtain measurement information related to the first device" may include:
  • start power consumption capping control In the case where the power consumption of the first device exceeds the first power consumption cap value, start power consumption capping control;
  • the measurement information related to the first device can be understood as: all the information that can be measured during the working process of the first device, including but not limited to: processing unit power consumption, processing unit temperature value, processing unit utilization rate, main board Temperature information, fan speed, etc.
  • step 302 "obtain an inference model that uses training samples to complete training” may include:
  • the inference model completes training
  • the inference model to be trained in this embodiment can be a neural network model in the prior art, such as a convolutional neural network CNN, a long short-term memory network LSTM, etc., which is not specifically limited in this embodiment .
  • the model training process can also refer to related content in the prior art.
  • FIG. 4 shows a schematic flowchart of a data acquisition method provided by an embodiment of the present application.
  • the second device in this embodiment is a test device, which has the same type of processing unit and other hardware as the actual server (that is, a server that needs to be used on site and requires power consumption management), and it has a processing unit that can read accurately Frequency in-band sensor.
  • the method includes:
  • processing unit frequency and the measurement information as a sample pair in the training samples used to train the inference model to be trained.
  • a test program such as a Benchmark tool
  • a Benchmark tool can be loaded and run on the second device.
  • the content of the Benchmark tool please refer to the prior art, which will not be repeated in this article.
  • the frequency of the processing unit can be obtained by an in-band sensor.
  • the measurement information related to the first device can be understood as: all the information that can be measured during the working process of the first device, including but not limited to: processing unit power consumption, processing unit temperature value, processing unit utilization rate, main board Temperature information, fan speed, etc.
  • the measurement information related to the first device is information that can be obtained by the out-of-band management device.
  • the method provided in this embodiment may further include the following steps:
  • step 402 obtaining the frequency of the processing unit of the second device under test load and measurement information related to the second device.
  • the "acquiring the frequency of the processing unit of the second device and the measurement information related to the second device" in the foregoing 4022 may specifically be:
  • test period is a period of time from when the power consumption capping control is activated to when the test load of the second device is loaded to a preset maximum load; or from when the power consumption capping control is activated to the first
  • the test load of the second device is loaded to the preset maximum load and continues for a period of time after the set duration.
  • the aforementioned continuous setting duration may be an empirical value, which is not specifically limited in this embodiment.
  • the method provided in this embodiment may further include the following steps:
  • the next round of testing returns to the above step 401 to repeat the above process again on the basis of resetting the first power consumption cap value to obtain a new sample pair.
  • the foregoing threshold may be equal to or greater than the rated power consumption of the second device.
  • Each round of resetting can increase a certain step length, this step length can be a fixed value, or it can be changed appropriately.
  • the rated power consumption of the second device (such as a server) is 500W
  • the first round of testing the first power consumption cap value is set to 350W
  • in the second round of testing set the first power consumption
  • the cap value is reset to 360W;..., in the Nth round of testing, the first cap value of power consumption is reset to 520W.
  • the last threshold can be a little higher than the device's rated power consumption of 500W, because the rated power consumption does not represent the maximum power consumption of the machine in actual operation.
  • the "resetting the first power consumption cap value" in the above steps 405 and 405' may specifically be:
  • the first power consumption cap value is updated to the fourth power consumption cap value.
  • FIG. 5 shows a schematic flowchart of a model training method provided by an embodiment of the present application.
  • the model training method includes:
  • the training sample includes a plurality of sample pairs, and the sample pairs include test information and a processing unit frequency corresponding to the test information.
  • the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
  • test system includes:
  • the second test device 601 includes the same hardware structure and performance as the first device, and is used to run the test program to load the corresponding test load;
  • the second management device 602 for testing is connected to the second device, and is used to obtain the frequency of the processing unit of the second device under the test load and the measurement information related to the second device; The frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  • the second management device for testing has the same hardware structure and functions as the first management device. In addition to this, it also has a function that the first management device does not have, that is, the frequency of the processing unit of the second device is acquired. In specific implementation, the frequency of the processing unit of the second device can be obtained through an in-band sensor.
  • the second management device 602 is further configured to set a first power consumption cap value for the second device; the test load is increased until the power consumption of the second device reaches the first power consumption cap value When, sending a power consumption capping control instruction to the second device;
  • the second device 601 is also configured to perform a power consumption capping operation according to the control instruction
  • the second management device 602 is further configured to obtain the processing unit frequency of the second device and the measurement information related to the second device when the power consumption capping control is activated by the second device;
  • the processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  • test system may further include:
  • the model training device is used to obtain training samples, the training samples include multiple sample pairs, the sample pairs include test information and processing unit frequencies corresponding to the test information; based on the multiple samples, the inference model to be trained is trained for the first A management device provides an inference model for completing the training.
  • the second management device provided in this embodiment may implement other functions in addition to the above functions.
  • the second management device may implement other functions in addition to the above functions.
  • the first part is to generate training samples.
  • the second part is to train the inference model.
  • the third part is the real-time processing unit frequency inference when using the inference model to control the power capping of the server.
  • test server with the same hardware structure and performance as the server in the actual application scenario and a test out-of-band management device with the same hardware structure and function as the out-of-band management device in the actual application scenario.
  • the processing unit is the CPU.
  • the set threshold is equal to the rated power consumption of the test server or 120% of the rated power consumption.
  • the Benchmark can continue for a period of time to allow enough time to record multiple pairs of samples when the power consumption cap takes effect.
  • the data processing process can refer to the corresponding content in the prior art, which is not specifically limited in this embodiment.
  • a machine learning model or any regression equation (ie, the above-mentioned inference model to be trained) is used to establish the relationship between the frequency of the processing unit and the measurement information.
  • the training samples processed by the above S21 are used to train the inference model to be trained.
  • the training process can refer to the related content of the prior art, which is not specifically limited here.
  • the real-time processing unit frequency is inferred when the inference model is used to control the power capping of the server.
  • the measurement information includes at least: processing unit power, processing unit temperature value, processing unit usage rate, temperature information of the main board, fan speed, and so on.
  • the inferred processing unit frequency can be used as a basis for adjusting the existing power consumption cap value. For example, if the inferred frequency of the processing unit is too high, the existing power consumption cap value can be lowered; if the inferred frequency of the processing unit is too low, the existing power consumption cap value can be adjusted higher.
  • FIG. 10 shows a schematic flowchart of a method for controlling power consumption of a device according to an embodiment of the present application.
  • the device power consumption control method includes:
  • step 703 “resetting the first power consumption cap value based on the frequency of the processing unit” may specifically include:
  • the first preset value is less than the second preset value.
  • the above-mentioned first preset value and second preset value may be preset values, which may be empirical values, or calculated through corresponding algorithms, or obtained through multiple experiments, and so on.
  • the increase and decrease of the first power consumption cap value can be determined based on a preset reset rule.
  • the reset rule is: each adjustment is increased or decreased by a fixed value or a fixed ratio.
  • the method provided in the embodiment of the present application that is, the execution premise of each of the foregoing steps 701 may be: when the power consumption capping control for the first device is started, an action of triggering the measurement information related to the first device.
  • step 702 determine the frequency of the processing unit of the first device according to the measurement information.
  • the measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
  • the measurement information may include but is not limited to at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
  • the measurement information related to the first device is used to determine the frequency of the processing unit of the first device, and then based on the determined frequency of the processing unit, the first reference to be referenced during power capping control is reset.
  • Power consumption cap value due to the high accuracy of the processing unit frequency, the first power consumption cap value is reset with an accurate processing unit frequency, which is closer to the actual situation of the first device; it can be seen that the technology provided by the embodiment of this application is used.
  • the solution helps to improve the power management capability of the device.
  • FIG. 11 shows a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • the data processing device includes: an acquisition module 11 and a determination module 12.
  • the obtaining module 11 is used to obtain measurement information related to the first device;
  • the determining module 12 is used to determine the frequency of the processing unit of the first device according to the measurement information.
  • the acquisition module 11 is also used for:
  • the data processing device provided in this embodiment should include a reset module.
  • the reset module is configured to reset the first power consumption cap value when the frequency of the processing unit does not meet the first preset condition.
  • reset module is also used for:
  • the first preset value is smaller than the second preset value.
  • determining module 12 is also used for:
  • the measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
  • the acquisition module 11 is also used for:
  • training samples include: processing unit frequencies and measurement information corresponding to the processing unit frequency samples;
  • the inference model completes the training
  • the parameters in the inference model are optimized.
  • the measurement information includes at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
  • FIG. 12 shows a schematic structural diagram of a data processing device provided by another embodiment of the present application.
  • the data processing device includes: an acquisition module 21 and an inference module 22.
  • the acquisition module 21 is used to acquire measurement information related to the first device; and to acquire an inference model that uses training samples to complete training, wherein the training samples include multiple sample pairs, and the sample pairs include measurement information and a processing unit. frequency.
  • the inference module 22 is configured to use the measurement information as an input of an inference model, and execute the inference model to obtain the processing unit frequency of the first device.
  • the self-learning ability of the training model is used to self-learn the association relationship between the measurement information and the frequency of the processing unit; then the trained inference model is used to use the measurement information related to the first device as the inference
  • the input of the model, the frequency of the processing unit of the first device is obtained by executing the inferred model.
  • the processing unit frequency obtained by the technical solution provided in this embodiment has a higher accuracy rate.
  • the acquisition module 21 is also used for:
  • the acquisition module 21 is also used for:
  • the inference model completes the training
  • the parameters in the inference model are optimized; and the next training process is entered.
  • FIG. 13 shows a schematic structural diagram of a data acquisition device provided by an embodiment of the present application.
  • the data acquisition device includes: a loading module 31 and an acquisition module 32.
  • the loading module is used to increase the test load for the second device for testing.
  • the acquisition module is used to acquire the processing unit frequency of the second device under a test load and measurement information related to the second device; and use the processing unit frequency and the measurement information as the processing unit frequency and the measurement information to be used for training.
  • a sample pair in the training sample for training the inference model is used to acquire the processing unit frequency of the second device under a test load and measurement information related to the second device.
  • the second device for testing is added with a load to simulate the processing unit frequency of the second device under different load conditions and related measurement information, and the processing unit frequency and measurement information are used as the inference model.
  • the training samples can learn more accurately the relationship between the measurement information and the frequency of the processing unit, thereby improving the calculation accuracy of the frequency of the processing unit.
  • the data acquisition device provided in this embodiment may further include a setting module.
  • the setting module is used to set a first power consumption cap value for the second device.
  • the acquisition module is also used for:
  • the processing unit frequency of the second device and the measurement information related to the second device are acquired.
  • the acquisition module 32 is also used for:
  • the test period is a period of time from when the power consumption capping control is started to when the test load of the second device is loaded to a preset maximum load; or from when the power consumption capping control is started to the first
  • the test load of the second device is loaded to the preset maximum load and continues for a period of time after the set duration.
  • the data acquisition module provided in this embodiment may further include a reset module.
  • the reset module is used for:
  • reset module is also used for:
  • the first power consumption cap value is updated to the fourth power consumption cap value.
  • the data acquisition device provided in the foregoing embodiment can implement the technical solutions described in the foregoing method embodiments.
  • the specific implementation principles of the foregoing modules or units please refer to the corresponding content in the foregoing method embodiments. No longer.
  • FIG. 14 shows a schematic structural diagram of a model training device provided by an embodiment of the present application.
  • the model training device includes: an acquisition module 41 and a training module 42.
  • the acquisition module 41 is used to acquire training samples
  • the training samples include multiple sample pairs
  • the sample pairs include test information and processing unit frequencies corresponding to the test information.
  • the training module 42 is configured to train the inference model to be trained based on the multiple samples; wherein the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device .
  • FIG. 15 shows a schematic structural diagram of an apparatus for controlling power consumption of a device according to an embodiment of the present application.
  • the device power consumption control device includes: an acquisition module 51, a determination module 52, a reset module 53 and an execution module 54.
  • the obtaining module 51 is used to obtain measurement information related to the first device;
  • the determining module 52 is used to determine the frequency of the processing unit of the first device according to the measurement information;
  • the reset module 53 is used to obtain measurement information based on the processing unit Frequency, reset the first power consumption cap value;
  • the execution module 54 is configured to use the reset first power consumption cap value to execute power consumption cap control for the first device.
  • reset module 53 is also used for:
  • the first preset value is smaller than the second preset value.
  • determining module 52 is also used for:
  • the measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
  • the measurement information includes at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
  • FIG. 16 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device includes: a memory 61 and a processor 62, where:
  • the memory 61 is used to store programs
  • the processor 62 is coupled with the memory 61, and is configured to execute the program stored in the memory 61 for:
  • the frequency of the processing unit of the first device is determined.
  • the aforementioned memory 61 may be configured to store other various data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device.
  • the memory 61 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable and Programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable and Programmable read only memory
  • PROM programmable read only memory
  • ROM read only memory
  • magnetic memory flash memory
  • flash memory magnetic or optical disk.
  • processor 62 executes the program in the memory 61, in addition to the above functions, it may also implement other functions. For details, please refer to the description of the previous embodiments.
  • the electronic device further includes: a communication component 63, a display 64, a power supply component 65, an audio component 66 and other components. Only some components are schematically shown in FIG. 16, which does not mean that the electronic device only includes the components shown in FIG. 16.
  • the structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. 16 above.
  • the electronic device includes a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
  • an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which can implement the steps or functions of the data processing method provided by the foregoing embodiments when the computer program is executed by a computer.
  • the structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. 16 above.
  • the electronic device includes a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  • an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which when executed by a computer can implement the steps or functions of the data acquisition method provided in the foregoing embodiments.
  • the structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG.
  • the electronic device includes a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
  • the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
  • an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which can implement the steps or functions of the model training method provided in the foregoing embodiments when the computer program is executed by a computer.
  • the structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. 16 above.
  • the electronic device includes a memory and a processor, among which,
  • the memory is used to store programs
  • the processor is coupled with the memory, and is configured to execute the program stored in the memory for:
  • the power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
  • an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which when executed by a computer can implement the steps or functions of the device power consumption control method provided by the foregoing embodiments.
  • the device embodiments described above are merely illustrative.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units.
  • Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.
  • each implementation manner can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware.
  • the above technical solution essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic A disc, an optical disc, etc., include several instructions to make a computer first device (which may be a personal computer, a server, or a network first device, etc.) execute the methods described in each embodiment or some parts of the embodiment.

Abstract

Data processing, acquisition, model training and power consumption control methods, a system and a device. The data processing method comprises: acquiring measurement information related to a first device (201) (101); and according to the measurement information, determining a processing unit frequency of the first device (201) (102). Compared to the existing processing unit frequency read from a BMC, the processing unit frequency obtained by means of said solution has higher accuracy, thereby helping to improve the power consumption management capability of devices.

Description

数据处理、获取、模型训练及功耗控制方法、系统及设备Data processing, acquisition, model training and power consumption control method, system and equipment 技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种数据处理、获取、模型训练及功耗控制方法、系统及设备。This application relates to the field of computer technology, and in particular to a method, system, and equipment for data processing, acquisition, model training, and power consumption control.
背景技术Background technique
准确的功率管理能力是提高系统可靠性和降低成本的关键。当服务器处于低负载状态时,应该有一种机制来减少不必要的功耗。当功率消耗超过安全水平时,应该有办法加以限制。要做功耗管理和控制,准确地感知服务器的处理单元,如CPU(Central Processing Unit,中央处理器)频率的状态是至关重要的。Accurate power management capabilities are the key to improving system reliability and reducing costs. When the server is under low load, there should be a mechanism to reduce unnecessary power consumption. When power consumption exceeds a safe level, there should be ways to limit it. To manage and control power consumption, it is vital to accurately perceive the server's processing unit, such as the CPU (Central Processing Unit, central processing unit) frequency status.
然而,在实际应用中,处理单元频率无法被准确地检测到。However, in practical applications, the processing unit frequency cannot be accurately detected.
发明内容Summary of the invention
本申请各实施例提供一种数据处理、获取、模型训练及功耗控制方法、系统及设备,以解决或改善现有技术存在的问题。The embodiments of the present application provide a data processing, acquisition, model training, and power consumption control method, system, and equipment to solve or improve the problems in the prior art.
在本申请的一个实施例中,提供了一种数据处理方法。该方法,包括:In an embodiment of the present application, a data processing method is provided. The method includes:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
根据所述测量信息,确定所述第一设备的处理单元频率。According to the measurement information, the frequency of the processing unit of the first device is determined.
在本申请的另一个实施例中,提供了一种数据处理系统。该数据处理系统包括:In another embodiment of the present application, a data processing system is provided. The data processing system includes:
第一设备,用于在工作过程中产生测量信息;The first device is used to generate measurement information during work;
第一管理装置,用于获取与第一设备有关的测量信息,并根据所述测量信息,确定所述第一设备的处理单元频率。The first management device is configured to obtain measurement information related to the first device, and determine the frequency of the processing unit of the first device according to the measurement information.
在本申请的又一个实施例中,提供了一种数据处理方法。该方法包括:In yet another embodiment of the present application, a data processing method is provided. The method includes:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。The measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
在本申请的又一个实施例中,提供了一种数据获取方法。该方法包括:In yet another embodiment of the present application, a data acquisition method is provided. The method includes:
为测试用第二设备增加测试负载;Increase the test load for the second device for testing;
获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;Acquiring a processing unit frequency of the second device under a test load and measurement information related to the second device;
将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
在本申请的又一个实施例中,提供了一种模型训练方法。该模型训练方法包括:In yet another embodiment of the present application, a model training method is provided. The model training methods include:
获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;Acquiring training samples, the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
基于所述多个样本,对待训练推断模型进行训练;Training the inference model to be trained based on the multiple samples;
其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。Wherein, the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
在本申请的又一个实施例中,提供了一种测试系统。该测试系统包括:In yet another embodiment of the present application, a test system is provided. The test system includes:
测试用第二设备,包括与第一设备相同的硬件结构及性能,用于运行测试程序以加载相应测试负载;The second device for testing, including the same hardware structure and performance as the first device, is used to run the test program to load the corresponding test load;
测试用第二管理装置,与所述第二设备连接,用于获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The second management device for testing is connected to the second device, and is used to obtain the processing unit frequency of the second device under the test load and the measurement information related to the second device; and the frequency of the processing unit And the measurement information is used as a sample pair in the training samples used to train the inference model to be trained.
在本申请的又一个实施例中,提供了一种设备功耗控制方法。该方法包括:In yet another embodiment of the present application, a device power consumption control method is provided. The method includes:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
根据所述测量信息,确定所述第一设备的处理单元频率;Determine the frequency of the processing unit of the first device according to the measurement information;
基于所述处理单元频率,重置第一功耗封顶值;Resetting the first power consumption cap value based on the frequency of the processing unit;
利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。The power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
在本申请的又一个实施例中,提供了一种电子设备。该电子设备包括:存储器及处理器,其中,In yet another embodiment of the present application, an electronic device is provided. The electronic device includes: a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
根据所述测量信息,确定所述第一设备的处理单元频率。According to the measurement information, the frequency of the processing unit of the first device is determined.
在本申请的又一个实施例中,提供了一种电子设备。该电子设备包括:存储器及处理器,其中,In yet another embodiment of the present application, an electronic device is provided. The electronic device includes: a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。The measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
在本申请的又一个实施例中,提供了一种电子设备。该电子设备包括:存储器及处理器,其中,In yet another embodiment of the present application, an electronic device is provided. The electronic device includes: a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
为测试用第二设备增加测试负载;Increase the test load for the second device for testing;
获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;Acquiring a processing unit frequency of the second device under a test load and measurement information related to the second device;
将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
在本申请的又一个实施例中,提供了一种电子设备。该电子设备包括:存储器及处理器,其中,In yet another embodiment of the present application, an electronic device is provided. The electronic device includes: a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;Acquiring training samples, the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
基于所述多个样本,对待训练推断模型进行训练;Training the inference model to be trained based on the multiple samples;
其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。Wherein, the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
在本申请的又一个实施例中,提供了一种电子设备。该电子设备包括:存储器及处理器,其中,In yet another embodiment of the present application, an electronic device is provided. The electronic device includes: a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
根据所述测量信息,确定所述第一设备的处理单元频率;Determine the frequency of the processing unit of the first device according to the measurement information;
基于所述处理单元频率,重置第一功耗封顶值;Resetting the first power consumption cap value based on the frequency of the processing unit;
利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。The power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
实现本申请实施例提供的技术方案的发明人通过大量的创造性劳动发现:设备的处理单元功率与测量信息存在关联性,其中,测量信息为可测 量到的与设备有关的信息。这里需要补充的是:本文中提及的处理单元功率,即处理单元的时钟频率,简单说是处理单元运算时的工作频率。因此,本申请一实施例提供的技术方案中,获取与第一设备有关的测量信息;基于所述测量信息来确定所述第一设备的处理单元频率。经实践证明,采用本实施例提供的技术方案得出的处理单元频率,较现有从BMC(baseboard management controller,基板管理控制器)中读取到的处理单元频率,准确率更高;继而有助于提升设备的功耗管理能力。The inventor who implements the technical solutions provided by the embodiments of the present application discovered through a lot of creative work that the power of the processing unit of the device is related to measurement information, where the measurement information is measurable information related to the device. What needs to be added here is: the processing unit power mentioned in this article, that is, the clock frequency of the processing unit, is simply the operating frequency of the processing unit during operation. Therefore, in the technical solution provided by an embodiment of the present application, measurement information related to the first device is acquired; and the processing unit frequency of the first device is determined based on the measurement information. Practice has proved that the processing unit frequency obtained by using the technical solution provided in this embodiment is more accurate than the existing processing unit frequency read from the BMC (baseboard management controller); Helps improve the power management capabilities of the device.
本申请另一实施例提供的技术方案中,利用训练模型的自学习能力,自学习测量信息与处理单元频率间的关联关系;然后利用训练完成的推断模型,将与第一设备有关的测量信息作为该推断模型的输入,通过执行该推断模型得到第一设备的处理单元频率。较现有从BMC中读取到的处理单元频率,本实施例提供的技术方案得出的处理单元频率,准确率更高。In the technical solution provided by another embodiment of the present application, the self-learning ability of the training model is used to self-learn the correlation between the measurement information and the frequency of the processing unit; and then the inference model completed by the training is used to combine the measurement information related to the first device As an input of the inference model, the processing unit frequency of the first device is obtained by executing the inference model. Compared with the existing processing unit frequency read from the BMC, the processing unit frequency obtained by the technical solution provided in this embodiment has a higher accuracy rate.
本申请又一实施例提供的技术方案中,通过为测试用第二设备增加负载来模拟第二设备在不同负载状态下的处理单元频率及与其有关的测量信息,将处理单元频率及测量信息作为推断模型的训练样本,可更准确地学习到测量信息与处理单元频率间存在的关联关系,进而提高处理单元频率的计算准确率。In the technical solution provided by another embodiment of the present application, the frequency of the processing unit of the second device under different load conditions and related measurement information is simulated by adding a load to the second device for testing, and the frequency of the processing unit and the measurement information are taken as The training samples of the inferred model can learn more accurately the relationship between the measurement information and the frequency of the processing unit, thereby improving the calculation accuracy of the frequency of the processing unit.
本申请又一实施例提供的技术方案中,利用与第一设备有关的测量信息确定所述第一设备的处理单元频率,再基于确定出的处理单元频率,重置功耗封顶控制时所需参照的第一功耗封顶值;因处理单元频率的准确度高,所以利用精准的处理单元频率重置第一功耗封顶值,更贴近第一设备的实际情况;可见,采用本申请实施例提供的技术方案,有助于提升设备的功耗管理能力。In the technical solution provided by another embodiment of the present application, the measurement information related to the first device is used to determine the processing unit frequency of the first device, and then based on the determined processing unit frequency, the power consumption capping control is reset. The reference first power consumption cap value; due to the high accuracy of the processing unit frequency, the accurate processing unit frequency is used to reset the first power consumption cap value, which is closer to the actual situation of the first device; it can be seen that the embodiment of the application is adopted The technical solutions provided will help improve the power consumption management capabilities of the equipment.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部 分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application, and do not constitute an improper limitation of the application. In the attached picture:
图1为本申请一实施例提供的数据处理方法的流程示意图;FIG. 1 is a schematic flowchart of a data processing method provided by an embodiment of this application;
图2为本申请一实施例提供的数据处理系统的结构示意图;FIG. 2 is a schematic structural diagram of a data processing system provided by an embodiment of this application;
图3为本申请另一实施例提供的数据处理方法的流程示意图;FIG. 3 is a schematic flowchart of a data processing method provided by another embodiment of this application;
图4为本申请一实施例提供的数据获取方法的流程示意图;FIG. 4 is a schematic flowchart of a data acquisition method provided by an embodiment of the application;
图5为本申请一实施例提供的模型训练方法的流程示意图;FIG. 5 is a schematic flowchart of a model training method provided by an embodiment of this application;
图6为本申请另一实施例提供的数据处理系统的结构示意图;FIG. 6 is a schematic structural diagram of a data processing system provided by another embodiment of this application;
图7为本申请一实施例提供的训练样本生成过程示意图;FIG. 7 is a schematic diagram of a training sample generation process provided by an embodiment of the application;
图8为本申请一实施例提供的推断模型训练过程示意图;FIG. 8 is a schematic diagram of an inference model training process provided by an embodiment of the application;
图9为本申请一实施例提供的使用推断模型对服务器进行功率封顶控制时的实时处理单元频率推断过程示意图;9 is a schematic diagram of a real-time processing unit frequency inference process when using an inference model to perform power capping control on a server according to an embodiment of the application;
图10为本申请一实施例提供的设备功耗控制方法的流程示意图;FIG. 10 is a schematic flowchart of a method for controlling power consumption of a device according to an embodiment of the application;
图11为本申请一实施例提供的数据处理装置的结构示意图;FIG. 11 is a schematic structural diagram of a data processing device provided by an embodiment of the application;
图12为本申请另一实施例提供的数据处理装置的结构示意图;FIG. 12 is a schematic structural diagram of a data processing device provided by another embodiment of this application;
图13为本申请一实施例提供的数据获取装置的结构示意图;FIG. 13 is a schematic structural diagram of a data acquisition device provided by an embodiment of this application;
图14为本申请一实施例提供的模型训练装置的结构示意图;FIG. 14 is a schematic structural diagram of a model training device provided by an embodiment of the application;
图15为本申请一实施例提供的设备功耗控制装置的结构示意图;15 is a schematic structural diagram of a device power consumption control apparatus provided by an embodiment of the application;
图16为本申请一实施例提供的电子设备的结构示意图。FIG. 16 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
具体实施方式Detailed ways
现有技术中的功耗管理包括三个主要部分:设置封顶值、监视运行功耗、执行功耗封顶动作。即先根据机柜配电要求、服务器正常运行的实际功耗、业务压力需求等设置各台机架服务器的功耗封顶值,然后将该封顶值写入带外管理装置作为服务器运行的上限功耗。在服务器运行过程中,带外管理装置监测整机功耗,如果发现功耗超过封顶值,则执行封顶动作。 其中,带外管理装置对服务器状态的访问是有限的,比如带外管理装置就无法准确而直接的获得处理单元频率。而在功耗管理时,准确地获得处理单元频率是至关重要的。The power consumption management in the prior art includes three main parts: setting a capping value, monitoring operating power consumption, and performing power capping actions. That is, first set the power consumption cap value of each rack server according to the power distribution requirements of the cabinet, the actual power consumption of the normal operation of the server, business pressure requirements, etc., and then write the cap value into the out-of-band management device as the upper limit power consumption of the server operation . During the operation of the server, the out-of-band management device monitors the power consumption of the whole machine, and if it finds that the power consumption exceeds the capping value, it performs the capping action. Among them, the out-of-band management device has limited access to the server status. For example, the out-of-band management device cannot accurately and directly obtain the frequency of the processing unit. In power management, it is crucial to accurately obtain the processing unit frequency.
目前,有一种获取处理单元频率的方法是:从BMC(baseboard management controller,基板管理控制器)中读取处理单元频率。BMC通过采样时间内的指令计数来估计频率。然而,大量的实践证明,这种估计是相当粗略的。它只有在所有的处理单元核心都充分使用时才有可靠的估计。当处理单元没有满负荷运行时,错误会大得离谱。Currently, there is a method for obtaining the frequency of the processing unit: reading the frequency of the processing unit from a BMC (baseboard management controller). The BMC estimates the frequency through the instruction count during the sampling time. However, a lot of practice has proved that this estimate is quite rough. It has a reliable estimate only when all the processing unit cores are fully used. When the processing unit is not running at full capacity, the error can be ridiculously large.
为此,本申请各实施例提供一种解决方案,以得出准确率更高的处理单元频率,以便更准确的进行功耗管理。To this end, each embodiment of the present application provides a solution to obtain a processing unit frequency with a higher accuracy rate, so as to perform power consumption management more accurately.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application.
在本申请的说明书、权利要求书及上述附图中描述的一些流程中,包含了按照特定顺序出现的多个操作,这些操作可以不按照其在本文中出现的顺序来执行或并行执行。操作的序号如101、102等,仅仅是用于区分各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、第一设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。此外,下述的各实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。Some processes described in the specification, claims, and the above-mentioned drawings of the present application include multiple operations appearing in a specific order, and these operations may be performed out of the order in which they appear in this document or performed in parallel. The sequence numbers of operations, such as 101, 102, etc., are only used to distinguish different operations, and the sequence numbers themselves do not represent any execution order. In addition, these processes may include more or fewer operations, and these operations may be executed sequentially or in parallel. It should be noted that the descriptions of "first" and "second" in this article are used to distinguish different messages, first devices, modules, etc., and do not represent a sequence, nor do they limit "first" and "second". Two" are different types. In addition, the following embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative work shall fall within the protection scope of this application.
图1示出了本申请一实施例提供的数据处理方法的流程示意图。如图1所示,所述数据处理方法包括:Fig. 1 shows a schematic flowchart of a data processing method provided by an embodiment of the present application. As shown in Figure 1, the data processing method includes:
101、获取与第一设备有关的测量信息。101. Acquire measurement information related to the first device.
102、根据所述测量信息,确定所述第一设备的处理单元频率。102. Determine the frequency of the processing unit of the first device according to the measurement information.
上述101中,与所述第一设备有关的测量信息可理解为:在第一设备工作过程中所有可测量到的信息,包括但不限于:处理单元功耗、处理单元温度值、处理单元利用率、主板的温度信息、风扇转速等等。In the above 101, the measurement information related to the first device can be understood as: all measurable information in the working process of the first device, including but not limited to: processing unit power consumption, processing unit temperature value, processing unit utilization Speed, motherboard temperature information, fan speed, etc.
在一具体实现方案中,上述步骤101“获取与第一设备有关的测量信息”,可包括:In a specific implementation solution, the foregoing step 101 "obtain measurement information related to the first device" may include:
1011、在第一设备因功耗超出第一功耗封顶值(power capping threshold value)的情况下,启动功耗封顶控制。1011. In a case where the power consumption of the first device exceeds the first power capping threshold value (power capping threshold value), start power capping control.
其中,功耗封顶控制可简单理解为:对第一设备的功耗进行监控,将第一设备的功耗控制在不超过第一功耗封顶值。Among them, the power consumption capping control can be simply understood as: monitoring the power consumption of the first device, and controlling the power consumption of the first device to not exceed the first power consumption capping value.
1012、在所述第一设备启动功耗封顶控制的情况下,获取与第一设备有关的测量信息。1012. Acquire measurement information related to the first device when the power consumption capping control is started by the first device.
上述102中,处理单元频率,也就是处理单元的时钟频率,简单说是处理单元运算时的工作频率(1秒内发生的同步脉冲数)的简称;它决定计算机的运行速度。In the above 102, the processing unit frequency, that is, the clock frequency of the processing unit, is simply the abbreviation of the operating frequency (the number of synchronization pulses generated in 1 second) of the processing unit during operation; it determines the operating speed of the computer.
在一种可实现的技术方案中,上述步骤102“根据所述测量信息,确定所述第一设备的处理单元频率”,可包括:In an achievable technical solution, the above step 102 "determine the frequency of the processing unit of the first device according to the measurement information" may include:
1021、获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率。1021. Obtain an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies.
1022、将所述测量信息作为所述推断模型的输入,执行所述推断模型得到所述处理单元频率。1022. Use the measurement information as an input of the inference model, and execute the inference model to obtain the processing unit frequency.
其中,上述步骤1021中“获取推断模型”,可具体包括:Among them, the "obtaining inference model" in the above step 1021 may specifically include:
10211、获取训练样本,其中,所述训练样本包括:处理单元频率及所述处理单元频率样本对应的测量信息;10211. Obtain training samples, where the training samples include: processing unit frequencies and measurement information corresponding to the processing unit frequency samples;
10212、将所述测量信息作为待训练推断模型的输入,执行所述推断模型得到输出结果;10212. Use the measurement information as an input of an inference model to be trained, and execute the inference model to obtain an output result;
10213、基于所述输出结果及所述处理单元频率确定满足收敛条件的情况下,所述推断模型完成训练;10213. In a case where it is determined that a convergence condition is satisfied based on the output result and the frequency of the processing unit, the inference model completes training;
10214、基于所述输出结果及所述处理单元频率确定不满足收敛条件的情况下,对所述推断模型中的参数进行优化。10214. If it is determined based on the output result and the frequency of the processing unit that the convergence condition is not met, optimize the parameters in the inference model.
实现本申请实施例提供的技术方案的发明人通过大量的创造性劳动发现:设备的处理单元功率与测量信息存在关联性,其中,测量信息为可测量到的与设备有关的信息。因此,本申请一实施例提供的技术方案中,获取与第一设备有关的测量信息;基于所述测量信息来确定所述第一设备的处理单元频率。经实践证明,采用本实施例提供的技术方案得出的处理单元频率,较现有从BMC(baseboard management controller,基板管理控制器)中读取到的处理单元频率,准确率更高;继而有助于提升设备的功耗管理能力。The inventor who implements the technical solutions provided by the embodiments of the present application discovered through a lot of creative work that the power of the processing unit of the device is related to measurement information, where the measurement information is measurable information related to the device. Therefore, in the technical solution provided by an embodiment of the present application, measurement information related to the first device is acquired; and the processing unit frequency of the first device is determined based on the measurement information. Practice has proved that the processing unit frequency obtained by using the technical solution provided in this embodiment is more accurate than the existing processing unit frequency read from the BMC (baseboard management controller); Helps improve the power management capabilities of the device.
进一步的,本实施例提供的所述方法还可包括如下步骤:Further, the method provided in this embodiment may further include the following steps:
103、在所述处理单元频率不满足第一预设条件的情况下,对所述第一功耗封顶值进行重置。103. In a case where the frequency of the processing unit does not meet the first preset condition, reset the first power consumption cap value.
其中,所述第一预设条件可以是:是否在设定取值范围内。假设该取值范围为:第一预设值至第二预设值之间,其中,所述第一预设值小于所述第二预设值。相应的,上述步骤103“在所述处理单元频率不满足第一预设条件的情况下,对所述第一功耗封顶值进行重置”,可具体包括:Wherein, the first preset condition may be: whether it is within a set value range. It is assumed that the value range is between a first preset value and a second preset value, where the first preset value is smaller than the second preset value. Correspondingly, the above step 103 "resetting the first power consumption cap value when the frequency of the processing unit does not meet the first preset condition" may specifically include:
在所述处理单元频率低于第一预设值的情况下,将所述第一功耗封顶值提高至第二功耗封顶值;When the frequency of the processing unit is lower than the first preset value, increasing the first power consumption cap value to a second power consumption cap value;
在所述处理单元频率高于第二预设值的情况下,将所述第一功耗封顶值降低至第三功耗封顶值。When the frequency of the processing unit is higher than the second preset value, reducing the first power consumption cap value to a third power consumption cap value.
具体实施时,第一预设值及第二预设值可为经验值或通过多次实验得出的值等等,本实施例对此不作具体限定。对所述第一功耗封顶值的提高量及降低量,可基于预设的重置规则来确定。比如,重置规则为:每次调整均提高或降低一固定值或一固定比例等。During specific implementation, the first preset value and the second preset value may be empirical values or values obtained through multiple experiments, etc., which are not specifically limited in this embodiment. The increase and decrease of the first power consumption cap value may be determined based on a preset reset rule. For example, the reset rule is: each adjustment is increased or decreased by a fixed value or a fixed ratio.
这里需要说明的是:本文中提及的处理单元可以是通用处理器,如CPU;还可以是专用处理器或异构计算单元,如DSP(Digital Signal Processing,数字信号处理器),ASIC(专用集成电路),GPU(Graphics Processing Unit,图 形处理器),FPGA(Field-Programmable Gate Array,现场可编程门阵列),网卡加速芯片等。What needs to be explained here is: the processing unit mentioned in this article can be a general-purpose processor, such as a CPU; it can also be a dedicated processor or a heterogeneous computing unit, such as DSP (Digital Signal Processing, digital signal processor), ASIC (dedicated Integrated circuit), GPU (Graphics Processing Unit, graphics processor), FPGA (Field-Programmable Gate Array, field programmable gate array), network card acceleration chip, etc.
上述实施例提供的所述方法可基于图2所示的数据处理系统实现。具体的,如图2所示,所述数据处理系统包括:The method provided in the foregoing embodiment can be implemented based on the data processing system shown in FIG. 2. Specifically, as shown in Figure 2, the data processing system includes:
第一设备201,用于在工作过程中产生测量信息;The first device 201 is used to generate measurement information during work;
第一管理装置202,用于获取与第一设备有关的测量信息,并根据所述测量信息,确定所述第一设备的处理单元频率。The first management device 202 is configured to obtain measurement information related to the first device, and determine the frequency of the processing unit of the first device according to the measurement information.
进一步的,所述第一管理装置202,还用于在所述第一设备功耗超出第一功耗封顶值的情况下,向所述第一设备发送功耗封顶控制指令。所述第一设备201,还用于根据所述控制指令,执行功耗封顶操作。所述第一管理装置202,还用于获取所述第一设备在启动功耗封顶控制的情况下与所述第一设备有关的测量信息,并根据所述测量信息,确定所述第一设备的处理单元频率。Further, the first management device 202 is further configured to send a power consumption capping control instruction to the first device when the power consumption of the first device exceeds the first power consumption cap value. The first device 201 is also configured to perform a power capping operation according to the control instruction. The first management device 202 is further configured to obtain measurement information related to the first device when the power consumption capping control is activated by the first device, and determine the first device according to the measurement information The processing unit frequency.
在一具体的应用场景中,所述第一设备为服务器集群中的服务器;所述第一管理装置为带外管理装置。服务器包括多个硬件,可以包括但不限于以下内容:主板、处理单元、功耗、温度传感器和风扇。带外管理装置获取的与第一设备有关的测量信息可包括:处理单元功耗、处理单元温度值、处理单元利用率、主板的温度信息、风扇转速等等。通过利用服务器带外管理装置对第一设备的状态信息进行采集监控,从而可以保证第一设备的正常运行。In a specific application scenario, the first device is a server in a server cluster; the first management device is an out-of-band management device. The server includes multiple pieces of hardware, including but not limited to the following: motherboard, processing unit, power consumption, temperature sensor, and fan. The measurement information related to the first device acquired by the out-of-band management apparatus may include: power consumption of the processing unit, temperature value of the processing unit, utilization of the processing unit, temperature information of the main board, fan speed, and so on. By using the server out-of-band management device to collect and monitor the status information of the first device, the normal operation of the first device can be ensured.
为降低带外管理装置的生产成本,可以在第一设备中设置第一接口,该第一接口用于连接BMC。其中,BMC可实现部分或全部测量信息的采集功能。第一接口可以包括但不限于:USB接口和PCI(Peripheral Component Interconnect,外设部件互连)插槽。In order to reduce the production cost of the out-of-band management device, a first interface may be provided in the first device, and the first interface is used to connect to the BMC. Among them, BMC can realize the collection function of part or all of the measurement information. The first interface may include, but is not limited to: a USB interface and a PCI (Peripheral Component Interconnect) slot.
带外管理装置所实现的带外管理可以包括:根据服务器发送的带外信息(即能测量到的所有与第一设备有关的测量信息),对设备功耗进行带外管理,和/或对至少一个服务器共用的硬件,如风扇,进行带外管理。The out-of-band management implemented by the out-of-band management device may include: out-of-band management of the power consumption of the device according to the out-of-band information sent by the server (that is, all measurement information related to the first device that can be measured), and/or Hardware shared by at least one server, such as a fan, is managed out-of-band.
这里需要说明的:本实施例提供的所述带外管理装置除了上面的功能之 外,还可实现其它功能,具体可参见上述方法实施例中的描述。It should be noted here that the out-of-band management device provided in this embodiment can implement other functions in addition to the above functions. For details, please refer to the description in the above method embodiment.
图3示出了本申请另一实施例提供的数据处理方法的流程示意图。如图所示,所述数据处理方法包括:FIG. 3 shows a schematic flowchart of a data processing method provided by another embodiment of the present application. As shown in the figure, the data processing method includes:
301、获取与第一设备有关的测量信息。301. Acquire measurement information related to a first device.
302、获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率。302. Obtain an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies.
303、将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。303. Use the measurement information as an input of an inference model, and execute the inference model to obtain a processing unit frequency of the first device.
在一具体实现方案中,上述步骤301“获取与第一设备有关的测量信息”,可包括:In a specific implementation solution, the above step 301 "obtain measurement information related to the first device" may include:
301、在第一设备因功耗超出第一功耗封顶值的情况下,启动功耗封顶控制;301. In the case where the power consumption of the first device exceeds the first power consumption cap value, start power consumption capping control;
302、在所述第一设备启动功耗封顶控制的情况下,获取与第一设备有关的测量信息。302. Acquire measurement information related to the first device when the power consumption capping control is started by the first device.
与所述第一设备有关的测量信息可理解为:在第一设备工作过程中所有可测量到的信息,包括但不限于:处理单元功耗、处理单元温度值、处理单元利用率、主板的温度信息、风扇转速等等。The measurement information related to the first device can be understood as: all the information that can be measured during the working process of the first device, including but not limited to: processing unit power consumption, processing unit temperature value, processing unit utilization rate, main board Temperature information, fan speed, etc.
在一可实现的技术方案中,上述步骤302“获取利用训练样本完成训练的推断模型”,可包括:In an achievable technical solution, the above step 302 "obtain an inference model that uses training samples to complete training" may include:
3021、将样本对中的测量信息作为待训练推断模型的输入,执行所述推断模型得到输出结果;3021. Use the measurement information in the sample pair as the input of the inference model to be trained, and execute the inference model to obtain an output result;
3022、基于所述输出结果及所述第一样本对中的处理单元频率确定符合预设结束条件的情况下,所述推断模型完成训练;3022. In a case where it is determined that a preset end condition is met based on the output result and the frequency of the processing unit in the first sample pair, the inference model completes training;
3023、基于所述输出结果及所述样本对中的处理单元频率确定不符合所述预设结束条件的情况下,对所述推断模型中的参数进行优化;并进入下次训练过程。3023. If it is determined based on the output result and the frequency of the processing unit in the sample pair that the preset end condition is not met, optimize the parameters in the inference model; and enter the next training process.
这里需要说明的是:本实施例中所述待训练推断模型可选用现有技术中的神经网络模型,如卷积神经网络CNN、长短期记忆网络LSTM等等,本实施例对此不作具体限定。另外,模型训练过程也可参见现有技术中的相关内容。It should be noted here that the inference model to be trained in this embodiment can be a neural network model in the prior art, such as a convolutional neural network CNN, a long short-term memory network LSTM, etc., which is not specifically limited in this embodiment . In addition, the model training process can also refer to related content in the prior art.
图4示出了本申请一实施例提供的数据获取方法的流程示意图。本实施例中的第二设备为测试用设备,其具有与实际服务器(即现场使用的,需进行功耗管理的服务器)相同类型的处理单元及其他硬件,且其具有能读取准确处理单元频率的带内传感器。具体的,如图4所示,所述方法包括:FIG. 4 shows a schematic flowchart of a data acquisition method provided by an embodiment of the present application. The second device in this embodiment is a test device, which has the same type of processing unit and other hardware as the actual server (that is, a server that needs to be used on site and requires power consumption management), and it has a processing unit that can read accurately Frequency in-band sensor. Specifically, as shown in FIG. 4, the method includes:
401、为测试用第二设备增加测试负载。401. Increase the test load for the second device for testing.
402、获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息。402. Obtain a processing unit frequency of the second device under a test load and measurement information related to the second device.
403、将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。403. Use the processing unit frequency and the measurement information as a sample pair in the training samples used to train the inference model to be trained.
上述401中,可在第二设备上加载并运行测试程序,如Benchmark工具。有关Benchmark工具的内容可参见现有技术,本文不作赘述。In the above 401, a test program, such as a Benchmark tool, can be loaded and run on the second device. For the content of the Benchmark tool, please refer to the prior art, which will not be repeated in this article.
上述402中,所述处理单元频率可由带内传感器获得。与所述第一设备有关的测量信息可理解为:在第一设备工作过程中所有可测量到的信息,包括但不限于:处理单元功耗、处理单元温度值、处理单元利用率、主板的温度信息、风扇转速等等。与第一设备有关的测量信息为带外管理装置能够获得的信息。In the above 402, the frequency of the processing unit can be obtained by an in-band sensor. The measurement information related to the first device can be understood as: all the information that can be measured during the working process of the first device, including but not limited to: processing unit power consumption, processing unit temperature value, processing unit utilization rate, main board Temperature information, fan speed, etc. The measurement information related to the first device is information that can be obtained by the out-of-band management device.
进一步的,本实施例提供的所述方法还可包括如下步骤:Further, the method provided in this embodiment may further include the following steps:
404、为所述第二设备设置第一功耗封顶值。404. Set a first power consumption cap value for the second device.
相应的,上述步骤402“获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息”,包括:Correspondingly, the foregoing step 402 "obtaining the frequency of the processing unit of the second device under test load and measurement information related to the second device" includes:
4021、所述测试负载增加至所述第二设备功耗达到所述第一功耗封顶值时,启动功耗封顶控制;4021. When the test load increases until the power consumption of the second device reaches the first power consumption cap value, start power consumption capping control;
4022、在所述第二设备启动功耗封顶控制的情况下,获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息。4022. When the second device starts power consumption capping control, acquire the processing unit frequency of the second device and measurement information related to the second device.
上述4022中“获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息”,可具体为:The "acquiring the frequency of the processing unit of the second device and the measurement information related to the second device" in the foregoing 4022 may specifically be:
启动功耗封顶控制后,持续为所述第二设备增加测试负载直至到达预设最大负载;After starting the power consumption capping control, continue to increase the test load for the second device until the preset maximum load is reached;
记录测试期间所述第二设备的处理频率及测量信息;Record the processing frequency and measurement information of the second device during the test;
其中,所述测试期间为所述功耗封顶控制启动时至所述第二设备的测试负载被加载到预设最大负载时的一段时间;或者为所述功耗封顶控制启动时至所述第二设备的测试负载被加载到预设最大负载并持续设定时长后的一段时间。Wherein, the test period is a period of time from when the power consumption capping control is activated to when the test load of the second device is loaded to a preset maximum load; or from when the power consumption capping control is activated to the first The test load of the second device is loaded to the preset maximum load and continues for a period of time after the set duration.
具体的,上述持续的设定时长可以是一个经验值,本实施例对此不作具体限定。Specifically, the aforementioned continuous setting duration may be an empirical value, which is not specifically limited in this embodiment.
进一步的,本实施例提供的所述方法还可包括如下步骤:Further, the method provided in this embodiment may further include the following steps:
405、为所述第二设备增加测试负载至预设最大负载后,重置所述第一功耗封顶值;并进入下轮测试,直至重置后的所述第一功耗封顶值大于或等于阈值;或405. After increasing the test load for the second device to a preset maximum load, reset the first power consumption cap value; and enter the next round of testing until the reset first power consumption cap value is greater than or Equal to the threshold; or
405’、为所述第二设备增加负载至预设最大负载并持续设定时长后,重置所述第一功耗封顶值;并进入下轮测试,直至重置后的所述第一功耗封顶值大于或等于所述阈值。405'. After increasing the load for the second device to a preset maximum load and continuing for a set duration, reset the first power consumption cap value; and enter the next round of testing until the reset of the first power The consumption cap value is greater than or equal to the threshold.
其中,下轮测试即返回上述步骤401,以在重置第一功耗封顶值的基础上再次重复上述过程以获得新的样本对。上述阈值可等于第二设备的额定功耗,也可大于第二设备的额定功耗。每轮重置可增加一定的步长,这个步长可以是固定值,也可以适当变化。一个典型的方式是,比如该第二设备(如服务器)的额定功耗是500W,在第1轮测试时,第一功耗封顶值设为350W;第2轮测试时,将第一功耗封顶值重置为360W;….,第N轮测试时,将第一功耗封顶值重置为520W。等等,最后一次阈值,可以比设备额定功耗500W 高一点,因为额定功耗不代表机器实际运行时的最大功耗。Wherein, the next round of testing returns to the above step 401 to repeat the above process again on the basis of resetting the first power consumption cap value to obtain a new sample pair. The foregoing threshold may be equal to or greater than the rated power consumption of the second device. Each round of resetting can increase a certain step length, this step length can be a fixed value, or it can be changed appropriately. A typical way is, for example, the rated power consumption of the second device (such as a server) is 500W, in the first round of testing, the first power consumption cap value is set to 350W; in the second round of testing, set the first power consumption The cap value is reset to 360W;..., in the Nth round of testing, the first cap value of power consumption is reset to 520W. Wait, the last threshold can be a little higher than the device's rated power consumption of 500W, because the rated power consumption does not represent the maximum power consumption of the machine in actual operation.
即上述步骤405和405’中“重置所述第一功耗封顶值”可具体为:That is, the "resetting the first power consumption cap value" in the above steps 405 and 405' may specifically be:
在所述第一功耗封顶值的基础上,增加设定值得到第四功耗封顶值;On the basis of the first power consumption cap value, increase a set value to obtain a fourth power consumption cap value;
将所述第一功耗封顶值更新为所述第四功耗封顶值。The first power consumption cap value is updated to the fourth power consumption cap value.
图5示出了本申请一实施例提供的模型训练方法的流程示意图。如图5所示,所述模型训练方法包括:FIG. 5 shows a schematic flowchart of a model training method provided by an embodiment of the present application. As shown in Figure 5, the model training method includes:
501、获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率。501. Obtain a training sample, where the training sample includes a plurality of sample pairs, and the sample pairs include test information and a processing unit frequency corresponding to the test information.
502、基于所述多个样本,对待训练推断模型进行训练。502. Based on the multiple samples, train the inference model to be trained.
其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。Wherein, the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
上述实施例提供的数据获取方法可基于图6所示的测试系统实现。具体的,如图6所述,所述测试系统包括:The data acquisition method provided by the foregoing embodiment can be implemented based on the test system shown in FIG. 6. Specifically, as shown in Figure 6, the test system includes:
测试用第二设备601,包括与第一设备相同的硬件结构及性能,用于运行测试程序以加载相应测试负载;The second test device 601 includes the same hardware structure and performance as the first device, and is used to run the test program to load the corresponding test load;
测试用第二管理装置602,与所述第二设备连接,用于获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The second management device 602 for testing is connected to the second device, and is used to obtain the frequency of the processing unit of the second device under the test load and the measurement information related to the second device; The frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
其中,测试用第二管理装置具有与第一管理装置相同的硬件结构及功能,除此之前,还具有第一管理装置不具有的功能,即获取到第二设备的处理单元频率。具体实现时,所述第二设备的处理单元频率可通过带内传感器获得。Wherein, the second management device for testing has the same hardware structure and functions as the first management device. In addition to this, it also has a function that the first management device does not have, that is, the frequency of the processing unit of the second device is acquired. In specific implementation, the frequency of the processing unit of the second device can be obtained through an in-band sensor.
进一步的,所述第二管理装置602,还用于为所述第二设备设置第一功耗封顶值;所述测试负载增加至所述第二设备功耗达到所述第一功耗封顶值时,向所述第二设备发送功耗封顶控制指令;Further, the second management device 602 is further configured to set a first power consumption cap value for the second device; the test load is increased until the power consumption of the second device reaches the first power consumption cap value When, sending a power consumption capping control instruction to the second device;
所述第二设备601,还用于根据所述控制指令,执行功耗封顶操作;The second device 601 is also configured to perform a power consumption capping operation according to the control instruction;
所述第二管理装置602,还用于在所述第二设备启动功耗封顶控制的情况下,获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息;将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The second management device 602 is further configured to obtain the processing unit frequency of the second device and the measurement information related to the second device when the power consumption capping control is activated by the second device; The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
进一步的,所述测试系统还可包括:Further, the test system may further include:
模型训练装置,用于获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;基于所述多个样本,对待训练推断模型进行训练,以为第一管理装置提供完成训练的推断模型。The model training device is used to obtain training samples, the training samples include multiple sample pairs, the sample pairs include test information and processing unit frequencies corresponding to the test information; based on the multiple samples, the inference model to be trained is trained for the first A management device provides an inference model for completing the training.
这里需要说明的:本实施例提供的所述第二管理装置除了上面的功能之外,还可实现其它功能,具体可参见上述方法实施例中的描述。It should be noted here that the second management device provided in this embodiment may implement other functions in addition to the above functions. For details, please refer to the description in the foregoing method embodiment.
下面结合图7、图8和图9,对本申请各实施例涉及的三大部分内容进行说明。The three major contents involved in each embodiment of the present application will be described below in conjunction with FIG. 7, FIG. 8, and FIG. 9.
第一部分、生成训练样本。The first part is to generate training samples.
第二部分、训练推断模型。The second part is to train the inference model.
第三部分、使用推断模型对服务器进行功率封顶控制时的实时处理单元频率推断。The third part is the real-time processing unit frequency inference when using the inference model to control the power capping of the server.
如图7所示,生成训练样本As shown in Figure 7, generate training samples
搭建测试系统,即选用具有与实际应用场景中的服务器相同硬件结构及性能的测试服务器及与实际应用场景中带外管理装置相同硬件结构及功能的测试带外管理装置。使用Benchmark工具为测试服务器增加负载。处理单元为CPU。To build a test system, select a test server with the same hardware structure and performance as the server in the actual application scenario and a test out-of-band management device with the same hardware structure and function as the out-of-band management device in the actual application scenario. Use the Benchmark tool to increase the load on the test server. The processing unit is the CPU.
具体的,如图7所示,包括:Specifically, as shown in Figure 7, it includes:
S11、为测试服务器设置一个第一功耗封顶值。S11. Set a first power consumption cap value for the test server.
S12、运行Benchmark测试程序,逐步给测试服务器加载负载。S12. Run the Benchmark test program to gradually load the test server.
S13、测试服务器的功耗达到第一功耗封顶值时,启用功耗封顶控制。S13. When the power consumption of the test server reaches the first power consumption cap value, the power consumption cap control is enabled.
S14、通过采样,记录测试服务器在启动功耗封顶控制后的所有与其有关的测量信息以及测试服务器的CPU频率。S14. Through sampling, record all related measurement information of the test server after the power consumption capping control is started, and the CPU frequency of the test server.
S15、测试服务器的负载被加载到Benchmark的最大负载后,对所述第一功耗封顶值进行重置,并返回S11,直至重置后的第一功耗封顶值大于或等于设定阈值。S15. After the load of the test server is loaded to the maximum load of Benchmark, reset the first power consumption cap value, and return to S11, until the reset first power consumption cap value is greater than or equal to the set threshold.
其中,设定阈值等于测试服务器的额定功耗或额定功耗的120%。此外,Benchmark加载到最大负载后,可持续一段时间,以便有足够的时间,在功耗封顶生效时记录多对样本。Among them, the set threshold is equal to the rated power consumption of the test server or 120% of the rated power consumption. In addition, after the Benchmark is loaded to the maximum load, it can continue for a period of time to allow enough time to record multiple pairs of samples when the power consumption cap takes effect.
如图8所示,训练推断模型As shown in Figure 8, the training inference model
S21、数据处理。S21. Data processing.
对通过上述图7所示方法采集到的数据进行数据清洗,例如,将一些不完整的样本去除,不正常的样本去除;将一些样本集成到一个特性中,或将一些特征从样本中派生出来。例如,对于一个处理单元有多个核的情况,每个核的频率应该集成到每个处理单元的一个值中。Perform data cleaning on the data collected by the method shown in Figure 7 above, for example, remove some incomplete samples and remove abnormal samples; integrate some samples into a feature, or derive some features from the sample . For example, for a processing unit with multiple cores, the frequency of each core should be integrated into a value for each processing unit.
这里需要说明的是:数据处理过程可参见现有技术中的相应内容,本实施例对此不作具体限定。It should be noted here that: the data processing process can refer to the corresponding content in the prior art, which is not specifically limited in this embodiment.
S22、模型训练。S22. Model training.
使用机器学习模型或任何回归方程(即上文中提及的待训练推理模型)来建立处理单元频率与测量信息的关系。利用上述S21处理后的训练样本对待训练推理模型进行训练。其中,训练过程可参见现有技术的相关内容,此处不作具体限定。A machine learning model or any regression equation (ie, the above-mentioned inference model to be trained) is used to establish the relationship between the frequency of the processing unit and the measurement information. The training samples processed by the above S21 are used to train the inference model to be trained. Among them, the training process can refer to the related content of the prior art, which is not specifically limited here.
如图9所示,使用推断模型对服务器进行功率封顶控制时的实时处理单元频率推断。As shown in Figure 9, the real-time processing unit frequency is inferred when the inference model is used to control the power capping of the server.
S31、当服务器功耗达到功耗封顶值或接收到功耗封顶控制指令时,启 动功耗封顶控制。S31. When the power consumption of the server reaches the power consumption cap value or a power consumption cap control instruction is received, start the power consumption cap control.
S32、获取服务器的测量信息。S32. Obtain measurement information of the server.
其中,测量信息至少包括:处理单元功率、处理单元温度值、处理单元使用率、主板的温度信息、风扇转速等等。Among them, the measurement information includes at least: processing unit power, processing unit temperature value, processing unit usage rate, temperature information of the main board, fan speed, and so on.
S33、将测量信息输入推断模型,执行推断模型得出CPU频率。S33. Input the measurement information into the inference model, and execute the inference model to obtain the CPU frequency.
推断出的处理单元频率可作为是否调整现有功耗封顶值的依据。例如,推断出的处理单元频率过高,可调低现有功耗封顶值;推断出的处理单元频率过低,可调高现有功耗封顶值。The inferred processing unit frequency can be used as a basis for adjusting the existing power consumption cap value. For example, if the inferred frequency of the processing unit is too high, the existing power consumption cap value can be lowered; if the inferred frequency of the processing unit is too low, the existing power consumption cap value can be adjusted higher.
图10示出了本申请一实施例提供的设备功耗控制方法的流程示意图。如图10所示,所述设备功耗控制方法包括:FIG. 10 shows a schematic flowchart of a method for controlling power consumption of a device according to an embodiment of the present application. As shown in FIG. 10, the device power consumption control method includes:
701、获取与第一设备有关的测量信息;701. Acquire measurement information related to the first device.
702、根据所述测量信息,确定所述第一设备的处理单元频率;702. Determine the frequency of the processing unit of the first device according to the measurement information.
703、基于所述处理单元频率,重置第一功耗封顶值;703. Reset the first power consumption cap value based on the frequency of the processing unit.
704、利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。704. Use the reset first power consumption cap value to perform power consumption cap control for the first device.
其中,有关上述步骤701及702可参见上述各实施例中的相应内容,此处不作赘述。For the above steps 701 and 702, please refer to the corresponding content in the above embodiments, which will not be repeated here.
上述步骤703“基于所述处理单元频率,重置第一功耗封顶值”可具体包括:The above step 703 "resetting the first power consumption cap value based on the frequency of the processing unit" may specifically include:
在所述处理单元频率低于第一预设值的情况下,将所述第一功耗封顶值提高至第二功耗封顶值;When the frequency of the processing unit is lower than the first preset value, increasing the first power consumption cap value to a second power consumption cap value;
在所述处理单元频率高于第二预设值的情况下,将所述第一功耗封顶值降低至第三功耗封顶值;In a case where the frequency of the processing unit is higher than the second preset value, reducing the first power consumption cap value to a third power consumption cap value;
其中,所述第一预设值小于所述第二预设值。Wherein, the first preset value is less than the second preset value.
上述第一预设值、第二预设值可为预先设定值,其可以为经验值,或通过相应的算法计算得出、或通过多次实验得出的等等。所述第一功耗封顶值 的提高量及降低量,可基于预设的重置规则来确定。比如,重置规则为:每次调整均提高或降低一固定值或一固定比例等。The above-mentioned first preset value and second preset value may be preset values, which may be empirical values, or calculated through corresponding algorithms, or obtained through multiple experiments, and so on. The increase and decrease of the first power consumption cap value can be determined based on a preset reset rule. For example, the reset rule is: each adjustment is increased or decreased by a fixed value or a fixed ratio.
另外,本申请实施例提供的所述方法,即上述各执行步骤701的执行前提可以是:在启动针对第一设备的功耗封顶控制时,触发与第一设备有关的测量信息的动作。In addition, the method provided in the embodiment of the present application, that is, the execution premise of each of the foregoing steps 701 may be: when the power consumption capping control for the first device is started, an action of triggering the measurement information related to the first device.
上述步骤702“根据所述测量信息,确定所述第一设备的处理单元频率”可包括:The above step 702 "determine the frequency of the processing unit of the first device according to the measurement information" may include:
获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
将所述测量信息作为所述推断模型的输入,执行所述推断模型得到所述处理单元频率。The measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
进一步的,所述测量信息可包括但不限于如下中的至少一个:处理单元功耗、处理单元温度值及处理单元利用率。Further, the measurement information may include but is not limited to at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
本实施例提供的技术方案中,利用与第一设备有关的测量信息确定所述第一设备的处理单元频率,再基于确定出的处理单元频率,重置功耗封顶控制时所需参照的第一功耗封顶值;因处理单元频率的准确度高,所以利用精准的处理单元频率重置第一功耗封顶值,更贴近第一设备的实际情况;可见,采用本申请实施例提供的技术方案,有助于提升设备的功耗管理能力。In the technical solution provided in this embodiment, the measurement information related to the first device is used to determine the frequency of the processing unit of the first device, and then based on the determined frequency of the processing unit, the first reference to be referenced during power capping control is reset. 1. Power consumption cap value; due to the high accuracy of the processing unit frequency, the first power consumption cap value is reset with an accurate processing unit frequency, which is closer to the actual situation of the first device; it can be seen that the technology provided by the embodiment of this application is used The solution helps to improve the power management capability of the device.
图11示出了本申请一实施例提供的数据处理装置的结构示意图。如图11所示,所述数据处理装置包括:获取模块11及确定模块12。其中,获取模块11用于获取与第一设备有关的测量信息;所述确定模块12用于根据所述测量信息,确定所述第一设备的处理单元频率。FIG. 11 shows a schematic structural diagram of a data processing device provided by an embodiment of the present application. As shown in FIG. 11, the data processing device includes: an acquisition module 11 and a determination module 12. Wherein, the obtaining module 11 is used to obtain measurement information related to the first device; the determining module 12 is used to determine the frequency of the processing unit of the first device according to the measurement information.
经实践证明,采用本实施例提供的技术方案得出的处理单元频率,较现有从BMC(baseboard management controller,基板管理控制器)中读取到的处理单元频率,准确率更高;继而有助于提升设备的功耗管理能力。Practice has proved that the processing unit frequency obtained by using the technical solution provided in this embodiment is more accurate than the existing processing unit frequency read from the BMC (baseboard management controller); Helps improve the power management capabilities of the device.
进一步的,所述获取模块11还用于:Further, the acquisition module 11 is also used for:
在第一设备因功耗超出第一功耗封顶值的情况下,启动功耗封顶控制;In the case where the power consumption of the first device exceeds the first power consumption capping value, start the power consumption capping control;
在所述第一设备启动功耗封顶控制的情况下,获取与第一设备有关的测量信息。When the first device starts power consumption capping control, obtain measurement information related to the first device.
进一步的,本实施例提供的所述数据处理装置该包括重置模块。所述重置模块用于在所述处理单元频率不满足第一预设条件的情况下,对所述第一功耗封顶值进行重置。Further, the data processing device provided in this embodiment should include a reset module. The reset module is configured to reset the first power consumption cap value when the frequency of the processing unit does not meet the first preset condition.
进一步的,所述重置模块还用于:Further, the reset module is also used for:
在所述处理单元频率低于第一预设值的情况下,将所述第一功耗封顶值提高至第二功耗封顶值;When the frequency of the processing unit is lower than the first preset value, increasing the first power consumption cap value to a second power consumption cap value;
在所述处理单元频率高于第二预设值的情况下,将所述第一功耗封顶值降低至第三功耗封顶值;In a case where the frequency of the processing unit is higher than the second preset value, reducing the first power consumption cap value to a third power consumption cap value;
其中,所述第一预设值小于所述第二预设值。Wherein, the first preset value is smaller than the second preset value.
进一步的,所述确定模块12还用于:Further, the determining module 12 is also used for:
获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
将所述测量信息作为所述推断模型的输入,执行所述推断模型得到所述处理单元频率。The measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
进一步的,所述获取模块11还用于:Further, the acquisition module 11 is also used for:
获取训练样本,其中,所述训练样本包括:处理单元频率及所述处理单元频率样本对应的测量信息;Acquiring training samples, where the training samples include: processing unit frequencies and measurement information corresponding to the processing unit frequency samples;
将所述测量信息作为待训练推断模型的输入,执行所述推断模型得到输出结果;Taking the measurement information as the input of the inference model to be trained, and executing the inference model to obtain an output result;
基于所述输出结果及所述处理单元频率确定满足收敛条件的情况下,所述推断模型完成训练;If it is determined based on the output result and the frequency of the processing unit that the convergence condition is satisfied, the inference model completes the training;
基于所述输出结果及所述处理单元频率确定不满足收敛条件的情况下,对所述推断模型中的参数进行优化。If it is determined based on the output result and the frequency of the processing unit that the convergence condition is not satisfied, the parameters in the inference model are optimized.
进一步的,所述测量信息包括如下中的至少一个:处理单元功耗、处理单元温度值及处理单元利用率。Further, the measurement information includes at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
这里需要说明的是:上述实施例提供的数据处理装置可实现上述各方法实施例中描述的技术方案,上述各模块或单元具体实现的原理可参见上述各方法实施例中的相应内容,此处不再赘述。What needs to be explained here is that the data processing device provided in the foregoing embodiment can implement the technical solutions described in the foregoing method embodiments. For the specific implementation principles of the foregoing modules or units, please refer to the corresponding content in the foregoing method embodiments. Here No longer.
图12示出了本申请另一实施例提供的数据处理装置的结构示意图。如图12所示,所述数据处理装置包括:获取模块21及推断模块22。其中,所述获取模块21用于获取与第一设备有关的测量信息;以及获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率。所述推断模块22用于将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。FIG. 12 shows a schematic structural diagram of a data processing device provided by another embodiment of the present application. As shown in FIG. 12, the data processing device includes: an acquisition module 21 and an inference module 22. Wherein, the acquisition module 21 is used to acquire measurement information related to the first device; and to acquire an inference model that uses training samples to complete training, wherein the training samples include multiple sample pairs, and the sample pairs include measurement information and a processing unit. frequency. The inference module 22 is configured to use the measurement information as an input of an inference model, and execute the inference model to obtain the processing unit frequency of the first device.
本实施例提供的技术方案中,利用训练模型的自学习能力,自学习测量信息与处理单元频率间的关联关系;然后利用训练完成的推断模型,将与第一设备有关的测量信息作为该推断模型的输入,通过执行该推断模型得到第一设备的处理单元频率。较现有从BMC中读取到的处理单元频率,本实施例提供的技术方案得出的处理单元频率,准确率更高。In the technical solution provided in this embodiment, the self-learning ability of the training model is used to self-learn the association relationship between the measurement information and the frequency of the processing unit; then the trained inference model is used to use the measurement information related to the first device as the inference The input of the model, the frequency of the processing unit of the first device is obtained by executing the inferred model. Compared with the existing processing unit frequency read from the BMC, the processing unit frequency obtained by the technical solution provided in this embodiment has a higher accuracy rate.
进一步的,所述获取模块21还用于:Further, the acquisition module 21 is also used for:
在第一设备因功耗超出第一功耗封顶值的情况下,启动功耗封顶控制;In the case where the power consumption of the first device exceeds the first power consumption capping value, start the power consumption capping control;
在所述第一设备启动功耗封顶控制的情况下,获取与第一设备有关的测量信息。When the first device starts power consumption capping control, obtain measurement information related to the first device.
进一步的,所述获取模块21还用于:Further, the acquisition module 21 is also used for:
将样本对中的测量信息作为待训练推断模型的输入,执行所述推断模型得到输出结果;Use the measurement information in the sample pair as the input of the inference model to be trained, and execute the inference model to obtain an output result;
基于所述输出结果及所述第一样本对中的处理单元频率确定符合预设结束条件的情况下,所述推断模型完成训练;If it is determined based on the output result and the frequency of the processing unit in the first sample pair that the preset end condition is met, the inference model completes the training;
基于所述输出结果及所述样本对中的处理单元频率确定不符合所述预设结束条件的情况下,对所述推断模型中的参数进行优化;并进入下次训练过程。If it is determined based on the output result and the frequency of the processing unit in the sample pair that the preset end condition is not met, the parameters in the inference model are optimized; and the next training process is entered.
这里需要说明的是:上述实施例提供的数据处理装置可实现上述各方法实施例中描述的技术方案,上述各模块或单元具体实现的原理可参见上述各方法实施例中的相应内容,此处不再赘述。What needs to be explained here is that the data processing device provided in the foregoing embodiment can implement the technical solutions described in the foregoing method embodiments. For the specific implementation principles of the foregoing modules or units, please refer to the corresponding content in the foregoing method embodiments. Here No longer.
图13示出了本申请一实施例提供的数据获取装置的结构示意图。如图13所示,所述数据获取装置包括:加载模块31及获取模块32。其中,所述加载模块用于为测试用第二设备增加测试负载。所述获取模块用于获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;并将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。FIG. 13 shows a schematic structural diagram of a data acquisition device provided by an embodiment of the present application. As shown in FIG. 13, the data acquisition device includes: a loading module 31 and an acquisition module 32. Wherein, the loading module is used to increase the test load for the second device for testing. The acquisition module is used to acquire the processing unit frequency of the second device under a test load and measurement information related to the second device; and use the processing unit frequency and the measurement information as the processing unit frequency and the measurement information to be used for training. A sample pair in the training sample for training the inference model.
本实施例提供的技术方案中,通过为测试用第二设备增加负载来模拟第二设备在不同负载状态下的处理单元频率及与其有关的测量信息,将处理单元频率及测量信息作为推断模型的训练样本,可更准确地学习到测量信息与处理单元频率间存在的关联关系,进而提高处理单元频率的计算准确率。In the technical solution provided by this embodiment, the second device for testing is added with a load to simulate the processing unit frequency of the second device under different load conditions and related measurement information, and the processing unit frequency and measurement information are used as the inference model. The training samples can learn more accurately the relationship between the measurement information and the frequency of the processing unit, thereby improving the calculation accuracy of the frequency of the processing unit.
进一步的,本实施例提供的所述数据获取装置还可包括设置模块。该设置模块用于为所述第二设备设置第一功耗封顶值。相应的,所述获取模块还用于:Further, the data acquisition device provided in this embodiment may further include a setting module. The setting module is used to set a first power consumption cap value for the second device. Correspondingly, the acquisition module is also used for:
所述测试负载增加至所述第二设备功耗达到所述第一功耗封顶值时,启动功耗封顶控制;When the test load increases until the power consumption of the second device reaches the first power consumption cap value, start the power consumption cap control;
在所述第二设备启动功耗封顶控制的情况下,获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息。When the second device starts power consumption capping control, the processing unit frequency of the second device and the measurement information related to the second device are acquired.
进一步的,所述获取模块32还用于:Further, the acquisition module 32 is also used for:
启动功耗封顶控制后,持续为所述第二设备增加测试负载直至到达预设 最大负载;After starting the power consumption capping control, continue to increase the test load for the second device until the preset maximum load is reached;
记录测试期间所述第二设备的处理频率及测量信息;Record the processing frequency and measurement information of the second device during the test;
其中,所述测试期间为所述功耗封顶控制启动时至所述第二设备的测试负载被加载到预设最大负载时的一段时间;或者为所述功耗封顶控制启动时至所述第二设备的测试负载被加载到预设最大负载并持续设定时长后的一段时间。Wherein, the test period is a period of time from when the power consumption capping control is started to when the test load of the second device is loaded to a preset maximum load; or from when the power consumption capping control is started to the first The test load of the second device is loaded to the preset maximum load and continues for a period of time after the set duration.
进一步的,本实施例提供的所述数据获取模块还可包括重置模块。其中,所述重置模块用于:Further, the data acquisition module provided in this embodiment may further include a reset module. Wherein, the reset module is used for:
为所述第二设备增加测试负载至预设最大负载后,重置所述第一功耗封顶值;并进入下轮测试,直至重置后的所述第一功耗封顶值大于或等于阈值;或After increasing the test load for the second device to the preset maximum load, reset the first power consumption cap value; and enter the next round of testing until the reset first power consumption cap value is greater than or equal to the threshold value ;or
为所述第二设备增加负载至预设最大负载并持续设定时长后,重置所述第一功耗封顶值;并进入下轮测试,直至重置后的所述第一功耗封顶值大于或等于所述阈值。After increasing the load for the second device to the preset maximum load and continuing for a set time period, reset the first power consumption cap value; and enter the next round of testing until the reset first power consumption cap value Greater than or equal to the threshold.
进一步的,所述重置模块还用于:Further, the reset module is also used for:
在所述第一功耗封顶值的基础上,增加设定值得到第四功耗封顶值;On the basis of the first power consumption cap value, increase a set value to obtain a fourth power consumption cap value;
将所述第一功耗封顶值更新为所述第四功耗封顶值。The first power consumption cap value is updated to the fourth power consumption cap value.
这里需要说明的是:上述实施例提供的数据获取装置可实现上述各方法实施例中描述的技术方案,上述各模块或单元具体实现的原理可参见上述各方法实施例中的相应内容,此处不再赘述。What needs to be explained here is that the data acquisition device provided in the foregoing embodiment can implement the technical solutions described in the foregoing method embodiments. For the specific implementation principles of the foregoing modules or units, please refer to the corresponding content in the foregoing method embodiments. No longer.
图14示出了本申请一实施例提供的模型训练装置的结构示意图。如图14所示,所述模型训练装置包括:获取模块41及训练模块42。其中,所述获取模块41用于获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率。所述训练模块42用于基于所述多个样本,对待训练推断模型进行训练;其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。FIG. 14 shows a schematic structural diagram of a model training device provided by an embodiment of the present application. As shown in FIG. 14, the model training device includes: an acquisition module 41 and a training module 42. Wherein, the acquisition module 41 is used to acquire training samples, the training samples include multiple sample pairs, and the sample pairs include test information and processing unit frequencies corresponding to the test information. The training module 42 is configured to train the inference model to be trained based on the multiple samples; wherein the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device .
图15示出了本申请一实施例提供的设备功耗控制装置的结构示意图。如图15所示,所述设备功耗控制装置包括:获取模块51、确定模块52、重置模块53及执行模块54。其中,获取模块51用于获取与第一设备有关的测量信息;确定模块52用于根据所述测量信息,确定所述第一设备的处理单元频率;重置模块53用于基于所述处理单元频率,重置第一功耗封顶值;执行模块54用于利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。FIG. 15 shows a schematic structural diagram of an apparatus for controlling power consumption of a device according to an embodiment of the present application. As shown in FIG. 15, the device power consumption control device includes: an acquisition module 51, a determination module 52, a reset module 53 and an execution module 54. Wherein, the obtaining module 51 is used to obtain measurement information related to the first device; the determining module 52 is used to determine the frequency of the processing unit of the first device according to the measurement information; the reset module 53 is used to obtain measurement information based on the processing unit Frequency, reset the first power consumption cap value; the execution module 54 is configured to use the reset first power consumption cap value to execute power consumption cap control for the first device.
进一步的,所述重置模块53还用于:Further, the reset module 53 is also used for:
在所述处理单元频率低于第一预设值的情况下,将所述第一功耗封顶值提高至第二功耗封顶值;When the frequency of the processing unit is lower than the first preset value, increasing the first power consumption cap value to a second power consumption cap value;
在所述处理单元频率高于第二预设值的情况下,将所述第一功耗封顶值降低至第三功耗封顶值;In a case where the frequency of the processing unit is higher than the second preset value, reducing the first power consumption cap value to a third power consumption cap value;
其中,所述第一预设值小于所述第二预设值。Wherein, the first preset value is smaller than the second preset value.
进一步的,所述确定模块52还用于:Further, the determining module 52 is also used for:
获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
将所述测量信息作为所述推断模型的输入,执行所述推断模型得到所述处理单元频率。The measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
进一步的,所述测量信息包括如下中的至少一个:处理单元功耗、处理单元温度值及处理单元利用率。Further, the measurement information includes at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
图16示出了本申请一实施例提供的电子设备的结构示意图。如图16所示,所述电子设备包括:存储器61及处理器62,其中,FIG. 16 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 16, the electronic device includes: a memory 61 and a processor 62, where:
所述存储器61,用于存储程序;The memory 61 is used to store programs;
所述处理器62,与所述存储器61耦合,用于执行所述存储器61中存储的所述程序,以用于:The processor 62 is coupled with the memory 61, and is configured to execute the program stored in the memory 61 for:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
根据所述测量信息,确定所述第一设备的处理单元频率。According to the measurement information, the frequency of the processing unit of the first device is determined.
上述存储器61可被配置为存储其它各种数据以支持在电子设备上的操作。这些数据的示例包括用于在电子设备上操作的任何应用程序或方法的指令。存储器61可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The aforementioned memory 61 may be configured to store other various data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device. The memory 61 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable and Programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
上述处理器62在执行存储器61中的程序时,除了上面的功能之外,还可实现其它功能,具体可参见前面各实施例的描述。When the above-mentioned processor 62 executes the program in the memory 61, in addition to the above functions, it may also implement other functions. For details, please refer to the description of the previous embodiments.
进一步,如图16所示,电子设备还包括:通信组件63、显示器64、电源组件65、音频组件66等其它组件。图16中仅示意性给出部分组件,并不意味着电子设备只包括图16所示组件。Further, as shown in FIG. 16, the electronic device further includes: a communication component 63, a display 64, a power supply component 65, an audio component 66 and other components. Only some components are schematically shown in FIG. 16, which does not mean that the electronic device only includes the components shown in FIG. 16.
本申请又一实施例提供了一种电子设备。该电子设备的结构与上述电子设备实施例类同,可参见上述图16所示。该电子设备包括存储器及处理器,其中,Another embodiment of the present application provides an electronic device. The structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. 16 above. The electronic device includes a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。The measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
其中,处理器在执行存储器中的程序时,除了上面的功能之外,还可实现其它功能,具体可参见前面各实施例的描述。Wherein, when the processor executes the program in the memory, in addition to the above functions, other functions may also be implemented. For details, please refer to the description of the foregoing embodiments.
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,所述计算机程序被计算机执行时能够实现上述各实施例提供的数据处理方法的步骤或功能。Correspondingly, an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which can implement the steps or functions of the data processing method provided by the foregoing embodiments when the computer program is executed by a computer.
本申请又一实施例提供了一种电子设备。该电子设备的结构与上述电子设备实施例类同,可参见上述图16所示。该电子设备包括存储器及处理器,其中,Another embodiment of the present application provides an electronic device. The structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. 16 above. The electronic device includes a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
为测试用第二设备增加测试负载;Increase the test load for the second device for testing;
获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;Acquiring a processing unit frequency of the second device under a test load and measurement information related to the second device;
将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
其中,处理器在执行存储器中的程序时,除了上面的功能之外,还可实现其它功能,具体可参见前面各实施例的描述。Wherein, when the processor executes the program in the memory, in addition to the above functions, other functions may also be implemented. For details, please refer to the description of the previous embodiments.
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,所述计算机程序被计算机执行时能够实现上述各实施例提供的数据获取方法的步骤或功能。Correspondingly, an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which when executed by a computer can implement the steps or functions of the data acquisition method provided in the foregoing embodiments.
本申请又一实施例提供了一种电子设备。该电子设备的结构与上述电子设备实施例类同,可参见上述图16所示。该电子设备包括存储器及处理器,其中,Another embodiment of the present application provides an electronic device. The structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. The electronic device includes a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;Acquiring training samples, the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
基于所述多个样本,对待训练推断模型进行训练;Training the inference model to be trained based on the multiple samples;
其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。Wherein, the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
其中,处理器在执行存储器中的程序时,除了上面的功能之外,还可实现其它功能,具体可参见前面各实施例的描述。Wherein, when the processor executes the program in the memory, in addition to the above functions, other functions may also be implemented. For details, please refer to the description of the previous embodiments.
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,所述计算机程序被计算机执行时能够实现上述各实施例提供的模型训练方法的步骤或功能。Correspondingly, an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which can implement the steps or functions of the model training method provided in the foregoing embodiments when the computer program is executed by a computer.
本申请又一实施例提供了一种电子设备。该电子设备的结构与上述电子设备实施例类同,可参见上述图16所示。该电子设备包括存储器及处理器,其中,Another embodiment of the present application provides an electronic device. The structure of the electronic device is similar to the above-mentioned electronic device embodiment, which can be referred to as shown in FIG. 16 above. The electronic device includes a memory and a processor, among which,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
根据所述测量信息,确定所述第一设备的处理单元频率;Determine the frequency of the processing unit of the first device according to the measurement information;
基于所述处理单元频率,重置第一功耗封顶值;Resetting the first power consumption cap value based on the frequency of the processing unit;
利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。The power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
其中,处理器在执行存储器中的程序时,除了上面的功能之外,还可实现其它功能,具体可参见前面各实施例的描述。Wherein, when the processor executes the program in the memory, in addition to the above functions, other functions may also be implemented. For details, please refer to the description of the previous embodiments.
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,所述计算机程序被计算机执行时能够实现上述各实施例提供的设备功耗控制方法的步骤或功能。Correspondingly, an embodiment of the present application also provides a computer-readable storage medium storing a computer program, which when executed by a computer can implement the steps or functions of the device power consumption control method provided by the foregoing embodiments.
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机第一设备(可以是个人计算机,服务器,或者网络第一设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the description of the above implementation manners, those skilled in the art can clearly understand that each implementation manner can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solution essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic A disc, an optical disc, etc., include several instructions to make a computer first device (which may be a personal computer, a server, or a network first device, etc.) execute the methods described in each embodiment or some parts of the embodiment.
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application, not to limit them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (31)

  1. 一种数据处理方法,其特征在于,包括:A data processing method, characterized in that it comprises:
    获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
    根据所述测量信息,确定所述第一设备的处理单元频率。According to the measurement information, the frequency of the processing unit of the first device is determined.
  2. 根据权利要求1所述的方法,其特征在于,获取与第一设备有关的测量信息,包括:The method according to claim 1, wherein acquiring measurement information related to the first device comprises:
    在第一设备因功耗超出第一功耗封顶值的情况下,启动功耗封顶控制;In the case where the power consumption of the first device exceeds the first power consumption capping value, start the power consumption capping control;
    在所述第一设备启动功耗封顶控制的情况下,获取与第一设备有关的测量信息。When the first device starts power consumption capping control, obtain measurement information related to the first device.
  3. 根据权利要求2所述的方法,其特征在于,还包括:The method according to claim 2, further comprising:
    在所述处理单元频率不满足第一预设条件的情况下,对所述第一功耗封顶值进行重置。When the frequency of the processing unit does not meet the first preset condition, reset the first power consumption cap value.
  4. 根据权利要求3所述的方法,其特征在于,在所述处理单元频率不满足第一预设条件的情况下,对所述第一功耗封顶值进行重置,包括:The method according to claim 3, wherein, when the frequency of the processing unit does not meet a first preset condition, resetting the first power consumption cap value includes:
    在所述处理单元频率低于第一预设值的情况下,将所述第一功耗封顶值提高至第二功耗封顶值;When the frequency of the processing unit is lower than the first preset value, increasing the first power consumption cap value to a second power consumption cap value;
    在所述处理单元频率高于第二预设值的情况下,将所述第一功耗封顶值降低至第三功耗封顶值;In a case where the frequency of the processing unit is higher than the second preset value, reducing the first power consumption cap value to a third power consumption cap value;
    其中,所述第一预设值小于所述第二预设值。Wherein, the first preset value is smaller than the second preset value.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,根据所述测量信息,确定所述第一设备的处理单元频率,包括:The method according to any one of claims 1 to 4, wherein determining the frequency of the processing unit of the first device according to the measurement information comprises:
    获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
    将所述测量信息作为所述推断模型的输入,执行所述推断模型得到所述处理单元频率。The measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
  6. 根据权利要求5所述的方法,其特征在于,获取推断模型,包括:The method according to claim 5, wherein obtaining an inference model comprises:
    获取训练样本,其中,所述训练样本包括:处理单元频率及所述处理单元频率样本对应的测量信息;Acquiring training samples, where the training samples include: processing unit frequencies and measurement information corresponding to the processing unit frequency samples;
    将所述测量信息作为待训练推断模型的输入,执行所述推断模型得到输出结果;Taking the measurement information as the input of the inference model to be trained, and executing the inference model to obtain an output result;
    基于所述输出结果及所述处理单元频率确定满足收敛条件的情况下,所述推断模型完成训练;If it is determined based on the output result and the frequency of the processing unit that the convergence condition is satisfied, the inference model completes the training;
    基于所述输出结果及所述处理单元频率确定不满足收敛条件的情况下,对所述推断模型中的参数进行优化。If it is determined based on the output result and the frequency of the processing unit that the convergence condition is not satisfied, the parameters in the inference model are optimized.
  7. 根据权利要求1至4中任一项所述的方法,其特征在于,所述测量信息包括如下中的至少一个:处理单元功耗、处理单元温度值及处理单元利用率。The method according to any one of claims 1 to 4, wherein the measurement information includes at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
  8. 一种数据处理系统,其特征在于,包括:A data processing system, characterized in that it comprises:
    第一设备,用于在工作过程中产生测量信息;The first device is used to generate measurement information during work;
    第一管理装置,用于获取与第一设备有关的测量信息,并根据所述测量信息,确定所述第一设备的处理单元频率。The first management device is configured to obtain measurement information related to the first device, and determine the frequency of the processing unit of the first device according to the measurement information.
  9. 根据权利要求8所述的系统,其特征在于,The system according to claim 8, wherein:
    所述第一管理装置,还用于在所述第一设备功耗超出第一功耗封顶值的情况下,向所述第一设备发送功耗封顶控制指令;The first management device is further configured to send a power consumption capping control instruction to the first device when the power consumption of the first device exceeds a first power consumption cap value;
    所述第一设备,还用于根据所述控制指令,执行功耗封顶操作;The first device is further configured to perform a power consumption capping operation according to the control instruction;
    所述第一管理装置,还用于获取所述第一设备在启动功耗封顶控制的情况下与所述第一设备有关的测量信息,并根据所述测量信息,确定所述第一设备的处理单元频率。The first management apparatus is further configured to obtain measurement information related to the first device when the power consumption capping control is activated by the first device, and determine the measurement information of the first device according to the measurement information. Processing unit frequency.
  10. 根据权利要求8或9所述的系统,其特征在于,所述第一设备为服务器集群中的服务器;所述第一管理装置为带外管理装置。The system according to claim 8 or 9, wherein the first device is a server in a server cluster; and the first management device is an out-of-band management device.
  11. 一种数据处理方法,其特征在于,包括:A data processing method, characterized in that it comprises:
    获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
    获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
    将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。The measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
  12. 根据权利要求11所述的方法,其特征在于,获取与第一设备有关的测量信息;The method according to claim 11, wherein the measurement information related to the first device is acquired;
    在第一设备因功耗超出第一功耗封顶值的情况下,启动功耗封顶控制;In the case where the power consumption of the first device exceeds the first power consumption capping value, start the power consumption capping control;
    在所述第一设备启动功耗封顶控制的情况下,获取与第一设备有关的测量信息。When the first device starts power consumption capping control, obtain measurement information related to the first device.
  13. 根据权利要求11或12所述的方法,其特征在于,获取利用训练样本完成训练的推断模型,包括:The method according to claim 11 or 12, wherein obtaining an inference model that uses training samples to complete training includes:
    将样本对中的测量信息作为待训练推断模型的输入,执行所述推断模型得到输出结果;Use the measurement information in the sample pair as the input of the inference model to be trained, and execute the inference model to obtain an output result;
    基于所述输出结果及所述第一样本对中的处理单元频率确定符合预设结束条件的情况下,所述推断模型完成训练;If it is determined based on the output result and the frequency of the processing unit in the first sample pair that the preset end condition is met, the inference model completes the training;
    基于所述输出结果及所述样本对中的处理单元频率确定不符合所述预设结束条件的情况下,对所述推断模型中的参数进行优化;并进入下次训练过程。If it is determined based on the output result and the frequency of the processing unit in the sample pair that the preset end condition is not met, the parameters in the inference model are optimized; and the next training process is entered.
  14. 一种数据获取方法,其特征在于,包括:A data acquisition method, characterized in that it comprises:
    为测试用第二设备增加测试负载;Increase the test load for the second device for testing;
    获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;Acquiring a processing unit frequency of the second device under a test load and measurement information related to the second device;
    将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  15. 根据权利要求14所述的方法,其特征在于,还包括:The method according to claim 14, further comprising:
    为所述第二设备设置第一功耗封顶值;Setting a first power consumption cap value for the second device;
    以及,获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息,包括:And, acquiring the processing unit frequency of the second device under the test load and the measurement information related to the second device includes:
    所述测试负载增加至所述第二设备功耗达到所述第一功耗封顶值时,启动功耗封顶控制;When the test load increases until the power consumption of the second device reaches the first power consumption cap value, start the power consumption cap control;
    在所述第二设备启动功耗封顶控制的情况下,获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息。When the second device starts power consumption capping control, the processing unit frequency of the second device and the measurement information related to the second device are acquired.
  16. 根据权利要求15所述的方法,其特征在于,获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息,包括:The method according to claim 15, wherein acquiring the processing unit frequency of the second device and measurement information related to the second device comprises:
    启动功耗封顶控制后,持续为所述第二设备增加测试负载直至到达预设最大负载;After starting the power consumption capping control, continue to increase the test load for the second device until the preset maximum load is reached;
    记录测试期间所述第二设备的处理频率及测量信息;Record the processing frequency and measurement information of the second device during the test;
    其中,所述测试期间为所述功耗封顶控制启动时至所述第二设备的测试负载被加载到预设最大负载时的一段时间;或者为所述功耗封顶控制启动时至所述第二设备的测试负载被加载到预设最大负载并持续设定时长后的一段时间。Wherein, the test period is a period of time from when the power consumption capping control is activated to when the test load of the second device is loaded to a preset maximum load; or from when the power consumption capping control is activated to the first The test load of the second device is loaded to the preset maximum load and continues for a period of time after the set duration.
  17. 根据权利要求15或16所述的方法,其特征在于,还包括:The method according to claim 15 or 16, further comprising:
    为所述第二设备增加测试负载至预设最大负载后,重置所述第一功耗封顶值;并进入下轮测试,直至重置后的所述第一功耗封顶值大于或等于阈值;或After increasing the test load for the second device to the preset maximum load, reset the first power consumption cap value; and enter the next round of testing until the reset first power consumption cap value is greater than or equal to the threshold ;or
    为所述第二设备增加负载至预设最大负载并持续设定时长后,重置所述第一功耗封顶值;并进入下轮测试,直至重置后的所述第一功耗封顶值大于或等于所述阈值。After increasing the load for the second device to the preset maximum load and continuing for a set time period, reset the first power consumption cap value; and enter the next round of testing until the reset first power consumption cap value Greater than or equal to the threshold.
  18. 根据权利要求17所述的方法,其特征在于,重置所述第一功耗封顶值,包括:The method of claim 17, wherein resetting the first power consumption cap value comprises:
    在所述第一功耗封顶值的基础上,增加设定值得到第四功耗封顶值;On the basis of the first power consumption cap value, increase a set value to obtain a fourth power consumption cap value;
    将所述第一功耗封顶值更新为所述第四功耗封顶值。The first power consumption cap value is updated to the fourth power consumption cap value.
  19. 一种模型训练方法,其特征在于,包括:A model training method is characterized in that it includes:
    获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;Acquiring training samples, the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
    基于所述多个样本,对待训练推断模型进行训练;Training the inference model to be trained based on the multiple samples;
    其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。Wherein, the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
  20. 一种测试系统,包括:A test system including:
    测试用第二设备,包括与第一设备相同的硬件结构及性能,用于运行测试程序以加载相应测试负载;The second device for testing, including the same hardware structure and performance as the first device, is used to run the test program to load the corresponding test load;
    测试用第二管理装置,与所述第二设备连接,用于获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The second management device for testing is connected to the second device, and is used to obtain the processing unit frequency of the second device under the test load and the measurement information related to the second device; and the frequency of the processing unit And the measurement information is used as a sample pair in the training samples used to train the inference model to be trained.
  21. 根据权利要求20所述的系统,其特征在于,The system of claim 20, wherein:
    所述第二管理装置,还用于为所述第二设备设置第一功耗封顶值;所述测试负载增加至所述第二设备功耗达到所述第一功耗封顶值时,向所述第二设备发送功耗封顶控制指令;The second management device is further configured to set a first power consumption cap value for the second device; when the test load increases until the power consumption of the second device reaches the first power consumption cap value, the The second device sends a power consumption capping control instruction;
    所述第二设备,还用于根据所述控制指令,执行功耗封顶操作;The second device is further configured to perform a power consumption capping operation according to the control instruction;
    所述第二管理装置,还用于在所述第二设备启动功耗封顶控制的情况下,获取所述第二设备的处理单元频率及与所述第二设备有关的测量信息;将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The second management device is further configured to obtain the processing unit frequency of the second device and the measurement information related to the second device when the power consumption capping control is activated by the second device; The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  22. 根据权利要求20或21所述的系统,其特征在于,还包括:The system according to claim 20 or 21, further comprising:
    模型训练装置,用于获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;基于所述多个样本,对待训练推断模型进行训练,以为第一管理装置提供完成训练的推断模型。The model training device is used to obtain training samples, the training samples include multiple sample pairs, the sample pairs include test information and processing unit frequencies corresponding to the test information; based on the multiple samples, the inference model to be trained is trained for the first A management device provides an inference model for completing the training.
  23. 一种设备功耗控制方法,其特征在于,包括:A device power consumption control method, characterized in that it includes:
    获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
    根据所述测量信息,确定所述第一设备的处理单元频率;Determine the frequency of the processing unit of the first device according to the measurement information;
    基于所述处理单元频率,重置第一功耗封顶值;Resetting the first power consumption cap value based on the frequency of the processing unit;
    利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。The power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
  24. 根据权利要求23所述的方法,其特征在于,基于所述处理单元频率,重置第一功耗封顶值,包括:The method according to claim 23, wherein, based on the frequency of the processing unit, resetting the first power consumption cap value comprises:
    在所述处理单元频率低于第一预设值的情况下,将所述第一功耗封顶值提高至第二功耗封顶值;When the frequency of the processing unit is lower than the first preset value, increasing the first power consumption cap value to a second power consumption cap value;
    在所述处理单元频率高于第二预设值的情况下,将所述第一功耗封顶值降低至第三功耗封顶值;In a case where the frequency of the processing unit is higher than the second preset value, reducing the first power consumption cap value to a third power consumption cap value;
    其中,所述第一预设值小于所述第二预设值。Wherein, the first preset value is smaller than the second preset value.
  25. 根据权利要求23所述的方法,其特征在于,根据所述测量信息,确定所述第一设备的处理单元频率,包括:The method according to claim 23, wherein determining the frequency of the processing unit of the first device according to the measurement information comprises:
    获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
    将所述测量信息作为所述推断模型的输入,执行所述推断模型得到所述处理单元频率。The measurement information is used as the input of the inference model, and the inference model is executed to obtain the processing unit frequency.
  26. 根据权利要求23至25中任一项所述的方法,其特征在于,所述测量信息包括如下中的至少一个:处理单元功耗、处理单元温度值及处理单元利用率。The method according to any one of claims 23 to 25, wherein the measurement information includes at least one of the following: processing unit power consumption, processing unit temperature value, and processing unit utilization rate.
  27. 一种电子设备,其特征在于,包括:存储器及处理器,其中,An electronic device, characterized by comprising: a memory and a processor, wherein:
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
    获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
    根据所述测量信息,确定所述第一设备的处理单元频率。According to the measurement information, the frequency of the processing unit of the first device is determined.
  28. 一种电子设备,其特征在于,包括:存储器及处理器,其中,An electronic device, characterized by comprising: a memory and a processor, wherein:
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
    获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
    获取利用训练样本完成训练的推断模型,其中,所述训练样本包括多个样本对,样本对包含测量信息和处理单元频率;Acquiring an inference model that uses training samples to complete training, where the training samples include multiple sample pairs, and the sample pairs include measurement information and processing unit frequencies;
    将所述测量信息作为推断模型的输入,执行所述推断模型得到所述第一设备的处理单元频率。The measurement information is used as an input of an inference model, and the inference model is executed to obtain the processing unit frequency of the first device.
  29. 一种电子设备,其特征在于,包括:存储器及处理器,其中,An electronic device, characterized by comprising: a memory and a processor, wherein:
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
    为测试用第二设备增加测试负载;Increase the test load for the second device for testing;
    获取所述第二设备在测试负载情况下的处理单元频率及与所述第二设备有关的测量信息;Acquiring a processing unit frequency of the second device under a test load and measurement information related to the second device;
    将所述处理单元频率及所述测量信息作为用于训练待训练推断模型的训练样本中的一个样本对。The processing unit frequency and the measurement information are used as a sample pair in the training samples used to train the inference model to be trained.
  30. 一种电子设备,其特征在于,包括:存储器及处理器,其中,An electronic device, characterized by comprising: a memory and a processor, wherein:
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
    获取训练样本,所述训练样本包含多个样本对,样本对包括测试信息及测试信息对应的处理单元频率;Acquiring training samples, the training samples including multiple sample pairs, the sample pairs including test information and processing unit frequencies corresponding to the test information;
    基于所述多个样本,对待训练推断模型进行训练;Training the inference model to be trained based on the multiple samples;
    其中,完成训练的所述推断模型用于根据与第一设备的测量信息确定所述第一设备的处理单元频率。Wherein, the trained inference model is used to determine the frequency of the processing unit of the first device according to the measurement information with the first device.
  31. 一种电子设备,其特征在于,包括:存储器及处理器,其中,An electronic device, characterized by comprising: a memory and a processor, wherein:
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,与所述存储器耦合,用于执行所述存储器中存储的所述程序,以用于:The processor is coupled with the memory, and is configured to execute the program stored in the memory for:
    获取与第一设备有关的测量信息;Acquiring measurement information related to the first device;
    根据所述测量信息,确定所述第一设备的处理单元频率;Determine the frequency of the processing unit of the first device according to the measurement information;
    基于所述处理单元频率,重置第一功耗封顶值;Resetting the first power consumption cap value based on the frequency of the processing unit;
    利用重置后的所述第一功耗封顶值,执行针对所述第一设备的功耗封顶控制。The power consumption capping control for the first device is executed by using the first power consumption capping value after resetting.
PCT/CN2019/128400 2019-12-25 2019-12-25 Data processing, acquisition, model training and power consumption control methods, system and device WO2021128084A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/128400 WO2021128084A1 (en) 2019-12-25 2019-12-25 Data processing, acquisition, model training and power consumption control methods, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/128400 WO2021128084A1 (en) 2019-12-25 2019-12-25 Data processing, acquisition, model training and power consumption control methods, system and device

Publications (1)

Publication Number Publication Date
WO2021128084A1 true WO2021128084A1 (en) 2021-07-01

Family

ID=76575081

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/128400 WO2021128084A1 (en) 2019-12-25 2019-12-25 Data processing, acquisition, model training and power consumption control methods, system and device

Country Status (1)

Country Link
WO (1) WO2021128084A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014032250A1 (en) * 2012-08-30 2014-03-06 华为终端有限公司 Method and device for controlling central processing unit
CN104204825A (en) * 2012-03-30 2014-12-10 英特尔公司 Dynamically measuring power consumption in a processor
CN107861606A (en) * 2017-11-21 2018-03-30 北京工业大学 A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping
CN108599966A (en) * 2018-03-13 2018-09-28 山东超越数控电子股份有限公司 A kind of net peace equipment power dissipation dynamic debugging system and method
CN108615071A (en) * 2018-05-10 2018-10-02 阿里巴巴集团控股有限公司 The method and device of model measurement

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104204825A (en) * 2012-03-30 2014-12-10 英特尔公司 Dynamically measuring power consumption in a processor
WO2014032250A1 (en) * 2012-08-30 2014-03-06 华为终端有限公司 Method and device for controlling central processing unit
CN107861606A (en) * 2017-11-21 2018-03-30 北京工业大学 A kind of heterogeneous polynuclear power cap method by coordinating DVFS and duty mapping
CN108599966A (en) * 2018-03-13 2018-09-28 山东超越数控电子股份有限公司 A kind of net peace equipment power dissipation dynamic debugging system and method
CN108615071A (en) * 2018-05-10 2018-10-02 阿里巴巴集团控股有限公司 The method and device of model measurement

Similar Documents

Publication Publication Date Title
US10587935B2 (en) System and method for automatically determining server rack weight
US11210172B2 (en) System and method for information handling system boot status and error data capture and analysis
US20160232450A1 (en) Storage device lifetime monitoring system and storage device lifetime monitoring method thereof
JP2019511054A (en) Distributed cluster training method and apparatus
US10296434B2 (en) Bus hang detection and find out
CN109522175B (en) Memory evaluation method and device
US9612641B2 (en) Adjusting the connection idle timeout in connection pools
US10365996B2 (en) Performance-aware and reliability-aware data placement for n-level heterogeneous memory systems
US10055366B2 (en) Method for data transmission and server for implementing the method
CN111581043A (en) Server power consumption monitoring method and device and server
TWI709039B (en) Server and method for controlling error event log recording
US9645873B2 (en) Integrated configuration management and monitoring for computer systems
CN106294364B (en) Method and device for realizing web crawler to capture webpage
WO2021128084A1 (en) Data processing, acquisition, model training and power consumption control methods, system and device
TWI611290B (en) Method for monitoring server racks
CN111901405A (en) Multi-node monitoring method and device, electronic equipment and storage medium
WO2019169727A1 (en) Network traffic test method and apparatus, device, and computer-readable storage medium
US10592383B2 (en) Technologies for monitoring health of a process on a compute device
TWI771759B (en) Power failure monitoring method, device, electronic device and storage medium
CN107748711A (en) Method, terminal device and the storage medium of Automatic Optimal Storm degree of parallelisms
US8891515B2 (en) Method for node communication
CN113612624A (en) Method and device for processing heartbeat between nodes
US9794120B2 (en) Managing network configurations in a server system
CN104252400A (en) Multi-node management system and method for data center
CN115756982A (en) Method and device for testing system management interrupt response duration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19957679

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19957679

Country of ref document: EP

Kind code of ref document: A1