WO2017124953A1 - 机器异常的处理方法、学习速率的调整方法及装置 - Google Patents

机器异常的处理方法、学习速率的调整方法及装置 Download PDF

Info

Publication number
WO2017124953A1
WO2017124953A1 PCT/CN2017/070906 CN2017070906W WO2017124953A1 WO 2017124953 A1 WO2017124953 A1 WO 2017124953A1 CN 2017070906 W CN2017070906 W CN 2017070906W WO 2017124953 A1 WO2017124953 A1 WO 2017124953A1
Authority
WO
WIPO (PCT)
Prior art keywords
gradient
learning rate
time
target machine
consumption time
Prior art date
Application number
PCT/CN2017/070906
Other languages
English (en)
French (fr)
Inventor
周俊
Original Assignee
阿里巴巴集团控股有限公司
周俊
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 周俊 filed Critical 阿里巴巴集团控股有限公司
Priority to EP17740984.4A priority Critical patent/EP3407211A4/en
Publication of WO2017124953A1 publication Critical patent/WO2017124953A1/zh
Priority to US16/043,006 priority patent/US10748090B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3419Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3495Performance evaluation by tracing or monitoring for systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present application relates to the field of the Internet, and in particular to a method for processing machine abnormalities, a method and apparatus for adjusting a learning rate.
  • Internet companies have a large amount of user behavior data, usually through machine learning methods to extract useful information from these data, such as user preferences, etc., through mining this information to enhance user experience and Internet company revenue.
  • the core practice of machine learning is to solve the minimum value of the loss function (the loss function is a function to measure the degree of loss and error. For example, the search advertisement, that is, the smaller the loss function, the more likely the user clicks on the search. ad).
  • the gradient descent method (gradient, which is a vector, is the derivative of the loss function versus weight) is the most widely used method for solving the loss function minimum in machine learning. Because of its simple implementation, it can be quickly calculated and used in various optimization problems. on.
  • the learning rate (usually expressed in Eta) is an important parameter for weight update (weight, which is a vector, which can be understood as an independent variable of the loss function), which affects the convergence of the training process. If Eta is too large, then each iteration is too far away, and it is easy to miss the optimal solution; if Eta is too small, it will go too slowly, affecting the convergence speed.
  • the embodiment of the present application provides a processing method for a machine abnormality, a method and a device for adjusting a learning rate, so as to at least solve a technical problem that a training cost is high due to a partial machine calculation or a slow communication speed in a cluster.
  • a method for processing a machine abnormality includes: acquiring a gradient consumption time of a target machine, wherein the gradient consumption time is used to indicate that the target machine consumes during training. a gradient-related time; determining whether the gradient consumption time satisfies a predetermined condition compared to a pre-acquired consumption time average value, wherein the consumption time average value is used to represent all machines in the cluster except the target machine, An average value of time associated with the gradient consumed during the training; and if the gradient consumption time satisfies the predetermined condition as compared to the average of the consumption time, determining the target machine abnormality.
  • a method for adjusting a learning rate including: acquiring a gradient calculated by a target machine; calculating a learning rate corresponding to the gradient according to the gradient; determining the learning rate If the learning rate is less than the preset threshold, the update weight operation is stopped; if the learning rate is greater than or equal to the preset threshold, the update weight operation is performed.
  • a processing device for machine abnormality including: a first acquiring unit, configured to acquire a gradient consumption time of a target machine, wherein the gradient consumption time is used to indicate the a gradient-related time consumed by the target machine during the training process; the determining unit, configured to determine whether the gradient consumption time satisfies a predetermined condition compared to a pre-acquired consumption time average value, wherein the consumption time average value is used to represent An average of time associated with the gradient consumed by the machine during the training process, and a detection unit configured to: if the gradient consumption time and the consumption time mean The target machine is determined to be abnormal than satisfying the predetermined condition.
  • a learning rate adjustment apparatus including: a second acquiring unit, configured to acquire a gradient calculated by a target machine; and a calculating unit, configured to: Calculating, according to the gradient, a learning rate corresponding to the gradient; a processing unit, configured to determine whether the learning rate is less than a preset threshold; if the learning rate is less than the preset threshold, stopping performing an update weight operation; The learning rate is greater than or equal to the preset threshold, and the update weight operation is performed.
  • the gradient consumption time of the target machine is acquired, wherein the gradient consumption time is used to indicate the time related to the gradient consumed by the target machine during the training process; and the gradient consumption time is determined to be the average value of the consumption time acquired in advance.
  • the consumption time average is used to represent the average of the gradient-related time consumed during the training process for all machines except the target machine in the cluster; if the gradient consumption time and the consumption time mean Determining whether the target machine has an abnormality by comparing the gradient consumption time of the target machine with the mean value of the consumption time of all the machines except the target machine than when the predetermined condition is satisfied, and when the target machine is abnormal, timely Adjusting the training strategy to avoid some machine calculations or slow communication speeds, resulting in increased training costs, to achieve the purpose of determining the abnormal machines in the cluster in time, thus achieving the technical effect of reducing training costs, and thus solving the part of the cluster Machine computing or communication Of slower due to the higher cost of training technical problems.
  • FIG. 1 is a block diagram showing the hardware structure of a computer terminal for processing a machine abnormality according to an embodiment of the present application
  • FIG. 2 is a schematic flow chart of an optional method for processing machine abnormalities according to an embodiment of the present application
  • FIG. 3 is a schematic flow chart of another optional method for processing machine abnormalities according to an embodiment of the present application.
  • FIG. 4 is a flow diagram of an optional learning rate adjustment method according to an embodiment of the present application. intention
  • FIG. 5 is a schematic structural diagram of an optional machine abnormality processing apparatus according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of another optional machine abnormality processing apparatus according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an optional processing unit according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of still another optional machine abnormality processing apparatus according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an optional learning rate adjusting apparatus according to an embodiment of the present application.
  • FIG. 10 is a structural block diagram of a computer terminal according to an embodiment of the present application.
  • a method embodiment of a method of processing machine anomalies is also provided, it being noted that the steps illustrated in the flowchart of the figures may be performed in a computer system such as a set of computer executable instructions And, although the logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
  • FIG. 1 is a hardware structural block diagram of a computer terminal for processing a machine abnormality according to an embodiment of the present application.
  • computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • a memory 104 for storing data
  • a transmission device 106 for communication functions.
  • computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of the application software, such as program instructions/modules corresponding to the processing method of the machine abnormality in the embodiment of the present application, and the processor 102 executes by executing the software program and the module stored in the memory 104.
  • Various functional applications and data processing that is, processing methods for implementing machine exceptions of the above-described applications.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Transmission device 106 is for receiving or transmitting data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • NIC Network Interface Controller
  • the transmission device 106 can be a radio frequency (RF) module for communicating with the Internet by wireless.
  • RF radio frequency
  • FIG. 2 is a flow chart of a method for processing machine abnormalities according to Embodiment 1 of the present application.
  • Step S202 acquiring a gradient consumption time of the target machine.
  • the gradient consumption time is used to indicate the gradient-related time consumed by the target machine during the training.
  • the gradient consumption time includes a first time consumed by the target machine to calculate the gradient and/or a second time consumed by the transmission gradient.
  • step S204 it is determined whether the gradient consumption time satisfies a predetermined condition compared to the pre-acquired consumption time average value.
  • the consumption time average is used to represent the average of the time related to the gradient consumed by all the machines except the target machine in the cluster during the training.
  • the consumption time average value includes a first average value of time consumed by all machines other than the target machine in the system and/or a second average value of time consumed by the transmission gradient.
  • determining whether the gradient consumption time is compared with the pre-acquired consumption time average value, whether the predetermined condition is met includes: determining whether the first time is greater than a product of the first average value and the first preset coefficient; wherein, if the first time is greater than The product of the first average value and the first preset coefficient determines that the gradient consumption time satisfies a predetermined condition compared with the average value of the consumption time. If the first time is less than or equal to the product of the first average value and the first preset coefficient, the gradient is determined.
  • the consumption time does not satisfy the predetermined condition compared to the consumption time average; and/or, determining whether the second time is greater than a product of the second average value and the second predetermined coefficient; wherein, if the second time is greater than the second average value and the second time
  • the product of the preset coefficient determines that the gradient consumption time satisfies a predetermined condition compared with the average value of the consumption time. If the second time is less than or equal to the product of the second average value and the second predetermined coefficient, determining the gradient consumption time and the consumption time mean value The predetermined condition is not satisfied.
  • step S206 if the gradient consumption time satisfies the predetermined condition compared with the consumption time average value, the target machine abnormality is determined.
  • step S206 of the present application if the first time is greater than the first average value and the first preset coefficient The product, and/or, if the second time is greater than the product of the second average and the second predetermined coefficient, the target machine is determined to be abnormal.
  • the processing method of the machine abnormality includes, but is not limited to, determining according to two dimensions of the first time consumed by the calculation gradient of each machine and the second time consumed by the sending gradient, and the predetermined condition is
  • the target machine exceeds the average of all machine gradient transmission times ⁇ the second preset coefficient (the second preset coefficient can be, for example, 2), the target machine has not sent a gradient, and the target machine is considered to be an abnormal machine (that is, Assume that 9 of the 10 machines are in 5 seconds, the gradient is sent out, but the target machine, for more than 10 seconds, has not sent a gradient, the target machine is considered to be a slow machine), and / or,
  • the first preset coefficient may be, for example, 3
  • the target machine is considered to belong to the abnormal machine.
  • the solution provided in the first embodiment of the present application determines whether the target machine is abnormal or not when the target machine is abnormal by comparing the gradient consumption time of the target machine with the mean time of consumption of all the machines except the target machine.
  • the training strategy is adjusted in time to avoid some machine calculation or communication speed is slow, the problem of increasing the training cost is achieved, and the purpose of determining the abnormal machine in the cluster is achieved in time, thereby achieving the technical effect of reducing the training cost, thereby solving the problem.
  • a technical problem in which the training cost of some machines in the cluster is slow or the communication speed is relatively high.
  • the processing method of the machine abnormality in this embodiment may further include:
  • Step S302 obtaining a gradient calculated by all the machines except the target machine.
  • step S302 of the present application after determining the target machine abnormality, the processing method of the machine abnormality of the embodiment does not wait for the gradient of the target machine of the abnormality, and the direct acquisition has been returned.
  • the gradient of the machine ie the gradient calculated by all machines except the target machine).
  • Step S304 calculating a learning rate corresponding to the gradient according to the gradient.
  • sum (i-th gradient + i-th gradient) refers to the summation of the i-th gradient of the current iteration and the square of the i-th gradient of the M-round iteration before the current round, where, for example, M It may be 20, which is not limited in this embodiment.
  • Step S306 determining whether to perform the update weight operation according to the learning rate.
  • determining whether to perform the update weight operation according to the learning rate includes: determining whether the learning rate is less than a preset threshold; if the learning rate is less than the preset threshold, stopping performing the update weight operation; if the learning rate is greater than or equal to the preset threshold, Perform an update weight operation.
  • the preset threshold of this embodiment may be 1e -5 , that is, 1 is multiplied by a negative 5th power of 10.
  • performing the update weight operation includes: calculating update weights according to the learning rate, the gradient, and the historical weight, wherein the historical weight refers to the weight used by the target machine during the training.
  • the historical weight refers to the weight used by the target machine in the current iteration
  • the update weight refers to the weight to be used in the next iteration of the machine.
  • the processing method of the machine exception of the embodiment may further include:
  • step S10 the update weight is sent to the target machine and all machines except the target machine to indicate that the target machine and all machines except the target machine are trained according to the update weight.
  • step S10 of the present application after determining the target machine abnormality, the processing method of the machine abnormality of the embodiment does not wait for the gradient returned by the target machine, directly performs the update weight operation according to the gradient of the machine that has been returned, and then sends the update weight. Save time to all machines, notifying all machines to the next iteration.
  • the machine abnormality of the present embodiment first, by detecting an abnormal machine in the cluster, the machine having an abnormality is avoided, thereby avoiding the situation that the training is withdrawn or waiting for a slow machine, thereby speeding up the training and saving the cost, and secondly, fully utilizing Gradient, iteration round and other information, automatically adjust the learning rate, use different learning rates for the weights of different dimensions, can achieve better convergence in each dimension, further accelerate training and save costs.
  • the present application proposes a processing method of machine abnormality, which is determined by comparing the gradient consumption time of the target machine with the mean time of consumption of all machines except the target machine to determine whether the target machine is abnormal.
  • the target machine adjust the training strategy in time to avoid the problem that some machine calculation or communication speed is slow, resulting in increased training cost, and achieve the purpose of determining the abnormal machine in the cluster in time, thus achieving the technical effect of reducing the training cost.
  • a method embodiment of a method for adjusting a learning rate is also provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer executable instructions. And, although the logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
  • FIG. 4 is a flowchart of a method for adjusting a learning rate according to Embodiment 2 of the present application.
  • Step S402 obtaining a gradient calculated by the target machine.
  • the gradient is a value obtained by deriving the loss function
  • the loss function I is a function that maps an event (an element in a sample space) to a real number that expresses the economic or opportunity cost associated with its event.
  • Step S404 calculating a learning rate corresponding to the gradient according to the gradient.
  • sum (i-th gradient + i-th gradient) refers to the summation of the i-th gradient of the current iteration and the square of the i-th gradient of the M-round iteration before the current round, where, for example, M It may be 20, which is not limited in this embodiment.
  • Step S406 determining whether the learning rate is less than a preset threshold.
  • the preset threshold may be 1e -5 , that is, 1 is multiplied by a negative 5th power of 10.
  • Step S408 if the learning rate is less than the preset threshold, stop performing the update weight operation.
  • Step S410 If the learning rate is greater than or equal to a preset threshold, perform an update weight operation.
  • the performing the update weight operation comprises: calculating the update weight according to the learning rate, the gradient, and the historical weight, wherein the historical weight refers to the weight used by the target machine in the training process.
  • the processing method of the machine abnormality may further include: sending the update weight to the target machine to indicate that the target machine performs training according to the update weight.
  • the processing method of the machine abnormality in this embodiment can fully utilize the information such as the gradient and the iteration round, automatically adjust the learning rate, use different learning rates for the weights of different dimensions, and can achieve better convergence in each dimension. Further accelerate training and save costs.
  • the solution provided by the foregoing embodiment 2 of the present application is based on the target machine.
  • the gradient calculates the corresponding learning rate.
  • the learning rate is less than the preset threshold, the execution of the update weight operation is stopped, and the training time is shortened, thereby achieving the technical effect of reducing the training cost.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present application.
  • an apparatus embodiment for implementing the foregoing method for processing an abnormality of a machine is provided.
  • the apparatus provided by the foregoing embodiment of the present application may be run on a computer terminal.
  • FIG. 5 is a schematic structural diagram of a processing device for machine abnormality according to an embodiment of the present application.
  • the processing device of the machine abnormality may include a first acquiring unit 502, a determining unit 504, and a detecting unit 506.
  • the first obtaining unit 502 is configured to acquire a gradient consumption time of the target machine, where the gradient consumption time is used to represent a gradient-related time consumed by the target machine during the training process; and the determining unit 504 is configured to: Determining whether the gradient consumption time satisfies a predetermined condition compared to a pre-acquired consumption time average value, wherein the consumption time average value is used to represent the set An average of the time associated with the gradient consumed by the machine during the training process, and a detection unit 506 for averaging the gradient consumption time and the consumption time The target machine abnormality is determined as compared to satisfying the predetermined condition.
  • the solution provided in the above third embodiment of the present application determines whether the target machine is abnormal or not when the target machine is abnormal by comparing the gradient consumption time of the target machine with the average of the consumption time of all the machines except the target machine.
  • the training strategy is adjusted in time to avoid some machine calculation or communication speed is slow, the problem of increasing the training cost is achieved, and the purpose of determining the abnormal machine in the cluster is achieved in time, thereby achieving the technical effect of reducing the training cost, thereby solving the problem.
  • a technical problem in which the training cost of some machines in the cluster is slow or the communication speed is relatively high.
  • the foregoing first obtaining unit 502, the determining unit 504, and the detecting unit 506 correspond to steps S202 to S206 in the first embodiment, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps. However, it is not limited to the contents disclosed in the first embodiment. It should be noted that the foregoing module may be implemented in the computer terminal 10 provided in the first embodiment as a part of the device, and may be implemented by software or by hardware.
  • the gradient consumption time includes a first time consumed by the target machine to calculate a gradient and/or a second time consumed to send the gradient; the consumption time average includes a target machine in the system All machines other than the machine calculate a first average of the time consumed by the gradient and/or a second average of the time consumed to transmit the gradient.
  • the determining unit 504 is configured to: determine whether the gradient consumption time is compared with a pre-acquired consumption time average value, whether a predetermined condition is met: determining whether the first time is greater than the first average value and a product of a first preset coefficient; wherein, if the first time is greater than a product of the first average value and the first preset coefficient, determining that the gradient consumption time is satisfied compared to the average of the consumption time
  • the predetermined condition if the first time is less than or equal to a product of the first average value and the first preset coefficient, determining that the gradient consumption time does not satisfy the predetermined ratio compared to the consumption time average value a condition; and/or, determining whether the second time is greater than a product of the second average value and a second predetermined coefficient; wherein, if the second time is greater than the second average value and the second pre- Set the product of the coefficients to determine the gradient consumption time The predetermined condition is satisfied compared to the consumption time average value, and if the second time is met
  • the processing device of the machine abnormality may further include: a second obtaining unit 602, a calculating unit 604, and a processing unit 606.
  • the second obtaining unit 602 is configured to acquire the gradient calculated by all the machines except the target machine, and the calculating unit 604 is configured to calculate a learning rate corresponding to the gradient according to the gradient; the processing unit 606 And determining, according to the learning rate, whether to perform an update weight operation.
  • the foregoing second obtaining unit 602, the calculating unit 604, and the processing unit 606 correspond to steps S302 to S306 in the first embodiment, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps. However, it is not limited to the contents disclosed in the first embodiment. It should be noted that the foregoing module may be implemented in the computer terminal 10 provided in the first embodiment as a part of the device, and may be implemented by software or by hardware.
  • the processing unit 606 includes: a determining module 702 and an executing module 704.
  • the determining module 702 is configured to determine whether the learning rate is less than a preset threshold; and the executing module 704 is configured to stop performing the updating weight operation if the learning rate is less than the preset threshold; The update weight operation is performed when the preset threshold is greater than or equal to.
  • the executing module 704 is configured to perform the following steps: performing the update weight operation: calculating an update weight according to the learning rate, the gradient, and a historical weight, where Historical weight refers to the weight used by the target machine during the training process.
  • the processing device of the machine abnormality may further include: a sending unit 802.
  • the sending unit 802 is configured to send the update weight to the target machine and all machines except the target machine to indicate that the target machine and all machines except the target machine are updated according to the target Weight training.
  • an apparatus embodiment for implementing the foregoing method for adjusting a learning rate is further provided.
  • the apparatus provided by the foregoing embodiment of the present application may be run on a computer terminal.
  • FIG. 9 is a schematic structural diagram of an apparatus for adjusting a learning rate according to an embodiment of the present application.
  • the learning rate adjustment apparatus may include: a second acquisition unit 902, a calculation unit 904, and a processing unit 906.
  • the second obtaining unit 902 is configured to acquire a gradient calculated by the target machine, and the calculating unit 904 is configured to calculate a learning rate corresponding to the gradient according to the gradient, and the processing unit 906 is configured to determine whether the learning rate is If the learning rate is less than the preset threshold, the update weight operation is stopped; if the learning rate is greater than or equal to the preset threshold, the update weight operation is performed.
  • the solution provided in the fourth embodiment of the present application calculates the corresponding learning rate according to the gradient of the target machine, and stops the execution of the update weight operation and shortens the training time when the learning rate is less than the preset threshold.
  • Embodiments of the present application may provide a computer terminal, which may be any one of computer terminal groups.
  • the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
  • the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.
  • the computer terminal may execute the program code of the following steps in the processing method of the machine abnormality: acquiring the gradient consumption time of the target machine, wherein the gradient consumption time is used to indicate that the target machine consumes during the training process. a time associated with the gradient; determining whether the gradient consumption time satisfies a predetermined condition compared to a pre-acquired consumption time average value, wherein the consumption time average value is used to represent all machines in the cluster other than the target machine And an average value of time associated with the gradient consumed during the training; and if the gradient consumption time satisfies the predetermined condition as compared to the average of the consumption time, determining the target machine abnormality.
  • FIG. 10 is a structural block diagram of a computer terminal according to an embodiment of the present application.
  • the computer terminal A may include one or more (only one shown in the figure) processor 1002, memory 1004, and transmission device 1006.
  • the memory 1004 can be used to store a software program and a module, such as a processing method of the machine abnormality in the embodiment of the present application and a program instruction/module block corresponding to the device, and the processor 1002 runs the software program stored in the memory 1004. And the module block, thereby performing various function applications and data processing, that is, the processing method of the above machine abnormality is realized.
  • the memory 1004 may include a high speed random access memory, and may also include a nonvolatile memory such as one or more A magnetic storage device, flash memory, or other non-volatile solid state memory.
  • memory 1004 can further include memory remotely located relative to the processor, which can be connected to terminal A over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the transmission device 1006 described above is for receiving or transmitting data via a network.
  • Specific examples of the above network may include a wired network and a wireless network.
  • transmission device 1006 includes a Network Interface Controller (NIC) that can be connected to other network devices and routers via a network cable to communicate with the Internet or a local area network.
  • NIC Network Interface Controller
  • the transmission device 1006 is a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • RF Radio Frequency
  • the memory 1004 is configured to store preset action conditions and information of preset permission users, and an application.
  • the processor 1002 can call the information and the application stored in the memory by the transmission device to perform the following steps: acquiring a packet type and a destination address of the packet to be detected; and obtaining, according to the packet type, the preset configuration file. a first set of attack types corresponding to the packet type, and acquiring a second set of attack types according to the destination address, where the second set of attack types includes a device pointed by the destination address in a preset time period The type of the attack to be detected; the detection policy chain corresponding to the to-be-detected message is generated according to the first attack type set and the second attack type set; and the to-be-detected report is detected according to the detection policy chain Text.
  • the solution provided in the above fifth embodiment of the present application determines whether the target machine has an abnormality by comparing the gradient consumption time of the target machine with the average of the consumption time of all the machines except the target machine, when the target machine is abnormal.
  • the training strategy is adjusted in time to avoid some machine calculation or communication speed is slow, resulting in increased training cost, and the purpose of determining the abnormal machine in the cluster in time is achieved, thereby achieving the technical effect of reducing the training cost, thereby solving the problem
  • FIG. 10 is only an illustration, and the computer terminal can also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, and a applause. Computers and mobile Internet devices (MID), PAD and other terminal devices.
  • FIG. 10 does not limit the structure of the above electronic device.
  • computer terminal 10 may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 10, or have a different configuration than that shown in FIG.
  • the embodiment of the present application may further provide a computer terminal, which may be any one of the computer terminal groups.
  • a computer terminal which may be any one of the computer terminal groups.
  • the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
  • the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.
  • the computer terminal may execute the program code of the following steps in the method for adjusting the learning rate: acquiring a gradient calculated by the target machine; calculating a learning rate corresponding to the gradient according to the gradient; determining the learning rate If the learning rate is less than the preset threshold, the update weight operation is stopped; if the learning rate is greater than or equal to the preset threshold, the update weight operation is performed.
  • the computer terminal can include one or more processors, memory, and transmission devices.
  • the memory can be used to store the software program and the module, such as the method for adjusting the learning rate and the program module/module block corresponding to the device in the embodiment of the present application, and the processor runs the software program and the program program stored in the memory. Blocks, thereby performing various functional applications and data processing, that is, implementing the above-described processing method of machine abnormalities.
  • the memory may include a high speed random access memory, and may also include a non-volatile memory such as one or more magnetic storage devices, flash memory, Or other non-volatile solid state memory.
  • the memory can further include memory remotely located relative to the processor, which can be connected to terminal A via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the above transmission device is for receiving or transmitting data via a network.
  • Specific examples of the above network may include a wired network and a wireless network.
  • the transmission device includes a Network Interface Controller (NIC) that can be connected to other network devices and routers via a network cable to communicate with the Internet or a local area network.
  • the transmission device is a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • the memory is used to store preset action conditions and information of the preset rights user, and an application.
  • the processor may call the memory stored information and the application by the transmitting device to perform the steps of: acquiring a gradient calculated by the target machine; calculating a learning rate corresponding to the gradient according to the gradient; determining whether the learning rate is less than The preset threshold is set; if the learning rate is less than the preset threshold, the update weight operation is stopped; if the learning rate is greater than or equal to the preset threshold, the update weight operation is performed.
  • Embodiment 6 of the present application calculates the corresponding learning rate according to the gradient of the target machine, and stops the execution of the update weight operation and shortens the training time when the learning rate is less than the preset threshold.
  • the computer terminal can also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, an applause computer, and a mobile Internet device (MID), a PAD, and the like.
  • a smart phone such as an Android phone, an iOS phone, etc.
  • a tablet computer such as an iPad, Samsung Galaxy Tab, Samsung Galaxy Tab, etc.
  • MID mobile Internet device
  • PAD PAD
  • Embodiments of the present application also provide a storage medium.
  • the foregoing storage medium may be used to save the program code executed by the processing method of the machine abnormality provided in the first embodiment.
  • the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.
  • the storage medium is configured to store program code for performing the following steps: acquiring a gradient consumption time of the target machine, wherein the gradient consumption time is used to indicate that the target machine is in the training process a time associated with the gradient consumed; determining whether the gradient consumption time satisfies a predetermined condition compared to a pre-acquired consumption time average value, wherein the consumption time average value is used to represent a cluster other than the target machine All machines, an average of the time associated with the gradient consumed during the training; if the gradient consumption time meets the predetermined condition as compared to the average of the consumption time, the target machine anomaly is determined.
  • the storage medium is further configured to store program code for performing: determining whether the first time is greater than a product of the first average value and the first preset coefficient; If the first time is greater than a product of the first average value and the first preset coefficient, determining that the gradient consumption time satisfies the predetermined condition compared to the consumption time average value, if the first If the time is less than or equal to the product of the first average value and the first preset coefficient, determining that the gradient consumption time does not satisfy the predetermined condition compared to the consumption time average; and/or determining the Whether the second time is greater than a product of the second average value and the second predetermined coefficient; wherein, if the second time is greater than a product of the second average value and the second predetermined coefficient, determining the The gradient consumption time satisfies the predetermined condition as compared with the average value of the consumption time. If the second time is less than or equal to a product of the second average value and the second predetermined coefficient, determining the gradient consumption time and
  • the storage medium is further configured to store program code for performing the following steps: acquiring the gradient calculated by all machines except the target machine; according to the gradient, calculating the Determining a learning rate corresponding to the gradient; determining whether to perform an update weight operation according to the learning rate.
  • the storage medium is further configured to store program code for performing: determining whether the learning rate is less than a preset threshold; if the learning rate is less than the preset threshold, stopping Performing the update weight operation; if the learning rate is greater than or equal to the preset threshold, performing the update weight operation.
  • the storage medium is further configured to store program code for performing the following steps: calculating an update weight according to the learning rate, the gradient, and the historical weight, wherein the historical weight is Refers to the weight used by the target machine during the training process.
  • the storage medium is further configured to store program code for performing the following steps: transmitting the update weight to the target machine and all machines except the target machine to indicate The target machine and all machines except the target machine are trained according to the update weight.
  • Embodiments of the present application also provide a storage medium.
  • the foregoing storage medium may be used to save the learning rate adjustment method provided in Embodiment 2 above.
  • the program code that is executed.
  • the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.
  • the storage medium is configured to store program code for performing the following steps: acquiring a gradient calculated by the target machine; calculating a learning rate corresponding to the gradient according to the gradient; determining the If the learning rate is less than the preset threshold, the update weight operation is stopped; if the learning rate is greater than or equal to the preset threshold, the update weight operation is performed.
  • the storage medium is further configured to store program code for performing the following steps: calculating an update weight according to the learning rate, the gradient, and the historical weight, wherein the historical weight is Refers to the weight used by the target machine during the training process.
  • the storage medium is further configured to store program code for performing the following steps:
  • the storage medium is further configured to store a program for performing the following steps Code:
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a mobile hard disk e.g., a hard disk
  • magnetic memory e.g., a hard disk
  • the disclosed processing apparatus for order information may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a read only memory (ROM, Read-Only Memory), Random Access Memory (RAM), removable hard disk, disk or optical disk, etc., which can store program code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Debugging And Monitoring (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • General Factory Administration (AREA)

Abstract

一种机器异常的处理方法、学习速率的调整方法及装置。其中,该方法包括:获取目标机器的梯度消耗时间(S202),其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件(S204),其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常(S206)。该方法解决了由于集群中部分机器计算或通信速度较慢造成的训练成本较高的技术问题。

Description

机器异常的处理方法、学习速率的调整方法及装置 技术领域
本申请涉及互联网领域,具体而言,涉及一种机器异常的处理方法、学习速率的调整方法及装置。
背景技术
互联网公司都拥有大量用户行为数据,通常都是通过机器学习的方法从这些数据中挖掘出有用的信息,比如用户偏好等,通过挖掘出这些信息,来提升用户体验及互联网公司收入。
机器学习的核心做法,便是求解损失函数的最小值(损失函数是一种衡量损失和错误程度的函数,以搜索广告为例,也就是说,损失函数越小,那么用户越有可能点击搜索广告)。梯度下降方法(梯度,是个向量,是损失函数对权重的导数)作为机器学习中使用最为广泛的求解损失函数最小值的方法,由于其实现简单,能够快速计算,被大量使用在各种优化问题上。学习速率(通常用Eta表示)作为权重更新(权重,是个向量,可以理解成损失函数的自变量)的重要参数,会影响训练过程的收敛。Eta如果太大,那么每轮迭代走的太远,容易错过最优解;Eta如果太小,那就走的太慢,影响收敛速度。
目前,在进行这种大规模的机器学习求解问题时,都是在集群上进行训练,集群环境里面包含多个机器。然而,始终会有机器在不同时间点负载不一样,有些机器运算速度比较快,有些机器通信负担轻从而通信效率高,但也有很多机器,负载很高从而计算非常慢,部分机器也可能因为低配置原因,通信速度非常慢,从而使得整个训练过程非常慢,使用大量的机器资源,导致巨大的财务成本(例如,训练1个用户偏好,需要800台机器,一台机器1小时成本假设是C,那么一共训练T小时,成本是800×C×T,若C大于1000,T大于100,那么一次成功训练的成本至少是800万,如果训练过程中失败,又要重新开始,那么成本更加大)。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
本申请实施例提供了一种机器异常的处理方法、学习速率的调整方法及装置,以至少解决由于集群中部分机器计算或通信速度较慢造成的训练成本较高的技术问题。
根据本申请实施例的一个方面,提供了一种机器异常的处理方法,包括:获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
根据本申请实施例的另一方面,还提供了一种学习速率的调整方法,包括:获取目标机器计算出的梯度;根据所述梯度,计算所述梯度对应的学习速率;判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
根据本申请实施例的另一方面,还提供了一种机器异常的处理装置,包括:第一获取单元,用于获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;判断单元,用于判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;检测单元,用于若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
根据本申请实施例的另一方面,还提供了一种学习速率的调整装置,包括:第二获取单元,用于获取目标机器计算出的梯度;计算单元,用于 根据所述梯度,计算所述梯度对应的学习速率;处理单元,用于判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
在本申请实施例中,采用获取目标机器的梯度消耗时间,其中,梯度消耗时间用于表示目标机器在训练过程中消耗的与梯度相关的时间;判断梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,消耗时间均值用于表示集群内的除目标机器以外的所有机器,在训练过程中消耗的与梯度相关的时间的平均值;若梯度消耗时间与消耗时间均值相比满足预定条件,确定目标机器异常的方式,通过将目标机器的梯度消耗时间与除目标机器以外的所有机器的消耗时间均值进行比较,来确定目标机器是否出现异常,当目标机器异常时,及时调整训练策略,避免部分机器计算或通信速度较慢,造成的增加训练成本的问题,达到了及时确定集群中异常机器的目的,从而实现了降低训练成本的技术效果,进而解决了由于集群中部分机器计算或通信速度较慢造成的训练成本较高的技术问题。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是根据本申请实施例的一种运行机器异常的处理方法的计算机终端的硬件结构框图;
图2是根据本申请实施例的一种可选的机器异常的处理方法的流程示意图;
图3是根据本申请实施例的另一种可选的机器异常的处理方法的流程示意图;
图4是根据本申请实施例的一种可选的学习速率的调整方法的流程示 意图;
图5是根据本申请实施例的一种可选的机器异常的处理装置的结构示意图;
图6是根据本申请实施例的另一种可选的机器异常的处理装置的结构示意图;
图7是根据本申请实施例的一种可选的处理单元的结构示意图;
图8是根据本申请实施例的又一种可选的机器异常的处理装置的结构示意图;
图9是根据本申请实施例的一种可选的学习速率的调整装置的结构示意图;
图10是根据本申请实施例的一种计算机终端的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例1
根据本申请实施例,还提供了一种机器异常的处理方法的方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本申请实施例一所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在计算机终端上为例,图1是本申请实施例的一种机器异常的处理方法的计算机终端的硬件结构框图。如图1所示,计算机终端10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输装置106。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。
存储器104可用于存储应用软件的软件程序以及模块,如本申请实施例中的机器异常的处理方法对应的程序指令/模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的应用程序的机器异常的处理方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例 中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
在上述运行环境下,本申请提供了如图2所示的机器异常的处理方法。图2是根据本申请实施例一的机器异常的处理方法的流程图。
步骤S202,获取目标机器的梯度消耗时间。
本申请步骤S202中,梯度消耗时间用于表示目标机器在训练过程中消耗的与梯度相关的时间。本申请实施例中,梯度消耗时间包含目标机器计算梯度所消耗的第一时间和/或发送梯度所消耗的第二时间。
步骤S204,判断梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件。
本申请步骤S202中,消耗时间均值用于表示集群内的除目标机器以外的所有机器,在训练过程中消耗的与梯度相关的时间的平均值。本申请实施例中,消耗时间均值包含系统内的除目标机器以外的所有机器计算梯度所消耗的时间的第一平均值和/或发送梯度所消耗的时间的第二平均值。
可选地,判断梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件包括:判断第一时间是否大于第一平均值与第一预设系数的乘积;其中,若第一时间大于第一平均值与第一预设系数的乘积,则确定梯度消耗时间与消耗时间均值相比满足预定条件,若第一时间小于等于第一平均值与第一预设系数的乘积,则确定梯度消耗时间与消耗时间均值相比不满足预定条件;和/或,判断第二时间是否大于第二平均值与第二预设系数的乘积;其中,若第二时间大于第二平均值与第二预设系数的乘积,则确定梯度消耗时间与消耗时间均值相比满足预定条件,若第二时间小于等于第二平均值与第二预设系数的乘积,则确定梯度消耗时间与消耗时间均值相比不满足预定条件。
步骤S206,若梯度消耗时间与消耗时间均值相比满足预定条件,确定目标机器异常。
本申请步骤S206中,若第一时间大于第一平均值与第一预设系数的 乘积,和/或,若第二时间大于第二平均值与第二预设系数的乘积,则确定目标机器异常。
本实施例提供的机器异常的处理方法,包括但不限于根据各个机器的计算梯度所消耗的第一时间及发送梯度所消耗的第二时间这两个维度来进行判定,采取的预定条件是当目标机器超过所有机器梯度发送时间的平均值×第二预设系数(第二预设系数例如可以为2)时,该目标机器还没有发送出梯度,认为该目标机器属于异常机器(也就是说,假定10台机器中,9台机器都在5秒钟内,把梯度发出来了,但目标机器,超过10秒,还没有发送出梯度,则认为目标机器是慢机),和/或,当目标机器超过所有机器计算梯度的消耗时间均值×第一预设系数(第一预设系数例如可以为3),还没有计算完梯度时,认为该目标机器属于异常机器。
一旦确定目标机器异常,则可以不等待这些异常机器的梯度,就将目前已经返回的机器的梯度,执行更新权重操作,然后通知所有机器进入下一轮迭代,这样不需要等待异常机器,从而节省大量时间,具体实现方式后续实施例中会进行详细描述,此处不作赘述。
由上可知,本申请上述实施例一所提供的方案,通过将目标机器的梯度消耗时间与除目标机器以外的所有机器的消耗时间均值进行比较,来确定目标机器是否出现异常,当目标机器异常时,及时调整训练策略,避免部分机器计算或通信速度较慢,造成的增加训练成本的问题,达到了及时确定集群中异常机器的目的,从而实现了降低训练成本的技术效果,进而解决了由于集群中部分机器计算或通信速度较慢造成的训练成本较高的技术问题。
作为一种可选地实现方式,如图3所示,在确定目标机器异常之后,本实施例的机器异常的处理方法还可以包括:
步骤S302,获取除目标机器以外的所有机器计算出的梯度。
本申请步骤S302中,在确定目标机器异常之后,本实施例的机器异常的处理方法不再等待该异常的目标机器的梯度,直接获取目前已经返回 的机器的梯度(即除目标机器以外的所有机器计算出的梯度)。
步骤S304,根据梯度,计算梯度对应的学习速率。
本申请步骤S304中,根据梯度,计算梯度对应的学习速率的方法可以包括:通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到学习速率,其中,Eta(i)为学习速率,A为第一预设系数,B为第二预设系数,梯度是由n个第i维梯度所组成的向量,n为梯度的维度的数量,0<i≤n。
其中,sum(第i维梯度×第i维梯度)是指对本轮迭代的第i维梯度及在本轮之前的M轮迭代的第i维梯度的平方进行求和运算,其中,M例如可以是20,本实施例对此不作限定。
步骤S306,依据学习速率,确定是否执行更新权重操作。
本申请步骤S306中,依据学习速率,确定是否执行更新权重操作包括:判断学习速率是否小于预设阈值;若学习速率小于预设阈值,停止执行更新权重操作;若学习速率大于等于预设阈值,执行更新权重操作。
可选地,本实施例的预设阈值可以为1e-5,即1乘以10的负5次幂。
进一步地,执行更新权重操作包括:根据学习速率、梯度以及历史权重,计算更新权重,其中,历史权重是指训练过程中目标机器所使用的权重。可选地,根据学习速率、梯度以及历史权重,计算更新权重包括:通过公式更新权重=历史权重+(-学习速率×梯度),计算得到更新权重。
其中,历史权重是指本轮迭代过程中目标机器所使用的权重,更新权重是指下一轮迭代机器需使用的权重。
作为一种可选地实现方式,在执行更新权重操作之后,本实施例的机器异常的处理方法还可以包括:
步骤S10,将更新权重发送至目标机器以及除目标机器以外的所有机器,以指示目标机器以及除目标机器以外的所有机器根据更新权重进行训练。
本申请步骤S10中,在确定目标机器异常之后,本实施例的机器异常的处理方法不等待目标机器返回的梯度,直接依据目前已经返回的机器的梯度,执行更新权重操作,然后将更新权重发送给所有机器,通知所有机器进入下一轮迭代,从而节省大量时间。
本实施例的机器异常的处理方法,首先,通过检测集群中的异常机器,规避出现异常的机器,从而避免训练中途退出或者等待慢机器等情况,从而加速训练,节省成本,其次,能够充分利用梯度、迭代轮次等信息,自动调整学习速率,对不同维度的权重使用不同的学习速率,能够在每个维度上都取得更好的收敛,进一步加速训练,节省成本。
由上可知,现有技术存在的集群中有很多机器,负载很高从而计算非常慢,部分机器也可能因为低配置原因,通信速度非常慢,从而使得整个训练过程非常慢,使用大量的机器资源,导致巨大的财务成本的问题,本申请提出一种机器异常的处理方法,通过将目标机器的梯度消耗时间与除目标机器以外的所有机器的消耗时间均值进行比较,来确定目标机器是否出现异常,当目标机器异常时,及时调整训练策略,避免部分机器计算或通信速度较慢,造成的增加训练成本的问题,达到了及时确定集群中异常机器的目的,从而实现了降低训练成本的技术效果。
实施例2
根据本申请实施例,还提供了一种学习速率的调整方法的方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本申请提供了如图4所示的学习速率的调整方法。图4是根据本申请实施例二的学习速率的调整方法的流程图。
步骤S402,获取目标机器计算出的梯度。
本申请步骤S402中,梯度为对损失函数求导后得到的值,损失函数 是一种将一个事件(在一个样本空间中的一个元素)映射到一个表达与其事件相关的经济成本或机会成本的实数上的一种函数。
步骤S404,根据梯度,计算梯度对应的学习速率。
本申请步骤S404中,根据梯度,计算梯度对应的学习速率包括:通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到学习速率,其中,Eta(i)为学习速率,A为第一预设系数,B为第二预设系数,梯度是由n个第i维梯度所组成的向量,n为梯度的维度的数量,0<i≤n。
其中,sum(第i维梯度×第i维梯度)是指对本轮迭代的第i维梯度及在本轮之前的M轮迭代的第i维梯度的平方进行求和运算,其中,M例如可以是20,本实施例对此不作限定。
步骤S406,判断学习速率是否小于预设阈值。
本申请步骤S406中,预设阈值可以为1e-5,即1乘以10的负5次幂。
步骤S408,若学习速率小于预设阈值,停止执行更新权重操作。
步骤S410,若学习速率大于等于预设阈值,执行更新权重操作。
本申请步骤S410中,执行更新权重操作包括:根据学习速率、梯度以及历史权重,计算更新权重,其中,历史权重是指训练过程中目标机器所使用的权重。可选地,根据学习速率、梯度以及历史权重,计算更新权重包括:通过公式更新权重=历史权重+(-学习速率×梯度),计算得到更新权重。
进一步地,在执行更新权重操作之后,机器异常的处理方法还可以包括:将更新权重发送至目标机器,以指示目标机器根据更新权重进行训练。
本实施例的机器异常的处理方法,能够充分利用梯度、迭代轮次等信息,自动调整学习速率,对不同维度的权重使用不同的学习速率,能够在每个维度上都取得更好的收敛,进一步加速训练,节省成本。
由上可知,本申请上述实施例二所提供的方案,通过根据目标机器的 梯度计算对应的学习速率,在学习速率小于预设阈值的情况下,停止执行更新权重操作,缩短训练时间,从而实现了降低训练成本的技术效果。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
实施例3
根据本申请实施例,还提供了一种用于实施上述机器异常的处理方法实施例的装置实施例,本申请上述实施例所提供的装置可以在计算机终端上运行。
图5是根据本申请实施例的机器异常的处理装置的结构示意图。
如图5所示,该机器异常的处理装置可以包括第一获取单元502、判断单元504以及检测单元506。
其中,第一获取单元502,用于获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;判断单元504,用于判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集 群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;检测单元506,用于若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
由上可知,本申请上述实施例三所提供的方案,通过将目标机器的梯度消耗时间与除目标机器以外的所有机器的消耗时间均值进行比较,来确定目标机器是否出现异常,当目标机器异常时,及时调整训练策略,避免部分机器计算或通信速度较慢,造成的增加训练成本的问题,达到了及时确定集群中异常机器的目的,从而实现了降低训练成本的技术效果,进而解决了由于集群中部分机器计算或通信速度较慢造成的训练成本较高的技术问题。
此处需要说明的是,上述第一获取单元502、判断单元504以及检测单元506对应于实施例一中的步骤S202至步骤S206,三个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中,可以通过软件实现,也可以通过硬件实现。
可选地,所述梯度消耗时间包含所述目标机器计算梯度所消耗的第一时间和/或发送所述梯度所消耗的第二时间;所述消耗时间均值包含系统内的除所述目标机器以外的所有机器计算所述梯度所消耗的时间的第一平均值和/或发送所述梯度所消耗的时间的第二平均值。
可选地,所述判断单元504用于执行以下步骤判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件:判断所述第一时间是否大于所述第一平均值与第一预设系数的乘积;其中,若所述第一时间大于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,若所述第一时间小于等于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件;和/或,判断所述第二时间是否大于所述第二平均值与第二预设系数的乘积;其中,若所述第二时间大于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间 与所述消耗时间均值相比满足所述预定条件,若所述第二时间小于等于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件。
可选地,如图6所示,该机器异常的处理装置还可以包括:第二获取单元602、计算单元604以及处理单元606。
其中,第二获取单元602,用于获取除所述目标机器以外的所有机器计算出的所述梯度;计算单元604,用于根据所述梯度,计算所述梯度对应的学习速率;处理单元606,用于依据所述学习速率,确定是否执行更新权重操作。
此处需要说明的是,上述第二获取单元602、计算单元604以及处理单元606对应于实施例一中的步骤S302至步骤S306,三个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中,可以通过软件实现,也可以通过硬件实现。
可选地,所述计算单元604用于执行以下步骤根据所述梯度,计算所述梯度对应的学习速率:通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
可选地,如图7所示,所述处理单元606包括:判断模块702和执行模块704。
其中,判断模块702,用于判断所述学习速率是否小于预设阈值;执行模块704,用于若所述学习速率小于所述预设阈值,停止执行所述更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
可选地,所述执行模块704用于执行以下步骤执行所述更新权重操作:根据所述学习速率、所述梯度以及历史权重,计算更新权重,其中,所述 历史权重是指所述训练过程中所述目标机器所使用的权重。
可选地,所述执行模块704用于执行以下步骤根据所述学习速率、所述梯度以及历史权重,计算更新权重:通过公式更新权重=历史权重+(-学习速率×梯度),计算得到所述更新权重。
可选地,如图8所示,该机器异常的处理装置还可以包括:发送单元802。
其中,发送单元802,用于将所述更新权重发送至所述目标机器以及除所述目标机器以外的所有机器,以指示所述目标机器以及除所述目标机器以外的所有机器根据所述更新权重进行训练。
实施例4
根据本申请实施例,还提供了一种用于实施上述学习速率的调整方法实施例的装置实施例,本申请上述实施例所提供的装置可以在计算机终端上运行。
图9是根据本申请实施例的学习速率的调整装置的结构示意图。
如图9所示,该学习速率的调整装置可以包括:第二获取单元902、计算单元904以及处理单元906。
其中,第二获取单元902,用于获取目标机器计算出的梯度;计算单元904,用于根据所述梯度,计算所述梯度对应的学习速率;处理单元906,用于判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
由上可知,本申请上述实施例四所提供的方案,通过根据目标机器的梯度计算对应的学习速率,在学习速率小于预设阈值的情况下,停止执行更新权重操作,缩短训练时间,从而实现了降低训练成本的技术效果。
可选地,所述计算单元904用于执行以下步骤根据所述梯度,计算所 述梯度对应的学习速率:通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
实施例5
本申请的实施例可以提供一种计算机终端,该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选地,在本实施例中,上述计算机终端也可以替换为移动终端等终端设备。
可选地,在本实施例中,上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。
在本实施例中,上述计算机终端可以执行机器异常的处理方法中以下步骤的程序代码:获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
可选地,图10是根据本申请实施例的一种计算机终端的结构框图。如图10所示,该计算机终端A可以包括:一个或多个(图中仅示出一个)处理器1002、存储器1004、以及传输装置1006。
其中,存储器1004可用于存储软件程序以及模程序块,如本申请实施例中的机器异常的处理方法和装置对应的程序指令/模程序块,处理器1002通过运行存储在存储器1004内的软件程序以及模程序块,从而执行各种功能应用以及数据处理,即实现上述的机器异常的处理方法。存储器1004可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多 个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器1004可进一步包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
上述的传输装置1006用于经由一个网络接收或者发送数据。上述的网络具体实例可包括有线网络及无线网络。在一个实例中,传输装置1006包括一个网络适配器(Network Interface Controller,NIC),其可通过网线与其他网络设备与路由器相连从而可与互联网或局域网进行通讯。在一个实例中,传输装置1006为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
其中,具体地,存储器1004用于存储预设动作条件和预设权限用户的信息、以及应用程序。
处理器1002可以通过传输装置调用存储器存储的信息及应用程序,以执行下述步骤:获取待检测报文的报文类型以及目的地址;根据所述报文类型,从预设的配置文件中获取与所述报文类型对应的第一攻击类型集合,并根据所述目的地址获取第二攻击类型集合,其中,所述第二攻击类型集合包含所述目的地址所指向的设备在预设时间段内受到的攻击类型;根据所述第一攻击类型集合和所述第二攻击类型集合,生成对应于所述待检测报文的检测策略链;依据所述检测策略链,检测所述待检测报文。
由上可知,本申请上述实施例五所提供的方案,通过将目标机器的梯度消耗时间与除目标机器以外的所有机器的消耗时间均值进行比较,来确定目标机器是否出现异常,当目标机器异常时,及时调整训练策略,避免部分机器计算或通信速度较慢,造成的增加训练成本的问题,达到了及时确定集群中异常机器的目的,从而实现了降低训练成本的技术效果,进而解决了由于集群中部分机器计算或通信速度较慢造成的训练成本较高的技术问题。
本领域普通技术人员可以理解,图10所示的结构仅为示意,计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声 电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图10其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图10中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图10所示不同的配置。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
实施例6
本申请的实施例还可以提供一种计算机终端,该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选地,在本实施例中,上述计算机终端也可以替换为移动终端等终端设备。
可选地,在本实施例中,上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。
在本实施例中,上述计算机终端可以执行学习速率的调整方法中以下步骤的程序代码:获取目标机器计算出的梯度;根据所述梯度,计算所述梯度对应的学习速率;判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
该计算机终端可以包括:一个或多个处理器、存储器、以及传输装置。
其中,存储器可用于存储软件程序以及模程序块,如本申请实施例中的学习速率的调整方法和装置对应的程序指令/模程序块,处理器通过运行存储在存储器内的软件程序以及模程序块,从而执行各种功能应用以及数据处理,即实现上述的机器异常的处理方法。存储器可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、 或者其他非易失性固态存储器。在一些实例中,存储器可进一步包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
上述的传输装置用于经由一个网络接收或者发送数据。上述的网络具体实例可包括有线网络及无线网络。在一个实例中,传输装置包括一个网络适配器(Network Interface Controller,NIC),其可通过网线与其他网络设备与路由器相连从而可与互联网或局域网进行通讯。在一个实例中,传输装置为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
其中,具体地,存储器用于存储预设动作条件和预设权限用户的信息、以及应用程序。
处理器可以通过传输装置调用存储器存储的信息及应用程序,以执行下述步骤:获取目标机器计算出的梯度;根据所述梯度,计算所述梯度对应的学习速率;判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
由上可知,本申请上述实施例六所提供的方案,通过根据目标机器的梯度计算对应的学习速率,在学习速率小于预设阈值的情况下,停止执行更新权重操作,缩短训练时间,从而实现了降低训练成本的技术效果。
本领域普通技术人员可以理解,计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、 磁盘或光盘等。
实施例7
本申请的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于保存上述实施例一所提供的机器异常的处理方法所执行的程序代码。
可选地,在本实施例中,上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中,或者位于移动终端群中的任意一个移动终端中。
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:判断所述第一时间是否大于所述第一平均值与第一预设系数的乘积;其中,若所述第一时间大于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,若所述第一时间小于等于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件;和/或,判断所述第二时间是否大于所述第二平均值与第二预设系数的乘积;其中,若所述第二时间大于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,若所述第二时间小于等于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:获取除所述目标机器以外的所有机器计算出的所述梯度;根据所述梯度,计算所述梯度对应的学习速率;依据所述学习速率,确定是否执行更新权重操作。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行所述更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:根据所述学习速率、所述梯度以及历史权重,计算更新权重,其中,所述历史权重是指所述训练过程中所述目标机器所使用的权重。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:通过公式更新权重=历史权重+(-学习速率×梯度),计算得到所述更新权重。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:将所述更新权重发送至所述目标机器以及除所述目标机器以外的所有机器,以指示所述目标机器以及除所述目标机器以外的所有机器根据所述更新权重进行训练。
实施例8
本申请的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于保存上述实施例二所提供的学习速率的调整方法所 执行的程序代码。
可选地,在本实施例中,上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中,或者位于移动终端群中的任意一个移动终端中。
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:获取目标机器计算出的梯度;根据所述梯度,计算所述梯度对应的学习速率;判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:根据所述学习速率、所述梯度以及历史权重,计算更新权重,其中,所述历史权重是指训练过程中所述目标机器所使用的权重。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:通过公式更新权重=历史权重+(-学习速率×梯度),计算得到所述更新权重。
可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:可选地,在本实施例中,存储介质还被设置为存储用于执行以下步骤的程序代码:
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
可选地,本实施例中的具体示例可以参考上述实施例1中所描述的示例,本实施例在此不再赘述。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的订单信息的处理装置,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM, Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (25)

  1. 一种机器异常的处理方法,其特征在于,包括:
    获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;
    判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;
    若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
  2. 根据权利要求1所述的方法,其特征在于,所述梯度消耗时间包含所述目标机器计算梯度所消耗的第一时间和/或发送所述梯度所消耗的第二时间;所述消耗时间均值包含系统内的除所述目标机器以外的所有机器计算所述梯度所消耗的时间的第一平均值和/或发送所述梯度所消耗的时间的第二平均值。
  3. 根据权利要求2所述的方法,其特征在于,所述判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件包括:
    判断所述第一时间是否大于所述第一平均值与第一预设系数的乘积;其中,若所述第一时间大于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,若所述第一时间小于等于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件;
    和/或,
    判断所述第二时间是否大于所述第二平均值与第二预设系数的乘积;其中,若所述第二时间大于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述 预定条件,若所述第二时间小于等于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,在所述确定所述目标机器异常之后,所述方法还包括:
    获取除所述目标机器以外的所有机器计算出的所述梯度;
    根据所述梯度,计算所述梯度对应的学习速率;
    依据所述学习速率,确定是否执行更新权重操作。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述梯度,计算所述梯度对应的学习速率包括:
    通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
  6. 根据权利要求5所述的方法,其特征在于,所述依据所述学习速率,确定是否执行更新权重操作包括:
    判断所述学习速率是否小于预设阈值;
    若所述学习速率小于所述预设阈值,停止执行所述更新权重操作;
    若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
  7. 根据权利要求6所述的方法,其特征在于,所述执行所述更新权重操作包括:
    根据所述学习速率、所述梯度以及历史权重,计算更新权重,其中,所述历史权重是指所述训练过程中所述目标机器所使用的权重。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述学习速率、所述梯度以及历史权重,计算更新权重包括:
    通过公式更新权重=历史权重+(-学习速率×梯度),计算得到所述 更新权重。
  9. 根据权利要求7或8所述的方法,其特征在于,在所述执行更新权重操作之后,所述方法还包括:
    将所述更新权重发送至所述目标机器以及除所述目标机器以外的所有机器,以指示所述目标机器以及除所述目标机器以外的所有机器根据所述更新权重进行训练。
  10. 一种学习速率的调整方法,其特征在于,包括:
    获取目标机器计算出的梯度;
    根据所述梯度,计算所述梯度对应的学习速率;
    判断所述学习速率是否小于预设阈值;
    若所述学习速率小于所述预设阈值,停止执行更新权重操作;
    若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
  11. 根据权利要求10所述的方法,其特征在于,所述根据所述梯度,计算所述梯度对应的学习速率包括:
    通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
  12. 根据权利要求10所述的方法,其特征在于,所述执行所述更新权重操作包括:
    根据所述学习速率、所述梯度以及历史权重,计算更新权重,其中,所述历史权重是指训练过程中所述目标机器所使用的权重。
  13. 根据权利要求12所述的方法,其特征在于,所述根据所述学习速率、所述梯度以及历史权重,计算更新权重包括:
    通过公式更新权重=历史权重+(-学习速率×梯度),计算得到所述更新权重。
  14. 根据权利要求12或13所述的方法,其特征在于,在所述执行更新权重操作之后,所述方法还包括:
    将所述更新权重发送至所述目标机器,以指示所述目标机器根据所述更新权重进行训练。
  15. 一种机器异常的处理装置,其特征在于,包括:
    第一获取单元,用于获取目标机器的梯度消耗时间,其中,所述梯度消耗时间用于表示所述目标机器在训练过程中消耗的与梯度相关的时间;
    判断单元,用于判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件,其中,所述消耗时间均值用于表示集群内的除所述目标机器以外的所有机器,在所述训练过程中消耗的与所述梯度相关的时间的平均值;
    检测单元,用于若所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,确定所述目标机器异常。
  16. 根据权利要求15所述的装置,其特征在于,所述梯度消耗时间包含所述目标机器计算梯度所消耗的第一时间和/或发送所述梯度所消耗的第二时间;所述消耗时间均值包含系统内的除所述目标机器以外的所有机器计算所述梯度所消耗的时间的第一平均值和/或发送所述梯度所消耗的时间的第二平均值。
  17. 根据权利要求16所述的装置,其特征在于,所述判断单元用于执行以下步骤判断所述梯度消耗时间与预先获取的消耗时间均值相比,是否满足预定条件:
    判断所述第一时间是否大于所述第一平均值与第一预设系数的乘积;其中,若所述第一时间大于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,若所述第一时间小于等于所述第一平均值与所述第一预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满 足所述预定条件;
    和/或,
    判断所述第二时间是否大于所述第二平均值与第二预设系数的乘积;其中,若所述第二时间大于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比满足所述预定条件,若所述第二时间小于等于所述第二平均值与所述第二预设系数的乘积,则确定所述梯度消耗时间与所述消耗时间均值相比不满足所述预定条件。
  18. 根据权利要求15至17中任一项所述的装置,其特征在于,所述装置还包括:
    第二获取单元,用于获取除所述目标机器以外的所有机器计算出的所述梯度;
    计算单元,用于根据所述梯度,计算所述梯度对应的学习速率;
    处理单元,用于依据所述学习速率,确定是否执行更新权重操作。
  19. 根据权利要求18所述的装置,其特征在于,所述计算单元用于执行以下步骤根据所述梯度,计算所述梯度对应的学习速率:
    通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
  20. 根据权利要求19所述的装置,其特征在于,所述处理单元包括:
    判断模块,用于判断所述学习速率是否小于预设阈值;
    执行模块,用于若所述学习速率小于所述预设阈值,停止执行所述更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
  21. 根据权利要求20所述的装置,其特征在于,所述执行模块用于执行 以下步骤执行所述更新权重操作:
    根据所述学习速率、所述梯度以及历史权重,计算更新权重,其中,所述历史权重是指所述训练过程中所述目标机器所使用的权重。
  22. 根据权利要求21所述的装置,其特征在于,所述执行模块用于执行以下步骤根据所述学习速率、所述梯度以及历史权重,计算更新权重:
    通过公式更新权重=历史权重+(-学习速率×梯度),计算得到所述更新权重。
  23. 根据权利要求21或22所述的装置,其特征在于,所述装置还包括:
    发送单元,用于将所述更新权重发送至所述目标机器以及除所述目标机器以外的所有机器,以指示所述目标机器以及除所述目标机器以外的所有机器根据所述更新权重进行训练。
  24. 一种学习速率的调整装置,其特征在于,包括:
    第二获取单元,用于获取目标机器计算出的梯度;
    计算单元,用于根据所述梯度,计算所述梯度对应的学习速率;
    处理单元,用于判断所述学习速率是否小于预设阈值;若所述学习速率小于所述预设阈值,停止执行更新权重操作;若所述学习速率大于等于所述预设阈值,执行所述更新权重操作。
  25. 根据权利要求18所述的装置,其特征在于,所述计算单元用于执行以下步骤根据所述梯度,计算所述梯度对应的学习速率:
    通过公式Eta(i)=A×第i维梯度/(B+sqrt(sum(第i维梯度×第i维梯度))),计算得到所述学习速率,其中,Eta(i)为所述学习速率,A为第一预设系数,B为第二预设系数,所述梯度是由n个所述第i维梯度所组成的向量,n为所述梯度的维度的数量,0<i≤n。
PCT/CN2017/070906 2016-01-21 2017-01-11 机器异常的处理方法、学习速率的调整方法及装置 WO2017124953A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17740984.4A EP3407211A4 (en) 2016-01-21 2017-01-11 PROCESS FOR PROCESSING MACHINE ANOMALIES, METHOD FOR SETTING THE LEARNING RATE AND DEVICE
US16/043,006 US10748090B2 (en) 2016-01-21 2018-07-23 Method and apparatus for machine-exception handling and learning rate adjustment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610041708.0 2016-01-21
CN201610041708.0A CN106991095B (zh) 2016-01-21 2016-01-21 机器异常的处理方法、学习速率的调整方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/043,006 Continuation US10748090B2 (en) 2016-01-21 2018-07-23 Method and apparatus for machine-exception handling and learning rate adjustment

Publications (1)

Publication Number Publication Date
WO2017124953A1 true WO2017124953A1 (zh) 2017-07-27

Family

ID=59361362

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/070906 WO2017124953A1 (zh) 2016-01-21 2017-01-11 机器异常的处理方法、学习速率的调整方法及装置

Country Status (5)

Country Link
US (1) US10748090B2 (zh)
EP (1) EP3407211A4 (zh)
CN (1) CN106991095B (zh)
TW (1) TW201732695A (zh)
WO (1) WO2017124953A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10748090B2 (en) 2016-01-21 2020-08-18 Alibaba Group Holding Limited Method and apparatus for machine-exception handling and learning rate adjustment
CN114461568A (zh) * 2022-04-14 2022-05-10 苏州浪潮智能科技有限公司 一种数据处理方法、系统、设备及可读存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154237B (zh) * 2016-12-06 2022-04-05 华为技术有限公司 一种数据处理系统及方法
US11972429B2 (en) 2019-01-24 2024-04-30 Walmart Apollo, Llc Methods and apparatus for fraud detection
US11605085B2 (en) * 2019-01-24 2023-03-14 Walmart Apollo, Llc Methods and apparatus for fraud detection
CN113485805B (zh) * 2021-07-01 2024-02-06 中科曙光(南京)计算技术有限公司 基于异构加速平台的分布式计算调整方法、装置及设备
CN114528914B (zh) * 2022-01-10 2024-05-14 鹏城实验室 一种人在回路的冷水主机状态监测方法、终端及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020157035A1 (en) * 2001-04-23 2002-10-24 Wong Joseph D. Systems and methods for providing an automated diagnostic audit for cluster computer systems
CN101505243A (zh) * 2009-03-10 2009-08-12 中国科学院软件研究所 一种Web应用性能异常侦测方法
CN104063747A (zh) * 2014-06-26 2014-09-24 上海交通大学 一种分布式系统中的性能异常预测方法及系统
CN104644143A (zh) * 2015-03-09 2015-05-27 耿希华 一种非接触式生命体征监护系统

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6934673B2 (en) * 2001-05-25 2005-08-23 Hewlett-Packard Development Company, L.P. Method and apparatus for predicting multi-part performability
US9123077B2 (en) * 2003-10-07 2015-09-01 Hospira, Inc. Medication management system
US8332411B2 (en) * 2007-10-19 2012-12-11 Microsoft Corporation Boosting a ranker for improved ranking accuracy
US8005774B2 (en) * 2007-11-28 2011-08-23 Yahoo! Inc. Determining a relevance function based on a query error derived using a structured output learning technique
US8725667B2 (en) * 2008-03-08 2014-05-13 Tokyo Electron Limited Method and system for detection of tool performance degradation and mismatch
DE102008049714A1 (de) * 2008-09-30 2010-04-01 Siemens Enterprise Communications Gmbh & Co. Kg Verfahren und Anordnung zum Betreiben eines elektronischen Systems
US8175377B2 (en) * 2009-06-30 2012-05-08 Xerox Corporation Method and system for training classification and extraction engine in an imaging solution
CN102622366B (zh) * 2011-01-28 2014-07-30 阿里巴巴集团控股有限公司 相似图像的识别方法和装置
CN103020947B (zh) * 2011-09-23 2016-04-06 阿里巴巴集团控股有限公司 一种图像的质量分析方法及装置
CN102663100B (zh) * 2012-04-13 2014-01-15 西安电子科技大学 一种两阶段混合粒子群优化聚类方法
JP5978993B2 (ja) * 2012-12-28 2016-08-24 富士通株式会社 情報処理システム制御装置、該プログラム、及び該方法
US10354187B2 (en) * 2013-01-17 2019-07-16 Hewlett Packard Enterprise Development Lp Confidentiality of files using file vectorization and machine learning
US20140310218A1 (en) * 2013-04-11 2014-10-16 Nec Laboratories America, Inc. High-Order Semi-RBMs and Deep Gated Neural Networks for Feature Interaction Identification and Non-Linear Semantic Indexing
US9625274B2 (en) * 2014-03-28 2017-04-18 Mitsubishi Electric Research Laboratories, Inc. Time-varying extremum seeking for controlling vapor compression systems
CN106662867B (zh) * 2014-04-16 2019-03-15 西门子公司 使用条件模型来迁移故障样本以用于机器状况监视
JP6334282B2 (ja) * 2014-06-11 2018-05-30 株式会社東芝 情報処理装置および運転曲線作成方法
CN104036451B (zh) * 2014-06-20 2018-12-11 深圳市腾讯计算机系统有限公司 基于多图形处理器的模型并行处理方法及装置
US10318882B2 (en) * 2014-09-11 2019-06-11 Amazon Technologies, Inc. Optimized training of linear machine learning models
CN104731709B (zh) * 2015-03-31 2017-09-29 北京理工大学 一种基于jcudasa_bp算法的软件缺陷预测方法
US10452995B2 (en) * 2015-06-29 2019-10-22 Microsoft Technology Licensing, Llc Machine learning classification on hardware accelerators with stacked memory
CN106991095B (zh) 2016-01-21 2021-09-28 阿里巴巴集团控股有限公司 机器异常的处理方法、学习速率的调整方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020157035A1 (en) * 2001-04-23 2002-10-24 Wong Joseph D. Systems and methods for providing an automated diagnostic audit for cluster computer systems
CN101505243A (zh) * 2009-03-10 2009-08-12 中国科学院软件研究所 一种Web应用性能异常侦测方法
CN104063747A (zh) * 2014-06-26 2014-09-24 上海交通大学 一种分布式系统中的性能异常预测方法及系统
CN104644143A (zh) * 2015-03-09 2015-05-27 耿希华 一种非接触式生命体征监护系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3407211A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10748090B2 (en) 2016-01-21 2020-08-18 Alibaba Group Holding Limited Method and apparatus for machine-exception handling and learning rate adjustment
CN114461568A (zh) * 2022-04-14 2022-05-10 苏州浪潮智能科技有限公司 一种数据处理方法、系统、设备及可读存储介质

Also Published As

Publication number Publication date
US20180329798A1 (en) 2018-11-15
CN106991095B (zh) 2021-09-28
EP3407211A1 (en) 2018-11-28
US10748090B2 (en) 2020-08-18
EP3407211A4 (en) 2019-01-09
CN106991095A (zh) 2017-07-28
TW201732695A (zh) 2017-09-16

Similar Documents

Publication Publication Date Title
WO2017124953A1 (zh) 机器异常的处理方法、学习速率的调整方法及装置
US11704221B2 (en) Systems and methods for collecting, tracking, and storing system performance and event data for computing devices
US20200272889A1 (en) Optimizing data center controls using neural networks
CN104809051B (zh) 用于预测计算机应用中的异常和故障的方法和装置
WO2020082973A1 (zh) 基于神经网络的负荷预测方法及装置
TW202131661A (zh) 用於網路最佳化的裝置及方法、以及非暫時性電腦可讀媒體
CN113157422A (zh) 基于深度强化学习的云数据中心集群资源调度方法及装置
CN102340543A (zh) 选择系统主节点的方法和设备
CN111784472B (zh) 基于消费数据的风控方法、装置、系统及可读存储介质
CN110875838B (zh) 一种资源部署方法、装置和存储介质
CN103365727A (zh) 一种云计算环境中的主机负载预测方法
CN112954707B (zh) 基站的节能方法、装置、基站和计算机可读存储介质
CN112487210A (zh) 异常设备识别方法、电子设备和介质
CN104484222A (zh) 一种基于混合遗传算法的虚拟机调度方法
CN104375621A (zh) 一种云计算中基于自适应阈值的动态加权负载评估方法
CN109783221A (zh) 一种虚拟机资源分配方法、装置及资源服务器
CN107665349A (zh) 一种分类模型中多个目标的训练方法和装置
CN117741442A (zh) 电芯温度预测方法、装置、设备、存储介质和程序产品
CN108073449B (zh) 一种虚拟机动态放置方法
CN105205723A (zh) 一种基于社交应用的建模方法及装置
CN116437341A (zh) 一种移动区块链网络的计算卸载与隐私保护联合优化方法
CN106765867B (zh) 一种空调冷水机组控制方法及系统
CN113487041B (zh) 横向联邦学习方法、装置及存储介质
CN103873388A (zh) 一种网络内容控制方法和网络设备
CN112489663A (zh) 一种语音唤醒方法、装置、介质和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17740984

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017740984

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017740984

Country of ref document: EP

Effective date: 20180821