WO2022111403A1 - Machine learning method, device, and system - Google Patents

Machine learning method, device, and system Download PDF

Info

Publication number
WO2022111403A1
WO2022111403A1 PCT/CN2021/132048 CN2021132048W WO2022111403A1 WO 2022111403 A1 WO2022111403 A1 WO 2022111403A1 CN 2021132048 W CN2021132048 W CN 2021132048W WO 2022111403 A1 WO2022111403 A1 WO 2022111403A1
Authority
WO
WIPO (PCT)
Prior art keywords
fusion
parameters
local
node
parameter
Prior art date
Application number
PCT/CN2021/132048
Other languages
French (fr)
Chinese (zh)
Inventor
张朝阳
杨禹志
于天航
李榕
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022111403A1 publication Critical patent/WO2022111403A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink

Definitions

  • the present application relates to the technical field of machine learning, and in particular, to a machine learning method, apparatus and system.
  • AI artificial intelligence
  • computing nodes such as base stations
  • each local node such as mobile phones
  • the base station can deliver the model that meets the preset conditions obtained through machine learning to each mobile phone, so that each mobile phone can adjust its own communication process according to the model to realize intelligent communication.
  • each local node needs to upload all the data collected by it to the computing node, so that the computing node can perform training based on the data.
  • the amount of data being transferred can be very large. Since each local node needs to transmit data to the computing node, the pressure of data transmission between the local node and the computing node is relatively high.
  • the data since the data is directly transmitted by the local node to the central node, it may include content related to user privacy information, which also makes the information privacy information insecure.
  • the embodiments of the present application provide a machine learning method, device and system, which can significantly reduce the amount of data transmission in the machine learning process, thereby reducing the time-consuming proportion of data transmission in the entire learning process, thereby effectively improving the efficiency of machine learning .
  • a machine learning method is provided, the method is applied to a sub-node, and a binary neural network model BNN is set in the sub-node, and the method includes: the sub-node performs BNN-based machine learning on a local data set collected and acquired, Get the local model parameters corresponding to the local dataset. The child node sends a first message to the central node, where the first message includes local model parameters.
  • a scheme combining BNN with a distributed machine learning architecture is provided.
  • the BNN-based binarized machine learning can be performed locally, and the parameters of the local neural network model obtained thereby can be binarized parameters.
  • Sending the binarized model parameters to the central node can significantly reduce the requirement for data transmission bandwidth compared to directly sending the high-precision neural network model or model parameters to the central node. , so the transmission time is correspondingly reduced. It is understandable that in the process of data transmission, the child nodes will not perform machine learning. Therefore, reducing the time-consuming of data transmission can significantly increase the proportion of time for machine learning in the entire learning process, thereby improving learning efficiency. .
  • the local model parameters included in the first message are binarized local model parameters. Based on this scheme, the formal constraints of the local model parameters during the transmission process are given.
  • the transmitted local model parameters are binarized parameters, that is, including both +1 and -1 parameters. Obviously, each parameter corresponds to a 1-bit bandwidth. Compared with the data transmission under the traditional FL architecture, due to the need for high-precision data transmission, the transmission bandwidth corresponding to a high-precision element may be 16 bits or more. When the model parameter matrix of the element is used, it needs to consume more transmission resources. Therefore, the solution provided by this example can effectively reduce the demand for data transmission resources and shorten the transmission time, thereby improving the learning efficiency of the system.
  • the method further includes: the child node receives fusion parameters from the central node, and the fusion parameters are obtained by the central node through fusion according to local model parameters.
  • the child node updates the local model parameters according to the fusion parameters to obtain the updated local model parameters.
  • a method for model updating by sub-nodes by receiving information is presented.
  • the central node can perform data aggregation processing.
  • the model parameters transmitted by each sub-node can be obtained, and these model parameters can be fused to obtain and each sub-node.
  • Model parameters with higher adaptability such as the above-mentioned fusion parameters
  • the sub-nodes can obtain the fusion parameters from the central node, so that local fusion can be performed according to the fusion parameters in combination with the local parameters, thereby realizing the updating of the local neural network model.
  • the fusion parameters are binarized model parameters, or the fusion parameters are high-precision model parameters. Based on this scheme, different forms of fusion parameters are given.
  • the fusion parameter may be a binarized model parameter.
  • the fusion parameter may be a high-precision model parameter.
  • the child nodes can transmit the binarized model parameters to the central node.
  • the central node can perform fusion according to multiple binarized model parameters (for example, average weighting according to weights), thereby obtaining corresponding fusion parameters.
  • the model parameters after the fusion process do not only include elements of +1 and -1, that is, the fusion model obtained after the weighted average should be high-precision model parameters.
  • the fusion model obtained after the weighted average should be high-precision model parameters.
  • the binarized fusion parameters obtained by binarizing the high-precision parameters are used for downloading, so that the data download rate is faster, and the sub-nodes can also complete the update of the local model faster.
  • local fusion when receiving binarized fusion parameters, local fusion can be performed according to the local fusion method corresponding to the binarized fusion parameters, and when receiving high-precision fusion parameters, according to the local fusion method corresponding to the high-precision fusion parameters.
  • the local fusion method performs local fusion.
  • the first message further includes: accuracy information corresponding to the local model parameters.
  • the accuracy information is obtained by the sub-node according to the local model parameters and the test data set. Based on this solution, a specific content of the first message is provided.
  • the child node may send the accuracy information corresponding to the learning result to the central node. So that the central node can determine the accuracy of the system according to the accuracy information reported by each sub-node, and then determine whether to issue binary fusion parameters or high-precision fusion parameters.
  • the accuracy information may also be obtained by the child node through verification according to the local model parameters and the verification data set.
  • the method further includes: the child node continues machine learning based on the BNN according to the updated local model parameters.
  • the child node updates the local model according to the fusion parameters.
  • the child node can continue to perform the second round or subsequent rounds of learning on the local model based on the existing data set or in combination with the newly added data set, and repeat the above example. method until the learning result converges, and the machine learning is completed.
  • the updated local model can be used to guide the current business of the child node, such as predicting the direction of data, etc.
  • a machine learning method is provided, the method is applied to a central node, and the method includes: the central node receives N first messages from N child nodes respectively, the first messages include local model parameters, and the local model parameters are binarized local model parameters. N is an integer greater than or equal to 1. The central node fuses the local model parameters included in the N first messages to obtain the fusion parameters. The central node sends a second message to the M child nodes, where the second message includes a fusion parameter, and M is a positive integer greater than or equal to 1.
  • the central node can receive local model parameters from multiple sub-nodes, and perform fusion based on these local model parameters, thereby obtaining a fusion model with stronger adaptability.
  • the central node can distribute the fusion model to each sub-node, so that the sub-nodes can perform local fusion according to the fusion model to complete a round of learning.
  • the model parameters received by the central node from the child nodes may be binarized model parameters. It can be understood that the data volume of the binarized model parameters is significantly smaller than the data volume of ordinary model parameters (such as high-precision model parameters), so the uploading process is more efficient.
  • the obtained fusion parameters can be adapted to the data set type corresponding to each sub-node, so it has more accurate and adaptable features.
  • there may be some reference sub-nodes among the N sub-nodes and these sub-nodes can be used to provide local model parameters, but do not need fusion parameters from the central node.
  • the nodes that need to update the local model according to the fusion parameters may not be included in the N child nodes. Therefore, in some implementations, the M child nodes may also include child nodes that are not included in the N child nodes, or the M child nodes may be the first part of the N child nodes. The specific determination of the M sub-nodes can be flexibly configured according to the actual implementation process.
  • the central node fuses N local model parameters to obtain the fusion parameters, including: the central node performs a weighted average on the N local model parameters to obtain the fusion parameters.
  • a scheme for obtaining fusion parameters is provided.
  • the central node can process N local parameter models through a simple weighted average. The weight in the weighted average can be determined according to the size of the input data set during the local training process of the local parameter model.
  • the central node can obtain the size of the data set used in the current round of learning from each child node, and can also obtain the size of the data set used by each child node in the local learning process from other nodes.
  • the central node may also adjust the weight in combination with other factors. For example, for some frequently used sub-nodes, their weights can be appropriately increased, while for some neural network models that use less sub-nodes, their corresponding weights can be appropriately decreased.
  • the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized fusion parameter.
  • the central node may directly send the high-precision parameters obtained after the fusion process to the child nodes through the second message.
  • the central node can also send the high-precision parameters after fusion processing to the child nodes through binarization processing and then through the second message. It can be understood that when the data transmission rate needs to be improved, it can be achieved by issuing binary fusion parameters, and when the accuracy needs to be improved, it can be achieved by issuing high-precision fusion parameters.
  • the method before the central node sends the second message to the M sub-nodes, the method further includes: the central node determines the system accuracy information according to the N first messages, and the central node determines the system accuracy information according to the system accuracy information.
  • the fusion parameters included in a message are high-precision fusion parameters or binarized fusion parameters. Based on this scheme, a mechanism for the central node to adjust and issue high-precision fusion parameters and binary fusion parameters is provided. In this example, the central node may determine and issue high-precision fusion parameters or binary fusion parameters according to the system accuracy information.
  • the binarized fusion parameters can be sent, and when the system accuracy is high, the high-precision fusion parameters can be sent.
  • the system accuracy may be determined according to the accuracy of each sub-node, or may be determined by the central node spontaneously verifying according to the model parameters of each sub-node.
  • the first message further includes: accuracy information.
  • the accuracy information corresponds to the accuracy obtained by verifying the local model parameters included in the first message at the corresponding child node.
  • the central node determining the system accuracy information according to the N first messages includes: the central node determining the system accuracy information according to the accuracy information included in the N messages. Based on the solution, a method for the central node to determine the system accuracy information is provided.
  • each sub-node can send the accuracy of the model parameter verification obtained in this round of learning to the central node, and the central node can determine the system accuracy according to the accuracy uploaded by each sub-node, and then adjust the following accordingly. form of fusion parameters.
  • the central node determines that the fusion parameter is a binarized fusion parameter.
  • the central node determines that the fusion parameter is a high-precision fusion parameter.
  • the central node can issue binarized fusion parameters, thereby increasing the data transmission rate.
  • the central node determines that the system accuracy is greater than or equal to the first threshold, it considers that the learning in the current system is close to convergence, and the adjustment space of the model parameters is small, so data transmission with higher accuracy is required. Therefore, when the system accuracy is greater than or equal to the first threshold, the central node can issue high-precision fusion parameters, thereby improving the accuracy of model parameters.
  • the first threshold and the second threshold may be preset, and in different implementations, the first threshold and the second threshold may be the same or different.
  • the central node when the central node sends the second message to the M sub-nodes, including: when the number of iteration rounds is less than or equal to the third threshold, the central node sends the first message including the binarized fusion parameter to the M sub-nodes Two news. When the number of iteration rounds is greater than or equal to the fourth threshold, the central node sends a second message including high-precision fusion parameters to the M sub-nodes.
  • another mechanism is provided for the central node to determine the form of the fusion parameters to be issued. In this example, the central node may determine the form of delivering the fusion parameter according to the number of iteration rounds.
  • the central node when the number of iteration rounds is small, that is, less than or equal to the third threshold, the central node can think that the current state should focus on the improvement of data transmission efficiency, so it can choose to issue binarized fusion parameters to Increase data transfer rate.
  • the central node when the number of iteration rounds is large, that is, when it is greater than or equal to the fourth threshold, the central node can think that the current state should be based on accuracy, so it can choose to issue high-precision fusion parameters to improve the local fusion process. accuracy in .
  • the central node sends the second message to the M child nodes through broadcasting.
  • a method for the central node to deliver the second message is provided.
  • the central node can deliver the second message in the form of broadcasting, without the need to deliver the second message to each child node separately. It can be understood that the content of the data delivered to each child node is similar, therefore, the data can be delivered to each child node at the same time in the form of broadcasting.
  • the transmission is a binarized fusion parameter or a high-precision fusion parameter, the transmission form of the broadcast will not affect the information security.
  • a machine learning device in a third aspect, can be applied to a sub-node, and a binary neural network model BNN is set in the sub-node, and the device includes: an acquisition unit, used for collecting and acquiring a local data set. BNN-based machine learning to obtain local model parameters corresponding to the local dataset.
  • the sending unit is configured to send a first message to the central node, where the first message includes local model parameters.
  • the local model parameters included in the first message are binarized local model parameters.
  • the apparatus further includes: a receiving unit, configured to receive fusion parameters from the central node, where the fusion parameters are obtained by the central node fusion according to local model parameters.
  • the fusion unit is used to fuse according to the fusion parameters and the local model parameters to obtain the updated local model parameters.
  • the fusion parameters are binarized model parameters, or the fusion parameters are high-precision model parameters.
  • the first message further includes: accuracy information corresponding to the local model parameters.
  • the acquisition unit is also used to verify the acquired accuracy information according to the local model parameters and the test data set.
  • the apparatus further includes: a learning unit for continuing machine learning based on the BNN according to the updated local model parameters.
  • a machine learning device configured to apply to a central node, and the device includes: a receiving unit, configured to receive N first messages from N sub-nodes respectively, the first messages include local model parameters, local model parameters are the binarized local model parameters. N is an integer greater than or equal to 1.
  • the fusion unit is configured to fuse the local model parameters included in the N first messages to obtain fusion parameters.
  • a sending unit configured to send a second message to the M sub-nodes, where the second message includes a fusion parameter, and M is a positive integer greater than or equal to 1.
  • the fusion unit is specifically used to perform a weighted average of N local model parameters to obtain fusion parameters.
  • the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized fusion parameter.
  • the device further includes: a determining unit, configured to determine the system accuracy information according to the N first messages, and the central node determines, according to the system accuracy information, that the fusion parameter included in the first message is high Precision fusion parameters or binarized fusion parameters.
  • the first message further includes: accuracy information.
  • the accuracy information corresponds to the accuracy obtained by verifying the local model parameters included in the first message at the corresponding child node.
  • the determining unit is specifically configured to determine the system accuracy information according to the accuracy information included in the N messages.
  • the determining unit is configured to determine that the fusion parameter is a binarized fusion parameter when the system accuracy information is less than or equal to the first threshold.
  • the determining unit is further configured to determine that the fusion parameter is a high-precision fusion parameter when the system accuracy information is greater than or equal to the second threshold.
  • the sending unit is configured to send the second message including the binarized fusion parameter to the M sub-nodes when the number of iteration rounds is less than or equal to the third threshold.
  • the sending unit is further configured to send a second message including a high-precision fusion parameter to the M sub-nodes when the number of iteration rounds is greater than or equal to the fourth threshold.
  • the sending unit is specifically configured to send the second message to the M sub-nodes through broadcasting.
  • a child node may include one or more processors and one or more memories.
  • One or more memories are coupled to the one or more processors, and the one or more memories store computer instructions.
  • the child nodes are caused to perform the machine learning method of any of the first aspect and possible designs thereof.
  • a central node may include one or more processors and one or more memories.
  • One or more memories are coupled to the one or more processors, and the one or more memories store computer instructions.
  • the central node executes the computer instructions, the central node is caused to perform the machine learning method of any one of the second aspect and possible designs thereof.
  • a machine learning system in a seventh aspect, includes one or more sub-nodes provided in the fifth aspect, and one or more central nodes as provided in the sixth aspect.
  • a chip system in an eighth aspect, includes an interface circuit and a processor; the interface circuit and the processor are interconnected through a line; the interface circuit is used to receive a signal from a memory and send a signal to the processor, and the signal includes a signal stored in the memory.
  • Computer instructions when the processor executes the computer instructions, the system-on-a-chip executes the machine learning method described in any one of the above-mentioned first aspect and various possible designs, or executes the above-mentioned second aspect and various possible designs The machine learning method described in any of the above.
  • a computer-readable storage medium includes computer instructions, and when the computer instructions are executed, the machine learning method described in any one of the above-mentioned first aspect and various possible designs is executed , or, perform the machine learning method as described in any of the above-mentioned second aspect and various possible designs.
  • a tenth aspect provides a computer program product, the computer program product includes instructions, when the computer program product runs on a computer, so that the computer can execute any one of the above-mentioned first aspect and various possible designs according to the instructions.
  • Fig. 1 is a kind of realization schematic diagram of machine learning in the communication process
  • Fig. 2 is the working schematic diagram of a kind of FL architecture
  • FIG. 3 is a schematic diagram of a comparison between a BNN and an ordinary neural network based on high-precision parameters
  • FIG. 4 is a schematic diagram of the composition of a machine learning system according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of the composition of another machine learning system provided by an embodiment of the present application.
  • FIG. 6 is a schematic working logic diagram of a machine learning system provided by an embodiment of the present application.
  • FIG. 7 is a schematic working logic diagram of another machine learning system provided by an embodiment of the present application.
  • FIG. 8 is a schematic working logic diagram of another machine learning system provided by an embodiment of the present application.
  • FIG. 9 is a schematic logical diagram of a machine learning method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a comparison of simulation results provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a comparison of another simulation result provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a comparison of another simulation result provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of the composition of a machine learning apparatus provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of the composition of another machine learning apparatus provided by an embodiment of the present application.
  • FIG. 15 is a schematic diagram of the composition of a child node according to an embodiment of the present application.
  • FIG. 16 is a schematic diagram of the composition of a chip system provided by an embodiment of the present application.
  • FIG. 17 is a schematic diagram of the composition of a central node according to an embodiment of the present application.
  • FIG. 18 is a schematic diagram of the composition of another chip system provided by an embodiment of the present application.
  • words such as “exemplary” or “for example” are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as “exemplary” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
  • first and second are only used for description purposes, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features.
  • a feature defined as “first” or “second” may expressly or implicitly include one or more of that feature.
  • plural means two or more.
  • the meaning of the term “at least one” refers to one or more, and the meaning of the term “plurality” in this application refers to two or more.
  • a plurality of second messages refers to two or more more than one second message.
  • the size of the sequence number of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not be used in the embodiment of the present application. Implementation constitutes any limitation.
  • determining B according to A does not mean that B is only determined according to A, and B may also be determined according to A and/or other information.
  • the term “if” may be interpreted to mean “when” or “upon” or “in response to determining” or “in response to detecting.”
  • the phrases “if it is determined" or “if a [statement or event] is detected” can be interpreted to mean “when determining" or “in response to determining... ” or “on detection of [recited condition or event]” or “in response to detection of [recited condition or event]”.
  • references throughout the specification to "one embodiment,” “an embodiment,” and “one possible implementation” mean that a particular feature, structure, or characteristic related to the embodiment or implementation is included in the present application at least one embodiment of .
  • appearances of "in one embodiment” or “in an embodiment” or “one possible implementation” in various places throughout this specification are not necessarily necessarily referring to the same embodiment.
  • the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
  • connection mentioned in the embodiments of the present application may be a direct connection, an indirect connection, a wired connection, or a wireless connection, that is, the embodiment of the present application can
  • the connection method is not limited.
  • the local node can collect relevant data, and upload the data to the computing node respectively, so that the computing node can learn based on the data, thereby obtaining the corresponding training model.
  • the computing node can issue the training model to each local node, so that each local node can predict and guide its work in the communication process according to the training model.
  • FIG. 1 it is an implementation example of machine learning in a communication process.
  • three local nodes perform machine learning through computing nodes as an example.
  • the local node 1 can upload the data set 1 composed of the collected data to the computing node.
  • the local node 2 can upload the data set 2 composed of the collected data to the computing node.
  • the local node 3 can upload the data set 3 composed of the collected data to the computing node.
  • Compute nodes can perform machine learning based on these datasets (eg dataset 1 - dataset 3).
  • a basic neural network model can be preset in the computing node, and the basic neural network model can be iteratively learned according to data set 1 to data set 3, and various model parameters (such as weight and bias, etc.) of the basic neural network model are optimized to obtain Iterate over the converged model parameters, thereby completing a round of machine learning.
  • the computing node can deliver the iteratively converged model parameters to each local node. For example, send model parameters to local node 1, local node 2 and local node 3.
  • the local node may also be preset with a model of the same type as the basic neural network model in the computing node.
  • the local node can update the locally maintained model according to the received model parameters, thereby obtaining the training model after machine learning.
  • the local node can predict and guide its work based on the trained model.
  • the corresponding training model can be used to judge and predict the corresponding parameters according to the above scheme, thereby greatly improving The working performance of the local node.
  • each local node needs to send the data set collected by it to the computing node respectively.
  • the data volume of the data set that the computing node needs to collect is very large, which makes the local node in the process of transmitting the data set to the computing node.
  • the communication link between them creates a great burden.
  • the data set is directly sent to the computing node, when some private information of the user is included in the data set, the private information will be directly exposed in the communication link and the computing node, resulting in information privacy. hazard.
  • a Federated Learning (FL) architecture with a distributed architecture can be used to couple machine learning and communication, reduce the amount of data transmission, and at the same time properly protect information privacy.
  • FL Federated Learning
  • a central node or called a central server
  • sub-nodes can be provided.
  • the number of child nodes can range from several to several thousand according to different tasks and data distribution.
  • a neural network model can be preset locally.
  • the neural network model in each child node is the same.
  • the child nodes can obtain the corresponding dataset to learn in the neural network model.
  • each child node performs several iterations (usually once for all data in the dataset), and then uploads the local model with converged model parameters to the central node.
  • the central node will weight and average all the sent local models according to the proportion of the data volume of each child node (this process may also be called fusion of training models), thereby obtaining the fusion training model. Then, the central node can send the obtained training model to all sub-nodes, so that the sub-nodes can continue to train according to the new data set according to the fused training model, or directly use it for calculation and prediction of related scenarios. It should be noted that, in different implementations, the transmission of the training model between the child nodes and the central node may be to directly transmit all the data of the training model, or it may be to transmit only the parameters of the training model.
  • the architecture includes three sub-nodes (such as local node 1, local node 2 and local node 3) and one central node as an example.
  • local node 1 relevant data can be collected to obtain data set 1.
  • the local training model corresponding to the data set 1 may be stored in the local node 1 .
  • the local node 1 can input the data set 1 into the local training model, and perform local training, thereby obtaining the converged local training model parameter 1 (eg, marked as W1).
  • processing similar to the above local node 1 can also be performed, and corresponding local training model parameters 2 (eg, identified as W2) and local training model parameters 3 (eg, identified as W3) can be obtained. Understandably, due to the learned acquisition of locally trained model parameters, there is a strong correlation with the input dataset. For example, when the input data sets are different, the obtained local training model parameters may also be different.
  • the three local nodes can respectively send the acquired local training model parameters (such as W1-W3) to the central node.
  • the central node can fuse the obtained W1-W3, and then obtain the fused training model parameters (such as W0).
  • the central node can deliver the fused training model parameters to the local node 1, the local node 2 and the local node 3 respectively.
  • Each local node can update the local training model according to the received W0.
  • the data set is processed locally without being sent to the central node, so that the information security of the data set can be guaranteed.
  • the data volume of the training model or the training model parameters is significantly smaller than the data volume of the data set itself, the data transmission pressure between the child nodes and the central node can be effectively reduced by transmitting the training model.
  • the training efficiency and data transmission volume under the FL architecture still cannot meet the needs of all scenarios. Take the child node as the mobile phone and the central node as the base station as an example. There is information exchange between the mobile phone and the base station at any time.
  • the FL architecture although only the learning training model or the training model parameters need to be transmitted, since a parameter often requires a transmission bandwidth of 16 bits (bit) or higher, and a set of training model parameters includes multiple parameters, therefore, for Transmission bandwidth requirements are still very high. Further, during the transmission of the training model (or training model parameters), the mobile phone will not continue to perform local training. Therefore, due to the long transmission time of the training model (or training model parameters), the training efficiency of the entire FL architecture will be higher. Low.
  • the solutions provided by the embodiments of the present application can combine a binarized data processing solution and a distributed neural network learning solution to achieve the effect of reducing the pressure on data transmission while improving the efficiency of machine learning.
  • the neural network applying the binarized data processing scheme may also be referred to as a binarized neural network (Binary Neural Network, BNN).
  • BNN Binary Neural Network
  • the node that sends the data needs to quantify the data before transmission. For example, take the child node sending data to the central node as an example.
  • the child node can quantify the data to be sent into a sequence consisting of 0 or 1, and then transmit the sequence through the uplink data transmission channel.
  • the corresponding data may not be completely equivalent to the data before the quantization.
  • the wider the bit width of a sequence after quantization corresponds to a data to be transmitted (such as data called full precision), the higher the precision of the parameters obtained after quantization.
  • quantized data with a wider quantization bit width may be referred to as high-precision data, or high-precision parameters.
  • high-precision data or high-precision parameters may refer to data with a sequence bit width greater than or equal to 32 bits.
  • the parameters of the neural network consist of binarized parameters of +1 and -1.
  • the parameters of the training model are identified by the binarized parameters. Therefore, when learning BNN, compared with model learning based on high-precision parameters without binarization, it can effectively reduce the amount of calculation when the neural network is used for deduction, and accelerate the convergence of the learning process.
  • BNN can also reduce the amount of storage required to store the parameters of the neural network, thereby reducing the amount of communication required to send the entire neural network.
  • FIG. 3 is a schematic diagram of a comparison between a BNN and an ordinary neural network based on high-precision parameters.
  • the corresponding high-precision parameters can be W1, W2, and W3 respectively.
  • the corresponding binarization parameters can be Wb1, Wb2, and Wb3 respectively.
  • W1 and Wb1 as examples.
  • the corresponding relationship between W1 and Wb1 can be as follows: W1 can be converted into corresponding Wb1 through binarization conversion.
  • the binarization conversion may be: for any element in the calculation matrix corresponding to W1, if the element is greater than 0, the element corresponding to the corresponding position in the Wb1 matrix is denoted as +1. Correspondingly, if the element is less than 0, the element corresponding to the corresponding position in the Wb1 matrix is marked as -1.
  • the corresponding W1 can be obtained through gradient accumulation. It can be understood that in a typical BNN learning process, the binarized parameters can be used for forward calculation and gradient calculation, and the gradients are accumulated on the corresponding high-precision parameters. When a sufficiently large gradient is accumulated on the high-precision parameters, the binarization parameters jump.
  • BNN performs the above process multiple times through iteration, gradually updating the parameters, and finally converges after enough iterations. Therefore, when using a learned BNN, it is only necessary to use the binarized parameters for deduction, and the final output is the deduction result of the BNN.
  • the following is an example to illustrate the learning of BNN in combination with actual scenarios. Take the image classification problem and the stochastic gradient descent method as an example. Let bs be the batch size of each learning, the binarization parameter is Wi b , and the high precision parameter is Wi . Among them, the subscript i represents the user index.
  • the child nodes can compute In this loss calculation formula, L represents the structure of the neural network used, Wi b is the current binarization parameter, lossfunc ( ⁇ , ⁇ ) is the loss function of the neural network, and loss is the final loss value.
  • the child node After calculating the loss, the child node performs back-propagation to calculate the gradient value of the binarization parameter, that is, And the gradient is accumulated to the high-precision parameter, W i ⁇ W i - ⁇ grad, where ⁇ is the learning rate, which can be preset before the calculation. Finally, if a certain high-precision parameter changes sign in this iteration, the corresponding binarized parameter sign is also flipped (eg, flipped from +1 to -1). After each of the above iterations, continue to extract bs data from the remaining data sets (if the data is insufficient, all are extracted), and repeat the above iterations. A round of learning ends when all data has been extracted and data extraction cannot continue.
  • the convergence of the method can be determined by comparing the calculation result of loss with the calculation results of the previous one or several times. For example, after this loss calculation, if the difference between the settlement result of this loss and the previous three calculation results is within the preset range, then the method is considered to be convergent.
  • BNN has the characteristics of fast convergence and small data transmission.
  • BNN has the characteristics of fast convergence and small data transmission.
  • BNN is directly applied to the existing distributed learning-based system (such as FL architecture), since all data transmission is carried out in the form of binarization, the learning accuracy of the entire system will be too low and Unusable situation.
  • the above-mentioned BNN can be used in a learning framework based on a distributed machine learning system such as FL, and according to the machine learning method provided by the embodiment of the present application, the entire FL framework can be used in the machine learning method.
  • the data transmission pressure of the machine learning system is significantly relieved, and at the same time, the learning efficiency of the entire machine learning system is improved by combining the needs of different scenarios.
  • local training in each sub-node can be performed locally based on a binarized neural network model.
  • the neural network model can be selected flexibly.
  • the neural network model can have network structures such as a fully connected network and a convolutional neural network.
  • the machine learning method provided by the embodiments of the present application can be applied to wireless communication systems including 3G/4G/5G/6G, or satellite communication.
  • the wireless communication system is usually composed of cells, each cell includes a base station (Base Station, BS), and the base station provides communication services to multiple mobile stations (Mobile Station, MS).
  • the base station includes BBU (Baseband Unit, Chinese: Baseband Unit) and RRU (Remote Radio Unit, Chinese: Remote Radio Unit).
  • BBU Baseband Unit
  • RRU Remote Radio Unit
  • the BBU and RRU can be placed in different places, for example, the RRU is far away and placed in an area with high traffic volume, and the BBU is placed in the central computer room.
  • BBU and RRU can also be placed in the same computer room.
  • the BBU and RRU can also be different components under one rack.
  • the wireless communication systems mentioned in the solution of the present invention include but are not limited to: Narrow Band-Internet of Things (NB-IoT), Global System for Mobile Communications (GSM), Enhanced Data Rate for GSM Evolution (EDGE), Wideband Code Division Multiple Access (WCDMA), Code Division Multiple Access 2000 (Code Division Multiple Access, CDMA2000), Time Division
  • NB-IoT Narrow Band-Internet of Things
  • GSM Global System for Mobile Communications
  • EDGE Enhanced Data Rate for GSM Evolution
  • WCDMA Wideband Code Division Multiple Access
  • CDMA2000 Code Division Multiple Access 2000
  • Time Division Multiple Access Time Division
  • TD-SCDMA Time Division-Synchronization Code Division Multiple Access
  • LTE Long Term Evolution
  • 5G mobile communication system eMBB, URLLC and eMTC.
  • the base station is a device deployed in a radio access network to provide a wireless communication function for an MS.
  • the base stations may include various forms of macro base stations, micro base stations (also called small cells), relay stations, access points, and the like.
  • the names of devices with base station functions may be different.
  • LTE systems it is called an evolved NodeB (evolved NodeB, eNB or eNodeB).
  • eNB evolved NodeB
  • eNodeB 3rd Generation
  • Node B Node B
  • the above-mentioned apparatuses for providing wireless communication functions for MSs are collectively referred to as network equipment or base stations or BSs.
  • the MS involved in the solution of the present invention may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices, or other processing devices connected to a wireless modem with wireless communication capabilities.
  • the MS may also be referred to as a terminal (terminal), and the MS may be a subscriber unit (subscriber unit), a cellular phone (cellular phone), a smart phone (smart phone), a wireless data card, a personal digital assistant (Personal Digital Assistant, PDA) ) computer, tablet computer, wireless modem (modem), handheld device (handset), laptop computer (laptop computer), machine type communication (Machine Type Communication, MTC) terminal, etc.
  • the machine learning system may include a central node and multiple sub-nodes (eg, sub-node 1-sub-node N).
  • the child node may also be called a local node.
  • the central node can communicate with each sub-node in a wired or wireless manner.
  • the central node can receive the local training results uploaded by each sub-node in the uplink transmission channel.
  • the local training result may include parameters of the local training model, or the local training model itself.
  • the central node may deliver the merged local training model parameters or the local training model itself to each sub-node through a broadcast or downlink transmission channel.
  • the transmission of local training model parameters or local training model can be realized by transmitting the binarized parameters corresponding to each parameter, or by directly transmitting the high-precision parameters corresponding to each parameter.
  • the parameters of the local training model may include the first item or multiple parameters of the parameters such as the weight and bias of the neural network, and the parameters of the local training model may also be their corresponding gradients.
  • the machine learning system may include a base station and N mobile phones (such as mobile phone 1 - mobile phone N).
  • the same basic training model may be pre-stored in the N mobile phones and base stations.
  • the mobile phone 1 can collect data of the corresponding scene, thereby forming the corresponding data set 1 .
  • Mobile phone 1 can input the data set 1 into the basic training model for local training. It can be understood that, in this embodiment of the present application, the mobile phone can perform local training through the BNN.
  • the mobile phone 1 can input the data set 1 into the basic training model, and obtain the local model parameter 1 according to the machine learning method of high-precision parameters.
  • the local model parameter 1 may be a high precision parameter.
  • the local model parameters 1 may include high precision weights and biases.
  • the mobile phone 1 can perform reverse deduction based on high-precision weights and biases, thereby completing the learning of a part of the data in the data set 1.
  • the mobile phone 1 can perform binarization conversion on the corresponding high-precision weights and biases to obtain corresponding binarization parameters.
  • the mobile phone 1 can perform training and learning based on the binarization parameter. Since the data volume of the binarization parameters is significantly smaller than the high-precision data volume, the mobile phone 1 can quickly acquire and complete the local training to acquire the converged weights and biases. Since the local training is based on the binarized parameters, the weight and bias results obtained by the mobile phone 1 can be the binarized parameters.
  • the binarized parameters obtained by local training of each mobile phone may be referred to as local parameters.
  • the binarized weights and biases obtained by mobile phone 1 after one round of learning can be called local parameters 1.
  • the binarized weights and biases obtained by mobile phone 2 after one round of learning can be called local parameters 2.
  • the binarized weights and biases obtained by the mobile phone N after one round of learning can be called local parameters N.
  • Each mobile phone can send its corresponding local parameters to the base station respectively.
  • the base station may fuse the acquired N local parameters (eg, local parameter 1-local parameter N) to obtain a normalized fusion parameter. Then, the base station can distribute the fusion parameters to each mobile phone respectively. After receiving the fusion parameters, the mobile phone can update the local basic training model accordingly, and perform the next round of learning or directly use it for data prediction in actual scenarios.
  • the following describes the processing and transmission of data in the sub-nodes and the central node by including three sub-nodes in the machine learning system.
  • a central fusion module may be set in the central node, and a learning module and a local fusion module may be set in each sub-node.
  • the learning module in the child node may be used to perform local training on the data set to obtain the binarized local parameters.
  • Subnode 1 can send the local parameter to the central node.
  • the child node 2 and the child node 3 may also send their corresponding local parameters to the central node.
  • the central fusion module in the central node can be used to fuse all received local parameters to obtain fusion parameters.
  • the central node may deliver the fusion parameter to the sub-node 1 to the sub-node 3 respectively.
  • the local fusion module can be used to update the local training model according to the received fusion parameters, so as to obtain a local training model based on the fusion parameters.
  • the local fusion module can be used to update the local training model according to the received fusion parameters, so as to obtain the local training model based on the fusion parameters.
  • the local fusion module can be used to update the local training model according to the received fusion parameters, so as to obtain the local training model based on the fusion parameters.
  • the local fusion module may be an independent module that is not included in the child node.
  • the central node can only send the fusion parameters to the local fusion module, and the local fusion module can be used to update the local training model, and the updated local The training model is distributed to child node 1 to child node 3 respectively.
  • the performance requirements for the sub-nodes can be reduced, and at the same time, since the central node only needs to send the fusion parameters to the local fusion module, the signaling overhead of the central node can be reduced.
  • the local fusion module may also be set in some sub-nodes.
  • the local fusion module is integrated in the sub-node 3, and the local fusion modules of the sub-node 1 and the sub-node 2 can be set independently of the sub-nodes.
  • the central node can send the fusion parameters to the local fusion modules corresponding to the sub-node 1 and the sub-node 2, and the sub-node 3 respectively.
  • the local fusion modules corresponding to the sub-node 1 and the sub-node 2 can deliver the local training model updated based on the fusion parameters to the sub-node 1 and the sub-node 2.
  • the local fusion module integrated therein can be used to update the local training model according to the received fusion parameters, thereby obtaining the updated local training model.
  • composition of the learning system shown in FIG. 6 , FIG. 7 and FIG. 8 in this example is only an example, and in other implementations of the present application, the system may also include multiple independent Configured local fusion module.
  • the system may also include multiple independent Configured local fusion module.
  • the system configured with 5 sub-nodes (eg, sub-node 1-sub-node 5) and 3 local fusion modules (eg, local fusion module 1-local fusion module 3) as an example.
  • local fusion module 1 can provide local fusion services for sub-node 1 and sub-node 2
  • local fusion module 2 can provide local fusion services for sub-node 3 and sub-node 4
  • local fusion module 3 can provide local fusion services for sub-node 5 Fusion Services.
  • the corresponding relationship between the local fusion module and the child nodes can also be reconfigured.
  • Sub-node 5 provides local fusion service
  • local fusion module 3 can provide local fusion service for sub-node 4.
  • one or part of the three local fusion modules can also provide local fusion services to sub-nodes.
  • local fusion module 1 can provide local fusion services for sub-node 1 and sub-node 3, and local fusion module 2
  • Sub-node 2, sub-node 5, and sub-node 4 can be provided with local fusion services, and local fusion module 3 can be in a dormant state such as sleep.
  • FIG. 9 shows a logical schematic diagram of a machine learning method provided by an embodiment of the present application. As shown in Figure 9, the method may include:
  • the child node 1 performs local learning.
  • the child node 1 acquires local parameters.
  • the child node 1 sends the local parameter to the central node.
  • the above-mentioned S901-S903 can also be executed respectively, so that the central node can obtain N local parameters.
  • the local parameter may be a binarized local parameter.
  • the central node fuses the N local parameters.
  • the central node obtains fusion parameters.
  • the central node sends the fusion parameter to the child node 1.
  • the child node 1 updates the local model parameters according to the fusion parameters.
  • the central node can also execute the above S906 correspondingly, and the corresponding sub-nodes can also execute the above S907, so as to update its local training model.
  • the sub-nodes can send the binarized local parameters to the central node, it is not necessary to send the high-precision local parameters with a large amount of data to the central node, so It can significantly reduce the communication pressure between the child nodes and the central node. Since the amount of transmitted data is small, the time-consuming is correspondingly reduced, which can increase the proportion of local training in the entire learning time, thereby improving the learning efficiency.
  • the central node can send the fusion parameters to each sub-node by broadcasting, so that the central node does not need to send the fusion parameters to each sub-node one by one, thereby saving the central node signaling overhead.
  • the embodiment of the present application in order to be able to adapt to the learning needs in different scenarios, the embodiment of the present application also provides three modes in the method shown in FIG. 9 , so that the machine learning system can learn In scenarios with different efficiency and learning accuracy, select the corresponding mode to achieve the effect of rapid convergence or high-accuracy learning.
  • Mode 1 Binarized parameters are used when uploading model parameters, and high-precision parameters are used when central node models are downloaded.
  • the child nodes Due to the high-precision parameters used in the download, the child nodes can more accurately update and obtain the local training model. Due to the binarized parameters used for uploading, the overall learning efficiency is still significantly higher than that of the existing FL architecture that uses high-precision parameters for both uplink and downlink. This mode can be used in scenarios that have certain requirements for learning efficiency and learning accuracy.
  • Mode 2 Binarization parameters are used when uploading and downloading model parameters.
  • model parameters uploaded and downloaded are all binarized parameters. Therefore, the data transmission stress of the system is minimal. This mode can be used in scenarios that require high learning efficiency.
  • Mode 3 High-precision parameters are used when uploading and downloading model parameters.
  • the child nodes Due to the high-precision parameters used in the download, the child nodes can more accurately update and obtain the local training model. This mode can be used in scenarios that require high learning accuracy.
  • the sub-node trains a local model based on the existing model and local data and uploads it to the central server.
  • the central server receives local parameters uploaded from all child nodes Then, execute the central model fusion method to obtain the central parameters and will It is distributed to all child nodes in the form of broadcast.
  • child node receives , immediately using the local model fusion method will receive the It is fused with the local high precision parameter Wi.
  • the sub-nodes are re-binary quantized Then start the next round of local training.
  • the central model fusion method and the local model fusion method differ according to the parameter transmission mode (eg mode 1-mode 3).
  • the amount of data on each node is not equal, due to the large number of system nodes, it can be considered that It reflects the proportion of positive or negative cumulative gradients in the data of all nodes, considering that for node i, the total proportion of local data So for each node, it can be considered as the total number of nodes and each node has the same amount of data. Because at this time, the equivalent M' for each node is different, but it does not affect the specific implementation details.
  • the following descriptions are given by taking the same size of each child node data set as an example. It can be understood that, in other implementation manners of the present application, the size of each sub-node data set may also be different. When the size of the data set is inconsistent, the calculation process is similar, which will not be repeated here.
  • the local fusion module receives Then perform local fusion calculation, because the high-precision parameters of all child nodes are accumulated based on the local data set, so they have strong correlation, and because of the large number of nodes, it can be assumed that the high-precision parameters of all nodes are approximately the same normal The sampled values of the distribution. Further, it is assumed that the covariance matrix of this normal distribution is a diagonal matrix, that is, the values of different position parameters are irrelevant. Therefore, it can be assumed that each child node parameter Wi ,j ⁇ N( ⁇ j , ⁇ j ), wherein Wi ,j represents the parameter value corresponding to the jth parameter of the ith node.
  • the problem is transformed into, under the above known conditions, how to calculate according to Wi ,j and
  • the local high-precision parameters of each node are likely to be different, and when estimated completely independently, the results will also be different.
  • all subscripts are omitted, and all symbols represent the parameters corresponding to the jth parameter of the ith node.
  • the problem is constructed as:
  • the optimization objective is an increasing and then decreasing function of y.
  • the value of this function is only related to ⁇ and M. Since M can be known in advance, the optimal solution can be drawn in advance.
  • the relationship curve between ⁇ and ⁇ fit this relationship with a specific function, and save it locally to reduce the complexity of solving optimization problems.
  • This example uses the logarithmic function A least-squares fit is performed on this curve, where ⁇ is in the range [0,1] and c>0 to ensure that the function is well defined.
  • This function can be used as a given M, Approximate function expression for ⁇ , stored locally. Then we get the solution of the original problem when W>0:
  • ⁇ >1 represents the magnification, which is a parameter selected in advance, generally between 1.5 and 2.5. (For the sake of system stability, in order to make the system converge more stably, it should be ensured that with the deepening of the calculation, W is always It tends to be far away from 0, but not too large. Since the absolute size of the high-precision parameters is of little significance, but the relative size is important, all parameters can be proportionally enlarged and then limited).
  • the local fusion module can update the local training model according to the fusion parameters obtained.
  • the central server After the central server receives the binarized parameters uploaded by all sub-nodes, sum them and take the sign to obtain the parameters of the central end. If the result is exactly 0, -1 or +1 is randomly issued. at this time, The meaning of is that for a certain parameter, if the larger proportion of all node data is positive, the binarization parameter is positive, then it takes +1, otherwise, it takes -1. At this time, this parameter only reflects a general overall trend and contains less information.
  • child node receives After local fusion calculation, use a simple linear fusion method: Among them, sign( ⁇ ) represents the sign function, and ⁇ is between 0 and 1, which is a parameter selected in advance.
  • the central server After the central server receives the high-precision parameters uploaded by all the sub-nodes, it averages them to obtain the parameters of the central end. at this time, The meaning of is the weighted average cumulative gradient calculated by all child nodes according to the local data set.
  • Child node receives directly after Update the local high-precision parameters, i.e.
  • This mode is basically the same as the fusion method under the traditional FL framework, the only difference is that the local model is BNN, and the forward calculation and deduction complexity of training is lower than that of ordinary neural networks.
  • any one of the above Mode 1, Mode 2, or Mode 3 may be selected as the transmission mode, so as to obtain corresponding beneficial effects. It should be noted that no matter the above mode 1 or mode 2 or mode 3 is adopted, since the BNN is used for local training of the child nodes locally, the results can be obtained iteratively faster than the existing FL architecture.
  • the above-mentioned mode 1 and mode 2, or mode 1 and mode 3, or mode 2 and mode 3, or mode 1 and mode 2 and mode 3 can also be combined to realize the above-mentioned mode as shown in FIG. 9 . method.
  • mode 2 can be used at the beginning of learning.
  • mode 1 can be used to continue learning, thereby appropriately improving the accuracy of the parameters.
  • mode 3 can be used to continue learning, thereby obtaining the result with the highest accuracy.
  • Table 1 shows a correspondence between accuracy and mode selection.
  • the central node determines to continue to use mode 2 for learning.
  • the central node can determine to use mode 1 for learning.
  • the central node can determine to learn through mode 3.
  • the accuracy may be obtained by the central node calculated according to the accuracy uploaded by each sub-node.
  • the central node can be based on Calculate the accuracy of the acquisition system, and determine the transmission mode according to the corresponding relationship shown in Table 1. It should be noted that 0.65/0.8 in Table 1 are all setting examples of a threshold, and in other implementations of the present application, the threshold may also be set to other values, or flexibly adjusted according to the environment.
  • the local training model corresponding to the local parameters can be verified through the test set stored therein, thereby obtaining the corresponding accuracy, and sending the accuracy to the central node.
  • the operation of verifying the acquisition accuracy may also be completed at the central node.
  • the training model corresponding to the local node may be stored in the central node.
  • the central node can update the training model according to the local parameters, and verify the accuracy corresponding to the local parameters based on the updated training model and the test set stored in the central node. In this way, the corresponding accuracy can be obtained.
  • the central node can also obtain the corresponding accuracy.
  • the central node can calculate the accuracy of the corresponding system based on the accuracy corresponding to each local parameter, and then determine the data transmission mode.
  • the accuracy may also be the accuracy obtained by the central node according to the fusion parameters after central fusion and the training model combined with the test set or the verification data set for verification.
  • the method for determining the accuracy may be flexibly determined, which is not limited in this embodiment of the present application.
  • the central node may also determine the transmission mode according to other methods. For example, the central node may determine the transmission mode according to the number of iteration rounds N. Table 2 shows a possible correspondence between the number of iteration rounds N and the transmission mode.
  • the central node when the number of iteration rounds is within 5, the central node can determine that in the current learning, the need to improve the convergence speed is higher, so mode 2 can be used for learning.
  • the central node can determine that the accuracy needs to be appropriately improved in the current learning, and then adopts mode 1 to learn.
  • the central node When the number of iteration rounds is greater than 50, the central node can consider that the learning is about to end, and the parameter transmission needs to be performed with the highest accuracy, that is, mode 3 is used for learning.
  • the central node may instruct each sub-node to adjust the parameter transmission mode.
  • three modes can be indicated by 2 bits, for example, 00 indicates that the next round parameter transmission mode is mode 1, 01 indicates mode 2, and 10 indicates mode 3.
  • the parameter transmission mode field can be delivered together with the central fusion model, or delivered through a dedicated control channel.
  • a 4-layer convolutional neural network consisting of two convolutional layers and two fully connected layers is used.
  • the training set in the MNIST dataset is evenly distributed on 100 child nodes.
  • Each node has a total of 600 pairs of data, including 60 pairs of each type of data, and the test set is only stored on the central server.
  • the final training set related results are the mean of the results of 100 nodes, and the test set related results are performed by the central server based on the binarized results of local parameters. High-precision parameters are quantized with 32 bits.
  • the specific structure is: 3*3*16 convolutional layer, normalization layer, 2*2 maximum pooling layer, tanh activation function, 3*3*16 convolutional layer, normalization layer A normalization layer, a 2*2 max pooling layer, a tanh activation function, a 784*100 fully connected layer, a normalization layer, a tanh activation function, a 100*10 fully connected layer, a softmax activation function, and finally a cross-entropy loss function.
  • the initial value of the learning rate ⁇ is 0.05, and then decreases every 30 iterations to 0.02, 0.01, 0.005, and 0.002.
  • FIG. 11 shows the curve of the accuracy rate of the test set changing with time when the present invention is applied to the MNIST handwritten digit recognition data set.
  • centralized training means that all data is collected to a central node for training, and this curve is used as a baseline for comparison. It can be seen that in terms of the accuracy of the test set, both modes 1 and 3 can achieve close to the centralized effect. Mode 3 has a slight advantage in the training effect, but mode 1 requires a small amount of communication per iteration and consumes The communication cost is much lower than that of mode 3, and mode 3 is not very stable.
  • Mode 2 although the final performance is poor, it is highly competitive in the early stage of training due to the extremely low amount of communication required, and the performance cannot be compared with the first two modes as the training progresses.
  • mode 1 in the present invention is suitable for most practical situations
  • mode 2 is suitable for application in the early stage of training, or when communication resources are very tight and the requirements for learning effect are very low
  • mode 3 is suitable for later training of well-trained models. Fine tune.
  • Table 3 gives a comparison of the amount of computation and communication per child node required to achieve 90% and 95% accuracy on the test set for the first time.
  • the total number of parameters of the system is 82242.
  • the system works in mode 1, it needs to upload 10.04KB and download 66.70KB data each time; when it works in mode 2, it needs to upload and download 10.04KB data each time; mode 2 Because the accuracy rate of 95% cannot be achieved, the data is vacant; when working in mode 3, each 321.26KB data needs to be uploaded and downloaded each time.
  • the communication amount required for each iteration of the mode 1 method in the present invention is greatly reduced. A magnitude reduction in the total time required for distributed machine learning tasks.
  • a 4-layer convolutional neural network consisting of two convolutional layers and two fully connected layers is used.
  • the network structure is the same as that in Section 2.4.1.
  • the initial learning rate ⁇ is The value is 0.02 and then decreases every 30 iterations to 0.01, 0.005, 0.002.
  • the training set in the MNIST data set is unevenly distributed on 100 sub-nodes.
  • the specific distribution method is as follows: first, the data set is divided into 10 parts according to the type, and then each part is divided into 100 parts to obtain 1000 sub-data sets. The 1000 subdatasets are randomly assigned to 100 child nodes, and each child node is randomly assigned to 10 subdatasets. The test set is only saved on the central server.
  • FIG. 12 is a curve showing the change of the accuracy rate of the test set when the present invention is applied to the non-IID MNIST handwritten digit recognition data set.
  • the mode switching in the hybrid mode is judged by the number of iteration rounds. Initially, mode 2 is used for training. After 5 iterations, it is changed to mode 1, and then it is changed to mode 3 after 50 iterations. In this case, due to the non-IID characteristics of the dataset, the system will suffer a certain performance loss. It can be seen that the mixed mode can also achieve better results in the end.
  • Table 4 shows the communication and computational costs required to achieve a certain accuracy for 5 consecutive times for the three modes and the hybrid mode.
  • each functional module may be divided into each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • a machine learning device 1300 is provided in an embodiment of the present application.
  • the device can be applied to a sub-node, and a binary neural network model BNN is set in the sub-node.
  • the device includes: an obtaining unit 1301 for Perform BNN-based machine learning on the collected local data set to obtain local model parameters corresponding to the local data set.
  • the sending unit 1302 is configured to send a first message to the central node, where the first message includes local model parameters.
  • the local model parameters included in the first message are binarized local model parameters.
  • the apparatus further includes: a receiving unit 1303, configured to receive fusion parameters from the central node, where the fusion parameters are obtained by the central node by fusion according to local model parameters.
  • the fusion unit 1304 is configured to fuse the fusion parameters and the local model parameters to obtain updated local model parameters.
  • the fusion parameters are binarized model parameters, or the fusion parameters are high-precision model parameters.
  • the first message further includes: accuracy information corresponding to the local model parameters.
  • the obtaining unit 1301 is further configured to verify the obtained accuracy information according to the local model parameters and the test data set.
  • the apparatus further includes: a learning unit 1305, configured to continue machine learning based on the BNN according to the updated local model parameters.
  • a machine learning apparatus 1400 is provided in an embodiment of the present application.
  • the apparatus is applied to a central node.
  • the apparatus includes: a receiving unit 1401, configured to receive N first messages from N child nodes, respectively.
  • the message includes local model parameters, which are binarized local model parameters.
  • N is an integer greater than or equal to 1.
  • the fusion unit 1402 is configured to fuse the local model parameters included in the N first messages to obtain fusion parameters.
  • the sending unit 1403 is configured to send a second message to the M sub-nodes, where the second message includes a fusion parameter, and M is a positive integer greater than or equal to 1.
  • the M child nodes are included in the N child nodes.
  • the fusion unit 1402 is specifically configured to perform a weighted average on the N local model parameters to obtain the fusion parameters.
  • the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized fusion parameter.
  • the apparatus further includes: a determining unit 1404, configured to determine the system accuracy information according to the N first messages, and the central node determines, according to the system accuracy information, that the fusion parameter included in the first message is High-precision fusion parameters or binarized fusion parameters.
  • the first message further includes: accuracy information.
  • the accuracy information corresponds to the accuracy obtained by verifying the local model parameters included in the first message at the corresponding child node.
  • the determining unit 1404 is specifically configured to determine the system accuracy information according to the accuracy information included in the N messages.
  • the determining unit 1404 is configured to determine that the fusion parameter is a binarized fusion parameter when the system accuracy information is less than or equal to the first threshold.
  • the determining unit 1404 is further configured to determine that the fusion parameter is a high-precision fusion parameter when the system accuracy information is greater than or equal to the second threshold.
  • the sending unit 1403 is configured to send the second message including the binarized fusion parameter to the M sub-nodes when the number of iteration rounds is less than or equal to the third threshold.
  • the sending unit 1403 is further configured to send a second message including a high-precision fusion parameter to the M sub-nodes when the number of iteration rounds is greater than or equal to the fourth threshold.
  • the sending unit 1403 is specifically configured to send the second message to the M sub-nodes through broadcasting.
  • FIG. 15 shows a schematic composition diagram of a child node 1500 .
  • the child node 1500 may include: a processor 1501 and a memory 1502 .
  • the memory 1502 is used to store computer-implemented instructions. Exemplarily, in some embodiments, when the processor 1501 executes the instructions stored in the memory 1502, the child node 1500 may be caused to execute the data processing method shown in any of the foregoing embodiments.
  • FIG. 16 shows a schematic diagram of the composition of a chip system 1600 .
  • the chip system may be applied to any sub-node involved in the embodiments of this application.
  • the chip system 1600 may include: a processor 1601 and a communication interface 1602, which are used to support related devices to implement the functions involved in the above embodiments.
  • the chip system further includes a memory for storing necessary program instructions and data of the child nodes.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the communication interface 1602 may also be referred to as an interface circuit.
  • FIG. 17 shows a schematic diagram of the composition of a central node 1700 .
  • the central node 1700 may include: a processor 1701 and a memory 1702 .
  • the memory 1702 is used to store computer-implemented instructions.
  • the processor 1701 executes the instructions stored in the memory 1702
  • the central node 1700 can be caused to execute the data processing method shown in any one of the foregoing embodiments.
  • FIG. 18 shows a schematic composition diagram of a chip system 1800 .
  • the chip system may be applied to any central node involved in the embodiments of this application.
  • the chip system 1800 may include: a processor 1801 and a communication interface 1802, which are used to support related devices to implement the functions involved in the above embodiments.
  • the chip system further includes a memory for storing necessary program instructions and data of the central node.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the communication interface 1802 may also be referred to as an interface circuit.
  • the functions or actions or operations or steps in the above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • a software program When implemented using a software program, it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium can be any available medium that can be accessed by a computer, or data storage devices including one or more servers, data centers, etc. that can be integrated with the medium.
  • the usable media may be magnetic media (e.g., floppy disk, hard disk, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., solid state drive (SSD)), and the like.

Abstract

Embodiments of the present application relate to the technical field of machine learning. Disclosed are a machine learning method, device, and system, being capable of significantly reducing the amount of data transmission during machine learning, thereby reducing the time-consuming ratio of the data transmission in an entire learning process, and effectively improving the efficiency of machine learning. The specific solution comprises: a sub node performs BNN-based machine learning on a local data set obtained by means of collection, so as to obtain a local model parameter corresponding to the local data set; and the sub node sends a first message to a central node, the first message comprising the local model parameter.

Description

一种机器学习方法、装置和系统A machine learning method, apparatus and system
本申请要求于2020年11月27日提交国家知识产权局、申请号为202011365069.6、申请名称为“一种机器学习方法、装置和系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application with an application number of 202011365069.6 and an application title of "A Machine Learning Method, Device and System" filed with the State Intellectual Property Office on November 27, 2020, the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本申请涉及机器学习技术领域,尤其涉及一种机器学习方法、装置和系统。The present application relates to the technical field of machine learning, and in particular, to a machine learning method, apparatus and system.
背景技术Background technique
随着人工智能(Artificial Intelligence,AI)技术的发展,神经网络与通信系统相结合的研究受到广泛关注。比如,在智能通信、物联网、车联网、智慧城市等场景中,都可以通过机器学习来提升网络的灵活性。With the development of artificial intelligence (AI) technology, the research on the combination of neural network and communication system has received extensive attention. For example, in scenarios such as intelligent communication, Internet of Things, Internet of Vehicles, and smart cities, machine learning can be used to improve the flexibility of the network.
示例性的,以智能通信为例。通过机器学习,计算节点(如基站)可以根据各个本地节点(如手机)上传的数据,结合其中设置的神经网络模型进行机器学习。基站可以将通过机器学习获取的符合预设条件的模型下发给各个手机,以使得各个手机能够根据该模型,对各自的通信过程进行调整,实现智能通信。Illustratively, take intelligent communication as an example. Through machine learning, computing nodes (such as base stations) can perform machine learning based on the data uploaded by each local node (such as mobile phones) combined with the neural network model set therein. The base station can deliver the model that meets the preset conditions obtained through machine learning to each mobile phone, so that each mobile phone can adjust its own communication process according to the model to realize intelligent communication.
可以看到,在基于神经网络与通信系统的结合,并进行机器学习的过程中,各个本地节点需要将其收集的数据全部上传给计算节点,以使得计算节点能够根据这些数据进行训练。根据场景的不同,被传输的数据量可能非常大。由于各个本地节点都需要将数据传输给计算节点,这就使得本地节点和计算节点之间的数据传输压力较大。此外,由于数据直接被本地节点传输给中心节点,其中可能会包括与用户隐私信息相关的内容,这也会使得信息隐私信息的不安全。It can be seen that in the process of machine learning based on the combination of the neural network and the communication system, each local node needs to upload all the data collected by it to the computing node, so that the computing node can perform training based on the data. Depending on the scenario, the amount of data being transferred can be very large. Since each local node needs to transmit data to the computing node, the pressure of data transmission between the local node and the computing node is relatively high. In addition, since the data is directly transmitted by the local node to the central node, it may include content related to user privacy information, which also makes the information privacy information insecure.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种机器学习方法、装置和系统,可以显著降低机器学习过程中数据传输的数量,由此降低数据传输在整个学习过程中的耗时占比,从而有效提高机器学习的效率。The embodiments of the present application provide a machine learning method, device and system, which can significantly reduce the amount of data transmission in the machine learning process, thereby reducing the time-consuming proportion of data transmission in the entire learning process, thereby effectively improving the efficiency of machine learning .
为了达到上述目的,本申请实施例采用如下技术方案:In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
第一方面,提供一种机器学习方法,方法应用于子节点,子节点中设置有二值化神经网络模型BNN,方法包括:子节点对采集获取的本地数据集,进行基于BNN的机器学习,获取与本地数据集对应的本地模型参数。子节点向中心节点发送第一消息,第一消息包括本地模型参数。In a first aspect, a machine learning method is provided, the method is applied to a sub-node, and a binary neural network model BNN is set in the sub-node, and the method includes: the sub-node performs BNN-based machine learning on a local data set collected and acquired, Get the local model parameters corresponding to the local dataset. The child node sends a first message to the central node, where the first message includes local model parameters.
基于该方案,提供了一种将BNN与分布式的机器学习架构结合的方案。在该示例中,可以在本地进行基于BNN的二值化机器学习,由此获取的本地神经网络模型的参数可以为二值化的参数。将二值化的模型参数发送给中心节点,相比于直接将高精度的神经网络模型或者模型参数发送给中心节点,能够显著地降低对于数据传输带宽的要求,同时由于要传输数据显著变少,因此传输耗时也相应降低。可以理解的是,在进行数据传输过程中,子节点不会进行机器学习,因此,降低数据传输耗时就能够显著增加在整个学习过程中,进行机器学习的时间占比,由此提升学习效率。Based on this scheme, a scheme combining BNN with a distributed machine learning architecture is provided. In this example, the BNN-based binarized machine learning can be performed locally, and the parameters of the local neural network model obtained thereby can be binarized parameters. Sending the binarized model parameters to the central node can significantly reduce the requirement for data transmission bandwidth compared to directly sending the high-precision neural network model or model parameters to the central node. , so the transmission time is correspondingly reduced. It is understandable that in the process of data transmission, the child nodes will not perform machine learning. Therefore, reducing the time-consuming of data transmission can significantly increase the proportion of time for machine learning in the entire learning process, thereby improving learning efficiency. .
在一种可能的设计中,第一消息中包括的本地模型参数是二值化的本地模型参数。基于该方案,给出了本地模型参数在被传输过程中的形式限定。在该示例中,被传输的 本地模型参数为二值化的参数,也就是包括+1和-1两种参数。显而易见的,每个参数对应到1比特带宽即可。而相较于传统的FL架构下的数据传输,由于要进行基于高精度的数据传输,而一个高精度元素对应到的传输带宽可能是16比特或更多,这样要传输一个包括多个高精度元素的模型参数矩阵时,就需要耗费更多的传输资源。因此,通过该示例提供的方案,能够有效地降低数据传输资源的需求,传输时间更短,由此提升系统的学习效率。In one possible design, the local model parameters included in the first message are binarized local model parameters. Based on this scheme, the formal constraints of the local model parameters during the transmission process are given. In this example, the transmitted local model parameters are binarized parameters, that is, including both +1 and -1 parameters. Obviously, each parameter corresponds to a 1-bit bandwidth. Compared with the data transmission under the traditional FL architecture, due to the need for high-precision data transmission, the transmission bandwidth corresponding to a high-precision element may be 16 bits or more. When the model parameter matrix of the element is used, it needs to consume more transmission resources. Therefore, the solution provided by this example can effectively reduce the demand for data transmission resources and shorten the transmission time, thereby improving the learning efficiency of the system.
在一种可能的设计中,该方法还包括:子节点从中心节点接收融合参数,融合参数是中心节点根据本地模型参数融合获取的。子节点根据融合参数对本地模型参数进行更新,以获取更新后的本地模型参数。基于该方案,给出了一种子节点通过接收信息实现模型更新的方法。可以理解的是,在分布式的学习架构中,在中心节点可以进行数据的汇总处理,比如,可以获取各个子节点分别传输的模型参数,并对这些模型参数进行融合,由此获取与各个子节点对应情况都适配的具有较高适应性的模型参数(比如上述融合参数)。子节点可以从中心节点获取该融合参数,这样,就可以根据该融合参数,结合本地参数进行本地融合,由此实现对于本地神经网络模型的更新。In a possible design, the method further includes: the child node receives fusion parameters from the central node, and the fusion parameters are obtained by the central node through fusion according to local model parameters. The child node updates the local model parameters according to the fusion parameters to obtain the updated local model parameters. Based on this scheme, a method for model updating by sub-nodes by receiving information is presented. It can be understood that in the distributed learning architecture, the central node can perform data aggregation processing. For example, the model parameters transmitted by each sub-node can be obtained, and these model parameters can be fused to obtain and each sub-node. Model parameters with higher adaptability (such as the above-mentioned fusion parameters) that are adapted to the corresponding conditions of the nodes. The sub-nodes can obtain the fusion parameters from the central node, so that local fusion can be performed according to the fusion parameters in combination with the local parameters, thereby realizing the updating of the local neural network model.
在一种可能的设计中,融合参数是二值化的模型参数,或者,融合参数是高精度的模型参数。基于该方案,给出了融合参数的不同形式。比如,在一些实现方式中,该融合参数可以是二值化的模型参数。又如,在另一些实现方式中,该融合参数可以是高精度的模型参数。可以理解的是,在本申请中,子节点可以将二值化的模型参数传输给中心节点。中心节点可以根据多个二值化的模型参数进行融合(比如根据权值进行平均加权),由此获取对应的融合参数。以融合处理为加权平均为例,显而易见的,通过融合处理之后的模型参数并非只包括+1以及-1的元素,也就是说,在加权平均之后获取的融合模型应当为高精度的模型参数。在本申请中,可以根据不同场景,选择在下发(或者下传)融合参数时,采用直接获取的高精度参数进行下传,以便于子节点能够根据该高精度融合参数更加准确地更新本地模型。或者采用对高精度参数进行二值化处理后的二值化融合参数进行下传,以便于数据下传的速率更快,子节点也可以更快地完成本地模型的更新。对于子节点而言,可以在接收二值化的融合参数时,根据与二值化融合参数对应的本地融合方法进行本地融合,在接收高精度的融合参数时,根据与高精度融合参数对应的本地融合方法进行本地融合。其具体执行方法请参考实施例中的说明,此处不再赘述。In one possible design, the fusion parameters are binarized model parameters, or the fusion parameters are high-precision model parameters. Based on this scheme, different forms of fusion parameters are given. For example, in some implementations, the fusion parameter may be a binarized model parameter. For another example, in other implementations, the fusion parameter may be a high-precision model parameter. It can be understood that, in this application, the child nodes can transmit the binarized model parameters to the central node. The central node can perform fusion according to multiple binarized model parameters (for example, average weighting according to weights), thereby obtaining corresponding fusion parameters. Taking the fusion process as the weighted average as an example, it is obvious that the model parameters after the fusion process do not only include elements of +1 and -1, that is, the fusion model obtained after the weighted average should be high-precision model parameters. In this application, according to different scenarios, it is possible to choose to download (or download) the fusion parameters using the directly obtained high-precision parameters for downloading, so that the child nodes can more accurately update the local model according to the high-precision fusion parameters . Alternatively, the binarized fusion parameters obtained by binarizing the high-precision parameters are used for downloading, so that the data download rate is faster, and the sub-nodes can also complete the update of the local model faster. For child nodes, when receiving binarized fusion parameters, local fusion can be performed according to the local fusion method corresponding to the binarized fusion parameters, and when receiving high-precision fusion parameters, according to the local fusion method corresponding to the high-precision fusion parameters. The local fusion method performs local fusion. For the specific implementation method, please refer to the description in the embodiment, and details are not repeated here.
在一种可能的设计中,第一消息还包括:与本地模型参数对应的准确度信息。其中,准确度信息是子节点根据本地模型参数,以及测试数据集,校验获取的。基于该方案,提供了一种第一消息的具体内容。在该示例中,子节点在完成一轮本地学习之后,可以将与该学习结果对应的准确度信息发送给中心节点。以便与中心节点可以根据各个子节点上报的准确度信息确定系统准确度,进而由此确定下发二值化融合参数还是高精度融合参数。需要说明的是,在本申请的另一些实现方式中,准确度信息还可以是子节点根据本地模型参数,以及验证数据集,校验获取的。In a possible design, the first message further includes: accuracy information corresponding to the local model parameters. Among them, the accuracy information is obtained by the sub-node according to the local model parameters and the test data set. Based on this solution, a specific content of the first message is provided. In this example, after completing a round of local learning, the child node may send the accuracy information corresponding to the learning result to the central node. So that the central node can determine the accuracy of the system according to the accuracy information reported by each sub-node, and then determine whether to issue binary fusion parameters or high-precision fusion parameters. It should be noted that, in other implementation manners of the present application, the accuracy information may also be obtained by the child node through verification according to the local model parameters and the verification data set.
在一种可能的设计中,该方法还包括:子节点根据更新后的本地模型参数,基于BNN继续进行机器学习。基于该方案,提供了一种子节点根据融合参数更新本地模型之后的方法示例。在该示例中,子节点可以在更新本地模型参数之后,基于现有的数据集,或者结合新增的数据集对该本地模型继续进行第二轮或后续轮次的学习,并重复上述示例中的方法,直到学习结果收敛为止,完成机器学习。需要说明的是,在本申请的另一些 实现方式中,无论学习结果是否收敛,更新后的本地模型都可以用于对子节点当前的业务进行指导,比如预测数据走向等。In a possible design, the method further includes: the child node continues machine learning based on the BNN according to the updated local model parameters. Based on this solution, an example of the method after the child node updates the local model according to the fusion parameters is provided. In this example, after updating the local model parameters, the child node can continue to perform the second round or subsequent rounds of learning on the local model based on the existing data set or in combination with the newly added data set, and repeat the above example. method until the learning result converges, and the machine learning is completed. It should be noted that, in other implementations of the present application, regardless of whether the learning result converges or not, the updated local model can be used to guide the current business of the child node, such as predicting the direction of data, etc.
第二方面,提供一种机器学习方法,方法应用于中心节点,方法包括:中心节点接收分别来自N个子节点的N个第一消息,第一消息包括本地模型参数,本地模型参数是二值化的本地模型参数。N为大于或等于1的整数。中心节点对N个第一消息包括的本地模型参数进行融合,获取融合参数。中心节点向M个子节点发送第二消息,第二消息包括融合参数,M为大于或等于1的正整数。In a second aspect, a machine learning method is provided, the method is applied to a central node, and the method includes: the central node receives N first messages from N child nodes respectively, the first messages include local model parameters, and the local model parameters are binarized local model parameters. N is an integer greater than or equal to 1. The central node fuses the local model parameters included in the N first messages to obtain the fusion parameters. The central node sends a second message to the M child nodes, where the second message includes a fusion parameter, and M is a positive integer greater than or equal to 1.
基于该方案,中心节点可以接收来自多个子节点的本地模型参数,并基于这些本地模型参数进行融合,由此获取具有更强适应性的融合模型。中心节点可以将该融合模型下发给各个子节点,以便于子节点可以根据该融合模型进行本地融合完成一轮学习。相较于现有的FL框架中的分布式架构,在本示例中,中心节点接收到的来自子节点的模型参数可以是二值化的模型参数。可以理解的是,二值化的模型参数的数据量要显著的小于普通的模型参数(如高精度的模型参数)的数据量,因此该上传的过程更加高效。中心节点将各个本地模型参数融合后,获取的融合参数,能够适配各个子节点对应的数据集类型,因此具有更加准确和适配性更强的特征。需要说明的是,在本申请是一些实现方式中,N个子节点中有可能存在一些参考用子节点,这些子节点可以用于提供本地模型参数,但是并不需要来自中心节点的融合参数。而需要根据融合参数对本地模型进行更新的节点也可能不包括在N个子节点中。因此,在一些实现方式中,M个子节点中也可以包括N个子节点中没有的子节点,或者,M个子节点可以是N个子节点中第一部分。具体的M个子节点的确定,可以根据实际实施过程中灵活配置。Based on this scheme, the central node can receive local model parameters from multiple sub-nodes, and perform fusion based on these local model parameters, thereby obtaining a fusion model with stronger adaptability. The central node can distribute the fusion model to each sub-node, so that the sub-nodes can perform local fusion according to the fusion model to complete a round of learning. Compared with the distributed architecture in the existing FL framework, in this example, the model parameters received by the central node from the child nodes may be binarized model parameters. It can be understood that the data volume of the binarized model parameters is significantly smaller than the data volume of ordinary model parameters (such as high-precision model parameters), so the uploading process is more efficient. After the central node fuses the local model parameters, the obtained fusion parameters can be adapted to the data set type corresponding to each sub-node, so it has more accurate and adaptable features. It should be noted that, in some implementations of the present application, there may be some reference sub-nodes among the N sub-nodes, and these sub-nodes can be used to provide local model parameters, but do not need fusion parameters from the central node. And the nodes that need to update the local model according to the fusion parameters may not be included in the N child nodes. Therefore, in some implementations, the M child nodes may also include child nodes that are not included in the N child nodes, or the M child nodes may be the first part of the N child nodes. The specific determination of the M sub-nodes can be flexibly configured according to the actual implementation process.
在一种可能的设计中,中心节点对N个本地模型参数进行融合,获取融合参数,包括:中心节点对N个本地模型参数,进行加权平均,获取融合参数。基于该方案,提供了一种获取融合参数的方案。在该示例中,中心节点可以通过简单的加权平均,处理N个本地参数模型。其中,加权平均中的权重可以根据本地参数模型在进行本地训练过程中,输入数据集的大小确定。中心节点可以从各个子节点获取进行本轮学习过程中使用数据集的大小,也可以从其他节点中获取各个子节点在进行本地学习过程中所使用的数据集的大小。当然,在本申请的其他一些实现方式中,中心节点还可以结合其他因素,调整权重。比如,对于一些使用较为频繁的子节点,其权重可以适当增加,而对于一些神经网络模型使用较少的子节点,其对应权重可以适当减小。In a possible design, the central node fuses N local model parameters to obtain the fusion parameters, including: the central node performs a weighted average on the N local model parameters to obtain the fusion parameters. Based on this scheme, a scheme for obtaining fusion parameters is provided. In this example, the central node can process N local parameter models through a simple weighted average. The weight in the weighted average can be determined according to the size of the input data set during the local training process of the local parameter model. The central node can obtain the size of the data set used in the current round of learning from each child node, and can also obtain the size of the data set used by each child node in the local learning process from other nodes. Of course, in some other implementation manners of the present application, the central node may also adjust the weight in combination with other factors. For example, for some frequently used sub-nodes, their weights can be appropriately increased, while for some neural network models that use less sub-nodes, their corresponding weights can be appropriately decreased.
在一种可能的设计中,第二消息包括的融合参数是高精度的融合参数,或者,第二消息包括的融合参数是二值化的融合参数。基于该方案,提供了一种中心节点下发融合参数的方案示例。在该示例中,中心节点可以将融合处理之后获取的高精度参数直接通过第二消息下发给子节点。中心节点也可以将融合处理之后的高精度参数,通过二值化处理再通过第二消息下发给子节点。可以理解的是,在需要提升数据传输速率时,可以通过下发二值化融合参数实现,在需要提升准确度时,可以通过下发高精度融合参数实现。In a possible design, the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized fusion parameter. Based on this solution, an example of a solution in which a central node issues fusion parameters is provided. In this example, the central node may directly send the high-precision parameters obtained after the fusion process to the child nodes through the second message. The central node can also send the high-precision parameters after fusion processing to the child nodes through binarization processing and then through the second message. It can be understood that when the data transmission rate needs to be improved, it can be achieved by issuing binary fusion parameters, and when the accuracy needs to be improved, it can be achieved by issuing high-precision fusion parameters.
在一种可能的设计中,在中心节点向M个子节点发送第二消息之前,方法还包括:中心节点根据N个第一消息,确定系统准确度信息,中心节点根据系统准确度信息,确定第一消息中包括的融合参数为高精度的融合参数或者二值化的融合参数。基于该方案,提供了一种中心节点调整下发高精度融合参数和二值化融合参数的机制。在该示例中,中心节点可以根据系统准确度信息,确定下发高精度融合参数或者二值化融合参数。比 如,在系统准确度较低时,可以下发二值化融合参数,又如,在系统准确度较高时,可以下发高精度融合参数。其中,系统准确度可以根据各个子节点的准确度确定,也可以是中心节点根据各个子节点的模型参数自发地进行校验确定的。In a possible design, before the central node sends the second message to the M sub-nodes, the method further includes: the central node determines the system accuracy information according to the N first messages, and the central node determines the system accuracy information according to the system accuracy information. The fusion parameters included in a message are high-precision fusion parameters or binarized fusion parameters. Based on this scheme, a mechanism for the central node to adjust and issue high-precision fusion parameters and binary fusion parameters is provided. In this example, the central node may determine and issue high-precision fusion parameters or binary fusion parameters according to the system accuracy information. For example, when the system accuracy is low, the binarized fusion parameters can be sent, and when the system accuracy is high, the high-precision fusion parameters can be sent. The system accuracy may be determined according to the accuracy of each sub-node, or may be determined by the central node spontaneously verifying according to the model parameters of each sub-node.
在一种可能的设计中,第一消息还包括:准确度信息。准确度信息与第一消息中包括的本地模型参数在对应子节点处校验获取的准确度对应。中心节点根据N个第一消息,确定系统准确度信息,包括:中心节点根据N个消息中包括的准确度信息,确定系统准确度信息。基于该方案,提供了一种中心节点确定系统准确度信息的方法。在该示例中,各个子节点可以将本轮学习获取的模型参数校验的准确度发送给中心节点,中心节点就可以根据各个子节点上传的准确度,确定系统准确度,进而据此调整下发融合参数的形式。In a possible design, the first message further includes: accuracy information. The accuracy information corresponds to the accuracy obtained by verifying the local model parameters included in the first message at the corresponding child node. The central node determining the system accuracy information according to the N first messages includes: the central node determining the system accuracy information according to the accuracy information included in the N messages. Based on the solution, a method for the central node to determine the system accuracy information is provided. In this example, each sub-node can send the accuracy of the model parameter verification obtained in this round of learning to the central node, and the central node can determine the system accuracy according to the accuracy uploaded by each sub-node, and then adjust the following accordingly. form of fusion parameters.
在一种可能的设计中,在系统准确度信息小于或等于第一阈值时,中心节点确定融合参数为二值化的融合参数。在系统准确度信息大于或等于第二阈值时,中心节点确定融合参数为高精度的融合参数。基于该方案,提供了一种具体的中心节点确定下发融合参数的形式的示例。在该示例中,中心节点在确定系统准确度小于或等于第一阈值时,则认为当前系统中的学习处于初步阶段,模型参数还有大量的调整空间,因此不需要进行精度较高的数据传输,此时应当数据传输效率的提升为主。因此,在系统准确度小于或等于第一阈值时,则中心节点可以下发二值化的融合参数,由此提升数据传输速率。对应的,中心节点在确定系统准确度大于或等于第一阈值时,则认为当前系统中的学习已经接近收敛,模型参数的调整空间较小,因此需要进行精度较高的数据传输。因此,在系统准确度大于或等于第一阈值时,则中心节点可以下发高精度的融合参数,由此提升模型参数的准确度。其中,第一阈值与第二阈值可以是预先设置的,在不同的实现方式中,第一阈值和第二阈值可以相同,也可以不同。In a possible design, when the system accuracy information is less than or equal to the first threshold, the central node determines that the fusion parameter is a binarized fusion parameter. When the system accuracy information is greater than or equal to the second threshold, the central node determines that the fusion parameter is a high-precision fusion parameter. Based on this solution, a specific example of the form in which the central node determines and delivers fusion parameters is provided. In this example, when the central node determines that the system accuracy is less than or equal to the first threshold, it considers that the learning in the current system is in the preliminary stage, and there is still a lot of room for adjustment of model parameters, so data transmission with higher accuracy is not required , at this time, the improvement of data transmission efficiency should be the main priority. Therefore, when the system accuracy is less than or equal to the first threshold, the central node can issue binarized fusion parameters, thereby increasing the data transmission rate. Correspondingly, when the central node determines that the system accuracy is greater than or equal to the first threshold, it considers that the learning in the current system is close to convergence, and the adjustment space of the model parameters is small, so data transmission with higher accuracy is required. Therefore, when the system accuracy is greater than or equal to the first threshold, the central node can issue high-precision fusion parameters, thereby improving the accuracy of model parameters. The first threshold and the second threshold may be preset, and in different implementations, the first threshold and the second threshold may be the same or different.
在一种可能的设计中,在中心节点向M个子节点发送第二消息,包括:在迭代轮数小于或等于第三阈值时,中心节点向M个子节点发送包括二值化的融合参数的第二消息。在迭代轮数大于或等于第四阈值时,中心节点向M个子节点发送包括高精度的融合参数的第二消息。基于该方案,提供了又一种中心节点确定下发融合参数的形式的机制。在该示例中,中心节点可以根据迭代轮数确定下发融合参数的形式。比如,在迭代轮数较少时,即小于或等于第三阈值时,则中心节点可以认为当前状态下应当以数据传输效率的提升为主,因此可以选择下发二值化的融合参数,以提升数据传输速率。对应的,在迭代轮数较多时,即大于或等于第四阈值时,则中心节点可以认为当前状态下应当以准确度为主,因此可以选择下发高精度的融合参数,以提升本地融合过程中的准确度。In a possible design, when the central node sends the second message to the M sub-nodes, including: when the number of iteration rounds is less than or equal to the third threshold, the central node sends the first message including the binarized fusion parameter to the M sub-nodes Two news. When the number of iteration rounds is greater than or equal to the fourth threshold, the central node sends a second message including high-precision fusion parameters to the M sub-nodes. Based on this solution, another mechanism is provided for the central node to determine the form of the fusion parameters to be issued. In this example, the central node may determine the form of delivering the fusion parameter according to the number of iteration rounds. For example, when the number of iteration rounds is small, that is, less than or equal to the third threshold, the central node can think that the current state should focus on the improvement of data transmission efficiency, so it can choose to issue binarized fusion parameters to Increase data transfer rate. Correspondingly, when the number of iteration rounds is large, that is, when it is greater than or equal to the fourth threshold, the central node can think that the current state should be based on accuracy, so it can choose to issue high-precision fusion parameters to improve the local fusion process. accuracy in .
在一种可能的设计中,中心节点通过广播,向M个子节点发送第二消息。基于该方案,提供了一种中心节点下发第二消息的方式。在该示例中,中心节点可以通过广播的形式进行第二消息的下发,而不需对各个子节点分别进行下发。可以理解的是对于各个子节点的数据下发内容相近,因此,可以通过广播的形式,同时将该数据下发给各个子节点。同时,由于传输的是二值化的融合参数或者高精度的融合参数,因此,广播的传输形式也不会对信息安全性有影响。In a possible design, the central node sends the second message to the M child nodes through broadcasting. Based on the solution, a method for the central node to deliver the second message is provided. In this example, the central node can deliver the second message in the form of broadcasting, without the need to deliver the second message to each child node separately. It can be understood that the content of the data delivered to each child node is similar, therefore, the data can be delivered to each child node at the same time in the form of broadcasting. At the same time, since the transmission is a binarized fusion parameter or a high-precision fusion parameter, the transmission form of the broadcast will not affect the information security.
第三方面,提供一种机器学习装置,该装置可以应用于子节点,子节点中设置有二值化神经网络模型BNN,该装置包括:获取单元,用于对采集获取的本地数据集,进行基于BNN的机器学习,获取与本地数据集对应的本地模型参数。发送单元,用于向中心节点发送第一消息,第一消息包括本地模型参数。In a third aspect, a machine learning device is provided, the device can be applied to a sub-node, and a binary neural network model BNN is set in the sub-node, and the device includes: an acquisition unit, used for collecting and acquiring a local data set. BNN-based machine learning to obtain local model parameters corresponding to the local dataset. The sending unit is configured to send a first message to the central node, where the first message includes local model parameters.
在一种可能的设计中,第一消息中包括的本地模型参数是二值化的本地模型参数。In one possible design, the local model parameters included in the first message are binarized local model parameters.
在一种可能的设计中,该装置还包括:接收单元,用于从中心节点接收融合参数,融合参数是中心节点根据本地模型参数融合获取的。融合单元,用于对根据融合参数以及本地模型参数进行融合,以获取更新后的本地模型参数。In a possible design, the apparatus further includes: a receiving unit, configured to receive fusion parameters from the central node, where the fusion parameters are obtained by the central node fusion according to local model parameters. The fusion unit is used to fuse according to the fusion parameters and the local model parameters to obtain the updated local model parameters.
在一种可能的设计中,融合参数是二值化的模型参数,或者,融合参数是高精度的模型参数。In one possible design, the fusion parameters are binarized model parameters, or the fusion parameters are high-precision model parameters.
在一种可能的设计中,第一消息还包括:与本地模型参数对应的准确度信息。获取单元,还用于根据本地模型参数,以及测试数据集,校验获取准确度信息。In a possible design, the first message further includes: accuracy information corresponding to the local model parameters. The acquisition unit is also used to verify the acquired accuracy information according to the local model parameters and the test data set.
在一种可能的设计中,该装置还包括:学习单元,用于根据更新后的本地模型参数,基于BNN继续进行机器学习。In a possible design, the apparatus further includes: a learning unit for continuing machine learning based on the BNN according to the updated local model parameters.
第四方面,提供一种机器学习装置,该装置应用于中心节点,装置包括:接收单元,用于接收分别来自N个子节点的N个第一消息,第一消息包括本地模型参数,本地模型参数是二值化的本地模型参数。N为大于或等于1的整数。融合单元,用于对N个第一消息包括的本地模型参数进行融合,获取融合参数。发送单元,用于向M个子节点发送第二消息,第二消息包括融合参数,M为大于或等于1的正整数。In a fourth aspect, a machine learning device is provided, the device is applied to a central node, and the device includes: a receiving unit, configured to receive N first messages from N sub-nodes respectively, the first messages include local model parameters, local model parameters are the binarized local model parameters. N is an integer greater than or equal to 1. The fusion unit is configured to fuse the local model parameters included in the N first messages to obtain fusion parameters. A sending unit, configured to send a second message to the M sub-nodes, where the second message includes a fusion parameter, and M is a positive integer greater than or equal to 1.
在一种可能的设计中,融合单元,具体用于对N个本地模型参数,进行加权平均,获取融合参数。In a possible design, the fusion unit is specifically used to perform a weighted average of N local model parameters to obtain fusion parameters.
在一种可能的设计中,第二消息包括的融合参数是高精度的融合参数,或者,第二消息包括的融合参数是二值化的融合参数。In a possible design, the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized fusion parameter.
在一种可能的设计中,该装置还包括:确定单元,用于根据N个第一消息,确定系统准确度信息,中心节点根据系统准确度信息,确定第一消息中包括的融合参数为高精度的融合参数或者二值化的融合参数。In a possible design, the device further includes: a determining unit, configured to determine the system accuracy information according to the N first messages, and the central node determines, according to the system accuracy information, that the fusion parameter included in the first message is high Precision fusion parameters or binarized fusion parameters.
在一种可能的设计中,第一消息还包括:准确度信息。准确度信息与第一消息中包括的本地模型参数在对应子节点处校验获取的准确度对应。确定单元,具体用于根据N个消息中包括的准确度信息,确定系统准确度信息。In a possible design, the first message further includes: accuracy information. The accuracy information corresponds to the accuracy obtained by verifying the local model parameters included in the first message at the corresponding child node. The determining unit is specifically configured to determine the system accuracy information according to the accuracy information included in the N messages.
在一种可能的设计中,确定单元,用于在系统准确度信息小于或等于第一阈值时,确定融合参数为二值化的融合参数。确定单元,还用于在系统准确度信息大于或等于第二阈值时,确定融合参数为高精度的融合参数。In a possible design, the determining unit is configured to determine that the fusion parameter is a binarized fusion parameter when the system accuracy information is less than or equal to the first threshold. The determining unit is further configured to determine that the fusion parameter is a high-precision fusion parameter when the system accuracy information is greater than or equal to the second threshold.
在一种可能的设计中,发送单元,用于在迭代轮数小于或等于第三阈值时,向M个子节点发送包括二值化的融合参数的第二消息。发送单元,还用于在迭代轮数大于或等于第四阈值时,向M个子节点发送包括高精度的融合参数的第二消息。In a possible design, the sending unit is configured to send the second message including the binarized fusion parameter to the M sub-nodes when the number of iteration rounds is less than or equal to the third threshold. The sending unit is further configured to send a second message including a high-precision fusion parameter to the M sub-nodes when the number of iteration rounds is greater than or equal to the fourth threshold.
在一种可能的设计中,发送单元,具体用于通过广播,向M个子节点发送第二消息。In a possible design, the sending unit is specifically configured to send the second message to the M sub-nodes through broadcasting.
第五方面,提供一种子节点,该子节点可以包括一个或多个处理器和一个或多个存储器。一个或多个存储器与一个或多个处理器耦合,一个或多个存储器存储有计算机指令。当一个或多个处理器执行计算机指令时,使得子节点执行如第一方面及其可能的设计中任一项所述的机器学习方法。In a fifth aspect, a child node is provided, the child node may include one or more processors and one or more memories. One or more memories are coupled to the one or more processors, and the one or more memories store computer instructions. When one or more processors execute the computer instructions, the child nodes are caused to perform the machine learning method of any of the first aspect and possible designs thereof.
第六方面,提供一种中心节点,该中心节点可以包括一个或多个处理器和一个或多个存储器。一个或多个存储器与一个或多个处理器耦合,一个或多个存储器存储有计算机指令。当一个或多个处理器执行计算机指令时,使得中心节点执行如第二方面及其可能的设计中任一项所述的机器学习方法。In a sixth aspect, a central node is provided, and the central node may include one or more processors and one or more memories. One or more memories are coupled to the one or more processors, and the one or more memories store computer instructions. When the one or more processors execute the computer instructions, the central node is caused to perform the machine learning method of any one of the second aspect and possible designs thereof.
第七方面,提供一种机器学习系统,机器学习系统包括一个或多个第五方面提供的 子节点,以及一个或多个如第六方面提供的中心节点。In a seventh aspect, a machine learning system is provided, the machine learning system includes one or more sub-nodes provided in the fifth aspect, and one or more central nodes as provided in the sixth aspect.
第八方面,提供一种芯片系统,芯片系统包括接口电路和处理器;接口电路和处理器通过线路互联;接口电路用于从存储器接收信号,并向处理器发送信号,信号包括存储器中存储的计算机指令;当处理器执行计算机指令时,芯片系统执行如上述第一方面以及各种可能的设计中任一种所述的机器学习方法,或者,执行如上述第二方面以及各种可能的设计中任一种所述的机器学习方法。In an eighth aspect, a chip system is provided, the chip system includes an interface circuit and a processor; the interface circuit and the processor are interconnected through a line; the interface circuit is used to receive a signal from a memory and send a signal to the processor, and the signal includes a signal stored in the memory. Computer instructions; when the processor executes the computer instructions, the system-on-a-chip executes the machine learning method described in any one of the above-mentioned first aspect and various possible designs, or executes the above-mentioned second aspect and various possible designs The machine learning method described in any of the above.
第九方面,提供一种计算机可读存储介质,计算机可读存储介质包括计算机指令,当计算机指令运行时,执行如上述第一方面以及各种可能的设计中任一种所述的机器学习方法,或者,执行如上述第二方面以及各种可能的设计中任一种所述的机器学习方法。In a ninth aspect, a computer-readable storage medium is provided, the computer-readable storage medium includes computer instructions, and when the computer instructions are executed, the machine learning method described in any one of the above-mentioned first aspect and various possible designs is executed , or, perform the machine learning method as described in any of the above-mentioned second aspect and various possible designs.
第十方面,提供一种计算机程序产品,计算机程序产品中包括指令,当计算机程序产品在计算机上运行时,使得计算机可以根据指令执行如上述第一方面以及各种可能的设计中任一种所述的机器学习方法,或者,执行如上述第二方面以及各种可能的设计中任一种所述的机器学习方法。A tenth aspect provides a computer program product, the computer program product includes instructions, when the computer program product runs on a computer, so that the computer can execute any one of the above-mentioned first aspect and various possible designs according to the instructions. The machine learning method described above, or, implement the machine learning method described in any one of the above second aspect and various possible designs.
应当理解的是,上述第三方面,第四方面,第五方面,第六方面,第七方面,第八方面,第九方面以及第十方面提供的技术方案,其技术特征均可对应到第一方面及其可能的设计中,或者第二方面及其可能的设计中提供的机器学习方法,因此能够达到的有益效果类似,此处不再赘述。It should be understood that the technical features of the technical solutions provided in the third aspect, the fourth aspect, the fifth aspect, the sixth aspect, the seventh aspect, the eighth aspect, the ninth aspect and the tenth aspect can all correspond to the first aspect. The machine learning method provided in the one aspect and its possible designs, or the second aspect and its possible designs, can achieve similar beneficial effects, which will not be repeated here.
附图说明Description of drawings
图1为一种机器学习在通信过程中的实现示意图;Fig. 1 is a kind of realization schematic diagram of machine learning in the communication process;
图2为一种FL架构的工作示意图;Fig. 2 is the working schematic diagram of a kind of FL architecture;
图3为一种BNN与基于高精度参数的普通神经网络的对比示意图;3 is a schematic diagram of a comparison between a BNN and an ordinary neural network based on high-precision parameters;
图4为本申请实施例提供的一种机器学习系统的组成示意图;FIG. 4 is a schematic diagram of the composition of a machine learning system according to an embodiment of the present application;
图5为本申请实施例提供的又一种机器学习系统的组成示意图;FIG. 5 is a schematic diagram of the composition of another machine learning system provided by an embodiment of the present application;
图6为本申请实施例提供的一种机器学习系统的工作逻辑示意图;FIG. 6 is a schematic working logic diagram of a machine learning system provided by an embodiment of the present application;
图7为本申请实施例提供的又一种机器学习系统的工作逻辑示意图;FIG. 7 is a schematic working logic diagram of another machine learning system provided by an embodiment of the present application;
图8为本申请实施例提供的又一种机器学习系统的工作逻辑示意图;FIG. 8 is a schematic working logic diagram of another machine learning system provided by an embodiment of the present application;
图9为本申请实施例提供的一种机器学习方法的逻辑示意图;FIG. 9 is a schematic logical diagram of a machine learning method provided by an embodiment of the present application;
图10为本申请实施例提供的一种仿真结果的对比示意图;10 is a schematic diagram of a comparison of simulation results provided by an embodiment of the present application;
图11为本申请实施例提供的又一种仿真结果的对比示意图;11 is a schematic diagram of a comparison of another simulation result provided by an embodiment of the present application;
图12为本申请实施例提供的又一种仿真结果的对比示意图;FIG. 12 is a schematic diagram of a comparison of another simulation result provided by an embodiment of the present application;
图13为本申请实施例提供的一种机器学习装置的组成示意图;FIG. 13 is a schematic diagram of the composition of a machine learning apparatus provided by an embodiment of the present application;
图14为本申请实施例提供的又一种机器学习装置的组成示意图;FIG. 14 is a schematic diagram of the composition of another machine learning apparatus provided by an embodiment of the present application;
图15为本申请实施例提供的一种子节点的组成示意图;FIG. 15 is a schematic diagram of the composition of a child node according to an embodiment of the present application;
图16为本申请实施例提供的一种芯片系统的组成示意图;FIG. 16 is a schematic diagram of the composition of a chip system provided by an embodiment of the present application;
图17为本申请实施例提供的一种中心节点的组成示意图;FIG. 17 is a schematic diagram of the composition of a central node according to an embodiment of the present application;
图18为本申请实施例提供的又一种芯片系统的组成示意图。FIG. 18 is a schematic diagram of the composition of another chip system provided by an embodiment of the present application.
具体实施方式Detailed ways
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例 如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as "exemplary" or "such as" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present the related concepts in a specific manner.
在本申请的实施例中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,除非另有说明,“多个”的含义是两个或两个以上。In the embodiments of the present application, the terms "first" and "second" are only used for description purposes, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature. In the description of this application, unless stated otherwise, "plurality" means two or more.
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个第二报文是指两个或两个以上的第二报文。In this application, the meaning of the term "at least one" refers to one or more, and the meaning of the term "plurality" in this application refers to two or more. For example, a plurality of second messages refers to two or more more than one second message.
应理解,在本文中对各种所述示例的描述中所使用的术语只是为了描述特定示例,而并非旨在进行限制。如在对各种所述示例的描述和所附权利要求书中所使用的那样,单数形式“一个(“a”,“an”)”和“该”旨在也包括复数形式,除非上下文另外明确地指示。It is to be understood that the terminology used in describing the various described examples herein is for the purpose of describing particular examples and is not intended to be limiting. As used in the description of the various described examples and the appended claims, the singular forms "a", "an")" and "the" are intended to include the plural forms as well, unless the context dictates otherwise. clearly instructed.
还应理解,本文中所使用的术语“和/或”是指并且涵盖相关联的所列出的项目中的一个或多个项目的任何和全部可能的组合。术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中的字符“/”,一般表示前后关联对象是一种“或”的关系。It will also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. The term "and/or" is an association relationship that describes an associated object, indicating that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist simultaneously, and B exists alone. a situation. In addition, the character "/" in this application generally indicates that the related objects are an "or" relationship.
还应理解,在本申请的各个实施例中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should also be understood that, in each embodiment of the present application, the size of the sequence number of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not be used in the embodiment of the present application. Implementation constitutes any limitation.
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。It should be understood that determining B according to A does not mean that B is only determined according to A, and B may also be determined according to A and/or other information.
还应理解,术语“包括”(也称“includes”、“including”、“comprises”和/或“comprising”)当在本说明书中使用时指定存在所陈述的特征、整数、步骤、操作、元素、和/或部件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元素、部件、和/或其分组。It will also be understood that the term "includes" (also referred to as "includes", "including", "comprises" and/or "comprising") when used in this specification designates the presence of stated features, integers, steps, operations, elements , and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groupings thereof.
还应理解,术语“如果”可被解释为意指“当...时”(“when”或“upon”)或“响应于确定”或“响应于检测到”。类似地,根据上下文,短语“如果确定...”或“如果检测到[所陈述的条件或事件]”可被解释为意指“在确定...时”或“响应于确定...”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。It should also be understood that the term "if" may be interpreted to mean "when" or "upon" or "in response to determining" or "in response to detecting." Similarly, depending on the context, the phrases "if it is determined..." or "if a [statement or event] is detected" can be interpreted to mean "when determining..." or "in response to determining... ” or “on detection of [recited condition or event]” or “in response to detection of [recited condition or event]”.
应理解,说明书通篇中提到的“一个实施例”、“一实施例”、“一种可能的实现方式”意味着与实施例或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”、“一种可能的实现方式”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。It should be understood that references throughout the specification to "one embodiment," "an embodiment," and "one possible implementation" mean that a particular feature, structure, or characteristic related to the embodiment or implementation is included in the present application at least one embodiment of . Thus, appearances of "in one embodiment" or "in an embodiment" or "one possible implementation" in various places throughout this specification are not necessarily necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
还应理解,本申请实施例中提到的“连接”,可以是直接连接,也可以是间接连接,可以是有线连接,也可以是无线连接,也就是说,本申请实施例对设备之间的连接方式不作限定。It should also be understood that the "connection" mentioned in the embodiments of the present application may be a direct connection, an indirect connection, a wired connection, or a wireless connection, that is, the embodiment of the present application can The connection method is not limited.
以下对本申请实施例提供的方案进行详细说明。The solutions provided in the embodiments of the present application will be described in detail below.
由于基于神经网络的机器学习与通信系统的耦合可以显著提升通信过程的灵活性,因此,神经网络在通信过程中的使用机制对于通信效果影响显著。Since the coupling of neural network-based machine learning and communication systems can significantly improve the flexibility of the communication process, the use mechanism of neural networks in the communication process has a significant impact on the communication effect.
目前的方案中,本地节点可以采集相关数据,并将这些数据分别上传给计算节点, 以便计算节点基于这些数据进行学习,由此获取对应的训练模型。计算节点可以将训练模型下发给各个本地节点,以使得各个本地节点可以根据训练模型,对其在通信过程中的工作进行预测指导。In the current solution, the local node can collect relevant data, and upload the data to the computing node respectively, so that the computing node can learn based on the data, thereby obtaining the corresponding training model. The computing node can issue the training model to each local node, so that each local node can predict and guide its work in the communication process according to the training model.
示例性的,结合图1,为一种机器学习在通信过程中的实现示例。其中,以3个本地节点通过计算节点进行机器学习为例。如图1所示,本地节点1可以将采集获取的数据组成的数据集1上传给计算节点。类似的,本地节点2可以将采集获取的数据组成的数据集2上传给计算节点。本地节点3可以将采集获取的数据组成的数据集3上传给计算节点。计算节点可以根据这些数据集(如数据集1-数据集3)进行机器学习。比如,计算节点中可以预设有基础神经网络模型,根据数据集1-数据集3对基础神经网络模型进行迭代学习,优化基础神经网络模型的各个模型参数(如权重和偏置等),获取迭代收敛的模型参数,由此完成一轮机器学习。此后,计算节点可以将迭代收敛的模型参数下发给各个本地节点。比如,将模型参数发送给本地节点1,本地节点2以及本地节点3。需要说明的是,在本示例中,本地节点也可预设有与计算节点中基础神经网络模型相同类型的模型。本地节点可以根据接收到的模型参数,对本地维护的模型进行更新,由此获取经过机器学习后的训练模型。Exemplarily, with reference to FIG. 1 , it is an implementation example of machine learning in a communication process. Among them, three local nodes perform machine learning through computing nodes as an example. As shown in FIG. 1 , the local node 1 can upload the data set 1 composed of the collected data to the computing node. Similarly, the local node 2 can upload the data set 2 composed of the collected data to the computing node. The local node 3 can upload the data set 3 composed of the collected data to the computing node. Compute nodes can perform machine learning based on these datasets (eg dataset 1 - dataset 3). For example, a basic neural network model can be preset in the computing node, and the basic neural network model can be iteratively learned according to data set 1 to data set 3, and various model parameters (such as weight and bias, etc.) of the basic neural network model are optimized to obtain Iterate over the converged model parameters, thereby completing a round of machine learning. After that, the computing node can deliver the iteratively converged model parameters to each local node. For example, send model parameters to local node 1, local node 2 and local node 3. It should be noted that, in this example, the local node may also be preset with a model of the same type as the basic neural network model in the computing node. The local node can update the locally maintained model according to the received model parameters, thereby obtaining the training model after machine learning.
这样,本地节点就可以根据该训练模型,对其工作进行预测指导。比如,在涉及边缘计算的过程中,车联网,自动驾驶,以及对用户输入习惯的预测等场景下,都可以根据上述方案,采用对应的训练模型对相应的参数进行判断预测,由此大幅提升本地节点的工作性能。In this way, the local node can predict and guide its work based on the trained model. For example, in the process involving edge computing, in scenarios such as Internet of Vehicles, automatic driving, and prediction of user input habits, the corresponding training model can be used to judge and predict the corresponding parameters according to the above scheme, thereby greatly improving The working performance of the local node.
可以看到,如图1所示的方案中,各个本地节点都需要将其采集的数据集分别发送给计算节点。为了使得机器学习的结果足够精确,一般而言,计算节点所需要收集的数据集的数据量是非常大的,这就使得本地节点在向计算节点传输数据集的过程中,会对二者之间的通信链路造成很大的负担。另外,由于数据集是直接被发送给计算节点的,因此,在数据集中包括用户的一些隐私信息时,就会导致隐私信息被直接暴露在通信链路以及计算节点中,由此造成信息隐私性的隐患。It can be seen that, in the solution shown in Figure 1, each local node needs to send the data set collected by it to the computing node respectively. In order to make the results of machine learning accurate enough, in general, the data volume of the data set that the computing node needs to collect is very large, which makes the local node in the process of transmitting the data set to the computing node. The communication link between them creates a great burden. In addition, since the data set is directly sent to the computing node, when some private information of the user is included in the data set, the private information will be directly exposed in the communication link and the computing node, resulting in information privacy. hazard.
为了解决上述问题,目前,可以采用具有分布式架构的联邦学习(Federated Learning,FL)架构进行机器学习和通信的结耦,降低数据传输量,同时对信息隐私起到适当的保护。In order to solve the above problems, at present, a Federated Learning (FL) architecture with a distributed architecture can be used to couple machine learning and communication, reduce the amount of data transmission, and at the same time properly protect information privacy.
在FL架构中,可以设置有中心节点(或称为中心服务器)以及子节点。其中,根据任务和数据分布不同,子节点数可以为几个到几千个不等。对于每一个子节点,本地都可以预设有神经网络模型。各个子节点中的神经网络模型相同。子节点可以获取对应的数据集在神经网络模型中进行学习。在学习过程中,每个子节点都进行若干次迭代(通常为遍历数据集中所有数据各一次),然后将具有收敛的模型参数的本地模型上传到中心节点。中心节点会将所有发来的本地模型按照各子节点的数据量比例进行加权求均值(该过程也可称为训练模型的融合),由此获取融合后的训练模型。接着,中心节点可以把得到的训练模型下发给所有子节点,以便子节点可以根据融合后的训练模型继续根据新的数据集进行训练,或者,直接用于进行相关场景的计算以及预测。需要说明都是,在不同的实现中,子节点与中心节点之间的训练模型的传输,可以是直接传输训练模型的所有数据,也可以是只对训练模型的参数进行传输即可。In the FL architecture, a central node (or called a central server) and sub-nodes can be provided. Among them, the number of child nodes can range from several to several thousand according to different tasks and data distribution. For each child node, a neural network model can be preset locally. The neural network model in each child node is the same. The child nodes can obtain the corresponding dataset to learn in the neural network model. During the learning process, each child node performs several iterations (usually once for all data in the dataset), and then uploads the local model with converged model parameters to the central node. The central node will weight and average all the sent local models according to the proportion of the data volume of each child node (this process may also be called fusion of training models), thereby obtaining the fusion training model. Then, the central node can send the obtained training model to all sub-nodes, so that the sub-nodes can continue to train according to the new data set according to the fused training model, or directly use it for calculation and prediction of related scenarios. It should be noted that, in different implementations, the transmission of the training model between the child nodes and the central node may be to directly transmit all the data of the training model, or it may be to transmit only the parameters of the training model.
示例性的,结合图2,为一种FL架构的工作示意。其中,以该架构中包括3个子节点(如本地节点1,本地节点2以及本地节点3),1个中心节点为例。对于本地节点1, 可以采集相关数据,以获取数据集1。在本地节点1中可以存储有与数据集1对应的本地训练模型。本地节点1可以将数据集1输入到本地训练模型中,进行本地训练,由此就可以获取收敛后的本地训练模型参数1(如标识为W1)。类似的,在其他本地节点中,也可进行如上述本地节点1类似的处理,并获取对应的本地训练模型参数2(如标识为W2)和本地训练模型参数3(如标识为W3)。可以理解的是,由于本地训练模型参数的学习获取,与输入的数据集强相关。比如,在输入的数据集不同时,得到的本地训练模型参数也可能不同。3个本地节点可以分别将获取的本地训练模型参数(如W1-W3)发送给中心节点。中心节点可以对获取的W1-W3进行融合,进而得到融合后的训练模型参数(如W0)。中心节点可以将该融合后的训练模型参数分别下发给本地节点1、本地节点2以及本地节点3。各个本地节点就可以根据接收到的W0,对本地训练模型进行更新。Exemplarily, with reference to FIG. 2 , it is a working schematic diagram of a FL architecture. Wherein, the architecture includes three sub-nodes (such as local node 1, local node 2 and local node 3) and one central node as an example. For local node 1, relevant data can be collected to obtain data set 1. The local training model corresponding to the data set 1 may be stored in the local node 1 . The local node 1 can input the data set 1 into the local training model, and perform local training, thereby obtaining the converged local training model parameter 1 (eg, marked as W1). Similarly, in other local nodes, processing similar to the above local node 1 can also be performed, and corresponding local training model parameters 2 (eg, identified as W2) and local training model parameters 3 (eg, identified as W3) can be obtained. Understandably, due to the learned acquisition of locally trained model parameters, there is a strong correlation with the input dataset. For example, when the input data sets are different, the obtained local training model parameters may also be different. The three local nodes can respectively send the acquired local training model parameters (such as W1-W3) to the central node. The central node can fuse the obtained W1-W3, and then obtain the fused training model parameters (such as W0). The central node can deliver the fused training model parameters to the local node 1, the local node 2 and the local node 3 respectively. Each local node can update the local training model according to the received W0.
可以看到,在FL架构中,数据集在本地就被处理,而不需要被发送给中心节点,这样就可以保证数据集的信息安全得以保证。同时,由于训练模型或者训练模型参数的数据量显著小于数据集本身的数据量,因此,通过传输训练模型,能够有效地降低子节点与中心节点之间的数据传输压力。It can be seen that in the FL architecture, the data set is processed locally without being sent to the central node, so that the information security of the data set can be guaranteed. At the same time, since the data volume of the training model or the training model parameters is significantly smaller than the data volume of the data set itself, the data transmission pressure between the child nodes and the central node can be effectively reduced by transmitting the training model.
但是,随着通信技术的发展,FL架构下的训练效率和数据传输量依然不能满足所有场景下的需求。以子节点为手机,中心节点为基站为例。手机和基站之间随时都进行着信息的交互。在FL架构下,虽然只需传输学习训练模型或者训练模型参数,但是,由于一个参数往往需要16比特(bit)或更高的传输带宽,而一组训练模型参数包括多个参数,因此,对于传输带宽的要求依然很高。进一步的,手机在进行训练模型(或训练模型参数)的传输过程中,不会继续进行本地训练,因此,由于训练模型(或训练模型参数)传输时间长就会导致整个FL架构的训练效率较低。However, with the development of communication technology, the training efficiency and data transmission volume under the FL architecture still cannot meet the needs of all scenarios. Take the child node as the mobile phone and the central node as the base station as an example. There is information exchange between the mobile phone and the base station at any time. In the FL architecture, although only the learning training model or the training model parameters need to be transmitted, since a parameter often requires a transmission bandwidth of 16 bits (bit) or higher, and a set of training model parameters includes multiple parameters, therefore, for Transmission bandwidth requirements are still very high. Further, during the transmission of the training model (or training model parameters), the mobile phone will not continue to perform local training. Therefore, due to the long transmission time of the training model (or training model parameters), the training efficiency of the entire FL architecture will be higher. Low.
为了解决上述问题,本申请实施例提供的方案,能够结合二值化的数据处理方案,以及分布式的神经网络学习方案,达到在提升机器学习效率的同时,降低对于数据传输压力的效果。In order to solve the above problems, the solutions provided by the embodiments of the present application can combine a binarized data processing solution and a distributed neural network learning solution to achieve the effect of reducing the pressure on data transmission while improving the efficiency of machine learning.
首先,对本申请所涉及的二值化的数据处理方案进行说明。需要说明的是,本申请中,应用二值化的数据处理方案的神经网络也可称为二值化神经网络(Binary Neural Network,BNN)。First, the binarized data processing scheme according to the present application will be described. It should be noted that, in this application, the neural network applying the binarized data processing scheme may also be referred to as a binarized neural network (Binary Neural Network, BNN).
需要说明的是,数据在进行传输之前(如在子节点和中心节点之间的传输之前),发出数据的节点需要将数据进行量化才能传输。比如,以子节点将数据发送给中心节点为例。子节点可以将需要发送的数据量化为由0或1组成的序列,然后通过上行数据传输通道传输该序列。可以理解的是,在数据量化后,其对应的数据与量化前的数据可能不完全对等。量化的过程中,一个需要传输的数据(如称为全精度的数据)对应到量化后的序列位宽越宽,则量化后获取的参数的精度越高。在本申请实施例中,可以将量化位宽较宽的量化数据称为高精度数据,或者高精度参数。在一些实现中,高精度数据或者高精度参数可以是指序列位宽大于或等于32比特的数据。It should be noted that, before the data is transmitted (for example, before the transmission between the child node and the central node), the node that sends the data needs to quantify the data before transmission. For example, take the child node sending data to the central node as an example. The child node can quantify the data to be sent into a sequence consisting of 0 or 1, and then transmit the sequence through the uplink data transmission channel. It can be understood that after the data is quantized, the corresponding data may not be completely equivalent to the data before the quantization. In the process of quantization, the wider the bit width of a sequence after quantization corresponds to a data to be transmitted (such as data called full precision), the higher the precision of the parameters obtained after quantization. In this embodiment of the present application, quantized data with a wider quantization bit width may be referred to as high-precision data, or high-precision parameters. In some implementations, high-precision data or high-precision parameters may refer to data with a sequence bit width greater than or equal to 32 bits.
不同于基于高精度参数的神经网络,在BNN中,神经网络的参数由+1和-1的二值化参数组成。在各个子节点中,进行神经网络的学习(或称为训练)过程中,用二值化的参数标识训练模型的参数。因此,在对BNN进行学习时,相较于基于未进行二值化处理的高精度参数的模型学习,能够达到有效减少神经网络用于推演时的计算量、使得学习过程加快收敛的效果。同时,BNN也可以减少储存神经网络参数所需的储存量、进而减少发送整个神经网络所需的通信量。Unlike neural networks based on high-precision parameters, in BNN, the parameters of the neural network consist of binarized parameters of +1 and -1. In each child node, during the learning (or training) process of the neural network, the parameters of the training model are identified by the binarized parameters. Therefore, when learning BNN, compared with model learning based on high-precision parameters without binarization, it can effectively reduce the amount of calculation when the neural network is used for deduction, and accelerate the convergence of the learning process. At the same time, BNN can also reduce the amount of storage required to store the parameters of the neural network, thereby reducing the amount of communication required to send the entire neural network.
示例性的,图3为一种BNN与基于高精度参数的普通神经网络的对比示意。如图3所示,在普通神经网络中,在进行3次迭代计算的情况下,分别对应的高精度参数可以为W1、W2以及W3。将该过程对应到BNN中,那么,在进行3次迭代计算的情况下,分别对应的二值化参数可以为Wb1、Wb2以及Wb3。对于同样的迭代计算过程,以W1和Wb1为例。W1和Wb1的对应关系可以为,W1通过二值化转换,即可得到对应的Wb1。比如,该二值化转换可以为:对于W1对应计算矩阵中的任意一个元素,如果该元素大于0,则对应Wb1矩阵中对应位置的元素记为+1。对应的,如果该元素小于0,则对应Wb1矩阵中对应位置的元素记为-1。而对于Wb1,可以通过梯度累积获取对应的W1。可以理解的是,在一个典型的BNN学习过程中,可以使用二值化参数进行前向计算和梯度计算,并在对应的高精度参数上累积梯度。当高精度参数上累积了足够大的梯度时,二值化参数就会发生跳变。BNN通过迭代进行多次以上过程,逐步更新参数,并在足够多次迭代后最终收敛。因此,在使用一个学习好的BNN时,只需要使用二值化参数进行推演,最终的输出即为BNN的推演结果。Exemplarily, FIG. 3 is a schematic diagram of a comparison between a BNN and an ordinary neural network based on high-precision parameters. As shown in Fig. 3, in a common neural network, in the case of performing three iterations of calculation, the corresponding high-precision parameters can be W1, W2, and W3 respectively. Corresponding this process to BNN, then, in the case of performing three iterations of calculation, the corresponding binarization parameters can be Wb1, Wb2, and Wb3 respectively. For the same iterative calculation process, take W1 and Wb1 as examples. The corresponding relationship between W1 and Wb1 can be as follows: W1 can be converted into corresponding Wb1 through binarization conversion. For example, the binarization conversion may be: for any element in the calculation matrix corresponding to W1, if the element is greater than 0, the element corresponding to the corresponding position in the Wb1 matrix is denoted as +1. Correspondingly, if the element is less than 0, the element corresponding to the corresponding position in the Wb1 matrix is marked as -1. For Wb1, the corresponding W1 can be obtained through gradient accumulation. It can be understood that in a typical BNN learning process, the binarized parameters can be used for forward calculation and gradient calculation, and the gradients are accumulated on the corresponding high-precision parameters. When a sufficiently large gradient is accumulated on the high-precision parameters, the binarization parameters jump. BNN performs the above process multiple times through iteration, gradually updating the parameters, and finally converges after enough iterations. Therefore, when using a learned BNN, it is only necessary to use the binarized parameters for deduction, and the final output is the deduction result of the BNN.
以下结合实际场景,对BNN的学习进行举例说明。以图像分类问题和随机梯度下降方法为例。记bs为每次学习的批数据大小,二值化参数为W i b,高精度参数为W i。其中,下角标i表示的是用户index。在进行学习时,每次以不放回的方式从数据集(x 1,y 1),(x 2,y 2),…,(x N,y N)中抽取bs个数据,记为(x′ 1,y′ 1),(x′ 2,y′ 2),…,(x′ bs,y′ bs),N为数据集中的数据数量。然后,子节点可以计算
Figure PCTCN2021132048-appb-000001
在该loss计算公式中,L表示使用的神经网络的结构,W i b为当前的二值化参数,lossfunc(·,·)为神经网络的损失函数,loss为最后的损失值。在计算出loss后,子节点进行反向传播算出二值化参数的梯度值,即
Figure PCTCN2021132048-appb-000002
并将梯度累积到高精度参数上,W i←W i-ηgrad,其中η为学习率,该学习率可以是在进行计算之前预先设置的。最后,如果某个高精度参数在本次迭代中符号改变,则对应的二值化参数符号也翻转(比如,从+1翻转为-1)。在每次上述迭代结束后,从余下的数据集中继续抽取bs个数据(不足则全部抽取),重复上述迭代。直到全部数据均已被抽取过,无法继续抽取数据时,一轮学习结束。在二值神经网络学习过程中,需要多次重复这一过程,直到方法收敛。在本示例中,方法的收敛可以根据loss的计算结果与前1次或前几次的计算结果相对比确定。比如,在本次loss计算后,如果本次loss的结算结果与前3次的计算结果差值均在预设的范围之内,那么认为方法收敛。
The following is an example to illustrate the learning of BNN in combination with actual scenarios. Take the image classification problem and the stochastic gradient descent method as an example. Let bs be the batch size of each learning, the binarization parameter is Wi b , and the high precision parameter is Wi . Among them, the subscript i represents the user index. During learning, each time bs data is extracted from the data set (x 1 , y 1 ), (x 2 , y 2 ),..., (x N , y N ) in a non-replacement manner, denoted as ( x′ 1 ,y′ 1 ),(x′ 2 ,y′ 2 ),…,(x′ bs ,y′ bs ), N is the number of data in the dataset. Then, the child nodes can compute
Figure PCTCN2021132048-appb-000001
In this loss calculation formula, L represents the structure of the neural network used, Wi b is the current binarization parameter, lossfunc (·,·) is the loss function of the neural network, and loss is the final loss value. After calculating the loss, the child node performs back-propagation to calculate the gradient value of the binarization parameter, that is,
Figure PCTCN2021132048-appb-000002
And the gradient is accumulated to the high-precision parameter, W i ←W i -ηgrad, where η is the learning rate, which can be preset before the calculation. Finally, if a certain high-precision parameter changes sign in this iteration, the corresponding binarized parameter sign is also flipped (eg, flipped from +1 to -1). After each of the above iterations, continue to extract bs data from the remaining data sets (if the data is insufficient, all are extracted), and repeat the above iterations. A round of learning ends when all data has been extracted and data extraction cannot continue. In the learning process of binary neural network, this process needs to be repeated many times until the method converges. In this example, the convergence of the method can be determined by comparing the calculation result of loss with the calculation results of the previous one or several times. For example, after this loss calculation, if the difference between the settlement result of this loss and the previous three calculation results is within the preset range, then the method is considered to be convergent.
可以看到,BNN相较于普通的神经网络,具有快速收敛,数据传输量小的特点。但是,目前尚未有将BNN使用在基于分布式机器学习系统的方案。而如果直接将BNN应用在现有的基于分布式学习的系统(如FL架构)中,由于所有数据传输都是以二值化的形式进行的,因此会导致整个系统的学习准确度过低而无法使用的情况。It can be seen that compared with ordinary neural networks, BNN has the characteristics of fast convergence and small data transmission. However, there is currently no solution to using BNN in distributed machine learning systems. However, if BNN is directly applied to the existing distributed learning-based system (such as FL architecture), since all data transmission is carried out in the form of binarization, the learning accuracy of the entire system will be too low and Unusable situation.
在本申请实施例提供的机器学习方法中,可以将上述BNN使用在诸如FL等基于分布式机器学习系统的学习框架下,并根据本申请实施例提供的机器学习方法,使得整个FL框架下的机器学习系统的数据传输压力得到显著缓解,同时结合不同场景的需求,提升整个机器学习系统下的学习效率。需要说明的是,在本申请中,各个子节点中的本地训练都可以进行基于二值化的神经网络模型进行本地训练。在不同的使用场景下,该神经网络模型可以灵活选取,比如,该神经网络模型可以具有全连接网络、卷积神经网络等 网络结构。In the machine learning method provided by the embodiment of the present application, the above-mentioned BNN can be used in a learning framework based on a distributed machine learning system such as FL, and according to the machine learning method provided by the embodiment of the present application, the entire FL framework can be used in the machine learning method. The data transmission pressure of the machine learning system is significantly relieved, and at the same time, the learning efficiency of the entire machine learning system is improved by combining the needs of different scenarios. It should be noted that, in this application, local training in each sub-node can be performed locally based on a binarized neural network model. In different usage scenarios, the neural network model can be selected flexibly. For example, the neural network model can have network structures such as a fully connected network and a convolutional neural network.
本申请实施例提供的机器学习方法,可以应用于包括3G/4G/5G/6G,或者卫星通信等无线通信系统中。The machine learning method provided by the embodiments of the present application can be applied to wireless communication systems including 3G/4G/5G/6G, or satellite communication.
其中,无线通信系统通常由小区组成,每个小区包含一个基站(Base Station,BS),基站向多个移动台(Mobile Station,MS)提供通信服务。其中基站包含BBU(Baseband Unit,中文:基带单元)和RRU(Remote Radio Unit,中文:远端射频单元)。BBU和RRU可以放置在不同的地方,例如:RRU拉远,放置于高话务量的区域,BBU放置于中心机房。BBU和RRU也可以放置在同一机房。BBU和RRU也可以为一个机架下的不同部件。The wireless communication system is usually composed of cells, each cell includes a base station (Base Station, BS), and the base station provides communication services to multiple mobile stations (Mobile Station, MS). The base station includes BBU (Baseband Unit, Chinese: Baseband Unit) and RRU (Remote Radio Unit, Chinese: Remote Radio Unit). The BBU and RRU can be placed in different places, for example, the RRU is far away and placed in an area with high traffic volume, and the BBU is placed in the central computer room. BBU and RRU can also be placed in the same computer room. The BBU and RRU can also be different components under one rack.
需要说明的是,本发明方案提及的无线通信系统包括但不限于:窄带物联网系统(Narrow Band-Internet of Things,NB-IoT)、全球移动通信系统(Global System for Mobile Communications,GSM)、增强型数据速率GSM演进系统(Enhanced Data rate for GSM Evolution,EDGE)、宽带码分多址系统(Wideband Code Division Multiple Access,WCDMA)、码分多址2000系统(Code Division Multiple Access,CDMA2000)、时分同步码分多址系统(Time Division-Synchronization Code Division Multiple Access,TD-SCDMA),长期演进系统(Long Term Evolution,LTE)以及下一代5G移动通信系统的三大应用场景eMBB,URLLC和eMTC。It should be noted that the wireless communication systems mentioned in the solution of the present invention include but are not limited to: Narrow Band-Internet of Things (NB-IoT), Global System for Mobile Communications (GSM), Enhanced Data Rate for GSM Evolution (EDGE), Wideband Code Division Multiple Access (WCDMA), Code Division Multiple Access 2000 (Code Division Multiple Access, CDMA2000), Time Division The three major application scenarios of synchronous code division multiple access system (Time Division-Synchronization Code Division Multiple Access, TD-SCDMA), long term evolution system (Long Term Evolution, LTE) and the next generation 5G mobile communication system are eMBB, URLLC and eMTC.
在本示例中,基站是一种部署在无线接入网中为MS提供无线通信功能的装置。述基站可以包括各种形式的宏基站,微基站(也称为小站),中继站,接入点等。在采用不同的无线接入技术的系统中,具备基站功能的设备的名称可能会有所不同,例如,在LTE系统中,称为演进的节点B(evolved NodeB,eNB或者eNodeB),在第三代(3rd Generation,3G)系统中,称为节点B(Node B)等。为方便描述,本申请所有实施例中,上述为MS提供无线通信功能的装置统称为网络设备或基站或BS。In this example, the base station is a device deployed in a radio access network to provide a wireless communication function for an MS. The base stations may include various forms of macro base stations, micro base stations (also called small cells), relay stations, access points, and the like. In systems using different radio access technologies, the names of devices with base station functions may be different. For example, in LTE systems, it is called an evolved NodeB (evolved NodeB, eNB or eNodeB). In the 3rd Generation (3G) system, it is called a Node B (Node B) and so on. For convenience of description, in all the embodiments of this application, the above-mentioned apparatuses for providing wireless communication functions for MSs are collectively referred to as network equipment or base stations or BSs.
本发明方案中所涉及到的MS可以包括各种具有无线通信功能的手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其它处理设备。所述MS也可以称为终端(terminal),还MS可以是用户单元(subscriber unit)、蜂窝电话(cellular phone)、智能手机(smart phone)、无线数据卡、个人数字助理(Personal Digital Assistant,PDA)电脑、平板型电脑、无线调制解调器(modem)、手持设备(handset)、膝上型电脑(laptop computer)、机器类型通信(Machine Type Communication,MTC)终端等。The MS involved in the solution of the present invention may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices, or other processing devices connected to a wireless modem with wireless communication capabilities. The MS may also be referred to as a terminal (terminal), and the MS may be a subscriber unit (subscriber unit), a cellular phone (cellular phone), a smart phone (smart phone), a wireless data card, a personal digital assistant (Personal Digital Assistant, PDA) ) computer, tablet computer, wireless modem (modem), handheld device (handset), laptop computer (laptop computer), machine type communication (Machine Type Communication, MTC) terminal, etc.
请参考图4,为本申请实施例提供的一种机器学习系统的组成。如图4所示,该机器学习系统中可以包括中心节点,以及多个子节点(如子节点1-子节点N)。其中,子节点也可称为本地节点。中心节点可以与各个子节点通过有线或无线的方式进行通信。比如,中心节点可以接收各个子节点在上行传输通道中上传的本地训练结果。其中,该本地训练结果可以包括本地训练模型参数,或者本地训练模型本身。又如,中心节点可以将融合后的本地训练模型参数或者本地训练模型本身通过广播或者下行传输通道下发给各个子节点。为了便于对本申请实施例提供的方案进行说明,以下以在子节点和中心节点传输的数据对应于本地训练模型参数,以及融合训练模型参数为例。在不同的场景中,对于本地训练模型参数或者本地训练模型的传输,可以是通过传输各个参数对应的二值化参数实现的,也可以是通过直接传输各个参数对应的高精度参数实现的。需要说明的是,该本地训练模型参数可以包括参数可以为神经网络的权重、偏置等参数中第一项或多项,该本地训练模型参数也可以为它们对应的梯度。Please refer to FIG. 4 , which shows the composition of a machine learning system provided by an embodiment of the present application. As shown in FIG. 4 , the machine learning system may include a central node and multiple sub-nodes (eg, sub-node 1-sub-node N). The child node may also be called a local node. The central node can communicate with each sub-node in a wired or wireless manner. For example, the central node can receive the local training results uploaded by each sub-node in the uplink transmission channel. Wherein, the local training result may include parameters of the local training model, or the local training model itself. For another example, the central node may deliver the merged local training model parameters or the local training model itself to each sub-node through a broadcast or downlink transmission channel. In order to facilitate the description of the solutions provided by the embodiments of the present application, the following takes the data transmitted at the child node and the central node corresponding to the local training model parameters and the fusion training model parameters as an example. In different scenarios, the transmission of local training model parameters or local training model can be realized by transmitting the binarized parameters corresponding to each parameter, or by directly transmitting the high-precision parameters corresponding to each parameter. It should be noted that the parameters of the local training model may include the first item or multiple parameters of the parameters such as the weight and bias of the neural network, and the parameters of the local training model may also be their corresponding gradients.
示例性的,以中心节点为基站,与该中心节点结耦的子节点为手机为例。如图5所 示,该机器学习系统中可以包括一个基站,以及N个手机(如手机1-手机N)。N个手机与基站中可以预先存储有相同的基础训练模型。手机1可以采集对应场景的数据,由此形成对应的数据集1。手机1可以将该数据集1输入到基础训练模型中进行本地训练。可以理解的是,本申请实施例中,手机可以通过BNN进行本地训练。比如,手机1可以将该数据集1输入到基础训练模型中,按照高精度参数的机器学习方法,获取本地模型参数1。该本地模型参数1可以是高精度参数。在一些实现中,该本地模型参数1可以包括高精度的权重以及偏置。手机1可以基于高精度的权重和偏置进行反向推演,由此完成数据集1中一部分数据的学习。接着,手机1可以将对应的高精度的权重和偏置经过二值化转换,获取对应的二值化参数。在对于后续数据的学习过程中,手机1可以基于该二值化参数,进行训练学习。由于二值化参数的数据量显著的小于高精度的数据量,因此手机1可以快速地获取完成本地训练,以获取收敛后的权重和偏置。由于是基于二值化参数进行的本地训练,因此,手机1获取的权重和偏置结果可以为二值化的参数。Exemplarily, take the central node as the base station and the child node coupled with the central node as the mobile phone as an example. As shown in Figure 5, the machine learning system may include a base station and N mobile phones (such as mobile phone 1 - mobile phone N). The same basic training model may be pre-stored in the N mobile phones and base stations. The mobile phone 1 can collect data of the corresponding scene, thereby forming the corresponding data set 1 . Mobile phone 1 can input the data set 1 into the basic training model for local training. It can be understood that, in this embodiment of the present application, the mobile phone can perform local training through the BNN. For example, the mobile phone 1 can input the data set 1 into the basic training model, and obtain the local model parameter 1 according to the machine learning method of high-precision parameters. The local model parameter 1 may be a high precision parameter. In some implementations, the local model parameters 1 may include high precision weights and biases. The mobile phone 1 can perform reverse deduction based on high-precision weights and biases, thereby completing the learning of a part of the data in the data set 1. Next, the mobile phone 1 can perform binarization conversion on the corresponding high-precision weights and biases to obtain corresponding binarization parameters. In the learning process of the subsequent data, the mobile phone 1 can perform training and learning based on the binarization parameter. Since the data volume of the binarization parameters is significantly smaller than the high-precision data volume, the mobile phone 1 can quickly acquire and complete the local training to acquire the converged weights and biases. Since the local training is based on the binarized parameters, the weight and bias results obtained by the mobile phone 1 can be the binarized parameters.
类似于手机1中的处理,其他手机,如手机2-手机N也可分别进行类似的本地训练,以获得对应的二值化参数。本申请中,可以将各个手机在本地训练获取的二值化参数称为本地参数。比如,手机1经过1轮学习后获取的二值化的权重和偏置可以称为本地参数1。手机2经过1轮学习后获取的二值化的权重和偏置可以称为本地参数2。手机N经过1轮学习后获取的二值化的权重和偏置可以称为本地参数N。Similar to the processing in mobile phone 1, other mobile phones, such as mobile phone 2-mobile phone N, can also perform similar local training respectively to obtain corresponding binarization parameters. In this application, the binarized parameters obtained by local training of each mobile phone may be referred to as local parameters. For example, the binarized weights and biases obtained by mobile phone 1 after one round of learning can be called local parameters 1. The binarized weights and biases obtained by mobile phone 2 after one round of learning can be called local parameters 2. The binarized weights and biases obtained by the mobile phone N after one round of learning can be called local parameters N.
各个手机可以将其对应的本地参数分别发送给基站。基站可以将获取的N个本地参数(如本地参数1-本地参数N)进行融合,以获取归一的融合参数。接着,基站可以将该融合参数分别下发给各个手机,手机在接收到融合参数之后可以据此更新本地的基础训练模型,并进行下一轮学习或者直接用于实际场景中的数据预测。Each mobile phone can send its corresponding local parameters to the base station respectively. The base station may fuse the acquired N local parameters (eg, local parameter 1-local parameter N) to obtain a normalized fusion parameter. Then, the base station can distribute the fusion parameters to each mobile phone respectively. After receiving the fusion parameters, the mobile phone can update the local basic training model accordingly, and perform the next round of learning or directly use it for data prediction in actual scenarios.
结合图4,以下以机器学习系统中包括3个子节点,对数据在子节点与中心节点中的处理与传输进行示例性的说明。With reference to FIG. 4 , the following describes the processing and transmission of data in the sub-nodes and the central node by including three sub-nodes in the machine learning system.
请参考图6。在中心节点中可以设置有中心融合模块,在各个子节点中可以设置有学习模块和本地融合模块。Please refer to Figure 6. A central fusion module may be set in the central node, and a learning module and a local fusion module may be set in each sub-node.
以子节点1为例。子节点在执行本申请实施例提供的机器学习方法时,其中的学习模块可以用于对数据集进行本地训练,以获取二值化的本地参数。子节点1可以将该本地参数发送给中心节点。类似的,子节点2和子节点3也可以将其分别对应的本地参数发送给中心节点。中心节点中的中心融合模块,可以用于将接收到的所有本地参数进行融合,以获取融合参数。接着,中心节点可以将该融合参数分别下发给子节点1-子节点3。在子节点1中,本地融合模块可以用于根据接收到的融合参数,对本地训练模型进行更新,以获取基于融合参数的本地训练模型。类似的,在子节点2中,本地融合模块可以用于根据接收到的融合参数,对本地训练模型进行更新,以获取基于融合参数的本地训练模型。在子节点3中,本地融合模块可以用于根据接收到的融合参数,对本地训练模型进行更新,以获取基于融合参数的本地训练模型。Take child node 1 as an example. When the child node executes the machine learning method provided in the embodiment of the present application, the learning module in the child node may be used to perform local training on the data set to obtain the binarized local parameters. Subnode 1 can send the local parameter to the central node. Similarly, the child node 2 and the child node 3 may also send their corresponding local parameters to the central node. The central fusion module in the central node can be used to fuse all received local parameters to obtain fusion parameters. Next, the central node may deliver the fusion parameter to the sub-node 1 to the sub-node 3 respectively. In sub-node 1, the local fusion module can be used to update the local training model according to the received fusion parameters, so as to obtain a local training model based on the fusion parameters. Similarly, in sub-node 2, the local fusion module can be used to update the local training model according to the received fusion parameters, so as to obtain the local training model based on the fusion parameters. In sub-node 3, the local fusion module can be used to update the local training model according to the received fusion parameters, so as to obtain the local training model based on the fusion parameters.
在一些实现方式中,该本地融合模块可以是不包括在子节点中,而是独立的模块。例如,结合图7,在本地融合模块独立于各个子节点时,则中心节点可以将融合参数只发送给本地融合模块,本地融合模块可以用于对本地训练模型进行更新,并将更新后的本地训练模型分别下发给子节点1-子节点3。由此,可以降低对子节点的性能要求,同时由于中心节点只需要将融合参数发送给本地融合模块,因此可以降低中心节点的信令开销。In some implementations, the local fusion module may be an independent module that is not included in the child node. For example, referring to Fig. 7, when the local fusion module is independent of each sub-node, the central node can only send the fusion parameters to the local fusion module, and the local fusion module can be used to update the local training model, and the updated local The training model is distributed to child node 1 to child node 3 respectively. In this way, the performance requirements for the sub-nodes can be reduced, and at the same time, since the central node only needs to send the fusion parameters to the local fusion module, the signaling overhead of the central node can be reduced.
在本申请的另一些实现方式中,本地融合模块也可是设置在部分子节点中的。比如,结合图8。其中,子节点3中集成有本地融合模块,而子节点1和子节点2的本地融合模块可以独立于子节点而单独设置的。在该架构下,中心节点可以在获取融合参数后,分别将该融合参数发送给与子节点1和子节点2对应的本地融合模块,以及子节点3。这样,子节点1和子节点2对应的本地融合模块可以将基于融合参数更新的本地训练模型下发给子节点1和子节点2。而对于子节点3,可以根据接收到的融合参数,使用其中集成的本地融合模块更新本地训练模型,由此获取更新后的本地训练模型。In other implementation manners of the present application, the local fusion module may also be set in some sub-nodes. For example, in conjunction with Figure 8. Wherein, the local fusion module is integrated in the sub-node 3, and the local fusion modules of the sub-node 1 and the sub-node 2 can be set independently of the sub-nodes. Under this architecture, after acquiring the fusion parameters, the central node can send the fusion parameters to the local fusion modules corresponding to the sub-node 1 and the sub-node 2, and the sub-node 3 respectively. In this way, the local fusion modules corresponding to the sub-node 1 and the sub-node 2 can deliver the local training model updated based on the fusion parameters to the sub-node 1 and the sub-node 2. As for the sub-node 3, the local fusion module integrated therein can be used to update the local training model according to the received fusion parameters, thereby obtaining the updated local training model.
理解的是,在本示例中示出如图6、图7以及图8的及其学习系统的组成仅为一种示例,在本申请的另一些实现中,该系统中也可以包括多个独立配置的本地融合模块。比如,以在系统中配置有5个子节点(如子节点1-子节点5),以及3个本地融合模块(如本地融合模块1-本地融合模块3)为例。在一些场景下,本地融合模块1可以为子节点1和子节点2提供本地融合服务,本地融合模块2可以为子节点3和子节点4提供本地融合服务,本地融合模块3可以为子节点5提供本地融合服务。在另一些场景下,本地融合模块与子节点的对应关系也可以进行重新配置,比如,本地融合模块1可以为子节点1和子节点3提供本地融合服务,本地融合模块2可以为子节点2、子节点5提供本地融合服务,本地融合模块3可以为子节点4提供本地融合服务。当然,在一些场景下,也可以通过3个本地融合模块中的一个或部分向子节点提供本地融合服务,比如本地融合模块1可以为子节点1和子节点3提供本地融合服务,本地融合模块2可以为子节点2、子节点5以及子节点4提供本地融合服务,本地融合模块3则可以处于休眠等不工作的状态。It should be understood that the composition of the learning system shown in FIG. 6 , FIG. 7 and FIG. 8 in this example is only an example, and in other implementations of the present application, the system may also include multiple independent Configured local fusion module. For example, take the system configured with 5 sub-nodes (eg, sub-node 1-sub-node 5) and 3 local fusion modules (eg, local fusion module 1-local fusion module 3) as an example. In some scenarios, local fusion module 1 can provide local fusion services for sub-node 1 and sub-node 2, local fusion module 2 can provide local fusion services for sub-node 3 and sub-node 4, and local fusion module 3 can provide local fusion services for sub-node 5 Fusion Services. In other scenarios, the corresponding relationship between the local fusion module and the child nodes can also be reconfigured. Sub-node 5 provides local fusion service, and local fusion module 3 can provide local fusion service for sub-node 4. Of course, in some scenarios, one or part of the three local fusion modules can also provide local fusion services to sub-nodes. For example, local fusion module 1 can provide local fusion services for sub-node 1 and sub-node 3, and local fusion module 2 Sub-node 2, sub-node 5, and sub-node 4 can be provided with local fusion services, and local fusion module 3 can be in a dormant state such as sleep.
为了便于说明,以下以本地融合模块集成在子节点中为例。图9示出了本申请实施例提供的一种机器学习方法的逻辑示意。如图9所示,该方法可以包括:For the convenience of description, the following takes the integration of the local fusion module in the sub-node as an example. FIG. 9 shows a logical schematic diagram of a machine learning method provided by an embodiment of the present application. As shown in Figure 9, the method may include:
S901、子节点1进行本地学习。S901, the child node 1 performs local learning.
S902、子节点1获取本地参数。S902, the child node 1 acquires local parameters.
S903、子节点1将本地参数发送给中心节点。S903, the child node 1 sends the local parameter to the central node.
可以理解的是,结合上述说明,对于机器学习系统中的其他子节点,也可分别执行上述S901-S903,这样中心节点就可以获取N个本地参数。其中,在一些实现中,该本地参数可以为二值化的本地参数。It can be understood that, in combination with the above description, for other sub-nodes in the machine learning system, the above-mentioned S901-S903 can also be executed respectively, so that the central node can obtain N local parameters. Wherein, in some implementations, the local parameter may be a binarized local parameter.
S904、中心节点对N个本地参数进行融合。S904, the central node fuses the N local parameters.
S905、中心节点获取融合参数。S905, the central node obtains fusion parameters.
S906、中心节点将融合参数发送给子节点1。S906, the central node sends the fusion parameter to the child node 1.
S907、子节点1根据融合参数,更新本地模型参数。S907, the child node 1 updates the local model parameters according to the fusion parameters.
可以理解的是,对于机器学习系统中的其他子节点,中心节点也可对应执行上述S906,而对应的子节点也可执行上述S907,以便对其本地训练模型进行更新。It can be understood that, for other sub-nodes in the machine learning system, the central node can also execute the above S906 correspondingly, and the corresponding sub-nodes can also execute the above S907, so as to update its local training model.
可以看到,本申请实施例提供的机器学习方法,由于子节点可以将二值化的本地参数发送给中心节点,而不需要将具有较大数据量的高精度本地参数发送给中心节点,因此能够显著地降低子节点与中心节点之间的通信压力。由于传输的数据量很少,因此耗时也相应降低,由此能够提升本地训练在整个学习时长的占比,由此提高学习效率。另外,在本申请的一些实现方式中,中心节点可以通过广播的方式将融合参数下发给各个子节点,这样中心节点就可以不需一一向各个子节点发送融合参数,由此节省中心节点的信令开销。It can be seen that, in the machine learning method provided by the embodiments of the present application, since the sub-nodes can send the binarized local parameters to the central node, it is not necessary to send the high-precision local parameters with a large amount of data to the central node, so It can significantly reduce the communication pressure between the child nodes and the central node. Since the amount of transmitted data is small, the time-consuming is correspondingly reduced, which can increase the proportion of local training in the entire learning time, thereby improving the learning efficiency. In addition, in some implementations of the present application, the central node can send the fusion parameters to each sub-node by broadcasting, so that the central node does not need to send the fusion parameters to each sub-node one by one, thereby saving the central node signaling overhead.
需要说明的是,在本申请实施例中,为了能够适应不同场景中的学习需求,本申请实施例在如图9所示的方法中,还提供三种模式,使得机器学习系统可以根据对于学习效率以及学习准确度不同的场景下,选取对应的模式,达到快速收敛或者高准确度学习的效果。It should be noted that, in the embodiment of the present application, in order to be able to adapt to the learning needs in different scenarios, the embodiment of the present application also provides three modes in the method shown in FIG. 9 , so that the machine learning system can learn In scenarios with different efficiency and learning accuracy, select the corresponding mode to achieve the effect of rapid convergence or high-accuracy learning.
模式1:上传模型参数时采用二值化参数,中心节点模型下传时采用高精度参数。Mode 1: Binarized parameters are used when uploading model parameters, and high-precision parameters are used when central node models are downloaded.
由于下传采用的高精度参数,因此,子节点可以更加准确地更新获取本地训练模型。而由于上传采用的二值化参数,因此整个学习效率依然显著高于现有的FL架构中上下行都采用高精度参数的方案。该模式可以应用于对于学习效率和学习准确度都有一定要求的场景中。Due to the high-precision parameters used in the download, the child nodes can more accurately update and obtain the local training model. Due to the binarized parameters used for uploading, the overall learning efficiency is still significantly higher than that of the existing FL architecture that uses high-precision parameters for both uplink and downlink. This mode can be used in scenarios that have certain requirements for learning efficiency and learning accuracy.
模式2:上传、下传模型参数时均采用二值化参数。Mode 2: Binarization parameters are used when uploading and downloading model parameters.
由于上传和下传的模型参数均为二值化参数。因此,系统的数据传输压力最小。该模式可以应用于对于学习效率要求较高的场景中。Because the model parameters uploaded and downloaded are all binarized parameters. Therefore, the data transmission stress of the system is minimal. This mode can be used in scenarios that require high learning efficiency.
模式3:上传、下传模型参数时均采用高精度参数。Mode 3: High-precision parameters are used when uploading and downloading model parameters.
由于下传采用的高精度参数,因此,子节点可以更加准确地更新获取本地训练模型。该模式可以应用于对于学习准确度要求较高的场景中。Due to the high-precision parameters used in the download, the child nodes can more accurately update and obtain the local training model. This mode can be used in scenarios that require high learning accuracy.
应当理解的是,结合图6和图9,在每一轮迭代中,子节点基于现有模型和本地数据训练本地模型并上传给中心服务器。对于每个子节点,
Figure PCTCN2021132048-appb-000003
(i=1,2,…,M,M为子节点个数)表示第i个子节点的全部二值化参数。中心服务器接收到来自所有子节点上传的本地参数
Figure PCTCN2021132048-appb-000004
后,执行中心模型融合方法得到中心端参数
Figure PCTCN2021132048-appb-000005
并将
Figure PCTCN2021132048-appb-000006
以广播的形式下发给所有子节点。子节点收到
Figure PCTCN2021132048-appb-000007
后,立即使用本地模型融合方法将收到的
Figure PCTCN2021132048-appb-000008
与本地高精度参数W i进行融合。融合后,子节点重新进行二值量化
Figure PCTCN2021132048-appb-000009
然后开始下一轮的本地训练。中心模型融合方法和本地模型融合方法根据参数传输模式(如模式1-模式3)的不同而有所差别。
It should be understood that, with reference to FIG. 6 and FIG. 9 , in each round of iteration, the sub-node trains a local model based on the existing model and local data and uploads it to the central server. For each child node,
Figure PCTCN2021132048-appb-000003
(i=1,2,...,M, where M is the number of child nodes) represents all binarization parameters of the ith child node. The central server receives local parameters uploaded from all child nodes
Figure PCTCN2021132048-appb-000004
Then, execute the central model fusion method to obtain the central parameters
Figure PCTCN2021132048-appb-000005
and will
Figure PCTCN2021132048-appb-000006
It is distributed to all child nodes in the form of broadcast. child node receives
Figure PCTCN2021132048-appb-000007
, immediately using the local model fusion method will receive the
Figure PCTCN2021132048-appb-000008
It is fused with the local high precision parameter Wi. After fusion, the sub-nodes are re-binary quantized
Figure PCTCN2021132048-appb-000009
Then start the next round of local training. The central model fusion method and the local model fusion method differ according to the parameter transmission mode (eg mode 1-mode 3).
以下对采用各个模式时,中心融合以及本地融合的具体计算过程进行说明。The following describes the specific calculation process of center fusion and local fusion when each mode is adopted.
1、对于模式1中的中心融合:1. For center fusion in mode 1:
中心服务器收到所有子节点上传的二值化参数后对其进行加权平均,得到中心端的参数
Figure PCTCN2021132048-appb-000010
需要说明的是,当每个节点上的数据数量相等时,这一融合公式变为
Figure PCTCN2021132048-appb-000011
此时,
Figure PCTCN2021132048-appb-000012
的取值本身并没有实际意义。但在已知M的情况下,可以反映出对于某一个参数,有多少节点的高精度参数为正或负。例如,当节点数M=100时,从中心服务器收到的
Figure PCTCN2021132048-appb-000013
中,某一位置的参数值为-0.2就意味着在这一位置,有60个子节点高精度参数为负,上传了-1,其余40个子节点高精度参数为正,上传了+1。而当每个节点上数据数量不等时,由于系统节点众多,可以认为
Figure PCTCN2021132048-appb-000014
反映了所有节点的数据中,给出正或负累积梯度的比例,考虑到对于节点i,本地数据共占比
Figure PCTCN2021132048-appb-000015
因此对于每个节点,可以视为总节点数
Figure PCTCN2021132048-appb-000016
且每个节点数据量相等的情况。因为此时,对于每个节点的等效M′均不同,但并不影响具体实施细节。为了简化计算以便于说明融合公式推导过程,以下均以各子节点数据集大小一致为例进行说明。可以理解的是,在本申请的另一些实现方式中,各个子节点数据集的大小也可以不同,在数据集大小不一致时,其计算过程类似,此处不再赘述。
After receiving the binarized parameters uploaded by all sub-nodes, the central server performs a weighted average on them to obtain the parameters of the central end.
Figure PCTCN2021132048-appb-000010
It should be noted that when the amount of data on each node is equal, this fusion formula becomes
Figure PCTCN2021132048-appb-000011
at this time,
Figure PCTCN2021132048-appb-000012
The value itself has no practical meaning. However, when M is known, it can reflect how many nodes have positive or negative high-precision parameters for a certain parameter. For example, when the number of nodes is M=100, the data received from the central server
Figure PCTCN2021132048-appb-000013
, the parameter value of a certain position is -0.2, which means that at this position, there are 60 sub-nodes with negative high-precision parameters and uploaded -1, and the remaining 40 sub-nodes with positive high-precision parameters and uploaded +1. When the amount of data on each node is not equal, due to the large number of system nodes, it can be considered that
Figure PCTCN2021132048-appb-000014
It reflects the proportion of positive or negative cumulative gradients in the data of all nodes, considering that for node i, the total proportion of local data
Figure PCTCN2021132048-appb-000015
So for each node, it can be considered as the total number of nodes
Figure PCTCN2021132048-appb-000016
and each node has the same amount of data. Because at this time, the equivalent M' for each node is different, but it does not affect the specific implementation details. In order to simplify the calculation and facilitate the description of the derivation process of the fusion formula, the following descriptions are given by taking the same size of each child node data set as an example. It can be understood that, in other implementation manners of the present application, the size of each sub-node data set may also be different. When the size of the data set is inconsistent, the calculation process is similar, which will not be repeated here.
2、对于模式1中的本地融合:2. For local fusion in mode 1:
本地融合模块收到
Figure PCTCN2021132048-appb-000017
后进行本地融合计算,因为所有子节点的高精度参数均为基于本地数据集累积而得,因而具有强相关性,又因为节点数众多,所以可以假设所有节点的高精度参数近似为同一正态分布的采样值。进一步地,假设这个正态分布的协方差矩阵为对角阵,即不同位置参数的取值无关。由此,可以假设每一个子节点参数W i,j~N(μ jj),其中,W i,j表示第i个节点第j个参数对应的参数值。问题转化为,在上述已知条件下,如何根据W i,j
Figure PCTCN2021132048-appb-000018
的值估计出μ j的值,并用估计到的值作为对所有节点的W k,j(k=1,2,…,M)的均值的估计,替换原来的W i,j。注意到各节点本地高精度参数很有可能不同,在完全独立地估计时,结果也会不一样。在本问题的分析求解过程中,为简明起见,省去所有下标,所有符号均表示第i个节点的第j个参数对应的参数。
The local fusion module receives
Figure PCTCN2021132048-appb-000017
Then perform local fusion calculation, because the high-precision parameters of all child nodes are accumulated based on the local data set, so they have strong correlation, and because of the large number of nodes, it can be assumed that the high-precision parameters of all nodes are approximately the same normal The sampled values of the distribution. Further, it is assumed that the covariance matrix of this normal distribution is a diagonal matrix, that is, the values of different position parameters are irrelevant. Therefore, it can be assumed that each child node parameter Wi ,j ∼N(μ jj ), wherein Wi ,j represents the parameter value corresponding to the jth parameter of the ith node. The problem is transformed into, under the above known conditions, how to calculate according to Wi ,j and
Figure PCTCN2021132048-appb-000018
The value of μ j is estimated from the value of , and the original Wi ,j is replaced by the estimated value as an estimate of the mean value of W k,j (k=1,2,...,M) for all nodes. Note that the local high-precision parameters of each node are likely to be different, and when estimated completely independently, the results will also be different. In the process of analyzing and solving this problem, for the sake of brevity, all subscripts are omitted, and all symbols represent the parameters corresponding to the jth parameter of the ith node.
由本模式下中心端
Figure PCTCN2021132048-appb-000019
的计算过程可以发现,所有W>0的子节点数为
Figure PCTCN2021132048-appb-000020
首先,假设本地W>0,记
Figure PCTCN2021132048-appb-000021
0≤θ≤1,为除本节点外,其他所有节点中W>0的比例。则在剩余的M-1个节点中,共有(M-1)θ个节点W k,j>0。则相当于对某正态分布N(μ,σ)的一组M个观察样本中,得到(M-1)θ个未知具体值的正观察值,(M-1)(1-θ)个未知具体值的负观察值,以及一个已知的正观察值W,这个正态分布的均值μ就可以认为是所有子节点对于本地W>0观察值的均值,可以一定程度反映出这一点的全局累积梯度大小。因此,可以用最大似然原则求出最大似然
Figure PCTCN2021132048-appb-000022
值。
From the center in this mode
Figure PCTCN2021132048-appb-000019
The calculation process of , it can be found that the number of all child nodes of W>0 is
Figure PCTCN2021132048-appb-000020
First, assuming that the local W>0, record
Figure PCTCN2021132048-appb-000021
0≤θ≤1, which is the proportion of W>0 in all nodes except this node. Then in the remaining M-1 nodes, there are (M-1)θ nodes W k,j >0. It is equivalent to obtaining (M-1)θ positive observations of unknown specific values in a group of M observation samples of a normal distribution N(μ,σ), (M-1)(1-θ) Negative observations with unknown specific values, and a known positive observation W, the mean μ of this normal distribution can be considered as the mean of all child nodes for the local W>0 observations, which can reflect this to a certain extent. Global cumulative gradient magnitude. Therefore, the maximum likelihood can be obtained by the maximum likelihood principle
Figure PCTCN2021132048-appb-000022
value.
问题构建为:The problem is constructed as:
Figure PCTCN2021132048-appb-000023
Figure PCTCN2021132048-appb-000023
α|x~N(0,1)}表示正态分布的上分位数函数,并对上式求对数后,可得等价问题:α|x~N(0,1)} represents the upper quantile function of the normal distribution, and after taking the logarithm of the above formula, the equivalent problem can be obtained:
Figure PCTCN2021132048-appb-000024
Figure PCTCN2021132048-appb-000024
进行参数替换,记y=μ/σ,x=W/σ,因为W取值与优化目标无关,所以问题可以再次转化为:Perform parameter replacement, note y=μ/σ, x=W/σ, because the value of W has nothing to do with the optimization goal, so the problem can be transformed into:
Figure PCTCN2021132048-appb-000025
Figure PCTCN2021132048-appb-000025
为了解决上述优化问题,首先对x进行优化求解,由于W>0的约束,必然有x>0。此时,将所有与x相关的项提取出来,可以得到如下子问题:In order to solve the above optimization problem, the optimization solution for x is firstly carried out. Due to the constraint of W>0, there must be x>0. At this point, all items related to x are extracted, and the following sub-problems can be obtained:
Figure PCTCN2021132048-appb-000026
Figure PCTCN2021132048-appb-000026
可以很容易通过求导解得
Figure PCTCN2021132048-appb-000027
将其带入原问题后,问题可等价转换为:
It can be easily solved by derivation
Figure PCTCN2021132048-appb-000027
After bringing it into the original problem, the problem can be equivalently transformed into:
Figure PCTCN2021132048-appb-000028
Figure PCTCN2021132048-appb-000028
优化目标是一个关于y的先增后减函数。这一函数的取值只与θ和M有关,由于M是可以事先知道的,所以可以事先绘制出最优解
Figure PCTCN2021132048-appb-000029
与θ之间的关系曲线,用特定函数拟合此关系,保存在本地,降低求解优化问题的复杂度。本实施例使用对数函数
Figure PCTCN2021132048-appb-000030
Figure PCTCN2021132048-appb-000031
对这一曲线进行最小二乘拟合,其中θ的取值范围为[0,1],令c>0以确保函数有定义。这个函数就可以作为在给定的M下,
Figure PCTCN2021132048-appb-000032
关于θ的近似函数表达式,保存在本地。进而得到W>0时原问题的解:
The optimization objective is an increasing and then decreasing function of y. The value of this function is only related to θ and M. Since M can be known in advance, the optimal solution can be drawn in advance.
Figure PCTCN2021132048-appb-000029
The relationship curve between θ and θ, fit this relationship with a specific function, and save it locally to reduce the complexity of solving optimization problems. This example uses the logarithmic function
Figure PCTCN2021132048-appb-000030
Figure PCTCN2021132048-appb-000031
A least-squares fit is performed on this curve, where θ is in the range [0,1] and c>0 to ensure that the function is well defined. This function can be used as a given M,
Figure PCTCN2021132048-appb-000032
Approximate function expression for θ, stored locally. Then we get the solution of the original problem when W>0:
Figure PCTCN2021132048-appb-000033
Figure PCTCN2021132048-appb-000033
考虑W<0的情况,同理推导,可以得到原问题的解为:Considering the case of W < 0, and in the same way, the solution to the original problem can be obtained as:
Figure PCTCN2021132048-appb-000034
Figure PCTCN2021132048-appb-000034
在本地融合的过程中,对于每一个W,根据上述过程中求出的对数近似表达式计算出
Figure PCTCN2021132048-appb-000035
然后利用
Figure PCTCN2021132048-appb-000036
计算出
Figure PCTCN2021132048-appb-000037
并计算
Figure PCTCN2021132048-appb-000038
最后使用
Figure PCTCN2021132048-appb-000039
替换W。其中,α>1表示放大倍数,为事先选取的参数,一般取值在1.5到2.5之间,(出于系统稳定性考虑,为了使系统更稳定地收敛,应当确保随着计算深入,W始终倾向于远离0,但又不能过大。由于高精度参数的绝对大小意义不大,但相对大小很重要,因而可以将所有参数等比例放大再进行限幅)。clamp(·)表示限幅操作,x<a时clamp(x,a,b)=a,a≤x≤b时clamp(x,a,b)=x,x>b时clamp(x,a,b)=b。
In the process of local fusion, for each W, according to the logarithmic approximate expression obtained in the above process,
Figure PCTCN2021132048-appb-000035
then use
Figure PCTCN2021132048-appb-000036
Calculate
Figure PCTCN2021132048-appb-000037
and calculate
Figure PCTCN2021132048-appb-000038
last use
Figure PCTCN2021132048-appb-000039
Replace W. Among them, α>1 represents the magnification, which is a parameter selected in advance, generally between 1.5 and 2.5. (For the sake of system stability, in order to make the system converge more stably, it should be ensured that with the deepening of the calculation, W is always It tends to be far away from 0, but not too large. Since the absolute size of the high-precision parameters is of little significance, but the relative size is important, all parameters can be proportionally enlarged and then limited). clamp( ) represents the clipping operation, clamp(x,a,b)=a when x<a, clamp(x,a,b)=x when a≤x≤b, clamp(x,a when x>b ,b)=b.
最终,本地融合的公式如下:Finally, the formula for local fusion is as follows:
Figure PCTCN2021132048-appb-000040
Figure PCTCN2021132048-appb-000040
根据该本地融合公式,本地融合模块就可以根据融合参数获取对本地训练模型进行更新。According to the local fusion formula, the local fusion module can update the local training model according to the fusion parameters obtained.
3、对于模式2中的中心融合:3. For center fusion in mode 2:
中心服务器收到所有子节点上传的二值化参数后对其求和并取符号,得到中心端的参数
Figure PCTCN2021132048-appb-000041
若这一结果恰好为0,则随机下发-1或+1。此时,
Figure PCTCN2021132048-appb-000042
的意义为对于某一个参数,所有节点数据中较大比例得到二值化参数为正,则取+1,反之则取-1。此时,这一参数只反映一个大概的整体趋势,包含的信息量较少。
After the central server receives the binarized parameters uploaded by all sub-nodes, sum them and take the sign to obtain the parameters of the central end.
Figure PCTCN2021132048-appb-000041
If the result is exactly 0, -1 or +1 is randomly issued. at this time,
Figure PCTCN2021132048-appb-000042
The meaning of is that for a certain parameter, if the larger proportion of all node data is positive, the binarization parameter is positive, then it takes +1, otherwise, it takes -1. At this time, this parameter only reflects a general overall trend and contains less information.
4、对于模式2中的本地融合:4. For local fusion in mode 2:
子节点收到
Figure PCTCN2021132048-appb-000043
后进行本地融合计算,使用简单的线性融合方法:
Figure PCTCN2021132048-appb-000044
Figure PCTCN2021132048-appb-000045
其中sign(·)表示取符号函数,β在0与1之间,为事先选定的参数。
child node receives
Figure PCTCN2021132048-appb-000043
After local fusion calculation, use a simple linear fusion method:
Figure PCTCN2021132048-appb-000044
Figure PCTCN2021132048-appb-000045
Among them, sign(·) represents the sign function, and β is between 0 and 1, which is a parameter selected in advance.
5、对于模式3中的中心融合:5. For center fusion in mode 3:
中心服务器收到所有子节点上传的高精度参数后对其求平均,得到中心端的参数
Figure PCTCN2021132048-appb-000046
此时,
Figure PCTCN2021132048-appb-000047
的意义为所有子节点依据本地数据集计算出的加权平均累积梯度。
After the central server receives the high-precision parameters uploaded by all the sub-nodes, it averages them to obtain the parameters of the central end.
Figure PCTCN2021132048-appb-000046
at this time,
Figure PCTCN2021132048-appb-000047
The meaning of is the weighted average cumulative gradient calculated by all child nodes according to the local data set.
6、对于模式3中的本地融合:6. For local fusion in mode 3:
子节点收到
Figure PCTCN2021132048-appb-000048
后直接用
Figure PCTCN2021132048-appb-000049
更新本地高精度参数,即
Figure PCTCN2021132048-appb-000050
child node receives
Figure PCTCN2021132048-appb-000048
directly after
Figure PCTCN2021132048-appb-000049
Update the local high-precision parameters, i.e.
Figure PCTCN2021132048-appb-000050
此模式与传统FL框架下的融合方法基本相同,区别仅在于本地模型为BNN,训练前向计算和推演复杂度相比于普通神经网络更低。This mode is basically the same as the fusion method under the traditional FL framework, the only difference is that the local model is BNN, and the forward calculation and deduction complexity of training is lower than that of ordinary neural networks.
基于上述对1-6的说明,可以看到,对于不同的传输模式,可以采用对应的本地融合以及中心融合方法,以实现对应的学习计算顺利进行。Based on the above descriptions of 1-6, it can be seen that for different transmission modes, corresponding local fusion and central fusion methods can be used to achieve smooth corresponding learning calculations.
如前述说明,在不同的实现场景中,可以基于如图9所示的方法中,选取上述模式1或者模式2或者模式3中的任意一种作为传输方式,以此获取对应的有益效果。需要说明的是,无论采用上述模式1或模式2或模式3,由于子节点本地进行本地训练时都采用的BNN进行,因此相比于现有的FL架构能够更快地迭代获取结果。As described above, in different implementation scenarios, based on the method shown in FIG. 9 , any one of the above Mode 1, Mode 2, or Mode 3 may be selected as the transmission mode, so as to obtain corresponding beneficial effects. It should be noted that no matter the above mode 1 or mode 2 or mode 3 is adopted, since the BNN is used for local training of the child nodes locally, the results can be obtained iteratively faster than the existing FL architecture.
在本申请的另一些实现方式中,还可以结合上述模式1和模式2,或者模式1和模式3,或者模式2和模式3,或者模式1和模式2以及模式3实现如图9所示的方法。In other implementation manners of the present application, the above-mentioned mode 1 and mode 2, or mode 1 and mode 3, or mode 2 and mode 3, or mode 1 and mode 2 and mode 3 can also be combined to realize the above-mentioned mode as shown in FIG. 9 . method.
示例性的,可以在开始学习时,采用模式2进行。这样,虽然不能获取准确度较高的训练结果,但是能够使得前面几轮的学习快速收敛。可以理解的是,一般而言,一个学习过程需要多轮学习才能完成。而在前面几轮的学习过程中,由于模型中的参数大概率会在后续的学习过程中发生变化,因此,对于前面几轮学习的准确度要求并不高。而如果能够提升前面几轮学习的收敛速度,那么会对整体的学习效率提升有较大贡献。在准确度达到一定程度时,可以采用模式1继续学习,由此适当提升参数的准确度。在准确度提升到一定程度时,可以采用模式3继续学习,由此获取准确度最高的结果。Exemplarily, mode 2 can be used at the beginning of learning. In this way, although the training results with higher accuracy cannot be obtained, the previous rounds of learning can be quickly converged. It is understandable that, in general, a learning process requires multiple rounds of learning to complete. In the previous rounds of learning, since the parameters in the model are likely to change in the subsequent learning process, the accuracy requirements for the previous rounds of learning are not high. If the convergence speed of the previous rounds of learning can be improved, it will greatly contribute to the improvement of the overall learning efficiency. When the accuracy reaches a certain level, mode 1 can be used to continue learning, thereby appropriately improving the accuracy of the parameters. When the accuracy is improved to a certain level, mode 3 can be used to continue learning, thereby obtaining the result with the highest accuracy.
作为一种示例,表1示出了一种准确度与模式选择的对应关系。As an example, Table 1 shows a correspondence between accuracy and mode selection.
表1Table 1
Figure PCTCN2021132048-appb-000051
Figure PCTCN2021132048-appb-000051
如表1所示,在准确度小于0.65时,那么中心节点确定继续采用模式2进行学习。在准确度处于[0.65,0.8]之间时,中心节点可以确定采用模式1进行学习。而在准确度大于或等于0.8时,则中心节点可以确定通过模式3进行学习。As shown in Table 1, when the accuracy is less than 0.65, the central node determines to continue to use mode 2 for learning. When the accuracy is between [0.65, 0.8], the central node can determine to use mode 1 for learning. When the accuracy is greater than or equal to 0.8, the central node can determine to learn through mode 3.
其中,准确度可以是中心节点根据各个子节点上传的准确度计算获取的。比如,中心节点可以根据
Figure PCTCN2021132048-appb-000052
计算获取系统的准确度,并根据表1所示的对应关系确定传输模式。需要说明的是,表1中的0.65/0.8均为一种阈值的设置示例,在本申请的另一些实现方式中,该阈值还可以是被设置为其他值,或者根据环境灵活调整的。
The accuracy may be obtained by the central node calculated according to the accuracy uploaded by each sub-node. For example, the central node can be based on
Figure PCTCN2021132048-appb-000052
Calculate the accuracy of the acquisition system, and determine the transmission mode according to the corresponding relationship shown in Table 1. It should be noted that 0.65/0.8 in Table 1 are all setting examples of a threshold, and in other implementations of the present application, the threshold may also be set to other values, or flexibly adjusted according to the environment.
需要说明的是,对于各个子节点,可以通过其中存储的测试集,对与本地参数对应的本地训练模型进行校验,由此获取对应的准确度,并将该准确度发送给中心节点。在本申请的另一些实现方式中,校验获取准确度的操作也可以是在中心节点处完成的。比 如,在中心节点中可以存储有与本地节点对应的训练模型。中心节点可以在接收到本地节点发送的本地参数后,根据该本地参数更新训练模型,并基于更新的训练模型以及中心节点中存储的测试集,对该本地参数对应的准确度进行校验,由此即可获取对应的准确度。类似的,对于其他节点上传的本地参数,中心节点也可获取对应的准确度。由此,中心节点就可以基于各个本地参数对应的准确度,计算获取对应的系统的准确度,进而确定数据的传输模式。It should be noted that, for each child node, the local training model corresponding to the local parameters can be verified through the test set stored therein, thereby obtaining the corresponding accuracy, and sending the accuracy to the central node. In other implementation manners of the present application, the operation of verifying the acquisition accuracy may also be completed at the central node. For example, the training model corresponding to the local node may be stored in the central node. After receiving the local parameters sent by the local node, the central node can update the training model according to the local parameters, and verify the accuracy corresponding to the local parameters based on the updated training model and the test set stored in the central node. In this way, the corresponding accuracy can be obtained. Similarly, for local parameters uploaded by other nodes, the central node can also obtain the corresponding accuracy. Thus, the central node can calculate the accuracy of the corresponding system based on the accuracy corresponding to each local parameter, and then determine the data transmission mode.
在本申请的另一些实现方式中,该准确度也可以是中心节点根据完成中心融合后的融合参数,以及训练模型,结合测试集或者验证数据集进行校验,获取的准确度。在不同的实现方式中,准确度的确定方法可以灵活确定,本申请实施例对此不作限制。In other implementation manners of the present application, the accuracy may also be the accuracy obtained by the central node according to the fusion parameters after central fusion and the training model combined with the test set or the verification data set for verification. In different implementation manners, the method for determining the accuracy may be flexibly determined, which is not limited in this embodiment of the present application.
以上说明中,是以通过准确度选取传输模式为例进行说明的,在本申请的另一些实现方式中,中心节点还可以根据其他方法确定传输模式。比如,中心节点可以根据迭代轮数N确定传输模式。表2示出了一种可能的迭代轮数N与传输模式的对应关系。In the above description, the selection of the transmission mode by accuracy is used as an example for description. In other implementations of the present application, the central node may also determine the transmission mode according to other methods. For example, the central node may determine the transmission mode according to the number of iteration rounds N. Table 2 shows a possible correspondence between the number of iteration rounds N and the transmission mode.
表2Table 2
迭代轮数NIteration rounds N 参数传输模式parameter transfer mode
N≤5N≤5 模式2 Mode 2
5<N≤505<N≤50 模式1Mode 1
N>50N>50 模式3 Mode 3
如表2所示,在迭代轮数在5以内时,中心节点可以确定当前学习中,提高收敛速度的需求更高,则可以采用模式2进行学习。在迭代轮数大于5轮且在50轮之内时,则中心节点可以确定当前学习中需要适当提高准确度,继而采用模式1进行学习。而在迭代轮数大于50时,则中心节点可以认为学习即将结束,需要以最高的准确度进行参数传输,即采用模式3进行学习。As shown in Table 2, when the number of iteration rounds is within 5, the central node can determine that in the current learning, the need to improve the convergence speed is higher, so mode 2 can be used for learning. When the number of iteration rounds is greater than 5 rounds and within 50 rounds, the central node can determine that the accuracy needs to be appropriately improved in the current learning, and then adopts mode 1 to learn. When the number of iteration rounds is greater than 50, the central node can consider that the learning is about to end, and the parameter transmission needs to be performed with the highest accuracy, that is, mode 3 is used for learning.
需要说明的是,在本申请的一些实现方式中,当三种模式发生切换时,中心节点可以指示各子节点调整参数传输模式。例如增加参数传输模式字段来指示,三种模式可用2比特来指示,如00指示下轮参数传输方式为模式1、01指示模式2、10指示模式3。参数传输模式字段可随中心融合模型一起下发,或用专用控制信道下发。It should be noted that, in some implementation manners of the present application, when the three modes are switched, the central node may instruct each sub-node to adjust the parameter transmission mode. For example, adding a parameter transmission mode field to indicate, three modes can be indicated by 2 bits, for example, 00 indicates that the next round parameter transmission mode is mode 1, 01 indicates mode 2, and 10 indicates mode 3. The parameter transmission mode field can be delivered together with the central fusion model, or delivered through a dedicated control channel.
通过上述方案说明,可以理解的是,在结合图4-图8所示的机器学习系统中,采用如图9所示的机器学习方法,能够显著地增加系统学习效率。同时结合模式1、模式2以及模式3的选择使用,能够结合当前学习过程中对于准确度和收敛速度的不同需求,自适应地选取对应的模式,以获取最优的学习结果。From the above solution description, it can be understood that, in combination with the machine learning system shown in FIG. 4 to FIG. 8 , using the machine learning method shown in FIG. 9 can significantly increase the system learning efficiency. At the same time, combined with the selection and use of mode 1, mode 2 and mode 3, the corresponding mode can be adaptively selected according to the different requirements for accuracy and convergence speed in the current learning process to obtain the optimal learning result.
为了说明本申请实施例提供的方案能够达到的效果,以下结合仿真数据,对本申请所述方案所能够达到的效果进行示例性说明。In order to illustrate the effects that can be achieved by the solutions provided in the embodiments of the present application, the following describes the effects that can be achieved by the solutions described in the present application in combination with simulation data.
以MNIST手写数字识别为例,在仿真中,使用了由两层卷积层和两层全连接层组成的4层卷积神经网络,MNIST数据集中的训练集被均匀分布在100个子节点上,每个节点共有600对数据,其中每类数据60对,测试集只保存在中心服务器上。最终的训练集相关结果为100个节点的结果的均值,测试集相关的结果为中心服务器基于本地参数二值化后的结果进行的。高精度参数采用32比特量化。Taking MNIST handwritten digit recognition as an example, in the simulation, a 4-layer convolutional neural network consisting of two convolutional layers and two fully connected layers is used. The training set in the MNIST dataset is evenly distributed on 100 child nodes. Each node has a total of 600 pairs of data, including 60 pairs of each type of data, and the test set is only stored on the central server. The final training set related results are the mean of the results of 100 nodes, and the test set related results are performed by the central server based on the binarized results of local parameters. High-precision parameters are quantized with 32 bits.
使用4层卷积神经网络进行训练,具体结构为:3*3*16卷积层,归一化层,2*2最大池化层,tanh激活函数,3*3*16卷积层,归一化层,2*2最大池化层,tanh激活函数,784*100 全连接层,归一化层,tanh激活函数,100*10全连接层,softmax激活函数,最后使用交叉熵损失函数。使用了Adam梯度更新方法,学习率η的初始值为0.05,随后每30次迭代降低一次,依次为0.02,0.01,0.005,0.002。Use 4-layer convolutional neural network for training, the specific structure is: 3*3*16 convolutional layer, normalization layer, 2*2 maximum pooling layer, tanh activation function, 3*3*16 convolutional layer, normalization layer A normalization layer, a 2*2 max pooling layer, a tanh activation function, a 784*100 fully connected layer, a normalization layer, a tanh activation function, a 100*10 fully connected layer, a softmax activation function, and finally a cross-entropy loss function. Using the Adam gradient update method, the initial value of the learning rate η is 0.05, and then decreases every 30 iterations to 0.02, 0.01, 0.005, and 0.002.
模式1的本地融合方法中,假设M=100,验证用对数函数拟合
Figure PCTCN2021132048-appb-000053
函数的曲线,绘制出
Figure PCTCN2021132048-appb-000054
曲线和使用插值得到的对数近似曲线,其结果如图10所示。可以看到,通过模式1所示的本地融合方法所绘制的近似曲线(表达式为
Figure PCTCN2021132048-appb-000055
Figure PCTCN2021132048-appb-000056
),与真实绘制的真实曲线基本重合。因此,通过上述模式1的本地融合方法能够较好地模拟真实情况。
In the local fusion method of mode 1, assuming M = 100, the verification is fitted with a logarithmic function
Figure PCTCN2021132048-appb-000053
The curve of the function, plotted out
Figure PCTCN2021132048-appb-000054
The curve and the logarithmic approximation obtained using interpolation, the results of which are shown in Figure 10. It can be seen that the approximate curve drawn by the local fusion method shown in Mode 1 (the expression is
Figure PCTCN2021132048-appb-000055
Figure PCTCN2021132048-appb-000056
), which basically coincides with the real curve drawn actually. Therefore, the local fusion method of the above mode 1 can better simulate the real situation.
图11示出了本发明应用在MNIST手写数字识别数据集上时,测试集准确率随时间变化的曲线。其中,集中式训练,即将所有的数据收集至一个中心节点进行训练,此曲线作为基线进行对比。可以看出,在测试集准确率上,模式1、3均可以达到接近集中式的效果,模式3在训练效果上有微弱的优势,但模式1每次迭代所需的通信量很小,消耗的通信代价远小于模式3,且模式3表现不是很稳定。在模式2中,虽然最终性能较差,但由于所需的通信量极低,在训练的早期具有较强的竞争力,随着训练的深入,性能无法与前两种模式相比。总体来说,本发明中的模式1适合大多数实际情况,模式2适合在训练早期,或者通信资源非常紧张、且对学习效果要求很低时应用,模式3适合后期对基本训练好的模型进行微调。FIG. 11 shows the curve of the accuracy rate of the test set changing with time when the present invention is applied to the MNIST handwritten digit recognition data set. Among them, centralized training means that all data is collected to a central node for training, and this curve is used as a baseline for comparison. It can be seen that in terms of the accuracy of the test set, both modes 1 and 3 can achieve close to the centralized effect. Mode 3 has a slight advantage in the training effect, but mode 1 requires a small amount of communication per iteration and consumes The communication cost is much lower than that of mode 3, and mode 3 is not very stable. In Mode 2, although the final performance is poor, it is highly competitive in the early stage of training due to the extremely low amount of communication required, and the performance cannot be compared with the first two modes as the training progresses. Generally speaking, mode 1 in the present invention is suitable for most practical situations, mode 2 is suitable for application in the early stage of training, or when communication resources are very tight and the requirements for learning effect are very low, and mode 3 is suitable for later training of well-trained models. Fine tune.
以下(表3)给出使测试集准确率首次达到90%和95%所需的每个子节点计算量和通信量对比。其中,选择α=1.5和β=0.3时的结果作为模式1和模式2的结果,计算量以每个子节点进行一次前向计算和反向传播为1,则每次子节点训练需要进行10次前向计算和反向传播,需要计算量为10。The following (Table 3) gives a comparison of the amount of computation and communication per child node required to achieve 90% and 95% accuracy on the test set for the first time. Among them, the results when α=1.5 and β=0.3 are selected as the results of mode 1 and mode 2, and the calculation amount is 1 for each child node to perform forward calculation and backpropagation, so each child node training needs to be performed 10 times Forward computation and backpropagation require 10 computations.
表3table 3
Figure PCTCN2021132048-appb-000057
Figure PCTCN2021132048-appb-000057
可以看到,系统总参数个数为82242,则系统工作在模式1时每次需要上传10.04KB、下传66.70KB数据;工作在模式2时每次需要上传、下传各10.04KB数据;模式2因为无法达到95%准确率,故数据空缺;工作在模式3时每次需要上传、下传各321.26KB数据。相比于普通联邦学习的训练模式(模式3),本发明中的模式1方法每次迭代所需的通信量大大减少,本方法大大地降低了分布式机器学习系统的通信量,进而可以大幅度减少分布式机器学习任务所需的总时间。It can be seen that the total number of parameters of the system is 82242. When the system works in mode 1, it needs to upload 10.04KB and download 66.70KB data each time; when it works in mode 2, it needs to upload and download 10.04KB data each time; mode 2 Because the accuracy rate of 95% cannot be achieved, the data is vacant; when working in mode 3, each 321.26KB data needs to be uploaded and downloaded each time. Compared with the training mode (mode 3) of ordinary federated learning, the communication amount required for each iteration of the mode 1 method in the present invention is greatly reduced. A magnitude reduction in the total time required for distributed machine learning tasks.
在各个子节点中的数据集非独立同分布场景下,本申请实施例所提供的机器学习方法的仿真结果如下:In the scenario where the data sets in each sub-node are not independent and identically distributed, the simulation results of the machine learning method provided by the embodiments of the present application are as follows:
同样以MNIST手写数字识别为例,在仿真中,使用了由两层卷积层和两层全连接层 组成的4层卷积神经网络,网络结构与2.4.1节相同,学习率η的初始值为0.02,随后每30次迭代降低一次,依次为0.01,0.005,0.002。MNIST数据集中的训练集被不均匀分布在100个子节点上,具体分布方法为:首先将数据集按照类型分为10份,再将每份均分为100份,得到1000个子数据集,将这1000个子数据集随机分配给100个子节点,每个子节点被随机分配到10个子数据集。测试集只保存在中心服务器上。Also taking MNIST handwritten digit recognition as an example, in the simulation, a 4-layer convolutional neural network consisting of two convolutional layers and two fully connected layers is used. The network structure is the same as that in Section 2.4.1. The initial learning rate η is The value is 0.02 and then decreases every 30 iterations to 0.01, 0.005, 0.002. The training set in the MNIST data set is unevenly distributed on 100 sub-nodes. The specific distribution method is as follows: first, the data set is divided into 10 parts according to the type, and then each part is divided into 100 parts to obtain 1000 sub-data sets. The 1000 subdatasets are randomly assigned to 100 child nodes, and each child node is randomly assigned to 10 subdatasets. The test set is only saved on the central server.
图12为本发明应用在非独立同分布MNIST手写数字识别数据集上时,测试集准确率变化的曲线。其中混合模式中模式切换采用迭代轮数进行判断,初始使用模式2进行训练,5次迭代后改为模式1,再经过50次迭代后改为模式3。在这一情况下,因为数据集非独立同分布的特点,系统会有一定的性能损失。可以看出,混合模式最终也可以达到较好的效果。FIG. 12 is a curve showing the change of the accuracy rate of the test set when the present invention is applied to the non-IID MNIST handwritten digit recognition data set. Among them, the mode switching in the hybrid mode is judged by the number of iteration rounds. Initially, mode 2 is used for training. After 5 iterations, it is changed to mode 1, and then it is changed to mode 3 after 50 iterations. In this case, due to the non-IID characteristics of the dataset, the system will suffer a certain performance loss. It can be seen that the mixed mode can also achieve better results in the end.
表4给出了三种模式和混合模式连续5次达到一定准确率所需的通信和计算代价。Table 4 shows the communication and computational costs required to achieve a certain accuracy for 5 consecutive times for the three modes and the hybrid mode.
表4Table 4
Figure PCTCN2021132048-appb-000058
Figure PCTCN2021132048-appb-000058
结合表4的仿真结果,可以看出,单纯使用模式1和模式2,较难达到85%的准确度需求,混合模式可以达到,且通信开销与模式3相比更有优势。Combined with the simulation results in Table 4, it can be seen that it is difficult to achieve the 85% accuracy requirement by simply using Mode 1 and Mode 2, while the hybrid mode can be achieved, and the communication overhead is more advantageous than Mode 3.
上述主要从子节点和中心节点的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solutions provided by the embodiments of the present application from the perspectives of the child nodes and the central node. In order to realize the above-mentioned functions, it includes corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that the present application can be implemented in hardware or a combination of hardware and computer software with the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对其中涉及的设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In this embodiment of the present application, the devices involved may be divided into functional modules according to the above method examples. For example, each functional module may be divided into each function, or two or more functions may be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
请参考图13,为本申请实施例提供的一种机器学习装置1300,该装置可以应用于子节点,子节点中设置有二值化神经网络模型BNN,该装置包括:获取单元1301,用于对采集获取的本地数据集,进行基于BNN的机器学习,获取与本地数据集对应的本地模型参数。发送单元1302,用于向中心节点发送第一消息,第一消息包括本地模型参数。Referring to FIG. 13 , a machine learning device 1300 is provided in an embodiment of the present application. The device can be applied to a sub-node, and a binary neural network model BNN is set in the sub-node. The device includes: an obtaining unit 1301 for Perform BNN-based machine learning on the collected local data set to obtain local model parameters corresponding to the local data set. The sending unit 1302 is configured to send a first message to the central node, where the first message includes local model parameters.
在一种可能的设计中,第一消息中包括的本地模型参数是二值化的本地模型参数。In one possible design, the local model parameters included in the first message are binarized local model parameters.
在一种可能的设计中,该装置还包括:接收单元1303,用于从中心节点接收融合参数,融合参数是中心节点根据本地模型参数融合获取的。融合单元1304,用于对融合参数以及本地模型参数进行融合,以获取更新后的本地模型参数。In a possible design, the apparatus further includes: a receiving unit 1303, configured to receive fusion parameters from the central node, where the fusion parameters are obtained by the central node by fusion according to local model parameters. The fusion unit 1304 is configured to fuse the fusion parameters and the local model parameters to obtain updated local model parameters.
在一种可能的设计中,融合参数是二值化的模型参数,或者,融合参数是高精度的模型参数。In one possible design, the fusion parameters are binarized model parameters, or the fusion parameters are high-precision model parameters.
在一种可能的设计中,第一消息还包括:与本地模型参数对应的准确度信息。获取单元1301,还用于根据本地模型参数,以及测试数据集,校验获取准确度信息。In a possible design, the first message further includes: accuracy information corresponding to the local model parameters. The obtaining unit 1301 is further configured to verify the obtained accuracy information according to the local model parameters and the test data set.
在一种可能的设计中,该装置还包括:学习单元1305,用于根据更新后的本地模型参数,基于BNN继续进行机器学习。In a possible design, the apparatus further includes: a learning unit 1305, configured to continue machine learning based on the BNN according to the updated local model parameters.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
请参考图14,为本申请实施例提供的一种机器学习装置1400,该装置应用于中心节点,装置包括:接收单元1401,用于接收分别来自N个子节点的N个第一消息,第一消息包括本地模型参数,本地模型参数是二值化的本地模型参数。N为大于或等于1的整数。融合单元1402,用于对N个第一消息包括的本地模型参数进行融合,获取融合参数。发送单元1403,用于向M个子节点发送第二消息,第二消息包括融合参数,M为大于或等于1的正整数。M个子节点包括在N个子节点中。Referring to FIG. 14 , a machine learning apparatus 1400 is provided in an embodiment of the present application. The apparatus is applied to a central node. The apparatus includes: a receiving unit 1401, configured to receive N first messages from N child nodes, respectively. The message includes local model parameters, which are binarized local model parameters. N is an integer greater than or equal to 1. The fusion unit 1402 is configured to fuse the local model parameters included in the N first messages to obtain fusion parameters. The sending unit 1403 is configured to send a second message to the M sub-nodes, where the second message includes a fusion parameter, and M is a positive integer greater than or equal to 1. The M child nodes are included in the N child nodes.
在一种可能的设计中,融合单元1402,具体用于对N个本地模型参数,进行加权平均,获取融合参数。In a possible design, the fusion unit 1402 is specifically configured to perform a weighted average on the N local model parameters to obtain the fusion parameters.
在一种可能的设计中,第二消息包括的融合参数是高精度的融合参数,或者,第二消息包括的融合参数是二值化的融合参数。In a possible design, the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized fusion parameter.
在一种可能的设计中,该装置还包括:确定单元1404,用于根据N个第一消息,确定系统准确度信息,中心节点根据系统准确度信息,确定第一消息中包括的融合参数为高精度的融合参数或者二值化的融合参数。In a possible design, the apparatus further includes: a determining unit 1404, configured to determine the system accuracy information according to the N first messages, and the central node determines, according to the system accuracy information, that the fusion parameter included in the first message is High-precision fusion parameters or binarized fusion parameters.
在一种可能的设计中,第一消息还包括:准确度信息。准确度信息与第一消息中包括的本地模型参数在对应子节点处校验获取的准确度对应。确定单元1404,具体用于根据N个消息中包括的准确度信息,确定系统准确度信息。In a possible design, the first message further includes: accuracy information. The accuracy information corresponds to the accuracy obtained by verifying the local model parameters included in the first message at the corresponding child node. The determining unit 1404 is specifically configured to determine the system accuracy information according to the accuracy information included in the N messages.
在一种可能的设计中,确定单元1404,用于在系统准确度信息小于或等于第一阈值时,确定融合参数为二值化的融合参数。确定单元1404,还用于在系统准确度信息大于或等于第二阈值时,确定融合参数为高精度的融合参数。In a possible design, the determining unit 1404 is configured to determine that the fusion parameter is a binarized fusion parameter when the system accuracy information is less than or equal to the first threshold. The determining unit 1404 is further configured to determine that the fusion parameter is a high-precision fusion parameter when the system accuracy information is greater than or equal to the second threshold.
在一种可能的设计中,发送单元1403,用于在迭代轮数小于或等于第三阈值时,向M个子节点发送包括二值化的融合参数的第二消息。发送单元1403,还用于在迭代轮数大于或等于第四阈值时,向M个子节点发送包括高精度的融合参数的第二消息。In a possible design, the sending unit 1403 is configured to send the second message including the binarized fusion parameter to the M sub-nodes when the number of iteration rounds is less than or equal to the third threshold. The sending unit 1403 is further configured to send a second message including a high-precision fusion parameter to the M sub-nodes when the number of iteration rounds is greater than or equal to the fourth threshold.
在一种可能的设计中,发送单元1403,具体用于通过广播,向M个子节点发送第二消息。In a possible design, the sending unit 1403 is specifically configured to send the second message to the M sub-nodes through broadcasting.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功 能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional descriptions of the corresponding functional modules, which will not be repeated here.
图15示出了的一种子节点1500的组成示意图。如图15所示,该子节点1500可以包括:处理器1501和存储器1502。该存储器1502用于存储计算机执行指令。示例性的,在一些实施例中,当该处理器1501执行该存储器1502存储的指令时,可以使得该子节点1500执行上述实施例中任一种所示的数据处理方法。FIG. 15 shows a schematic composition diagram of a child node 1500 . As shown in FIG. 15 , the child node 1500 may include: a processor 1501 and a memory 1502 . The memory 1502 is used to store computer-implemented instructions. Exemplarily, in some embodiments, when the processor 1501 executes the instructions stored in the memory 1502, the child node 1500 may be caused to execute the data processing method shown in any of the foregoing embodiments.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
图16示出了的一种芯片系统1600的组成示意图。该芯片系统可以应用于本申请实施例涉及的任意一个子节点中。该芯片系统1600可以包括:处理器1601和通信接口1602,用于支持相关设备实现上述实施例中所涉及的功能。在一种可能的设计中,芯片系统还包括存储器,用于保存子节点必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。需要说明的是,在本申请的一些实现方式中,该通信接口1602也可称为接口电路。FIG. 16 shows a schematic diagram of the composition of a chip system 1600 . The chip system may be applied to any sub-node involved in the embodiments of this application. The chip system 1600 may include: a processor 1601 and a communication interface 1602, which are used to support related devices to implement the functions involved in the above embodiments. In a possible design, the chip system further includes a memory for storing necessary program instructions and data of the child nodes. The chip system may be composed of chips, or may include chips and other discrete devices. It should be noted that, in some implementation manners of the present application, the communication interface 1602 may also be referred to as an interface circuit.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
图17示出了的一种中心节点1700的组成示意图。如图17所示,该中心节点1700可以包括:处理器1701和存储器1702。该存储器1702用于存储计算机执行指令。示例性的,在一些实施例中,当该处理器1701执行该存储器1702存储的指令时,可以使得该中心节点1700执行上述实施例中任一种所示的数据处理方法。FIG. 17 shows a schematic diagram of the composition of a central node 1700 . As shown in FIG. 17 , the central node 1700 may include: a processor 1701 and a memory 1702 . The memory 1702 is used to store computer-implemented instructions. Exemplarily, in some embodiments, when the processor 1701 executes the instructions stored in the memory 1702, the central node 1700 can be caused to execute the data processing method shown in any one of the foregoing embodiments.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
图18示出了的一种芯片系统1800的组成示意图。该芯片系统可以应用于本申请实施例涉及的任意一个中心节点中。该芯片系统1800可以包括:处理器1801和通信接口1802,用于支持相关设备实现上述实施例中所涉及的功能。在一种可能的设计中,芯片系统还包括存储器,用于保存中心节点必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。需要说明的是,在本申请的一些实现方式中,该通信接口1802也可称为接口电路。FIG. 18 shows a schematic composition diagram of a chip system 1800 . The chip system may be applied to any central node involved in the embodiments of this application. The chip system 1800 may include: a processor 1801 and a communication interface 1802, which are used to support related devices to implement the functions involved in the above embodiments. In a possible design, the chip system further includes a memory for storing necessary program instructions and data of the central node. The chip system may be composed of chips, or may include chips and other discrete devices. It should be noted that, in some implementation manners of the present application, the communication interface 1802 may also be referred to as an interface circuit.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
在上述实施例中的功能或动作或操作或步骤等,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包括一个或多个可以用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介 质(例如固态硬盘(sol id state drive,SSD))等。The functions or actions or operations or steps in the above embodiments may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented using a software program, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, all or part of the processes or functions described in the embodiments of the present application are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer, or data storage devices including one or more servers, data centers, etc. that can be integrated with the medium. The usable media may be magnetic media (e.g., floppy disk, hard disk, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., solid state drive (SSD)), and the like.
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包括这些改动和变型在内。Although the application has been described in conjunction with specific features and embodiments thereof, it will be apparent that various modifications and combinations can be made therein without departing from the spirit and scope of the application. Accordingly, this specification and drawings are merely exemplary illustrations of the application as defined by the appended claims, and are deemed to cover any and all modifications, variations, combinations or equivalents within the scope of this application. Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims (36)

  1. 一种机器学习方法,其特征在于,所述方法应用于子节点,所述子节点中设置有二值化神经网络模型BNN,所述方法包括:A machine learning method, characterized in that the method is applied to a sub-node, and the sub-node is provided with a binary neural network model BNN, and the method includes:
    所述子节点对采集获取的本地数据集,进行基于所述BNN的机器学习,获取与所述本地数据集对应的本地模型参数;The sub-node performs machine learning based on the BNN on the local data set obtained by collecting, and obtains local model parameters corresponding to the local data set;
    所述子节点向中心节点发送第一消息,所述第一消息包括所述本地模型参数。The child node sends a first message to the central node, where the first message includes the local model parameters.
  2. 根据权利要求1所述的方法,其特征在于,所述第一消息中包括的本地模型参数是二值化的本地模型参数。The method according to claim 1, wherein the local model parameters included in the first message are binarized local model parameters.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    所述子节点从所述中心节点接收融合参数,所述融合参数是所述中心节点根据所述本地模型参数融合获取的;The child node receives fusion parameters from the central node, and the fusion parameters are obtained by the central node fusion according to the local model parameters;
    所述子节点对根据所述融合参数和所述本地模型参数进行融合处理,以获取更新后的本地模型参数。The sub-node performs fusion processing on the fusion parameters and the local model parameters to obtain updated local model parameters.
  4. 根据权利要求3所述的方法,其特征在于,所述融合参数是二值化的模型参数,或者,所述融合参数是高精度的模型参数。The method according to claim 3, wherein the fusion parameter is a binarized model parameter, or the fusion parameter is a high-precision model parameter.
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述第一消息还包括:The method according to any one of claims 1-4, wherein the first message further comprises:
    与所述本地模型参数对应的准确度信息;accuracy information corresponding to the local model parameters;
    其中,所述准确度信息是所述子节点根据所述本地模型参数,以及测试数据集,校验获取的。Wherein, the accuracy information is obtained by the sub-node through verification according to the local model parameters and the test data set.
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-5, wherein the method further comprises:
    所述子节点根据所述更新后的本地模型参数,基于所述BNN继续进行机器学习。The child node continues machine learning based on the BNN according to the updated local model parameters.
  7. 一种机器学习方法,其特征在于,所述方法应用于中心节点,所述方法包括:A machine learning method, wherein the method is applied to a central node, and the method comprises:
    所述中心节点接收分别来自N个子节点的N个第一消息,所述第一消息包括本地模型参数,所述本地模型参数是二值化的本地模型参数;N为大于或等于1的整数;The central node receives N first messages from N child nodes respectively, the first messages include local model parameters, and the local model parameters are binary local model parameters; N is an integer greater than or equal to 1;
    所述中心节点对所述N个第一消息包括的本地模型参数进行融合,获取融合参数;The central node fuses the local model parameters included in the N first messages to obtain the fusion parameters;
    所述中心节点向M个子节点发送第二消息,所述第二消息包括融合参数,M为大于或等于1的正整数。The central node sends a second message to the M sub-nodes, where the second message includes a fusion parameter, where M is a positive integer greater than or equal to 1.
  8. 根据权利要求7所述的方法,其特征在于,所述中心节点对所述N个本地模型参数进行融合,获取融合参数,包括:The method according to claim 7, wherein the central node fuses the N local model parameters to obtain the fusion parameters, comprising:
    所述中心节点对所述N个本地模型参数,进行加权平均,获取所述融合参数。The central node performs a weighted average on the N local model parameters to obtain the fusion parameters.
  9. 根据权利要求7或8所述的方法,其特征在于,所述第二消息包括的融合参数是高精度的融合参数,或者,The method according to claim 7 or 8, wherein the fusion parameter included in the second message is a high-precision fusion parameter, or,
    所述第一消息包括的融合参数是二值化的融合参数。The fusion parameter included in the first message is a binarized fusion parameter.
  10. 根据权利要求7-9中任一项所述的方法,其特征在于,The method according to any one of claims 7-9, wherein,
    在所述中心节点向M个子节点发送第二消息之前,所述方法还包括:Before the central node sends the second message to the M sub-nodes, the method further includes:
    所述中心节点根据N个所述第一消息,确定系统准确度信息,The central node determines the system accuracy information according to the N first messages,
    所述中心节点根据所述系统准确度信息,确定所述第一消息中包括的融合参数为高精度的融合参数或者二值化的融合参数。The central node determines, according to the system accuracy information, that the fusion parameter included in the first message is a high-precision fusion parameter or a binarized fusion parameter.
  11. 根据权利要求10所述的方法,其特征在于,所述第一消息还包括:准确度信息;所述准确度信息与所述第一消息中包括的本地模型参数在对应子节点处校验获取的准确度对应;The method according to claim 10, wherein the first message further comprises: accuracy information; the accuracy information and the local model parameters included in the first message are verified and obtained at the corresponding child node The accuracy corresponds to;
    所述中心节点根据N个所述第一消息,确定系统准确度信息,包括:The central node determines the system accuracy information according to the N first messages, including:
    所述中心节点根据所述N个消息中包括的准确度信息,确定系统准确度信息。The central node determines system accuracy information according to the accuracy information included in the N messages.
  12. 根据权利要求10或11所述的方法,其特征在于,在所述系统准确度信息小于或等于第一阈值时,所述中心节点确定所述融合参数为二值化的融合参数;The method according to claim 10 or 11, wherein when the system accuracy information is less than or equal to a first threshold, the central node determines that the fusion parameter is a binarized fusion parameter;
    在所述系统准确度信息大于或等于第二阈值时,所述中心节点确定所述融合参数为高精度的融合参数。When the system accuracy information is greater than or equal to the second threshold, the central node determines that the fusion parameter is a high-precision fusion parameter.
  13. 根据权利要求7-9中任一项所述的方法,其特征在于,The method according to any one of claims 7-9, wherein,
    在所述中心节点向M个子节点发送第二消息,包括:Send a second message to the M sub-nodes at the central node, including:
    在迭代轮数小于或等于第三阈值时,所述中心节点向M个所述子节点发送包括二值化的融合参数的第二消息;When the number of iteration rounds is less than or equal to the third threshold, the central node sends a second message including the binarized fusion parameter to the M sub-nodes;
    在迭代轮数大于或等于第四阈值时,所述中心节点向M个所述子节点发送包括高精度的融合参数的第二消息。When the number of iteration rounds is greater than or equal to the fourth threshold, the central node sends a second message including a high-precision fusion parameter to the M sub-nodes.
  14. 根据权利要求7-13中任一项所述的方法,其特征在于,所述中心节点通过广播,向M个子节点发送第二消息。The method according to any one of claims 7-13, wherein the central node sends the second message to the M sub-nodes through broadcasting.
  15. 一种机器学习装置,其特征在于,所述装置应用于子节点,所述子节点中设置有二值化神经网络模型BNN,所述装置包括:获取单元,用于对采集获取的本地数据集,进行基于所述BNN的机器学习,获取与所述本地数据集对应的本地模型参数;发送单元,用于向中心节点发送第一消息,所述第一消息包括所述本地模型参数。A machine learning device, characterized in that the device is applied to a sub-node, and the sub-node is provided with a binary neural network model BNN, and the device comprises: an acquisition unit for collecting and acquiring a local data set , perform machine learning based on the BNN, and obtain local model parameters corresponding to the local data set; a sending unit, configured to send a first message to the central node, where the first message includes the local model parameters.
  16. 根据权利要求15所述的装置,其特征在于,所述第一消息中包括的本地模型参数是二值化的本地模型参数。The apparatus according to claim 15, wherein the local model parameters included in the first message are binarized local model parameters.
  17. 根据权利要求15或16所述的装置,其特征在于,所述装置还包括:接收单元,用于从所述中心节点接收融合参数,所述融合参数是所述中心节点根据所述本地模型参数融合获取的;The apparatus according to claim 15 or 16, wherein the apparatus further comprises: a receiving unit, configured to receive a fusion parameter from the central node, the fusion parameter being the central node according to the local model parameter obtained by fusion;
    融合单元,用于对根据所述融合参数以及所述本地模型参数进行融合,以获取更新后的本地模型参数。The fusion unit is configured to fuse the parameters according to the fusion and the local model parameters to obtain the updated local model parameters.
  18. 根据权利要求17所述的装置,其特征在于,所述融合参数是二值化的模型参数,或者,所述融合参数是高精度的模型参数。The apparatus according to claim 17, wherein the fusion parameter is a binarized model parameter, or the fusion parameter is a high-precision model parameter.
  19. 根据权利要求15-18中任一项所述的装置,其特征在于,所述第一消息还包括:与所述本地模型参数对应的准确度信息;The apparatus according to any one of claims 15-18, wherein the first message further comprises: accuracy information corresponding to the local model parameter;
    所述获取单元,还用于根据所述本地模型参数,以及测试数据集,校验获取所述准确度信息。The obtaining unit is further configured to verify and obtain the accuracy information according to the local model parameters and the test data set.
  20. 根据权利要求15-19中任一项所述的装置,其特征在于,所述装置还包括:学习单元,用于根据所述更新后的本地模型参数,基于所述BNN继续进行机器学习。The apparatus according to any one of claims 15-19, wherein the apparatus further comprises: a learning unit, configured to continue machine learning based on the BNN according to the updated local model parameters.
  21. 一种机器学习装置,其特征在于,所述装置应用于中心节点,所述装置包括:接收单元,用于接收分别来自N个子节点的N个第一消息,所述第一消息包括本地模型参数,所述本地模型参数是二值化的本地模型参数;N为大于或等于1的整数;A machine learning device, characterized in that the device is applied to a central node, and the device comprises: a receiving unit configured to receive N first messages from N sub-nodes respectively, the first messages including local model parameters , the local model parameter is a binarized local model parameter; N is an integer greater than or equal to 1;
    融合单元,用于对N个所述第一消息包括的所述本地模型参数进行融合,获取融合参数;a fusion unit, configured to fuse the local model parameters included in the N first messages to obtain fusion parameters;
    发送单元,用于向M个所述子节点发送第二消息,所述第二消息包括所述融合参数,M为大于或等于1的正整数。A sending unit, configured to send a second message to the M sub-nodes, where the second message includes the fusion parameter, and M is a positive integer greater than or equal to 1.
  22. 根据权利要求21所述的装置,其特征在于,所述融合单元,具体用于对N个所 述本地模型参数,进行加权平均,获取所述融合参数。The apparatus according to claim 21, wherein the fusion unit is specifically configured to perform a weighted average of the N local model parameters to obtain the fusion parameters.
  23. 根据权利要求21或22所述的装置,其特征在于,所述第二消息包括的所述融合参数是高精度的融合参数,或者,所述第二消息包括的所述融合参数是二值化的融合参数。The apparatus according to claim 21 or 22, wherein the fusion parameter included in the second message is a high-precision fusion parameter, or the fusion parameter included in the second message is a binarized parameter fusion parameters.
  24. 根据权利要求21-23中任一项所述的装置,其特征在于,所述装置还包括:确定单元,用于根据N个所述第一消息,确定系统准确度信息,所述中心节点根据所述系统准确度信息,确定所述第一消息中包括的所述融合参数为高精度的融合参数或者二值化的融合参数。The apparatus according to any one of claims 21-23, wherein the apparatus further comprises: a determining unit, configured to determine the system accuracy information according to the N first messages, and the central node according to The system accuracy information determines that the fusion parameter included in the first message is a high-precision fusion parameter or a binarized fusion parameter.
  25. 根据权利要求24所述的装置,其特征在于,所述第一消息还包括:准确度信息;所述准确度信息与所述第一消息中包括的所述本地模型参数在对应子节点处校验获取的准确度对应;所述确定单元,具体用于根据N个所述消息中包括的所述准确度信息,确定所述系统准确度信息。The apparatus according to claim 24, wherein the first message further comprises: accuracy information; the accuracy information is calibrated at the corresponding child node with the local model parameters included in the first message and the determining unit is specifically configured to determine the system accuracy information according to the accuracy information included in the N messages.
  26. 根据权利要求24或25所述的装置,其特征在于,所述确定单元,用于在所述系统准确度信息小于或等于第一阈值时,确定所述融合参数为二值化的融合参数;所述确定单元,还用于在所述系统准确度信息大于或等于第二阈值时,确定所述融合参数为高精度的融合参数。The apparatus according to claim 24 or 25, wherein the determining unit is configured to determine that the fusion parameter is a binarized fusion parameter when the system accuracy information is less than or equal to a first threshold; The determining unit is further configured to determine that the fusion parameter is a high-precision fusion parameter when the system accuracy information is greater than or equal to a second threshold.
  27. 根据权利要求21-23中任一项所述的装置,其特征在于,所述发送单元,用于在迭代轮数小于或等于第三阈值时,向M个所述子节点发送包括二值化的融合参数的所述第二消息;所述发送单元,还用于在迭代轮数大于或等于第四阈值时,向M个所述子节点发送包括高精度的融合参数的所述第二消息。The apparatus according to any one of claims 21 to 23, wherein the sending unit is configured to send the data including binarization to the M sub-nodes when the number of iteration rounds is less than or equal to a third threshold The second message of the fusion parameter; the sending unit is further configured to send the second message including the high-precision fusion parameter to the M sub-nodes when the number of iteration rounds is greater than or equal to the fourth threshold .
  28. 根据权利要求21-27中任一项所述的装置,其特征在于,所述发送单元,具体用于通过广播,向M个所述子节点发送所述第二消息。The apparatus according to any one of claims 21-27, wherein the sending unit is specifically configured to send the second message to the M sub-nodes through broadcasting.
  29. 一种机器学习系统,其特征在于,所述机器学习系统包括一个或多个如权利要求15-20中任一项所述的机器学习装置,以及一个或多个如权利要求21-28中任一项所述的机器学习装置。A machine learning system, characterized in that the machine learning system comprises one or more machine learning devices according to any one of claims 15-20, and one or more machine learning devices according to any one of claims 21-28 A machine learning device of the kind.
  30. 一种子节点,其特征在于,所述子节点包括一个或多个处理器和一个或多个存储器;所述一个或多个存储器与所述一个或多个处理器耦合,所述一个或多个存储器存储有计算机指令;A child node, characterized in that the child node includes one or more processors and one or more memories; the one or more memories are coupled with the one or more processors, and the one or more memories the memory stores computer instructions;
    当所述一个或多个处理器执行所述计算机指令时,使得所述子节点执行如权利要求1-6中任一项所述的机器学习方法。The computer instructions, when executed by the one or more processors, cause the child nodes to perform the machine learning method of any of claims 1-6.
  31. 一种中心节点,其特征在于,所述中心节点包括一个或多个处理器和一个或多个存储器;所述一个或多个存储器与所述一个或多个处理器耦合,所述一个或多个存储器存储有计算机指令;A central node, characterized in that the central node includes one or more processors and one or more memories; the one or more memories are coupled with the one or more processors, the one or more memories a memory storing computer instructions;
    当所述一个或多个处理器执行所述计算机指令时,使得所述中心节点执行如权利要求7-14中任一项所述的机器学习方法。When the one or more processors execute the computer instructions, the central node is caused to perform the machine learning method of any one of claims 7-14.
  32. 一种机器学习系统,其特征在于,所述机器学习系统包括一个或多个如权利要求31所述的中心节点,以及一个或多个如权利要求30所述的子节点。A machine learning system, characterized in that, the machine learning system comprises one or more central nodes as claimed in claim 31 , and one or more sub-nodes as claimed in claim 30 .
  33. 一种芯片系统,其特征在于,所述芯片系统包括接口电路和处理器;所述接口电路和所述处理器通过线路互联;所述接口电路用于从存储器接收信号,并向所述处理器发送所述信号,所述信号包括所述存储器中存储的计算机指令;当所述处理器执行所述计算机指令时,所述芯片系统执行如权利要求1-6中任一项所述的机器学习方法;或者, 当所述处理器执行所述计算机指令时,所述芯片系统执行如权利要求7-14中任一项所述的机器学习方法。A chip system, characterized in that the chip system includes an interface circuit and a processor; the interface circuit and the processor are interconnected by lines; the interface circuit is used for receiving signals from a memory and sending signals to the processor sending the signal, the signal comprising computer instructions stored in the memory; when the processor executes the computer instructions, the system-on-a-chip performs the machine learning of any one of claims 1-6 method; or, when the processor executes the computer instructions, the chip system executes the machine learning method according to any one of claims 7-14.
  34. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机指令,当所述计算机指令运行时,执行如权利要求1-6中任一项所述的机器学习方法;或者,执行如权利要求7-14中任一项所述的机器学习方法。A computer-readable storage medium, wherein the computer-readable storage medium comprises computer instructions, when the computer instructions are executed, the machine learning method according to any one of claims 1-6 is executed; or , performing the machine learning method as claimed in any one of claims 7-14.
  35. 一种计算机程序,其特征在于,当所述计算机程序在计算机上运行时,使得如权利要求1-6中任一项所述的机器学习方法被执行;或者,如权利要求7-14中任一项所述的机器学习方法被执行。A computer program, characterized in that, when the computer program runs on a computer, the machine learning method according to any one of claims 1-6 is executed; or, as any one of claims 7-14 A described machine learning method is performed.
  36. 一种计算机程序产品,其特征在于,所述计算机程序产品中包括指令,当所述计算机程序产品在计算机上运行时,使得如权利要求1-6中任一项所述的机器学习方法被执行;或者,如权利要求7-14中任一项所述的机器学习方法被执行。A computer program product, characterized in that the computer program product includes instructions that, when the computer program product is run on a computer, cause the machine learning method according to any one of claims 1 to 6 to be executed ; or, a machine learning method as claimed in any one of claims 7-14 is performed.
PCT/CN2021/132048 2020-11-27 2021-11-22 Machine learning method, device, and system WO2022111403A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011365069.6A CN114548356A (en) 2020-11-27 2020-11-27 Machine learning method, device and system
CN202011365069.6 2020-11-27

Publications (1)

Publication Number Publication Date
WO2022111403A1 true WO2022111403A1 (en) 2022-06-02

Family

ID=81668422

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132048 WO2022111403A1 (en) 2020-11-27 2021-11-22 Machine learning method, device, and system

Country Status (2)

Country Link
CN (1) CN114548356A (en)
WO (1) WO2022111403A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932124A (en) * 2018-06-26 2018-12-04 Oppo广东移动通信有限公司 neural network model compression method, device, terminal device and storage medium
US10152676B1 (en) * 2013-11-22 2018-12-11 Amazon Technologies, Inc. Distributed training of models using stochastic gradient descent
WO2019117646A1 (en) * 2017-12-15 2019-06-20 한국전자통신연구원 Method and device for providing compression and transmission of training parameters in distributed processing environment
CN110084378A (en) * 2019-05-07 2019-08-02 南京大学 A kind of distributed machines learning method based on local learning strategy
CN110795477A (en) * 2019-09-20 2020-02-14 平安科技(深圳)有限公司 Data training method, device and system
CN111709533A (en) * 2020-08-19 2020-09-25 腾讯科技(深圳)有限公司 Distributed training method and device of machine learning model and computer equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10152676B1 (en) * 2013-11-22 2018-12-11 Amazon Technologies, Inc. Distributed training of models using stochastic gradient descent
WO2019117646A1 (en) * 2017-12-15 2019-06-20 한국전자통신연구원 Method and device for providing compression and transmission of training parameters in distributed processing environment
CN108932124A (en) * 2018-06-26 2018-12-04 Oppo广东移动通信有限公司 neural network model compression method, device, terminal device and storage medium
CN110084378A (en) * 2019-05-07 2019-08-02 南京大学 A kind of distributed machines learning method based on local learning strategy
CN110795477A (en) * 2019-09-20 2020-02-14 平安科技(深圳)有限公司 Data training method, device and system
CN111709533A (en) * 2020-08-19 2020-09-25 腾讯科技(深圳)有限公司 Distributed training method and device of machine learning model and computer equipment

Also Published As

Publication number Publication date
CN114548356A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN111585816B (en) Task unloading decision method based on adaptive genetic algorithm
CN110417496B (en) Cognitive NOMA network stubborn resource allocation method based on energy efficiency
CN105721106B (en) SCMA ascending communication system multi-user test method based on serial strategy
CN109951897A (en) A kind of MEC discharging method under energy consumption and deferred constraint
CN110968426B (en) Edge cloud collaborative k-means clustering model optimization method based on online learning
US11831708B2 (en) Distributed computation offloading method based on computation-network collaboration in stochastic network
CN109743713B (en) Resource allocation method and device for electric power Internet of things system
CN115633380B (en) Multi-edge service cache scheduling method and system considering dynamic topology
CN111796880B (en) Unloading scheduling method for edge cloud computing task
CN104767833A (en) Cloud transferring method for computing tasks of mobile terminal
CN114118748B (en) Service quality prediction method and device, electronic equipment and storage medium
CN104079335B (en) The three-dimensional wave bundle shaping method of robustness under a kind of multi-cell OFDMA network
CN109088944B (en) Cache content optimization method based on sub-gradient descent method
CN108810139B (en) Monte Carlo tree search-assisted wireless caching method
CN109510681B (en) Reference node selection method with minimum time synchronization series of communication network
JP5575271B2 (en) Method for controlling resource usage within a communication system
WO2022111403A1 (en) Machine learning method, device, and system
WO2019080771A1 (en) Electronic device and method for wireless communication
WO2020211833A1 (en) Machine learning-based ap adaptive optimization selection method
Zhong et al. Slice allocation of 5G network for smart grid with deep reinforcement learning ACKTR
CN109474664B (en) Active pre-caching method and device in heterogeneous wireless network
CN109561129B (en) Cooperative computing unloading method based on optical fiber-wireless network
CN114615705B (en) Single-user resource allocation strategy method based on 5G network
CN110418359B (en) Method and device for adjusting range extension bias of mobile communication network
CN113590211A (en) Calculation unloading method based on PSO-DE algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896904

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21896904

Country of ref document: EP

Kind code of ref document: A1