WO2021068926A1 - Procédé de mise à jour de modèle, nœud de travail et système de mise à jour de modèle - Google Patents

Procédé de mise à jour de modèle, nœud de travail et système de mise à jour de modèle Download PDF

Info

Publication number
WO2021068926A1
WO2021068926A1 PCT/CN2020/120135 CN2020120135W WO2021068926A1 WO 2021068926 A1 WO2021068926 A1 WO 2021068926A1 CN 2020120135 W CN2020120135 W CN 2020120135W WO 2021068926 A1 WO2021068926 A1 WO 2021068926A1
Authority
WO
WIPO (PCT)
Prior art keywords
working node
model
working
information
node
Prior art date
Application number
PCT/CN2020/120135
Other languages
English (en)
Chinese (zh)
Inventor
朱越
张宝峰
王成录
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021068926A1 publication Critical patent/WO2021068926A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates

Definitions

  • This application relates to the field of artificial intelligence (AI), and mainly relates to a model update method, work node, and model update system.
  • AI artificial intelligence
  • the machine learning system is the most important branch of the AI system.
  • Distributed machine learning (DML) systems are currently commonly used systems for processing large-scale artificial intelligence applications.
  • Traditional distributed machine learning systems are centralized systems that use computing clusters to train massive user data to obtain prediction models.
  • the central node schedules each working node to calculate the gradient of the loss function with respect to the model; after the calculation is completed, all the working nodes are allowed to upload the gradient to the central node; the central node updates the model after receiving the uploaded gradient.
  • the user needs to request service from the system.
  • the system makes a decision according to the model trained by the central node in response to the user's request, and then transmits the result of the decision to the user for execution.
  • the embodiments of the present application provide a model update method, work node, and model update system, which can provide a flexible and efficient optimization technology for a decentralized distributed AI system without relying on a central node.
  • an embodiment of the present application provides a model update method, including: a first working node receives first information sent by at least one second working node; wherein, the above-mentioned at least one second working node is working with the above-mentioned first working node.
  • the nodes belong to the working nodes of the same sub-network; the first working node updates the model according to the first information and the model and data stored by the first working node; wherein each of the first information includes sending the first information At least one of the following information of the second working node: at least one parameter, gradient, and impulse of the model; the model stored in the first working node and the model stored in the at least one second working node are at least partially the same Model structure.
  • the first working node or the second working node may be any smart terminal device, or may be a virtual node composed of several smart terminal devices.
  • the model stored by the first working node and the model stored by the second working node have at least part of the same model structure.
  • the first working node may belong to the first working node by
  • the model is updated by the first information sent by the second working node in the same neighborhood and the data saved by the first working node.
  • the first information may include at least one of the following information of the second working node that sends the first information: at least one parameter of the model, the gradient, and the impulse of the model.
  • the model update method provided in the embodiments of this application is a decentralized model update method, that is, the method does not depend on the central node, and the first working node uses one or more of the same sub-networks that it belongs to.
  • the model-related information corresponding to the second working node transmitted by the second working node is used to update the model on the first working node.
  • this method does not need to upload user data to the cloud, and only needs to update the model of the first working node through parameters such as the model, gradient, and impulse of the second working node, which can ensure the privacy of users.
  • each working node can update the model of the first working node in combination with its saved user data and first information, and updating the model according to the saved user data can customize a personalized model for the working node, thereby providing Users provide more accurate services.
  • the first working node can directly use its saved data to update data without waiting for the calculation results of other working nodes, and the calculation speed is fast and the time delay is small.
  • the model stored in the first working node and the model stored in the at least one second working node have the same model structure.
  • the number of the above-mentioned second working nodes is k; before the above-mentioned first working node receives the first information sent by at least one second working node, the above-mentioned method further includes: when the update condition is satisfied The above-mentioned first working node sends a first request to other working nodes that belong to the same subnet as the above-mentioned first working node; wherein, the above-mentioned first request is used to request the above-mentioned other working nodes to participate in the model update of the above-mentioned first working node; The first working node receives the first response sent by the above-mentioned other working nodes; wherein, the above-mentioned first response is used to characterize whether the above-mentioned other working nodes participate in the model update of the above-mentioned first working node, and others who participate in the model updating of the above-mentioned first working node
  • the number of working nodes is
  • the update condition is that the first working node is currently in an idle state, that is, the first working node is not currently used by the user. In this way, it can be ensured that the first working node does not affect the user's use during model update.
  • the update condition is that the network environment where the first working node is currently located is a non-mobile data network environment. This can ensure that the first working node saves the user's cost when updating the model, and does not incur additional network costs.
  • the update condition is that the first working node is currently in an idle state and the network environment where the first working node is currently located is a non-mobile data network environment. In this way, it can be ensured that the first working node does not affect the use of the user when the model is updated, and does not generate additional network expenses, thereby saving the cost of the user.
  • the method further includes: the first working node updates the model The impulse of is sent to the at least one second working node, so that the at least one second working node performs model update.
  • the embodiment of the application can send the impulse of the model updated by the first working node to the second working node, so that the second working node also updates the model, ensuring the accuracy of the output of the second working node model, and providing users with more accurate services .
  • the method before the first working node receives the first information sent by the at least one second working node, the method further includes: the first working node sends to the at least one second working node to obtain the at least one A request for the first information of a second working node.
  • the first working node may send a request to obtain the first information of the second working node to update the model according to the first information of the second working node.
  • the entire model update process calculations are only performed on the first working node, and there is no need to wait for the calculation process of other working nodes. The calculation speed is fast and the delay is small.
  • the first information includes at least one parameter and gradient of the model of the second working node that sends the first information.
  • the first working node performs model update on the model according to the first information and the model and data stored in the first working node, including:
  • the first working node uses the received model of each second working node, the average value of the parameter corresponding to any one of the above-mentioned first working nodes, and the received gradient of each second working node. Or multiple gradient descent to update the model; or,
  • the above-mentioned first working node uses at least one parameter of the received model of each second working node and the average value of the received gradient of each second working node to perform one or more gradient descents to update the model; or,
  • the first working node uses the average value of any parameter of the received model of each second working node and the average value of the parameter corresponding to any one of the above-mentioned parameters in the first working node and the average value of the received gradient of each second working node Do one or more gradient descents to update the model.
  • the first information includes the gradient of the second working node that sends the first information.
  • the first working node performs model update on the model according to the first information and the model and data stored in the first working node, including:
  • the above-mentioned first working node uses the received gradient of each second working node to perform one or more gradient descents to update the model.
  • the embodiment of the application provides multiple ways to update the model based on the first information.
  • the first working node can update the model of the first working node through the model and impulse of the second working node without obtaining user data. Implicit data such as impulse can be used to update the model, which can ensure the privacy of users.
  • the above-mentioned first information is information after compression or truncation processing.
  • the data transferred between the first working node and the second working node is compressed or truncated, which can reduce network communication overhead.
  • the above-mentioned first information is information that is encrypted using homomorphic or semi-homomorphic encryption, or the above-mentioned first information is information that is encrypted using a differential privacy method in a trusted computing environment.
  • encrypting the data transferred between the first working node and the second working node can further ensure data security, thereby ensuring the privacy of users.
  • the model stored by the first working node is a machine learning model.
  • the above-mentioned machine learning model is a machine learning model based on gradient update
  • the above-mentioned machine learning model based on gradient update includes a deep neural network model or a support vector machine (SVM) model.
  • SVM support vector machine
  • an embodiment of the present application provides a working node, which includes a processor and a memory, and the memory is used to store computer program instructions.
  • the processor executes the computer program instructions in the memory to enable The device executes the method provided by the first aspect or any implementation manner of the first aspect of the embodiments of the present application.
  • the working node is an electronic device or a part of the electronic device.
  • the working node is a system on a chip (system on a chip, SoC).
  • an embodiment of the present application provides a model update system.
  • the system includes a first working node and at least one second working node.
  • the first working node is a working node provided by the second aspect or any implementation manner of the second aspect of the embodiments of the present application, and the second working node is a working node that belongs to the same sub-network as the first working node.
  • an embodiment of the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores instructions that, when run on an electronic device, cause the electronic device to execute the above-mentioned first aspect or the first aspect.
  • the method provided by any one of the implementations.
  • the embodiments of the present application provide a computer program product, which when running on an electronic device, causes the electronic device to execute the method provided in the first aspect or any one of the implementation manners of the first aspect.
  • an embodiment of the present application provides a device, the device includes a processing system, and the processing system is configured to execute program instructions so that the device executes the first aspect or any one of the first aspects. The method provided by the implementation method.
  • the processing system includes at least one processor.
  • the device further includes at least one memory, the at least one memory is configured to store the program instructions, and the at least one memory is coupled with the processing system.
  • the device is an electronic device or a part of the electronic device.
  • the device is a system-on-chip SoC.
  • the work node provided in the second aspect, the model update system provided in the third aspect, the computer storage medium provided in the fourth aspect, the computer program product provided in the fifth aspect, or the device provided in the sixth aspect can be used to implement the model update method provided in the first aspect. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding method, which will not be repeated here.
  • Figure 1 is a schematic structural diagram of a model update system provided by an embodiment of the application.
  • FIG. 2 is a schematic structural diagram of an electronic device involved in an embodiment of the application
  • FIG. 3 is a schematic diagram of a process of selecting a second working node according to an embodiment of the application
  • FIG. 4 is a schematic diagram of the process of updating and storing the first working node and the second working node respectively according to an embodiment of the application;
  • FIG. 5 is a schematic flowchart of a model update method provided by an embodiment of the application.
  • the embodiments of the present application provide a model update method, work node, and model update system. This method does not depend on the central node.
  • the saved data (local data) of the working node can be combined with other working nodes in the neighborhood (that belong to the same subnet as the working node).
  • the machine learning model for example, a deep neural network model, a support vector machine (SVM) model, etc.
  • SVM support vector machine
  • the working nodes in this method are updated in combination with their local data, so that the model of each working node can be a personalized model, which conforms to the usage habits of users who use the working node, and provides users with accurate services.
  • the use of local data for updating can eliminate the need to wait for the calculation results of other working nodes, so that the model of the working node is updated quickly and the time delay is small.
  • the updated model of the first working node can be used for prediction or reasoning.
  • the updated model can be applied to reasoning in scenarios such as image recognition and speech recognition to improve the accuracy of image recognition or speech recognition.
  • the updated model can be used to predict the application (APP) that the user will use, and the predicted APP is loaded into the memory in advance to improve the response speed of the device. It is not limited to the use scenarios listed above.
  • the updated model of the first working node can also be applied to other use scenarios, and the comparison of the embodiments of the present application is not limited.
  • FIG. 1 exemplarily shows a schematic structural diagram of a model update system provided by an embodiment of the present application.
  • the model update system may include a first working node 11 and at least one second working node 12 (the second working nodes 12a, 12b, 12c, 12d, 12e are exemplarily shown in Fig. 1).
  • Data can be transmitted between each second working node and the first working node.
  • the data transmission method may specifically be wired transmission or wireless transmission. Wherein, wired transmission may be data transmission through a data line, and wireless transmission may be data transmission through a mobile data network or short-distance wireless transmission (such as but not limited to Bluetooth, wireless fidelity (Wi-Fi), etc.).
  • each first working node or second working node may become a virtual node, and the virtual node may be composed of one or more electronic devices (in FIG. 1, the first virtual node includes an electronic device as an example) . If the virtual node is composed of multiple electronic devices, one of the electronic devices can act as a gateway to complete the data interaction between the virtual node and other virtual nodes.
  • the first working node and the second working node belong to the same subnet.
  • the range of the subnet can be the connection range of the same router or the connection range of the same base station.
  • the second working node and the first working node are directly connected in the same subnet, and these directly connected working nodes form a neighborhood N.
  • Other working nodes belonging to the same subnet as the first working node may be referred to as neighboring nodes.
  • the above-mentioned first working node may store a model (which may be referred to as the local model of the first working node), and the second working node may also store a model (which may be referred to as the local model of the second working node).
  • the local model of the first working node and the local model of the second working node have the same model type, and the local model of the first working node and the local model of the second working node have at least part of the same model structure.
  • the local model of the first working node may be a machine model, specifically, a machine learning model based on gradient update, such as but not limited to a deep neural network model or an SVM model.
  • Convolutional neural networks can include input layers, convolutional layers, activation functions, pooling layers, and fully connected layers.
  • the input layer can be used to input user data;
  • the convolutional layer can be used to extract local features;
  • the activation function can be used to add nonlinear factors;
  • the pooling layer is used to downsample the features, extract the main features, and simplify network calculations The complexity of, increases the robustness;
  • the fully connected layer is used to connect all the features and send the output value to the classifier.
  • the model structure of the local model of the first working node and the local model of the second working node in the present application may at least be a layer containing updatable weight parameters, such as a convolutional layer and a fully connected layer.
  • the first working node or the second working node involved in this application may be an electronic device, or the working node may be a part of the electronic device (such as a processor).
  • the first working node or the second working node involved in this application may also be an SoC.
  • the electronic devices in the embodiments of this application may be mobile phones, tablet computers, desktops, laptops, notebook computers, ultra-mobile personal computers (UMPCs), handheld computers, netbooks, and personal digital assistants (personal digital assistants). , PDA), wearable electronic equipment, or virtual reality equipment, etc.
  • UMPCs ultra-mobile personal computers
  • PDA personal digital assistants
  • wearable electronic equipment or virtual reality equipment, etc.
  • the working node is an electronic device as an example for description.
  • the structure of the working node (electronic device) provided by the embodiment of the present application is described below in conjunction with FIG. 2.
  • FIG. 2 shows a schematic diagram of the structure of the electronic device 100.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2.
  • Mobile communication module 150 wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
  • SIM Subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100.
  • the electronic device 100 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU), etc.
  • AP application processor
  • modem processor modem processor
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the different processing units may be independent devices or integrated in one or more processors.
  • the controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 to store instructions and data.
  • the memory in the processor 110 is a cache memory.
  • the memory can store instructions or data that the processor 110 has just used or used cyclically. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
  • the processor 110 may include one or more interfaces.
  • Interfaces can include integrated circuit (I2C) interfaces, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interfaces, pulse code modulation (PCM) interfaces, universal asynchronous transmitters receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / Or Universal Serial Bus (USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • UART mobile industry processor interface
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB Universal Serial Bus
  • the I2C interface is a bidirectional synchronous serial bus, which includes a serial data line (SDA) and a serial clock line (SCL).
  • the processor 110 may include multiple sets of I2C buses.
  • the processor 110 may couple the touch sensor 180K, the charger, the flash, the camera 193, etc., respectively through different I2C bus interfaces.
  • the processor 110 may couple the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through an I2C bus interface to implement the touch function of the electronic device 100.
  • the I2S interface can be used for audio communication.
  • the processor 110 may include multiple sets of I2S buses.
  • the processor 110 may be coupled with the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170.
  • the audio module 170 may transmit audio signals to the wireless communication module 160 through an I2S interface, so as to realize the function of answering calls through a Bluetooth headset.
  • the PCM interface can also be used for audio communication to sample, quantize and encode analog signals.
  • the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
  • the audio module 170 may also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • the UART interface is generally used to connect the processor 110 and the wireless communication module 160.
  • the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function.
  • the audio module 170 may transmit audio signals to the wireless communication module 160 through a UART interface, so as to realize the function of playing music through a Bluetooth headset.
  • the MIPI interface can be used to connect the processor 110 with the display screen 194, the camera 193 and other peripheral devices.
  • the MIPI interface includes a camera serial interface (camera serial interface, CSI), a display serial interface (display serial interface, DSI), and so on.
  • the processor 110 and the camera 193 communicate through a CSI interface to implement the shooting function of the electronic device 100.
  • the processor 110 and the display screen 194 communicate through a DSI interface to realize the display function of the electronic device 100.
  • the GPIO interface can be configured through software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 110 with the camera 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and so on.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 130 is an interface that complies with the USB standard specification, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transfer data between the electronic device 100 and peripheral devices. It can also be used to connect earphones and play audio through earphones. This interface can also be used to connect to other electronic devices, such as AR devices.
  • the interface connection relationship between the modules illustrated in the embodiment of the present invention is merely a schematic description, and does not constitute a structural limitation of the electronic device 100.
  • the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 140 may receive the charging input of the wired charger through the USB interface 130.
  • the charging management module 140 may receive the wireless charging input through the wireless charging coil of the electronic device 100. While the charging management module 140 charges the battery 142, it can also supply power to the electronic device through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, and the wireless communication module 160.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 141 may also be provided in the processor 110.
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the electronic device 100 can be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
  • the antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna can be used in combination with a tuning switch.
  • the mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 100.
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • the mobile communication module 150 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering, amplifying and transmitting the received electromagnetic waves to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic wave radiation via the antenna 1.
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110.
  • the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
  • the mobile communication module 150 may be used for data transmission with other working nodes, for example, receiving requests sent by other working nodes, or sending first information to other working nodes.
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor.
  • the application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.), or displays an image or video through the display screen 194.
  • the modem processor may be an independent device. In other embodiments, the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), and global navigation satellites.
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • IR infrared technology
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110.
  • the wireless communication module 160 may also receive the signal to be sent from the processor 110, perform frequency modulation, amplify it, and convert it into electromagnetic waves to radiate through the antenna 2.
  • the wireless communication module 160 may be used for data transmission with other working nodes, for example, receiving requests sent by other working nodes, or sending first information to other working nodes.
  • the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
  • the GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite-based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite-based augmentation systems
  • the electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is an image processing microprocessor, which is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • the processor 110 may include one or more GPUs, which execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos, and the like.
  • the display screen 194 includes a display panel.
  • the display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • active-matrix organic light-emitting diode active-matrix organic light-emitting diode
  • AMOLED flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc.
  • the electronic device 100 may include one or N display screens 194, and N is a positive integer greater than one.
  • the electronic device 100 can realize a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.
  • the ISP is used to process the data fed back by the camera 193. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing and is converted into an image visible to the naked eye. ISP can also optimize the image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193. In this embodiment of the present application, the ISP may process the images collected by the camera 193 that include user gestures.
  • the camera 193 is used to capture still images or videos.
  • the object generates an optical image through the lens and is projected to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the electronic device 100 may include one or N cameras 193, and N is a positive integer greater than one.
  • the camera 193 may be used to collect an image stream at a first frequency, and when the NPU recognizes that the image stream contains the initial part of a gesture supported by the electronic device, the image stream is collected at the second frequency. Among them, the first frequency is lower than the second frequency.
  • the camera 193 involved in the embodiment of the present application may be a front camera.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.
  • MPEG moving picture experts group
  • MPEG2 MPEG2, MPEG3, MPEG4, and so on.
  • NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • the NPU can be used to process the image collected by the camera 193, and analyze whether the gesture contained in the image is the initial part of the gesture supported by the electronic device 100, or whether the gesture contained in the image is supported by the electronic device 100 gesture.
  • the NPU may be used to store the model of the working node, that is, the local model of the working node, and to update the model.
  • the external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.
  • the internal memory 121 may be used to store computer executable program code, where the executable program code includes instructions.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, at least one application program (such as a sound playback function, an image playback function, etc.) required by at least one function.
  • the data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like.
  • UFS universal flash storage
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by running instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
  • the internal memory 121 may be used to store the model of the gesture supported by the electronic device 100 and the function corresponding to each supported gesture. Possibly, in the application interface of different applications, the gestures supported may be different, and the functions implemented by the same gesture may also be different.
  • the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. For example, music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into an analog audio signal for output, and is also used to convert an analog audio input into a digital audio signal.
  • the audio module 170 can also be used to encode and decode audio signals.
  • the audio module 170 may be provided in the processor 110, or part of the functional modules of the audio module 170 may be provided in the processor 110.
  • the speaker 170A also called “speaker” is used to convert audio electrical signals into sound signals.
  • the electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call.
  • the receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the electronic device 100 answers a call or voice message, it can receive the voice by bringing the receiver 170B close to the human ear.
  • the microphone 170C also called “microphone”, “microphone”, is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 170C through the human mouth, and input the sound signal into the microphone 170C.
  • the electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C, which can implement noise reduction functions in addition to collecting sound signals. In other embodiments, the electronic device 100 may also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
  • the earphone interface 170D is used to connect wired earphones.
  • the earphone interface 170D may be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, and a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA, CTIA
  • the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
  • the pressure sensor 180A may be provided on the display screen 194.
  • the capacitive pressure sensor may include at least two parallel plates with conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes.
  • the electronic device 100 determines the intensity of the pressure according to the change in capacitance.
  • the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
  • touch operations that act on the same touch position but have different touch operation strengths may correspond to different operation instructions. For example, when a touch operation whose intensity of the touch operation is less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the movement posture of the electronic device 100.
  • the angular velocity of the electronic device 100 around three axes ie, x, y, and z axes
  • the gyro sensor 180B can be used for image stabilization.
  • the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 100 through reverse movement to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
  • the air pressure sensor 180C is used to measure air pressure.
  • the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
  • the magnetic sensor 180D includes a Hall sensor.
  • the electronic device 100 may use the magnetic sensor 180D to detect the opening and closing of the flip holster.
  • the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D.
  • features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and be used in applications such as horizontal and vertical screen switching, pedometers and so on.
  • the electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.
  • the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 100 emits infrared light to the outside through the light emitting diode.
  • the electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100.
  • the electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.
  • the ambient light sensor 180L is used to sense the brightness of the ambient light.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived brightness of the ambient light.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.
  • the temperature sensor 180J is used to detect temperature.
  • the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
  • the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature.
  • the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also called “touch device”.
  • the touch sensor 180K may be disposed on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called a “touch screen”.
  • the touch sensor 180K is used to detect touch operations acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • the visual output related to the touch operation can be provided through the display screen 194.
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.
  • the bone conduction sensor 180M can acquire vibration signals.
  • the bone conduction sensor 180M can obtain the vibration signal of the vibrating bone mass of the human voice.
  • the bone conduction sensor 180M can also contact the human pulse and receive the blood pressure pulse signal.
  • the bone conduction sensor 180M may also be provided in the earphone, combined with the bone conduction earphone.
  • the audio module 170 can parse the voice signal based on the vibration signal of the vibrating bone block of the voice obtained by the bone conduction sensor 180M, and realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 180M, and realize the heart rate detection function.
  • the button 190 includes a power-on button, a volume button, and so on.
  • the button 190 may be a mechanical button. It can also be a touch button.
  • the electronic device 100 may receive key input, and generate key signal input related to user settings and function control of the electronic device 100.
  • the motor 191 can generate vibration prompts.
  • the motor 191 can be used for incoming call vibration notification, and can also be used for touch vibration feedback.
  • touch operations that act on different applications can correspond to different vibration feedback effects.
  • Acting on touch operations in different areas of the display screen 194, the motor 191 can also correspond to different vibration feedback effects.
  • Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 may be an indicator light, which may be used to indicate the charging status, power change, or to indicate messages, missed calls, notifications, and so on.
  • the SIM card interface 195 is used to connect to the SIM card.
  • the SIM card can be inserted into the SIM card interface 195 or pulled out from the SIM card interface 195 to achieve contact and separation with the electronic device 100.
  • the electronic device 100 may support 1 or N SIM card interfaces, and N is a positive integer greater than 1.
  • the SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc.
  • the same SIM card interface 195 can insert multiple cards at the same time. The types of the multiple cards can be the same or different.
  • the SIM card interface 195 can also be compatible with different types of SIM cards.
  • the SIM card interface 195 may also be compatible with external memory cards.
  • the electronic device 100 interacts with the network through the SIM card to implement functions such as call and data communication.
  • the electronic device 100 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.
  • the model update method provided by the embodiment of the present application can be roughly divided into two parts: first, the first working node selects the second working node participating in the update; second, the first working node and the second working node respectively update their own stored models.
  • Fig. 3 exemplarily shows the process in which the first working node selects the second working node participating in the update.
  • the process can include the following steps:
  • the first request is used to request the above-mentioned other working nodes to participate in the model update of the first working node.
  • the number of other working nodes can be one or more.
  • the aforementioned update condition may be that the first working node is currently in an idle state, that is, the first working node is not currently being used by the user. In this way, it can be ensured that the first working node does not affect the user's use during model update.
  • the above update condition may be that the network environment where the first working node is currently located is a non-mobile data network environment. This can ensure that the first working node saves the user's cost when updating the model, and does not incur additional network costs.
  • the aforementioned update condition may be that the first working node is currently in an idle state and the network environment where the first working node is currently located is a non-mobile data network environment. In this way, it can be ensured that the first working node does not affect the use of the user when the model is updated, and does not generate additional network expenses, thereby saving the cost of the user.
  • the updatable state may be that the working node is not currently involved in the model update of other working nodes, and the working node meets the update condition.
  • the update condition refer to the description of the update condition of the first working node in S301, which is not repeated here.
  • S303 The other working nodes randomly select whether to participate in the model update of the first working node.
  • a model update request initiated by multiple first working nodes may be received at the same time.
  • the first working node can randomly select a working node from the multiple first working nodes to participate in the model update.
  • the first response is sent to the first working node.
  • the first response is used to characterize that the other working node can participate in the model update of the first working node.
  • the other working node may also send the first working node to the first working node. In response, it indicates that the other working node does not participate in the model update of the first working node.
  • the first working node selects k other working nodes from the m first responses as the second working node.
  • the first working node can receive m first responses, indicating that m other working nodes can participate in the model update of the first working node, and the first working node can mark these m other working nodes as participating in the update Node.
  • the first response may carry the identification information of the other working node.
  • the first working node may mark the working node as a node participating in the update through the identification information carried in the first response.
  • the first working node may wait for t time after sending the above-mentioned first request. That is, the working node that returns the first response within t is the node that participates in the update.
  • the value of t mentioned above can be, for example, but not limited to, 5 milliseconds (ms), 10 ms, 100 ms, and so on.
  • the first working node may randomly select k from the m nodes participating in the update as the second working node.
  • m and k are both positive integers, and m is greater than or equal to k.
  • the first working node may randomly sample k from the m first responses to respond again. For details on how to respond again, see S306.
  • the first working node can be denoted as n i
  • the neighboring nodes of the first working node n i (including the selected second working node) can be denoted as n j , where n j ⁇ N(n i ).
  • Fig. 4 exemplarily shows a process in which the first working node and the second working node respectively update their stored models. As shown in Figure 4, the process can include the following steps:
  • S306 The first working node sends a second request to the second working node.
  • the first working node randomly samples k from the m first responses and responds again S( ⁇ ) may specifically be that the first working node sends the second request to the second working node.
  • the second request may be a request to obtain at least one model parameter of the second working node and the gradient of the model.
  • the parameter of the at least one model may be a parameter of a part of the model stored in the first working node and the second working node with the same structure.
  • the second working node calculates at least one parameter of its model and the gradient of the model.
  • the second working node calculates at least one parameter of its local model (that is, the model stored by the second working node) and the gradient of the model.
  • At least one parameter of the model of the second working node is denoted as ⁇ w j
  • w j represents at least one parameter of the model of the second working node.
  • the model gradient calculated by the second working node may be a gradient based on mini-batch data, and the gradient is denoted as ⁇ g j
  • the amount of user data (or sample data) of each working node is relatively large, a subset of the entire user data (or sample data) can be divided into multiple small batches of data.
  • the gradient of the second working node model can be calculated based on one of the small batches of data, which can reduce the calculation time, increase the speed of model training, and ensure the accuracy of the calculation results.
  • the second working node sends at least one parameter and gradient of its model to the first working node.
  • the first working node updates the model for the first time and calculates the first impulse.
  • the parameter expression of the model of the first working node is as follows:
  • is a constant, which represents the learning rate of the model.
  • k is the number of second working nodes. It can be known that the number of model parameters can be multiple, and different parameters have their own corresponding gradients, and the above expressions can be applied to each model parameter.
  • the parameters of the model may include parameter 1, parameter 2, and parameter 3.
  • Both the model of the first working node and the model of the second working node include parameter 1, parameter 2, and parameter 3.
  • the updated parameter 1 of the first working node is denoted as The original parameter 1 of the first working node is denoted as w i,1
  • the parameter 1 of the second working node is denoted as w j,1 .
  • the gradient of the parameter 1w i,1 in the first working node is g i
  • the gradient of the parameter 1w j,1 in the second working node is g j,1 .
  • the first working node It can be calculated based on w i,1 , w j,1 , g i,1 and g j,1 .
  • the impulse is the difference between the model update and before the update.
  • the first working node may not perform the first model update, and only calculate the value of the first impulse.
  • S310 The first working node sends the first impulse to the second working node.
  • S311 The second working node updates the model for the first time according to the first impulse.
  • the second working node updates at least one parameter w j of the model of the second working node to
  • S312 The first working node sends a third request to the second working node.
  • the first working node may send the third request to all neighboring nodes participating in the update (including the above k second working nodes).
  • the third request can be used to request a new gradient of the model of the second working node
  • S313 The second working node calculates the gradient on the entire data subset.
  • the neighborhood node (including the above k second working nodes) that received the third request calculates a new gradient on its entire data subset
  • S314 The second working node sends the gradient on the entire data subset to the first working node.
  • all the neighboring nodes participating in the update can return their new gradients to the first working node
  • S315 The first working node updates the model for the second time, and calculates the second impulse.
  • the first working node can perform the local model of the first working node after collecting the new gradient returned by the neighborhood node (denoted as P) of ⁇ % (such as but not limited to 50%, 80%, etc.) Second update:
  • the first working node can use the received average value of the gradient of each second working node to do gradient descent to obtain the impulse of the model update of the first working node And according to the impulse, the model of the first working node is updated.
  • model parameters after the first working node updates the model for the second time can be expressed as Specifically, the model parameter w i before the update can be replaced with Denoted as
  • S317 The second working node updates the model for the second time according to the second impulse.
  • the second working section updates its model parameters according to the second impulse, and the updated model parameters can be expressed as Specifically, the model parameter w j before the update can be replaced with Denoted as
  • the first working node can update the model by using at least one parameter of the received model of each second working node and the average value of the received gradient of each second working node.
  • the aforementioned multiple gradient descent to perform model update may be that the second working node calculates the gradient g j on its small batch data multiple times in S307 so that the first working node executes S309 multiple times to implement multiple gradient updates.
  • the above-mentioned multiple gradient descent for model update can also be the second working node in S313 multiple times to calculate its new gradient on the entire data subset. Make the first working node execute S315 multiple times to implement multiple gradient updates.
  • the above-mentioned multiple gradient descents for model update can be a combination of the above two situations (that is, the gradient g j is calculated multiple times in S307 and the new gradient is calculated multiple times in S313. ).
  • the number of gradient descents in the two cases can be the same or different.
  • the embodiment of the present application does not limit the number of the aforementioned gradient descent.
  • the first working node can use the received gradient of each second working node to perform one or more gradient descents to update the model.
  • the For the last gradient descent Can be recorded as
  • the first working node uses any parameter of the received model of each second working node and the average value of the parameter corresponding to the any parameter in the first working node and the received gradient of each second working node.
  • One or more gradient descents to update the model.
  • the first working node receives the parameters of the three second working node models.
  • the model of the first working node includes parameter 1, parameter 2, and parameter 3, and the parameters of the model sent by each second working node also include parameter 1, parameter 2, and parameter 3.
  • the parameters of the model of the first working node are
  • the For the last gradient descent Can be recorded as among them,
  • the first working node uses any parameter of the received model of each second working node and the average value of the parameter corresponding to the any parameter in the first working node and the received gradient of each second working node
  • the mean value is updated with one or more gradient descents.
  • the parameters of the model of the first working node are
  • the For the last gradient descent Can be recorded as among them,
  • the second working node when the second working node transmits the first information to the first working node (such as S308 or S314), a relatively large bandwidth may be occupied.
  • the first information can be compressed or truncated.
  • the second working node may sort the absolute values of the gradients of the various parameters in descending order, select the first part of the parameters sorted by the absolute value of the gradient according to the network communication overhead budget for transmission, and then update these parts of the parameters. In this way, the most important part or the most active part of the model can be selected for data transmission, thereby reducing network communication overhead.
  • encryption algorithms can be used to process the transmitted data to ensure data security and further increase the system Security, forming a decentralized horizontal federated learning method to ensure user privacy.
  • the aforementioned encryption algorithm may be homomorphic encryption, or semi-homomorphic encryption, or a differential privacy method in a trusted computer environment.
  • the model stored by the first working node and the model stored by the second working node have at least part of the same model structure.
  • the first working node may belong to the first working node by
  • the model is updated by the first information sent by the second working node in the same neighborhood and the data saved by the first working node.
  • the first information may include at least one of the following information of the second working node that sends the first information: at least one parameter of the model, the gradient, and the impulse of the model.
  • the model update method provided in the embodiments of this application is a decentralized model update method, that is, the method does not depend on the central node, and the first working node uses one or more of the same sub-networks that it belongs to.
  • the model-related information corresponding to the second working node transmitted by the second working node is used to update the model on the first working node.
  • this method does not need to upload user data to the cloud, and only needs to update the model of the first working node through parameters such as the model, gradient, and impulse of the second working node, which can ensure the privacy of users.
  • each working node can update the model of the first working node in combination with its saved user data and first information, and updating the model according to the saved user data can customize a personalized model for the working node, thereby providing Users provide more accurate services.
  • the first working node can directly use its saved data to update data without waiting for the calculation results of other working nodes, and the calculation speed is fast and the time delay is small.
  • the model update method can include the following steps:
  • S501 The first working node receives first information sent by at least one second working node.
  • the first working node updates the model according to the first information and the model and data stored by the first working node.
  • the above-mentioned at least one second working node is a working node that belongs to the same sub-network as the first working node.
  • the first information may include at least one of the following information of the second working node that sends the first information: at least one parameter of the model, the gradient, and the impulse of the model.
  • the model stored in the first working node and the model stored in the second working node have at least partially the same model structure.
  • the model stored in the first working node and the model stored in the second working node have the same model structure.
  • the method may further include a process of selecting a second working node. For the process, reference may be made to the related description of FIG. 3, which is not repeated here.
  • the method may further include: the first working node sends the impulse of the updated model to the at least one second working node, so that the second working node can update the model.
  • the impulse of the model updated by the first working node is the first impulse mentioned in S309
  • the relevant description in S309 please refer to the relevant description in S309, which will not be repeated here.
  • the method may further include: the first working node sends a request to the second working node to obtain its first information .
  • the first information includes at least one parameter and gradient of the model of the second working node that sends the first information.
  • the mode for the first working node to update the model according to the first information and the model and data stored in the first working node may include the following:
  • Manner 1 The first working node uses the received gradient of each second working node to perform one or more gradient descents to update the model.
  • the first working node uses any parameter of the received model of each second working node and the average value of the parameter corresponding to the any parameter in the first working node and the received gradient of each second working node.
  • One or more gradient descents are performed to obtain the impulse of the model update of the first working node, and the model of the first working node is updated according to the impulse.
  • the first working node uses any parameter of the received model of each second working node and the average value of the parameter corresponding to the any parameter in the first working node and the received gradient of each second working node
  • the mean value is updated with one or more gradient descents.
  • Method 4 The first working node uses at least one parameter of the received model of each second working node and the average value of the received gradient of each second working node to perform one or more gradient descents to obtain the model update of the first working node According to the impulse, the model of the first working node is updated.
  • the above-mentioned first information is information that has been compressed or truncated.
  • the above-mentioned first information is information that is encrypted using homomorphic or semi-homomorphic encryption, or the first information is information that is encrypted using a differential privacy method in a trusted computing environment.
  • the model stored by the first working node and the model stored by the second working node have at least part of the same model structure.
  • the first working node may belong to the first working node by
  • the model is updated by the first information sent by the second working node in the same neighborhood and the data saved by the first working node.
  • the first information may include at least one of the following information of the second working node that sends the first information: at least one parameter of the model, the gradient, and the impulse of the model.
  • the model update method provided in the embodiments of this application is a decentralized model update method, that is, the method does not depend on the central node, and the first working node uses one or more of the same sub-networks that it belongs to.
  • the model-related information corresponding to the second working node transmitted by the second working node is used to update the model on the first working node.
  • this method does not need to upload user data to the cloud, and only needs to update the model of the first working node through parameters such as the model, gradient, and impulse of the second working node, which can ensure the privacy of users.
  • each working node can update the model of the first working node in combination with its saved user data and first information, and updating the model according to the saved user data can customize a personalized model for the working node, thereby providing Users provide more accurate services.
  • the first working node can directly use its saved data to update data without waiting for the calculation results of other working nodes, and the calculation speed is fast and the time delay is small.
  • the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium can be a magnetic disk, an optical disc, a read-only memory (read-only memory, ROM), or a random access memory (random access memory, RAM), etc.
  • the methods provided in the embodiments of the present application can execute various steps through corresponding units or modules.
  • the modules in the devices in the embodiments of the present application can be combined, divided, and deleted according to actual needs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Telephone Function (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

Procédé de mise à jour de modèle, pouvant fournir une technologie d'optimisation flexible et à efficacité élevée pour un système IA distribué décentralisé. Le procédé peut consister à utiliser l'interaction d'un nœud de travail au sein d'un voisinage pour mettre à jour un modèle local. Le procédé peut consister à recevoir une ou plusieurs informations parmi un paramètre d'un modèle, et/ou un gradient, et/ou l'impulsion du modèle qui sont transmises par un autre nœud de travail dans le voisinage, et à combiner le modèle local avec des données de façon à mettre à jour le modèle.
PCT/CN2020/120135 2019-10-12 2020-10-10 Procédé de mise à jour de modèle, nœud de travail et système de mise à jour de modèle WO2021068926A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910969404.4 2019-10-12
CN201910969404 2019-10-12
CN201911025363.XA CN112651510A (zh) 2019-10-12 2019-10-25 模型更新方法、工作节点及模型更新系统
CN201911025363.X 2019-10-25

Publications (1)

Publication Number Publication Date
WO2021068926A1 true WO2021068926A1 (fr) 2021-04-15

Family

ID=75343254

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/120135 WO2021068926A1 (fr) 2019-10-12 2020-10-10 Procédé de mise à jour de modèle, nœud de travail et système de mise à jour de modèle

Country Status (2)

Country Link
CN (1) CN112651510A (fr)
WO (1) WO2021068926A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033828B (zh) * 2021-04-29 2022-03-22 江苏超流信息技术有限公司 模型训练方法、使用方法、系统、可信节点及设备
CN114301573B (zh) * 2021-11-24 2023-05-23 超讯通信股份有限公司 联邦学习模型参数传输方法及系统
CN116527561A (zh) * 2022-01-20 2023-08-01 北京邮电大学 一种网络模型的残差传播方法和残差传播装置
WO2024036567A1 (fr) * 2022-08-18 2024-02-22 Huawei Technologies Co., Ltd. Procédés et appareils d'apprentissage d'un modèle d'intelligence artificielle ou d'apprentissage automatique

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169513A (zh) * 2017-05-05 2017-09-15 第四范式(北京)技术有限公司 控制数据使用顺序的分布式机器学习系统及其方法
CN108280522A (zh) * 2018-01-03 2018-07-13 北京大学 一种插件式分布式机器学习计算框架及其数据处理方法
CN108446770A (zh) * 2017-02-16 2018-08-24 中国科学院上海高等研究院 一种基于采样的分布式机器学习慢节点处理系统及方法
US20190073587A1 (en) * 2017-09-04 2019-03-07 Kabushiki Kaisha Toshiba Learning device, information processing device, learning method, and computer program product

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11488008B2 (en) * 2017-05-05 2022-11-01 Intel Corporation Hardware implemented point to point communication primitives for machine learning
CN109754060B (zh) * 2017-11-06 2023-08-25 阿里巴巴集团控股有限公司 一种神经网络机器学习模型的训练方法及装置
CN108491928B (zh) * 2018-03-29 2019-10-25 腾讯科技(深圳)有限公司 模型参数发送方法、装置、服务器及存储介质
CN108924910B (zh) * 2018-07-25 2021-03-09 Oppo广东移动通信有限公司 Ai模型的更新方法及相关产品
CN109409125B (zh) * 2018-10-12 2022-05-31 南京邮电大学 一种提供隐私保护的数据采集和回归分析方法
CN109299781B (zh) * 2018-11-21 2021-12-03 安徽工业大学 基于动量和剪枝的分布式深度学习系统
CN109740755B (zh) * 2019-01-08 2023-07-18 深圳市网心科技有限公司 一种基于梯度下降法的数据处理方法及相关装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446770A (zh) * 2017-02-16 2018-08-24 中国科学院上海高等研究院 一种基于采样的分布式机器学习慢节点处理系统及方法
CN107169513A (zh) * 2017-05-05 2017-09-15 第四范式(北京)技术有限公司 控制数据使用顺序的分布式机器学习系统及其方法
US20190073587A1 (en) * 2017-09-04 2019-03-07 Kabushiki Kaisha Toshiba Learning device, information processing device, learning method, and computer program product
CN108280522A (zh) * 2018-01-03 2018-07-13 北京大学 一种插件式分布式机器学习计算框架及其数据处理方法

Also Published As

Publication number Publication date
CN112651510A (zh) 2021-04-13

Similar Documents

Publication Publication Date Title
WO2020211733A1 (fr) Procédé, dispositif et système de connexion bluetooth
WO2020133183A1 (fr) Dispositif et procédé de synchronisation de données audio
WO2021068926A1 (fr) Procédé de mise à jour de modèle, nœud de travail et système de mise à jour de modèle
WO2021185141A1 (fr) Procédé et système d'établissement de liaison sensible au wi-fi, dispositif électronique et support de stockage
WO2021104104A1 (fr) Procédé de traitement d'affichage écoénergétique, et appareil
WO2021169515A1 (fr) Procédé d'échange de données entre dispositifs, et dispositif associé
WO2020173379A1 (fr) Procédé et dispositif de groupement de photographies
CN111132234A (zh) 一种数据传输方法及对应的终端
WO2021190314A1 (fr) Procédé et appareil de commande de réponse au glissement d'un écran tactile, et dispositif électronique
CN113676339B (zh) 组播方法、装置、终端设备及计算机可读存储介质
WO2022262492A1 (fr) Procédé et appareil de téléchargement de données, et dispositif terminal
WO2022022319A1 (fr) Procédé et système de traitement d'image, dispositif électronique et système de puce
CN113596919B (zh) 数据下载方法、装置和终端设备
WO2020062304A1 (fr) Procédé de transmission de fichier et dispositif électronique
CN114915721A (zh) 建立连接的方法与电子设备
WO2020078267A1 (fr) Procédé et dispositif de traitement de données vocales dans un processus de traduction en ligne
WO2022135144A1 (fr) Procédé d'affichage auto-adaptatif, dispositif électronique et support de stockage
CN115665632A (zh) 音频电路、相关装置和控制方法
WO2021204036A1 (fr) Procédé de surveillance du risque de sommeil, dispositif électronique et support de stockage
WO2022037405A1 (fr) Procédé de vérification d'informations, dispositif électronique et support d'enregistrement lisible par ordinateur
WO2021110115A1 (fr) Procédé d'abonnement à un événement et dispositif électronique
WO2021110117A1 (fr) Procédé d'inscription à un événement et dispositif électronique
CN114116610A (zh) 获取存储信息的方法、装置、电子设备和介质
CN113453327A (zh) 一种发送功率控制方法、终端、芯片系统与系统
WO2022143158A1 (fr) Procédé de sauvegarde de données, dispositif électronique, système de sauvegarde de données et système de puce

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20875043

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20875043

Country of ref document: EP

Kind code of ref document: A1