WO2021179196A1 - 一种基于联邦学习的模型训练方法、电子设备及存储介质 - Google Patents

一种基于联邦学习的模型训练方法、电子设备及存储介质 Download PDF

Info

Publication number
WO2021179196A1
WO2021179196A1 PCT/CN2020/078721 CN2020078721W WO2021179196A1 WO 2021179196 A1 WO2021179196 A1 WO 2021179196A1 CN 2020078721 W CN2020078721 W CN 2020078721W WO 2021179196 A1 WO2021179196 A1 WO 2021179196A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
local
local model
node device
training
Prior art date
Application number
PCT/CN2020/078721
Other languages
English (en)
French (fr)
Inventor
田文强
沈嘉
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to CN202080098459.3A priority Critical patent/CN115280338A/zh
Priority to PCT/CN2020/078721 priority patent/WO2021179196A1/zh
Publication of WO2021179196A1 publication Critical patent/WO2021179196A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Definitions

  • This application relates to the field of wireless communication technology, and in particular to a model training method, electronic equipment and storage medium based on federated learning.
  • the embodiments of the present application provide a model training method, electronic equipment, and storage medium based on federated learning, which can train to obtain a high-performance global model.
  • an embodiment of the present application provides a model training method based on federated learning, including: a child node device sends model parameters of a local model and weight information corresponding to the local model; the model parameters and the weight information are used Train the global model on the master node equipment.
  • an embodiment of the present application provides a model training method based on federated learning, including: a master node device receives model parameters of a local model and weight information corresponding to the local model sent by at least two child node devices; The node device trains a global model based on the model parameters and the weight information.
  • an embodiment of the present application provides a sub-node device, and the sub-node device includes:
  • the first sending unit is configured to send the model parameters of the local model and the weight information corresponding to the local model; the model parameters and the weight information are used for the master node device to train the global model.
  • an embodiment of the present application provides a master node device, where the master node device includes:
  • the first receiving unit is configured to receive the model parameters of the local models and the weight information corresponding to the local models sent by at least two child node devices; the processing unit is configured to train the global model based on the model parameters and the weight information.
  • an embodiment of the present application provides a sub-node device, including a processor and a memory for storing a computer program that can run on the processor, wherein the processor is configured to execute the above-mentioned sub-node when the computer program is running.
  • the steps of the model training method based on federated learning performed by the node device.
  • an embodiment of the present application provides a master node device, including a processor and a memory for storing a computer program that can run on the processor, wherein the processor is configured to execute the above-mentioned computer program when the computer program is running.
  • an embodiment of the present application provides a chip, including: a processor, configured to call and run a computer program from a memory, so that the device installed with the chip executes the federated learning-based model training performed by the above-mentioned child node device method.
  • an embodiment of the present application provides a chip, including: a processor, configured to call and run a computer program from a memory, so that the device installed with the chip executes the federated learning-based model training performed by the above-mentioned master node device method.
  • an embodiment of the present application provides a storage medium that stores an executable program, and when the executable program is executed by a processor, the method for training a model based on federated learning executed by the child node device described above is implemented.
  • an embodiment of the present application provides a storage medium storing an executable program, and when the executable program is executed by a processor, the above-mentioned master node device executes the model training method based on federated learning.
  • an embodiment of the present application provides a computer program product, including computer program instructions that cause a computer to execute the model training method based on federated learning executed by the aforementioned child node device.
  • an embodiment of the present application provides a computer program product, including computer program instructions that cause a computer to execute the above-mentioned model training method based on federated learning executed by the master node device.
  • an embodiment of the present application provides a computer program that enables a computer to execute the model training method based on federated learning executed by the above-mentioned child node device.
  • an embodiment of the present application provides a computer program that enables a computer to execute the model training method based on federated learning executed by the above-mentioned master node device.
  • the model training method, electronic device, and storage medium based on federated learning include: a child node device sends model parameters of a local model and weight information corresponding to the local model; the model parameters and the weight information Used for the master node equipment to train the global model.
  • the child node device reports the weight information corresponding to the local model to the master node device, so that the master node device can train the global model based on the weight information of different local models; the global model can reflect the characteristics of the training data represented by the local model, It can be ensured that when the master node device uses the local model reported by each child node device to train the global model, the performance of the global model is not affected by the low-reliability local model.
  • Figure 1 is a schematic diagram of the basic structure of the simple neural network model of this application.
  • Figure 2 is a schematic diagram of the basic structure of the deep neural network model of the application
  • Figure 3a is a schematic diagram of the training process of the neural network model of the application.
  • Figure 3b is a schematic diagram of the reasoning process of the neural network model of the application.
  • Figure 4 is a schematic diagram of the training process of the neural network model based on federated learning in this application;
  • FIG. 5 is a schematic diagram of the composition structure of a communication system according to an embodiment of the application.
  • FIG. 6 is a schematic diagram of an optional processing flow of a model training method based on federated learning according to an embodiment of the application;
  • FIG. 7 is a schematic diagram of another optional processing flow of the model training method based on federated learning according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of a detailed processing flow of a model training method based on federated learning according to an embodiment of the application.
  • FIG. 9 is a schematic diagram of another detailed processing flow of a model training method based on federated learning according to an embodiment of the application.
  • FIG. 10 is a schematic diagram of an optional composition structure of a node device in an implementation example of this application.
  • FIG. 11 is a schematic diagram of an optional composition structure of a master node device according to an embodiment of the application.
  • FIG. 12 is a schematic diagram of the hardware composition structure of an electronic device according to an embodiment of the application.
  • the basic structure of the deep neural network model is shown in Figure 2.
  • the deep neural network model includes multiple hidden layers, including the depth of multiple hidden layers
  • Neural network models can greatly improve data processing capabilities, and are widely used in pattern recognition, signal processing, optimized combination, and anomaly detection.
  • the application of neural network model includes two processes: training phase and inference phase.
  • training phase it is first necessary to obtain a large amount of data as a training set (also called a sample set), use the training set as the input data of the neural network model to be trained, and based on a specific training algorithm, through a large number of training and parameter iterations,
  • the to-be-determined parameters of the neural network model to be trained are determined, so that the training process of the neural network model is completed, and a trained neural network model is obtained.
  • a neural network model that recognizes a puppy can be trained through a large number of pictures, as shown in Figure 3a.
  • the trained neural network model can be used to perform inference or verification operations such as identification, classification, and information recovery. This process is called the inference process of the neural network model.
  • the puppy in the image can be identified through the trained neural network model, as shown in Figure 3b.
  • One method of neural network model training is "federated learning", which is characterized in that during the training process of the neural network model, the training set is distributed on each sub-node device.
  • the training process of the neural network model based on federated learning is shown in Figure 4, which includes three steps. First, after each child node generates a local local neural network model, the local neural network model is uploaded to the main node device; secondly, the main node device The department synthesizes the current global neural network model according to all the local local neural network models obtained, and transmits the global neural network model to each sub-node device. Finally, the child node device continues to use the new global neural network model for the next training iteration; the training of the neural network model is completed under the cooperation of the main node device and multiple child node devices.
  • different sub-node devices may obtain different training sets; for example, the number of training combinations is different and/or the types of training sets are different.
  • the local models determined by user A and user B should not be treated as local models with the same degree of credibility.
  • different sub-node devices may use the training set to obtain the local model in different ways.
  • User A uses 200 pieces of data as a batch of training data when training a local model, and uses this batch of training data to update the local model parameters to complete a local model training.
  • User B uses 1000 training data as a batch of training data when training the local model, and uses this batch of training data to update the local model parameters to complete a local model training.
  • the training data used for a single local model training of user B is more than the training data used for a single local model training of user A; accordingly, the single local model corresponding to user B is compared to that of user A The corresponding single local model represents more training set information.
  • User A and User B both use 200 training data as a batch when training a local model, and User A and User B update their own local model parameters to complete a local model training.
  • the channel environment of user A is poor, and the transmission rate is low.
  • User A cannot realize that after each local local model training is completed, the updated local model parameters are reported to the master node device; for example, A
  • the user updates the local model parameters 10 times locally and then transmits the local model parameters to the master node once.
  • the channel environment conditions of user B are better than those of user A and can support a relatively higher transmission rate.
  • User B performs local model parameter updates twice locally and then transmits local model parameters to the master node device once.
  • the local model parameters transmitted by user A to the master node device and the local model parameters transmitted by user B to the master node device represent different local model training times, which can also be understood as corresponding to different sizes of training set information .
  • the weights of the local models generated by different sub-node devices are different in the process of federated learning;
  • the training data of a local model A is less than the training data of the local model B
  • the local model A and the local model B are combined into a global model using the same treatment strategy
  • the global model training results will be affected by a small scale. The impact of the training data is too large, and thus the performance of the global model.
  • the embodiments of the application provide a model training method based on federated learning.
  • the technical solutions of the embodiments of the application can be applied to various communication systems, such as the global system of mobile communication (GSM) system, Code division multiple access (CDMA) system, wideband code division multiple access (WCDMA) system, general packet radio service (GPRS), long term evolution, LTE) system, LTE frequency division duplex (FDD) system, LTE time division duplex (TDD) system, advanced long term evolution (LTE-A) system, new wireless ( new radio, NR) system, evolution of NR system, LTE (LTE-based access to unlicensed spectrum, LTE-U) system on unlicensed frequency bands, NR (NR-based access to unlicensed spectrum, NR) on unlicensed frequency bands -U) system, universal mobile telecommunication system (UMTS), worldwide interoperability for microwave access (WiMAX) communication system, wireless local area networks (WLAN), wireless fidelity ( wireless fidelity, WiFi), next-generation
  • GSM
  • the network equipment involved in the embodiments of this application may be a common base station (such as NodeB or eNB or gNB), a new radio controller (NR controller), a centralized network element (centralized unit), a new radio base station, Radio remote module, micro base station, relay, distributed unit, reception point (transmission reception point, TRP), transmission point (transmission point, TP), or any other equipment.
  • a common base station such as NodeB or eNB or gNB
  • NR controller new radio controller
  • a centralized network element centralized unit
  • a new radio base station Radio remote module
  • micro base station relay, distributed unit, reception point (transmission reception point, TRP), transmission point (transmission point, TP), or any other equipment.
  • TRP transmission reception point
  • TP transmission point
  • the terminal device may be any terminal.
  • the terminal device may be a user equipment for machine-type communication. That is to say, the terminal equipment can also be referred to as user equipment UE, mobile station (mobile station, MS), mobile terminal (mobile terminal), terminal (terminal), etc., and the terminal device can be accessed via a radio access network.
  • network, RAN communicates with one or more core networks.
  • the terminal device can be a mobile phone (or called a "cellular" phone), a computer with a mobile terminal, etc., for example, the terminal device can also be a portable or pocket-sized , Handheld, computer built-in or vehicle-mounted mobile devices that exchange language and/or data with the wireless access network.
  • the terminal device may be a user equipment for machine-type communication. That is to say, the terminal equipment can also be referred to as user equipment UE, mobile station (mobile station, MS), mobile terminal (mobile terminal), terminal (terminal), etc., and the terminal device can be accessed via a radio access network.
  • network, RAN
  • network equipment and terminal equipment can be deployed on land, including indoor or outdoor, handheld or vehicle-mounted; they can also be deployed on water; they can also be deployed on airborne aircraft, balloons, and satellites.
  • the embodiments of the present application do not limit the application scenarios of network equipment and terminal equipment.
  • communication between network equipment and terminal equipment and between terminal equipment and terminal equipment can be carried out through licensed spectrum, or through unlicensed spectrum, or through licensed spectrum and terminal equipment at the same time. Unlicensed spectrum for communication.
  • Between network equipment and terminal equipment and between terminal equipment and terminal equipment can communicate through the frequency spectrum below 7 gigahertz (gigahertz, GHz), can also communicate through the frequency spectrum above 7 GHz, and can also use the frequency spectrum below 7 GHz and Communication is performed in the frequency spectrum above 7GHz.
  • the embodiment of the present application does not limit the spectrum resource used between the network device and the terminal device.
  • D2D device to device
  • M2M machine to machine
  • MTC machine type communication
  • V2V vehicle to vehicle
  • the communication system 100 applied in the embodiment of the present application is shown in FIG. 5.
  • the communication system 100 may include a network device 110, and the network device 110 may be a device that communicates with a terminal device 120 (or called a communication terminal or terminal).
  • the network device 110 may provide communication coverage for a specific geographic area, and may communicate with terminal devices located in the coverage area.
  • the network device 110 may be a base station (Base Transceiver Station, BTS) in a GSM system or a CDMA system, a base station (NodeB, NB) in a WCDMA system, or an evolved base station in an LTE system (Evolutional Node B, eNB or eNodeB), or the wireless controller in the Cloud Radio Access Network (CRAN), or the network equipment can be a mobile switching center, a relay station, an access point, a vehicle-mounted device, Wearable devices, hubs, switches, bridges, routers, network-side devices in 5G networks, or network devices in the future evolution of the Public Land Mobile Network (PLMN), etc.
  • BTS Base Transceiver Station
  • NodeB, NB base station
  • LTE Long Term Evolutional Node B
  • eNB evolved base station
  • CRAN Cloud Radio Access Network
  • the network equipment can be a mobile switching center, a relay station, an access point, a vehicle-mounted device, Wearable devices, hubs, switches
  • the communication system 100 also includes at least one terminal device 120 located within the coverage area of the network device 110.
  • the "terminal equipment” used here includes but is not limited to connection via wired lines, such as via Public Switched Telephone Networks (PSTN), Digital Subscriber Line (DSL), digital cable, and direct cable connection ; And/or another data connection/network; and/or via a wireless interface, such as for cellular networks, wireless local area networks (WLAN), digital TV networks such as DVB-H networks, satellite networks, AM- FM broadcast transmitter; and/or another terminal device that is set to receive/send communication signals; and/or Internet of Things (IoT) equipment.
  • PSTN Public Switched Telephone Networks
  • DSL Digital Subscriber Line
  • WLAN wireless local area networks
  • IoT Internet of Things
  • a terminal device set to communicate through a wireless interface may be referred to as a "wireless communication terminal", a “wireless terminal” or a “mobile terminal”.
  • mobile terminals include, but are not limited to, satellite or cellular phones; Personal Communications System (PCS) terminals that can combine cellular radio phones with data processing, fax, and data communication capabilities; can include radio phones, pagers, Internet/intranet PDA with internet access, web browser, memo pad, calendar, and/or Global Positioning System (GPS) receiver; and conventional laptop and/or palmtop receivers or others including radio telephone transceivers Electronic device.
  • PCS Personal Communications System
  • GPS Global Positioning System
  • Terminal equipment can refer to access terminals, user equipment (UE), user units, user stations, mobile stations, mobile stations, remote stations, remote terminals, mobile equipment, user terminals, terminals, wireless communication equipment, user agents, or User device.
  • the access terminal can be a cellular phone, a cordless phone, a Session Initiation Protocol (SIP) phone, a wireless local loop (Wireless Local Loop, WLL) station, a personal digital processing (Personal Digital Assistant, PDA), with wireless communication Functional handheld devices, computing devices or other processing devices connected to wireless modems, in-vehicle devices, wearable devices, terminal devices in 5G networks, or terminal devices in the future evolution of PLMN, etc.
  • SIP Session Initiation Protocol
  • WLL Wireless Local Loop
  • PDA Personal Digital Assistant
  • the terminal devices 120 may perform direct terminal connection (Device to Device, D2D) communication.
  • D2D Direct terminal connection
  • the 5G system or 5G network may also be referred to as a New Radio (NR) system or NR network.
  • NR New Radio
  • Figure 5 exemplarily shows one network device and two terminal devices.
  • the communication system 100 may include multiple network devices and the coverage of each network device may include other numbers of terminal devices. The embodiment does not limit this.
  • the communication system 100 may also include other network entities such as a network controller and a mobility management entity, which are not limited in the embodiment of the present application.
  • network entities such as a network controller and a mobility management entity, which are not limited in the embodiment of the present application.
  • the devices with communication functions in the network/system in the embodiments of the present application may be referred to as communication devices.
  • the communication device may include a network device 110 having a communication function and a terminal device 120.
  • the network device 110 and the terminal device 120 may be the specific devices described above, which will not be repeated here.
  • the communication device may also include other devices in the communication system 100, such as network controllers, mobility management entities and other network entities, which are not limited in the embodiment of the present application.
  • An optional processing procedure of the model training method based on federated learning provided by the embodiment of the present application, as shown in FIG. 6, includes the following steps:
  • Step S201 The child node device sends the model parameters of the local model and the weight information corresponding to the local model.
  • the child node device sends the model parameters of the local model and the weight information corresponding to the local model to the master node device.
  • the model parameters and the weight information are used for the master node equipment to train a global model.
  • the child node device may send the model parameters through service layer data, or uplink control signaling (Uplink Control Information, UCI), or radio resource control (Radio Resource Contro, RRC) signaling; Model parameters can also be carried on the Physical Uplink Control Channel (PUCCH) or the Physical Uplink Shared Channel (PUSCH).
  • the child node device may send the weight information corresponding to the local model through service layer data, or UCI, or RRC signaling; the weight information corresponding to the local model may also be carried on the PUCCH or PUSCH.
  • the weight information corresponding to the local model may be: data characteristics of the samples used to train the local model; then the child node device sends the model parameters of the local model and the samples used to train the local model Data characteristics.
  • the data features of the samples used to train the local model include at least one of the following: the size of all sample data used to train the local model, the size of the sample data used to train the local model each time, and The number of times the local model is trained.
  • the weight information corresponding to the local model may be: a weight factor value corresponding to the data feature of the sample used for training the local model. Then the child node device sends the model parameters of the local model and the weight factor value corresponding to the data feature of the sample used to train the local model.
  • the data features of the samples used to train the local model include at least one of the following: the size of all sample data used to train the local model, the size of the sample data used to train the local model each time, and The number of times the local model is trained.
  • the corresponding relationship between the data feature of the sample used to train the local model and the weight factor value is configured by the master node device; or, the data feature of the sample used to train the local model and the weight
  • the corresponding relationship of the factor value is pre-arranged.
  • the data features of the samples used to train the local model include the size of all sample data used to train the local model, and the correspondence between the data features of the samples used to train the local model and the value of the weighting factor As shown in Table 1 below, the size of all sample data used for training the local model is Ni, and the weighting factor value corresponding to Ni is Mi.
  • Nimin is the minimum value of all sample data used to train the local model
  • Nimax is the maximum value of all sample data used to train the local model
  • the data features of the samples used for training the local model include the size of the sample data for each training of the local model, and the correspondence between the data features of the samples used for training the local model and the value of the weighting factor , As shown in Table 2 below, the size of the sample data for each training of the local model is Bi, and the weighting factor value corresponding to Bi is Mi.
  • Bimin is the minimum value of the sample data for each training of the local model
  • Bimax is the maximum value of the sample data for each training of the local model
  • the data features of the samples used to train the local model include the number of times the local model is trained, and the correspondence between the data features of the samples used to train the local model and the value of the weighting factor is as shown in Table 3 below. It is shown that the number of times of training the local model is Ki, and the weighting factor value corresponding to Ki is Mi.
  • Kimin is the minimum number of times the local model is trained, and Kimax is the maximum number of times the local model is trained; as shown in Table 3, when the number of times the local model is trained is between K1min and K1max, the corresponding weight The factor is M1; when the number of times of training the local model is between K2min and K2max, the corresponding weighting factor is M2; when the number of times of training the local model is between K3min and K3max, the corresponding weighting factor is M3.
  • the master node device may use service layer data, or RRC signaling, or Broadcast messages, or Downlink Control Information (DCI), or Media Access Control-Control Element (MAC CE), or Physical Downlink Control Channel (PDCCH) signaling will
  • DCI Downlink Control Information
  • MAC CE Media Access Control-Control Element
  • PDCCH Physical Downlink Control Channel
  • the child node device may be a first terminal device
  • the master node device may be a second terminal device or a network device.
  • the child node device may send the model parameters of the local model and the weight information corresponding to the local model to the second terminal device, and the second terminal device may act as the master device.
  • the node processes the received model parameters of the local model and the weight information corresponding to the local model.
  • the second terminal device may send the received model parameters of the local model and the weight information corresponding to the local model to the master node device.
  • Another optional processing procedure of the model training method based on federated learning includes the following steps:
  • Step S301 The master node device receives the model parameters of the local model and the weight information corresponding to the local model sent by at least two child node devices.
  • the description of the model parameters and the weight information is the same as that in the above step S201, and will not be repeated here.
  • the description of the master node device receiving the model parameters and the weight information is the same as the description of the child node device sending the model parameters and the weight information in step S201, and will not be omitted here. Go into details.
  • the weight information is the correspondence between the data feature of the sample used to train the local model and the weight factor value, and the data feature of the sample used to train the local model corresponds to the value of the weight factor.
  • the method may further include:
  • Step S300 The master node device sends first configuration information, where the first configuration information is used to determine the correspondence between the data feature of the sample used for training the local model and the weight factor value.
  • the first configuration information may carry any one of the following: service layer data, RRC signaling, broadcast message, DCI, MAC CE, and PDCCH signaling.
  • step S302 the master node device trains a global model based on the model parameters and the weight information.
  • the master node device determines that the value of the model parameter of the global model is equal to, and the value of the model parameter of each local model is equal to After the number of times of training the local model is multiplied, the sum of the values obtained by dividing the number of times of training all the local models is added to the sum.
  • the child node device 1 and the child node device 2 report model parameters and weight information to the master node device.
  • the model parameter reported by the child node device 1 is R1
  • the number of times the child node device 1 trains the local model is N1
  • the model parameter reported by the child node device 2 is R2
  • the number of times the child node device 2 trains the local model is N2
  • the child node device 2 reports the model parameter once after training the local model N2 times.
  • the model parameter R of the global model can be expressed as:
  • the master node device determines the global The value of the model parameter of the model is equal to the sum of the value of the model parameter of each local model multiplied by the parameter factor of the local model; wherein, the parameter factor of the local model is equal to the value of the local model The ratio of the data characteristics of the sample to the sum of the data characteristics of the samples of all local models.
  • the model parameter reported by the child node device 1 is R1
  • the size of all sample data for training the local model by the child node device 1 is N1
  • the model parameter reported by the child node device 2 is R2
  • the child node device 2 training institute The size of all sample data of the local model is N2
  • the model parameter reported by the child node device k is Rk
  • the size of all sample data of the child node device k training the local model is Nk.
  • the master node device determines that the value of the model parameter of the global model is equal to that of each local model. After the value of the model parameter of the model is multiplied by the weight factor value corresponding to the number of times of training the local model, the sum of the value obtained by dividing the value of the weight factor value corresponding to the number of times of training all the local models is added to the sum.
  • the child node device 1 and the child node device 2 report model parameters and weight information to the master node device. If the model parameter reported by the child node device 1 is R1, the weight factor value corresponding to the number of training the local model is M1; if the model parameter reported by the child node device 2 is R2, the weight factor value corresponding to the number of training the local model is M2. Then the model parameters of the master node device to determine the global model are:
  • the weight information includes a weight factor value corresponding to the size of all sample data for training the local model, or a weight factor value corresponding to the size of the sample data for each training of the local model.
  • the master node device determines that the value of the model parameter of the global model is equal to the sum of the value of the model parameter of each local model multiplied by the parameter factor of the local model; wherein, The parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.
  • the model parameter reported by the child node device 1 is R1
  • the weight factor value corresponding to the size of all sample data for training the local model by the child node device 1 is M1
  • the model parameter reported by the child node device 2 Is R2
  • the weighting factor value corresponding to the size of all the sample data of the sub-node device 2 training the local model is M2
  • the model parameter reported by the sub-node device k is Rk
  • the sub-node device k trains all the sample data of the local model
  • the weighting factor value corresponding to the size of is Mk.
  • the model parameter R of the global model can be expressed as:
  • the child node device may be a first terminal device
  • the master node device may be a second terminal device or a network device.
  • the weight information corresponding to the local model includes the data features of the samples used to train the local model as an example.
  • a detailed processing flow diagram of the model training method based on federated learning provided in the embodiment of the present application is shown in the figure. 8 shows, including:
  • Step S401 The child node device sends the model parameters of the local model and the data characteristics of the samples used to train the local model to the master node device.
  • the child node device sends the model parameters of the local model and the data characteristics of the samples used to train the local model to the master node device through service layer data, or UCI, or RRC signaling.
  • the data features of the samples used to train the local model include at least one of the following: the size of all sample data used to train the local model, the size of the sample data used to train the local model each time, and The number of times the local model is trained.
  • Step S402 The master node device synthesizes the global model based on the model parameters of the local model sent by the child node device and the data characteristics of the samples used to train the local model.
  • the master node device determines the global model
  • the value of the model parameter is equal to the sum of the value obtained by multiplying the value of the model parameter of each local model by the number of times of training the local model, and then dividing it by the number of times of training all the local models.
  • the master node device determines that the value of the model parameter of the global model is equal to the value obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model The sum of the sum; wherein the parameter factor of the local model is equal to the ratio of the data feature of the sample of the local model to the sum of the data feature of the sample of all the local models.
  • Step S403 The master node device sends the global model to the child node device.
  • Step S404 The child node device sends the model parameters of the local model and the data characteristics of the samples used to train the local model to the master node device.
  • the child node device repeats the operation of step S401, the model parameters sent in step S401 and step S404 may be different or the same; the data of the sample used to train the local model sent in step S401 and step S404 The characteristics may be different or the same.
  • the master node device repeats the operations from step S402 to step S403 above; until the global model training is completed.
  • the weight information corresponding to the local model includes: the value of the weight factor corresponding to the data feature of the sample used to train the local model is an example, a detailed model training method based on federated learning provided in the embodiment of the present application
  • the processing flow diagram, as shown in Figure 9, includes:
  • Step S501 The child node device obtains the corresponding relationship between the data feature of the sample used for training the local model and the weight factor value.
  • the child node device may determine the correspondence between the data feature of the sample used to train the local model and the weight factor value according to a pre-appointment; the child node device may also receive the network device The first configuration information sent is used to determine the correspondence between the data feature of the sample used for training the local model and the weight factor value.
  • step S502 the child node device determines a weight factor value according to the data characteristics of the sample for training the local model.
  • the child node device searches for the weight factor value corresponding to the data feature of the sample used to train the local model in the correspondence between the data feature of the sample used to train the local model and the weight factor value.
  • Step S503 The child node device sends the model parameter and weight factor value of the local model to the master node device.
  • the child node device sends the model parameter and weight factor value of the local model to the master node device through service layer data, or UCI, or RRC signaling.
  • step S504 the master node device synthesizes a global model based on the model parameters and weight factor values of the local model sent by the child node device.
  • the master node device determines based on the above formula (4) that the model parameter of the global model is equal to After the value of the model parameter of the local model is multiplied by the weight factor value corresponding to the number of times of training the local model, the sum of the value obtained by dividing the value of the weight factor value corresponding to the number of times of training all the local models is added to the sum.
  • the master node device determines that the value of the model parameter of the global model is equal to the value of the model parameter of the global model based on the above formula (5) and formula (6), and the value of the model parameter of each local model is multiplied by the parameter factor of the local model.
  • Step S505 The master node device sends the global model to the child node device.
  • Step S506 The child node device sends the model parameter and weight factor value of the local model to the master node device.
  • the child node device repeats the operation of step S503.
  • the model parameters sent by the child node device to the master node device may be the same or different; in step S503 and step S506, the child node device sends the model parameters to the master node.
  • the weight factor value sent by the device may be the same or different.
  • the master node device repeats the operations from step S504 to step S505 above; until the global model training is completed.
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • model training method based on federated learning provided in the embodiments of the present application can at least be applied to the following scenarios: such as channel model generation, service user prediction, and intelligent transportation decision-making.
  • the terminal device obtains channel quality data, and uses the channel quality data as a sample to train a local model; the terminal device uses the trained local
  • the model parameters of the model are sent to the network device, and the network device determines the model parameters of the global model according to the model parameters of the local models respectively sent by each terminal device; the global model is used to determine the channel quality.
  • an embodiment of the present application provides a sub-node device.
  • An optional structural schematic diagram of the sub-node device 600, as shown in FIG. 10, includes:
  • the first sending unit 601 is configured to send model parameters of a local model and weight information corresponding to the local model;
  • the model parameters and the weight information are used for the master node device to train a global model.
  • the weight information includes: data features of samples used to train the local model.
  • the weight information corresponding to the local model includes: a weight factor value corresponding to the data feature of the sample used to train the local model.
  • the corresponding relationship between the data feature of the sample used for training the local model and the weight factor value is configured by the master node device; or, the corresponding relationship between the data feature used for training the local model The corresponding relationship between the data feature of the sample and the value of the weighting factor is predetermined.
  • the correspondence between the data feature of the sample used to train the local model and the weight factor value is configured by any one of the following:
  • Service layer data RRC signaling, broadcast message, DCI, MAC CE and PDCCH signaling.
  • the data characteristics of the samples used to train the local model include at least one of the following: the size of all sample data used to train the local model, and each time the local model is trained The size of the sample data and the number of times the local model is trained.
  • the weight information is transmitted through service layer data, or UCI, or RRC signaling; and/or, the weight information is carried on PUCCH or PUSCH.
  • the model parameters are transmitted through service layer data, or UCI, or RRC signaling; and/or, the model parameters are carried on PUCCH or PUSCH.
  • the child node device 600 includes: a first terminal device.
  • the master node device includes: a second terminal device or a network device.
  • An optional structural schematic diagram of the master node device 800 includes:
  • the first receiving unit 801 is configured to receive model parameters of local models and weight information corresponding to the local models sent by at least two child node devices;
  • the processing unit 802 is configured to train a global model based on the model parameters and the weight information.
  • the weight information includes: data features of samples used to train the local model.
  • the weight information corresponding to the local model includes: a weight factor value corresponding to the data feature of the sample used to train the local model.
  • the corresponding relationship between the data feature of the sample used for training the local model and the value of the weighting factor is predetermined.
  • the master node device 800 further includes:
  • the second sending unit 803 is configured to send first configuration information, where the first configuration information is used to determine the correspondence between the data feature of the sample used for training the local model and the weight factor value.
  • the first configuration information carries any one of the following:
  • Service layer data RRC signaling, broadcast message, DCI, MAC CE and PDCCH signaling.
  • the data characteristics of the samples used to train the local model include at least one of the following: the size of all sample data used to train the local model, and each time the local model is trained The size of the sample data and the number of times the local model is trained.
  • the processing unit 802 is configured to determine that the value of the model parameter of the global model is equal to that of the model parameter of each local model when the weight information includes the number of training the local model After the value of is multiplied by the number of times of training the local model, it is then divided by the number of times of training all local models and added to the sum of the values obtained.
  • the processing unit 802 is configured to determine when the weight information includes the size of all sample data for training the local model or the size of the sample data for each training of the local model.
  • the value of the model parameter of the global model is equal to the sum of the value of the model parameter of each local model and the value obtained by multiplying the parameter factor of the local model;
  • the parameter factor of the local model is equal to the ratio of the data features of the samples of the local model to the sum of the data features of the samples of all the local models.
  • the processing unit 802 is configured to determine that the value of the model parameter of the global model is equal to After the value of the model parameter of each local model is multiplied by the weight factor value corresponding to the number of times of training the local model, the sum of the value obtained by dividing the value of the weight factor value corresponding to the number of times of training all the local models is added to the sum.
  • the processing unit 802 is configured to include a weight factor value corresponding to the size of all sample data used to train the local model or the sample data of each training of the local model when the weight information includes In the case of the value of the weighting factor corresponding to the size of, it is determined that the value of the model parameter of the global model is equal to the sum of the numerical data obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model ;
  • the parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.
  • the weight information is transmitted through service layer data, or UCI, or RRC signaling; and/or, the weight information is carried on PUCCH or PUSCH.
  • the model parameters are transmitted through service layer data, or UCI, or RRC signaling;
  • model parameters are carried on PUCCH or PUSCH.
  • the child node device includes: a first terminal device.
  • the master node device includes: a second terminal device or a network device.
  • An embodiment of the present application also provides a sub-node device, including a processor and a memory for storing a computer program that can run on the processor, where the processor is used to execute the above-mentioned sub-node execution when the computer program is running. Steps of the model training method based on federated learning.
  • An embodiment of the present application also provides a master node device, including a processor and a memory for storing a computer program that can run on the processor, where the processor is used to execute the above master node device when the computer program is running. Perform the steps of the model training method based on federated learning.
  • An embodiment of the present application also provides a chip, including a processor, configured to call and run a computer program from a memory, so that a device installed with the chip executes the above-mentioned child node device based model training method based on federated learning.
  • An embodiment of the present application also provides a chip, including a processor, configured to call and run a computer program from a memory, so that the device installed with the chip executes the above-mentioned model training method based on the federated learning performed by the master node device.
  • the embodiment of the present application also provides a storage medium storing an executable program, and when the executable program is executed by a processor, the method for training a model based on federated learning executed by the above-mentioned child node device is implemented.
  • the embodiment of the present application also provides a storage medium storing an executable program, and when the executable program is executed by a processor, the above-mentioned master node device executes the model training method based on federated learning.
  • An embodiment of the present application also provides a computer program product, including computer program instructions, which cause a computer to execute the model training method based on federated learning executed by the above-mentioned child node device.
  • the embodiment of the present application also provides a computer program product, including computer program instructions, which cause a computer to execute the model training method based on federated learning executed by the above-mentioned master node device.
  • the embodiment of the present application also provides a computer program that enables the computer to execute the model training method based on federated learning executed by the above-mentioned child nodes.
  • An embodiment of the present application also provides a computer program that enables a computer to execute the model training method based on federated learning executed by the above-mentioned master node device.
  • FIG. 12 is a schematic diagram of the hardware composition structure of an electronic device (a master node device or a child node device) according to an embodiment of the present application.
  • the electronic device 700 includes: at least one processor 701, a memory 702, and at least one network interface 704.
  • the various components in the electronic device 700 are coupled together through the bus system 705.
  • the bus system 705 is used to implement connection and communication between these components.
  • the bus system 705 also includes a power bus, a control bus, and a status signal bus.
  • various buses are marked as the bus system 705 in FIG. 12.
  • the memory 702 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory.
  • non-volatile memory can be ROM, Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), and electrically erasable Programmable read-only memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic random access memory (FRAM, ferromagnetic random access memory), flash memory (Flash Memory), magnetic surface memory, optical disk, or CD-ROM (CD) -ROM, Compact Disc Read-Only Memory); Magnetic surface memory can be disk storage or tape storage.
  • the volatile memory may be a random access memory (RAM, Random Access Memory), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • SSRAM synchronous static random access memory
  • Synchronous Static Random Access Memory Synchronous Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM synchronous connection dynamic random access memory
  • DRRAM Direct Rambus Random Access Memory
  • the memory 702 described in the embodiment of the present application is intended to include, but is not limited to, these and any other suitable types of memory.
  • the memory 702 in the embodiment of the present application is used to store various types of data to support the operation of the electronic device 700.
  • Examples of such data include: any computer program used to operate on the electronic device 700, such as the application program 7022.
  • the program for implementing the method of the embodiment of the present application may be included in the application program 7022.
  • the method disclosed in the foregoing embodiment of the present application may be applied to the processor 701 or implemented by the processor 701.
  • the processor 701 may be an integrated circuit chip with signal processing capabilities. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 701 or instructions in the form of software.
  • the aforementioned processor 701 may be a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, and the like.
  • the processor 701 may implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium, and the storage medium is located in the memory 702.
  • the processor 701 reads the information in the memory 702 and completes the steps of the foregoing method in combination with its hardware.
  • the electronic device 700 may be used by one or more application specific integrated circuits (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), and complex programmable logic device (CPLD). , Complex Programmable Logic Device), FPGA, general-purpose processor, controller, MCU, MPU, or other electronic components to implement the foregoing method.
  • ASIC Application Specific Integrated Circuit
  • DSP digital signal processor
  • PLD programmable logic device
  • CPLD complex programmable logic device
  • FPGA field-programmable Logic Device
  • controller MCU
  • MPU or other electronic components to implement the foregoing method.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

一种基于联邦学习的模型训练方法,包括:子节点设备发送局部模型的模型参数和所述局部模型对应的权重信息(S201);所述模型参数和所述权重信息用于主节点设备训练全局模型。还公开了另一种基于联邦学习的模型训练方法、电子设备及存储介质。

Description

一种基于联邦学习的模型训练方法、电子设备及存储介质 技术领域
本申请涉及无线通信技术领域,尤其涉及一种基于联邦学习的模型训练方法、电子设备及存储介质。
背景技术
基于联邦学习进行模型训练时,主节点设备如何基于子节点上报的局部模型训练全局模型时,如何获取高性能的全局模型尚未被明确。
发明内容
本申请实施例提供一种基于联邦学习的模型训练方法、电子设备及存储介质,能够训练得到高性能的全局模型。
第一方面,本申请实施例提供一种基于联邦学习的模型训练方法,包括:子节点设备发送局部模型的模型参数和所述局部模型对应的权重信息;所述模型参数和所述权重信息用于主节点设备训练全局模型。
第二方面,本申请实施例提供一种基于联邦学习的模型训练方法,包括:主节点设备接收至少两个子节点设备发送的局部模型的模型参数和所述局部模型对应的权重信息;所述主节点设备基于所述模型参数和所述权重信息,训练全局模型。
第三方面,本申请实施例提供一种子节点设备,所述子节点设备包括:
第一发送单元,配置为发送局部模型的模型参数和所述局部模型对应的权重信息;所述模型参数和所述权重信息用于主节点设备训练全局模型。
第四方面,本申请实施例提供一种主节点设备,所述主节点设备包括:
第一接收单元,配置为接收至少两个子节点设备发送的局部模型的模型参数和所述局部模型对应的权重信息;处理单元,配置为基于所述模型参数和所述权重信息,训练全局模型。
第五方面,本申请实施例提供一种子节点设备,包括处理器和用于存储能够在处理器上运行的计算机程序的存储器,其中,所述处理器用于运行所述计算机程序时,执行上述子节点设备执行的基于联邦学习的模型训练方法的步骤。
第六方面,本申请实施例提供一种主节点设备,包括处理器和用于存储能够在处理器上运行的计算机程序的存储器,其中,所述处理器用于运行所述计算机程序时,执行上述主节点设备执行的基于联邦学习的模型训练方法的步骤。
第七方面,本申请实施例提供一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行上述子节点设备执行的基于联邦学习的模型训练方法。
第八方面,本申请实施例提供一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行上述主节点设备执行的基于联邦学习的模型训练方法。
第九方面,本申请实施例提供一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时,实现上述子节点设备执行的基于联邦学习的模型训练方法。
第十方面,本申请实施例提供一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时,实现上述主节点设备执行的基于联邦学习的模型训练方法。
第十一方面,本申请实施例提供一种计算机程序产品,包括计算机程序指令,该计算机程序指 令使得计算机执行上述子节点设备执行的基于联邦学习的模型训练方法。
第十二方面,本申请实施例提供一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行上述主节点设备执行的基于联邦学习的模型训练方法。
第十三方面,本申请实施例提供一种计算机程序,所述计算机程序使得计算机执行上述子节点设备执行的基于联邦学习的模型训练方法。
第十四方面,本申请实施例提供一种计算机程序,所述计算机程序使得计算机执行上述主节点设备执行的基于联邦学习的模型训练方法。
本申请实施例提供的基于联邦学习的模型训练方法、电子设备及存储介质,包括:子节点设备发送局部模型的模型参数和所述局部模型对应的权重信息;所述模型参数和所述权重信息用于主节点设备训练全局模型。如此,通过子节点设备向主节点设备上报局部模型对应的权重信息,使得主节点设备能够基于不同的局部模型的权重信息训练全局模型;使得全局模型能够反映局部模型所代表的训练数据的特征,能够保证主节点设备利用各子节点设备上报的局部模型训练全局模型时,全局模型的性能不受低可靠度的局部模型影响。
附图说明
图1为本申请简单的神经网络模型的基本结构示意图;
图2为本申请深度神经网络模型的基本结构示意图;
图3a为本申请神经网络模型的训练过程示意图;
图3b为本申请神经网络模型的推理过程示意图;
图4为本申请基于联邦学习的神经网络模型的训练过程示意图;
图5为本申请实施例通信系统的组成结构示意图;
图6为本申请实施例基于联邦学习的模型训练方法的一种可选处理流程示意图;
图7为本申请实施例基于联邦学习的模型训练方法的另一种可选处理流程示意图;
图8为本申请实施例基于联邦学习的模型训练方法的一种详细处理流程示意图;
图9为本申请实施例基于联邦学习的模型训练方法的另一种详细处理流程示意图;
图10为本申请实施例子节点设备的一种可选组成结构示意图;
图11为本申请实施例主节点设备的一种可选组成结构示意图;
图12为本申请实施例电子设备的硬件组成结构示意图。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点和技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
在对本申请实施例进行详细描述之前,对人工智能进行简要说明。
人工智能已经成为人们解决问题、处理问题的新路径。其中,基于神经网络的人工智能具有广泛的应用。一个简单的神经网络模型的基本结构如图1所示,包括:输入层,隐藏层和输出层;其中,输入层用于接收数据,隐藏层用于对数据进行处理,输出层用于产生神经网络模型的计算结果。
随着对神经网络模型研究的不断发展,又提出了神经网络深度学习算法,深度神经网络模型的基本结构如图2所示,深度神经网络模型包括多个隐藏层,包括多个隐藏层的深度神经网络模型能够极大地提高对数据的处理能力,在模式识别、信号处理、优化组合以及异常探测等方面被广泛应用。
神经网络模型的应用包括训练阶段和推理阶段两个过程。在训练阶段,首先需要获得大量的数据作为训练集合(也称为样本集合),将训练集合作为待训练的神经网络模型的输入数据,并基于特定的训练算法,通过大量的训练和参数迭代,确定待训练的神经网络模型的待确定参数,如此便完 成了神经网络模型的训练过程,得到一个训练好的神经网络模型。例如可通过大量的图片训练一个识别小狗的神经网络模型,如图3a所示。对于一个神经网络来说,当神经网络模型训练完毕之后,便可以应用训练好的神经网络模型进行识别、分类、信息恢复等推理或验证操作,这一过程称之为神经网络模型的推理过程。例如可通过训练好的神经网络模型识别出图像中的小狗,如图3b所示。
神经网络模型训练的一种方式是“联邦学习”,其特征是在神经网络模型的训练过程中,训练集合分布在各个子节点设备上。基于联邦学习的神经网络模型的训练过程如图4所示,包括三个步骤,首先,各个子节点生成本地局部神经网络模型后,将本地神经网络模型上传至主节点设备;其次,主节点设备科根据获得的全部本地局部神经网络模型合成当前全局神经网络模型,并将全局神经网络模型传输至各个子节点设备。最后,子节点设备继续使用新的全局神经网络模型进行下一次训练迭代;在主节点设备和多个子节点设备的协作下完成神经网络模型的训练。
但是,在基于联邦学习的全局神经网络模型训练的过程中,可能会存在如下问题:
1、根据实际环境不同,不同的子节点设备在能够获得的训练集合可能不同;如训练结合的数量不同和/或训练集合的类别不同。
例如,当A用户有1000个数据作为训练集,B用户有10个数据作为训练集时,A用户和B用户确定的局部模型不应该当作同等可信程度的局部模型同等处理。
再例如:当A用户有1000个数据,且1000个数据都属于一类数据;B用户有1000个数据,但1000个数据属于不同类别的数据。在该场景下,A用户和B用户的训练集合中的数据数量虽然相同,但是训练集合所反映的样本信息是不同的;B用户的训练集合能反映出更多类别的样本信息,B用户对应的局部模型的泛化能力也会较A用户对应的局部模型的泛化能力强。
2、根据实际环境不同,不同的子节点设备利用训练集合获得局部模型的方式可能不同。
例如:A用户训练局部模型时用200个数据作为一批训练数据处理,并用这一批训练数据更新局部模型参数,完成一次局部模型的训练。B用户训练局部模型时用1000个训练数据作为一批训练数据处理,并用这一批训练数据更新局部模型参数,完成一次局部模型的训练。该场景下,B用户单次局部模型训练所使用的训练数据相比于A用户单次局部模型训练所使用的训练数据多;相应地,B用户所对应的单次局部模型相比于A用户所对应的单次局部模型代表了更多训练集合信息。
3、不同节点对应的信道情况和传输能力在实际环境下是不同的。
例如:A用户和B用户训练局部模型时都使用200个训练数据作为一批处理,并A用户和B用户分别更新自身的局部模型参数,完成一次局部模型的训练。在上述条件的基础上,A用户所处的信道环境较差,传输速率较低,A用户无法实现每次本地局部模型训练完成后,都将更新的局部模型参数上报至主节点设备;如A用户在本地做10次局部模型参数更新后传输一次局部模型参数至主节点。B用户所处的信道环境条件较A用户好,能支持相对较高的传输速率,B用户在本地做2次局部模型参数更新后传输一次局部模型参数至主节点设备。该场景下,A用户传输至主节点设备的局部模型参数与B用户传输至主节点设备的局部模型参数代表了不同的局部模型训练次数的信息,也可以理解为对应了不同大小的训练集合信息。
综上,由于训练集合的特征、确定局部模型的方式以及无线传输条件等因素的影响,在联邦学习的过程中可能存在不同的子节点设备生成的局部模型所对应的权重不同的情况;在该场景下,如果一个局部模型A的训练数据少于局部模型B的训练数据,如利用同等对待的策略将局部模型A和局部模型B合并为一个全局模型,则会造成全局模型训练结果受小规模训练数据的影响过大的问题,从而全局模型的性能。
针对上述问题,本申请实施例提供一种基于联邦学习的模型训练方法,本申请实施例的技术方案可以应用于各种通信系统,例如:全球移动通讯(global system of mobile communication,GSM)系统、码分多址(code division multiple access,CDMA)系统、宽带码分多址(wideband code division multiple access,WCDMA)系统、通用分组无线业务(general packet radio service,GPRS)、长期演进(long term evolution,LTE)系统、LTE频分双工(frequency division duplex,FDD)系统、LTE 时分双工(time division duplex,TDD)系统、先进的长期演进(advanced long term evolution,LTE-A)系统、新无线(new radio,NR)系统、NR系统的演进系统、非授权频段上的LTE(LTE-based access to unlicensed spectrum,LTE-U)系统、非授权频段上的NR(NR-based access to unlicensed spectrum,NR-U)系统、通用移动通信系统(universal mobile telecommunication system,UMTS)、全球互联微波接入(worldwide interoperability for microwave access,WiMAX)通信系统、无线局域网(wireless local area networks,WLAN)、无线保真(wireless fidelity,WiFi)、下一代通信系统或其他通信系统等。
本申请实施例描述的系统架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请实施例中涉及的网络设备,可以是普通的基站(如NodeB或eNB或者gNB)、新无线控制器(new radio controller,NR controller)、集中式网元(centralized unit)、新无线基站、射频拉远模块、微基站、中继(relay)、分布式网元(distributed unit)、接收点(transmission reception point,TRP)、传输点(transmission point,TP)或者任何其它设备。本申请的实施例对网络设备所采用的具体技术和具体设备形态不做限定。为方便描述,本申请所有实施例中,上述为终端设备提供无线通信功能的装置统称为网络设备。
在本申请实施例中,终端设备可以是任意的终端,比如,终端设备可以是机器类通信的用户设备。也就是说,该终端设备也可称之为用户设备UE、移动台(mobile station,MS)、移动终端(mobile terminal)、终端(terminal)等,该终端设备可以经无线接入网(radio access network,RAN)与一个或多个核心网进行通信,例如,终端设备可以是移动电话(或称为“蜂窝”电话)、具有移动终端的计算机等,例如,终端设备还可以是便携式、袖珍式、手持式、计算机内置的或者车载的移动装置,它们与无线接入网交换语言和/或数据。本申请实施例中不做具体限定。
可选的,网络设备和终端设备可以部署在陆地上,包括室内或室外、手持或车载;也可以部署在水面上;还可以部署在空中的飞机、气球和人造卫星上。本申请的实施例对网络设备和终端设备的应用场景不做限定。
可选的,网络设备和终端设备之间以及终端设备和终端设备之间可以通过授权频谱(licensed spectrum)进行通信,也可以通过非授权频谱(unlicensed spectrum)进行通信,也可以同时通过授权频谱和非授权频谱进行通信。网络设备和终端设备之间以及终端设备和终端设备之间可以通过7吉兆赫(gigahertz,GHz)以下的频谱进行通信,也可以通过7GHz以上的频谱进行通信,还可以同时使用7GHz以下的频谱和7GHz以上的频谱进行通信。本申请的实施例对网络设备和终端设备之间所使用的频谱资源不做限定。
通常来说,传统的通信系统支持的连接数有限,也易于实现,然而,随着通信技术的发展,移动通信系统将不仅支持传统的通信,还将支持例如,设备到设备(device to device,D2D)通信,机器到机器(machine to machine,M2M)通信,机器类型通信(machine type communication,MTC),以及车辆间(vehicle to vehicle,V2V)通信等,本申请实施例也可以应用于这些通信系统。
示例性的,本申请实施例应用的通信系统100,如图5所示。该通信系统100可以包括网络设备110,网络设备110可以是与终端设备120(或称为通信终端、终端)通信的设备。网络设备110可以为特定的地理区域提供通信覆盖,并且可以与位于该覆盖区域内的终端设备进行通信。可选地,该网络设备110可以是GSM系统或CDMA系统中的基站(Base Transceiver Station,BTS),也可以是WCDMA系统中的基站(NodeB,NB),还可以是LTE系统中的演进型基站(Evolutional Node B,eNB或eNodeB),或者是云无线接入网络(Cloud Radio Access Network,CRAN)中的无线控制器,或者该网络设备可以为移动交换中心、中继站、接入点、车载设备、可穿戴设备、集线器、交换机、网桥、路由器、5G网络中的网络侧设备或者未来演进的公共陆地移动网络(Public Land Mobile Network,PLMN)中的网络设备等。
该通信系统100还包括位于网络设备110覆盖范围内的至少一个终端设备120。作为在此使用的“终端设备”包括但不限于经由有线线路连接,如经由公共交换电话网络(Public Switched Telephone Networks,PSTN)、数字用户线路(Digital Subscriber Line,DSL)、数字电缆、直接电缆连接;和/或另一数据连接/网络;和/或经由无线接口,如,针对蜂窝网络、无线局域网(Wireless Local Area Network,WLAN)、诸如DVB-H网络的数字电视网络、卫星网络、AM-FM广播发送器;和/或另一终端设备的被设置成接收/发送通信信号的装置;和/或物联网(Internet of Things,IoT)设备。被设置成通过无线接口通信的终端设备可以被称为“无线通信终端”、“无线终端”或“移动终端”。移动终端的示例包括但不限于卫星或蜂窝电话;可以组合蜂窝无线电电话与数据处理、传真以及数据通信能力的个人通信系统(Personal Communications System,PCS)终端;可以包括无线电电话、寻呼机、因特网/内联网接入、Web浏览器、记事簿、日历以及/或全球定位系统(Global Positioning System,GPS)接收器的PDA;以及常规膝上型和/或掌上型接收器或包括无线电电话收发器的其它电子装置。终端设备可以指接入终端、用户设备(User Equipment,UE)、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置。接入终端可以是蜂窝电话、无绳电话、会话启动协议(Session Initiation Protocol,SIP)电话、无线本地环路(Wireless Local Loop,WLL)站、个人数字处理(Personal Digital Assistant,PDA)、具有无线通信功能的手持设备、计算设备或连接到无线调制解调器的其它处理设备、车载设备、可穿戴设备、5G网络中的终端设备或者未来演进的PLMN中的终端设备等。
可选地,终端设备120之间可以进行终端直连(Device to Device,D2D)通信。
可选地,5G系统或5G网络还可以称为新无线(New Radio,NR)系统或NR网络。
图5示例性地示出了一个网络设备和两个终端设备,可选地,该通信系统100可以包括多个网络设备并且每个网络设备的覆盖范围内可以包括其它数量的终端设备,本申请实施例对此不做限定。
可选地,该通信系统100还可以包括网络控制器、移动管理实体等其他网络实体,本申请实施例对此不作限定。
应理解,本申请实施例中网络/系统中具有通信功能的设备可称为通信设备。以图5示出的通信系统100为例,通信设备可包括具有通信功能的网络设备110和终端设备120,网络设备110和终端设备120可以为上文所述的具体设备,此处不再赘述;通信设备还可包括通信系统100中的其他设备,例如网络控制器、移动管理实体等其他网络实体,本申请实施例中对此不做限定。
本申请实施例提供的基于联邦学习的模型训练方法的一种可选处理流程,如图6所示,包括以下步骤:
步骤S201,子节点设备发送局部模型的模型参数和所述局部模型对应的权重信息。
在一些实施例中,子节点设备向主节点设备发送局部模型的模型参数和所述局部模型对应的权重信息。其中,所述模型参数和所述权重信息用于主节点设备训练全局模型。
在具体实施时,所述子节点设备可以通过业务层数据、或上行控制信令(Uplink Control Information,UCI)、或无线资源控制(Radio Resource Contro,RRC)信令发送所述模型参数;所述模型参数也可以承载于物理上行控制信道(Physical Uplink Control Channel,PUCCH)或物理上行共享信道(Physical Uplink Shared Channel,PUSCH)上。所述子节点设备可以通过业务层数据、或UCI、或RRC信令发送所述局部模型对应的权重信息;所述局部模型对应的权重信息也可以承载于PUCCH或PUSCH上。
在一些实施例中,所述局部模型对应的权重信息可以为:用于训练所述局部模型的样本的数据特征;则子节点设备发送局部模型的模型参数和用于训练所述局部模型的样本的数据特征。所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
在另一些实施例中,所述局部模型对应的权重信息可以为:与用于训练所述局部模型的样本的数据特征对应的权重因子值。则子节点设备发送局部模型的模型参数和与用于训练所述局部模型的 样本的数据特征对应的权重因子值。所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
其中,用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,由所述主节点设备配置;或者,用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系为预先约定。
举例来说,用于训练所述局部模型的样本的数据特征包括用于训练所述局部模型的全部样本数据的大小,用于训练所述局部模型的样本的数据特征与权重因子值的对应关系,如下表1所示,用于训练所述局部模型的全部样本数据的大小为Ni,与Ni对应的权重因子值为Mi。其中,Nimin是用于训练所述局部模型的全部样本数据的最小值,Nimax是用于训练所述局部模型的全部样本数据的最大值;如表1中用于训练所述局部模型的全部样本数据大小在N1min至N1max之间时,对应的权重因子为M1;用于训练所述局部模型的全部样本数据大小在N2min至N2max之间时,对应的权重因子为M2;用于训练所述局部模型的全部样本数据大小在N3min至N3max之间时,对应的权重因子为M3。
用于训练所述局部模型的全部样本数据的大小 权重因子
N1min至N1max M1
N2min至N2max M2
N3min至N3max M3
表1
再举例来说,用于训练所述局部模型的样本的数据特征包括每次训练所述局部模型的样本数据的大小,用于训练所述局部模型的样本的数据特征与权重因子值的对应关系,如下表2所示,每次训练所述局部模型的样本数据的大小为Bi,与Bi对应的权重因子值为Mi。其中,Bimin是每次训练所述局部模型的样本数据的最小值,Bimax是每次训练所述局部模型的样本数据的最大值;如表2中每次训练所述局部模型的样本数据的大小在B1min至B1max之间时,对应的权重因子为M1;每次训练所述局部模型的样本数据的大小在B2min至B2max之间时,对应的权重因子为M2;每次训练所述局部模型的样本数据的大小在B3min至B3max之间时,对应的权重因子为M3。
每次训练所述局部模型的样本数据的大小 权重因子
B1min至B1max M1
B2min至B2max M2
B3min至B3max M3
表2
又举例来说,用于训练所述局部模型的样本的数据特征包括训练所述局部模型的次数,用于训练所述局部模型的样本的数据特征与权重因子值的对应关系,如下表3所示,训练所述局部模型的次数为Ki,与Ki对应的权重因子值为Mi。其中,Kimin是训练所述局部模型的次数的最小值,Kimax是训练所述局部模型的次数的最大值;如表3中训练所述局部模型的次数在K1min至K1max之间时,对应的权重因子为M1;训练所述局部模型的次数在K2min至K2max之间时,对应的权重因子为M2;训练所述局部模型的次数在K3min至K3max之间时,对应的权重因子为M3。
训练局部模型的次数 局部模型权重因子
K1min至K1max M1
K2min至K2max M2
K3min至K3max M3
表3
在用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系由所述主节点设备配置的情况下,所述主节点设备可以通过业务层数据、或RRC信令、或广播消息、或下行控制信令 (Downlink Control Information,DCI)、或媒体接入控制单元(Media Access Control-Control Element,MAC CE)、或者物理下行控制信道(Physical Downlink Control CHannel,PDCCH)信令将用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系发送至子节点设备。子节点设备再根据所述对应关系,查找与自身训练局部模型所使用的样本的数据特征对应的权重因子值,将查找得到的权重因子值上报至主节点设备。
本申请实施例中,所述子节点设备可以是第一终端设备,所述主节点设备可以是第二终端设备或网络设备。在所述主节点设备为第二终端设备的情况下,所述子节点设备可以将局部模型的模型参数和所述局部模型对应的权重信息发送至第二终端设备,第二终端设备可以作为主节点处理接收到的局部模型的模型参数和所述局部模型对应的权重信息。或第二终端设备可以将接收到的局部模型的模型参数和所述局部模型对应的权重信息发送至主节点设备。
本申请实施例提供的基于联邦学习的模型训练方法的另一种可选处理流程,如图7所示,包括以下步骤:
步骤S301,主节点设备接收至少两个子节点设备发送的局部模型的模型参数和所述局部模型对应的权重信息。
在一些实施例中,针对所述模型参数和所述权重信息的说明,与上述步骤S201中的相同,这里不再赘述。
在一些实施例中,针对所述主节点设备接收所述模型参数和所述权重信息的说明,与上述步骤S201中子节点设备发送所述模型参数和所述权重信息的说明相同,这里不再赘述。
需要说明的是,在所述权重信息为用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,且所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系由所述主节点设备配置的情况下,所述方法还可以包括:
步骤S300,主节点设备发送第一配置信息,所述第一配置信息用于确定所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系。
在一些实施例中,所述第一配置信息可携带与下述中的任意一项:业务层数据、RRC信令、广播消息、DCI、MAC CE和PDCCH信令。
步骤S302,主节点设备基于所述模型参数和所述权重信息,训练全局模型。
在一些实施例中,在所述权重信息包括训练所述局部模型的次数的情况下,所述主节点设备确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数的相乘之后,再与训练全部局部模型的次数相除得到的数值相加之和。
举例来说,子节点设备1和子节点设备2向主节点设备上报模型参数和权重信息。若子节点设备1上报的模型参数为R1,子节点设备1训练所述局部模型的次数为N1,即子节点设备1训练N1次局部模型便上报一次模型参数;子节点设备2上报的模型参数为R2,子节点设备2训练所述局部模型的次数为N2,即子节点设备2训练N2次局部模型便上报一次模型参数。则全局模型的模型参数R可以表示为:
R=(R1*N1+R2*N2)/(N1+N2)    (1)
在另一些实施例中,在所述权重信息包括训练所述局部模型的全部样本数据的大小或每次训练所述局部模型的样本数据的大小的情况下,所述主节点设备确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值相加之和;其中,所述局部模型的参数因子等于所述局部模型的样本的数据特征与全部局部模型的样本的数据特征之和的比值。
举例来说,若子节点设备1上报的模型参数为R1,子节点设备1训练所述局部模型的全部样本数据的大小为N1;子节点设备2上报的模型参数为R2,子节点设备2训练所述局部模型的全部样本数据的大小为N2,子节点设备k上报的模型参数为Rk,子节点设备k训练所述局部模型的全部样本数据的大小为Nk。则全局模型的模型参数R可以表示为:
Figure PCTCN2020078721-appb-000001
其中,
Figure PCTCN2020078721-appb-000002
还有一些实施例中,在所述权重信息包括与训练所述局部模型的次数对应的权重因子值的情况下,所述主节点设备确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数对应的权重因子值相乘之后,再与训练全部局部模型的次数对应的权重因子值之和相除得到的数值相加之和。
举例来说,子节点设备1和子节点设备2向主节点设备上报模型参数和权重信息。若子节点设备1上报的模型参数为R1,训练所述局部模型的次数对应的权重因子值为M1;若子节点设备2上报的模型参数为R2,训练所述局部模型的次数对应的权重因子值为M2。则主节点设备确定全局模型的模型参数为:
R=(R1*M1+R2*M2)/(M1+M2)      (4)
又一些实施例中,在所述权重信息包括与训练所述局部模型的全部样本数据的大小对应的权重因子值、或与每次训练所述局部模型的样本数据的大小对应的权重因子值的情况下,所述主节点设备确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值数据相加之和;其中,所述局部模型的参数因子等于所述局部模型的权重因子值与全部局部模型的权重因子值之和的比值。
举例来说,举例来说,若子节点设备1上报的模型参数为R1,子节点设备1训练所述局部模型的全部样本数据的大小对应的权重因子值为M1;子节点设备2上报的模型参数为R2,子节点设备2训练所述局部模型的全部样本数据的大小对应的权重因子值为M2,子节点设备k上报的模型参数为Rk,子节点设备k训练所述局部模型的全部样本数据的大小对应的权重因子值为Mk。则全局模型的模型参数R可以表示为:
Figure PCTCN2020078721-appb-000003
其中,
Figure PCTCN2020078721-appb-000004
本申请实施例中,所述子节点设备可以是第一终端设备,所述主节点设备可以是第二终端设备或网络设备。
下面以所述局部模型对应的权重信息包括:用于训练所述局部模型的样本的数据特征为例,本申请实施例提供的基于联邦学习的模型训练方法的一种详细处理流程示意图,如图8所示,包括:
步骤S401,子节点设备向主节点设备发送局部模型的模型参数和用于训练所述局部模型的样本的数据特征。
其中,子节点设备通过业务层数据、或UCI、或RRC信令向主节点设备发送局部模型的模型参数和用于训练所述局部模型的样本的数据特征。所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
步骤S402,主节点设备基于子节点设备发送的局部模型的模型参数和用于训练所述局部模型的样本的数据特征,合成全局模型。
在具体实施时,若用于训练所述局部模型的样本的数据特征包括训练所述局部模型的次数的情况下,如上述公式(1)所示,所述主节点设备确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数的相乘之后,再与训练全部局部模型的次数相除得到的数值相加之和。
在具体实施时,若用于训练所述局部模型的样本的数据特征包括训练所述局部模型的全部样本数据的大小或每次训练所述局部模型的样本数据的大小的情况下,如上述公式(2)和公式(3)所示,所述主节点设备确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值相加之和;其中,所述局部模型的参数因子等于所述局部模型的样本的数据特征与全部局部模型的样本的数据特征之和的比值。
步骤S403,主节点设备向子节点设备发送全局模型。
步骤S404,子节点设备向主节点设备发送局部模型的模型参数和用于训练所述局部模型的样本的数据特征。
这里,所述子节点设备重复上述步骤S401的操作,步骤S401与步骤S404中发送的模型参数可能不同,也可能相同;步骤S401与步骤S404中发送的用于训练所述局部模型的样本的数据特征可能不同,也可能相同。相应的,所述主节点设备重复上述步骤S402至步骤S403的操作;直至全局模型训练完成。
下面以所述局部模型对应的权重信息包括:与用于训练所述局部模型的样本的数据特征对应的权重因子值为例,本申请实施例提供的基于联邦学习的模型训练方法的一种详细处理流程示意图,如图9所示,包括:
步骤S501,子节点设备获取用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系。
在一些实施例中,所述子节点设备可以根据预先约定来确定用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系;所述子节点设备也可以通过接收网络设备发送的第一配置信息来确定用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系。
步骤S502,子节点设备根据自身训练局部模型的样品的数据特征,确定权重因子值。
在具体实施时,子节点设备在用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系中,查找与自身训练局部模型的样品的数据特征对应的权重因子值。
步骤S503,子节点设备向主节点设备发送局部模型的模型参数和权重因子值。
在一些实施例中,子节点设备通过业务层数据、或UCI、或RRC信令向主节点设备发送局部模型的模型参数和权重因子值。
步骤S504,主节点设备基于子节点设备发送的局部模型的模型参数和权重因子值,合成全局模型。
在具体实施时,若所述权重因子值为与训练所述局部模型的次数对应的权重因子值的情况下,所述主节点设备基于上述公式(4)确定全局模型的模型参数等于,每个局部模型的模型参数的值与训练所述局部模型的次数对应的权重因子值相乘之后,再与训练全部局部模型的次数对应的权重因子值之和相除得到的数值相加之和。
在具体实施时,若所述权重因子值为与训练所述局部模型的全部样本数据的大小对应的权重因子值、或与每次训练所述局部模型的样本数据的大小对应的权重因子值的情况下,所述主节点设备基于上述公式(5)和公式(6)确定全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值数据相加之和;其中,所述局部模型的参数因子等于所述局部模型的权重因子值与全部局部模型的权重因子值之和的比值。
步骤S505,主节点设备向子节点设备发送全局模型。
步骤S506,子节点设备向主节点设备发送局部模型的模型参数和权重因子值。
这里,所述子节点设备重复上述步骤S503的操作,步骤S503与步骤S506中子节点设备向主节点设备发送的模型参数可能相同,也可能不同;步骤S503与步骤S506中子节点设备向主节点设备发送的权重因子值可能相同,也可能不同。相应的,所述主节点设备重复上述步骤S504至步骤S505的操作;直至全局模型训练完成。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各 过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
在具体实施时,本申请实施例提供的基于联邦学习的模型训练方法至少可以应用于下述场景:如信道模型生成,服务用户预测,智能交通决策。
以本申请实施例提供的基于联邦学习的模型训练方法应用于信道模型生成场景为例,所述终端设备获取信道质量数据,以该信道质量数据作为样本训练局部模型;终端设备将训练得到的局部模型的模型参数发送至网络设备,网络设备根据各终端设备分别发送的局部模型的模型参数确定全局模型的模型参数;所述全局模型用于确定信道质量。
为实现上述基于联邦学习的模型训练方法,本申请实施例提供一种子节点设备,所述子节点设备600的一种可选组成结构示意图,如图10所示,包括:
第一发送单元601,配置为发送局部模型的模型参数和所述局部模型对应的权重信息;
所述模型参数和所述权重信息用于主节点设备训练全局模型。
在一些实施例中,所述权重信息包括:用于训练所述局部模型的样本的数据特征。
在一些实施例中,所述局部模型对应的权重信息包括:与用于训练所述局部模型的样本的数据特征对应的权重因子值。
在一些实施例中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,由所述主节点设备配置;或者,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系为预先约定。
在一些实施例中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,通过下述中的任意一项配置:
业务层数据、RRC信令、广播消息、DCI、MAC CE和PDCCH信令。
在一些实施例中,所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
在一些实施例中,所述权重信息通过业务层数据、或UCI、或RRC信令传输;和/或,所述权重信息承载于PUCCH或PUSCH上。
在一些实施例中,所述模型参数通过业务层数据、或UCI、或RRC信令传输;和/或,所述模型参数承载于PUCCH或PUSCH上。
在一些实施例中,所述子节点设备600包括:第一终端设备。
在一些实施例中,所述主节点设备包括:第二终端设备或网络设备。
为实现上述基于联邦学习的模型训练方法,本申请实施例提供一种主节点设备,所述主节点设备800的可选组成结构示意图,如图11所示,包括:
第一接收单元801,配置为接收至少两个子节点设备发送的局部模型的模型参数和所述局部模型对应的权重信息;
处理单元802,配置为基于所述模型参数和所述权重信息,训练全局模型。
在一些实施例中,所述权重信息包括:用于训练所述局部模型的样本的数据特征。
在一些实施例中,所述局部模型对应的权重信息包括:与用于训练所述局部模型的样本的数据特征对应的权重因子值。
在一些实施例中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系为预先约定。
在一些实施例中,所述主节点设备800还包括:
第二发送单元803,配置为发送第一配置信息,所述第一配置信息用于确定所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系。
在一些实施例中,所述第一配置信息携带与下述中的任意一项:
业务层数据、RRC信令、广播消息、DCI、MAC CE和PDCCH信令。
在一些实施例中,所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
在一些实施例中,所述处理单元802,配置为在所述权重信息包括训练所述局部模型的次数的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数的相乘之后,再与训练全部局部模型的次数相除得到的数值相加之和。
在一些实施例中,所述处理单元802,配置为在所述权重信息包括训练所述局部模型的全部样本数据的大小或每次训练所述局部模型的样本数据的大小的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值相加之和;
其中,所述局部模型的参数因子等于所述局部模型的样本的数据特征与全部局部模型的样本的数据特征之和的比值。
在一些实施例中,所述处理单元802,配置为在所述权重信息包括与训练所述局部模型的次数对应的权重因子值的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数对应的权重因子值相乘之后,再与训练全部局部模型的次数对应的权重因子值之和相除得到的数值相加之和。
在一些实施例中,所述处理单元802,配置为在所述权重信息包括与训练所述局部模型的全部样本数据的大小对应的权重因子值、或与每次训练所述局部模型的样本数据的大小对应的权重因子值的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值数据相加之和;
其中,所述局部模型的参数因子等于所述局部模型的权重因子值与全部局部模型的权重因子值之和的比值。
在一些实施例中,所述权重信息通过业务层数据、或UCI、或RRC信令传输;和/或,所述权重信息承载于PUCCH或PUSCH上。
在一些实施例中,所述模型参数通过业务层数据、或UCI、或RRC信令传输;
和/或,所述模型参数承载于PUCCH或PUSCH上。
在一些实施例中,所述子节点设备包括:第一终端设备。
在一些实施例中,所述主节点设备包括:第二终端设备或网络设备。
本申请实施例还提供一种子节点设备,包括处理器和用于存储能够在处理器上运行的计算机程序的存储器,其中,所述处理器用于运行所述计算机程序时,执行上述子节点执行的基于联邦学习的模型训练方法的步骤。
本申请实施例还提供一种主节点设备,包括处理器和用于存储能够在处理器上运行的计算机程序的存储器,其中,所述处理器用于运行所述计算机程序时,执行上述主节点设备执行的基于联邦学习的模型训练方法的步骤。
本申请实施例还提供一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行上述子节点设备执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行上述主节点设备执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时,实现上述子节点设备执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时,实现上述主节点设备执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行上述子节点设备执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算 机执行上述主节点设备执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种计算机程序,所述计算机程序使得计算机执行上述子节点执行的基于联邦学习的模型训练方法。
本申请实施例还提供一种计算机程序,所述计算机程序使得计算机执行上述主节点设备执行的基于联邦学习的模型训练方法。
图12是本申请实施例的电子设备(主节点设备或子节点设备)的硬件组成结构示意图,电子设备700包括:至少一个处理器701、存储器702和至少一个网络接口704。电子设备700中的各个组件通过总线系统705耦合在一起。可理解,总线系统705用于实现这些组件之间的连接通信。总线系统705除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图12中将各种总线都标为总线系统705。
可以理解,存储器702可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是ROM、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random Access Memory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,Synchronous Dynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本申请实施例描述的存储器702旨在包括但不限于这些和任意其它适合类型的存储器。
本申请实施例中的存储器702用于存储各种类型的数据以支持电子设备700的操作。这些数据的示例包括:用于在电子设备700上操作的任何计算机程序,如应用程序7022。实现本申请实施例方法的程序可以包含在应用程序7022中。
上述本申请实施例揭示的方法可以应用于处理器701中,或者由处理器701实现。处理器701可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器701中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器701可以是通用处理器、数字信号处理器(DSP,Digital Signal Processor),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器701可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器702,处理器701读取存储器702中的信息,结合其硬件完成前述方法的步骤。
在示例性实施例中,电子设备700可以被一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、FPGA、通用处理器、控制器、MCU、MPU、或其他电子元件实现,用于执行前述方法。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程 图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
应理解,本申请中术语“系统”和“网络”在本文中常被可互换使用。本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中字符“/”,一般表示前后关联对象是一种“或”的关系。
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围,凡在本申请的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本申请的保护范围之内。

Claims (60)

  1. 一种基于联邦学习的模型训练方法,所述方法包括:
    子节点设备发送局部模型的模型参数和所述局部模型对应的权重信息;
    所述模型参数和所述权重信息用于主节点设备训练全局模型。
  2. 根据权利要求1所述的方法,其中,所述局部模型对应的权重信息包括:
    用于训练所述局部模型的样本的数据特征。
  3. 根据权利要求1所述的方法,其中,所述局部模型对应的权重信息包括:
    与用于训练所述局部模型的样本的数据特征对应的权重因子值。
  4. 根据权利要求3所述的方法,其中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,由所述主节点设备配置;
    或者,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系为预先约定。
  5. 根据权利要求4所述的方法,其中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,通过下述中的任意一项配置:
    业务层数据、无线资源控制RRC信令、广播消息、下行控制信令DCI、媒体接入控制单元MAC CE和物理下行控制信道PDCCH信令。
  6. 根据权利要求2至5任一项所述的方法,其中,所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:
    用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
  7. 根据权利要求1至6任一项所述的方法,其中,所述权重信息通过业务层数据、或上行控制信令UCI、或RRC信令传输;
    和/或,所述权重信息承载于PUCCH或物理上行共享信道PUSCH上。
  8. 根据权利要求1至7任一项所述的方法,其中,所述模型参数通过业务层数据、或UCI、或RRC信令传输;
    和/或,所述模型参数承载于PUCCH或PUSCH上。
  9. 根据权利要求1至8任一项所述的方法,其中,所述子节点设备包括:第一终端设备。
  10. 根据权利要求1至9任一项所述的方法,其中,所述主节点设备包括:第二终端设备或网络设备。
  11. 一种基于联邦学习的模型训练方法,所述方法包括:
    主节点设备接收至少两个子节点设备发送的局部模型的模型参数和所述局部模型对应的权重信息;
    所述主节点设备基于所述模型参数和所述权重信息,训练全局模型。
  12. 根据权利要求11所述的方法,其中,所述权重信息包括:
    用于训练所述局部模型的样本的数据特征。
  13. 根据权利要求11所述的方法,其中,所述局部模型对应的权重信息包括:
    与用于训练所述局部模型的样本的数据特征对应的权重因子值。
  14. 根据权利要求13所述的方法,其中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系由所述主节点设备配置;
    或者,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系预先约定。
  15. 根据权利要求13所述的方法,其中,所述方法还包括:
    所述主节点设备发送第一配置信息,所述第一配置信息用于确定所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系。
  16. 根据权利要求15所述的方法,其中,所述第一配置信息携带与下述中的任意一项:
    业务层数据、无线资源控制RRC信令、广播消息、下行控制信令DCI、媒体接入控制单元MAC CE和物理下行控制信道PDCCH信令。
  17. 根据权利要求12至16任一项所述的方法,其中,所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:
    用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
  18. 根据权利要求12所述的方法,其中,所述主节点设备基于所述模型参数和所述权重信息,训练全局模型,包括:
    在所述权重信息包括训练所述局部模型的次数的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数的相乘之后,再与训练全部局部模型的次数相除得到的数值相加之和。
  19. 根据权利要求12所述的方法,其中,所述主节点设备基于所述模型参数和所述权重信息,训练全局模型,包括:
    在所述权重信息包括训练所述局部模型的全部样本数据的大小或每次训练所述局部模型的样本数据的大小的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值相加之和;
    其中,所述局部模型的参数因子等于所述局部模型的样本的数据特征与全部局部模型的样本的数据特征之和的比值。
  20. 根据权利要求13至16任一项所述的方法,其中,所述主节点设备基于所述模型参数和所述权重信息,训练全局模型,包括:
    在所述权重信息包括与训练所述局部模型的次数对应的权重因子值的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数对应的权重因子值相乘之后,再与训练全部局部模型的次数对应的权重因子值之和相除得到的数值相加之和。
  21. 根据权利要求13至16任一项所述的方法,其中,所述主节点设备基于所述模型参数和所述权重信息,训练全局模型,包括:
    在所述权重信息包括与训练所述局部模型的全部样本数据的大小对应的权重因子值、或与每次训练所述局部模型的样本数据的大小对应的权重因子值的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值数据相加之和;
    其中,所述局部模型的参数因子等于所述局部模型的权重因子值与全部局部模型的权重因子值之和的比值。
  22. 根据权利要求11至21任一项所述的方法,其中,所述权重信息通过业务层数据、或上行控制信令UCI、或RRC信令传输;
    和/或,所述权重信息承载于PUCCH或物理上行共享信道PUSCH上。
  23. 根据权利要求11至22任一项所述的方法,其中,所述模型参数通过业务层数据、或上行控制信令UCI、或RRC信令传输;
    和/或,所述模型参数承载于PUCCH或PUSCH上。
  24. 根据权利要求11至23任一项所述的方法,其中,所述子节点设备包括:第一终端设备。
  25. 根据权利要求11至24任一项所述的方法,其中,所述主节点设备包括:第二终端设备或网络设备。
  26. 一种子节点设备,所述子节点设备包括:
    第一发送单元,配置为发送局部模型的模型参数和所述局部模型对应的权重信息;
    所述模型参数和所述权重信息用于主节点设备训练全局模型。
  27. 根据权利要求26所述的子节点设备,其中,所述局部模型对应的权重信息包括:
    用于训练所述局部模型的样本的数据特征。
  28. 根据权利要求26所述的子节点设备,其中,所述局部模型对应的权重信息包括:
    与用于训练所述局部模型的样本的数据特征对应的权重因子值。
  29. 根据权利要求28所述的子节点设备,其中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,由所述主节点设备配置;
    或者,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系为预先约定。
  30. 根据权利要求29所述的子节点设备,其中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系,通过下述中的任意一项配置:
    业务层数据、无线资源控制RRC信令、广播消息、下行控制信令DCI、媒体接入控制单元MAC CE和物理下行控制信道PDCCH信令。
  31. 根据权利要求27至30任一项所述的子节点设备,其中,所述用于训练所述局部模型的样本的数据特征包括下述中的至少一项:
    用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
  32. 根据权利要求26至31任一项所述的子节点设备,其中,所述权重信息通过业务层数据、或上行控制信令UCI、或RRC信令传输;
    和/或,所述权重信息承载于PUCCH或物理上行共享信道PUSCH上。
  33. 根据权利要求26至32任一项所述的子节点设备,其中,所述模型参数通过业务层数据、或UCI、或RRC信令传输;
    和/或,所述模型参数承载于PUCCH或PUSCH上。
  34. 根据权利要求26至33任一项所述的子节点设备,其中,所述子节点设备包括:第一终端设备。
  35. 根据权利要求26至33任一项所述的子节点设备,其中,所述主节点设备包括:第二终端设备或网络设备。
  36. 一种主节点设备,所述主节点设备包括:
    第一接收单元,配置为接收至少两个子节点设备发送的局部模型的模型参数和所述局部模型对应的权重信息;
    处理单元,配置为基于所述模型参数和所述权重信息,训练全局模型。
  37. 根据权利要求36所述的主节点设备,其中,所述权重信息包括:
    用于训练所述局部模型的样本的数据特征。
  38. 根据权利要求36所述的主节点设备,其中,所述局部模型对应的权重信息包括:
    与用于训练所述局部模型的样本的数据特征对应的权重因子值。
  39. 根据权利要求38所述的主节点设备,其中,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系由所述主节点设备配置;
    或者,所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系为预先约定。
  40. 根据权利要求38所述的主节点设备,其中,所述主节点设备还包括:
    第二发送单元,配置为发送第一配置信息,所述第一配置信息用于确定所述用于训练所述局部模型的样本的数据特征与所述权重因子值的对应关系。
  41. 根据权利要求40所述的主节点设备,其中,所述第一配置信息携带与下述中的任意一项:
    业务层数据、无线资源控制RRC信令、广播消息、下行控制信令DCI、媒体接入控制单元MAC CE和物理下行控制信道PDCCH信令。
  42. 根据权利要求37至41任一项所述的主节点设备,其中,所述用于训练所述局部模型的样 本的数据特征包括下述中的至少一项:
    用于训练所述局部模型的全部样本数据的大小、每次训练所述局部模型的样本数据的大小和训练所述局部模型的次数。
  43. 根据权利要求37所述的主节点设备,其中,所述处理单元,配置为在所述权重信息包括训练所述局部模型的次数的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数的相乘之后,再与训练全部局部模型的次数相除得到的数值相加之和。
  44. 根据权利要求37所述的主节点设备,其中,所述处理单元,配置为在所述权重信息包括训练所述局部模型的全部样本数据的大小或每次训练所述局部模型的样本数据的大小的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值相加之和;
    其中,所述局部模型的参数因子等于所述局部模型的样本的数据特征与全部局部模型的样本的数据特征之和的比值。
  45. 根据权利要求38至41任一项所述的主节点设备,其中,所述处理单元,配置为在所述权重信息包括与训练所述局部模型的次数对应的权重因子值的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与训练所述局部模型的次数对应的权重因子值相乘之后,再与训练全部局部模型的次数对应的权重因子值之和相除得到的数值相加之和。
  46. 根据权利要求38至41任一项所述的主节点设备,其中,所述处理单元,配置为在所述权重信息包括与训练所述局部模型的全部样本数据的大小对应的权重因子值、或与每次训练所述局部模型的样本数据的大小对应的权重因子值的情况下,确定所述全局模型的模型参数的值等于,每个局部模型的模型参数的值与所述局部模型的参数因子相乘得到的数值数据相加之和;
    其中,所述局部模型的参数因子等于所述局部模型的权重因子值与全部局部模型的权重因子值之和的比值。
  47. 根据权利要求36至46任一项所述的主节点设备,其中,所述权重信息通过业务层数据、或上行控制信令UCI、或RRC信令传输;
    和/或,所述权重信息承载于PUCCH或物理上行共享信道PUSCH上。
  48. 根据权利要求36至47任一项所述的主节点设备,其中,所述模型参数通过业务层数据、或UCI、或RRC信令传输;
    和/或,所述模型参数承载于PUCCH或PUSCH上。
  49. 根据权利要求36至47任一项所述的主节点设备,其中,所述子节点设备包括:第一终端设备。
  50. 根据权利要求36至48任一项所述的主节点设备,其中,所述主节点设备包括:第二终端设备或网络设备。
  51. 一种终端设备,包括处理器和用于存储能够在处理器上运行的计算机程序的存储器,其中,
    所述处理器用于运行所述计算机程序时,执行权利要求1至10任一项所述的基于联邦学习的模型训练方法的步骤。
  52. 一种网络设备,包括处理器和用于存储能够在处理器上运行的计算机程序的存储器,其中,
    所述处理器用于运行所述计算机程序时,执行权利要求11至25任一项所述的基于联邦学习的模型训练方法的步骤。
  53. 一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时,实现权利要求1至10任一项所述的基于联邦学习的模型训练方法。
  54. 一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时,实现权利要求11至25任一项所述的基于联邦学习的模型训练方法。
  55. 一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行如权利要 求1至10任一项所述的基于联邦学习的模型训练方法。
  56. 一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行如权利要求11至25任一项所述的基于联邦学习的模型训练方法。
  57. 一种计算机程序,所述计算机程序使得计算机执行如权利要求1至10任一项所述的基于联邦学习的模型训练方法。
  58. 一种计算机程序,所述计算机程序使得计算机执行如权利要求11至25任一项所述的基于联邦学习的模型训练方法。
  59. 一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行如权利要求1至10任一项所述的基于联邦学习的模型训练方法。
  60. 一种芯片,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行如权利要求11至25任一项所述的基于联邦学习的模型训练方法。
PCT/CN2020/078721 2020-03-11 2020-03-11 一种基于联邦学习的模型训练方法、电子设备及存储介质 WO2021179196A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080098459.3A CN115280338A (zh) 2020-03-11 2020-03-11 一种基于联邦学习的模型训练方法、电子设备及存储介质
PCT/CN2020/078721 WO2021179196A1 (zh) 2020-03-11 2020-03-11 一种基于联邦学习的模型训练方法、电子设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/078721 WO2021179196A1 (zh) 2020-03-11 2020-03-11 一种基于联邦学习的模型训练方法、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021179196A1 true WO2021179196A1 (zh) 2021-09-16

Family

ID=77670362

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/078721 WO2021179196A1 (zh) 2020-03-11 2020-03-11 一种基于联邦学习的模型训练方法、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN115280338A (zh)
WO (1) WO2021179196A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113708982A (zh) * 2021-10-28 2021-11-26 华中科技大学 一种基于群体学习的服务功能链部署方法及系统
CN113869528A (zh) * 2021-12-02 2021-12-31 中国科学院自动化研究所 共识表征提取和多样性传播的解纠缠个性化联邦学习方法
CN113935469A (zh) * 2021-10-26 2022-01-14 城云科技(中国)有限公司 基于去中心化联邦学习的模型训练方法
CN114118447A (zh) * 2021-12-15 2022-03-01 湖南红普创新科技发展有限公司 新型联邦学习系统、方法、装置、计算机设备及存储介质
CN114662706A (zh) * 2022-03-24 2022-06-24 支付宝(杭州)信息技术有限公司 一种模型训练方法、装置及设备
CN116346863A (zh) * 2023-05-29 2023-06-27 湘江实验室 基于联邦学习的车载网数据处理方法、装置、设备及介质
WO2024026583A1 (zh) * 2022-07-30 2024-02-08 华为技术有限公司 一种通信方法和通信装置
WO2024032453A1 (zh) * 2022-08-10 2024-02-15 索尼集团公司 用于频谱管理装置的电子设备和方法、存储介质
WO2024036526A1 (zh) * 2022-08-17 2024-02-22 华为技术有限公司 一种模型调度方法和装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871702A (zh) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 联邦模型训练方法、系统、设备及计算机可读存储介质
US20190227980A1 (en) * 2018-01-22 2019-07-25 Google Llc Training User-Level Differentially Private Machine-Learned Models
CN110442457A (zh) * 2019-08-12 2019-11-12 北京大学深圳研究生院 基于联邦学习的模型训练方法、装置及服务器
CN110490335A (zh) * 2019-08-07 2019-11-22 深圳前海微众银行股份有限公司 一种计算参与者贡献率的方法及装置
CN110610242A (zh) * 2019-09-02 2019-12-24 深圳前海微众银行股份有限公司 一种联邦学习中参与者权重的设置方法及装置
CN110795477A (zh) * 2019-09-20 2020-02-14 平安科技(深圳)有限公司 数据的训练方法及装置、系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190227980A1 (en) * 2018-01-22 2019-07-25 Google Llc Training User-Level Differentially Private Machine-Learned Models
CN109871702A (zh) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 联邦模型训练方法、系统、设备及计算机可读存储介质
CN110490335A (zh) * 2019-08-07 2019-11-22 深圳前海微众银行股份有限公司 一种计算参与者贡献率的方法及装置
CN110442457A (zh) * 2019-08-12 2019-11-12 北京大学深圳研究生院 基于联邦学习的模型训练方法、装置及服务器
CN110610242A (zh) * 2019-09-02 2019-12-24 深圳前海微众银行股份有限公司 一种联邦学习中参与者权重的设置方法及装置
CN110795477A (zh) * 2019-09-20 2020-02-14 平安科技(深圳)有限公司 数据的训练方法及装置、系统

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935469A (zh) * 2021-10-26 2022-01-14 城云科技(中国)有限公司 基于去中心化联邦学习的模型训练方法
CN113935469B (zh) * 2021-10-26 2022-06-24 城云科技(中国)有限公司 基于去中心化联邦学习的模型训练方法
CN113708982A (zh) * 2021-10-28 2021-11-26 华中科技大学 一种基于群体学习的服务功能链部署方法及系统
CN113869528A (zh) * 2021-12-02 2021-12-31 中国科学院自动化研究所 共识表征提取和多样性传播的解纠缠个性化联邦学习方法
CN113869528B (zh) * 2021-12-02 2022-03-18 中国科学院自动化研究所 共识表征提取和多样性传播的解纠缠个性化联邦学习方法
CN114118447A (zh) * 2021-12-15 2022-03-01 湖南红普创新科技发展有限公司 新型联邦学习系统、方法、装置、计算机设备及存储介质
CN114662706A (zh) * 2022-03-24 2022-06-24 支付宝(杭州)信息技术有限公司 一种模型训练方法、装置及设备
WO2024026583A1 (zh) * 2022-07-30 2024-02-08 华为技术有限公司 一种通信方法和通信装置
WO2024032453A1 (zh) * 2022-08-10 2024-02-15 索尼集团公司 用于频谱管理装置的电子设备和方法、存储介质
WO2024036526A1 (zh) * 2022-08-17 2024-02-22 华为技术有限公司 一种模型调度方法和装置
CN116346863A (zh) * 2023-05-29 2023-06-27 湘江实验室 基于联邦学习的车载网数据处理方法、装置、设备及介质
CN116346863B (zh) * 2023-05-29 2023-08-01 湘江实验室 基于联邦学习的车载网数据处理方法、装置、设备及介质

Also Published As

Publication number Publication date
CN115280338A (zh) 2022-11-01

Similar Documents

Publication Publication Date Title
WO2021179196A1 (zh) 一种基于联邦学习的模型训练方法、电子设备及存储介质
WO2021147001A1 (zh) 功率控制参数确定方法、终端、网络设备及存储介质
US20210136739A1 (en) Method and device for transmitting uplink signal
EP4131088A1 (en) Machine learning model training method, electronic device and storage medium
WO2021035494A1 (zh) 一种信道状态信息处理方法、电子设备及存储介质
WO2021237423A1 (zh) 一种信道状态信息传输方法、电子设备及存储介质
WO2021237715A1 (zh) 一种信道状态信息处理方法、电子设备及存储介质
WO2020118574A1 (zh) 一种上行传输的功率控制方法及终端设备
WO2021016770A1 (zh) 一种信息处理方法、网络设备、用户设备
WO2020073257A1 (zh) 无线通信方法和终端设备
US11425663B2 (en) Wireless communication method, terminal device, and network device
WO2021035492A1 (zh) 一种信道状态信息处理方法、电子设备及存储介质
WO2021227069A9 (zh) 一种模型更新方法及装置、通信设备
WO2021087827A1 (zh) 激活或者更新pusch路损rs的方法和设备
WO2018218519A1 (zh) 无线通信方法和设备
WO2022061891A1 (zh) 一种重复传输方法、通信设备及存储介质
US20230007529A1 (en) Data transmission method, electronic device, and storage medium
WO2021088066A1 (zh) 一种上行传输方法、电子设备及存储介质
WO2020118722A1 (zh) 一种数据处理方法、设备及存储介质
WO2022021371A1 (zh) 一种会话建立方法、电子设备及存储介质
WO2022151189A1 (zh) 一种信道传输方法、电子设备及存储介质
CN112673668A (zh) 基于复制数据的传输方法和设备
WO2023001060A1 (zh) 一种通信方法及相关装置
WO2021203391A1 (zh) 一种数据传输方法、发送设备及存储介质
WO2023102706A1 (zh) 信息指示方法、信息处理方法和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20924520

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20924520

Country of ref document: EP

Kind code of ref document: A1