CN115280338A

CN115280338A - Model training method based on federal learning, electronic equipment and storage medium

Info

Publication number: CN115280338A
Application number: CN202080098459.3A
Authority: CN
Inventors: 田文强; 沈嘉
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-03-11
Filing date: 2020-03-11
Publication date: 2022-11-01
Also published as: WO2021179196A1

Abstract

A method for model training based on federated learning, comprising: the method comprises the steps that a child node device sends model parameters of a local model and weight information corresponding to the local model (S201); the model parameters and the weight information are used for the master node device to train a global model. Another method for model training based on federated learning, an electronic device, and a storage medium are also disclosed.

Description

Model training method based on federal learning, electronic device and storage medium

Technical Field

The present application relates to the field of wireless communication technologies, and in particular, to a model training method based on federal learning, an electronic device, and a storage medium.

Background

When the model training is performed based on the federal learning, how to obtain the high-performance global model is not clear when the master node device trains the global model based on the local model reported by the child node.

Disclosure of Invention

The embodiment of the application provides a model training method based on federal learning, electronic equipment and a storage medium, and a high-performance global model can be obtained through training.

In a first aspect, an embodiment of the present application provides a method for training a model based on federal learning, including: the method comprises the steps that child node equipment sends model parameters of a local model and weight information corresponding to the local model; the model parameters and the weight information are used for the master node device to train a global model.

In a second aspect, an embodiment of the present application provides a method for model training based on federal learning, including: the method comprises the steps that a main node device receives model parameters of a local model and weight information corresponding to the local model, wherein the model parameters are sent by at least two sub-node devices; the master node device trains a global model based on the model parameters and the weight information.

In a third aspect, an embodiment of the present application provides a child node device, where the child node device includes:

the device comprises a first sending unit, a second sending unit and a control unit, wherein the first sending unit is configured to send model parameters of a local model and weight information corresponding to the local model; the model parameters and the weight information are used for the master node device to train a global model.

In a fourth aspect, an embodiment of the present application provides a master node device, where the master node device includes:

the first receiving unit is configured to receive model parameters of a local model and weight information corresponding to the local model, which are sent by at least two child node devices; a processing unit configured to train a global model based on the model parameters and the weight information.

In a fifth aspect, an embodiment of the present application provides a child node device, including a processor and a memory for storing a computer program capable of running on the processor, where the processor is configured to execute the steps of the model training method based on federal learning performed by the child node device when the computer program is run.

In a sixth aspect, an embodiment of the present application provides a master node device, including a processor and a memory, where the memory is used for storing a computer program capable of being executed on the processor, and the processor is configured to execute the steps of the above-mentioned method for training a model based on federated learning performed by the master node device when the computer program is executed.

In a seventh aspect, an embodiment of the present application provides a chip, including: and the processor is used for calling and running a computer program from the memory so that the equipment provided with the chip executes the model training method based on the federal learning executed by the child node equipment.

In an eighth aspect, an embodiment of the present application provides a chip, including: and the processor is used for calling and running a computer program from the memory so that the equipment provided with the chip executes the model training method based on the federal learning executed by the main node equipment.

In a ninth aspect, an embodiment of the present application provides a storage medium, which stores an executable program, and when the executable program is executed by a processor, the method for model training based on federal learning performed by a child node device is implemented.

In a tenth aspect, an embodiment of the present application provides a storage medium, which stores an executable program, and when the executable program is executed by a processor, the method for model training based on federal learning performed by a master node device is implemented.

In an eleventh aspect, an embodiment of the present application provides a computer program product, which includes computer program instructions to make a computer execute the federate learning-based model training method executed by the child node device.

In a twelfth aspect, an embodiment of the present application provides a computer program product, which includes computer program instructions, and the computer program instructions enable a computer to execute the method for model training based on federated learning performed by the master node device.

In a thirteenth aspect, an embodiment of the present application provides a computer program, where the computer program enables a computer to execute the method for model training based on federal learning performed by the child node device.

In a fourteenth aspect, an embodiment of the present application provides a computer program, where the computer program enables a computer to execute the method for model training based on federal learning performed by a master node device.

The federal learning-based model training method, the electronic device and the storage medium provided by the embodiment of the application comprise the following steps: the method comprises the steps that child node equipment sends model parameters of a local model and weight information corresponding to the local model; the model parameters and the weight information are used for training a global model by the master node equipment. In this way, the sub-node equipment reports the weight information corresponding to the local model to the main node equipment, so that the main node equipment can train a global model based on the weight information of different local models; the global model can reflect the characteristics of training data represented by the local model, and the performance of the global model is not influenced by the local model with low reliability when the master node device trains the global model by using the local model reported by each child node device.

Drawings

FIG. 1 is a schematic diagram of the basic structure of a simple neural network model of the present application;

FIG. 2 is a schematic diagram of a basic structure of a deep neural network model according to the present application;

FIG. 3a is a schematic diagram of a training process of a neural network model according to the present application;

FIG. 3b is a schematic diagram of the inference process of the neural network model of the present application;

FIG. 4 is a schematic diagram of a training process of the neural network model based on federated learning according to the present application;

fig. 5 is a schematic structural diagram of a communication system according to an embodiment of the present application;

FIG. 6 is a schematic view of an alternative processing flow of a model training method based on federated learning according to an embodiment of the present application;

FIG. 7 is a schematic view of an alternative process flow of a federated learning-based model training method according to an embodiment of the present application;

FIG. 8 is a detailed process flow diagram of a Federal learning-based model training method according to an embodiment of the present application;

FIG. 9 is a schematic diagram illustrating another detailed process flow of a federated learning-based model training method according to an embodiment of the present application;

FIG. 10 is a schematic diagram of an alternative node device according to an exemplary embodiment of the present disclosure;

fig. 11 is a schematic diagram of an alternative configuration of a master node device according to an embodiment of the present application;

fig. 12 is a schematic diagram of a hardware component structure of an electronic device according to an embodiment of the present application.

Detailed Description

So that the manner in which the features and elements of the present embodiments can be understood in detail, a more particular description of the embodiments, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings.

Before describing embodiments of the present application in detail, artificial intelligence is briefly described.

Artificial intelligence has become a new path for people to solve and deal with problems. Among them, artificial intelligence based on neural networks has wide applications. The basic structure of a simple neural network model is shown in fig. 1, and includes: an input layer, a hidden layer and an output layer; the input layer is used for receiving data, the hidden layer is used for processing the data, and the output layer is used for generating a calculation result of the neural network model.

With the continuous development of the research on the neural network model, a neural network deep learning algorithm is also provided, the basic structure of the deep neural network model is shown in fig. 2, the deep neural network model comprises a plurality of hidden layers, the deep neural network model comprising the plurality of hidden layers can greatly improve the data processing capability, and the deep neural network model is widely applied to the aspects of pattern recognition, signal processing, optimized combination, abnormal detection and the like.

The application of the neural network model comprises two processes of a training phase and an inference phase. In the training stage, a large amount of data is required to be obtained as a training set (also called a sample set) at first, the training set is used as input data of the neural network model to be trained, and parameters to be determined of the neural network model to be trained are determined through a large amount of training and parameter iteration based on a specific training algorithm, so that the training process of the neural network model is completed, and the trained neural network model is obtained. For example, a neural network model for identifying puppies can be trained from a large number of pictures, as shown in fig. 3 a. For a neural network, after the training of the neural network model is completed, the trained neural network model can be used for reasoning or verifying operations such as identification, classification, information recovery and the like, and the process is called as a reasoning process of the neural network model. The puppy in the image can be identified, for example, by a trained neural network model, as shown in fig. 3 b.

One way of training a neural network model is "federal learning," which is characterized in that in the training process of the neural network model, a training set is distributed on each child node device. The training process of the neural network model based on federal learning is shown in fig. 4 and comprises three steps, firstly, after each sub-node generates a local neural network model, the local neural network model is uploaded to a main node device; and secondly, the main node equipment department synthesizes a current global neural network model according to all the obtained local neural network models, and transmits the global neural network model to each sub-node equipment. Finally, the child node equipment continues to use the new global neural network model to carry out next training iteration; and finishing the training of the neural network model under the cooperation of the main node equipment and the plurality of sub-node equipment.

However, during the global neural network model training based on federal learning, the following problems may exist:

1. different child node devices may have different available training sets according to different actual environments; such as different numbers of training combinations and/or different classes of training sets.

For example, when a user a has 1000 data as the training set and B user B has 10 data as the training set, the local models determined by a user a and B user should not be treated equally as local models of equal confidence level.

For another example: when the A user has 1000 data, and the 1000 data belong to one type of data; b-users have 1000 data, but 1000 data belong to different categories of data. In this scenario, although the number of data in the training sets of the user a and the user B is the same, the sample information reflected by the training sets is different; the training set of the user B can reflect more types of sample information, and the generalization capability of the local model corresponding to the user B is stronger than that of the local model corresponding to the user A.

2. According to different actual environments, different child node devices may obtain local models by using training sets in different ways.

For example: when the user A trains the local model, 200 data are used as a batch of training data for processing, and the batch of training data is used for updating the parameters of the local model, thereby completing the training of the local model once. And B, when the user trains the local model, 1000 training data are used as a batch of training data for processing, and the batch of training data is used for updating the parameters of the local model to finish the training of the local model. Under the scene, the training data used by the user B for the single local model training is more than the training data used by the user A for the single local model training; accordingly, the one-time local model corresponding to the B user represents more training set information than the one-time local model corresponding to the a user.

3. The channel conditions and transmission capacities corresponding to different nodes are different in actual environments.

For example: when the user A and the user B train the local models, 200 pieces of training data are used as a batch of processing, and the user A and the user B respectively update the local model parameters of the user A and the user B to finish the training of the local models once. On the basis of the above conditions, the channel environment where the user a is located is poor, the transmission rate is low, and the user a reports the updated local model parameters to the master node device after the local model training is not completed each time; if the user A locally updates the local model parameters for 10 times, the local model parameters are transmitted to the main node for one time. The channel environment condition of the user B is better than that of the user A, the user B can support relatively higher transmission rate, and the user B locally updates the local model parameters for 2 times and then transmits the local model parameters to the main node equipment for one time. In this scenario, the local model parameter transmitted to the master node device by the user a and the local model parameter transmitted to the master node device by the user B represent different information on the number of local model training times, and may also be understood as training set information corresponding to different sizes.

In summary, due to the influence of factors such as the characteristics of the training set, the manner of determining the local model, the wireless transmission conditions, and the like, there may be a case where the weights corresponding to the local models generated by different child node devices are different in the federal learning process; in this scenario, if the training data of one local model a is less than the training data of one local model B, if the local model a and the local model B are combined into one global model by using the same-treatment strategy, the problem that the global model training result is greatly affected by the small-scale training data may be caused, so as to improve the performance of the global model.

In view of the foregoing problems, an embodiment of the present application provides a model training method based on federal learning, and the technical solution of the embodiment of the present application may be applied to various communication systems, for example: a global system for mobile communications (GSM) system, a Code Division Multiple Access (CDMA) system, a Wideband Code Division Multiple Access (WCDMA) system, a General Packet Radio Service (GPRS), a long term evolution (long term evolution, LTE) system, a LTE Frequency Division Duplex (FDD) system, a LTE Time Division Duplex (TDD) system, an advanced long term evolution (advanced long term evolution, LTE-a) system, a new radio (near radio), an NR) system, an evolution system of the NR system, an LTE (LTE-based access to unlicensed spectrum, LTE-U) system on an unlicensed frequency band, an NR (NR-based access to unlicensed spectrum, NR-U) system on an unlicensed frequency band, a Universal Mobile Telecommunications System (UMTS), a Worldwide Interoperability for Microwave Access (WiMAX) communication system, a Wireless Local Area Network (WLAN), wireless fidelity (WiFi), a next generation communication system, or other communication systems.

The system architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application, and it can be known by a person of ordinary skill in the art that the technical solution provided in the embodiment of the present application is also applicable to similar technical problems with the evolution of the network architecture and the occurrence of a new service scenario.

The network device related in this embodiment may be a common base station (e.g., a NodeB or an eNB or a gNB), a new radio controller (NR controller), a centralized network element (centralized unit), a new radio base station, a radio remote module, a micro base station, a relay (relay), a distributed network element (distributed unit), a reception point (TRP), a Transmission Point (TP), or any other device. The embodiments of the present application do not limit the specific technologies and the specific device forms used by the network devices. For convenience of description, in all embodiments of the present application, the above-mentioned apparatus for providing a wireless communication function for a terminal device is collectively referred to as a network device.

In the embodiment of the present application, the terminal device may be any terminal, for example, the terminal device may be a user equipment for machine type communication. That is, the terminal device may also be referred to as user equipment UE, a Mobile Station (MS), a mobile terminal (mobile terminal), a terminal (terminal), etc., and the terminal device may communicate with one or more core networks via a Radio Access Network (RAN), for example, the terminal device may be a mobile phone (or called "cellular" phone), a computer with a mobile terminal, etc., and for example, the terminal device may also be a portable, pocket, hand-held, computer-included, or vehicle-mounted mobile device, which exchanges language and/or data with the RAN. The embodiments of the present application are not particularly limited.

Optionally, the network device and the terminal device may be deployed on land, including indoors or outdoors, hand-held or vehicle-mounted; can also be deployed on the water surface; it may also be deployed on airborne airplanes, balloons and satellite vehicles. The embodiment of the application does not limit the application scenarios of the network device and the terminal device.

Optionally, the network device and the terminal device may communicate via a licensed spectrum (licensed spectrum), may communicate via an unlicensed spectrum (unlicensed spectrum), and may communicate via both the licensed spectrum and the unlicensed spectrum. The network device and the terminal device may communicate with each other through a frequency spectrum of less than 7 gigahertz (GHz), may communicate through a frequency spectrum of more than 7GHz, and may communicate using both a frequency spectrum of less than 7GHz and a frequency spectrum of more than 7 GHz. The embodiments of the present application do not limit the spectrum resources used between the network device and the terminal device.

Generally, conventional communication systems support a limited number of connections and are easy to implement, however, with the development of communication technology, mobile communication systems will support not only conventional communication, but also, for example, device to device (D2D) communication, machine to machine (M2M) communication, machine Type Communication (MTC), and vehicle to vehicle (V2V) communication, and the embodiments of the present application can also be applied to these communication systems.

Illustratively, the embodiment of the present application is applied to a communication system 100, as shown in fig. 5. The communication system 100 may include a network device 110, and the network device 110 may be a device that communicates with a terminal device 120 (or referred to as a communication terminal, a terminal). Network device 110 may provide communication coverage for a particular geographic area and may communicate with terminal devices located within that coverage area. Optionally, the Network device 110 may be a Base Transceiver Station (BTS) in a GSM system or a CDMA system, a Base Station (NodeB, NB) in a WCDMA system, an evolved Node B (eNB or eNodeB) in an LTE system, or a wireless controller in a Cloud Radio Access Network (CRAN), or may be a Network device in a Mobile switching center, a relay Station, an Access point, a vehicle-mounted device, a wearable device, a hub, a switch, a bridge, a router, a Network-side device in a 5G Network, or a Network device in a Public Land Mobile Network (PLMN) for future evolution, or the like.

The communication system 100 further comprises at least one terminal device 120 located within the coverage area of the network device 110. As used herein, "terminal equipment" includes, but is not limited to, connections via wireline, such as Public Switched Telephone Network (PSTN), digital Subscriber Line (DSL), digital cable, direct cable connection; and/or another data connection/network; and/or via a Wireless interface, e.g., for a cellular Network, a Wireless Local Area Network (WLAN), a digital television Network such as a DVB-H Network, a satellite Network, an AM-FM broadcast transmitter; and/or means of another terminal device arranged to receive/transmit communication signals; and/or Internet of Things (IoT) devices. A terminal device arranged to communicate over a wireless interface may be referred to as a "wireless communication terminal", "wireless terminal", or "mobile terminal". Examples of mobile terminals include, but are not limited to, satellite or cellular telephones; personal Communications Systems (PCS) terminals that may combine cellular radiotelephones with data processing, facsimile, and data Communications capabilities; PDAs that may include radiotelephones, pagers, internet/intranet access, web browsers, notepads, calendars, and/or Global Positioning System (GPS) receivers; and conventional laptop and/or palmtop receivers or other electronic devices that include a radiotelephone transceiver. A terminal Equipment may refer to an access terminal, user Equipment (UE), subscriber unit, subscriber station, mobile, remote station, remote terminal, mobile device, user terminal, wireless communication device, user agent, or User Equipment. An access terminal may be a cellular telephone, a cordless telephone, a Session Initiation Protocol (SIP) phone, a Wireless Local Loop (WLL) station, a Personal Digital Assistant (PDA), a handheld device having Wireless communication capabilities, a computing device or other processing device connected to a Wireless modem, a vehicle mounted device, a wearable device, a terminal device in a 5G network, or a terminal device in a future evolved PLMN, etc.

Optionally, a Device to Device (D2D) communication may be performed between the terminal devices 120.

Alternatively, the 5G system or the 5G network may also be referred to as a New Radio (NR) system or an NR network.

Fig. 5 exemplarily shows one network device and two terminal devices, alternatively, the communication system 100 may include a plurality of network devices and each network device may include other numbers of terminal devices within the coverage area, which is not limited in this embodiment of the present invention.

Optionally, the communication system 100 may further include other network entities such as a network controller, a mobility management entity, and the like, which is not limited in this embodiment.

It should be understood that a device having a communication function in a network/system in the embodiments of the present application may be referred to as a communication device. Taking the communication system 100 shown in fig. 5 as an example, the communication device may include a network device 110 and a terminal device 120 having a communication function, and the network device 110 and the terminal device 120 may be the specific devices described above, which are not described herein again; the communication device may also include other devices in the communication system 100, such as other network entities, for example, a network controller, a mobility management entity, and the like, which is not limited in this embodiment.

An optional processing flow of the model training method based on federal learning provided in the embodiment of the present application, as shown in fig. 6, includes the following steps:

step S201, the child node device sends model parameters of a local model and weight information corresponding to the local model.

In some embodiments, the child node device transmits, to the master node device, model parameters of the local model and weight information corresponding to the local model. Wherein the model parameters and the weight information are used for the master node device to train a global model.

In specific implementation, the child node device may send the model parameter through service layer data, uplink Control Information (UCI), or Radio Resource Control (RRC) signaling; the model parameters may also be carried on a Physical Uplink Control Channel (PUCCH) or a Physical Uplink Shared Channel (PUSCH). The child node equipment can send the weight information corresponding to the local model through service layer data, UCI or RRC signaling; the weight information corresponding to the local model may also be carried on the PUCCH or PUSCH.

In some embodiments, the weight information corresponding to the local model may be: data features of samples used to train the local model; the child node device transmits model parameters of the local model and data characteristics of the samples used to train the local model. The data characteristics of the samples used to train the local model include at least one of: the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.

In other embodiments, the weight information corresponding to the local model may be: a weighting factor value corresponding to a data characteristic of a sample used to train the local model. The child node device transmits model parameters of the local model and weight factor values corresponding to data characteristics of samples used to train the local model. The data characteristics of the samples used to train the local model include at least one of: the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.

Wherein the correspondence between the data characteristics of the samples used for training the local model and the weighting factor values is configured by the master node device; or, the corresponding relation between the data characteristics of the samples for training the local model and the weight factor values is predetermined.

For example, the data characteristics of the samples used for training the local model include the size of all sample data used for training the local model, and the correspondence between the data characteristics of the samples used for training the local model and the weight factor value, as shown in table 1 below, the size of all sample data used for training the local model is Ni, and the weight factor value corresponding to Ni is Mi. Wherein Nimin is the minimum value of all sample data used for training the local model, and Nimax is the maximum value of all sample data used for training the local model; when the size of all sample data used for training the local model is between N1min and N1max as in table 1, the corresponding weight factor is M1; when the size of all sample data for training the local model is between N2min and N2max, the corresponding weight factor is M2; and when the size of all sample data for training the local model is between N3min and N3max, the corresponding weight factor is M3.

Size of all sample data used to train the local model	Weight factor
N1min to N1max	M1
N2min to N2max	M2
N3min to N3max	M3

TABLE 1

For another example, the data characteristics of the samples used for training the local model include the size of the sample data used for training the local model each time, and the correspondence between the data characteristics of the samples used for training the local model and the weighting factor value, as shown in table 2 below, the size of the sample data used for training the local model each time is Bi, and the weighting factor value corresponding to Bi is Mi. Wherein Bimin is the minimum value of sample data of the local model each time, and Bimax is the maximum value of sample data of the local model each time; as shown in table 2, when the size of the sample data for training the local model is between B1min and B1max, the corresponding weight factor is M1; when the size of sample data for training the local model is between B2min and B2max each time, the corresponding weight factor is M2; and when the size of the sample data for training the local model is between B3min and B3max, the corresponding weight factor is M3.

The size of sample data for each training of the local model	Weight factor
B1min to B1max	M1
B2min to B2max	M2
B3min to B3max	M3

TABLE 2

For another example, the data characteristics of the samples used for training the local model include the number of times of training the local model, and the correspondence between the data characteristics of the samples used for training the local model and the weighting factor value, as shown in table 3 below, the number of times of training the local model is Ki, and the weighting factor value corresponding to Ki is Mi. Wherein Kimin is the minimum of the number of times the local model is trained, and Kimax is the maximum of the number of times the local model is trained; when the number of times of training the local model is between K1min and K1max as in table 3, the corresponding weight factor is M1; when the number of times of training the local model is between K2min and K2max, the corresponding weight factor is M2; and when the number of times of training the local model is between K3min and K3max, the corresponding weight factor is M3.

Number of times of training of local model	Local model weight factor
K1min to K1max	M1
K2min to K2max	M2
K3min to K3max	M3

TABLE 3

When the correspondence between the data characteristic of the sample used for training the local model and the weight factor value is configured by the master node device, the master node device may send the correspondence between the data characteristic of the sample used for training the local model and the weight factor value to a child node device through service layer data, RRC signaling, broadcast message, downlink Control Information (DCI), media Access Control Element (MAC CE), or Physical Downlink Control CHannel (PDCCH) signaling. And the sub-node equipment searches for a weight factor value corresponding to the data characteristics of the sample used by the local model for training according to the corresponding relation, and reports the searched weight factor value to the main node equipment.

In this embodiment, the child node device may be a first terminal device, and the master node device may be a second terminal device or a network device. And under the condition that the main node equipment is second terminal equipment, the sub-node equipment can send the model parameters of the local model and the weight information corresponding to the local model to the second terminal equipment, and the second terminal equipment can be used as a main node to process the received model parameters of the local model and the weight information corresponding to the local model. Or the second terminal device may send the received model parameters of the local model and the weight information corresponding to the local model to the master node device.

Another optional processing flow of the model training method based on federated learning provided in the embodiment of the present application, as shown in fig. 7, includes the following steps:

step S301, the main node device receives model parameters of a local model and weight information corresponding to the local model, which are sent by at least two sub-node devices.

In some embodiments, the description of the model parameters and the weight information is the same as that in step S201, and is not repeated here.

In some embodiments, the description that the master node device receives the model parameter and the weight information is the same as the description that the slave node device sends the model parameter and the weight information in step S201, and details are not repeated here.

In addition, when the weight information is a correspondence between a data feature of a sample used for training the local model and the weight factor value, and the correspondence between the data feature of the sample used for training the local model and the weight factor value is configured by the master node device, the method may further include:

step S300, the master node device sends first configuration information, where the first configuration information is used to determine a correspondence between the data features of the samples used for training the local model and the weight factor values.

In some embodiments, the first configuration information may carry any one of: service layer data, RRC signaling, broadcast messages, DCI, MAC CE, and PDCCH signaling.

Step S302, the master node device trains a global model based on the model parameters and the weight information.

In some embodiments, in a case where the weight information includes the number of times of training the local models, the master node device determines that the value of the model parameter of the global model is equal to the sum of the values obtained by dividing the number of times of training all the local models after multiplying the value of the model parameter of each local model by the number of times of training the local models.

For example, the child node device 1 and the child node device 2 report the model parameters and the weight information to the master node device. If the model parameter reported by the child node device 1 is R1, the number of times that the child node device 1 trains the local model is N1, that is, the child node device 1 reports the model parameter once when training the local model N1 times; the model parameter reported by the child node device 2 is R2, and the number of times the child node device 2 trains the local model is N2, that is, the child node device 2 reports the model parameter once after training the local model N2 times. The model parameters R of the global model can be expressed as:

R＝(R1*N1+R2*N2)/(N1+N2) (1)

in other embodiments, in a case that the weight information includes a size of all sample data for training the local model or a size of sample data for training the local model each time, the master node device determines that a value of a model parameter of the global model is equal to a sum of numerical values obtained by multiplying a value of a model parameter of each local model by a parameter factor of the local model; wherein the parameter factor of the local model is equal to the ratio of the data characteristics of the samples of the local model to the sum of the data characteristics of the samples of the entire local model.

For example, if the model parameter reported by the child node device 1 is R1, the size of all sample data for the child node device 1 to train the local model is N1; the model parameter reported by the child node device 2 is R2, the size of all sample data for training the local model by the child node device 2 is N2, the model parameter reported by the child node device k is Rk, and the size of all sample data for training the local model by the child node device k is Nk. The model parameters R of the global model can be expressed as:

wherein, the first and the second end of the pipe are connected with each other,

in still other embodiments, in a case where the weight information includes a weight factor value corresponding to the number of times the local model is trained, the master node device determines that the value of the model parameter of the global model is equal to the sum of values obtained by dividing the value of the model parameter of each local model by the sum of the weight factor values corresponding to the number of times the local model is trained, after multiplying the value of the model parameter of each local model by the weight factor value corresponding to the number of times the local model is trained.

For example, the child node device 1 and the child node device 2 report the model parameters and the weight information to the master node device. If the model parameter reported by the child node device 1 is R1, the weight factor value corresponding to the number of times of training the local model is M1; if the model parameter reported by the child node device 2 is R2, the weight factor value corresponding to the number of times of training the local model is M2. The master node device determines the model parameters of the global model as:

R＝(R1*M1+R2*M2)/(M1+M2) (4)

in still other embodiments, in a case where the weight information includes a weight factor value corresponding to the size of all sample data for training the local model, or a weight factor value corresponding to the size of sample data for each training of the local model, the master node apparatus determines that the value of the model parameter of the global model is equal to the sum of numerical data sums obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model; wherein the parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.

For example, if the model parameter reported by the child node device 1 is R1, the weight factor value corresponding to the size of all sample data of the child node device 1 for training the local model is M1; the model parameter reported by the child node device 2 is R2, the weight factor value corresponding to the size of all sample data of the local model trained by the child node device 2 is M2, the model parameter reported by the child node device k is Rk, and the weight factor value corresponding to the size of all sample data of the local model trained by the child node device k is Mk. The model parameters R of the global model can be expressed as:

in this embodiment, the child node device may be a first terminal device, and the master node device may be a second terminal device or a network device.

The following weight information corresponding to the local model includes: as an example, the detailed processing flow diagram of the model training method based on federal learning provided in this application is shown in fig. 8, and includes:

step S401, the child node device sends model parameters of the local model and data characteristics of samples used for training the local model to the main node device.

The sub-node equipment sends model parameters of a local model and data characteristics of samples used for training the local model to the main node equipment through service layer data, UCI or RRC signaling. The data characteristics of the samples used to train the local model include at least one of: the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.

Step S402, the main node device synthesizes a global model based on the model parameters of the local model sent by the sub-node device and the data characteristics of the samples used for training the local model.

In specific implementation, if the data characteristics of the samples used for training the local models include the number of times of training the local models, as shown in the above formula (1), the master node device determines that the value of the model parameter of the global model is equal to, and after the value of the model parameter of each local model is multiplied by the number of times of training the local models, the value is added to the sum of the values obtained by dividing the number of times of training all the local models.

In specific implementation, if the data characteristics of the samples used for training the local model include the size of all sample data for training the local model or the size of sample data for training the local model each time, as shown in the above formula (2) and formula (3), the master node device determines that the value of the model parameter of the global model is equal to the sum of the numerical values obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model; wherein the parameter factor of the local model is equal to the ratio of the data characteristics of the samples of the local model to the sum of the data characteristics of the samples of the entire local model.

In step S403, the master node device transmits the global model to the child node devices.

Step S404, the child node device sends model parameters of the local model and data characteristics of samples used for training the local model to the main node device.

Here, the child node device repeats the operation of step S401, and the model parameters sent in step S401 and step S404 may be different or the same; the data characteristics of the samples sent in step S401 and step S404 for training the local model may be different or the same. Correspondingly, the master node device repeats the operations of the steps S402 to S403; until the training of the global model is completed.

The following weight information corresponding to the local model includes: for example, the weighting factor value corresponding to the data feature of the sample used for training the local model, a detailed processing flow diagram of the model training method based on federal learning provided in this application is shown in fig. 9, and includes:

step S501, the child node device obtains the corresponding relation between the data characteristics of the samples used for training the local model and the weight factor value.

In some embodiments, the child node device may determine, according to a predetermined agreement, a correspondence between data characteristics of samples used for training the local model and the weight factor value; the child node device may also determine a correspondence relationship between the data characteristics of the samples used for training the local model and the weight factor value by receiving first configuration information sent by a network device.

Step S502, the child node device determines a weight factor value according to the data characteristics of the sample of the local model trained by the child node device.

In specific implementation, the child node device searches for a weight factor value corresponding to the data feature of the sample of the local model trained by itself in the correspondence between the data feature of the sample used for training the local model and the weight factor value.

In step S503, the child node device transmits the model parameters and the weight factor values of the local model to the master node device.

In some embodiments, the child node device transmits the model parameters and the weight factor values of the local model to the primary node device through traffic layer data, or UCI, or RRC signaling.

Step S504, the master node device synthesizes a global model based on the model parameters and the weight factor values of the local model transmitted by the child node devices.

In a specific implementation, if the weight factor value is a weight factor value corresponding to the number of times the local model is trained, the master node device determines, based on the above formula (4), that the model parameter of the global model is equal to, and after the value of the model parameter of each local model is multiplied by the weight factor value corresponding to the number of times the local model is trained, the sum of the values obtained by dividing the value by the sum of the weight factor values corresponding to the number of times all the local models are trained is added.

In specific implementation, if the weight factor value is a weight factor value corresponding to the size of all sample data for training the local model or a weight factor value corresponding to the size of sample data for training the local model each time, the master node device determines, based on the above formula (5) and formula (6), that the value of the model parameter of the global model is equal to the sum of the numerical data obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model; wherein the parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.

Step S505, the master node device sends the global model to the child node devices.

In step S506, the child node device transmits the model parameters and the weight factor values of the local model to the master node device.

Here, the child node device repeats the operation of step S503, and the model parameters sent by the child node device to the master node device in step S503 and step S506 may be the same or different; the weighting factor value transmitted from the child node device to the master node device in step S503 and step S506 may be the same or different. Correspondingly, the master node device repeats the operations from step S504 to step S505; until the training of the global model is completed.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

In specific implementation, the model training method based on federal learning provided in the embodiment of the present application can be applied to at least the following scenarios: such as channel model generation, service user prediction and intelligent traffic decision.

Taking the application of the model training method based on the federal learning provided by the embodiment of the application to a channel model generation scene as an example, the terminal equipment acquires channel quality data and takes the channel quality data as a sample to train a local model; the terminal equipment sends the model parameters of the local model obtained by training to the network equipment, and the network equipment determines the model parameters of the global model according to the model parameters of the local model respectively sent by each terminal equipment; the global model is used to determine channel quality.

In order to implement the above federate learning-based model training method, an embodiment of the present application provides a child node device, and an optional structural diagram of the child node device 600 is shown in fig. 10, and includes:

a first transmitting unit 601 configured to transmit model parameters of a local model and weight information corresponding to the local model;

the model parameters and the weight information are used for the master node device to train a global model.

In some embodiments, the weight information comprises: data features of samples used to train the local model.

In some embodiments, the weight information corresponding to the local model includes: a weight factor value corresponding to a data characteristic of a sample used to train the local model.

In some embodiments, the correspondence of the data characteristics of the samples used to train the local model to the weight factor values is configured by the master node device; or, the corresponding relation between the data characteristics of the samples for training the local model and the weight factor values is predetermined.

In some embodiments, the correspondence between the data characteristics of the samples used for training the local model and the weight factor values is configured by any one of the following:

service layer data, RRC signaling, broadcast messages, DCI, MAC CE, and PDCCH signaling.

In some embodiments, the data characteristics of the samples used to train the local model include at least one of: the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.

In some embodiments, the weight information is transmitted via traffic layer data, or UCI, or RRC signaling; and/or, the weight information is carried on PUCCH or PUSCH.

In some embodiments, the model parameters are transmitted via traffic layer data, or UCI, or RRC signaling; and/or, the model parameters are carried on PUCCH or PUSCH.

In some embodiments, the child node apparatus 600 includes: a first terminal device.

In some embodiments, the master node device comprises: a second terminal device or a network device.

In order to implement the above federate learning-based model training method, an embodiment of the present application provides a master node device, where an optional component structure diagram of the master node device 800 is shown in fig. 11, and the method includes:

a first receiving unit 801 configured to receive model parameters of a local model and weight information corresponding to the local model, which are sent by at least two child node devices;

a processing unit 802 configured to train a global model based on the model parameters and the weight information.

In some embodiments, the weight information corresponding to the local model includes: a weighting factor value corresponding to a data characteristic of a sample used to train the local model.

In some embodiments, the correspondence between the data characteristics of the samples used for training the local model and the weight factor values is predetermined.

In some embodiments, the master node apparatus 800 further comprises:

a second sending unit 803, configured to send first configuration information, where the first configuration information is used to determine a correspondence relationship between the data features of the samples used for training the local model and the weight factor values.

In some embodiments, the first configuration information carries any one of:

In some embodiments, the processing unit 802 is configured to determine that, in a case that the weight information includes the number of times of training the local models, the value of the model parameter of the global model is equal to the sum of the values obtained by multiplying the value of the model parameter of each local model by the number of times of training the local models and dividing the sum by the number of times of training all the local models.

In some embodiments, the processing unit 802 is configured to determine that, in a case that the weight information includes a size of all sample data for training the local model or a size of sample data for training the local model each time, a value of a model parameter of the global model is equal to a sum of numerical values obtained by multiplying a value of a model parameter of each local model by a parameter factor of the local model;

wherein the parameter factor of the local model is equal to the ratio of the data characteristics of the samples of the local model to the sum of the data characteristics of the samples of the entire local model.

In some embodiments, the processing unit 802 is configured to determine that, in a case that the weight information includes a weight factor value corresponding to the number of times the local model is trained, a value of a model parameter of the global model is equal to a sum of values obtained by dividing a value of the model parameter of each local model by a sum of weight factor values corresponding to the number of times all the local models are trained after multiplying the value of the model parameter of each local model by the weight factor value corresponding to the number of times the local model is trained.

In some embodiments, the processing unit 802 is configured to determine that a value of a model parameter of the global model is equal to a sum of a value of a model parameter of each local model and a sum of numerical data obtained by multiplying a parameter factor of the local model, in a case that the weight information includes a weight factor value corresponding to a size of all sample data for training the local model or a weight factor value corresponding to a size of sample data for training the local model each time;

wherein the parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.

In some embodiments, the weight information is transmitted through traffic layer data, or UCI, or RRC signaling; and/or, the weight information is carried on PUCCH or PUSCH.

In some embodiments, the model parameters are transmitted via traffic layer data, or UCI, or RRC signaling;

and/or, the model parameters are carried on PUCCH or PUSCH.

In some embodiments, the child node device comprises: a first terminal device.

The embodiment of the application further provides a child node device, which comprises a processor and a memory, wherein the memory is used for storing a computer program capable of being run on the processor, and when the processor is used for running the computer program, the steps of the model training method based on the federal learning, which is executed by the child node, are executed.

The embodiment of the present application further provides a master node device, which includes a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is configured to execute the steps of the model training method based on federal learning, executed by the master node device, when the computer program is run.

An embodiment of the present application further provides a chip, including: and the processor is used for calling and running a computer program from the memory so that the equipment provided with the chip executes the model training method based on the federal learning executed by the child node equipment.

An embodiment of the present application further provides a chip, including: and the processor is used for calling and running a computer program from the memory so that the equipment provided with the chip executes the model training method based on the federal learning executed by the main node equipment.

The embodiment of the application also provides a storage medium, which stores an executable program, and when the executable program is executed by a processor, the method for model training based on federal learning executed by the child node device is realized.

An embodiment of the present application further provides a storage medium, in which an executable program is stored, and when the executable program is executed by a processor, the method for model training based on federal learning performed by a master node device is implemented.

An embodiment of the present application further provides a computer program product, which includes computer program instructions, where the computer program instructions enable a computer to execute the federate learning-based model training method executed by the child node device.

An embodiment of the present application further provides a computer program product, which includes computer program instructions, where the computer program instructions enable a computer to execute the method for model training based on federal learning, which is executed by the master node device.

The embodiment of the application also provides a computer program, and the computer program enables a computer to execute the model training method based on the federal learning executed by the child node.

Embodiments of the present application further provide a computer program, where the computer program enables a computer to execute the method for model training based on federal learning, which is executed by the master node device.

Fig. 12 is a schematic diagram of a hardware component structure of an electronic device (a master node device or a child node device) according to an embodiment of the present application, where the electronic device 700 includes: at least one processor 701, a memory 702, and at least one network interface 704. The various components in the electronic device 700 are coupled together by a bus system 705. It is understood that the bus system 705 is used to enable connected communication between these components. The bus system 705 includes a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration the various busses are labeled in figure 12 as the bus system 705.

It will be appreciated that the memory 702 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The non-volatile Memory may be ROM, programmable Read-Only Memory (PROM), erasable Programmable Read-Only Memory (EPROM), electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic random access Memory (FRAM), flash Memory (Flash Memory), magnetic surface Memory, optical Disc, or Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), synchronous Static Random Access Memory (SSRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), synchronous Dynamic Random Access Memory (SLDRAM), direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 702 described in embodiments herein is intended to comprise, without being limited to, these and any other suitable types of memory.

The memory 702 in the present embodiment is used to store various types of data to support the operation of the electronic device 700. Examples of such data include: any computer program for operating on electronic device 700, such as application 7022. A program for implementing the methods according to embodiments of the present application may be included in application 7022.

The method disclosed in the embodiments of the present application may be applied to the processor 701, or implemented by the processor 701. The processor 701 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 701. The Processor 701 may be a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 701 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 702, and the processor 701 may read the information in the memory 702 and perform the steps of the aforementioned methods in conjunction with its hardware.

In an exemplary embodiment, the electronic Device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable Logic Devices (PLDs), complex Programmable Logic Devices (CPLDs), FPGAs, general purpose processors, controllers, MCUs, MPUs, or other electronic components for performing the foregoing methods.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should be understood that the terms "system" and "network" are often used interchangeably herein in this application. The term "and/or" in this application is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in this application generally indicates that the former and latter related objects are in an "or" relationship.

The above description is only exemplary of the present application and should not be taken as limiting the scope of the present application, as any modifications, equivalents, improvements, etc. made within the spirit and principle of the present application should be included in the scope of the present application.

Claims

A method of federal learning based model training, the method comprising:

the method comprises the steps that child node equipment sends model parameters of a local model and weight information corresponding to the local model;

the model parameters and the weight information are used for the master node device to train a global model.
The method of claim 1, wherein the weight information corresponding to the local model comprises:

data features of samples used to train the local model.
The method of claim 1, wherein the weight information corresponding to the local model comprises:

a weight factor value corresponding to a data characteristic of a sample used to train the local model.
The method of claim 3, wherein the correspondence of the data characteristics of the samples used to train the local model to the weight factor values is configured by the master node device;

or, the corresponding relation between the data characteristics of the samples for training the local model and the weight factor values is predetermined.
The method of claim 4, wherein the correspondence of the data features of the samples used to train the local model to the weight factor values is configured by any one of:

service layer data, radio resource control RRC signaling, broadcast messages, downlink control signaling DCI, a media access control unit MAC CE and physical downlink control channel PDCCH signaling.
The method of any of claims 2 to 5, wherein the data characteristics of the samples used to train the local model include at least one of:

the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.
The method according to any one of claims 1 to 6, wherein the weight information is transmitted by service layer data, or uplink control signaling, UCI, or RRC signaling;

and/or the weight information is carried on PUCCH or PUSCH.
The method according to any of claims 1 to 7, wherein the model parameters are transmitted by traffic layer data, or UCI, or RRC signaling;

and/or the model parameters are carried on PUCCH or PUSCH.
The method of any of claims 1 to 8, wherein the child node device comprises: a first terminal device.
The method of any of claims 1 to 9, wherein the master node device comprises: a second terminal device or a network device.
A method of federal learning based model training, the method comprising:

the method comprises the steps that a main node device receives model parameters of a local model and weight information corresponding to the local model, wherein the model parameters are sent by at least two sub-node devices;

the master node device trains a global model based on the model parameters and the weight information.
The method of claim 11, wherein the weight information comprises:

data features of samples used to train the local model.
The method of claim 11, wherein the weight information corresponding to the local model comprises:

a weighting factor value corresponding to a data characteristic of a sample used to train the local model.
The method of claim 13, wherein the correspondence of the data features of the samples used to train the local model to the weight factor values is configured by the master node device;

or, the corresponding relation between the data characteristics of the samples for training the local model and the weight factor values is agreed in advance.
The method of claim 13, wherein the method further comprises:

and the main node equipment sends first configuration information, wherein the first configuration information is used for determining the corresponding relation between the data characteristics of the samples for training the local model and the weight factor value.
The method of claim 15, wherein the first configuration information carries any one of:

service layer data, radio resource control RRC signaling, broadcast messages, downlink control signaling DCI, a media access control unit MAC CE and physical downlink control channel PDCCH signaling.
The method of any of claims 12 to 16, wherein the data characteristics of the samples used to train the local model include at least one of:

the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.
The method of claim 12, wherein the master node device trains a global model based on the model parameters and the weight information, comprising:

and under the condition that the weight information comprises the times of training the local models, determining that the value of the model parameter of the global model is equal to the sum of the values obtained by multiplying the value of the model parameter of each local model by the times of training the local models and dividing the multiplied value by the times of training all the local models.
The method of claim 12, wherein the master node device trains a global model based on the model parameters and the weight information, comprising:

determining that the value of the model parameter of the global model is equal to the sum of the numerical values obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model under the condition that the weight information comprises the size of all sample data for training the local model or the size of the sample data for training the local model each time;

wherein the parameter factor of the local model is equal to the ratio of the data characteristics of the samples of the local model to the sum of the data characteristics of the samples of the entire local model.
The method of any of claims 13 to 16, wherein the master node device training a global model based on the model parameters and the weight information comprises:

and under the condition that the weight information comprises a weight factor value corresponding to the times of training the local models, determining that the value of the model parameter of the global model is equal to the sum of numerical values obtained by dividing the value of the model parameter of each local model by the weight factor value corresponding to the times of training the local models and then dividing the value by the sum of the weight factor values corresponding to the times of training all the local models.
The method of any of claims 13 to 16, wherein the master node device training a global model based on the model parameters and the weight information comprises:

determining that the value of the model parameter of the global model is equal to the sum of numerical data additions obtained by multiplying the value of the model parameter of each local model by the parameter factor of the local model when the weight information includes a weight factor value corresponding to the size of all sample data for training the local model or a weight factor value corresponding to the size of the sample data for training the local model each time;

wherein the parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.
The method according to any one of claims 11 to 21, wherein the weight information is transmitted by traffic layer data, or uplink control signaling, UCI, or RRC signaling;

and/or, the weight information is carried on a PUCCH or a Physical Uplink Shared Channel (PUSCH).
The method according to any of claims 11 to 22, wherein the model parameters are transmitted by traffic layer data, or uplink control signaling, UCI, or RRC signaling;

and/or the model parameters are carried on PUCCH or PUSCH.
The method of any of claims 11 to 23, wherein the child node device comprises: a first terminal device.
The method of any of claims 11 to 24, wherein the master node device comprises: a second terminal device or a network device.
A child node device, the child node device comprising:

the device comprises a first sending unit, a second sending unit and a control unit, wherein the first sending unit is configured to send model parameters of a local model and weight information corresponding to the local model;

the model parameters and the weight information are used for the master node device to train a global model.
The child node device of claim 26, wherein the weight information corresponding to the local model comprises:

data features of samples used to train the local model.
The child node device of claim 26, wherein the weight information corresponding to the local model comprises:

a weighting factor value corresponding to a data characteristic of a sample used to train the local model.
The child node device of claim 28, wherein the correspondence of the data characteristics of the samples used to train the local model to the weight factor values is configured by the master node device;

or, the corresponding relation between the data characteristics of the samples for training the local model and the weight factor values is predetermined.
The child node device of claim 29, wherein the correspondence of the data characteristics of the samples used to train the local model to the weight factor values is configured by any one of:

service layer data, radio resource control RRC signaling, broadcast messages, downlink control signaling DCI, a media access control unit MAC CE and physical downlink control channel PDCCH signaling.
The child node device of any of claims 27 to 30, wherein the data characteristics of the samples used to train the local model comprise at least one of:

the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.
The sub-node device of any of claims 26 to 31, wherein the weight information is transmitted via traffic layer data, or uplink control signaling, UCI, or RRC signaling;

and/or the weight information is carried on PUCCH or PUSCH.
The child node device of any of claims 26-32, wherein the model parameters are transmitted via traffic layer data, or UCI, or RRC signaling;

and/or, the model parameters are carried on PUCCH or PUSCH.
The sub-node device of any one of claims 26 to 33, wherein the sub-node device comprises: a first terminal device.
The child node device of any one of claims 26 to 33, wherein the master node device comprises: a second terminal device or a network device.
A master node apparatus, the master node apparatus comprising:

the first receiving unit is configured to receive model parameters of a local model and weight information corresponding to the local model, which are sent by at least two child node devices;

a processing unit configured to train a global model based on the model parameters and the weight information.
The master node apparatus of claim 36, wherein the weight information comprises:

data features of samples used to train the local model.
The master node apparatus of claim 36, wherein the weight information corresponding to the local model comprises:

a weight factor value corresponding to a data characteristic of a sample used to train the local model.
The master node apparatus of claim 38, wherein the correspondence of the data features of the samples used to train the local model to the weight factor values is configured by the master node apparatus;

or, the corresponding relation between the data characteristics of the samples for training the local model and the weight factor values is predetermined.
The master node apparatus of claim 38, wherein the master node apparatus further comprises:

a second sending unit configured to send first configuration information, where the first configuration information is used to determine a correspondence relationship between data features of the samples used for training the local model and the weight factor values.
The master node device of claim 40, wherein the first configuration information carries any one of:

service layer data, radio resource control RRC signaling, broadcast messages, downlink control signaling DCI, a media access control unit MAC CE and physical downlink control channel PDCCH signaling.
The master node apparatus of any of claims 37 to 41, wherein the data characteristics of the samples used to train the local model comprise at least one of:

the size of all sample data used for training the local model, the size of the sample data used for training the local model each time and the number of times of training the local model.
The master node apparatus of claim 37, wherein the processing unit is configured to determine that, in a case where the weight information includes the number of times the local models are trained, the values of the model parameters of the global model are equal to the sum of the values obtained by dividing the value of the model parameter of each local model by the number of times the local model is trained, and then adding the sum to the values obtained by dividing the number of times the entire local models are trained.
The master node apparatus of claim 37, wherein the processing unit is configured to determine that, in a case that the weight information includes a size of all sample data for training the local model or a size of sample data for each training of the local model, a value of a model parameter of the global model is equal to a sum of numerical additions of values obtained by multiplying a value of a model parameter of each local model by a parameter factor of the local model;

wherein the parameter factor of the local model is equal to the ratio of the data characteristics of the samples of the local model to the sum of the data characteristics of the samples of the entire local model.
The master node apparatus of any one of claims 38 to 41, wherein the processing unit is configured to determine that, in a case where the weight information includes a weight factor value corresponding to the number of times the local models are trained, the values of the model parameters of the global model are equal to the sum of values obtained by multiplying the value of the model parameter of each local model by the weight factor value corresponding to the number of times the local model is trained and dividing the sum by the sum of the weight factor values corresponding to the number of times all the local models are trained.
The master node apparatus according to any one of claims 38 to 41, wherein the processing unit is configured to determine that a value of a model parameter of the global model is equal to a sum of numerical data sums obtained by multiplying a value of a model parameter of each local model by a parameter factor of the local model in a case where the weight information includes a weight factor value corresponding to a size of all sample data for training the local model or a weight factor value corresponding to a size of sample data for training the local model each time;

wherein the parameter factor of the local model is equal to the ratio of the weight factor value of the local model to the sum of the weight factor values of all local models.
The master node apparatus of any of claims 36 to 46, wherein the weight information is transmitted via traffic layer data, or uplink control signaling (UCI), or RRC signaling;

and/or, the weight information is carried on a PUCCH or a Physical Uplink Shared Channel (PUSCH).
The master node apparatus of any of claims 36 to 47, wherein the model parameters are transmitted via traffic layer data, or UCI, or RRC signaling;

and/or, the model parameters are carried on PUCCH or PUSCH.
The master node apparatus of any one of claims 36 to 47, wherein the child node apparatus comprises: a first terminal device.
The master node apparatus of any of claims 36 to 48, wherein the master node apparatus comprises: a second terminal device or a network device.
A terminal device comprising a processor and a memory for storing a computer program capable of running on the processor, wherein,

the processor is configured to execute the steps of the method for model training based on federated learning of any one of claims 1 to 10 when the computer program is executed.
A network device comprising a processor and a memory for storing a computer program capable of running on the processor, wherein,

the processor, when executing the computer program, is configured to perform the steps of the method for model training based on federated learning of any one of claims 11 to 25.
A storage medium storing an executable program which, when executed by a processor, implements the federal learning based model training method of any one of claims 1 to 10.
A storage medium storing an executable program which, when executed by a processor, implements the federal learning based model training method of any of claims 11 to 25.
A computer program product comprising computer program instructions to cause a computer to perform a method of federal learning based model training as claimed in any of claims 1 to 10.
A computer program product comprising computer program instructions to cause a computer to perform the method of federal learning based model training as claimed in any of claims 11 to 25.
A computer program for causing a computer to perform the method of federal learning based model training as claimed in any one of claims 1 to 10.
A computer program for causing a computer to perform the method of federal learning based model training as claimed in any one of claims 11 to 25.
A chip, comprising: a processor for calling and running a computer program from a memory so that a device on which the chip is installed performs the federal learning based model training method as claimed in any one of claims 1 to 10.
A chip, comprising: a processor for calling and running a computer program from a memory so that a device on which the chip is installed performs the federal learning based model training method as claimed in any one of claims 11 to 25.