WO2022041947A1 - Method for updating machine learning model, and communication apparatus - Google Patents

Method for updating machine learning model, and communication apparatus Download PDF

Info

Publication number
WO2022041947A1
WO2022041947A1 PCT/CN2021/100003 CN2021100003W WO2022041947A1 WO 2022041947 A1 WO2022041947 A1 WO 2022041947A1 CN 2021100003 W CN2021100003 W CN 2021100003W WO 2022041947 A1 WO2022041947 A1 WO 2022041947A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal device
model
network device
training
machine learning
Prior art date
Application number
PCT/CN2021/100003
Other languages
French (fr)
Chinese (zh)
Inventor
杨水根
晋英豪
秦东润
周彧
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022041947A1 publication Critical patent/WO2022041947A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a method and a communication device for updating a machine learning model.
  • Wireless communication networks are developing in the direction of network diversification, broadbandization, integration and intelligence. Wireless transmission uses higher and higher frequency spectrums, wider bandwidths, and more and more antennas. Traditional communication methods are too complex and difficult to guarantee performance. Furthermore, with the explosion of smart terminals and various applications, wireless communication network behavior and performance factors are more dynamic and unpredictable than in the past. Operating increasingly complex wireless communication networks with low cost and high efficiency is a challenge facing operators today.
  • AI/ML artificial intelligence
  • ML machine learning
  • the data in terminal devices generally involve user privacy.
  • each terminal device In order to avoid leakage of user privacy data, each terminal device generally uses its own data locally to train the machine learning model pre-distributed by the network device, and then The obtained model update parameters are sent to the network device, and then the network device aggregates the model update parameters sent by each participant (ie, each terminal device that performs model training), and then directly updates the local machine learning model of the network device.
  • the user data in each terminal device is generally different, because the ability of each terminal device to perform model training is generally different, and the configuration information of each terminal device for model training is configured by each terminal itself or by the terminal.
  • Update is a model update parameter that needs to be reported by all participants. Therefore, due to the time difference between the respective model update parameters reported by each terminal device, the time required for the network device to update the local model is also affected. The convergence speed is slower and the update efficiency is lower.
  • Embodiments of the present application provide a method and a communication device for updating a machine learning model, which are used to improve the convergence speed of updating the machine learning model, so as to improve the updating efficiency of the machine learning model.
  • a method for updating a machine learning model is provided, and the method can be applied to a network device or a chip inside the network device.
  • the network device determines the corresponding model training configuration information for the terminal device according to the computing capability of the terminal device, and receives the model training configuration information after sending the model training configuration information to the terminal device.
  • the model update parameter sent by the terminal device is then updated according to the received model update parameter to the second machine learning model in the network device.
  • the model update parameter sent by the terminal device is an update parameter obtained by the terminal device performing local training on the local first machine learning model according to the model training configuration information sent by the network device.
  • the machine learning model local to the terminal device is called the first machine learning model
  • the machine learning model local to the network device is called the second machine learning model
  • the first machine learning model is distributed by the network device for the terminal device.
  • the first machine learning model and the second machine learning model are of the same type of machine learning model, or the first machine learning model and the second machine learning model are different types of machine learning models.
  • the model update parameter information is used for local model update, and the first machine learning model and the second machine learning model are the same type of machine learning model.
  • the network device allocates corresponding model training configuration information to each terminal device according to the computing capability of each terminal device, so that the model training configuration information used by each terminal device to train the local machine learning model is the same as the In contrast to the way in which each terminal device independently selects the model training configuration information in the related art, in this solution, the network side uniformly calculates the difference of each terminal device according to the computing capability of each terminal device itself.
  • the corresponding model training configuration information can be configured so as to reduce the time difference caused by different capabilities of each terminal device during model training, thereby ensuring that each terminal device can complete the model training within the same time as possible, so that each terminal device can report
  • the time of the respective model update parameters is roughly the same, which reduces the difference in the time when each terminal device reports the model update parameters, thereby reducing the time difference between the network device receiving the model update parameters sent by each terminal device, so that the network device based on each
  • the model update parameters reported by the terminal device can be completed in a short time as much as possible to improve the convergence speed of the model update, thereby improving the update efficiency of the machine learning model.
  • the network device may receive the first computing power indication information from the terminal device, or may receive the second computing power indication information from the terminal device after sending a computing capability acquisition request to the terminal device, or may Receive third computing power indication information from other network devices.
  • the first computing power indication information, the second computing power indication information, and the third computing power indication information are all information used to indicate the computing capability of the terminal device, that is to say, in this embodiment, three types of acquisition terminal equipment are provided. In this way, the flexibility of the method for acquiring the computing power of the terminal device can be improved.
  • the model training configuration information includes at least one of hyperparameters, precision, and training time information.
  • the network device can configure one or more model training configuration information for the terminal device according to the computing capability of the terminal device, and the configuration flexibility is high.
  • the configured model training configuration information is routinely used by terminal devices for model training, which generally meets the configuration requirements of most terminal devices for local model training, and has good versatility.
  • the network device further sends training feature information to the terminal device, where the training feature information is used to indicate the training feature set used by the terminal device to train the first machine learning model in the terminal device.
  • the network device sends the training feature information to the terminal device, so that each terminal device participating in the local training can use the same training feature information to perform local training, thereby reducing the time when each terminal device performs local training based on different training feature information. difference in time spent.
  • the network device further sends accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy.
  • each terminal device participating in the local model training can use the same accuracy evaluation information to evaluate the accuracy of the locally trained machine learning model, because the same accuracy is used.
  • the evaluation method can try to make each terminal device meet the specified accuracy requirements under the same accuracy evaluation standard, thereby reducing the difference in the time spent by each terminal device for local training.
  • the network device further receives accuracy indication information from the terminal device, where the accuracy indication information is used to instruct the terminal device to perform a local first machine learning model on the local first machine learning model by using the model training configuration information sent by the network device. Accuracy achieved after training.
  • the terminal device in addition to feeding back the model update parameters to the network device, the terminal device can also feed back the accuracy of the corresponding model training to the network device, so that the network device can know the training effect of the terminal device, which can be used as a follow-up When configuring the model training configuration information for the terminal device, it is used as a reference to maximize the training effect.
  • the network device before the network device receives the model update parameters sent by the terminal device, it also determines a time point for acquiring the model update parameters of the terminal device, and sends an acquisition request to the terminal device at the time point, The obtaining request is used to instruct the terminal device to send the model update parameter of the terminal device to the network device.
  • the network device can explicitly control the time for requesting model update parameters from each terminal device. On the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, it can further reduce the reporting model of each terminal device. The time difference between the update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
  • the network device before the network device receives the model update parameters sent by the terminal device, it also determines a time point for acquiring the model update parameters of the terminal device, and sends reporting time information to the terminal device. The information is used to indicate that the model update parameters are sent to the network device at the determined aforementioned time point.
  • the network device can clearly control the specific time when each terminal device reports the model update parameters. On the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, it can further reduce the reporting model of each terminal device. The time difference between the update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
  • the network device determines the transmission duration for each of the multiple terminal devices to send their respective model update parameters to the network device, and determines to obtain the model update of the terminal device according to the transmission duration corresponding to each terminal.
  • the plurality of terminal devices may include the aforementioned terminal devices, or may not include the aforementioned terminal devices.
  • the network device actively requests model update parameters from each terminal device, and sends an acquisition request for requesting model update parameters to each corresponding terminal device at a time matching each terminal device.
  • the time required for updating parameters of the transmission model of most or even all participants can be comprehensively considered, and the time for each terminal device to report its own model update parameters can be more accurately controlled, thereby reducing the network device acquiring the data sent by each terminal device.
  • the time difference of the model update parameters thereby improving the convergence speed of the local model update and improving the model update efficiency.
  • the above obtaining request is also used to indicate that the specified model update parameters need to be obtained.
  • the network device can instruct the terminal to upload specific model update parameters, not necessarily all model update parameters, which can reduce the amount of data and time required for the terminal device to transmit the model update parameters to the network device, thereby minimizing invalidation
  • the transmission improves the effectiveness of transmission, saves network transmission resources, and reduces air interface resource overhead.
  • the network device further receives parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  • the availability of the model update parameters in the terminal device can be indicated by the parameter availability indication information.
  • the network device can specify the availability of various model update parameters in the terminal device according to the parameter availability indication information sent by the terminal device. , which increases the cognitive consistency between the network device and the terminal device, and the network device can also be more specific when acquiring the model update parameters sent by the terminal device.
  • a method for updating a machine learning model is provided, and the method can be applied to a terminal device or a chip inside the terminal device.
  • the terminal device receives the model training configuration information sent by the network device, wherein the model training configuration information is determined according to the computing capability of the terminal device, and further, the terminal device is Perform local training on the first machine learning model in the terminal device according to the received model training configuration information to obtain model update parameters, and then send the obtained model update parameters to the network device, so that the network device updates the parameters according to the model A local update is made to the second machine learning model in the network device.
  • the understanding of the first machine learning model and the second machine learning model can be understood according to the description of the first machine learning model and the second machine learning model in the first aspect.
  • the model training configuration information for the terminal device to perform local machine learning model training is determined by the network device according to the computing capability of the terminal device itself, so the model training configuration information is related to the computing capability of the terminal device itself. Matching, it can ensure that each terminal device can complete the model training in the same time as possible. In this way, the network device can configure the corresponding model training configuration information for each terminal device participating in the local training according to its own computing capability, so that the time for each terminal device to report its own model update parameters is roughly the same, reducing the number of terminal devices.
  • the difference in the time when the device reports the model update parameters thereby reducing the time difference between the network device receiving the model update parameters sent by each terminal device, so that the network device can update the model based on the model update parameters reported by each terminal device in as short a time as possible. It can be completed in time to improve the convergence speed of model update, thereby improving the update efficiency of the machine learning model.
  • the terminal device before receiving the model training configuration information sent by the network device, the terminal device receives the computing capability acquisition request sent by the network device, and according to the computing capability acquisition request, sends to the network device an instruction to indicate the terminal The second computing power indication information of the computing power of the device.
  • the terminal device on the basis of receiving the model training configuration information sent by the network device, the terminal device also receives training feature information from the network device, where the training feature information is used to indicate the machine learning in the terminal device.
  • the training feature set used by the model for training, and then the first machine learning model in the terminal device is locally trained according to the model training configuration information and the training feature information.
  • the terminal device further receives accuracy evaluation information from the network device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy, and then according to the accuracy The evaluation information determines the accuracy achieved by the trained first machine learning model.
  • the terminal device also sends accuracy indication information to the network device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information sent by the network device to train its local machine learning model after training. achieved accuracy.
  • the terminal device further receives an acquisition request from the network device, where the acquisition request is used to instruct the terminal device to send the model update parameters of the terminal device to the network device, and then send the network device to the network device according to the acquisition request.
  • Send model update parameters Send model update parameters.
  • the terminal device further receives reporting time information from the network device, and sends the model update parameter to the network device at the time point indicated by the reporting time information.
  • the terminal device further sends parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  • a method for updating a machine learning model is provided, and the method can be applied to a network device, or can be applied to a chip in the network device.
  • the network device selects and obtains the model update parameter of the first terminal device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device. at the selected time point, and then send an acquisition request to the first terminal device at the selected time point, or send to the terminal device the reporting time information for instructing the first terminal device to send the model update parameters at this time point, and receive the first terminal device The sent model update parameter, and then locally update the second machine learning model in the network device according to the model update parameter.
  • the network device actively requests each terminal device for the model update parameters in each terminal device, and the time point of the request is the amount of data that the network device actually needs to transmit according to each terminal device and the transmission chain situation (the quality) is determined. Specifically, the network device selects the time point for acquiring the model update parameters of each terminal device according to the transmission duration required for each terminal device to send the respective model update parameters to the network device, and according to the difference between the transmission durations of each terminal device Differentially request the respective model update parameters from each terminal device at different time points, which can minimize the time difference between the network device receiving the model update parameters sent by each terminal device due to the difference in transmission time, so that each terminal device can The model update parameters sent by the device can reach the network device at the same time (or approximately the same short period of time) as much as possible, reducing the time difference between each terminal device transmitting the model update parameters to the network device, thereby reducing the network device acquiring the model of each terminal device.
  • the network device further receives parameter availability indication information sent by the first terminal device.
  • the availability of the model update parameters in the terminal device can be indicated by the parameter availability indication information.
  • the network device can specify the availability of various model update parameters in the terminal device according to the parameter availability indication information sent by the terminal device. , which increases the cognitive consistency between the network device and the terminal device, and the network device can also be more explicit when acquiring the model update parameters sent by the terminal device.
  • the network device indicates the specified model update parameter to the first terminal device.
  • the model update parameters specified by the network device may be instructed through the acquisition request.
  • the network device can instruct the terminal to upload specific model update parameters, not necessarily all model update parameters, which can reduce the amount of data and time required for the terminal device to transmit the model update parameters to the network device, thereby minimizing invalidation
  • the transmission improves the effectiveness of transmission, saves network transmission resources, and reduces air interface resource overhead.
  • a method for updating a machine learning model is provided, and the method can be applied to a terminal device, or can be applied to a chip in the terminal device.
  • the first terminal device receives the acquisition request sent by the network device, and the time point when the acquisition request is sent is when the network device sends the request from each of the multiple terminal devices to the network device.
  • the transmission duration for the network device to send the respective model update parameters is determined, and the acquisition request is sent at the determined time point, and then the model update parameters are sent to the network device according to the acquisition request, or the report time information sent by the network device is received, and The model update parameter is sent to the network device at the time point indicated by the reporting time information, so that the network device locally updates the local machine learning model of the network device according to the model update parameter sent by the first terminal device.
  • the first terminal device further sends parameter availability indication information to the network device.
  • the first terminal device further receives indication information sent by the network device, where the indication information is used to indicate a model update parameter specified by the network device.
  • the indication information used to indicate the model update parameter specified by the network device is the above-mentioned acquisition request.
  • a fifth aspect provides a communication apparatus
  • the communication apparatus may be a network device, or a chip set inside the network device, and the communication apparatus includes the first aspect or any possible implementation manner of the first aspect.
  • the communication device includes a processing unit and a communication unit, wherein:
  • a processing unit configured to determine model training configuration information corresponding to the terminal device according to the computing capability of the terminal device
  • a communication unit configured to send the model training configuration information to the terminal device, and receive the model update parameter sent by the terminal device, wherein the model update parameter is the model parameter updated by the terminal device after training the first machine learning model according to the model training configuration information ;
  • the processing unit is further configured to update the second machine learning model according to the model update parameter.
  • the communication unit is further used for:
  • the model training configuration information includes at least one of hyperparameters, precision, and training time information.
  • the communication unit is further configured to send training feature information to the terminal device, where the training feature information is used to instruct the terminal device to use for training the first machine learning model training feature set.
  • the communication unit is further configured to send accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating accuracy or a test sample for evaluating accuracy kind.
  • the communication unit is further configured to receive accuracy indication information from the terminal device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information to The accuracy achieved by a machine learning model after training.
  • the processing unit is further configured to determine a time point for acquiring the model update parameter of the terminal device; then, the communication unit is further configured to:
  • reporting time information is used to indicate that model update parameters are sent to the network device at the time point.
  • the processing unit is specifically used for:
  • the time point is determined and acquired according to each transmission duration corresponding to each terminal device.
  • the obtaining request is further used to indicate that the specified model update parameters need to be obtained.
  • the communication unit is further configured to receive parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  • a communication device in a sixth aspect, the communication device may be a terminal device, or a chip provided inside the terminal device, and the communication device includes the second aspect or any possible implementation manner of the second aspect. Modules for the methods described in .
  • the communication device includes a communication unit and a processing unit, wherein:
  • a communication unit configured to receive model training configuration information sent by the network device, where the model training configuration information is determined according to the computing capability of the terminal device;
  • a processing unit configured to train the first machine learning model according to the model training configuration information to obtain model update parameters
  • the communication unit is further configured to send the model update parameter to the network device, where the model update parameter is used by the network device to update the second machine learning model.
  • the communication unit is further used for:
  • second computing power indication information is sent to the network device, where the second computing power indication information is used to indicate the computing capability of the terminal device.
  • the communication unit is further configured to receive training feature information from the network device, where the training feature information is used to instruct the terminal device to perform training on the first machine learning model.
  • the used training feature set; then, the processing unit is further configured to train the first machine learning model according to the model training configuration information and the training feature information.
  • the communication unit is further configured to receive accuracy evaluation information from the network device, where the accuracy evaluation information includes at least one of a method for evaluating accuracy or a test sample for evaluating accuracy One; then, the processing unit is further configured to determine the accuracy achieved by the trained machine learning model according to the accuracy evaluation information.
  • the communication unit is configured to send accuracy indication information to the network device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information to The accuracy achieved by the learned model after training.
  • the communication unit is further configured to receive an acquisition request from the network device, and send the model update parameter to the network device according to the acquisition request, where the acquisition request is used for The terminal device is instructed to send the model update parameter of the terminal device to the network device.
  • the communication unit is further configured to receive report time information from the network device, and send the model update parameter to the network device at the acquisition time indicated by the report time information .
  • the communication unit is further configured to send parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  • a communication device in a seventh aspect, the communication device may be a network device, or a chip set inside the network device, and the communication device includes the third aspect or any possible implementation manner of the third aspect. Modules for the methods described in .
  • the communication device includes a processing unit and a communication unit, wherein:
  • a processing unit configured to select a time point for acquiring the model update parameters of the first terminal device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device;
  • the communication unit is configured to send an acquisition request to the first terminal device at the time point, or send to the terminal device the reporting time information for instructing the first terminal device to send the model update parameters at the acquisition time, wherein the acquisition request uses in requesting the first terminal device to send model update parameters to the network device; and receiving the model update parameters sent by the first terminal device;
  • the processing unit is further configured to update the second machine learning model in the network device according to the model update parameter.
  • the communication unit is further configured to receive parameter availability indication information from the first terminal device.
  • the communication unit is further configured to receive indication information from the network device for indicating the specified model update parameter.
  • the indication information is carried in the acquisition request.
  • a communication device may be a terminal device, or a chip set inside the terminal device, and the communication device includes the fourth aspect or any possible implementation manner of the fourth aspect.
  • the communication device includes a communication unit and a processing unit, wherein:
  • a communication unit configured to receive an acquisition request sent by a network device or report time information, wherein the time point at which the acquisition request is sent is when the network device sends the transmission of the respective model update parameters to the network device according to each terminal device in the plurality of terminal devices fixed duration;
  • a processing unit configured to determine model update parameters to be sent according to the acquisition request
  • the communication unit is further configured to send the determined model update parameter to the network device, or send the model update parameter to the network device at the time point indicated by the reporting time information, and the model The update parameter is used by the network device to update the second machine learning model in the network device.
  • the communication unit is further configured to send parameter availability indication information to the network device.
  • the communication unit is further configured to receive indication information from the network device for indicating the specified model update parameter.
  • the indication information is carried in the acquisition request.
  • a communication device comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor executes instructions stored in a memory so that the communication device passes The communication interface performs the method as described in the first aspect or any possible implementation of the first aspect.
  • the memory is located outside the device.
  • the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
  • a communication device comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor executes instructions stored in a memory, so that the communication device passes The communication interface performs the method as described in the second aspect or any possible implementation of the second aspect.
  • the memory is located outside the device.
  • the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
  • a communication device comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor executes instructions stored in a memory to enable the communication device
  • the method as described in the third aspect or any possible implementation manner of the third aspect is performed through the communication interface.
  • the memory is located outside the device.
  • the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
  • a twelfth aspect provides a communication device, comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor causes the communication device to execute instructions stored in a memory by the at least one processor
  • the method as described in the fourth aspect or any possible implementation manner of the fourth aspect is performed through the communication interface.
  • the memory is located outside the device.
  • the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
  • a thirteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, when the program or instruction is run on a computer, the first aspect or any possible implementation manner of the first aspect is as described in the first aspect. method is executed.
  • a fourteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, which, when the program or instruction is run on a computer, makes the second aspect or any possible implementation of the second aspect as described in the second aspect method is executed.
  • a fifteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, which, when the program or instruction is run on a computer, makes the third aspect or any possible implementation of the third aspect described in the third aspect method is executed.
  • a sixteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, when the program or instruction is run on a computer, the fourth aspect or any possible implementation manner of the fourth aspect is provided. method is executed.
  • a seventeenth aspect provides a chip, which is coupled to a memory and configured to read and execute program instructions stored in the memory, so that the first aspect or any possible implementation manner of the first aspect is described in the first aspect. method is executed.
  • An eighteenth aspect provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the second aspect or any possible implementation manner of the second aspect is described in the second aspect. method is executed.
  • a nineteenth aspect provides a chip, which is coupled to a memory and configured to read and execute program instructions stored in the memory, so that the third aspect or any of the possible implementations of the third aspect is described in the third aspect. method is executed.
  • a twentieth aspect provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the fourth aspect or any of the possible implementations of the fourth aspect is described in the method is executed.
  • a twenty-first aspect provides a computer program product comprising instructions, when executed on a computer, to cause the method described in the first aspect or any of the possible implementations of the first aspect to be performed.
  • a twenty-second aspect provides a computer program product comprising instructions that, when run on a computer, cause the method described in the second aspect or any of the possible implementations of the second aspect to be performed.
  • a twenty-third aspect provides a computer program product comprising instructions, which when run on a computer, cause the above third aspect or any of the possible implementations of the third aspect to be performed.
  • a twenty-fourth aspect provides a computer program product comprising instructions, which when run on a computer, cause the method described in the fourth aspect or any of the possible implementations of the fourth aspect to be performed.
  • Figure 1 is a schematic diagram of applying federated learning to ML model training
  • FIG. 2 is a schematic diagram of an application scenario of an embodiment of the present application
  • FIG. 3 is a schematic diagram of a device architecture of a separate access network according to an embodiment of the present application.
  • FIG. 4 is a flowchart of a method for updating a machine learning model provided by an embodiment of the present application
  • FIG. 5 is a flowchart of another method for updating a machine learning model provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a communication device in an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of another communication device in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of another communication device in an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of another communication device in an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of another communication device in an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of another communication device in an embodiment of the present application.
  • Terminal devices including devices that provide voice and/or data connectivity to users, may include, for example, handheld devices with wireless connectivity, or processing devices connected to wireless modems.
  • the terminal equipment may communicate with the core network via a radio access network (RAN), and exchange voice and/or data with the RAN.
  • RAN radio access network
  • the terminal equipment may include user equipment (UE), terminal, wireless terminal equipment, mobile terminal equipment, device-to-device (D2D) terminal equipment, vehicle-to-everything (vehicle-to-everything, V2X) terminal equipment, machine-to-machine/machine-type communications (M2M/MTC) terminal equipment, Internet of things (IoT) terminal equipment, subscriber unit (subscriber unit), Subscriber station (subscriber station), mobile station (mobile station), remote station (remote station), access point (access point, AP), remote terminal (remote terminal), access terminal (access terminal), user terminal (user terminal), user agent, or user device, etc.
  • IoT Internet of things
  • these may include mobile telephones (or "cellular" telephones), computers with mobile terminal equipment, portable, pocket-sized, hand-held, computer-embedded mobile devices, and the like.
  • mobile telephones or "cellular" telephones
  • PCS personal communication service
  • SIP session initiation protocol
  • WLL wireless local loop
  • PDA personal digital assistant
  • constrained devices such as devices with lower power consumption, or devices with limited storage capacity, or devices with limited computing power, etc.
  • it includes information sensing devices such as barcodes, radio frequency identification (RFID), sensors, global positioning system (GPS), and laser scanners.
  • RFID radio frequency identification
  • GPS global positioning system
  • the terminal device may also be a wearable device.
  • Wearable devices can also be called wearable smart devices or smart wearable devices, etc. It is a general term for the application of wearable technology to intelligently design daily wear and develop wearable devices, such as glasses, gloves, watches, clothing and shoes. Wait.
  • a wearable device is a portable device that is worn directly on the body or integrated into the user's clothing or accessories. Wearable device is not only a hardware device, but also realizes powerful functions through software support, data interaction, and cloud interaction.
  • wearable smart devices include full-featured, large-scale, complete or partial functions without relying on smart phones, such as smart watches or smart glasses, and only focus on a certain type of application function, which needs to cooperate with other devices such as smart phones.
  • Use such as all kinds of smart bracelets, smart helmets, smart jewelry, etc. for physical sign monitoring.
  • the various terminal devices described above if they are located on the vehicle (for example, placed in the vehicle or installed in the vehicle), can be considered as on-board terminal equipment.
  • the on-board terminal equipment is also called on-board unit (OBU). ).
  • Network equipment including, for example, an access network (AN) equipment, such as a base station (eg, an access point), may refer to a device that communicates with wireless terminal equipment through one or more cells over the air interface in the access network.
  • the device or, for example, an access network device in a V2X technology is a road side unit (RSU).
  • the base station may be used to convert received air frames to and from Internet Protocol (IP) packets and act as a router between the terminal device and the rest of the access network, which may include the IP network.
  • IP Internet Protocol
  • the RSU can be a fixed infrastructure entity supporting V2X applications and can exchange messages with other entities supporting V2X applications.
  • the access network equipment can also coordinate the attribute management of the air interface.
  • the access network equipment may include a long term evolution (long term evolution, LTE) system or an evolved base station (NodeB or eNB or e-NodeB, evolutional Node B) in long term evolution-advanced (LTE-A) ), or may also include the next generation node B (gNB) and the next generation evolved base station ( next generation evolutional Node B, ng-eNB), or may also include a centralized unit (central unit, CU) and a distributed unit (distributed unit, DU) in a separate access network system, which is not limited in the embodiment of the present application.
  • LTE long term evolution
  • NodeB or eNB or e-NodeB, evolutional Node B evolutional Node B
  • LTE-A long term evolution-advanced
  • gNB next generation node B
  • next generation evolutional Node B next generation evolutional Node B
  • ng-eNB next generation evolutional Node B
  • DU distributed unit
  • network equipment can also include core network equipment, which can be an access and mobility management function (AMF), which is mainly responsible for functions such as access control, mobility management, attachment and detachment, and gateway selection.
  • AMF access and mobility management function
  • the core network device may also be a network data analytics function (NWDAF), which is mainly responsible for functions such as data collection and analysis.
  • NWDAAF network data analytics function
  • the core network device may also be other devices.
  • AI refers to the technology of presenting human intelligence through computer programs. It is a theory that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results. , methods, techniques and application systems. In other words, AI is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive discipline, involving a wide range of fields, including both hardware-level technology and software-level technology.
  • the basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • ML is a multi-field interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. It specializes in how computers simulate or realize human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent, and its applications are in all fields of artificial intelligence.
  • Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
  • a machine learning model In the embodiments of this application, no distinction is made between artificial intelligence and machine learning, and the machine learning model may be represented as an ML model or an AI model.
  • the machine learning model in the embodiments of the present application generally refers to AI models and ML models in the AI field and ML field.
  • the machine learning model includes, for example, linear regression, logistic regression, and decision tree. (decision tree), naive bayes, k-nearest neighbors, support vector machines, deep neural network, random forest, etc.
  • Federated learning is an emerging artificial intelligence basic technology and an encrypted distributed ML technology. Its design goals are to ensure information security during big data exchange, protect terminal data and On the premise of personal data privacy and legal compliance, efficient machine learning is carried out between multiple parties or computing nodes. Among them, the machine learning algorithms that can be used in federated learning are not limited to neural networks, but also include algorithms such as random forests. Federated learning is expected to be the basis for the next generation of AI collaborative algorithms and collaborative networks.
  • Federated learning is a machine learning framework designed on the premise of meeting data privacy, security and regulatory requirements, allowing artificial intelligence systems to use their own data more efficiently and accurately, to meet users' privacy protection and data security.
  • Features of federated learning include:
  • At least one means one or more, and “plurality” means two or more.
  • And/or which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character "/" generally indicates that the associated objects are an "or” relationship.
  • the ordinal numbers such as “first” and “second” mentioned in the embodiments of the present application are used to distinguish multiple objects, and are not used to limit the order, sequence, priority or priority of multiple objects. Importance.
  • the first information and the second information are only for distinguishing different signaling, and do not indicate the difference in content, priority, transmission order, or importance of the two kinds of information.
  • network equipment in order to avoid leakage of user privacy data, network equipment first distributes the initial machine learning model to each terminal device, and each terminal device generally uses its own data to pair locally.
  • the training of the machine learning model distributed by the network device, and then the model update parameters obtained after training are sent to the network device, and then the network device aggregates the model update parameters sent by each participant (that is, each of the aforementioned terminal devices), and then The local machine learning model of the network device is directly updated to obtain an updated machine learning model.
  • FL can be used to update the machine learning model.
  • the main feature of FL is that the data of each participant is kept locally and does not need to be uploaded to the network device, thus not revealing data privacy and reducing massive data. Network overhead required for uploading.
  • the network device sends the initial ML model to the participant terminal devices 1-N.
  • the terminal devices 1-N train the ML model based on their respective local training data sets, that is, update the ML model to obtain the updated model parameters W 1 0 ⁇ W N 0 , or obtain the updated model parameter difference
  • the model parameter differences g 1 , . . . , g N are also called gradients.
  • the terminal devices 1 - N send the updated model update parameters or gradients to the network device, for example, send W10 - WN0 to the network device, or send g1 - gN to the network device.
  • the network device After receiving the model update parameters sent by all the participants (ie, terminal devices 1 to N), the network device performs a weighted average on the model update parameters of each participant to obtain the aggregated ML model update parameters, and then aggregates the model update parameters.
  • each terminal device since the data used locally for ML model training by each of the terminal devices 1 to N is different, the ability of each terminal device to perform ML model training is generally different, and in the training process
  • the configuration information used by each terminal device for training is configured by each terminal device itself, or manually configured by the user of each terminal device, that is, the configuration information used by each terminal device for training is independent of each other. This results in the time for each terminal device to complete the ML model training is generally different, and there is a big difference.
  • each terminal device In order to facilitate the network device to update the ML model as soon as possible, each terminal device will generally report in time after obtaining the model update parameters.
  • the network device due to the time difference between each terminal device reporting its own model update parameters, and the network device needs to obtain the model update parameters reported by all participants before the model update can be performed, so it needs to wait until the last reported model update parameter.
  • the model When the model is updated, it may have been a long time since the time when the first model update parameters were received, which undoubtedly increases the time it takes for the network device to update the model, and causes the network device to update the machine learning model. The slower the convergence speed, the lower the update efficiency of the machine learning model.
  • an embodiment of the present application provides a method for updating a machine learning model.
  • the network side uniformly configures configuration information for training (referred to as model training configuration information in the embodiment of the present application) for each terminal device, Specifically, the network device allocates corresponding model training configuration information to each terminal device according to the computing capability of each terminal device, so that each terminal device uses the model training configuration used when training the local machine learning model pre-distributed by the network device
  • the information is matched with its own computing power, which can minimize the time difference caused by the difference in the capabilities of each terminal device during model training, thereby ensuring that each terminal device can complete the model training in the same time as possible, so that each terminal device can complete the model training within the same time.
  • the time for the devices to report their respective model update parameters is roughly the same, thereby reducing the difference in the time when each terminal device reports the model update parameters, so that the network device can update the model based on the model update parameters reported by each terminal device in as short a time as possible. It can be completed in time to improve the convergence speed of model update, thereby improving the update efficiency of the machine learning model.
  • the technical solutions provided in the embodiments of this application can be applied to the 4G system of the fourth generation mobile communication technology (the 4th generation, 4G), such as the LTE system, or can be applied to the 5G system, such as the NR system, or can also be applied to the next generation mobile communication technology.
  • the 4G system of the fourth generation mobile communication technology such as the 4th generation, 4G
  • the 5G system such as the NR system
  • the communication system or other similar communication systems are not specifically limited.
  • the machine learning model to be trained and updated in this embodiment of the present application may be some general models in the AI field, such as the aforementioned linear regression, logistic regression, decision tree, naive Bayes, K-nearest neighbor, support vector machine, and deep neural network , random forest, etc., which are not limited in the embodiments of the present application.
  • the machine learning models for training in each terminal device participating in the model training are uniformly distributed in advance by the network device, that is to say, each terminal device performs model training.
  • the machine learning model is of the same type, and the machine learning model in which the network device uses the model update parameters reported by each terminal device to update the model is of the same type as the machine learning model in each terminal device for model training.
  • FIG. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the communication system includes a core network device, a first access network device, a second access network device, and a terminal device.
  • the first access network device or the second access network device can communicate with the core network device;
  • the terminal device can communicate with the first access network device or the second access network device, and the terminal device can also communicate with the first access network device or the second access network device.
  • the access network device and the second access network device communicate at the same time, that is, multi-radio dual connectivity (MR-DC).
  • MR-DC multi-radio dual connectivity
  • the first access network device may be the primary access network device
  • the second access network device may be the secondary access network device
  • the second access network device may be the primary access network device
  • the first access network device may be a secondary access network device
  • the first access network device and the second access network device may be access network devices of different communication modes, or may be access network devices of the same communication mode network equipment.
  • the communication system may also include other devices, such as network control devices.
  • the network control device may be an operation management and maintenance (operation administration and maintenance, OAM) system, also called a network management system.
  • OAM operation administration and maintenance
  • the network control device may manage the aforementioned first access network device, second access network device, and core network device.
  • the core network device in FIG. 2 may be AMF or NWDAF, but is not limited to AMF and NWDAF.
  • the access network device in Figure 2 also known as a radio access network (RAN) device, is a device that connects a terminal device to a wireless network and can provide wireless resource management and services for the terminal device.
  • RAN radio access network
  • the access network equipment may include the following equipment:
  • gNB Provide NR control plane and/or user plane protocols and functions for terminal equipment, and access to 5G core network (5th generation core, 5GC);
  • ng-eNB Provides the protocols and functions of the control plane and/or user plane of the evolved universal terrestrial radio access (E-UTRA) for terminal equipment, and accesses the 5G core network;
  • E-UTRA evolved universal terrestrial radio access
  • CU It mainly includes the radio resource control (RRC) layer of gNB, the service data adaptation protocol (SDAP) layer and the packet data donvergence protocol (PDCP) layer, or ng- RRC layer and PDCP layer of eNB;
  • RRC radio resource control
  • SDAP service data adaptation protocol
  • PDCP packet data donvergence protocol
  • Radio link control (RLC) layer the radio link control (MAC) layer and the physical layer of the gNB or ng-eNB;
  • RLC radio link control
  • MAC media access control
  • CU-CP Central unit-control plane
  • the control plane of the CU mainly including the RRC layer in the gNB-CU or ng-eNB-CU, and the control plane in the PDCP layer;
  • CU-UP Central unit-user plane: the user plane of the CU, mainly including the SDAP layer in the gNB-CU or ng-eNB-CU, and the user plane in the PDCP layer;
  • Data analysis and management mainly responsible for data collection, ML model training, ML model generation, ML model update, ML model distribution and other functions.
  • FIG. 3 is a schematic structural diagram of a separate access network device.
  • the access network equipment is divided into one CU and one or more DUs according to functions, wherein the CU and the DU are connected through the F1 interface.
  • one CU may include one CU-CP and one or more CU-UPs.
  • CU-CP and CU-UP can be connected through E1 interface
  • CU-CP and DU can be connected through F1 control plane interface (F1-C)
  • CU-UP and DU can be connected through F1 user interface interface (F1-U) to connect.
  • the CU, DU or CU-CP can be connected to the DAM through the G1 interface, respectively.
  • the DAM can be used as an internal function of the CU, DU, or CU-CP, respectively.
  • there is no G1 interface or the G1 interface is an internal interface, which is invisible to the outside world).
  • FIG. 4 is a flowchart of the method.
  • the network device in the following introduction process may be the aforementioned access network device, or core network device, or network control device.
  • FIG. 4 uses a terminal device as an example to illustrate the technical solution of the present application.
  • other participants as model training can be understood according to the process shown in FIG. 4 .
  • the network device acquires the computing capability of the terminal device.
  • the computing power of the terminal device can be understood as the ability used to indicate or evaluate the speed at which the terminal device processes data, such as the output speed when the terminal device calculates the hash function. For example, you can use It is represented by the number of floating point operations per second (FLOPS).
  • FLOPS floating point operations per second
  • the computing power of a terminal device is positively related to the speed of processing data. For example, the greater the computing power, the faster the data processing speed, and the faster the model training speed.
  • the computing power of the terminal device is related to the hardware configuration performance of the terminal device itself, the smoothness of the operating system and other factors.
  • the network device may acquire the computing capability of the terminal device in any of the following manners.
  • the terminal device actively reports its computing capability to the network device.
  • the terminal device may send first computing power indication information for indicating the computing capability of the terminal device to the network device, and the first computing power indication information may also include the identifier of the terminal device.
  • the computing capability corresponding to the terminal device can be determined according to the first computing power indication information.
  • the terminal device may report the computing capability of the terminal device to the network device through a UE assistance information (UE assistance information) message.
  • UE assistance information UE assistance information
  • the terminal device can actively report its own computing capability to the network device when it is registered to the network device, or it can also actively report its own computing capability information when it receives the initial machine learning model distributed by the network device, or it can still be used. It is to actively report its own computing capability at other times, which is not limited in this embodiment of the present application.
  • the network device can obtain the computing power of each terminal device in advance, so that the corresponding model training configuration information can be allocated to each terminal device in a timely manner, and the allocation efficiency can be improved.
  • the terminal device reports its computing capability to the network device according to the request of the network device.
  • the network device can send a computing capability acquisition request to the terminal device, and the computing capability acquisition request is used to instruct the terminal device to report its own computing capability to the network device.
  • the terminal device may send the second computing power indication information used to indicate the computing capability of the terminal device to the network device.
  • the computing capability of the terminal device can be obtained.
  • the network device sends a UE capability enquiry message to the terminal device, which is used to request to obtain the computing capability of the terminal device. Further, the terminal device sends UE capability information (UE capability information) to the network device. information) message, which contains the computing capability of the terminal device.
  • UE capability information UE capability information
  • the network device only requests the computing capability of the terminal device, so that the computing capability of each terminal device can be stored locally without the need for advance, the storage consumption can be reduced to a certain extent, and the terminal device can be effectively used computing power.
  • the network device obtains the computing capability of the terminal device from other network devices.
  • other network devices can actively send the computing capability of the terminal device to the network device, or the network device can also send a request to other network devices first, and other network devices return the terminal to the network device based on the request of the network device.
  • the computing capability of the device for example, other network devices indicate the computing capability of the terminal device to the network device through the third computing capability indication information.
  • the computing capability of the terminal device can be obtained from other access network devices, core network devices, or network control devices.
  • the network device is a core network device, it can be obtained from the access network device or the network control device.
  • the computing capability of the terminal device is obtained at the network control device.
  • the premise of implementing this method is that other network devices themselves have the computing capability of the terminal device, or other network devices can acquire the computing capability of the terminal device.
  • the network device determines model training configuration information corresponding to the terminal device according to the computing capability of the terminal device.
  • the model training configuration information is the configuration information required by the terminal device to train the local machine learning model.
  • the model training configuration information is used for the terminal device to perform model training on the local machine learning model.
  • the speed at which the terminal device processes data can be evaluated based on the computing capability of the terminal device.
  • the computing power of the terminal device is used to allocate the corresponding model training configuration information for each terminal device. Based on this allocation mechanism, it can be ensured that each terminal device can complete the model training within approximately the same time period. For example, for terminal devices with poor computing power , model training configuration information with lower requirements can be configured for it. For terminal devices with better computing capabilities, model training configuration information with relatively high requirements can be configured for them. In this way, terminal devices with poor computing capabilities and computing The time for the terminal devices with better capability to complete the model training is roughly the same, thereby reducing the difference between the time for each terminal device to complete the model training.
  • the network device sends the determined model training configuration information to the terminal device.
  • the network device After the corresponding model training configuration information is determined according to the computing capability of the terminal device, the network device sends it to the corresponding terminal device, so that the terminal device can receive the model training configuration information sent by the network device.
  • the terminal device trains the first machine learning model in the terminal device according to the model training configuration information to obtain model update parameters.
  • the machine learning model for local training in each terminal device is pre-distributed by the network device.
  • the local machine learning model of the terminal device is called the first machine learning model
  • the local The machine learning model is called the second machine learning model, so the first machine learning model is distributed by the network device to the terminal device.
  • the first machine learning model and the second machine learning model are of the same type of machine learning model, or the first machine learning model and the second machine learning model are different types of machine learning models.
  • the model update parameter information is used for local model update, and the first machine learning model and the second machine learning model are the same type of machine learning model.
  • the terminal device After receiving the model training configuration information sent by the network device, the terminal device can locally train the first machine learning model in the terminal device according to the model training configuration information, and after the model training is completed, the corresponding model update parameters can be obtained , so the model update parameter here is the update parameter obtained after training the first machine learning model by using the model training configuration information.
  • the model update parameters are model parameters of the trained first machine learning model.
  • the network device may also indicate to the terminal device the time to start local training by using the model training configuration information, for example, instructing the terminal device to start at a specific moment Local training, or instruct the terminal device to start local training after a predetermined period of time after receiving the model training configuration information, or can also instruct to start local training at other times (for example, 15:00:00), so that the network device can be stricter It controls the start time of each terminal device for local training, which can ensure that the time when each terminal device starts local training is as consistent as possible, and further reduces the time difference between each terminal device completing local training.
  • the model training configuration information for example, instructing the terminal device to start at a specific moment Local training, or instruct the terminal device to start local training after a predetermined period of time after receiving the model training configuration information, or can also instruct to start local training at other times (for example, 15:00:00), so that the network device can be stricter It controls the start time of each terminal device for local training, which can ensure that the time when each terminal device starts local training is as consistent as possible
  • the terminal device sends the obtained model update parameters to the network device.
  • model update parameters in the embodiments of the present application include the model update parameters themselves and their corresponding parameter values.
  • model update parameters a, b, and c there are three model update parameters a, b, and c.
  • the three model update parameters of the device sender are to send the three model update parameters a, b and c and the parameter values of various model update parameters to the network device.
  • the parameter value of a is 1.5
  • the parameter value of b is 2.6
  • the parameter value of c is 2.4.
  • Each terminal device uses the local training data to perform local training using the model training configuration configured by the network device according to its own computing capability.
  • the types of model update parameters obtained by training are generally the same.
  • the model update parameter a, the model update parameter b, and the model update parameter c are obtained.
  • the parameter values of the three model update parameters obtained by terminal device 1 are 1.3, 1.8, and 2.4, respectively, and the parameter values of the three model update parameters obtained by terminal device 2 are 1.6, 1.4, 2.8, and those obtained by terminal device 3, respectively.
  • the parameter values of the three model update parameters are 1.9, 1.3, and 2.7 respectively. It can be seen that the three terminal devices have obtained the same type of model update parameters after local training, but the parameter values corresponding to each model update parameter are different.
  • the terminal device After obtaining the model update parameter corresponding to the model training configuration information, the terminal device sends it to the network device, and then the network device can receive the model update parameter sent by the terminal device.
  • the terminal device actively sends the model update parameters to the network device immediately after obtaining the model update parameters, so that the network device can obtain the model update parameters fed back by the terminal device as soon as possible.
  • the network device can use the model training configuration information to try to make the time spent by each terminal device for local training to be consistent, so that after the local training of each terminal device is completed and the model update parameters are obtained, all terminal devices can be updated in time. It is reported to the network device, thereby reducing the time difference for the network device to obtain the model update parameters sent by each terminal device.
  • the terminal device does not actively send the model update parameters to the network device immediately after obtaining the model update parameters, but only sends the model update parameters to the network device when a specific trigger condition is met, which will be described below with an example.
  • the terminal device sends the obtained model update parameters to the network device only after receiving the acquisition request sent by the network device to instruct the terminal device to send the model update parameters to the network device. That is to say, a possibility
  • the trigger condition is that the terminal device receives the acquisition request sent by the network device, and under this trigger condition, the terminal device only sends the model update parameters to the network device when it actively requests.
  • the network device may first determine the time point for requesting the model update parameters from the terminal device, that is, it may first determine the time point for sending the aforementioned acquisition request to the terminal device, and then send the acquisition request to the terminal device at the determined time point. request to actively request the terminal device for the model update parameters obtained by the terminal device.
  • the network device can explicitly control the time for requesting the model update parameters from each terminal device, and reduce the time for each terminal device to complete local training through the model training configuration information. On the basis of the difference, the time difference between each terminal device reporting the model update parameters can be further reduced, thereby reducing the time difference between the network device actually acquiring the model update parameters sent by each terminal device.
  • the network device may directly indicate to the terminal device the time point for reporting the model update parameters. Specifically, the network device may send the terminal device the reporting time information used to instruct the terminal device to send the model update parameter to the network device. After the terminal device receives the report time information, it can send the model update parameters to the network device at the time point indicated by the report time information. That is to say, another possible trigger condition is that the network device sends a The time indicated by the reporting time information. Under this trigger condition, the terminal device regularly reports the model update parameters to the network device according to the instruction of the network device. In this embodiment, the network device can explicitly control the specific time at which each terminal device reports the model update parameters.
  • the reporting by each terminal device can be further reduced.
  • the time difference between the model update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
  • the network device can determine the time point at which each terminal device completes its own local training, and on this basis, can directly determine the time point in the above scenario 1 and the reporting time in the scenario 2.
  • the time point indicated by the information for example, if the network device determines that each terminal device completes local training at about 16:05:00, it can send an acquisition request to each terminal device at 16:06:00, and can instruct each terminal device to 16:06:30 Send the model update parameters to the network device, which not only reduces the time difference between each terminal device reporting their own model update parameters, but also obtains the model update parameters as soon as possible, which is the basis for improving the efficiency of local model update.
  • the network device may first determine the transmission duration for each terminal device in most or all of the terminal devices participating in the local training to send their respective model update parameters to the network device, Then, according to each transmission duration corresponding to each terminal device, the acquisition time for acquiring the model update parameters of the terminal device is determined, that is, the aforementioned time point for sending the acquisition request is determined.
  • the transmission duration can be understood as the interval between sending the model update parameter from the terminal device to the network device receiving the model update parameter, which is related to the quality of the communication link between each terminal device and the network device.
  • the network device may obtain the uplink transmission rate of the terminal device according to the channel quality indicator (CQI) sent by the terminal device, and then determine the terminal device according to the data volume and the uplink transmission rate of the model update parameter.
  • CQI channel quality indicator
  • the transmission duration corresponding to each terminal device can be determined, and then the time point at which the acquisition request is sent to the terminal device can be determined according to the transmission duration corresponding to most terminal devices (for example, 80%) or all terminal devices, or each terminal device can be determined.
  • q represents the data amount of the model update parameter sent by the corresponding terminal device to the network device. Since the machine learning model for local training in each terminal device is uniformly distributed in advance by the network device, the model update parameters obtained after the training of each terminal device is known to the network device, and the network device can know each Data volume of model update parameters.
  • the terminal device transmits all model update parameters to the network device, then the network device can estimate the total data of all model update parameters on the basis of knowing the data volume of each model update parameter
  • the network device can request the terminal device for a specified type of model update parameter, and on the basis of knowing the data amount of each specified model update parameter, the network device can estimate all The total data volume of the specified model update parameters, obtain the aforementioned q.
  • the network device can actively request model update parameters from each terminal device, and at the time when it matches each terminal device, it sends an acquisition request for requesting model update parameters to each corresponding terminal device or indicates that the terminal device is matching
  • the model update parameters are reported to the network device at the same time point.
  • the time required for the transmission of model update parameters of most or even all participants can be comprehensively considered, and the time for each terminal device to report its own model update parameters can be more accurately controlled. , which reduces the time difference between each terminal device reporting the model update parameters to the network device, thereby reducing the time difference between the network device acquiring the model update parameters sent by each terminal device, thereby improving the convergence speed of the local model update and improving the model update efficiency.
  • the transmission duration corresponding to terminal device 1 is 10 minutes
  • the transmission duration corresponding to terminal device 2 is 15 minutes and hours
  • the transmission duration corresponding to terminal device 3 is 22 minutes
  • the transmission duration corresponding to terminal device 4 is 28 minutes
  • the network device You can send an acquisition request to terminal device 4 at 10:00, send an acquisition request to terminal device 3 at 10:06, send an acquisition request to terminal device 2 at 10:13, and send an acquisition request to terminal device 2 at 10:18.
  • Terminal device 1 sends an acquisition request. That is to say, the acquisition request can be sent to the terminal device with the longer transmission duration earlier, and the acquisition request can be sent to the terminal device with the shorter transmission duration later. In this way, the terminal device with the longer transmission duration can receive the acquisition request earlier.
  • the terminal device with longer transmission duration can be instructed to send the model update parameter earlier, and the terminal device with shorter transmission duration can be instructed to send the model update parameter later.
  • the terminal device with longer transmission duration can be instructed to send the model update parameter earlier, and the terminal device with shorter transmission duration can be instructed to send the model update parameter later.
  • the types of model update parameters obtained by each terminal device are the same, and the network device is also aware of it, because each terminal device The initial model of the locally trained machine learning model is distributed to each end device by the network device.
  • the model update parameter obtained by the terminal device performing local training through the model training configuration information sent by the network device the terminal device can send parameter availability indication information to the network device, and the parameter availability indication information can indicate the availability of the model update parameter in the terminal device, After the terminal device completes the local training of the machine learning model, the model update parameters of the terminal device are all available. Therefore, the parameter availability indication information can also indicate that the terminal device has completed the training of the local machine learning model.
  • the terminal The device can inform the network device that it has completed the local training of the machine learning model through the parameter availability indication information.
  • the parameter availability indication information may be carried in the RRC reestablishment complete (RRC reestablishment complete) message, the RRC reconfiguration complete (RRC reconfiguration complete) message, the RRC resume complete (RRC resume complete) message, and the RRC establishment complete message sent by the terminal device to the network device.
  • the network device can specify all the model update parameters available in the terminal device, and then report the model update parameters reported by the terminal device. By comparing the update parameters, it can be determined whether the terminal device has missed some model update parameters or whether the model update parameters are incompletely obtained due to abnormal transmission during the transmission process, thereby improving the integrity and accuracy of acquiring model update parameters.
  • the network device when the network device requests model update parameters from the terminal device through the acquisition request, it can obtain the specified type of model update parameters according to the actual needs of the network device, so the optional , the aforementioned acquisition request can also be used to indicate that the network device needs to acquire the specified model update parameters.
  • the model update parameters a, b, c, and d in the terminal device are all available, and the network device only requests the model update parameter a.
  • the model update parameters transmitted by the terminal device can be reduced, the time spent by the terminal device in transmitting the model update parameters can be reduced, invalid transmission can be reduced, network transmission resources can be saved, and the overhead of air interface resources can be reduced.
  • the specified model update parameters required by the network device may be indicated to the terminal device in other manners, for example, the specified model update parameters may be indicated by a message other than the acquisition request.
  • the network device updates the machine learning model in the network device according to the model update parameter.
  • the network device is configured with corresponding model training configuration information to multiple terminal devices, so the network device can receive the model update parameters sent by the above-mentioned terminal devices, as well as other terminal devices. other model update parameters, and by configuring the model training configuration information for each terminal device through the computing power, the difference in time required for the network device to receive model update parameters fed back by multiple terminal devices can be minimized.
  • the network device configures the first model training configuration parameters and the second model training configuration parameters for the terminal device 1, the terminal device 2, and the terminal device 3 according to the respective computing capabilities of the terminal device 1, the terminal device 2, and the terminal device 3, respectively.
  • the third model training configuration parameter the terminal device 1 obtains the first model update parameter after performing local training on the machine learning model in the terminal device 1 according to the first model training configuration information
  • the terminal device 2 according to the second model training configuration information to the terminal
  • the machine learning model in device 2 obtains second model update parameters after local training
  • terminal device 3 obtains third model update parameters after locally training the machine learning model in terminal device 3 according to the third model training configuration information.
  • each terminal device Since the model training configuration information for each terminal device to perform local training is allocated according to the network device according to their respective computing capabilities, the time for the terminal device 1, the terminal device 2, and the terminal device 3 to complete the local training can be roughly the same, and further , each terminal device sends the model update parameters obtained by each terminal device to the network device after the training is completed, and the network device can receive the model update parameters sent by each terminal device in roughly the same time, thereby reducing the number of network devices receiving multiple Variation in the time it takes for the model to update the parameters of the end device.
  • the network device can update all models according to the method described in the corresponding embodiment of FIG. 1 .
  • the parameters are aggregated, and then the local machine learning model of the network device is updated with the model update parameters after the aggregation process, that is, the local machine learning model in the network device is updated locally. Because the method of configuring model training parameters through computing power can reduce the difference in the time it takes for network devices to receive model update parameters from multiple terminal devices, network devices can quickly converge when updating the local machine learning model, improving model updates. effectiveness.
  • the network device uses the obtained model update parameters to update the parameters of the local machine learning model
  • the training data is local to each terminal device, and each terminal device does not need to transmit training data to the network device, so the FL method can be used to update the parameters of the local machine learning model.
  • the FL method can be used to update the parameters of the local machine learning model.
  • the amount of data for updating parameters of the transmission model is generally much smaller than the training data, which can reduce network transmission overhead to a large extent, thereby saving network transmission resources.
  • the model training configuration information is the information used by the terminal device to train the local machine learning model. It can be understood that the model training configuration information is the information instructing the terminal device how to train the local model.
  • the model training configuration information in this embodiment of the present application may include one type of information or a combination of multiple types of information.
  • the model training configuration information includes one of hyperparameters, accuracy, and training time information. information or a combination of information.
  • the model training configuration information configured by the network device for the terminal device is routinely used by the terminal device for model training, which generally meets the configuration requirements of most terminal devices for local model training, and has good versatility.
  • model training configuration information includes different information.
  • Model training configuration information is a hyperparameter. That is, the network device selects appropriate hyperparameters for the terminal device according to the computing capability of the terminal device, and then sends the selected hyperparameters to the terminal device, and the terminal device performs local training on the local machine learning model according to the hyperparameters, and then sends The obtained model update parameters are fed back to the network device.
  • Machine learning models involve two basic concepts, one is parameters and the other is hyperparameters.
  • parameters are variables obtained by the model through learning, such as weight w and bias b; hyperparameters are set based on experience and affect the size of model parameters (such as weight w and bias b).
  • a parameter is a parameter that sets a value before starting the learning process, not the parameter data obtained through model training.
  • a hyperparameter is also a parameter. It has the characteristics of a parameter, but it is not obtained through learning. For example, it can be obtained by The user specifies its value based on existing or existing experience. That is to say, hyperparameters are parameters that can affect model parameters. Therefore, the value set by hyperparameters can directly affect the effect of model training. Therefore, setting appropriate hyperparameters for the terminal device according to the computing power of the terminal device can control the terminal device as much as possible. Time spent training locally.
  • the hyperparameters in the embodiments of the present application may include at least one of a learning rate (learning rate), a batch size (batch size), an iteration (iteration), and a training epoch (epoch), that is, may include the aforementioned specific one or more of the hyperparameters.
  • learning rate learning rate
  • batch size batch size
  • iteration iteration
  • epoch training epoch
  • the network device sets a certain threshold (for example, called the first threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the first threshold, select a smaller threshold for the terminal device. Learning rate, such as 0.0001; if the computing capability of the terminal device is less than the first threshold, a larger learning rate, such as 0.01, is selected for the terminal device.
  • Batch size refers to the number of samples fed into the machine learning model during each training, that is, the number of samples required for one training. For example, if 100 samples are used in one training, the batch size is 100.
  • the network device sets a certain threshold (for example, called the second threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the second threshold, select a larger batch size for the terminal device. , such as 128; if the computing capability of the terminal device is less than the second threshold, select a smaller batch size for the terminal device, such as 16.
  • the number of iterations refers to the number of times the entire training set is input to the machine learning model for training.
  • Implementation 1 The network device sets a certain threshold (for example, called the third threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the third threshold, select a higher threshold for the terminal device. A large iteration, such as 10000; if the computing capability of the terminal device is less than the third threshold, a smaller iteration, such as 1000, is selected for the terminal device.
  • Number of training rounds refers to the number of rounds in which the entire training set is input to the machine learning model for training.
  • Implementation 1 The network device sets a certain threshold (for example, called the first threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the fourth threshold, select a higher threshold for the terminal device. A large epoch, such as 10; if the computing capability of the terminal device is less than the fourth threshold, a smaller epoch, such as 5, is selected for the terminal device.
  • hyperparameters are the basic training requirements for model training, and through different values of different types of hyperparameters, they can be quantified into the corresponding training time, for example, for 1000
  • the number of samples, the number of training rounds is 10, and the batch size is 20 for local training.
  • the computing power of terminal device 1 it takes about 2 minutes to complete the training, and according to the computing power of terminal device 2, it only takes about 1.5 minutes to complete the training.
  • Completing the training that is, for the aforementioned hyperparameters with specific values, the training time of 2 minutes can be quantified for terminal device 1, and the training time of 1.5 minutes can be quantified for terminal device 2.
  • the computing power of the terminal device can be more clearly defined when the terminal device completes the local training, so this can better control the training time of each terminal device participating in the local training, thereby reducing the time difference for each terminal device to complete the local training.
  • the model training configuration information is the accuracy required to train the machine learning model. That is, the network device selects the appropriate accuracy for the terminal device according to the computing capability of the terminal device, and then informs the terminal device of the selected accuracy. The terminal device performs local training on the local machine learning model according to the accuracy required by the network device, and then obtains The model update parameters are fed back to the network device.
  • the accuracy required when training the machine learning model refers to the difference between the actual predicted output of the machine learning model and the actual output of the sample.
  • the precision in this embodiment of the present application may include at least one of an error rate, a correct rate, a deviation rate, and a recall rate. in:
  • Error rate refers to the ratio of the number of samples with classification errors (or prediction errors) to the total number of samples based on the updated machine learning model (that is, the trained machine learning model).
  • the network device sets a certain threshold (for example, called the fifth threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the fifth threshold, a smaller error rate is selected for the terminal device. ; if the computing capability of the terminal device is less than the fifth threshold, select a larger error rate for the terminal device.
  • Correct rate refers to the proportion of the number of correctly classified (or correctly predicted) samples to the total number of samples based on the updated machine learning model (that is, the trained machine learning model).
  • the network device sets a certain threshold (for example, called the sixth threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the sixth threshold, select a higher correct rate for the terminal device. ; if the computing capability of the terminal device is less than the sixth threshold, select a smaller correct rate for the terminal device. For example, for a neural network, the machine learning model produces a probability prediction for each test sample.
  • the accuracy rate can be the Top-1 accuracy rate, that is, the accuracy rate of the category with the first probability prediction being consistent with the actual result; or,
  • the accuracy rate can be the Top-5 accuracy rate, that is, the accuracy rate of the probability prediction that the top five categories contain the actual results.
  • the network device sets a certain threshold (for example, referred to as the seventh threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the seventh threshold, select a larger threshold for the terminal device. rate; if the computing capability of the terminal device is less than the seventh threshold, a smaller precision rate is selected for the terminal device.
  • a certain threshold for example, referred to as the seventh threshold
  • Recall rate refers to how many positive examples in the sample are predicted correctly based on the updated machine learning model.
  • the network device sets a certain threshold (for example, referred to as the eighth threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the eighth threshold, the lower retrieval value is selected for the terminal device. rate; if the computing capability of the terminal device is less than the eighth threshold, a higher recall rate is selected for the terminal device.
  • the model training configuration information is the training time information required when training the machine learning model, and the training time information is used to indicate the time used for training.
  • the training duration either indicates the training end time and the training duration, or only indicates the training duration (eg, 5 minutes or 10 minutes, etc.). That is, the network device selects appropriate training time information for the terminal device according to the computing capability of the terminal device, and then informs the terminal device of the selected training time information, and the terminal device performs the local machine learning model according to the training time information required by the network device. Local training, and then feedback the obtained model update parameters to the network device.
  • the network device can calculate the time required for each terminal device to perform local training according to the computing capability of each terminal device, and then use most (for example, 90% of the time) ) or the maximum value of the training time required by all terminal devices as the training time of each participant.
  • a longer training time can be configured for each terminal device, so that each terminal device can complete the local training as much as possible within the specified training time, and the time it takes for most terminal devices to complete the local training is approximately the same, so It can reduce the time difference of each terminal device completing local training, thereby reducing the time difference of the network device acquiring the model update parameters fed back by each terminal device.
  • the model training configuration information is the hyperparameters and accuracy required to train the machine learning model.
  • the terminal device may not be able to absolutely meet these two requirements at the same time during local training.
  • the hyperparameter configured by the network device for the terminal device is the batch size. 50 and the number of training rounds is 10, the accuracy of the configuration is 96%, so when the terminal device is training, when 10 rounds of training are performed according to the batch size of 50, the accuracy may not reach 96%.
  • the terminal device can configure other hyperparameters by itself to achieve the 96% accuracy requirement.
  • the network device in addition to the aforementioned batch size of 50 and the number of training rounds configured by the network device, it can also be configured by itself.
  • a larger learning rate and a larger number of iterations are used for local training to try to meet the accuracy requirements of network devices;
  • another approach is to find a balance in the training configuration information of the two models configured on the network side, such as in
  • some values of the hyperparameters can be appropriately increased and the precision can be reduced, but the adjustment range should not be too large, so as to meet the training requirements of the network equipment as much as possible.
  • the model training configuration information configured by the network device may be Appropriate adjustments are made in a small range, so as to meet the requirements of multiple model training configuration information at the same time, so as not to affect the network equipment through the model training configuration information to limit the time difference for each terminal device to complete local training, and at the same time to obtain better results. training effect.
  • the model training configuration information is the hyperparameter, accuracy and training time information required when training the machine learning model.
  • the network device can simultaneously configure three (or more) types of model training configuration information for the terminal device.
  • the terminal device when it performs local training, it can refer to the methods listed in the fourth situation to To perform local training, that is, you can perform local training not strictly according to the configuration of the network equipment, but to adjust the training configuration information of one or more models in an appropriate and small range, in order to achieve a better training effect and at the same time reduce the training cost. Changes in time should be as small as possible.
  • the network device considering that the purpose of training configuration information for various models configured by the network device according to the computing capability of the terminal device is to reduce the time difference between the completion of local training by each terminal device participating in the local training, the network
  • the training time information configured by the device for each terminal device is the most direct requirement to reduce the time difference. For this reason, when training configuration information of various types of models including training time information, priority can be given to ensure that the training time information remains unchanged.
  • the training configuration information of other types of models is slightly adjusted in order to achieve a better training effect, or it is better to ensure that the training time information remains unchanged, and perform local training strictly according to the training configuration information of other types of models.
  • the priority of the training time information is the highest, and the corresponding time requirements can be kept unchanged, so as to satisfy the training time of each terminal device by the network device as much as possible. requirements, so as to better reduce the time difference of local training performed by each terminal device.
  • the network device allocates corresponding model training configuration information to each terminal device participating in the local training according to the computing capability of the terminal device.
  • the network device and the terminal device can also More interaction, in order to achieve better training effect, obtain more accurate model update parameters, and at the same time, it can better reduce the time difference of each terminal device completing local training, thereby reducing the model update that the network device obtains feedback from each terminal device.
  • the time difference of the parameters facilitates the rapid convergence of the network device when updating the local machine learning model, and improves the update efficiency of the network device updating the machine learning model.
  • the network device may also send training feature information to the terminal device, where the training feature information is used to instruct the terminal device to perform local training.
  • the training feature information includes: channel quality indicator (CQI), channel state information reference signal (channel interference information reference signals, CSI-RS) measurement result, synchronization signal and physical broadcast channel block (synchronization One or more of information such as signal and physical broadcast channel block, SSB) measurement results and packet delay.
  • CQI channel quality indicator
  • CSI-RS channel state information reference signal
  • SSB physical broadcast channel block
  • the terminal device can use the samples in the corresponding training features to locally train the local machine learning model, for example, use the samples in the SSB measurement result to locally train the machine learning model.
  • the network device sends the training feature information to the terminal device, so that each participant can use the same training feature information for local training, thereby reducing the difference in the time spent by each terminal device for local training based on different training feature information.
  • the network device may also send accuracy evaluation information to the terminal device, where the accuracy evaluation information is used by the terminal device for the locally trained machine learning model
  • the accuracy is evaluated, and the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy.
  • the method for evaluating the accuracy may be any one of hold-out, cross validation, bootstrapping or other methods.
  • the set aside method is to divide the samples into two mutually exclusive sets, one of which is used as the training sample of the machine learning model, and the other set is used as the test sample of the machine learning model. sample for testing.
  • the cross-validation method is to divide the sample into k mutually exclusive subsets of similar size, and then use the union of k-1 subsets as the training sample each time, and the remaining subset as the test sample, so as to train k times as soon as possible. Test, and finally return the mean of k test results.
  • the bootstrap method is to give a data set D of m samples, randomly select a sample from D each time, copy it to E, and then put the sample back into the initial data set D. This process is repeated m times to obtain a data set E containing m samples.
  • the samples in data set E are used as training samples, and the samples in data set D that are different from data set E are used as test samples.
  • each participant can use the same accuracy evaluation information to evaluate the accuracy of the locally trained machine learning model. Since the same accuracy evaluation method is used, each terminal device can be The specified accuracy requirements are met under the same accuracy evaluation standard, thereby reducing the difference in the time spent by each terminal device for local training.
  • the terminal device after completing the local training, the terminal device needs to feed back the obtained model update parameters to the network device.
  • the terminal device may also send accuracy indication information to the network device, where the accuracy indication information is used to indicate The accuracy achieved by the terminal device using the model training configuration information configured by the network device to perform the local training of the machine learning model. That is, the terminal device can not only feed back the model update parameters to the network device, but also the accuracy of the corresponding model training.
  • Feedback to the network device so that the network device can know the training effect of the terminal device, and can be used as a reference when the network device configures the model training configuration information for the terminal device in the future.
  • the accuracy indication information sent by the terminal device indicates the training accuracy. If it is poor, then when the subsequent network device selects the model training configuration information for the terminal device, it can perform directional adjustment on the basis of the previous model training configuration information.
  • the role of the model update parameters fed back by the terminal device in the local model update can be determined according to the judgment of its training effect.
  • the accuracy indication information fed back by the terminal device 1 indicates that the accuracy of its local training is 97%
  • the accuracy indication information fed back by the terminal device 2 indicates that the accuracy of its local training is 85%. It can be seen that the accuracy of the local training of the terminal device 1 is higher than that of the local training of the terminal device 2. In other words, the training effect of the terminal device 1 should be better than that of the terminal device. device 2.
  • the network device may give a larger weight to the model update parameter fed back by the terminal device 1, while giving the model update parameter fed back by the terminal device 2 a relatively small weight. . In this way, the effectiveness and accuracy of the local update of the model by the network device can be improved.
  • the method in which the network device configures the model training parameter information for each terminal device to perform local training according to the computing capability of the terminal device has been introduced above.
  • Each terminal device can try to complete the training of the machine learning model in the mainland within the same (or approximately the same) time, reducing the time difference between each terminal device sending model update parameters to the network device, thereby reducing the network device receiving the data sent by each terminal device. Due to the time difference of the model update parameters, the network device can try to use the model update parameters fed back by each terminal device to locally update the local machine learning model in a short period of time, thereby improving the convergence speed of the local update and improving the update of the machine learning model. effectiveness.
  • another method for updating a machine learning model is also provided.
  • the network device actively requests each terminal device for the model update parameters in each terminal device, and the time point of the request is the network device according to each terminal device.
  • the amount of data that the terminal device actually needs to transmit and the condition of the transmission chain (the quality of the transmission link) are determined.
  • the network device selects the time point for requesting the model update parameters from each terminal device according to the time required for each terminal device to send the respective model update parameters to the network device (for example, the duration is referred to as the transmission duration).
  • the difference between the transmission time and the different time points to request the respective model update parameters from each terminal device, which can minimize the difference between the network device receiving the model update parameters sent by each terminal device due to the difference in transmission time.
  • the time difference between each terminal device allows the model update parameters sent by each terminal device to reach the network device at the same time (or approximately the same short time), thereby reducing the time difference between the network device acquiring the model update parameters of each terminal device.
  • the network device can locally update the local machine learning model according to the model update parameters of each terminal device in a short time, thereby improving the convergence speed of the local update, thereby improving the updating efficiency of the machine learning model.
  • a first terminal device is used as an example for description, wherein the first terminal device is a participant. Any one of the multiple terminal devices of FL.
  • S51 The first terminal device sends parameter availability indication information to the network device.
  • the parameter availability indication information can be used to indicate the availability of the model update parameters in the first terminal device, because after the terminal device completes the local training of the machine learning model, the model update parameters of the terminal device are all available, Therefore, the parameter availability indication information can also indicate that the terminal device has completed the training of the local machine learning model. In other words, the first terminal device can inform the network device that it has completed the local training of the machine learning model through the parameter availability indication information. event.
  • the parameter availability indication information may be carried in the RRC re-establishment complete message, the RRC reconfiguration complete message, the RRC recovery complete message, the RRC establishment complete message, the UE information response message, the NAS
  • the first terminal device may notify the network device of the availability of the model update parameters in the first terminal device through any of the aforementioned messages.
  • S51 is not a necessary step, so S51 is represented by a dotted line in FIG. 5 , that is to say, the first terminal device can send parameter availability indication information to the network device, or it does not need to send the parameter availability indication to the network device. information, which is not limited in the embodiments of the present application.
  • the network device determines the transmission duration for each terminal device in the plurality of terminal devices to send their respective model update parameters to the network device.
  • the plurality of terminal devices may include the first terminal device, or may not include the first terminal device.
  • Each of the multiple terminal devices is a terminal device for which the network device pre-distributed the initial machine learning model, and the multiple terminal devices participate in the local training of the machine learning model distributed by the network device. Most (eg 80%) or the vast majority (eg 95%) of all terminal devices of
  • the network device selects a time point for acquiring the model update parameters of the first terminal device according to the transmission duration of each terminal device sending the respective model update parameters to the network device.
  • each terminal device before the network device requests each terminal device (including the first terminal device) for the model update parameters corresponding to each terminal device, each terminal device has completed the training of its own local machine learning model and obtained the corresponding corresponding In this way, the network device can determine the time point for acquiring the model update parameters in each terminal device according to the transmission duration of each terminal device sending the respective model update parameters to the network device. For example, the selected time point is called acquisition.
  • the acquisition time may be the time when the network device sends the acquisition request for requesting the acquisition of model update parameters to the first terminal device, that is, the network device may send the acquisition request to the first terminal device at the acquisition time point; or, the acquisition time It may also be the time indicated by the network device to the first terminal device to send the model update parameters to the network device, that is, the first terminal device may send its local model update parameters to the network device at the acquisition time.
  • most terminal devices eg, 80%
  • q represents the data amount of the model update parameter sent by the corresponding terminal device to the network device. Since the machine learning model for local training in each terminal device is uniformly distributed in advance by the network device, the model update parameters obtained after the training of each terminal device is known to the network device, and the network device can know each Data volume of model update parameters.
  • the terminal device transmits all model update parameters to the network device, then the network device can estimate the total data of all model update parameters on the basis of knowing the data volume of each model update parameter
  • the network device can request the terminal device for a specified type of model update parameter, and on the basis of knowing the data amount of each specified model update parameter, the network device can estimate all The total data volume of the specified model update parameters, obtain the aforementioned q.
  • the network device can Instruct the terminal device 4 to send the model update parameters to the network device at 13:02, instruct the terminal device 3 to send the model update parameters to the network device at 13:08, and instruct the terminal device 2 to send the network device at 13:15.
  • the long transmission time is compensated for by starting to send the model update parameters in advance, and the time difference of the transmission model update parameters of each terminal device is reduced, so that the The network device can receive the model update parameters of each terminal device within the same (or as much as possible the same) time, thereby reducing the time difference for the network device to receive the model update parameters of each terminal device.
  • the time required for the transmission model update parameters of most or even all participants can be comprehensively considered, and the time for each terminal device to report its own model update parameters can be more accurately controlled, thereby reducing the network device's acquisition of the data sent by each terminal device.
  • the time difference of the model update parameters thereby improving the convergence speed of the local model update and improving the model update efficiency.
  • the network device sends an acquisition request to the first terminal device.
  • the network device determines the time point for acquiring the model update parameters of the first terminal device, and may send an acquisition request to the first terminal device at the determined time point, where the acquisition request is used to instruct the first terminal device
  • the model update parameters in the first terminal device are sent to the network device.
  • the network device does not explicitly indicate to the first terminal device which model update parameters need to be acquired, then according to the default agreement between the network device and the first terminal device, the first terminal device can obtain all the model update parameters obtained by the first terminal device. are sent to the network device.
  • the network device may select the required partial model update parameters from the model update parameters available in the first terminal device, then in this manner, the acquisition request may also be used to indicate the network device specified Model update parameters, which means that the network device only needs to obtain the model update parameters indicated by the request, so that the first terminal device only needs to feed back the specified model update parameters requested by the network device to the network device, which can reduce the amount of data transmission, Reduce network transmission overhead.
  • the first terminal device determines, according to the acquisition request, model update parameters that need to be sent to the network device.
  • the model update parameters determined by the first terminal device according to the acquisition request may be all model update parameters or part of the model update parameters in the first terminal device.
  • S56 The first terminal device sends the determined model update parameters to the network device.
  • the above S51 to S56 show that the network device determines a time point according to the transmission duration of each terminal device, and sends an acquisition request to the first terminal device at the determined time point to request model update from the first terminal device Examples of parameters.
  • the network device can explicitly control the time for requesting model update parameters from each terminal device, and on the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, it can further reduce the reporting of each terminal device. The time difference between the model update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
  • S57 The network device sends reporting time information to the first terminal device.
  • the reporting time information is used to instruct the first terminal device to send the model update parameters in the first terminal device to the network device at the acquisition time determined by the network device, that is, the network device can explicitly indicate each terminal device to each terminal device The specific time for reporting model update parameters to the network device.
  • the first terminal device sends the model update parameter to the network device at the acquisition time indicated by the reporting time information.
  • the first terminal device when the acquisition time indicated by the reporting time information arrives, the first terminal device sends the model update parameters obtained through local training to the network device.
  • the above S51, S52, S53, S57, S58 (may not include S51) show that the network device determines a time point according to the transmission duration of each terminal device, and instructs the first terminal device to report the model update parameter to the network device at this time point.
  • the network device can explicitly control the specific time at which each terminal device reports the model update parameters.
  • the reporting by each terminal device can be further reduced.
  • the time difference between the model update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
  • the network device updates the local machine learning model according to the model update parameter sent by the first terminal device.
  • the network device can obtain the model update parameters in other terminal devices, and because it is based on each The transmission duration of the terminal device transmission model update parameters is differentiated and the acquisition request is sent to each terminal device, so the network device can receive the respective model update parameters sent by each terminal device at almost the same time, reducing the network device receiving each terminal device. Time variance of model update parameters sent by the device. Further, the network device can use the model update parameters of all terminal devices to locally update the local machine learning model in a short period of time, so that the update can be quickly converged, thereby improving the update efficiency of the machine learning model.
  • the terminal device and the network device may send messages based on the existing protocol stack.
  • the related information for example, the related information is sent between the terminal device and the access network device based on the RRC message, or the related information is sent between the terminal device and the core network device based on the NAS message.
  • the information exchanged between the terminal device and the CU can be forwarded through the DU, that is, the terminal device first sends the information that needs to be transmitted to the CU to the DU, and then the DU The information is then forwarded to the CU based on the F1 interface with the CU.
  • the information exchanged between the terminal device and the DU can be directly sent to each other, that is, the information that the terminal device needs to send to the DU can be directly sent to the DU, and the information that the DU needs to send to the terminal device can also be directly sent to the terminal device.
  • the DU is then forwarded to the terminal device, and the model update parameters in S45 are first sent by the terminal device to the DU and then forwarded by the DU to the CU; assuming that the network device in Figure 4 is a DU, then S41, S42, S46 Performed by the DU, S44 is performed by the terminal device, the model training configuration information in S43 is directly sent by the DU to the terminal device, and the model update parameters in S45 are directly sent by the terminal device to the DU.
  • the network device in Fig. 5 is a CU
  • S52, S53, and S59 are executed by the CU
  • S55 is executed by the terminal device
  • the parameter availability indication information in S51 is executed by the terminal device
  • the model update that needs to be sent in S56 is executed by the terminal device
  • the parameters and the model update parameters sent at the time point indicated by the reporting time information in S58 are first sent by the terminal device to the DU and then forwarded by the DU to the CU.
  • the acquisition request in S54 and the reporting time in S57 are sent by the CU.
  • FIG. 4 and FIG. 5 are respectively CU or DU
  • the specific embodiments of the steps performed by the CU and the DU respectively can refer to the description of the embodiments in the aforementioned FIG. 4 and FIG. 5 , and the description will not be repeated here. .
  • an embodiment of the present application provides a communication device, where the communication device may be a network device or a chip provided inside the network device.
  • the communication apparatus has the function of implementing the network equipment in the embodiments shown in FIG. 4 to FIG. 5 .
  • the communication apparatus includes the functions corresponding to the steps performed by the network equipment in the embodiments shown in FIG. 4 to FIG. 5 .
  • Modules or units or means, the functions or units or means can be implemented by software, or by hardware, or by executing corresponding software by hardware.
  • the communication apparatus in this embodiment of the present application includes a processing unit 601 and a communication unit 602, wherein:
  • a processing unit 601 configured to determine model training configuration information corresponding to the terminal device according to the computing capability of the terminal device;
  • the communication unit 602 is configured to send the model training configuration information to the terminal device, and receive the model update parameter sent by the terminal device, wherein the model update parameter is that the terminal device trains the first machine learning model according to the model training configuration information parameters updated later;
  • the processing unit 601 is further configured to update the second machine learning model according to the model update parameter.
  • the communication unit 602 is also used for:
  • the model training configuration information includes at least one of hyperparameters, precision, and training time information.
  • the communication unit 602 is further configured to send training feature information to the terminal device, where the training feature information is used to indicate the training feature set used by the terminal device to train the first machine learning model.
  • the communication unit 602 is further configured to send accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy.
  • the communication unit 602 is further configured to receive accuracy indication information from the terminal device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information to train the first machine learning model. precision.
  • the processing unit 601 is further configured to determine a time point for acquiring the model update parameters of the terminal device; correspondingly, the communication unit 602 is further configured to:
  • reporting time information is used to indicate that the model update parameters are sent to the network device at the aforementioned time point.
  • the processing unit 601 is specifically configured to determine the transmission duration for each terminal device in the multiple terminal devices to send their respective model update parameters to the network device; and determine the aforementioned transmission duration according to each transmission duration corresponding to each terminal device point in time.
  • the acquisition request is further used to indicate that the specified model update parameters need to be acquired.
  • the communication unit 602 is further configured to receive parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  • an embodiment of the present application provides a communication device, where the communication device may be a terminal device or a chip provided inside the terminal device.
  • the communication device has the function of implementing the terminal device in the embodiment shown in FIG. 4, or the communication device has the function of implementing the first terminal device in the embodiment shown in FIG. 5.
  • the communication device includes executing the above-mentioned FIG. 4 ⁇ the modules or units or means corresponding to the steps performed by the terminal device or the first terminal device in the embodiment shown in FIG. 5 , the functions, units or means may be implemented by software, or by hardware, or by hardware Execute the corresponding software implementation.
  • the communication apparatus in this embodiment of the present application includes a communication unit 701 and a processing unit 702, wherein:
  • a communication unit 701 configured to receive model training configuration information sent by a network device, where the model training configuration information is determined according to the computing capability of the terminal device;
  • a processing unit 702 configured to train the first machine learning model according to the model training configuration information to obtain model update parameters
  • the communication unit 701 is further configured to send the model update parameter to the network device, where the model update parameter is used by the network device to update the second machine learning model.
  • the communication unit 701 is further configured to receive a computing capability acquisition request sent by the network device; and send second computing power indication information to the network device according to the computing capability acquisition request, where the second computing power indication information Used to indicate the computing capability of the terminal device.
  • the communication unit 701 is further configured to receive training feature information from the network device, where the training feature information is used to indicate the training feature set used by the terminal device to train the first machine learning model; then the corresponding , the processing unit 702 is further configured to train the first machine learning model according to the model training configuration information and the training feature information.
  • the communication unit 701 is further configured to receive accuracy evaluation information from the network device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy; then the corresponding
  • the processing unit 702 is further configured to determine the accuracy achieved by the trained first machine learning model according to the accuracy evaluation information.
  • the communication unit 701 is configured to send accuracy indication information to the network device, where the accuracy indication information is used to indicate the accuracy achieved by the terminal device after training the first machine learning model by using the model training configuration information.
  • the communication unit 701 is further configured to receive an acquisition request from the network device, and send model update parameters to the network device according to the acquisition request, where the acquisition request is used to instruct the terminal device to send the network device the information of the terminal device to the network device. Model update parameters.
  • the communication unit 701 is further configured to receive reporting time information from the network device, and send model update parameters to the network device at the time point indicated by the reporting time information.
  • the communication unit 701 is further configured to send parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  • an embodiment of the present application provides a communication device, where the communication device may be a network device or a chip provided inside the network device.
  • the communication apparatus has the function of implementing the network equipment in the embodiments shown in FIG. 4 to FIG. 5 .
  • the communication apparatus includes the functions corresponding to the steps performed by the network equipment in the embodiments shown in FIG. 4 to FIG. 5 .
  • Modules or units or means, the functions or units or means can be implemented by software, or by hardware, or by executing corresponding software by hardware.
  • the communication apparatus in this embodiment of the present application includes a processing unit 801 and a communication unit 802, wherein:
  • a processing unit 801 configured to select a time point for acquiring the update parameters acquired by the first terminal device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device;
  • a communication unit 802 configured to send an acquisition request to the first terminal device at the above-mentioned time point, and receive the model update parameter sent by the first terminal device, where the acquisition request is used to request the first terminal device to send the model update parameter to the network device; or is used to send reporting time information to the first terminal device and receive model update parameters sent by the first terminal device to the network device, where the reporting time information is used to instruct the first terminal device to send the model update parameters to the network device at the above-mentioned time point;
  • the processing unit 801 is further configured to update the second machine learning model according to the model update parameter.
  • the communication unit 802 is further configured to receive parameter availability indication information from the first terminal device.
  • the communication unit 802 is further configured to receive indication information from the network device for indicating the specified model update parameter.
  • the indication information is carried in the acquisition request.
  • an embodiment of the present application provides a communication device, where the communication device may be a terminal device or a chip provided inside the terminal device.
  • the communication device has the function of implementing the terminal device in the embodiment shown in FIG. 4, or the communication device has the function of implementing the first terminal device in the embodiment shown in FIG. 5.
  • the communication device includes executing the above-mentioned FIG. 4 ⁇ the modules or units or means corresponding to the steps performed by the terminal device or the first terminal device in the embodiment shown in FIG. 5 , the functions, units or means may be implemented by software, or by hardware, or by hardware Execute the corresponding software implementation.
  • the communication apparatus in this embodiment of the present application includes a communication unit 901 and a processing unit 902, wherein:
  • the communication unit 901 is configured to receive an acquisition request sent by a network device, wherein the time point at which the acquisition request is sent is determined by the network device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device ; Or used to receive the reporting time information sent by the network device, the reporting time information is used to indicate that the model update parameters are sent to the network device at a time point;
  • a processing unit 902 configured to determine model update parameters to be sent according to the acquisition request
  • the communication unit 901 is further configured to send the determined model update parameter to the network device, or to send the model update parameter to the network device at the time point indicated by the reporting time information, where the model update parameter is used by the network device to learn about the second machine The model is updated.
  • the communication unit 901 is further configured to send parameter availability indication information to the network device.
  • the communication unit 901 is further configured to receive indication information from the network device for indicating the specified model update parameter.
  • the indication information is carried in the acquisition request.
  • an embodiment of the present application further provides a communication device, including:
  • the memory 1002 is located outside the communication device.
  • the communication apparatus includes a memory 1002 , the memory 1002 is connected to the at least one processor 1001 , and the memory 1002 stores instructions that can be executed by the at least one processor 1001 .
  • the memory 1002 is optional to the communication device as indicated by dashed lines in FIG. 10 .
  • processor 1001 and the memory 1002 may be coupled through an interface circuit, or may be integrated together, which is not limited here.
  • the specific connection medium between the processor 1001 , the memory 1002 , and the communication interface 1003 is not limited in the embodiments of the present application.
  • the processor 1001, the memory 1002, and the communication interface 1003 are connected through a bus 1004 in FIG. 10.
  • the bus is represented by a thick line in FIG. 10, and the connection between other components is only for schematic illustration. , is not limited.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 10, but it does not mean that there is only one bus or one type of bus.
  • an embodiment of the present application further provides a communication device, including:
  • the steps of the method executed by the terminal device shown in FIG. 5 may be executed, or the steps of the method executed by the first terminal device in the above-mentioned embodiment shown in FIG. 5 are executed.
  • the memory 1102 is located outside the communication device.
  • the communication apparatus includes a memory 1102 , the memory 1102 is connected to the at least one processor 1101 , and the memory 1102 stores instructions that can be executed by the at least one processor 1101 .
  • Figure 11 shows in dashed lines that the memory 1102 is optional to the communication device.
  • the processor 1101 and the memory 1102 may be coupled through an interface circuit, or may be integrated together, which is not limited here.
  • the specific connection medium between the processor 1101 , the memory 1102 , and the communication interface 1103 is not limited in the embodiments of the present application.
  • the processor 1101, the memory 1102, and the communication interface 1103 are connected by a bus 1104 in FIG. 11.
  • the bus is represented by a thick line in FIG. 11, and the connection between other components is only for schematic illustration. , is not limited.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 11, but it does not mean that there is only one bus or one type of bus.
  • the processor mentioned in the embodiments of the present application may be implemented by hardware or software.
  • the processor When implemented in hardware, the processor may be a logic circuit, an integrated circuit, or the like.
  • the processor When implemented in software, the processor may be a general-purpose processor implemented by reading software codes stored in memory.
  • the processor may be a central processing unit (central processing unit, CPU), or other general-purpose processors, digital signal processors (digital signal processors, DSP), application specific integrated circuits (application specific integrated circuit, ASIC) , off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • CPU central processing unit
  • DSP digital signal processors
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory mentioned in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Eate SDRAM DDR SDRAM
  • enhanced SDRAM ESDRAM
  • synchronous link dynamic random access memory Synchlink DRAM, SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the processor is a general-purpose processor, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components
  • the memory storage module
  • memory described herein is intended to include, but not be limited to, these and any other suitable types of memory.
  • an embodiment of the present application further provides a communication system, the communication system includes the communication device in FIG. 6 and the communication device in FIG. 7 , or includes the communication device in FIG. 8 and the communication device in FIG. 9 , Or include the communication device in FIG. 10 and the communication device in FIG. 11 .
  • an embodiment of the present application further provides a computer-readable storage medium, including a program or an instruction.
  • the program or instruction is run on a computer, the network in the embodiments shown in FIG. 4 to FIG. 5 can be The method executed by the device is executed.
  • an embodiment of the present application further provides a computer-readable storage medium, including a program or an instruction.
  • the terminal in the embodiments shown in FIG. 4 to FIG. 5 can be The method performed by the device or the first terminal device is performed.
  • an embodiment of the present application further provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the The method performed by the network device is performed.
  • an embodiment of the present application further provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the The method performed by the terminal device or the first terminal device is performed.
  • an embodiment of the present application further provides a computer program product, including instructions, which, when running on a computer, cause the methods performed by the network devices in the embodiments shown in FIG. 4 to FIG. 5 to be executed.
  • an embodiment of the present application also provides a computer program product, which includes instructions, which when run on a computer, cause the terminal device or the first terminal device in the embodiments shown in FIG. 4 to FIG. 5 to execute the program. method is executed.
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, from a website site, computer, server or data center via Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital versatile disc (DVD)), or semiconductor media (eg, solid state disk (SSD) ))Wait.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A method for updating a machine learning model, and a communication apparatus, which relate to the technical field of artificial intelligence and the technical field of communications. The method comprises: a network device determining corresponding model training configuration information for a terminal device according to the computing capability of the terminal device; receiving, after sending the model training configuration information to the terminal device, a model update parameter sent by the terminal device, wherein the model update parameter is a model parameter that is updated after the terminal device trains a first machine learning model according to the model training configuration information; and updating a second machine learning model according to the received model update parameter. Therefore, according to the computing capability of each terminal device, the difference in the time for each terminal device to report a model update parameter to a network device is reduced, and the network device can finish model updating in as short a time as possible on the basis of the model update parameter reported by each terminal device, thereby improving the convergence speed of model updating, and improving the update efficiency of a machine learning model.

Description

一种更新机器学习模型的方法及通信装置A method and communication device for updating a machine learning model
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求在2020年08月24日提交中国专利局、申请号为202010858858.7、申请名称为“一种更新机器学习模型的方法及通信装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on August 24, 2020 with the application number of 202010858858.7 and the application title of "A method and communication device for updating a machine learning model", the entire contents of which are incorporated by reference in this application.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种更新机器学习模型的方法及通信装置。The present application relates to the technical field of artificial intelligence, and in particular, to a method and a communication device for updating a machine learning model.
背景技术Background technique
无线通信网络正朝着网络多元化、宽带化、综合化、智能化的方向发展。无线传输采用越来越高的频谱、越来越宽的带宽、越来越多的天线,采用传统的通信方法复杂度太高且性能难以保证。此外,随着智能终端和各种应用爆炸式发展,无线通信网络行为和性能因素比过去更加动态和不可预测。低成本、高效率地运营日益复杂的无线通信网络是当前运营商面临的一项挑战。Wireless communication networks are developing in the direction of network diversification, broadbandization, integration and intelligence. Wireless transmission uses higher and higher frequency spectrums, wider bandwidths, and more and more antennas. Traditional communication methods are too complex and difficult to guarantee performance. Furthermore, with the explosion of smart terminals and various applications, wireless communication network behavior and performance factors are more dynamic and unpredictable than in the past. Operating increasingly complex wireless communication networks with low cost and high efficiency is a challenge facing operators today.
随着人工智能(artificial intelligence,AI)技术和机器学习(machine learning,ML)技术的发展,AI/ML在无线通信网络中也将承担着越来越重要的任务。目前,在无线通信网络中,终端设备和网络侧之间的物理层、媒体介入控制、无线资源控制、无线资源管理、运维等领域均在引入AI/ML。终端设备和网络设备(例如基站)作为无线通信网络的一部分,均可引入AI/ML进行相关通信事务的处理,具体来说,可以利用各自本地的数据进行机器学习模型的训练,进一步地再通过训练得到的机器学习模型处理相关的通信事务。With the development of artificial intelligence (AI) technology and machine learning (ML) technology, AI/ML will also assume more and more important tasks in wireless communication networks. Currently, in wireless communication networks, AI/ML is being introduced in the fields of physical layer, media intervention control, wireless resource control, wireless resource management, and operation and maintenance between the terminal device and the network side. Terminal equipment and network equipment (such as base stations), as part of the wireless communication network, can introduce AI/ML to process related communication transactions. The trained machine learning model handles the relevant communication transactions.
终端设备中的数据一般都是涉及用户隐私的,为避免用户的隐私数据泄露,所以各个终端设备一般是在本地利用其自身的数据对网络设备预先分发的机器学习模型的训练,然后将训练之后得到的模型更新参数发送给网络设备,再由网络设备对各个参与方(即前述进行模型训练的各个终端设备)发送的模型更新参数进行汇聚,进而对网络设备本地的机器学习模型直接进行更新。在此过程中,各个终端设备中的用户数据一般并不相同,由于各个终端设备进行模型训练的能力一般也存在差异,并且各个终端设备进行模型训练的配置信息是由各个终端自行配置或者由终端用户自行选择的,导致各个终端设备进行机器学习模型训练所花的时间一般也不相同,因此网络设备接收到各个终端设备上传的模型更新参数的时间一般存在较大差异,而网络设备进行本地模型更新是需要所有参与方上报的模型更新参数,所以,由于各个终端设备上报各自的模型更新参数的时间差异性,导致网络设备进行本地模型更新所需要的时间也受到影响,对机器学习模型更新的收敛速度较慢,更新效率较低。The data in terminal devices generally involve user privacy. In order to avoid leakage of user privacy data, each terminal device generally uses its own data locally to train the machine learning model pre-distributed by the network device, and then The obtained model update parameters are sent to the network device, and then the network device aggregates the model update parameters sent by each participant (ie, each terminal device that performs model training), and then directly updates the local machine learning model of the network device. In this process, the user data in each terminal device is generally different, because the ability of each terminal device to perform model training is generally different, and the configuration information of each terminal device for model training is configured by each terminal itself or by the terminal. The time it takes for each terminal device to perform machine learning model training is generally different, so the time it takes for the network device to receive the model update parameters uploaded by each terminal device is generally quite different, and the network device performs the local model. Update is a model update parameter that needs to be reported by all participants. Therefore, due to the time difference between the respective model update parameters reported by each terminal device, the time required for the network device to update the local model is also affected. The convergence speed is slower and the update efficiency is lower.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种更新机器学习模型的方法及通信装置,用于提高对机器学习模型更新的收敛速度,以提高机器学习模型的更新效率。Embodiments of the present application provide a method and a communication device for updating a machine learning model, which are used to improve the convergence speed of updating the machine learning model, so as to improve the updating efficiency of the machine learning model.
第一方面,提供一种更新机器学习模型的方法,该方法可以应用于网络设备,也可以应用于网络设备内部的芯片。以该方法应用于网络设备为例,在该方法中,网络设备根据终端设备的计算能力为该终端设备确定对应的模型训练配置信息,并在将该模型训练配置信息发送给终端设备后接收到该终端设备发送的模型更新参数,再根据接收到的模型更新参数对网络设备中的第二机器学习模型进行更新。其中,终端设备发送的模型更新参数是该终端设备根据网络设备发送的模型训练配置信息对本地的第一机器学习模型进行本地训练后得到的更新参数。In a first aspect, a method for updating a machine learning model is provided, and the method can be applied to a network device or a chip inside the network device. Taking the method applied to a network device as an example, in this method, the network device determines the corresponding model training configuration information for the terminal device according to the computing capability of the terminal device, and receives the model training configuration information after sending the model training configuration information to the terminal device. The model update parameter sent by the terminal device is then updated according to the received model update parameter to the second machine learning model in the network device. The model update parameter sent by the terminal device is an update parameter obtained by the terminal device performing local training on the local first machine learning model according to the model training configuration information sent by the network device.
其中,终端设备本地的机器学习模型称作第一机器学习模型,网络设备本地的机器学习模型称作第二机器学习模型,第一机器学习模型是由网络设备为终端设备分发的。第一机器学习模型和第二机器学习模型是同一种类型的机器学习模型,或者,第一机器学习模型和第二机器学习模型是不同类型的机器学习模型,为便于网络设备根据各个终端设备上报的模型更新参数信息进行本地的模型更新,第一机器学习模型和第二机器学习模型是相同类型的机器学习模型。The machine learning model local to the terminal device is called the first machine learning model, the machine learning model local to the network device is called the second machine learning model, and the first machine learning model is distributed by the network device for the terminal device. The first machine learning model and the second machine learning model are of the same type of machine learning model, or the first machine learning model and the second machine learning model are different types of machine learning models. The model update parameter information is used for local model update, and the first machine learning model and the second machine learning model are the same type of machine learning model.
在本申请实施例中,网络设备根据各个终端设备的计算能力为各个终端设备分配对应的模型训练配置信息,使得各个终端设备对本地的机器学习模型进行训练时所使用的模型训练配置信息是与自身的计算能力相匹配的,相对于相关技术中的由各个终端设备自行相互独立的选择模型训练配置信息的方式,本方案中由网络侧统一根据各个终端设备自身的计算能力为各个终端设备差异化地配置对应的模型训练配置信息,这样可减少各个终端设备在进行模型训练时由于能力差异而导致的时间差异,进而确保各个终端设备能够尽量在相同时间内完成模型训练,使得各个终端设备上报各自的模型更新参数的时间是大致相同的,减少各个终端设备上报模型更新参数时间上的差异性,从而减少网络设备接收各个终端设备发送的模型更新参数的时间差异性,以便于网络设备基于各个终端设备上报的模型更新参数进行模型更新时能够尽量在短时间内完成,提高模型更新的收敛速度,从而提高机器学习模型的更新效率。In the embodiment of the present application, the network device allocates corresponding model training configuration information to each terminal device according to the computing capability of each terminal device, so that the model training configuration information used by each terminal device to train the local machine learning model is the same as the In contrast to the way in which each terminal device independently selects the model training configuration information in the related art, in this solution, the network side uniformly calculates the difference of each terminal device according to the computing capability of each terminal device itself. The corresponding model training configuration information can be configured so as to reduce the time difference caused by different capabilities of each terminal device during model training, thereby ensuring that each terminal device can complete the model training within the same time as possible, so that each terminal device can report The time of the respective model update parameters is roughly the same, which reduces the difference in the time when each terminal device reports the model update parameters, thereby reducing the time difference between the network device receiving the model update parameters sent by each terminal device, so that the network device based on each The model update parameters reported by the terminal device can be completed in a short time as much as possible to improve the convergence speed of the model update, thereby improving the update efficiency of the machine learning model.
在一种可能的实现方式中,网络设备可以接收来自终端设备的第一算力指示信息,或者可以在向终端设备发送计算能力获取请求后接收来自终端设备的第二算力指示信息,或者可以接收来自其它网络设备的第三算力指示信息。In a possible implementation manner, the network device may receive the first computing power indication information from the terminal device, or may receive the second computing power indication information from the terminal device after sending a computing capability acquisition request to the terminal device, or may Receive third computing power indication information from other network devices.
其中,第一算力指示信息、第二算力指示信息、第三算力指示信息均是用于指示终端设备的计算能力的信息,也就是说,该实施方式中提供了三种获取终端设备的计算能力的方式,如此可以提高获取终端设备的计算能力的方式的灵活性。Wherein, the first computing power indication information, the second computing power indication information, and the third computing power indication information are all information used to indicate the computing capability of the terminal device, that is to say, in this embodiment, three types of acquisition terminal equipment are provided. In this way, the flexibility of the method for acquiring the computing power of the terminal device can be improved.
在一种可能的实现方式中,模型训练配置信息包括超参数、精度、训练时间信息中的至少一种。In a possible implementation manner, the model training configuration information includes at least one of hyperparameters, precision, and training time information.
在该方案中,网络设备根据终端设备的计算能力可以为该终端设备为配置一种或多种模型训练配置信息,配置的灵活性较高。并且,配置的模型训练配置信息是终端设备进行模型训练常规使用的,这样一般可以满足大多数终端设备进行本地模型训练的配置需求,通用性较好。In this solution, the network device can configure one or more model training configuration information for the terminal device according to the computing capability of the terminal device, and the configuration flexibility is high. In addition, the configured model training configuration information is routinely used by terminal devices for model training, which generally meets the configuration requirements of most terminal devices for local model training, and has good versatility.
在一种可能的实现方式中,网络设备还向终端设备发送训练特征信息,该训练特征信息用于指示终端设备对该终端设备中的第一机器学习模型进行训练所使用的训练特征集。In a possible implementation manner, the network device further sends training feature information to the terminal device, where the training feature information is used to indicate the training feature set used by the terminal device to train the first machine learning model in the terminal device.
在该方案中,网络设备向终端设备发送训练特征信息,可以让各个参与本地训练的终端设备均使用相同的训练特征信息进行本地训练,从而减少各个终端设备基于不同训练特 征信息进行本地训练时所花时间的差异。In this solution, the network device sends the training feature information to the terminal device, so that each terminal device participating in the local training can use the same training feature information to perform local training, thereby reducing the time when each terminal device performs local training based on different training feature information. difference in time spent.
在一种可能的实现方式中,网络设备还向终端设备发送精度评估信息,该精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种。In a possible implementation manner, the network device further sends accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy.
在该方案中,通过向终端设备指定精度评估信息,可以让各个参与本地模型训练的终端设备均使用相同的精度评估信息对本地训练后的机器学习模型进行精度评估,由于采用的是相同的精度评估方式,可以尽量使得各个终端设备在同一精度评估标准下达到规定的精度要求,从而可以减少各个终端设备进行本地训练所花时间的差异。In this solution, by specifying the accuracy evaluation information to the terminal device, each terminal device participating in the local model training can use the same accuracy evaluation information to evaluate the accuracy of the locally trained machine learning model, because the same accuracy is used. The evaluation method can try to make each terminal device meet the specified accuracy requirements under the same accuracy evaluation standard, thereby reducing the difference in the time spent by each terminal device for local training.
在一种可能的实现方式中,网络设备还接收来自终端设备的精度指示信息,该精度指示信息用于指示该终端设备利用网络设备发送的模型训练配置信息对本地的第一机器学习模型进行本地训练后达到的精度。In a possible implementation manner, the network device further receives accuracy indication information from the terminal device, where the accuracy indication information is used to instruct the terminal device to perform a local first machine learning model on the local first machine learning model by using the model training configuration information sent by the network device. Accuracy achieved after training.
在该方案中,终端设备除了向网络设备反馈模型更新参数,同时还可以将对应的模型训练的精度反馈给网络设备,这样,以便于网络设备知晓终端设备的训练效果,可以作为网络设备后续再为终端设备配置模型训练配置信息时作为参考依据,以尽量提高训练效果。In this solution, in addition to feeding back the model update parameters to the network device, the terminal device can also feed back the accuracy of the corresponding model training to the network device, so that the network device can know the training effect of the terminal device, which can be used as a follow-up When configuring the model training configuration information for the terminal device, it is used as a reference to maximize the training effect.
在一种可能的实现方式中,在网络设备接收终端设备发送的模型更新参数之前,还确定用于获取该终端设备的模型更新参数的时间点,并在该时间点向终端设备发送获取请求,该获取请求用于指示终端设备向网络设备发送该终端设备的模型更新参数。In a possible implementation manner, before the network device receives the model update parameters sent by the terminal device, it also determines a time point for acquiring the model update parameters of the terminal device, and sends an acquisition request to the terminal device at the time point, The obtaining request is used to instruct the terminal device to send the model update parameter of the terminal device to the network device.
在该方案中,网络设备可以明确控制向各个终端设备请求模型更新参数的时间,在通过模型训练配置信息减少各个终端设备完成本地训练的时间差异的基础上,可以进一步地减少各个终端设备上报模型更新参数的时间差异,从而减少网络设备实际获取到各个终端设备发送的模型更新参数的时间差异。In this solution, the network device can explicitly control the time for requesting model update parameters from each terminal device. On the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, it can further reduce the reporting model of each terminal device. The time difference between the update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
在一种可能的实现方式中,在网络设备接收终端设备发送的模型更新参数之前,还确定用于获取该终端设备的模型更新参数的时间点,并向终端设备发送上报时间信息,该上报时间信息用于指示在确定的前述时间点向网络设备发送模型更新参数。In a possible implementation manner, before the network device receives the model update parameters sent by the terminal device, it also determines a time point for acquiring the model update parameters of the terminal device, and sends reporting time information to the terminal device. The information is used to indicate that the model update parameters are sent to the network device at the determined aforementioned time point.
在该方案中,网络设备可以明确控制各个终端设备上报模型更新参数的具体时间,在通过模型训练配置信息减少各个终端设备完成本地训练的时间差异的基础上,可以进一步地减少各个终端设备上报模型更新参数的时间差异,从而减少网络设备实际获取到各个终端设备发送的模型更新参数的时间差异。In this solution, the network device can clearly control the specific time when each terminal device reports the model update parameters. On the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, it can further reduce the reporting model of each terminal device. The time difference between the update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
在一种可能的实现方式中,网络设备确定多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长,并根据各个终端对应的传输时长确定获取上述终端设备的模型更新参数的获取时间。其中的多个终端设备可以包括上述终端设备,或者也可以不包括上述终端设备。In a possible implementation manner, the network device determines the transmission duration for each of the multiple terminal devices to send their respective model update parameters to the network device, and determines to obtain the model update of the terminal device according to the transmission duration corresponding to each terminal. The acquisition time of the parameter. The plurality of terminal devices may include the aforementioned terminal devices, or may not include the aforementioned terminal devices.
在该方案中,网络设备主动向各个终端设备请求模型更新参数,并且是在与各个终端设备匹配的时间点向对应的各个终端设备发送用于请求模型更新参数的获取请求。此外,可以综合考虑大部分甚至是全部参与方的传输模型更新参数所需的时长,可以更为精确地控制各个终端设备上报各自的模型更新参数的时间,从而减少网络设备获取各个终端设备发送的模型更新参数的时间差异,进而提高本地模型更新的收敛速度,提高模型更新效率。In this solution, the network device actively requests model update parameters from each terminal device, and sends an acquisition request for requesting model update parameters to each corresponding terminal device at a time matching each terminal device. In addition, the time required for updating parameters of the transmission model of most or even all participants can be comprehensively considered, and the time for each terminal device to report its own model update parameters can be more accurately controlled, thereby reducing the network device acquiring the data sent by each terminal device. The time difference of the model update parameters, thereby improving the convergence speed of the local model update and improving the model update efficiency.
在一种可能的实现方式中,上述的获取请求还用于指示需要获取指定的模型更新参数。In a possible implementation manner, the above obtaining request is also used to indicate that the specified model update parameters need to be obtained.
在该方案中,网络设备可以指示终端上传特定的模型更新参数,而并不一定是全部的模型更新参数,这样可以减少终端设备向网络设备传输模型更新参数的数据量和时间,从而尽量减少无效传输,提高传输的有效性,也节约了网络传输资源,减少了空口资源开销。In this solution, the network device can instruct the terminal to upload specific model update parameters, not necessarily all model update parameters, which can reduce the amount of data and time required for the terminal device to transmit the model update parameters to the network device, thereby minimizing invalidation The transmission improves the effectiveness of transmission, saves network transmission resources, and reduces air interface resource overhead.
在一种可能的实现方式中,网络设备还接收来自终端设备的参数可用性指示信息,该参数可用性指示信息用于指示该终端设备中的模型更新参数的可用性。In a possible implementation manner, the network device further receives parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
在该方案中,通过参数可用性指示信息可以指示终端设备中的模型更新参数的可用性,如此,网络设备根据终端设备发送的参数可用性指示信息可以明确该终端设备中的各种模型更新参数的可用性情况,增加了网络设备和终端设备之间的认知一致性,网络设备在获取终端设备发送的模型更新参数时也可以更明确。In this solution, the availability of the model update parameters in the terminal device can be indicated by the parameter availability indication information. In this way, the network device can specify the availability of various model update parameters in the terminal device according to the parameter availability indication information sent by the terminal device. , which increases the cognitive consistency between the network device and the terminal device, and the network device can also be more specific when acquiring the model update parameters sent by the terminal device.
第二方面,提供一种更新机器学习模型的方法,该方法可以应用于终端设备,也可以应用于终端设备内部的芯片。以该方法应用于终端设备为例,在该方法中,终端设备接收网络设备发送的模型训练配置信息,其中的模型训练配置信息是根据该终端设备的计算能力确定的,进一步地,终端设备被根据接收到的模型训练配置信息对终端设备中的第一机器学习模型进行本地训练,以得到模型更新参数,然后再将得到的模型更新参数发送给网络设备,以使网络设备根据该模型更新参数对网络设备中的第二机器学习模型进行本地更新。In a second aspect, a method for updating a machine learning model is provided, and the method can be applied to a terminal device or a chip inside the terminal device. Taking the method applied to a terminal device as an example, in this method, the terminal device receives the model training configuration information sent by the network device, wherein the model training configuration information is determined according to the computing capability of the terminal device, and further, the terminal device is Perform local training on the first machine learning model in the terminal device according to the received model training configuration information to obtain model update parameters, and then send the obtained model update parameters to the network device, so that the network device updates the parameters according to the model A local update is made to the second machine learning model in the network device.
其中对于第一机器学习模型和第二机器学习模型的理解,可以按照第一方面中对第一机器学习模型和第二机器学习模型的说明进行理解。The understanding of the first machine learning model and the second machine learning model can be understood according to the description of the first machine learning model and the second machine learning model in the first aspect.
本申请实施例中,终端设备进行本地的机器学习模型训练的模型训练配置信息是网络设备根据该终端设备自身的计算能力确定的,所以该模型训练配置信息是与该终端设备自身的计算能力相匹配的,可以确保各个终端设备能够尽量在相同时间内完成模型训练。采用该方式,网络设备可以为参与本地训练的各个终端设备均依据其自身的计算能力配置对应的模型训练配置信息,使得各个终端设备上报各自的模型更新参数的时间是大致相同的,减少各个终端设备上报模型更新参数时间上的差异性,从而减少网络设备接收各个终端设备发送的模型更新参数的时间差异性,以便于网络设备基于各个终端设备上报的模型更新参数进行模型更新时能够尽量在短时间内完成,提高模型更新的收敛速度,从而提高机器学习模型的更新效率。In the embodiment of the present application, the model training configuration information for the terminal device to perform local machine learning model training is determined by the network device according to the computing capability of the terminal device itself, so the model training configuration information is related to the computing capability of the terminal device itself. Matching, it can ensure that each terminal device can complete the model training in the same time as possible. In this way, the network device can configure the corresponding model training configuration information for each terminal device participating in the local training according to its own computing capability, so that the time for each terminal device to report its own model update parameters is roughly the same, reducing the number of terminal devices. The difference in the time when the device reports the model update parameters, thereby reducing the time difference between the network device receiving the model update parameters sent by each terminal device, so that the network device can update the model based on the model update parameters reported by each terminal device in as short a time as possible. It can be completed in time to improve the convergence speed of model update, thereby improving the update efficiency of the machine learning model.
在一种可能的实现方式中,在接收网络设备发送的模型训练配置信息之前,终端设备接收网络设备发送的计算能力获取请求,并根据该计算能力获取请求,向网络设备发送用于指示该终端设备的计算能力的第二算力指示信息。In a possible implementation manner, before receiving the model training configuration information sent by the network device, the terminal device receives the computing capability acquisition request sent by the network device, and according to the computing capability acquisition request, sends to the network device an instruction to indicate the terminal The second computing power indication information of the computing power of the device.
在一种可能的实现方式中,在接收网络设备发送的模型训练配置信息的基础上,终端设备还接收来自网络设备的训练特征信息,该训练特征信息用于指示对该终端设备中的机器学习模型进行训练所使用的训练特征集,然后再根据模型训练配置信息和训练特征信息对终端设备中的第一机器学习模型进行本地训练。In a possible implementation manner, on the basis of receiving the model training configuration information sent by the network device, the terminal device also receives training feature information from the network device, where the training feature information is used to indicate the machine learning in the terminal device. The training feature set used by the model for training, and then the first machine learning model in the terminal device is locally trained according to the model training configuration information and the training feature information.
在一种可能的实现方式中,终端设备还接收来自网络设备的精度评估信息,该精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种,然后再根据精度评估信息确定训练后的第一机器学习模型所达到的精度。In a possible implementation manner, the terminal device further receives accuracy evaluation information from the network device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy, and then according to the accuracy The evaluation information determines the accuracy achieved by the trained first machine learning model.
在一种可能的实现方式中,终端设备还向网络设备发送精度指示信息,该精度指示信息用于指示该终端设备利用网络设备发送的模型训练配置信息对器其本地的机器学习模型进行训练后所达到的精度。In a possible implementation manner, the terminal device also sends accuracy indication information to the network device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information sent by the network device to train its local machine learning model after training. achieved accuracy.
在一种可能的实现方式中,终端设备还接收来自网络设备的获取请求,该获取请求用于指示该终端设备向网络设备发送该终端设备的模型更新参数,然后再根据该获取请求向网络设备发送模型更新参数。In a possible implementation manner, the terminal device further receives an acquisition request from the network device, where the acquisition request is used to instruct the terminal device to send the model update parameters of the terminal device to the network device, and then send the network device to the network device according to the acquisition request. Send model update parameters.
在一种可能的实现方式中,终端设备还接收来自网络设备的上报时间信息,并在该上报时间信息所指示的时间点向网络设备发送模型更新参数。In a possible implementation manner, the terminal device further receives reporting time information from the network device, and sends the model update parameter to the network device at the time point indicated by the reporting time information.
在一种可能的实现方式中,终端设备还向网络设备发送参数可用性指示信息,该参数可用性指示信息用于指示该终端设备中的模型更新参数的可用性。In a possible implementation manner, the terminal device further sends parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
上述第二方面中的任一实现方式可以达到的技术效果可以参照上述第一方面中有益效果的描述,此处不再重复赘述。For the technical effects that can be achieved by any implementation manner of the foregoing second aspect, reference may be made to the description of the beneficial effects in the foregoing first aspect, which will not be repeated here.
第三方面,提供一种更新机器学习模型的方法,该方法可以应用于网络设备,或者可以应用于网络设备中的芯片。以该方法应用于网络设备为例,在该方法中,网络设备根据多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长,选择获取第一终端设备的模型更新参数的时间点,再在选择的时间点向第一终端设备发送获取请求,或者向终端设备发送用于指示第一终端设备在该时间点发送模型更新参数的上报时间信息,并接收第一终端设备发送的模型更新参数,然后再根据该模型更新参数对网络设备中的第二机器学习模型进行本地更新。In a third aspect, a method for updating a machine learning model is provided, and the method can be applied to a network device, or can be applied to a chip in the network device. Taking the method applied to a network device as an example, in this method, the network device selects and obtains the model update parameter of the first terminal device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device. at the selected time point, and then send an acquisition request to the first terminal device at the selected time point, or send to the terminal device the reporting time information for instructing the first terminal device to send the model update parameters at this time point, and receive the first terminal device The sent model update parameter, and then locally update the second machine learning model in the network device according to the model update parameter.
本申请实施例中,网络设备主动向各个终端设备请求各个终端设备中的模型更新参数,并且请求的时间点是网络设备根据各个终端设备实际需传输的数据量和传输链情况(传输链路的质量)确定的。具体来说,网络设备根据各个终端设备向网络设备发送各自的模型更新参数的所需的传输时长选择获取各个终端设备的模型更新参数的时间点,根据各个终端设备的传输时长之间的不同而差异化地在不同的时间点向各个终端设备请求各自的模型更新参数,这样可以尽量减少由于传输时长的不同而导致网络设备接收各个终端设备发送的模型更新参数之间的时间差异,使得各个终端设备发送的模型更新参数能够尽量在相同时间(或者近似相同的短时间内)到达网络设备,减少了各个终端设备向网络设备传输模型更新参数的时间差异,从而减少网络设备获取各个终端设备的模型更新参数之间的时间差异,以便于网络设备能够在短时间内根据各个终端设备的模型更新参数对本地的机器学习模型进行本地更新,从而提高本地更新的收敛速度,进而提高机器学习模型的更新效率。In this embodiment of the present application, the network device actively requests each terminal device for the model update parameters in each terminal device, and the time point of the request is the amount of data that the network device actually needs to transmit according to each terminal device and the transmission chain situation (the quality) is determined. Specifically, the network device selects the time point for acquiring the model update parameters of each terminal device according to the transmission duration required for each terminal device to send the respective model update parameters to the network device, and according to the difference between the transmission durations of each terminal device Differentially request the respective model update parameters from each terminal device at different time points, which can minimize the time difference between the network device receiving the model update parameters sent by each terminal device due to the difference in transmission time, so that each terminal device can The model update parameters sent by the device can reach the network device at the same time (or approximately the same short period of time) as much as possible, reducing the time difference between each terminal device transmitting the model update parameters to the network device, thereby reducing the network device acquiring the model of each terminal device. The time difference between the update parameters, so that the network device can locally update the local machine learning model according to the model update parameters of each terminal device in a short time, so as to improve the convergence speed of the local update, and then improve the update of the machine learning model. effectiveness.
在一种可能的实现方式中,网络设备还接收第一终端设备发送的参数可用性指示信息。In a possible implementation manner, the network device further receives parameter availability indication information sent by the first terminal device.
在该方案中,通过参数可用性指示信息可以指示终端设备中的模型更新参数的可用性,如此,网络设备根据终端设备发送的参数可用性指示信息可以明确该终端设备中的各种模型更新参数的可用性情况,增加了网络设备和终端设备之间的认知一致性,网络设备在获取终端设备发送的模型更新参数时也可以更明确。In this solution, the availability of the model update parameters in the terminal device can be indicated by the parameter availability indication information. In this way, the network device can specify the availability of various model update parameters in the terminal device according to the parameter availability indication information sent by the terminal device. , which increases the cognitive consistency between the network device and the terminal device, and the network device can also be more explicit when acquiring the model update parameters sent by the terminal device.
在一种可能的实现方式中,网络设备向第一终端设备指示指定的模型更新参数。可选的,可以通过获取请求指示网络设备指定的模型更新参数。In a possible implementation manner, the network device indicates the specified model update parameter to the first terminal device. Optionally, the model update parameters specified by the network device may be instructed through the acquisition request.
在该方案中,网络设备可以指示终端上传特定的模型更新参数,而并不一定是全部的模型更新参数,这样可以减少终端设备向网络设备传输模型更新参数的数据量和时间,从而尽量减少无效传输,提高传输的有效性,也节约了网络传输资源,减少了空口资源开销。In this solution, the network device can instruct the terminal to upload specific model update parameters, not necessarily all model update parameters, which can reduce the amount of data and time required for the terminal device to transmit the model update parameters to the network device, thereby minimizing invalidation The transmission improves the effectiveness of transmission, saves network transmission resources, and reduces air interface resource overhead.
第四方面,提供一种更新机器学习模型的方法,该方法可以应用于终端设备,或者可以应用于终端设备中的芯片。以该方法应用于第一终端设备为例,在该方案中,第一终端设备接收网络设备发送的获取请求,发送该获取请求的时间点是网络设备根据多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长确定的,并且该获取请求是该确定的时间点发送的,再根据获取请求向网络设备发送模型更新参数,或者接收 网络设备发送的上报时间信息,并在该上报时间信息指示的时间点向网络设备发送模型更新参数,以使网络设备根据第一终端设备发送的模型更新参数对该网络设备本地的机器学习模型进行本地更新。In a fourth aspect, a method for updating a machine learning model is provided, and the method can be applied to a terminal device, or can be applied to a chip in the terminal device. Taking the method applied to the first terminal device as an example, in this solution, the first terminal device receives the acquisition request sent by the network device, and the time point when the acquisition request is sent is when the network device sends the request from each of the multiple terminal devices to the network device. The transmission duration for the network device to send the respective model update parameters is determined, and the acquisition request is sent at the determined time point, and then the model update parameters are sent to the network device according to the acquisition request, or the report time information sent by the network device is received, and The model update parameter is sent to the network device at the time point indicated by the reporting time information, so that the network device locally updates the local machine learning model of the network device according to the model update parameter sent by the first terminal device.
在一种可能的实现方式中,第一终端设备还向网络设备发送参数可用性指示信息。In a possible implementation manner, the first terminal device further sends parameter availability indication information to the network device.
在一种可能的实现方式中,第一终端设备还接收网络设备发送的指示信息,该指示信息用于指示网络设备指定的模型更新参数。In a possible implementation manner, the first terminal device further receives indication information sent by the network device, where the indication information is used to indicate a model update parameter specified by the network device.
在一种可能的实现方式中,用于指示网络设备指定的模型更新参数的指示信息为上述的获取请求。In a possible implementation manner, the indication information used to indicate the model update parameter specified by the network device is the above-mentioned acquisition request.
上述第四方面中的任一实现方式可以达到的技术效果可以参照上述第三方面中有益效果的描述,此处不再重复赘述。For the technical effects that can be achieved by any of the implementation manners of the above-mentioned fourth aspect, reference may be made to the description of the beneficial effects in the above-mentioned third aspect, which will not be repeated here.
第五方面,提供一种通信装置,该通信装置可以是网络设备,或者是设置在网络设备内部的芯片,该通信装置包括用于执行上述第一方面或第一方面任一种可能的实现方式中所述方法的模块。示例性的,该通信装置包括处理单元和通信单元,其中:A fifth aspect provides a communication apparatus, the communication apparatus may be a network device, or a chip set inside the network device, and the communication apparatus includes the first aspect or any possible implementation manner of the first aspect. Modules for the methods described in . Exemplarily, the communication device includes a processing unit and a communication unit, wherein:
处理单元,用于根据终端设备的计算能力,确定该终端设备对应的模型训练配置信息;a processing unit, configured to determine model training configuration information corresponding to the terminal device according to the computing capability of the terminal device;
通信单元,用于将模型训练配置信息发送给终端设备,以及接收终端设备发送的模型更新参数,其中,模型更新参数是终端设备根据模型训练配置信息对第一机器学习模型训练后更新的模型参数;A communication unit, configured to send the model training configuration information to the terminal device, and receive the model update parameter sent by the terminal device, wherein the model update parameter is the model parameter updated by the terminal device after training the first machine learning model according to the model training configuration information ;
处理单元,还用于根据模型更新参数对第二机器学习模型进行更新。The processing unit is further configured to update the second machine learning model according to the model update parameter.
在一种可能的实现方式中,所述通信单元还用于:In a possible implementation manner, the communication unit is further used for:
接收来自所述终端设备的第一算力指示信息,所述第一算力指示信息用于指示所述终端设备的计算能力;或者,receiving first computing power indication information from the terminal device, where the first computing power indication information is used to indicate the computing capability of the terminal device; or,
在向所述终端设备发送计算能力获取请求后,接收来自所述终端设备的第二算力指示信息,所述第二算力指示信息用于指示所述终端设备的计算能力;或者,After sending a computing capability acquisition request to the terminal device, receive second computing power indication information from the terminal device, where the second computing power indication information is used to indicate the computing capability of the terminal device; or,
接收来自其它网络设备的第三算力指示信息,所述第三算力指示信息用于指示所述终端设备的计算能力。Receive third computing power indication information from other network devices, where the third computing power indication information is used to indicate the computing capability of the terminal device.
在一种可能的实现方式中,所述模型训练配置信息包括超参数、精度、训练时间信息中的至少一种。In a possible implementation manner, the model training configuration information includes at least one of hyperparameters, precision, and training time information.
在一种可能的实现方式中,所述通信单元还用于向所述终端设备发送训练特征信息,所述训练特征信息用于指示所述终端设备对所述第一机器学习模型进行训练所使用的训练特征集。In a possible implementation manner, the communication unit is further configured to send training feature information to the terminal device, where the training feature information is used to instruct the terminal device to use for training the first machine learning model training feature set.
在一种可能的实现方式中,所述通信单元还用于向所述终端设备发送精度评估信息,所述精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种。In a possible implementation manner, the communication unit is further configured to send accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating accuracy or a test sample for evaluating accuracy kind.
在一种可能的实现方式中,所述通信单元还用于接收来自所述终端设备的精度指示信息,所述精度指示信息用于指示所述终端设备利用所述模型训练配置信息对所述第一机器学习模型进行训练后所达到的精度。In a possible implementation manner, the communication unit is further configured to receive accuracy indication information from the terminal device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information to The accuracy achieved by a machine learning model after training.
在一种可能的实现方式中,所述处理单元还用于确定用于获取所述终端设备的模型更新参数的时间点;则,所述通信单元还用于:In a possible implementation manner, the processing unit is further configured to determine a time point for acquiring the model update parameter of the terminal device; then, the communication unit is further configured to:
在所述时间点向所述终端设备发送获取请求,所述获取请求用于指示所述终端设备向所述网络设备发送所述终端设备的模型更新参数;或者,Send an acquisition request to the terminal device at the time point, where the acquisition request is used to instruct the terminal device to send the model update parameter of the terminal device to the network device; or,
向所述终端设备发送上报时间信息,所述上报时间信息用于指示在所述时间点向所述 网络设备发送模型更新参数。Sending reporting time information to the terminal device, where the reporting time information is used to indicate that model update parameters are sent to the network device at the time point.
在一种可能的实现方式中,所述处理单元具体用于:In a possible implementation manner, the processing unit is specifically used for:
确定多个终端设备中的各个终端设备向所述网络设备发送各自的模型更新参数的传输时长;determining the transmission duration for each terminal device in the plurality of terminal devices to send the respective model update parameters to the network device;
根据各个终端设备对应的各个传输时长确定获取所述时间点。The time point is determined and acquired according to each transmission duration corresponding to each terminal device.
在一种可能的实现方式中,所述获取请求还用于指示需要获取指定的模型更新参数。In a possible implementation manner, the obtaining request is further used to indicate that the specified model update parameters need to be obtained.
在一种可能的实现方式中,所述通信单元还用于接收来自所述终端设备的参数可用性指示信息,所述参数可用性指示信息用于指示所述终端设备中的模型更新参数的可用性。In a possible implementation manner, the communication unit is further configured to receive parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
上述第五方面中的任一实现方式可以达到的技术效果可以参照上述第一方面中有益效果的描述,此处不再重复赘述。For the technical effects that can be achieved by any of the implementation manners of the above fifth aspect, reference may be made to the description of the beneficial effects in the above first aspect, which will not be repeated here.
第六方面,提供一种通信装置,该通信装置可以是终端设备,或者是设置在终端设备内部的芯片,该通信装置包括用于执行上述第二方面或第二方面任一种可能的实现方式中所述方法的模块。示例性的,该通信装置包括通信单元和处理单元,其中:In a sixth aspect, a communication device is provided, the communication device may be a terminal device, or a chip provided inside the terminal device, and the communication device includes the second aspect or any possible implementation manner of the second aspect. Modules for the methods described in . Exemplarily, the communication device includes a communication unit and a processing unit, wherein:
通信单元,用于接收网络设备发送的模型训练配置信息,所述模型训练配置信息是根据终端设备的计算能力确定的;a communication unit, configured to receive model training configuration information sent by the network device, where the model training configuration information is determined according to the computing capability of the terminal device;
处理单元,用于根据所述模型训练配置信息对第一机器学习模型进行训练,以得到模型更新参数;a processing unit, configured to train the first machine learning model according to the model training configuration information to obtain model update parameters;
所述通信单元,还用于将所述模型更新参数发送给所述网络设备,所述模型更新参数用于所述网络设备对第二机器学习模型进行更新。The communication unit is further configured to send the model update parameter to the network device, where the model update parameter is used by the network device to update the second machine learning model.
在一种可能的实现方式中,所述通信单元还用于:In a possible implementation manner, the communication unit is further used for:
接收所述网络设备发送的计算能力获取请求;receiving a computing capability acquisition request sent by the network device;
根据所述计算能力获取请求,向所述网络设备发送第二算力指示信息,所述第二算力指示信息用于指示所述终端设备的计算能力。According to the computing capability acquisition request, second computing power indication information is sent to the network device, where the second computing power indication information is used to indicate the computing capability of the terminal device.
在一种可能的实现方式中,所述通信单元还用于接收来自所述网络设备的训练特征信息,所述训练特征信息用于指示所述终端设备对所述第一机器学习模型进行训练所使用的训练特征集;则,所述处理单元还用于根据所述模型训练配置信息和所述训练特征信息对所述第一机器学习模型进行训练。In a possible implementation manner, the communication unit is further configured to receive training feature information from the network device, where the training feature information is used to instruct the terminal device to perform training on the first machine learning model. The used training feature set; then, the processing unit is further configured to train the first machine learning model according to the model training configuration information and the training feature information.
在一种可能的实现方式中,所述通信单元还用于接收来自所述网络设备的精度评估信息,所述精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种;则,所述处理单元还用于根据所述精度评估信息确定训练后的机器学习模型所达到的精度。In a possible implementation manner, the communication unit is further configured to receive accuracy evaluation information from the network device, where the accuracy evaluation information includes at least one of a method for evaluating accuracy or a test sample for evaluating accuracy One; then, the processing unit is further configured to determine the accuracy achieved by the trained machine learning model according to the accuracy evaluation information.
在一种可能的实现方式中,所述通信单元用于向所述网络设备发送精度指示信息,所述精度指示信息用于指示所述终端设备利用所述模型训练配置信息对所述第一机器学习模型进行训练后所达到的精度。In a possible implementation manner, the communication unit is configured to send accuracy indication information to the network device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information to The accuracy achieved by the learned model after training.
在一种可能的实现方式中,所述通信单元还用于接收来自所述网络设备的获取请求,并根据所述获取请求向所述网络设备发送所述模型更新参数,所述获取请求用于指示所述终端设备向所述网络设备发送所述终端设备的模型更新参数。In a possible implementation manner, the communication unit is further configured to receive an acquisition request from the network device, and send the model update parameter to the network device according to the acquisition request, where the acquisition request is used for The terminal device is instructed to send the model update parameter of the terminal device to the network device.
在一种可能的实现方式中,所述通信单元还用于接收来自所述网络设备的上报时间信息,并在所述上报时间信息所指示的获取时间向所述网络设备发送所述模型更新参数。In a possible implementation manner, the communication unit is further configured to receive report time information from the network device, and send the model update parameter to the network device at the acquisition time indicated by the report time information .
在一种可能的实现方式中,所述通信单元还用于向所述网络设备发送参数可用性指示信息,所述参数可用性指示信息用于指示所述终端设备中的模型更新参数的可用性。In a possible implementation manner, the communication unit is further configured to send parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
上述第六方面中的任一实现方式可以达到的技术效果可以参照上述第二方面中有益效果的描述,此处不再重复赘述。For the technical effects that can be achieved by any of the implementation manners of the above sixth aspect, reference may be made to the description of the beneficial effects in the above second aspect, which will not be repeated here.
第七方面,提供一种通信装置,该通信装置可以是网络设备,或者是设置在网络设备内部的芯片,该通信装置包括用于执行上述第三方面或第三方面任一种可能的实现方式中所述方法的模块。示例性的,该通信装置包括处理单元和通信单元,其中:In a seventh aspect, a communication device is provided, the communication device may be a network device, or a chip set inside the network device, and the communication device includes the third aspect or any possible implementation manner of the third aspect. Modules for the methods described in . Exemplarily, the communication device includes a processing unit and a communication unit, wherein:
处理单元,用于根据多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长,选择获取第一终端设备的模型更新参数的时间点;a processing unit, configured to select a time point for acquiring the model update parameters of the first terminal device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device;
通信单元,用于在所述时间点向第一终端设备发送获取请求,或者向终端设备发送用于指示第一终端设备在获取时间发送模型更新参数的上报时间信息,其中,所述获取请求用于请求第一终端设备向所述网络设备发送模型更新参数;以及接收第一终端设备发送的模型更新参数;The communication unit is configured to send an acquisition request to the first terminal device at the time point, or send to the terminal device the reporting time information for instructing the first terminal device to send the model update parameters at the acquisition time, wherein the acquisition request uses in requesting the first terminal device to send model update parameters to the network device; and receiving the model update parameters sent by the first terminal device;
所述处理单元,还用于根据所述模型更新参数对网络设备中的第二机器学习模型进行更新。The processing unit is further configured to update the second machine learning model in the network device according to the model update parameter.
在一种可能的实现方式中,所述通信单元还用于接收来自所述第一终端设备的参数可用性指示信息。In a possible implementation manner, the communication unit is further configured to receive parameter availability indication information from the first terminal device.
在一种可能的实现方式中,所述通信单元还用于接收来自所述网络设备的用于指示指定的模型更新参数的指示信息。In a possible implementation manner, the communication unit is further configured to receive indication information from the network device for indicating the specified model update parameter.
在一种可能的实现方式中,所述指示信息携带在所述获取请求中。In a possible implementation manner, the indication information is carried in the acquisition request.
上述第七方面中的任一实现方式可以达到的技术效果可以参照上述第三方面中有益效果的描述,此处不再重复赘述。For the technical effects that can be achieved by any one of the implementation manners of the above seventh aspect, reference may be made to the description of the beneficial effects in the above third aspect, which will not be repeated here.
第八方面,提供一种通信装置,该通信装置可以是终端设备,或者是设置在终端设备内部的芯片,该通信装置包括用于执行上述第四方面或第四方面任一种可能的实现方式中所述方法的模块。示例性的,该通信装置包括通信单元和处理单元,其中:In an eighth aspect, a communication device is provided, the communication device may be a terminal device, or a chip set inside the terminal device, and the communication device includes the fourth aspect or any possible implementation manner of the fourth aspect. Modules for the methods described in . Exemplarily, the communication device includes a communication unit and a processing unit, wherein:
通信单元,用于接收网络设备发送的获取请求或者上报时间信息,其中,发送该获取请求的时间点是网络设备根据多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长确定的;A communication unit, configured to receive an acquisition request sent by a network device or report time information, wherein the time point at which the acquisition request is sent is when the network device sends the transmission of the respective model update parameters to the network device according to each terminal device in the plurality of terminal devices fixed duration;
处理单元,用于根据所述获取请求确定需要发送的模型更新参数;a processing unit, configured to determine model update parameters to be sent according to the acquisition request;
所述通信单元,还用于将确定的所述模型更新参数发送给所述网络设备,或者,在所述上报时间信息指示的时间点向所述网络设备发送所述模型更新参数,所述模型更新参数用于所述网络设备对所述网络设备中的第二机器学习模型进行更新。The communication unit is further configured to send the determined model update parameter to the network device, or send the model update parameter to the network device at the time point indicated by the reporting time information, and the model The update parameter is used by the network device to update the second machine learning model in the network device.
在一种可能的实现方式中,所述通信单元还用于向所述网络设备发送参数可用性指示信息。In a possible implementation manner, the communication unit is further configured to send parameter availability indication information to the network device.
在一种可能的实现方式中,所述通信单元还用于接收来自所述网络设备的用于指示指定的模型更新参数的指示信息。In a possible implementation manner, the communication unit is further configured to receive indication information from the network device for indicating the specified model update parameter.
在一种可能的实现方式中,所述指示信息携带在所述获取请求中。In a possible implementation manner, the indication information is carried in the acquisition request.
上述第八方面中的任一实现方式可以达到的技术效果可以参照上述第四方面中有益效果的描述,此处不再重复赘述。For the technical effects that can be achieved by any one of the implementation manners of the above-mentioned eighth aspect, reference may be made to the description of the beneficial effects in the above-mentioned fourth aspect, which will not be repeated here.
第九方面,提供一种通信装置,包括:至少一个处理器;以及与所述至少一个处理器通信连接的通信接口;所述至少一个处理器通过执行存储器存储的指令,使得所述通信装置通过所述通信接口执行如第一方面或第一方面任一种可能的实现方式中所述的方法。In a ninth aspect, a communication device is provided, comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor executes instructions stored in a memory so that the communication device passes The communication interface performs the method as described in the first aspect or any possible implementation of the first aspect.
可选的,所述存储器位于所述装置之外。Optionally, the memory is located outside the device.
可选的,所述装置包括所述存储器,所述存储器与所述至少一个处理器相连,所述存储器存储有可被所述至少一个处理器执行的指令。Optionally, the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
第十方面,提供一种通信装置,包括:至少一个处理器;以及与所述至少一个处理器通信连接的通信接口;所述至少一个处理器通过执行存储器存储的指令,使得所述通信装置通过所述通信接口执行如第二方面或第二方面任一种可能的实现方式中所述的方法。In a tenth aspect, a communication device is provided, comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor executes instructions stored in a memory, so that the communication device passes The communication interface performs the method as described in the second aspect or any possible implementation of the second aspect.
可选的,所述存储器位于所述装置之外。Optionally, the memory is located outside the device.
可选的,所述装置包括所述存储器,所述存储器与所述至少一个处理器相连,所述存储器存储有可被所述至少一个处理器执行的指令。Optionally, the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
第十一方面,提供一种通信装置,包括:至少一个处理器;以及与所述至少一个处理器通信连接的通信接口;所述至少一个处理器通过执行存储器存储的指令,使得所述通信装置通过所述通信接口执行如第三方面或第三方面任一种可能的实现方式中所述的方法。In an eleventh aspect, a communication device is provided, comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor executes instructions stored in a memory to enable the communication device The method as described in the third aspect or any possible implementation manner of the third aspect is performed through the communication interface.
可选的,所述存储器位于所述装置之外。Optionally, the memory is located outside the device.
可选的,所述装置包括所述存储器,所述存储器与所述至少一个处理器相连,所述存储器存储有可被所述至少一个处理器执行的指令。Optionally, the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
第十二方面,提供一种通信装置,包括:至少一个处理器;以及与所述至少一个处理器通信连接的通信接口;所述至少一个处理器通过执行存储器存储的指令,使得所述通信装置通过所述通信接口执行如第四方面或第四方面任一种可能的实现方式中所述的方法。A twelfth aspect provides a communication device, comprising: at least one processor; and a communication interface communicatively connected to the at least one processor; the at least one processor causes the communication device to execute instructions stored in a memory by the at least one processor The method as described in the fourth aspect or any possible implementation manner of the fourth aspect is performed through the communication interface.
可选的,所述存储器位于所述装置之外。Optionally, the memory is located outside the device.
可选的,所述装置包括所述存储器,所述存储器与所述至少一个处理器相连,所述存储器存储有可被所述至少一个处理器执行的指令。Optionally, the apparatus includes the memory connected to the at least one processor, the memory storing instructions executable by the at least one processor.
第十三方面,提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得如第一方面或第一方面任一种可能的实现方式中所述的方法被执行。A thirteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, when the program or instruction is run on a computer, the first aspect or any possible implementation manner of the first aspect is as described in the first aspect. method is executed.
第十四方面,提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得如第二方面或第二方面任一种可能的实现方式中所述的方法被执行。A fourteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, which, when the program or instruction is run on a computer, makes the second aspect or any possible implementation of the second aspect as described in the second aspect method is executed.
第十五方面,提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得如第三方面或第三方面任一种可能的实现方式中所述的方法被执行。A fifteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, which, when the program or instruction is run on a computer, makes the third aspect or any possible implementation of the third aspect described in the third aspect method is executed.
第十六方面,提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得如第四方面或第四方面任一种可能的实现方式中所述的方法被执行。A sixteenth aspect provides a computer-readable storage medium, comprising a program or an instruction, when the program or instruction is run on a computer, the fourth aspect or any possible implementation manner of the fourth aspect is provided. method is executed.
第十七方面,提供一种芯片,所述芯片与存储器耦合,用于读取并执行所述存储器中存储的程序指令,使得第一方面或第一方面任一种可能的实现方式中所述的方法被执行。A seventeenth aspect provides a chip, which is coupled to a memory and configured to read and execute program instructions stored in the memory, so that the first aspect or any possible implementation manner of the first aspect is described in the first aspect. method is executed.
第十八方面,提供一种芯片,所述芯片与存储器耦合,用于读取并执行所述存储器中存储的程序指令,使得第二方面或第二方面任一种可能的实现方式中所述的方法被执行。An eighteenth aspect provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the second aspect or any possible implementation manner of the second aspect is described in the second aspect. method is executed.
第十九方面,提供一种芯片,所述芯片与存储器耦合,用于读取并执行所述存储器中存储的程序指令,使得第三方面或第三方面任一种可能的实现方式中所述的方法被执行。A nineteenth aspect provides a chip, which is coupled to a memory and configured to read and execute program instructions stored in the memory, so that the third aspect or any of the possible implementations of the third aspect is described in the third aspect. method is executed.
第二十方面,提供一种芯片,所述芯片与存储器耦合,用于读取并执行所述存储器中存储的程序指令,使得第四方面或第四方面任一种可能的实现方式中所述的方法被执行。A twentieth aspect provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the fourth aspect or any of the possible implementations of the fourth aspect is described in the method is executed.
第二十一方面,提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得上述第一方面或第一方面任一种可能的实现方式中所述的方法被执行。A twenty-first aspect provides a computer program product comprising instructions, when executed on a computer, to cause the method described in the first aspect or any of the possible implementations of the first aspect to be performed.
第二十二方面,提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得 上述第二方面或第二方面任一种可能的实现方式中所述的方法被执行。A twenty-second aspect provides a computer program product comprising instructions that, when run on a computer, cause the method described in the second aspect or any of the possible implementations of the second aspect to be performed.
第二十三方面,提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得上述第三方面或第三方面任一种可能的实现方式中所述方法的被执行。A twenty-third aspect provides a computer program product comprising instructions, which when run on a computer, cause the above third aspect or any of the possible implementations of the third aspect to be performed.
第二十四方面,提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得上述第四方面或第四方面任一种可能的实现方式中所述方法的被执行。A twenty-fourth aspect provides a computer program product comprising instructions, which when run on a computer, cause the method described in the fourth aspect or any of the possible implementations of the fourth aspect to be performed.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明Description of drawings
图1为将联邦学习应用到ML模型训练的示意图;Figure 1 is a schematic diagram of applying federated learning to ML model training;
图2为本申请实施例的一种应用场景示意图;FIG. 2 is a schematic diagram of an application scenario of an embodiment of the present application;
图3为本申请实施例的分离式接入网设备架构示意图;FIG. 3 is a schematic diagram of a device architecture of a separate access network according to an embodiment of the present application;
图4为本申请实施例提供的一种更新机器学习模型的方法的流程图;4 is a flowchart of a method for updating a machine learning model provided by an embodiment of the present application;
图5为本申请实施例提供的另一种更新机器学习模型的方法的流程图;5 is a flowchart of another method for updating a machine learning model provided by an embodiment of the present application;
图6为本申请实施例中的通信装置的结构示意图;FIG. 6 is a schematic structural diagram of a communication device in an embodiment of the present application;
图7为本申请实施例中的另一通信装置的结构示意图;7 is a schematic structural diagram of another communication device in an embodiment of the present application;
图8为本申请实施例中的另一通信装置的结构示意图;FIG. 8 is a schematic structural diagram of another communication device in an embodiment of the present application;
图9为本申请实施例中的另一通信装置的结构示意图;FIG. 9 is a schematic structural diagram of another communication device in an embodiment of the present application;
图10为本申请实施例中的另一通信装置的结构示意图;10 is a schematic structural diagram of another communication device in an embodiment of the present application;
图11为本申请实施例中的另一通信装置的结构示意图。FIG. 11 is a schematic structural diagram of another communication device in an embodiment of the present application.
具体实施方式detailed description
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
以下,对本申请实施例中的部分用语进行解释说明,以便于本领域技术人员理解。Hereinafter, some terms in the embodiments of the present application will be explained, so as to facilitate the understanding of those skilled in the art.
(1)终端设备,包括向用户提供语音和/或数据连通性的设备,例如可以包括具有无线连接功能的手持式设备、或连接到无线调制解调器的处理设备。该终端设备可以经无线接入网(radio access network,RAN)与核心网进行通信,与RAN交换语音和/或数据。该终端设备可以包括用户设备(user equipment,UE)、终端、无线终端设备、移动终端设备、设备到设备通信(device-to-device,D2D)终端设备、车到一切(vehicle-to-everything,V2X)终端设备、机器到机器/机器类通信(machine-to-machine/machine-type communications,M2M/MTC)终端设备、物联网(internet of things,IoT)终端设备、订户单元(subscriber unit)、订户站(subscriber station),移动站(mobile station)、远程站(remote station)、接入点(access point,AP)、远程终端(remote terminal)、接入终端(access terminal)、用户终端(user terminal)、用户代理(user agent)、或用户装备(user device)等。例如,可以包括移动电话(或称为“蜂窝”电话),具有移动终端设备的计算机,便携式、袖珍式、手持式、计算机内置的移动装置等。例如,个人通信业务(personal communication service,PCS)电话、无绳电话、会话发起协议(session initiation protocol,SIP)话机、无线本地环路(wireless local loop,WLL)站、个人数字助理(personal digital assistant,PDA)、等 设备。还包括受限设备,例如功耗较低的设备,或存储能力有限的设备,或计算能力有限的设备等。例如包括条码、射频识别(radio frequency identification,RFID)、传感器、全球定位系统(global positioning system,GPS)、激光扫描器等信息传感设备。(1) Terminal devices, including devices that provide voice and/or data connectivity to users, may include, for example, handheld devices with wireless connectivity, or processing devices connected to wireless modems. The terminal equipment may communicate with the core network via a radio access network (RAN), and exchange voice and/or data with the RAN. The terminal equipment may include user equipment (UE), terminal, wireless terminal equipment, mobile terminal equipment, device-to-device (D2D) terminal equipment, vehicle-to-everything (vehicle-to-everything, V2X) terminal equipment, machine-to-machine/machine-type communications (M2M/MTC) terminal equipment, Internet of things (IoT) terminal equipment, subscriber unit (subscriber unit), Subscriber station (subscriber station), mobile station (mobile station), remote station (remote station), access point (access point, AP), remote terminal (remote terminal), access terminal (access terminal), user terminal (user terminal), user agent, or user device, etc. For example, these may include mobile telephones (or "cellular" telephones), computers with mobile terminal equipment, portable, pocket-sized, hand-held, computer-embedded mobile devices, and the like. For example, personal communication service (PCS) phones, cordless phones, session initiation protocol (SIP) phones, wireless local loop (WLL) stations, personal digital assistants (personal digital assistants), PDA), etc. Also includes constrained devices, such as devices with lower power consumption, or devices with limited storage capacity, or devices with limited computing power, etc. For example, it includes information sensing devices such as barcodes, radio frequency identification (RFID), sensors, global positioning system (GPS), and laser scanners.
作为示例而非限定,在本申请实施例中,该终端设备还可以是可穿戴设备。可穿戴设备也可以称为穿戴式智能设备或智能穿戴式设备等,是应用穿戴式技术对日常穿戴进行智能化设计、开发出可以穿戴的设备的总称,如眼镜、手套、手表、服饰及鞋等。可穿戴设备即直接穿在身上,或是整合到用户的衣服或配件的一种便携式设备。可穿戴设备不仅仅是一种硬件设备,更是通过软件支持以及数据交互、云端交互来实现强大的功能。广义穿戴式智能设备包括功能全、尺寸大、可不依赖智能手机实现完整或者部分的功能,例如:智能手表或智能眼镜等,以及只专注于某一类应用功能,需要和其它设备如智能手机配合使用,如各类进行体征监测的智能手环、智能头盔、智能首饰等。As an example and not a limitation, in this embodiment of the present application, the terminal device may also be a wearable device. Wearable devices can also be called wearable smart devices or smart wearable devices, etc. It is a general term for the application of wearable technology to intelligently design daily wear and develop wearable devices, such as glasses, gloves, watches, clothing and shoes. Wait. A wearable device is a portable device that is worn directly on the body or integrated into the user's clothing or accessories. Wearable device is not only a hardware device, but also realizes powerful functions through software support, data interaction, and cloud interaction. In a broad sense, wearable smart devices include full-featured, large-scale, complete or partial functions without relying on smart phones, such as smart watches or smart glasses, and only focus on a certain type of application function, which needs to cooperate with other devices such as smart phones. Use, such as all kinds of smart bracelets, smart helmets, smart jewelry, etc. for physical sign monitoring.
而如上介绍的各种终端设备,如果位于车辆上(例如放置在车辆内或安装在车辆内),都可以认为是车载终端设备,车载终端设备例如也称为车载单元(on-board unit,OBU)。The various terminal devices described above, if they are located on the vehicle (for example, placed in the vehicle or installed in the vehicle), can be considered as on-board terminal equipment. For example, the on-board terminal equipment is also called on-board unit (OBU). ).
(2)网络设备,例如包括接入网(access network,AN)设备,例如基站(例如,接入点),可以是指接入网中在空口通过一个或多个小区与无线终端设备通信的设备,或者例如,一种V2X技术中的接入网设备为路侧单元(road side unit,RSU)。基站可用于将收到的空中帧与网际协议(IP)分组进行相互转换,作为终端设备与接入网的其余部分之间的路由器,其中接入网的其余部分可包括IP网络。RSU可以是支持V2X应用的固定基础设施实体,可以与支持V2X应用的其他实体交换消息。接入网设备还可协调对空口的属性管理。例如,接入网设备可以包括长期演进(long term evolution,LTE)系统或高级长期演进(long term evolution-advanced,LTE-A)中的演进型基站(NodeB或eNB或e-NodeB,evolutional Node B),或者也可以包括第五代移动通信技术(the 5th generation,5G)的新无线(new radio,NR)系统中的下一代节点B(next generation node B,gNB)和下一代演进型基站(next generation evolutional Node B,ng-eNB),或者也可以包括分离式接入网系统中的集中单元(central unit,CU)和分布式单元(distributed unit,DU),本申请实施例并不限定。(2) Network equipment, including, for example, an access network (AN) equipment, such as a base station (eg, an access point), may refer to a device that communicates with wireless terminal equipment through one or more cells over the air interface in the access network. The device, or, for example, an access network device in a V2X technology is a road side unit (RSU). The base station may be used to convert received air frames to and from Internet Protocol (IP) packets and act as a router between the terminal device and the rest of the access network, which may include the IP network. The RSU can be a fixed infrastructure entity supporting V2X applications and can exchange messages with other entities supporting V2X applications. The access network equipment can also coordinate the attribute management of the air interface. For example, the access network equipment may include a long term evolution (long term evolution, LTE) system or an evolved base station (NodeB or eNB or e-NodeB, evolutional Node B) in long term evolution-advanced (LTE-A) ), or may also include the next generation node B (gNB) and the next generation evolved base station ( next generation evolutional Node B, ng-eNB), or may also include a centralized unit (central unit, CU) and a distributed unit (distributed unit, DU) in a separate access network system, which is not limited in the embodiment of the present application.
当然网络设备还可以包括核心网设备,可以是接入和移动性管理功能(access and mobility management function,AMF),主要负责接入控制、移动性管理、附着与去附着以及网关选择等功能。核心网设备还可以是网络数据分析功能(network data analytics function,NWDAF),主要负责数据的收集、分析等功能。核心网设备还可以是其它设备。Of course, network equipment can also include core network equipment, which can be an access and mobility management function (AMF), which is mainly responsible for functions such as access control, mobility management, attachment and detachment, and gateway selection. The core network device may also be a network data analytics function (NWDAF), which is mainly responsible for functions such as data collection and analysis. The core network device may also be other devices.
(3)AI,是指通过计算机程序来呈现人类智能的技术,是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,AI是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。(3) AI refers to the technology of presenting human intelligence through computer programs. It is a theory that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results. , methods, techniques and application systems. In other words, AI is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。Artificial intelligence technology is a comprehensive discipline, involving a wide range of fields, including both hardware-level technology and software-level technology. The basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
(4)ML,是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。(4) ML is a multi-field interdisciplinary subject involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. It specializes in how computers simulate or realize human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance. Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent, and its applications are in all fields of artificial intelligence. Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
机器学习的本质就是让计算机模拟上述过程,让计算机“学习”,从而获得某种认知,通过这种认知判断新的事物,这种认知可以通过“机器学习模型”体现,用数学表示可以理解为是一个函数。The essence of machine learning is to let the computer simulate the above process, let the computer "learn", so as to obtain a certain kind of cognition, through this cognition to judge new things, this kind of cognition can be reflected through the "machine learning model", expressed in mathematics It can be understood as a function.
(5)机器学习模型,本申请实施例中对其不加以区分人工智能和机器学习,可以将机器学习模型表示为ML模型或者AI模型。本申请实施例中的机器学习模型是泛指AI领域和ML领域中的AI模型和ML模型,举例来说,机器学习模型例如包括线性回归(linear regression)、逻辑回归(logistic regression)、决策树(decision tree)、朴素贝叶斯(naive bayes)、K-近邻(k-nearest neighbors)、支持向量机(support vector machines)、深度神经网络(deep neutral network)、随机森林(random forest)等。(5) A machine learning model. In the embodiments of this application, no distinction is made between artificial intelligence and machine learning, and the machine learning model may be represented as an ML model or an AI model. The machine learning model in the embodiments of the present application generally refers to AI models and ML models in the AI field and ML field. For example, the machine learning model includes, for example, linear regression, logistic regression, and decision tree. (decision tree), naive bayes, k-nearest neighbors, support vector machines, deep neural network, random forest, etc.
(6)联邦学习(federated learning,FL),是一种新兴的人工智能基础技术,是一种加密的分布式ML技术,其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下,在多参与方或多计算结点之间开展高效率的机器学习。其中,联邦学习可使用的机器学习算法不局限于神经网络,还包括随机森林等算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。(6) Federated learning (FL) is an emerging artificial intelligence basic technology and an encrypted distributed ML technology. Its design goals are to ensure information security during big data exchange, protect terminal data and On the premise of personal data privacy and legal compliance, efficient machine learning is carried out between multiple parties or computing nodes. Among them, the machine learning algorithms that can be used in federated learning are not limited to neural networks, but also include algorithms such as random forests. Federated learning is expected to be the basis for the next generation of AI collaborative algorithms and collaborative networks.
联邦学习是在满足数据隐私、安全和监管要求的前提下,设计的一个机器学习框架,让人工智能系统能够更加高效、准确的共同使用各自的数据,满足用户的隐私保护和数据安全。联邦学习的特点包括:Federated learning is a machine learning framework designed on the premise of meeting data privacy, security and regulatory requirements, allowing artificial intelligence systems to use their own data more efficiently and accurately, to meet users' privacy protection and data security. Features of federated learning include:
数据隔离,数据不会泄露到外部,满足用户隐私保护和数据安全的需求;Data isolation, data will not be leaked to the outside, to meet the needs of user privacy protection and data security;
能够保证模型质量无损,不会出现负迁移,保证联邦模型比割裂的独立模型效果好;It can ensure that the quality of the model is not damaged, and there will be no negative migration, and that the federated model is better than the split independent model;
各参与者地位对等,能够实现公平合作;All participants have equal status and can achieve fair cooperation;
能够保证各参与方在保持独立性的情况下,进行信息与模型参数的加密交换,并同时获得成长。It can ensure that each participant can carry out encrypted exchange of information and model parameters while maintaining independence, and grow at the same time.
(7)“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。(7) "At least one" means one or more, and "plurality" means two or more. "And/or", which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship.
以及,除非有相反的说明,本申请实施例提及“第一”、“第二”等序数词是用于对多个对象进行区分,不用于限定多个对象的顺序、时序、优先级或者重要程度。例如,第一信息和第二信息,只是为了区分不同的信令,而并不是表示这两种信息的内容、优先级、发送顺序或者重要程度等的不同。And, unless stated to the contrary, the ordinal numbers such as “first” and “second” mentioned in the embodiments of the present application are used to distinguish multiple objects, and are not used to limit the order, sequence, priority or priority of multiple objects. Importance. For example, the first information and the second information are only for distinguishing different signaling, and do not indicate the difference in content, priority, transmission order, or importance of the two kinds of information.
如上介绍了本申请实施例涉及的一些概念,下面介绍本申请实施例的技术特征。Some concepts involved in the embodiments of the present application are described above, and the technical features of the embodiments of the present application are described below.
如前所述的,对于相关技术中的机器学习模型,为避免用户的隐私数据泄露,网络设备先向各个终端设备分发初始的机器学习模型,各个终端设备一般是在本地利用其自身的数据对网络设备分发的机器学习模型的训练,然后再将训练之后得到的模型更新参数发送给网络设备,再由网络设备对各个参与方(即前述的各个终端设备)发送的模型更新参数 进行汇聚,进而对网络设备本地的机器学习模型直接进行更新,以得到更新的机器学习模型。举例来说,可以采用FL的方式来进行机器学习模型的更新,FL的主要特点是各参与方的数据都保留在本地,不需要上传到网络设备,从而不泄露数据隐私,也减少了海量数据上传所需的网络开销。As mentioned above, for machine learning models in related technologies, in order to avoid leakage of user privacy data, network equipment first distributes the initial machine learning model to each terminal device, and each terminal device generally uses its own data to pair locally. The training of the machine learning model distributed by the network device, and then the model update parameters obtained after training are sent to the network device, and then the network device aggregates the model update parameters sent by each participant (that is, each of the aforementioned terminal devices), and then The local machine learning model of the network device is directly updated to obtain an updated machine learning model. For example, FL can be used to update the machine learning model. The main feature of FL is that the data of each participant is kept locally and does not need to be uploaded to the network device, thus not revealing data privacy and reducing massive data. Network overhead required for uploading.
相关技术中,无线通信网络中利用FL进行机器学习模型(例如ML模型)更新的流程如图1所示:In the related art, the process of using FL to update a machine learning model (such as an ML model) in a wireless communication network is shown in Figure 1:
S11、网络设备向参与方终端设备1~N发送初始ML模型。为了描述方便,把ML模型用y=W 0x+b表示,其中W 0为ML模型的初始参数。 S11. The network device sends the initial ML model to the participant terminal devices 1-N. For the convenience of description, the ML model is represented by y=W 0 x+b, where W 0 is the initial parameter of the ML model.
S12、终端设备1~N基于各自本地的训练数据集,对ML模型进行训练,即对ML模型进行更新,得到更新后的模型参数W 1 0~W N 0,或者得到更新后的模型参数差值g 1=W 1 0-W 0、……、g N=W N 0-W 0,模型参数差值g 1、……、g N也称之为梯度。 S12. The terminal devices 1-N train the ML model based on their respective local training data sets, that is, update the ML model to obtain the updated model parameters W 1 0 ˜W N 0 , or obtain the updated model parameter difference The values g 1 =W 1 0 -W 0 , . . . , g N =W N 0 -W 0 , the model parameter differences g 1 , . . . , g N are also called gradients.
S13、终端设备1~N将更新后的模型更新参数或者梯度发送给网络设备,例如,将W 1 0~W N 0发送给网络设备,或者将g 1~g N发送给网络设备。 S13. The terminal devices 1 - N send the updated model update parameters or gradients to the network device, for example, send W10 - WN0 to the network device, or send g1 - gN to the network device.
S14、网络设备在收到所有参与方(即终端设备1~N)发送的模型更新参数后,对各个参与方的模型更新参数进行加权平均,从而得到汇聚后的ML模型更新参数,再通过汇聚后的ML模型更新参数对网络设备本地的ML模型进行更新。例如,网络设备计算更新后的ML模型更新参数为W=(W 1 0*p 1+W 2 0*p 2+…+W N 0*p N)/N,并且用W替代原来的W 0。或者,网络设备计算更新后的ML模型更新参数差值为g=(g 1*p 1+g 2*p 2+…+g N*p N)/N,从而得到更新后的ML模型参数为W=g+W 0,并且用W替代原来的W 0。其中,p 1、p 2、……、p N表示权重,例如p 1是指W 1 0或g 1的权重、p 2是指W 2 0或g 2的权重、……、p N是指W N 0或g N的权重,所有权重的和为1,即p 1+p 2+……+p N=1。 S14. After receiving the model update parameters sent by all the participants (ie, terminal devices 1 to N), the network device performs a weighted average on the model update parameters of each participant to obtain the aggregated ML model update parameters, and then aggregates the model update parameters. The latter ML model update parameter updates the local ML model of the network device. For example, the network device calculates the updated ML model update parameter as W=(W 1 0 *p 1 +W 2 0 *p 2 +...+W N 0 *p N )/N, and replaces the original W 0 with W . Or, the network device calculates the updated ML model update parameter difference as g=(g 1 *p 1 +g 2 *p 2 +...+g N *p N )/N, so that the updated ML model parameters are obtained as W=g+W 0 , and the original W 0 is replaced by W. Among them, p 1 , p 2 , ..., p N represent weights, for example, p 1 refers to the weight of W 1 0 or g 1 , p 2 refers to the weight of W 2 0 or g 2 , ..., p N refers to The weight of W N 0 or g N , the sum of all weights is 1, ie p 1 +p 2 +...+p N =1.
在上述图1所示的流程中,由于终端设备1~N中的各个终端设备本地用来进行ML模型训练的数据不同,由于各个终端设备进行ML模型训练的能力一般也不同,并且在训练过程中,各个终端设备用于训练的配置信息是由各个终端设备自行配置的,或者是由各个终端设备的用户手动配置的,也就是说,各个终端设备用于训练的配置信息是相互独立的配置的,这导致各个终端设备完成ML模型训练的时间一般也不一样,存在较大差异,为了便于网络设备能够尽快地进行ML模型更新,所以各个终端设备在得到模型更新参数之后一般会及时地上报给网络设备,由于各个终端设备上报各自的模型更新参数的时间差异性,且网络设备需要得到所有参与方上报的模型更新参数之后才能进行模型更新,所以需要等到最后一个上报的模型更新参数之后才能进行模型更新,此时距离收到第一个模型更新参数的时刻可能已经间隔了较长时间,这无疑是增大了网络设备进行模型更新所花费的时间,导致网络设备对机器学习模型更新的收敛速度较慢,对机器学习模型的更新效率也就较低。In the above process shown in FIG. 1 , since the data used locally for ML model training by each of the terminal devices 1 to N is different, the ability of each terminal device to perform ML model training is generally different, and in the training process The configuration information used by each terminal device for training is configured by each terminal device itself, or manually configured by the user of each terminal device, that is, the configuration information used by each terminal device for training is independent of each other. This results in the time for each terminal device to complete the ML model training is generally different, and there is a big difference. In order to facilitate the network device to update the ML model as soon as possible, each terminal device will generally report in time after obtaining the model update parameters. For the network device, due to the time difference between each terminal device reporting its own model update parameters, and the network device needs to obtain the model update parameters reported by all participants before the model update can be performed, so it needs to wait until the last reported model update parameter. When the model is updated, it may have been a long time since the time when the first model update parameters were received, which undoubtedly increases the time it takes for the network device to update the model, and causes the network device to update the machine learning model. The slower the convergence speed, the lower the update efficiency of the machine learning model.
通过对相关技术进行分析,发明人发现导致网络设备对机器学习模型进行本地更新的效率较慢的主要原因,是由于终端设备用于训练的配置信息并未考虑各个终端设备之间的差异,例如设备能力的差异、训练数据的差异等,各个终端设备在配置本地训练用的配置信息时相互之间是完全孤立的,所以导致各个终端设备完成本地训练的时间先后不一,差异较大。鉴于此,本申请实施例提供一种更新机器学习模型的方法,在该方法中,由网络侧统一为各个终端设备配置训练用的配置信息(本申请实施例中称作模型训练配置信息),具体来说,网络设备根据各个终端设备的计算能力为各个终端设备分配对应的模型训练配 置信息,使得各个终端设备对本地的由网络设备预先分发的机器学习模型进行训练时所使用的模型训练配置信息是与自身的计算能力相匹配的,这样可以尽量减少各个终端设备在进行模型训练时由于能力差异而导致的时间差异,进而确保各个终端设备能够尽量在相同时间内完成模型训练,使得各个终端设备上报各自的模型更新参数的时间是大致相同的,从而减少各个终端设备上报模型更新参数时间上的差异性,以便于网络设备基于各个终端设备上报的模型更新参数进行模型更新时能够尽量在短时间内完成,以提高模型更新的收敛速度,从而提高机器学习模型的更新效率。By analyzing related technologies, the inventor found that the main reason for the slow efficiency of local updating of machine learning models by network devices is that the configuration information used by terminal devices for training does not take into account the differences between terminal devices, such as Due to differences in device capabilities and training data, each terminal device is completely isolated from each other when configuring the configuration information for local training. Therefore, the time for each terminal device to complete local training varies greatly. In view of this, an embodiment of the present application provides a method for updating a machine learning model. In the method, the network side uniformly configures configuration information for training (referred to as model training configuration information in the embodiment of the present application) for each terminal device, Specifically, the network device allocates corresponding model training configuration information to each terminal device according to the computing capability of each terminal device, so that each terminal device uses the model training configuration used when training the local machine learning model pre-distributed by the network device The information is matched with its own computing power, which can minimize the time difference caused by the difference in the capabilities of each terminal device during model training, thereby ensuring that each terminal device can complete the model training in the same time as possible, so that each terminal device can complete the model training within the same time. The time for the devices to report their respective model update parameters is roughly the same, thereby reducing the difference in the time when each terminal device reports the model update parameters, so that the network device can update the model based on the model update parameters reported by each terminal device in as short a time as possible. It can be completed in time to improve the convergence speed of model update, thereby improving the update efficiency of the machine learning model.
本申请实施例提供的技术方案可以应用于第四代移动通信技术(the 4th generation,4G)4G系统中,例如LTE系统,或可以5G系统中,例如NR系统,或者还可以应用于下一代移动通信系统或其他类似的通信系统,具体的不做限制。The technical solutions provided in the embodiments of this application can be applied to the 4G system of the fourth generation mobile communication technology (the 4th generation, 4G), such as the LTE system, or can be applied to the 5G system, such as the NR system, or can also be applied to the next generation mobile communication technology. The communication system or other similar communication systems are not specifically limited.
本申请实施例中进行训练和更新的机器学习模型可以是AI领域的一些通用模型,例如前述的线性回归、逻辑回归、决策树、朴素贝叶斯、K-近邻、支持向量机、深度神经网络、随机森林等,本申请实施例对此不作限制。此外,各个参与模型训练的终端设备(如图1中的终端设备1~N)中的进行训练的机器学习模型是由网络设备预先统一分发的,也就是说,各个终端设备中进行模型训练的机器学习模型是同一种类型,以及,网络设备利用各个终端设备上报的模型更新参数进行模型更新的机器学习模型与各个终端设备中进行模型训练的机器学习模型也是同一种类型。The machine learning model to be trained and updated in this embodiment of the present application may be some general models in the AI field, such as the aforementioned linear regression, logistic regression, decision tree, naive Bayes, K-nearest neighbor, support vector machine, and deep neural network , random forest, etc., which are not limited in the embodiments of the present application. In addition, the machine learning models for training in each terminal device participating in the model training (terminal devices 1 to N in FIG. 1 ) are uniformly distributed in advance by the network device, that is to say, each terminal device performs model training. The machine learning model is of the same type, and the machine learning model in which the network device uses the model update parameters reported by each terminal device to update the model is of the same type as the machine learning model in each terminal device for model training.
下面介绍本申请实施例所应用的一种网络架构,请参考图2。The following introduces a network architecture applied in the embodiment of the present application, and please refer to FIG. 2 .
图2为本申请实施例提供的一种系统架构示意图。如图2所示,该通信系统包括核心网设备、第一接入网设备、第二接入网设备和终端设备。其中,第一接入网设备或者第二接入网设备能够与核心网设备进行通信;终端设备能够与第一接入网设备或者第二接入网设备进行通信,终端设备也能够与第一接入网设备和第二接入网设备同时进行通信,即多无线双连接(multi radio dual connectivity,MR-DC)。在MR-DC场景下,第一接入网设备可为主接入网设备,第二接入网设备可为辅接入网设备,或者,第二接入网设备可为主接入网设备,第一接入网设备可为辅接入网设备,其中,第一接入网设备和第二接入网设备可为不同通信制式的接入网设备,也可为相同通信制式的接入网设备。FIG. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application. As shown in FIG. 2 , the communication system includes a core network device, a first access network device, a second access network device, and a terminal device. The first access network device or the second access network device can communicate with the core network device; the terminal device can communicate with the first access network device or the second access network device, and the terminal device can also communicate with the first access network device or the second access network device. The access network device and the second access network device communicate at the same time, that is, multi-radio dual connectivity (MR-DC). In the MR-DC scenario, the first access network device may be the primary access network device, the second access network device may be the secondary access network device, or the second access network device may be the primary access network device , the first access network device may be a secondary access network device, wherein the first access network device and the second access network device may be access network devices of different communication modes, or may be access network devices of the same communication mode network equipment.
可以理解的,前述图2所示的通信系统,仅仅是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定。例如,该通信系统中还可以包括其他设备,如:网络控制设备。网络控制设备可以是操作管理维护(operation administration and maintenance,OAM)系统,也称之为网管系统。网络控制设备可以对前述第一接入网设备、第二接入网设备和核心网设备进行管理。It can be understood that the foregoing communication system shown in FIG. 2 is only for illustrating the technical solutions of the embodiments of the present application more clearly, and does not constitute a limitation on the technical solutions provided by the embodiments of the present application. For example, the communication system may also include other devices, such as network control devices. The network control device may be an operation management and maintenance (operation administration and maintenance, OAM) system, also called a network management system. The network control device may manage the aforementioned first access network device, second access network device, and core network device.
另外,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。In addition, those skilled in the art know that, with the evolution of the network architecture and the emergence of new service scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
图2中的核心网设备可以是AMF或NWDAF,但是并不限于AMF和NWDAF。图2中的接入网设备,又可以称作无线接入网(radio access network,RAN)设备,是一种将终端设备接入到无线网络的设备,可以为终端设备提供无线资源管理、服务质量管理、数据加密和压缩等功能。示例性地,接入网设备可以包括以下设备:The core network device in FIG. 2 may be AMF or NWDAF, but is not limited to AMF and NWDAF. The access network device in Figure 2, also known as a radio access network (RAN) device, is a device that connects a terminal device to a wireless network and can provide wireless resource management and services for the terminal device. Features such as quality management, data encryption and compression. Exemplarily, the access network equipment may include the following equipment:
gNB:为终端设备提供NR的控制面和/或用户面的协议和功能,并且接入到5G核心网(5th generation core,5GC);gNB: Provide NR control plane and/or user plane protocols and functions for terminal equipment, and access to 5G core network (5th generation core, 5GC);
ng-eNB:为终端设备提供演进的通用陆地无线接入(evolved universal terrestrial  radio access,E-UTRA)的控制面和/或用户面的协议和功能,并且接入到5G核心网;ng-eNB: Provides the protocols and functions of the control plane and/or user plane of the evolved universal terrestrial radio access (E-UTRA) for terminal equipment, and accesses the 5G core network;
CU:主要包括gNB的无线资源控制(radio resource control,RRC)层,业务数据适配协议(service data adaptation protocol,SDAP)层和分组数据汇聚协议(packet data donvergence protocol,PDCP)层,或者ng-eNB的RRC层和PDCP层;CU: It mainly includes the radio resource control (RRC) layer of gNB, the service data adaptation protocol (SDAP) layer and the packet data donvergence protocol (PDCP) layer, or ng- RRC layer and PDCP layer of eNB;
DU:主要包括gNB或者ng-eNB的无线链路控制(radio link control,RLC)层,媒体接入控制(media access control,MAC)层和物理层;DU: It mainly includes the radio link control (RLC) layer, the media access control (MAC) layer and the physical layer of the gNB or ng-eNB;
集中单元-控制平面(central unit–control plane,CU-CP):CU的控制面,主要包括gNB-CU或者ng-eNB-CU中的RRC层,以及PDCP层中的控制面;Central unit-control plane (CU-CP): The control plane of the CU, mainly including the RRC layer in the gNB-CU or ng-eNB-CU, and the control plane in the PDCP layer;
集中单元-用户平面(central unit–user plane,CU-UP):CU的用户面,主要包括gNB-CU或者ng-eNB-CU中的SDAP层,以及PDCP层中的用户面;Central unit-user plane (CU-UP): the user plane of the CU, mainly including the SDAP layer in the gNB-CU or ng-eNB-CU, and the user plane in the PDCP layer;
数据分析管理(data analysis and management,DAM):主要负责数据收集、ML模型训练、ML模型生成、ML模型更新、ML模型分发等功能。Data analysis and management (DAM): mainly responsible for data collection, ML model training, ML model generation, ML model update, ML model distribution and other functions.
图3为一种分离式接入网设备的架构示意图。接入网设备按照功能切分为一个CU和一个或多个DU,其中CU和DU之间通过F1接口相连接。进一步的,一个CU可以包括一个CU-CP和一个或者多个CU-UP。CU-CP和CU-UP之间可以通过E1接口进行连接,CU-CP和DU之间可以通过F1的控制面接口(F1-C)进行连接,CU-UP和DU之间可以通过F1的用户面接口(F1-U)进行连接。进一步的,CU、DU或者CU-CP可以分别通过G1接口和DAM进行连接。可选的,DAM可以分别作为CU、DU或者CU-CP的内部功能,此时不存在G1接口(或者说G1接口为内部接口,对外不可见)。FIG. 3 is a schematic structural diagram of a separate access network device. The access network equipment is divided into one CU and one or more DUs according to functions, wherein the CU and the DU are connected through the F1 interface. Further, one CU may include one CU-CP and one or more CU-UPs. CU-CP and CU-UP can be connected through E1 interface, CU-CP and DU can be connected through F1 control plane interface (F1-C), CU-UP and DU can be connected through F1 user interface interface (F1-U) to connect. Further, the CU, DU or CU-CP can be connected to the DAM through the G1 interface, respectively. Optionally, the DAM can be used as an internal function of the CU, DU, or CU-CP, respectively. At this time, there is no G1 interface (or the G1 interface is an internal interface, which is invisible to the outside world).
为进一步说明本申请实施例提供的技术方案,下面结合附图以及具体实施方式对此进行详细的说明。虽然本申请实施例提供了如下述实施例或附图所示的方法操作步骤,但基于常规或者无需创造性的劳动在所述方法中可以包括更多或者更少的操作步骤。在逻辑上不存在必要因果关系的步骤中,这些步骤的执行顺序不限于本申请实施例提供的执行顺序。所述方法在实际的处理过程中或者装置执行时,可按照实施例或者附图所示的方法顺序执行或者并行执行。In order to further illustrate the technical solutions provided by the embodiments of the present application, the following detailed descriptions are given in conjunction with the accompanying drawings and specific embodiments. Although the embodiments of the present application provide method operation steps as shown in the following embodiments or the accompanying drawings, more or less operation steps may be included in the method based on routine or without creative work. In steps that logically do not have a necessary causal relationship, the execution order of these steps is not limited to the execution order provided by the embodiments of the present application. In the actual processing process or when the method is executed by the device, the method may be executed sequentially or in parallel according to the methods shown in the embodiments or the accompanying drawings.
下面结合附图介绍本申请实施例提供的技术方案。The technical solutions provided by the embodiments of the present application are described below with reference to the accompanying drawings.
本申请实施例提供一种更新机器学习模型的方法,请参见图4,为该方法的流程图。在下文的介绍过程中,以该方法应用于图2所示的网络架构为例,下文介绍过程中的网络设备可以是前述提到的接入网设备,或者核心网设备,或者网络控制设备。需要说明的是,图4是以一个终端设备为例对本申请的技术方案进行说明,在具体实施过程中,对于作为模型训练的其它各个参与方,均可以按照图4所示的流程理解。An embodiment of the present application provides a method for updating a machine learning model. Please refer to FIG. 4 , which is a flowchart of the method. In the following introduction process, taking the method applied to the network architecture shown in FIG. 2 as an example, the network device in the following introduction process may be the aforementioned access network device, or core network device, or network control device. It should be noted that FIG. 4 uses a terminal device as an example to illustrate the technical solution of the present application. In the specific implementation process, other participants as model training can be understood according to the process shown in FIG. 4 .
S41、网络设备获取终端设备的计算能力。S41. The network device acquires the computing capability of the terminal device.
其中,终端设备的计算能力,也可以称为算力,可以理解为是用于指示或者评估终端设备处理数据的速度的能力,例如是终端设备计算哈希函数时输出的速度,具体例如可以用每秒浮点运算次数(floating point operations per second,FLOPS)来表示。终端设备的计算能力与处理数据的速度呈正相关,例如计算能力越大,处理数据的速度就越快,那么在进行模型训练时的速度一般也越快。终端设备的计算能力与终端设备自身的硬件配置性能、操作系统运行的流畅性等因素有关。Among them, the computing power of the terminal device, also known as the computing power, can be understood as the ability used to indicate or evaluate the speed at which the terminal device processes data, such as the output speed when the terminal device calculates the hash function. For example, you can use It is represented by the number of floating point operations per second (FLOPS). The computing power of a terminal device is positively related to the speed of processing data. For example, the greater the computing power, the faster the data processing speed, and the faster the model training speed. The computing power of the terminal device is related to the hardware configuration performance of the terminal device itself, the smoothness of the operating system and other factors.
在具体实施过程中,网络设备可通过以下方式中的任意一种获取终端设备的计算能力。In a specific implementation process, the network device may acquire the computing capability of the terminal device in any of the following manners.
方式1way 1
终端设备主动向网络设备上报自身的计算能力。在该方式中,终端设备可以向网络设备发送用于指示终端设备的计算能力的第一算力指示信息,该第一算力指示信息中还可以包括终端设备的标识,网络设备在接收到第一算力指示信息之后,即可根据第一算力指示信息确定出终端设备对应的计算能力。在一种可能的实施方式中,终端设备可以通过UE辅助信息(UE assistance information)消息向网络设备报告该终端设备的计算能力。The terminal device actively reports its computing capability to the network device. In this manner, the terminal device may send first computing power indication information for indicating the computing capability of the terminal device to the network device, and the first computing power indication information may also include the identifier of the terminal device. After the computing power indication information is received, the computing capability corresponding to the terminal device can be determined according to the first computing power indication information. In a possible implementation manner, the terminal device may report the computing capability of the terminal device to the network device through a UE assistance information (UE assistance information) message.
终端设备可以在注册到网络设备时就主动向网络设备上报自身的计算能力,或者也可以是在接收到网络设备为其分发初始机器学习模型的时候就主动上报自身的计算能力信息,或者还是可以是在其它时刻主动上报自身的计算能力,本申请实施例不做限制。通过各个终端设备主动上报自身的计算能力的方式,网络设备可以提前就获取各个终端设备的算力情况,以便于后续可以及时地为各个终端设备分配对应的模型训练配置信息,提高分配效率。The terminal device can actively report its own computing capability to the network device when it is registered to the network device, or it can also actively report its own computing capability information when it receives the initial machine learning model distributed by the network device, or it can still be used. It is to actively report its own computing capability at other times, which is not limited in this embodiment of the present application. Through the way that each terminal device actively reports its own computing power, the network device can obtain the computing power of each terminal device in advance, so that the corresponding model training configuration information can be allocated to each terminal device in a timely manner, and the allocation efficiency can be improved.
方式2way 2
终端设备根据网络设备的请求向网络设备上报自身的计算能力。在该方式中,在需要获取终端设备的计算能力时,网络设备可以向终端设备发送计算能力获取请求,该计算能力获取请求用于指示终端设备向网络设备上报自身的计算能力,所以,在收到网络设备发送的计算能力获取请求之后,终端设备可以向网络设备发送用于指示终端设备的计算能力的第二算力指示信息,网络设备在接收到终端设备发送的第二算力指示信息之后,即可获取到该终端设备的计算能力。The terminal device reports its computing capability to the network device according to the request of the network device. In this method, when the computing capability of the terminal device needs to be acquired, the network device can send a computing capability acquisition request to the terminal device, and the computing capability acquisition request is used to instruct the terminal device to report its own computing capability to the network device. After the computing capability acquisition request sent by the network device, the terminal device may send the second computing power indication information used to indicate the computing capability of the terminal device to the network device. After the network device receives the second computing power indication information sent by the terminal device , the computing capability of the terminal device can be obtained.
在一种具体的实施方式中,网络设备向终端设备发送UE能力查询(UE capability enquiry)消息,用于请求获取终端设备的计算能力,进一步地,终端设备向网络设备发送UE能力信息(UE capability information)消息,该消息中包含有终端设备的计算能力。In a specific implementation, the network device sends a UE capability enquiry message to the terminal device, which is used to request to obtain the computing capability of the terminal device. Further, the terminal device sends UE capability information (UE capability information) to the network device. information) message, which contains the computing capability of the terminal device.
在该方式中,网络设备在需要终端设备的计算能力时才去请求,这样可以无需预先就将各个终端设备的计算能力存储在本地,可以在一定程度上减少存储消耗,并且能够有效使用终端设备的计算能力。In this method, the network device only requests the computing capability of the terminal device, so that the computing capability of each terminal device can be stored locally without the need for advance, the storage consumption can be reduced to a certain extent, and the terminal device can be effectively used computing power.
方式3way 3
网络设备从其它网络设备处获取终端设备的计算能力。在该方式中,其它网络设备可以主动向该网络设备发送终端设备的计算能力,或者,网络设备也可以先向其它网络设备发送请求,其它网络设备基于该网络设备的请求向该网络设备返回终端设备的计算能力,例如其它网络设备通过第三算力指示信息向该网络设备指示终端设备的计算能力。The network device obtains the computing capability of the terminal device from other network devices. In this manner, other network devices can actively send the computing capability of the terminal device to the network device, or the network device can also send a request to other network devices first, and other network devices return the terminal to the network device based on the request of the network device. The computing capability of the device, for example, other network devices indicate the computing capability of the terminal device to the network device through the third computing capability indication information.
例如网络设备是接入网设备,则可以从其它接入网络设备或者核心网设备或者网络控制设备处获取终端设备的计算能力,又例如网络设备是核心网设备,则可以从接入网设备或者网络控制设备处获取终端设备的计算能力。当然,该方式的实施前提是,其它网络设备自身保存有终端设备的计算能力,或者,其它网络设备能够获取到终端设备的计算能力。For example, if the network device is an access network device, the computing capability of the terminal device can be obtained from other access network devices, core network devices, or network control devices. For example, if the network device is a core network device, it can be obtained from the access network device or the network control device. The computing capability of the terminal device is obtained at the network control device. Of course, the premise of implementing this method is that other network devices themselves have the computing capability of the terminal device, or other network devices can acquire the computing capability of the terminal device.
该实施方式中提供了三种获取终端设备的计算能力的方式,如此可以提高获取终端设备的计算能力的方式的灵活性。In this embodiment, three ways of acquiring the computing capability of the terminal device are provided, so that the flexibility of the way of acquiring the computing capability of the terminal device can be improved.
S42、网络设备根据终端设备的计算能力,确定终端设备对应的模型训练配置信息。S42. The network device determines model training configuration information corresponding to the terminal device according to the computing capability of the terminal device.
其中,模型训练配置信息是终端设备对本地的机器学习模型进行训练所需的配置信息,换言之,模型训练配置信息是用于终端设备对本地的机器学习模型进行模型训练的。The model training configuration information is the configuration information required by the terminal device to train the local machine learning model. In other words, the model training configuration information is used for the terminal device to perform model training on the local machine learning model.
如前所述的,可以通过终端设备的计算能力来评估终端设备处理数据的速度,为了减少各个终端设备对本地的机器学习模型进行训练的时间差异性,本申请实施例中,网络设 备通过各个终端设备的计算能力来为各个终端设备分配对应的模型训练配置信息,基于这样的分配机制,可以尽量确保各个终端设备能够在大致相同时长内完成模型训练,例如,对于计算能力较差的终端设备,则可以为其配置较低要求的模型训练配置信息,对于计算能力较好的终端设备,可以为其配置要求相对较高的模型训练配置信息,这样,使得计算能力较差的终端设备和计算能力较好的终端设备完成模型训练的时间是大致一样的,从而减少各个终端设备完成模型训练的时间之间的差异。As mentioned above, the speed at which the terminal device processes data can be evaluated based on the computing capability of the terminal device. In order to reduce the time difference in training the local machine learning model by each terminal device, The computing power of the terminal device is used to allocate the corresponding model training configuration information for each terminal device. Based on this allocation mechanism, it can be ensured that each terminal device can complete the model training within approximately the same time period. For example, for terminal devices with poor computing power , model training configuration information with lower requirements can be configured for it. For terminal devices with better computing capabilities, model training configuration information with relatively high requirements can be configured for them. In this way, terminal devices with poor computing capabilities and computing The time for the terminal devices with better capability to complete the model training is roughly the same, thereby reducing the difference between the time for each terminal device to complete the model training.
S43、网络设备将确定的模型训练配置信息发送给终端设备。S43. The network device sends the determined model training configuration information to the terminal device.
在根据终端设备的计算能力确定了对应的模型训练配置信息之后,网络设备将其发送给对应的终端设备,从而,终端设备可以接收到网络设备发送的模型训练配置信息。After the corresponding model training configuration information is determined according to the computing capability of the terminal device, the network device sends it to the corresponding terminal device, so that the terminal device can receive the model training configuration information sent by the network device.
S44、终端设备根据模型训练配置信息对该终端设备中的第一机器学习模型进行训练,以得到模型更新参数。S44. The terminal device trains the first machine learning model in the terminal device according to the model training configuration information to obtain model update parameters.
如前所述的,各个终端设备中进行本地训练的机器学习模型是由网络设备预先分发的,本申请实施例中,终端设备本地的机器学习模型称作第一机器学习模型,网络设备本地的机器学习模型称作第二机器学习模型,所以,第一机器学习模型是由网络设备为终端设备分发的。第一机器学习模型和第二机器学习模型是同一种类型的机器学习模型,或者,第一机器学习模型和第二机器学习模型是不同类型的机器学习模型,为便于网络设备根据各个终端设备上报的模型更新参数信息进行本地的模型更新,第一机器学习模型和第二机器学习模型是相同类型的机器学习模型。As mentioned above, the machine learning model for local training in each terminal device is pre-distributed by the network device. In this embodiment of the present application, the local machine learning model of the terminal device is called the first machine learning model, and the local The machine learning model is called the second machine learning model, so the first machine learning model is distributed by the network device to the terminal device. The first machine learning model and the second machine learning model are of the same type of machine learning model, or the first machine learning model and the second machine learning model are different types of machine learning models. The model update parameter information is used for local model update, and the first machine learning model and the second machine learning model are the same type of machine learning model.
在收到网络设备发送的模型训练配置信息之后,终端设备可以根据该模型训练配置信息对终端设备中的第一机器学习模型进行本地训练,在完成模型训练之后,即可得到对应的模型更新参数,所以,这里的模型更新参数是利用模型训练配置信息对第一机器学习模型进行训练后得到的更新参数。其中,模型更新参数是被训练过的第一机器学习模型的模型参数。After receiving the model training configuration information sent by the network device, the terminal device can locally train the first machine learning model in the terminal device according to the model training configuration information, and after the model training is completed, the corresponding model update parameters can be obtained , so the model update parameter here is the update parameter obtained after training the first machine learning model by using the model training configuration information. The model update parameters are model parameters of the trained first machine learning model.
在另一种可能的实施方式中,网络设备向终端设备发送模型训练配置信息之外,还可以向终端设备指示利用模型训练配置信息开始进行本地训练的时间,例如指示终端设备在特定时刻才开始本地训练,或者指示终端设备在接收到模型训练配置信息后的预定时长后就开始本地训练,或者还可以指示在其它时刻(例如15:00:00)开始本地训练,这样,网络设备可以较为严格的控制各个终端设备进行本地训练的开始时间,这样可以确保各个终端设备开始进行本地训练的时间是尽量一致的,进一步地减少各个终端设备完成本地训练的时间差异。In another possible implementation manner, in addition to sending the model training configuration information to the terminal device, the network device may also indicate to the terminal device the time to start local training by using the model training configuration information, for example, instructing the terminal device to start at a specific moment Local training, or instruct the terminal device to start local training after a predetermined period of time after receiving the model training configuration information, or can also instruct to start local training at other times (for example, 15:00:00), so that the network device can be stricter It controls the start time of each terminal device for local training, which can ensure that the time when each terminal device starts local training is as consistent as possible, and further reduces the time difference between each terminal device completing local training.
对于模型训练的具体方式,可以采用相关技术中的各种训练方式,本申请实施例不做限制。For the specific manner of model training, various training manners in the related art may be adopted, which is not limited in the embodiment of the present application.
S45、终端设备将得到的模型更新参数发送给网络设备。S45, the terminal device sends the obtained model update parameters to the network device.
需要说明的是,本申请实施例中的模型更新参数,包括模型更新参数其参数本身以及其对应的参数值,举例来说,有a、b、c三种模型更新参数,终端设备要向网络设备发送者三种模型更新参数,是向网络设备发送a、b、c这三种模型更新参数以及各种模型更新参数的参数值,例如a的参数值是1.5、b的参数值是2.6、c的参数值是2.4。It should be noted that the model update parameters in the embodiments of the present application include the model update parameters themselves and their corresponding parameter values. For example, there are three model update parameters a, b, and c. The three model update parameters of the device sender are to send the three model update parameters a, b and c and the parameter values of various model update parameters to the network device. For example, the parameter value of a is 1.5, the parameter value of b is 2.6, The parameter value of c is 2.4.
各个终端设备使用本地的训练数据各自使用网络设备根据其自身的计算能力配置的模型训练配置进行本地训练得到的模型更新参数不同,具体是值模型更新参数的参数值不同,而各个终端设备进行本地训练得到的模型更新参数的类型一般都是相同的。例如,对 于终端设备1、终端设备2、终端设备3,在分别使用网络设备为其配置的模型训练配置信息进行本地训练之后均得到了模型更新参数a、模型更新参数b、模型更新参数c,而终端设备1得到的这三种模型更新参数的参数值分别是1.3、1.8、2.4,终端设备2得到的这三种模型更新参数的参数值分别是1.6、1.4、2.8、终端设备3得到的这三种模型更新参数的参数值分别是1.9、1.3、2.7,可见,三个终端设备进行本地训练之后均获得了同样类型的模型更新参数,而各个模型更新参数对应的参数值却不同。Each terminal device uses the local training data to perform local training using the model training configuration configured by the network device according to its own computing capability. The types of model update parameters obtained by training are generally the same. For example, for terminal device 1, terminal device 2, and terminal device 3, after local training is performed using the model training configuration information configured by the network device respectively, the model update parameter a, the model update parameter b, and the model update parameter c are obtained. The parameter values of the three model update parameters obtained by terminal device 1 are 1.3, 1.8, and 2.4, respectively, and the parameter values of the three model update parameters obtained by terminal device 2 are 1.6, 1.4, 2.8, and those obtained by terminal device 3, respectively. The parameter values of the three model update parameters are 1.9, 1.3, and 2.7 respectively. It can be seen that the three terminal devices have obtained the same type of model update parameters after local training, but the parameter values corresponding to each model update parameter are different.
在得到与模型训练配置信息对应的模型更新参数后,终端设备将其发送给网络设备,进而,网络设备可以收到终端设备发送的该模型更新参数。After obtaining the model update parameter corresponding to the model training configuration information, the terminal device sends it to the network device, and then the network device can receive the model update parameter sent by the terminal device.
在本申请实施例中,终端设备向网络设备发送的模型更新参数可能包括多种类型,例如,可以是具体的更新后的模型参数W i 0,或者可以是更新后的模型参数差值g i=W i 0-W 0,其中,i为1~N中的任意一个数值,N的取值为网络设备为其配置模型训练配置信息的终端设备的总数,所以W i 0表示网络为其配置模型训练配置信息的第i个终端设备得到的模型更新参数,g i表示网络设备为其配置模型训练配置信息的第i个终端设备得到的模型更新参数差值。 In this embodiment of the present application, the model update parameter sent by the terminal device to the network device may include multiple types, for example, it may be a specific updated model parameter W i 0 , or it may be an updated model parameter difference gi =W i 0 -W 0 , where i is any value from 1 to N, and the value of N is the total number of terminal devices for which the network device configures the model training configuration information, so W i 0 indicates that the network configures it for Model update parameters obtained by the ith terminal device of the model training configuration information, gi represents the model update parameter difference obtained by the ith terminal device for which the network device configures the model training configuration information.
在第一种实施方式中,终端设备在得到模型更新参数之后则立刻主动发送给网络设备,以便于网络设备能够尽快地获得终端设备反馈的模型更新参数。在该方式中,网络设备可以通过模型训练配置信息来尽量使得各个终端设备进行本地训练所花的时间是一致的,这样在各个终端设备的本地训练完成得到模型更新参数之后,都可及时地将其上报给网络设备,从而也就减少了网络设备获得各个终端设备发送的模型更新参数的时间差异性。In the first embodiment, the terminal device actively sends the model update parameters to the network device immediately after obtaining the model update parameters, so that the network device can obtain the model update parameters fed back by the terminal device as soon as possible. In this way, the network device can use the model training configuration information to try to make the time spent by each terminal device for local training to be consistent, so that after the local training of each terminal device is completed and the model update parameters are obtained, all terminal devices can be updated in time. It is reported to the network device, thereby reducing the time difference for the network device to obtain the model update parameters sent by each terminal device.
在第二种实施方式中,终端设备在得到模型更新参数之后并不立刻主动发送给网络设备,而是在满足特定的触发条件时才向网络设备发送模型更新参数,以下举例说明。In the second embodiment, the terminal device does not actively send the model update parameters to the network device immediately after obtaining the model update parameters, but only sends the model update parameters to the network device when a specific trigger condition is met, which will be described below with an example.
情形1:终端设备在接收到网络设备向其发送的用于指示终端设备向网络设备发送模型更新参数的获取请求之后,才将得到的模型更新参数发送给网络设备,也就是说,一种可能的触发条件是终端设备接收到网络设备发送的获取请求,在该触发条件下,终端设备是在网络设备主动请求时才向其发送模型更新参数。在该实施方式中,网络设备可以先确定向终端设备请求模型更新参数的时间点,即可以先确定向终端设备发送前述的获取请求的时间点,然后在确定的时间点时向终端设备发送获取请求,以主动向终端设备请求该终端设备得到的模型更新参数,这样,网络设备可以明确控制向各个终端设备请求模型更新参数的时间,在通过模型训练配置信息减少各个终端设备完成本地训练的时间差异的基础上,可以进一步地减少各个终端设备上报模型更新参数的时间差异,从而减少网络设备实际获取到各个终端设备发送的模型更新参数的时间差异。Scenario 1: The terminal device sends the obtained model update parameters to the network device only after receiving the acquisition request sent by the network device to instruct the terminal device to send the model update parameters to the network device. That is to say, a possibility The trigger condition is that the terminal device receives the acquisition request sent by the network device, and under this trigger condition, the terminal device only sends the model update parameters to the network device when it actively requests. In this embodiment, the network device may first determine the time point for requesting the model update parameters from the terminal device, that is, it may first determine the time point for sending the aforementioned acquisition request to the terminal device, and then send the acquisition request to the terminal device at the determined time point. request to actively request the terminal device for the model update parameters obtained by the terminal device. In this way, the network device can explicitly control the time for requesting the model update parameters from each terminal device, and reduce the time for each terminal device to complete local training through the model training configuration information. On the basis of the difference, the time difference between each terminal device reporting the model update parameters can be further reduced, thereby reducing the time difference between the network device actually acquiring the model update parameters sent by each terminal device.
情形2:网络设备可以直接向终端设备指示上报模型更新参数的时间点,具体来说,网络设备可以向终端设备发送用于指示终端设备向网络设备发送模型更新参数的时间点的上报时间信息,终端设备在收到该上报时间信息之后,可以在该上报时间信息所指示的时间点向网络设备发送模型更新参数,也就是说,另一种可能的触发条件是到达网络设备向终端设备发送的上报时间信息所指示的时间,在该触发条件下,终端设备是根据网络设备的指示向网络设备定时上报模型更新参数。在该实施方式中,网络设备可以明确控制各个终端设备上报模型更新参数的具体时间,在通过模型训练配置信息减少各个终端设备完成本地训练的时间差异的基础上,可以进一步地减少各个终端设备上报模型更新参数的时间差异,从而减少网络设备实际获取到各个终端设备发送的模型更新参数的时间差异。Scenario 2: The network device may directly indicate to the terminal device the time point for reporting the model update parameters. Specifically, the network device may send the terminal device the reporting time information used to instruct the terminal device to send the model update parameter to the network device. After the terminal device receives the report time information, it can send the model update parameters to the network device at the time point indicated by the report time information. That is to say, another possible trigger condition is that the network device sends a The time indicated by the reporting time information. Under this trigger condition, the terminal device regularly reports the model update parameters to the network device according to the instruction of the network device. In this embodiment, the network device can explicitly control the specific time at which each terminal device reports the model update parameters. On the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, the reporting by each terminal device can be further reduced. The time difference between the model update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
通过各个终端设备对应的模型训练配置信息,网络设备可以确定各个终端设备完成各自的本地训练的时间点,进而可以在此基础上,直接确定上述情形1中的时间点以及情形2中通过上报时间信息所指示的时间点,例如,网络设备确定各个终端设备大概是16:05:00左右完成本地训练,则可以在16:06:00向各个终端设备发送获取请求,以及可以指示各个终端设备在16:06:30向网络设备发送模型更新参数,这样,不仅可以减少各个终端设备上报各自的模型更新参数的时间差异,同时可以尽快地获取到模型更新参数,在提高本地模型更新的效率的基础上,可以尽量提前进行模型更新,提高模型更新的及时性。对于上述的情形1和情形2,在另一种实施方式中,网络设备可以先确定参与本地训练的大部分或者全部终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长,再根据各个终端设备对应的各个传输时长确定获取该终端设备的模型更新参数的获取时间,即确定前述所说的发送获取请求的时间点。Through the model training configuration information corresponding to each terminal device, the network device can determine the time point at which each terminal device completes its own local training, and on this basis, can directly determine the time point in the above scenario 1 and the reporting time in the scenario 2. The time point indicated by the information, for example, if the network device determines that each terminal device completes local training at about 16:05:00, it can send an acquisition request to each terminal device at 16:06:00, and can instruct each terminal device to 16:06:30 Send the model update parameters to the network device, which not only reduces the time difference between each terminal device reporting their own model update parameters, but also obtains the model update parameters as soon as possible, which is the basis for improving the efficiency of local model update. , you can try to update the model in advance to improve the timeliness of the model update. For the above Scenario 1 and Scenario 2, in another implementation manner, the network device may first determine the transmission duration for each terminal device in most or all of the terminal devices participating in the local training to send their respective model update parameters to the network device, Then, according to each transmission duration corresponding to each terminal device, the acquisition time for acquiring the model update parameters of the terminal device is determined, that is, the aforementioned time point for sending the acquisition request is determined.
其中,传输时长可以理解为是从终端设备处发送模型更新参数到网络设备接收到该模型更新参数之间的间隔时长,这与各个终端设备与网络设备之间的通信链路质量有关,所以在一种可能的实施方式中,网络设备可以根据终端设备发送的信道质量指示(channel quality indicator,CQI)获得该终端设备的上行传输速率,然后根据模型更新参数的数据量和上行传输速率确定该终端设备对应的传输时长,例如以w表示终端设备的上行传输速率,以q表示该终端设备的模型更新参数的数据量,则对应该终端设备的传输时长T=q/w。按照此方法,可以确定各个终端设备对应的传输时长,然后根据大部分终端设备(例如80%)或者全部终端设备所对应的传输时长确定向该终端设备发送获取请求的时间点,或者确定各个终端设备向网络设备发送模型更新参数的时间点。Among them, the transmission duration can be understood as the interval between sending the model update parameter from the terminal device to the network device receiving the model update parameter, which is related to the quality of the communication link between each terminal device and the network device. In a possible implementation manner, the network device may obtain the uplink transmission rate of the terminal device according to the channel quality indicator (CQI) sent by the terminal device, and then determine the terminal device according to the data volume and the uplink transmission rate of the model update parameter. The transmission duration corresponding to the device, for example, w represents the uplink transmission rate of the terminal device, and q represents the data volume of the model update parameter of the terminal device, then the corresponding transmission duration of the terminal device T=q/w. According to this method, the transmission duration corresponding to each terminal device can be determined, and then the time point at which the acquisition request is sent to the terminal device can be determined according to the transmission duration corresponding to most terminal devices (for example, 80%) or all terminal devices, or each terminal device can be determined. The point in time when the device sends the model update parameters to the network device.
对于上述的传输时长的计算公式T=q/w,其中的q表示对应的终端设备向网络设备发送模型更新参数的数据量。由于各个终端设备中进行本地训练的机器学习模型是由网络设备预先统一分发的,所以各个终端设备训练完成后得到的模型更新参数对于网络设备来说是已知的,进而,网络设备可以知晓各个模型更新参数的数据量.在一种实施方式中,终端设备向网络设备传输所有模型更新参数,那么网络设备在知晓各个模型更新参数的数据量的基础上可以估算出所有模型更新参数的总数据量,得到前述的q;在另一种实施方式中,网络设备可以终端设备请求指定类型的模型更新参数,在知晓所指定的各个模型更新参数的数据量的基础上,网络设备可以估算出所有指定的模型更新参数的总数据量,得到前述q。For the above-mentioned calculation formula of the transmission duration T=q/w, q represents the data amount of the model update parameter sent by the corresponding terminal device to the network device. Since the machine learning model for local training in each terminal device is uniformly distributed in advance by the network device, the model update parameters obtained after the training of each terminal device is known to the network device, and the network device can know each Data volume of model update parameters. In one embodiment, the terminal device transmits all model update parameters to the network device, then the network device can estimate the total data of all model update parameters on the basis of knowing the data volume of each model update parameter In another embodiment, the network device can request the terminal device for a specified type of model update parameter, and on the basis of knowing the data amount of each specified model update parameter, the network device can estimate all The total data volume of the specified model update parameters, obtain the aforementioned q.
也就是说,网络设备可以主动向各个终端设备请求模型更新参数,并且是在与各个终端设备匹配的时间点向对应的各个终端设备发送用于请求模型更新参数的获取请求或者指示终端设备在匹配的时间点向网络设备上报模型更新参数,如此,综合考虑了大部分甚至是全部参与方的传输模型更新参数所需的时间,可以更为精确地控制各个终端设备上报各自的模型更新参数的时间,减少了各个终端设备向网络设备上报模型更新参数的时间差异,从而减少网络设备获取各个终端设备发送的模型更新参数的时间差异,进而提高本地模型更新的收敛速度,提高模型更新效率。That is to say, the network device can actively request model update parameters from each terminal device, and at the time when it matches each terminal device, it sends an acquisition request for requesting model update parameters to each corresponding terminal device or indicates that the terminal device is matching The model update parameters are reported to the network device at the same time point. In this way, the time required for the transmission of model update parameters of most or even all participants can be comprehensively considered, and the time for each terminal device to report its own model update parameters can be more accurately controlled. , which reduces the time difference between each terminal device reporting the model update parameters to the network device, thereby reducing the time difference between the network device acquiring the model update parameters sent by each terminal device, thereby improving the convergence speed of the local model update and improving the model update efficiency.
例如,终端设备1对应的传输时长为10分钟,终端设备2对应的传输时长为15分钟小时,终端设备3对应的传输时长为22分钟,终端设备4对应的传输时长为28分钟,则网络设备可以在10:00的时候向终端设备4发送获取请求,在10:06的时候向终端设备3发送获取请求,在10:13的时候向终端设备2发送获取请求,在10:18的时候向终端设备1 发送获取请求。也就是说,可以向传输时长越长的终端设备越提前发送获取请求,以及向传输时长越短的终端设备越后发送获取请求,这样,可以让传输时长越长的终端设备越早接收到获取请求后尽早地开始向网络设备发送模型更新参数,同理,可以指示传输时长越长的终端设备越早发送模型更新参数,以及指示传输时长越短的终端设备越晚发送模型更新参数,通过该方式,通过提前开始发送模型更新参数来弥补较长的传输时长,减少各个终端设备传输模型更新参数的时间差异性,使得网络设备能够在相同(或者尽量相同)的时间内接收到各个终端设备的模型更新参数,从而减少网络设备接收各个终端设备的模型更新参数的时间差异性。For example, the transmission duration corresponding to terminal device 1 is 10 minutes, the transmission duration corresponding to terminal device 2 is 15 minutes and hours, the transmission duration corresponding to terminal device 3 is 22 minutes, and the transmission duration corresponding to terminal device 4 is 28 minutes, then the network device You can send an acquisition request to terminal device 4 at 10:00, send an acquisition request to terminal device 3 at 10:06, send an acquisition request to terminal device 2 at 10:13, and send an acquisition request to terminal device 2 at 10:18. Terminal device 1 sends an acquisition request. That is to say, the acquisition request can be sent to the terminal device with the longer transmission duration earlier, and the acquisition request can be sent to the terminal device with the shorter transmission duration later. In this way, the terminal device with the longer transmission duration can receive the acquisition request earlier. Start sending the model update parameters to the network device as soon as possible after the request. Similarly, the terminal device with longer transmission duration can be instructed to send the model update parameter earlier, and the terminal device with shorter transmission duration can be instructed to send the model update parameter later. way, by starting to send model update parameters in advance to make up for the longer transmission time, reducing the time difference of each terminal device's transmission model update parameters, so that the network device can receive the data of each terminal device within the same (or as much as possible the same) time. Model update parameters, thereby reducing the time difference for the network device to receive the model update parameters of each terminal device.
在本申请实施例中,一般来说,各个终端设备通过FL的方式进行本地的模型训练时,各个终端设备得到的模型更新参数的类型是一样的,并且网络设备也是知晓的,因为各个终端设备进行本地训练的机器学习模型的初始模型是由网络设备分发给各个终端设备的。终端设备通过网络设备发送的模型训练配置信息进行本地训练得到的模型更新参数,终端设备可以向网络设备发送参数可用性指示信息,通过该参数可用性指示信息可以指示终端设备中的模型更新参数的可用性,终端设备在完成对机器学习模型的本地训练之后,终端设备的模型更新参数都是可用的,所以,通过参数可用性指示信息还可以指示终端设备已经完成了本地的机器学习模型的训练,换言之,终端设备可以通过参数可用性指示信息告知网络设备自身已经完成了机器学习模型的本地训练这一事件。例如,参数可用性指示信息可以携带在终端设备向网络设备发送的RRC重建立完成(RRC reestablishment complete)消息、RRC重配置完成(RRC reconfiguration complete)消息、RRC恢复完成(RRC resume complete)消息、RRC建立完成(RRC setup complete)消息、UE信息响应(UE information response)消息、非接入层(non-access stratum,NAS)消息等消息中的任意一种消息中,即,终端设备可以通过前述的任意一种消息向网络设备告知该终端设备中的模型更新参数的可用性。In the embodiment of the present application, generally speaking, when each terminal device performs local model training by means of FL, the types of model update parameters obtained by each terminal device are the same, and the network device is also aware of it, because each terminal device The initial model of the locally trained machine learning model is distributed to each end device by the network device. The model update parameter obtained by the terminal device performing local training through the model training configuration information sent by the network device, the terminal device can send parameter availability indication information to the network device, and the parameter availability indication information can indicate the availability of the model update parameter in the terminal device, After the terminal device completes the local training of the machine learning model, the model update parameters of the terminal device are all available. Therefore, the parameter availability indication information can also indicate that the terminal device has completed the training of the local machine learning model. In other words, the terminal The device can inform the network device that it has completed the local training of the machine learning model through the parameter availability indication information. For example, the parameter availability indication information may be carried in the RRC reestablishment complete (RRC reestablishment complete) message, the RRC reconfiguration complete (RRC reconfiguration complete) message, the RRC resume complete (RRC resume complete) message, and the RRC establishment complete message sent by the terminal device to the network device. Completion (RRC setup complete) message, UE information response (UE information response) message, non-access stratum (non-access stratum, NAS) message and other messages, that is, the terminal device can pass any of the aforementioned A message informs the network device of the availability of model update parameters in the terminal device.
对于上述的第一种实施方式,即对于终端设备主动上报模型更新参数的方式,通过参数可用性指示信息的指示,网络设备可以明确终端设备中可用的全部模型更新参数,进而将终端设备上报的模型更新参数进行比较,可以判断终端设备是否漏传了某些模型更新参数或者传输过程中是否由于传输异常而导致模型更新参数获取不全,从而可以提高获取模型更新参数的完整性和准确性。For the above-mentioned first embodiment, that is, for the way in which the terminal device actively reports the model update parameters, through the indication of the parameter availability indication information, the network device can specify all the model update parameters available in the terminal device, and then report the model update parameters reported by the terminal device. By comparing the update parameters, it can be determined whether the terminal device has missed some model update parameters or whether the model update parameters are incompletely obtained due to abnormal transmission during the transmission process, thereby improving the integrity and accuracy of acquiring model update parameters.
对于上述的第二种实施方式,根据参数可用性指示信息的指示,网络设备通过获取请求向终端设备请求模型更新参数时,可以按照网络设备的实际需求获取指定类型的模型更新参数,所以可选的,前述的获取请求还可以用于指示网络设备需要获取指定的模型更新参数,例如终端设备中的模型更新参数a,b,c,d均可用,而网络设备仅请求其中的模型更新参数a,b,c,这样可以减少终端设备传输的模型更新参数,减少了终端设备传输模型更新参数所花费的时间,并且可以减少无效传输,节约了网络传输资源,减少了空口资源的开销。在另一种实现方式中,或者还可以通过其它方式向终端设备指示网络设备需要的指定的模型更新参数,例如,可以用获取请求之外的消息来指示指定的模型更新参数。For the above-mentioned second embodiment, according to the indication of the parameter availability indication information, when the network device requests model update parameters from the terminal device through the acquisition request, it can obtain the specified type of model update parameters according to the actual needs of the network device, so the optional , the aforementioned acquisition request can also be used to indicate that the network device needs to acquire the specified model update parameters. For example, the model update parameters a, b, c, and d in the terminal device are all available, and the network device only requests the model update parameter a. b, c, in this way, the model update parameters transmitted by the terminal device can be reduced, the time spent by the terminal device in transmitting the model update parameters can be reduced, invalid transmission can be reduced, network transmission resources can be saved, and the overhead of air interface resources can be reduced. In another implementation manner, the specified model update parameters required by the network device may be indicated to the terminal device in other manners, for example, the specified model update parameters may be indicated by a message other than the acquisition request.
S46、网络设备根据模型更新参数对该网络设备中的机器学习模型进行更新。S46. The network device updates the machine learning model in the network device according to the model update parameter.
如前面图1介绍的那样,网络设备是向多个终端设备分别配置了对应的模型训练配置信息,所以网络设备除了接收到上述终端设备发送的模型更新参数之外,还可以接收其他终端设备发送的其它模型更新参数,并且通过计算能力为各个终端设备配置模型训练配置 信息的方式,可以尽量减少网络设备接收多个终端设备反馈的模型更新参数所需要时间的差异性。As described in Figure 1 above, the network device is configured with corresponding model training configuration information to multiple terminal devices, so the network device can receive the model update parameters sent by the above-mentioned terminal devices, as well as other terminal devices. other model update parameters, and by configuring the model training configuration information for each terminal device through the computing power, the difference in time required for the network device to receive model update parameters fed back by multiple terminal devices can be minimized.
例如,网络设备分别根据终端设备1、终端设备2、终端设备3各自的计算能力,为终端设备1、终端设备2、终端设备3分别配置了第一模型训练配置参数、第二模型训练配置参数、第三模型训练配置参数,终端设备1根据第一模型训练配置信息对终端设备1中的机器学习模型进行本地训练之后得到第一模型更新参数,终端设备2根据第二模型训练配置信息对终端设备2中的机器学习模型进行本地训练之后得到第二模型更新参数,终端设备3根据第三模型训练配置信息对终端设备3中的机器学习模型进行本地训练之后得到第三模型更新参数。由于各个终端设备进行本地训练的模型训练配置信息是根据网络设备根据各自的计算能力对应分配的,所以终端设备1、终端设备2、终端设备3完成本地训练的时间可以是大致相同的,进一步地,各个终端设备在训练完成后就将各自得到的模型更新参数发送给网络设备,网络设备则可以在大致相同的时间内接收到各个终端设备发送的模型更新参数,从而减少了网络设备接收多个终端设备的模型更新参数所需时间的差异性。For example, the network device configures the first model training configuration parameters and the second model training configuration parameters for the terminal device 1, the terminal device 2, and the terminal device 3 according to the respective computing capabilities of the terminal device 1, the terminal device 2, and the terminal device 3, respectively. , the third model training configuration parameter, the terminal device 1 obtains the first model update parameter after performing local training on the machine learning model in the terminal device 1 according to the first model training configuration information, and the terminal device 2 according to the second model training configuration information to the terminal The machine learning model in device 2 obtains second model update parameters after local training, and terminal device 3 obtains third model update parameters after locally training the machine learning model in terminal device 3 according to the third model training configuration information. Since the model training configuration information for each terminal device to perform local training is allocated according to the network device according to their respective computing capabilities, the time for the terminal device 1, the terminal device 2, and the terminal device 3 to complete the local training can be roughly the same, and further , each terminal device sends the model update parameters obtained by each terminal device to the network device after the training is completed, and the network device can receive the model update parameters sent by each terminal device in roughly the same time, thereby reducing the number of network devices receiving multiple Variation in the time it takes for the model to update the parameters of the end device.
进一步地,在获得各个参与方(即网络设备向其发送模型训练配置信息的各个终端设备)反馈的模型更新参数之后,网络设备可以按照前述图1对应实施例部分介绍的方法,对所有模型更新参数进行汇聚处理,然后再以汇聚处理后的模型更新参数对网络设备本地的机器学习模型进行更新,即,对网络设备中的机器学习模型进行本地更新。由于通过计算能力配置模型训练参数的方式,可以减少网络设备接收多个终端设备的模型更新参数所需时间的差异性,网络设备在对本地的机器学习模型进行更新时可以快速收敛,提高模型更新效率。Further, after obtaining the model update parameters fed back by each participant (that is, each terminal device to which the network device sends the model training configuration information), the network device can update all models according to the method described in the corresponding embodiment of FIG. 1 . The parameters are aggregated, and then the local machine learning model of the network device is updated with the model update parameters after the aggregation process, that is, the local machine learning model in the network device is updated locally. Because the method of configuring model training parameters through computing power can reduce the difference in the time it takes for network devices to receive model update parameters from multiple terminal devices, network devices can quickly converge when updating the local machine learning model, improving model updates. effectiveness.
需要说明的是,网络设备是利用得到的模型更新参数对本地的机器学习模型进行参数更新,训练数据是在各个终端设备本地,各个终端设备无需向网络设备传输训练数据,所以通过FL的方式可以确保用户数据安全,避免用户隐私泄露。另外,传输模型更新参数的数据量一般是要远小于训练数据的,这样可以在较大程度上减少网络传输开销,从而节约网络传输资源。It should be noted that the network device uses the obtained model update parameters to update the parameters of the local machine learning model, the training data is local to each terminal device, and each terminal device does not need to transmit training data to the network device, so the FL method can be used to update the parameters of the local machine learning model. Ensure user data security and avoid leakage of user privacy. In addition, the amount of data for updating parameters of the transmission model is generally much smaller than the training data, which can reduce network transmission overhead to a large extent, thereby saving network transmission resources.
如前所述的,模型训练配置信息是用于终端设备对本地的机器学习模型进行训练的信息,可以这样理解,模型训练配置信息是指示终端设备如何进本地模型训练的信息。本申请实施例中的模型训练配置信息可以包括一种信息或多种信息的组合,举例来说,模型训练配置信息包括超参数(hyper parameters)、精度(accuracy)、训练时间信息中的一种信息或多种信息的组合。本申请实施例中,网络设备为终端设备配置的模型训练配置信息是终端设备进行模型训练常规使用的,这样一般可以满足大多数终端设备进行本地模型训练的配置需求,通用性较好。As mentioned above, the model training configuration information is the information used by the terminal device to train the local machine learning model. It can be understood that the model training configuration information is the information instructing the terminal device how to train the local model. The model training configuration information in this embodiment of the present application may include one type of information or a combination of multiple types of information. For example, the model training configuration information includes one of hyperparameters, accuracy, and training time information. information or a combination of information. In the embodiment of the present application, the model training configuration information configured by the network device for the terminal device is routinely used by the terminal device for model training, which generally meets the configuration requirements of most terminal devices for local model training, and has good versatility.
为便于理解,以下对模型训练配置信息包括不同信息的情形进行说明。For ease of understanding, the following describes the case where the model training configuration information includes different information.
第一种情形first case
模型训练配置信息为超参数。即,网络设备根据终端设备的计算能力为该终端设备选择对应合适的超参数,然后将选择的超参数发送给终端设备,终端设备根据该超参数对本地的机器学习模型进行本地训练,然后将得到的模型更新参数反馈给网络设备。Model training configuration information is a hyperparameter. That is, the network device selects appropriate hyperparameters for the terminal device according to the computing capability of the terminal device, and then sends the selected hyperparameters to the terminal device, and the terminal device performs local training on the local machine learning model according to the hyperparameters, and then sends The obtained model update parameters are fed back to the network device.
机器学习模型涉及两个基本的概念,一个是参数,另一个是超参数(hyper parameters)。其中,参数(parameters)是由模型通过学习得到的变量,比如权重w和偏置b;超参数是 依据经验进行设定,影响到模型参数(例如权重w和偏置b)大小的参数,超参数是在开始学习过程之前设置值的参数,而不是通过模型训练得到的参数数据,通俗的理解,超参数也是一种参数,它具有参数的特性,但是它不是通过学习得到的,比如可以由用户根据已有或现有的经验指定它的值。也就是说,超参数是可以影响模型参数的参数,所以,超参数设置的值能够直接影响模型训练的效果,所以,根据终端设备的计算能力为其设置对应适合的超参数能够尽量控制终端设备本地训练所花费的时间。Machine learning models involve two basic concepts, one is parameters and the other is hyperparameters. Among them, parameters are variables obtained by the model through learning, such as weight w and bias b; hyperparameters are set based on experience and affect the size of model parameters (such as weight w and bias b). A parameter is a parameter that sets a value before starting the learning process, not the parameter data obtained through model training. In layman's terms, a hyperparameter is also a parameter. It has the characteristics of a parameter, but it is not obtained through learning. For example, it can be obtained by The user specifies its value based on existing or existing experience. That is to say, hyperparameters are parameters that can affect model parameters. Therefore, the value set by hyperparameters can directly affect the effect of model training. Therefore, setting appropriate hyperparameters for the terminal device according to the computing power of the terminal device can control the terminal device as much as possible. Time spent training locally.
本申请实施例中的超参数可以包括学习率(learning rate)、批次大小(batch size)、迭代次数(iteration)、训练轮数(epoch)中的至少一种,即可以包括前述几种具体的超参数中的一种或多种。其中:The hyperparameters in the embodiments of the present application may include at least one of a learning rate (learning rate), a batch size (batch size), an iteration (iteration), and a training epoch (epoch), that is, may include the aforementioned specific one or more of the hyperparameters. in:
学习率:机器学习模型更新时,例如会产生很多随机的决策树,每棵决策树的权重不同,学习率决定了权重更新的幅度大小。学习率过小,则导致机器学习模型收敛速度偏慢,需要更长的时间训练。例如,网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第一阈值),如果终端设备的计算能力大于或等于该第一阈值,则为该终端设备选取较小的学习率,例如0.0001;如果终端设备的计算能力小于该第一阈值,则为该终端设备选取较大的学习率,例如0.01。Learning rate: When the machine learning model is updated, for example, many random decision trees will be generated. The weight of each decision tree is different. The learning rate determines the magnitude of the weight update. If the learning rate is too small, the convergence rate of the machine learning model will be slow and it will take longer to train. For example, the network device sets a certain threshold (for example, called the first threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the first threshold, select a smaller threshold for the terminal device. Learning rate, such as 0.0001; if the computing capability of the terminal device is less than the first threshold, a larger learning rate, such as 0.01, is selected for the terminal device.
批次大小:是指每次训练时送入机器学习模型的样本数,即一次训练时所需的样本数,例如一次训练时用到100个样本数,则batch size为100。网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第二阈值),如果终端设备的计算能力大于或等于该第二阈值,则为该终端设备选取较大的batch size,例如128;如果终端设备的计算能力小于该第二阈值,则为该终端设备选取较小的batch size,例如16。Batch size: refers to the number of samples fed into the machine learning model during each training, that is, the number of samples required for one training. For example, if 100 samples are used in one training, the batch size is 100. The network device sets a certain threshold (for example, called the second threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the second threshold, select a larger batch size for the terminal device. , such as 128; if the computing capability of the terminal device is less than the second threshold, select a smaller batch size for the terminal device, such as 16.
迭代次数:是指整个训练集输入到机器学习模型进行训练的次数。实现方式1:网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第三阈值),如果终端设备的计算能力大于或等于该第三阈值,则为该终端设备选取较大的iteration,例如10000;如果终端设备的计算能力小于该第三阈值,则为该终端设备选取较小的iteration,例如1000。实现方式2:网络设备根据机器学习模型所需的计算能力和终端设备的计算能力,计算得到该终端设备的iteration,例如,机器学习模型一次迭代所需的计算能力为M,终端设备的计算能力为P,则终端设备的iteration为N=P/M。The number of iterations: refers to the number of times the entire training set is input to the machine learning model for training. Implementation 1: The network device sets a certain threshold (for example, called the third threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the third threshold, select a higher threshold for the terminal device. A large iteration, such as 10000; if the computing capability of the terminal device is less than the third threshold, a smaller iteration, such as 1000, is selected for the terminal device. Implementation mode 2: The network device calculates the iteration of the terminal device according to the computing capability required by the machine learning model and the computing capability of the terminal device. For example, the computing capability required for one iteration of the machine learning model is M, and the computing capability of the terminal device is M. is P, then the iteration of the terminal device is N=P/M.
训练轮数:是指整个训练集输入到机器学习模型进行训练的轮数。实现方式1:网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第私阈值),如果终端设备的计算能力大于活等于该第四阈值,则为该终端设备选取较大的epoch,例如10;如果终端设备的计算能力小于该第四阈值,则为该终端设备选取较小的epoch,例如5。实现方式2:网络设备根据机器学习模型所需的计算能力和终端设备的计算能力,计算得到该终端设备的epoch。例如,机器学习模型一轮迭代所需的计算能力为M,终端设备的计算能力为P,则终端设备的iteration为N=P/M。Number of training rounds: refers to the number of rounds in which the entire training set is input to the machine learning model for training. Implementation 1: The network device sets a certain threshold (for example, called the first threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the fourth threshold, select a higher threshold for the terminal device. A large epoch, such as 10; if the computing capability of the terminal device is less than the fourth threshold, a smaller epoch, such as 5, is selected for the terminal device. Implementation mode 2: The network device calculates the epoch of the terminal device according to the computing capability required by the machine learning model and the computing capability of the terminal device. For example, the computing capability required for one iteration of the machine learning model is M, and the computing capability of the terminal device is P, then the iteration of the terminal device is N=P/M.
举个例子,说明批次大小、迭代次数、训练轮数三者之间的关系:假设整个训练集有1000个样本数,训练轮数为10,批次大小为20,则迭代次数=10*(1000/20)=500。For example, to illustrate the relationship between batch size, number of iterations, and number of training rounds: Assuming that the entire training set has 1000 samples, the number of training rounds is 10, and the batch size is 20, then the number of iterations = 10* (1000/20)=500.
由于各个终端设备的计算能力不同,而超参数是进行模型训练时较为基本的训练要求,并且通过不同类型的超参数的不同取值,均可以将其量化成对应的训练时间,例如对于1000个样本数、训练轮数为10、批次大小为20进行本地训练,按照终端设备1的计算能力大概会花费2分钟才能完成训练,而按照终端设备2的计算能力大概只需要花费1.5分 钟就能完成训练,即对于前述的有具体取值的超参数,对于终端设备1则可以量化得到2分钟的训练时间,而对于终端设备2则可以量化得到1.5分钟的训练时间,所以,结合终端设备自身的计算能力通过配置超参数的方式可以较为明确终端设备完成本地训练的时间,所以这样可以较好地控制参与本地训练的各个终端设备的训练时间,从而减少各个终端设备完成本地训练的时间差异性。Due to the different computing capabilities of each terminal device, hyperparameters are the basic training requirements for model training, and through different values of different types of hyperparameters, they can be quantified into the corresponding training time, for example, for 1000 The number of samples, the number of training rounds is 10, and the batch size is 20 for local training. According to the computing power of terminal device 1, it takes about 2 minutes to complete the training, and according to the computing power of terminal device 2, it only takes about 1.5 minutes to complete the training. Completing the training, that is, for the aforementioned hyperparameters with specific values, the training time of 2 minutes can be quantified for terminal device 1, and the training time of 1.5 minutes can be quantified for terminal device 2. Therefore, combined with the terminal device itself By configuring hyperparameters, the computing power of the terminal device can be more clearly defined when the terminal device completes the local training, so this can better control the training time of each terminal device participating in the local training, thereby reducing the time difference for each terminal device to complete the local training. .
第二种情形second case
模型训练配置信息为训练机器学习模型时所要求的精度。即,网络设备根据终端设备的计算能力为该终端设备选择对应合适的精度,然后将选择的精度告知终端设备,终端设备根据网络设备要求的精度对本地的机器学习模型进行本地训练,然后将得到的模型更新参数反馈给网络设备。其中,训练机器学习模型时所要求的精度,是指机器学习模型的实际预测输出与样本的真实输出之间的差异。The model training configuration information is the accuracy required to train the machine learning model. That is, the network device selects the appropriate accuracy for the terminal device according to the computing capability of the terminal device, and then informs the terminal device of the selected accuracy. The terminal device performs local training on the local machine learning model according to the accuracy required by the network device, and then obtains The model update parameters are fed back to the network device. Among them, the accuracy required when training the machine learning model refers to the difference between the actual predicted output of the machine learning model and the actual output of the sample.
本申请实施例中精度可以包括错误率、正确率、差准率、查全率中的至少一种。其中:The precision in this embodiment of the present application may include at least one of an error rate, a correct rate, a deviation rate, and a recall rate. in:
错误率:是指基于更新后的机器学习模型(即经过训练后的机器学习模型),分类错误(或者说,预测错误)的样本数占样本总数的比例。网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第五阈值),如果终端设备的计算能力大于或等于该第五阈值,则为该终端设备选取较小的错误率;如果终端设备的计算能力小于该第五阈值,则为该终端设备选取较大的错误率。Error rate: refers to the ratio of the number of samples with classification errors (or prediction errors) to the total number of samples based on the updated machine learning model (that is, the trained machine learning model). The network device sets a certain threshold (for example, called the fifth threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the fifth threshold, a smaller error rate is selected for the terminal device. ; if the computing capability of the terminal device is less than the fifth threshold, select a larger error rate for the terminal device.
正确率:是指基于更新后的机器学习模型(即经过训练后的机器学习模型),分类正确(或者说,预测正确)的样本数占样本总数的比例。网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第六阈值),如果终端设备的计算能力大于或等于该第六阈值,则为该终端设备选取较大的正确率;如果终端设备的计算能力小于该第六阈值,则为该终端设备选取较小的正确率。例如,对于神经网络,机器学习模型对每个测试样本产生一个概率预测,此时,正确率可以是Top-1准确率,即概率预测排名第一的类别与实际结果相符的准确率;或者,正确率可以是Top-5准确率,即概率预测排名前五的类别包含实际结果的准确率。Correct rate: refers to the proportion of the number of correctly classified (or correctly predicted) samples to the total number of samples based on the updated machine learning model (that is, the trained machine learning model). The network device sets a certain threshold (for example, called the sixth threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the sixth threshold, select a higher correct rate for the terminal device. ; if the computing capability of the terminal device is less than the sixth threshold, select a smaller correct rate for the terminal device. For example, for a neural network, the machine learning model produces a probability prediction for each test sample. In this case, the accuracy rate can be the Top-1 accuracy rate, that is, the accuracy rate of the category with the first probability prediction being consistent with the actual result; or, The accuracy rate can be the Top-5 accuracy rate, that is, the accuracy rate of the probability prediction that the top five categories contain the actual results.
查准率:是指基于更新后的机器学习模型,预测为正的样本中有多少是真正的正样本。网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第七阈值),如果终端设备的计算能力大于或等于该第七阈值,则为该终端设备选取较大的查准率;如果终端设备的计算能力小于该第七阈值,则为该终端设备选取较小的查准率。Precision: It refers to how many of the samples predicted to be positive based on the updated machine learning model are true positive samples. The network device sets a certain threshold (for example, referred to as the seventh threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the seventh threshold, select a larger threshold for the terminal device. rate; if the computing capability of the terminal device is less than the seventh threshold, a smaller precision rate is selected for the terminal device.
查全率:是指基于更新后的机器学习模型,样本中的正例有多少被预测正确。网络设备根据机器学习模型所需的计算能力,设置一定的阈值(例如称作第八阈值),如果终端设备的计算能力大于或等于该第八阈值,则为该终端设备选取较低的查全率;如果终端设备的计算能力小于该第八阈值,则为该终端设备选取较高的查全率。Recall rate: refers to how many positive examples in the sample are predicted correctly based on the updated machine learning model. The network device sets a certain threshold (for example, referred to as the eighth threshold) according to the computing power required by the machine learning model. If the computing power of the terminal device is greater than or equal to the eighth threshold, the lower retrieval value is selected for the terminal device. rate; if the computing capability of the terminal device is less than the eighth threshold, a higher recall rate is selected for the terminal device.
第三种情形The third situation
模型训练配置信息为训练机器学习模型时所要求的训练时间信息,该训练时间信息用于指示训练所采用的时间,该训练时间信息例如指示开始训练时间和结束训练时间,或者指示开始训练时间和训练持续时长,或者指示训练结束时间和训练持续时长,或者仅指示训练时长(例如5分钟或者10分钟等)。即,网络设备根据终端设备的计算能力为该终端设备选择对应合适的训练时间信息,然后将选择的训练时间信息告知终端设备,终端设备根据网络设备要求的训练时间信息对本地的机器学习模型进行本地训练,然后将得到的模 型更新参数反馈给网络设备。The model training configuration information is the training time information required when training the machine learning model, and the training time information is used to indicate the time used for training. The training duration, either indicates the training end time and the training duration, or only indicates the training duration (eg, 5 minutes or 10 minutes, etc.). That is, the network device selects appropriate training time information for the terminal device according to the computing capability of the terminal device, and then informs the terminal device of the selected training time information, and the terminal device performs the local machine learning model according to the training time information required by the network device. Local training, and then feedback the obtained model update parameters to the network device.
在具体实施过程中,对于需要参与本地训练的多个或者全部终端设备,网络设备可以根据各个终端设备计算能力相应计算各个终端设备进行本地训练所需花费的时间,然后将大部分(例如90%)或者全部的终端设备所需的训练时间中的最大值作为各个参与方的训练时间。如此,可以将较长的训练时间配置给各个终端设备,使得各个终端设备在规定的训练时间内都尽量能够完成本地训练,并且大部分终端设备完成本地训练所花的时间是近似相同的,这样可以减少各个终端设备完成本地训练的时间差异性,从而减少网络设备获取各个终端设备反馈的模型更新参数的时间差异性。In the specific implementation process, for multiple or all terminal devices that need to participate in local training, the network device can calculate the time required for each terminal device to perform local training according to the computing capability of each terminal device, and then use most (for example, 90% of the time) ) or the maximum value of the training time required by all terminal devices as the training time of each participant. In this way, a longer training time can be configured for each terminal device, so that each terminal device can complete the local training as much as possible within the specified training time, and the time it takes for most terminal devices to complete the local training is approximately the same, so It can reduce the time difference of each terminal device completing local training, thereby reducing the time difference of the network device acquiring the model update parameters fed back by each terminal device.
第四种情形fourth situation
模型训练配置信息为训练机器学习模型时所要求的超参数和精度。在具体实施过程中,对于网络设备设定的超参数和精度,终端设备在本地训练时可能并无法绝对地同时满足这两个要求,例如,网络设备为终端设备配置的超参数是批次大小50且训练轮数10,配置的精度是正确率96%,那么终端设备在训练时,当按照批次大小50进行10轮训练时,精度可能还无法达到96%。在此情况下,一种做法是,终端设备可以自行配置其它超参数来尽量达到96%的精度要求,例如除了前述由网络设备配置的批次大小50且训练轮数10,同时还可以自行配置较大的学习率和较大的迭代次数来进行本地训练,以尽量达到网络设备的精度要求;另一种做法是,可以在网络侧所配置的两种模型训练配置信息中找到平衡,例如在网络设备配置的超参数和精度的基础上,可以适当增大超参数的一些取值以及降低精度,但是调整幅度不宜过大,这样能够尽量满足网络设备的训练要求。The model training configuration information is the hyperparameters and accuracy required to train the machine learning model. In the specific implementation process, for the hyperparameters and accuracy set by the network device, the terminal device may not be able to absolutely meet these two requirements at the same time during local training. For example, the hyperparameter configured by the network device for the terminal device is the batch size. 50 and the number of training rounds is 10, the accuracy of the configuration is 96%, so when the terminal device is training, when 10 rounds of training are performed according to the batch size of 50, the accuracy may not reach 96%. In this case, one approach is that the terminal device can configure other hyperparameters by itself to achieve the 96% accuracy requirement. For example, in addition to the aforementioned batch size of 50 and the number of training rounds configured by the network device, it can also be configured by itself. A larger learning rate and a larger number of iterations are used for local training to try to meet the accuracy requirements of network devices; another approach is to find a balance in the training configuration information of the two models configured on the network side, such as in On the basis of the hyperparameters and precision configured by the network equipment, some values of the hyperparameters can be appropriately increased and the precision can be reduced, but the adjustment range should not be too large, so as to meet the training requirements of the network equipment as much as possible.
也就是说,终端设备在进行本地训练时,可以不是绝对的满足网络设备所配置的模型训练配置信息的要求,在一种可能的实施方式中,可以对网络设备所配置的模型训练配置信息进行较小幅度的适当调整,这样能够尽量同时满足多种模型训练配置信息的要求,从而在不影响网络设备通过模型训练配置信息来限制各个终端设备完成本地训练的时间差异,同时可以获得较好的训练效果。That is to say, when the terminal device performs local training, it may not absolutely meet the requirements of the model training configuration information configured by the network device. In a possible implementation manner, the model training configuration information configured by the network device may be Appropriate adjustments are made in a small range, so as to meet the requirements of multiple model training configuration information at the same time, so as not to affect the network equipment through the model training configuration information to limit the time difference for each terminal device to complete local training, and at the same time to obtain better results. training effect.
第五种情形Fifth situation
模型训练配置信息为训练机器学习模型时所要求的超参数、精度和训练时间信息。The model training configuration information is the hyperparameter, accuracy and training time information required when training the machine learning model.
也就是说,网络设备可以同时为终端设备配置三种(或者更多种)类型的模型训练配置信息,对于该情形,终端设备在进行本地训练时,可以参照第四种情形中列举的方式来进行本地训练,即可以不严格按照网络设备的配置进行本地训练,而是可以适当小幅度的对一种或多种模型训练配置信息进行调整,以期达到较好的训练效果且同时对训练所花时间的改动尽量小。That is to say, the network device can simultaneously configure three (or more) types of model training configuration information for the terminal device. In this case, when the terminal device performs local training, it can refer to the methods listed in the fourth situation to To perform local training, that is, you can perform local training not strictly according to the configuration of the network equipment, but to adjust the training configuration information of one or more models in an appropriate and small range, in order to achieve a better training effect and at the same time reduce the training cost. Changes in time should be as small as possible.
在另一种实施方式中,考虑到网络设备根据终端设备的计算能力所配置的多种模型训练配置信息的目的,是为了减少参与本地训练的各个终端设备完成本地训练的时间差异,所以,网络设备为各个终端设备配置的训练时间信息是减少时间差异的最直接的要求,为此,在包括有训练时间信息的多种类型的模型训练配置信息时,可以优先确保训练时间信息不变,对其它类型的模型训练配置信息略微进行调整以期达到更好的训练效果,或者,还是优先确保训练时间信息不变,并严格按照其它类型的模型训练配置信息进行本地训练。也就是说,在多种模型训练配置信息中包括有训练时间信息时,训练时间信息的优先级最高,可以保持其对应的时间要求不变,这样可以尽量满足网络设备对各个终端设备的训练时间要求,从而较好地减少各个终端设备进行本地训练的时间差异性。In another embodiment, considering that the purpose of training configuration information for various models configured by the network device according to the computing capability of the terminal device is to reduce the time difference between the completion of local training by each terminal device participating in the local training, the network The training time information configured by the device for each terminal device is the most direct requirement to reduce the time difference. For this reason, when training configuration information of various types of models including training time information, priority can be given to ensure that the training time information remains unchanged. The training configuration information of other types of models is slightly adjusted in order to achieve a better training effect, or it is better to ensure that the training time information remains unchanged, and perform local training strictly according to the training configuration information of other types of models. That is to say, when training time information is included in the training configuration information of various models, the priority of the training time information is the highest, and the corresponding time requirements can be kept unchanged, so as to satisfy the training time of each terminal device by the network device as much as possible. requirements, so as to better reduce the time difference of local training performed by each terminal device.
前述介绍了网络设备根据终端设备的计算能力为各个参与本地训练的终端设备分配对应的模型训练配置信息的实施例,在上述实施例的基础上之上,网络设备和终端设备之间还可以进行更多的交互,以期达到更好的训练效果,获得更准确的模型更新参数,同时可以更好地减少各个终端设备完成本地训练的时间差异性,从而减少网络设备获得各个终端设备反馈的模型更新参数的时间差异性,以便于网络设备在更新本地的机器学习模型时能够快速收敛,提高网络设备更新机器学习模型的更新效率。The foregoing describes an embodiment in which the network device allocates corresponding model training configuration information to each terminal device participating in the local training according to the computing capability of the terminal device. On the basis of the above embodiment, the network device and the terminal device can also More interaction, in order to achieve better training effect, obtain more accurate model update parameters, and at the same time, it can better reduce the time difference of each terminal device completing local training, thereby reducing the model update that the network device obtains feedback from each terminal device. The time difference of the parameters facilitates the rapid convergence of the network device when updating the local machine learning model, and improves the update efficiency of the network device updating the machine learning model.
在一种实施方式中,在网络设备向终端设备配置了模型训练配置信息的基础上,网络设备还可以向终端设备发送训练特征信息,该训练特征信息是用于指示终端设备进行本地训练时所用的训练特征集,训练特征信息例如包括:信道质量指示(channel quality indicator,CQI)、信道状态信息参考信号(channel interference information reference signals,CSI-RS)测量结果、同步信号与物理广播信道块(synchronization signal and physical broadcast channel block,SSB)测量结果、报文时延等信息中的一种或多种。终端设备在收到训练特征信息之后,可以使用对应训练特征中的样本对本地的机器学习模型进行本地训练,例如,使用SSB测量结果中的样本对机器学习模型进行本地训练。网络设备向终端设备发送训练特征信息,可以让各个参与方均使用相同的训练特征信息进行本地训练,从而减少各个终端设备基于不同训练特征信息进行本地训练时所花时间的差异。In one embodiment, on the basis that the network device configures the model training configuration information to the terminal device, the network device may also send training feature information to the terminal device, where the training feature information is used to instruct the terminal device to perform local training. For example, the training feature information includes: channel quality indicator (CQI), channel state information reference signal (channel interference information reference signals, CSI-RS) measurement result, synchronization signal and physical broadcast channel block (synchronization One or more of information such as signal and physical broadcast channel block, SSB) measurement results and packet delay. After receiving the training feature information, the terminal device can use the samples in the corresponding training features to locally train the local machine learning model, for example, use the samples in the SSB measurement result to locally train the machine learning model. The network device sends the training feature information to the terminal device, so that each participant can use the same training feature information for local training, thereby reducing the difference in the time spent by each terminal device for local training based on different training feature information.
在一种实施方式中,在网络设备向终端设备配置了模型训练配置信息的基础上,网络设备还可以向终端设备发送精度评估信息,该精度评估信息用于终端设备对本地训练的机器学习模型的精度进行评估,该精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种。其中,用于评估精度的方法可以是留出法(hold-out)、交叉验证法(cross validation)、自助法(bootstrapping)或其它方法中的任意一种。In an embodiment, after the network device configures the model training configuration information to the terminal device, the network device may also send accuracy evaluation information to the terminal device, where the accuracy evaluation information is used by the terminal device for the locally trained machine learning model The accuracy is evaluated, and the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy. The method for evaluating the accuracy may be any one of hold-out, cross validation, bootstrapping or other methods.
留出法,是把样本划分成两个互斥的集合,其中一个集合作为机器学习模型训练的样本,另一个集合作为机器学习模型的测试样本,用训练样本训练处机器学习模型后,用测试样本进行测试。The set aside method is to divide the samples into two mutually exclusive sets, one of which is used as the training sample of the machine learning model, and the other set is used as the test sample of the machine learning model. sample for testing.
交叉验证法,是将样本划分成k个大小相似的互斥子集,然后每次用k-1个子集的并集作为训练样本,余下的那个子集作为测试样本,从而尽快k次训练和测试,最终返回k个测试结果的均值。The cross-validation method is to divide the sample into k mutually exclusive subsets of similar size, and then use the union of k-1 subsets as the training sample each time, and the remaining subset as the test sample, so as to train k times as soon as possible. Test, and finally return the mean of k test results.
自助法,是给定m个样本的数据集D,每次随机从D中挑选一个样本,将其拷贝到E,然后再讲该样本放回初始数据集D中。这个过程重复m次,得到包含了m个样本的数据集E。使用数据集E中的样本作为训练样本,数据集D中不同于数据集E中的样本作为测试样本。The bootstrap method is to give a data set D of m samples, randomly select a sample from D each time, copy it to E, and then put the sample back into the initial data set D. This process is repeated m times to obtain a data set E containing m samples. The samples in data set E are used as training samples, and the samples in data set D that are different from data set E are used as test samples.
通过向终端设备指定精度评估信息,可以让各个参与方均使用相同的精度评估信息对本地训练后的机器学习模型进行精度评估,由于采用的是相同的精度评估方式,可以尽量使得各个终端设备在同一精度评估标准下达到规定的精度要求,从而可以减少各个终端设备进行本地训练所花时间的差异。By specifying the accuracy evaluation information to the terminal device, each participant can use the same accuracy evaluation information to evaluate the accuracy of the locally trained machine learning model. Since the same accuracy evaluation method is used, each terminal device can be The specified accuracy requirements are met under the same accuracy evaluation standard, thereby reducing the difference in the time spent by each terminal device for local training.
本申请实施例中,终端设备在完成本地训练之后,需要将得到的模型更新参数反馈给网络设备,在此基础上,终端设备还可以向网络设备发送精度指示信息,该精度指示信息用于指示终端设备利用网络设备所配置的模型训练配置信息进行本地训练后的机器学习模型所达到的精度,也就是说,终端设备除了向网络设备反馈模型更新参数,同时还可以将对应的模型训练的精度反馈给网络设备,这样,以便于网络设备知晓终端设备的训练效 果,可以作为网络设备后续再为终端设备配置模型训练配置信息时作为参考依据,例如终端设备发送的精度指示信息指示训练达到的精度较差,那么后续网络设备在为该终端设备选择模型训练配置信息时,可以在之前的模型训练配置信息的基础上进行定向调整。In this embodiment of the present application, after completing the local training, the terminal device needs to feed back the obtained model update parameters to the network device. On this basis, the terminal device may also send accuracy indication information to the network device, where the accuracy indication information is used to indicate The accuracy achieved by the terminal device using the model training configuration information configured by the network device to perform the local training of the machine learning model. That is, the terminal device can not only feed back the model update parameters to the network device, but also the accuracy of the corresponding model training. Feedback to the network device, so that the network device can know the training effect of the terminal device, and can be used as a reference when the network device configures the model training configuration information for the terminal device in the future. For example, the accuracy indication information sent by the terminal device indicates the training accuracy. If it is poor, then when the subsequent network device selects the model training configuration information for the terminal device, it can perform directional adjustment on the basis of the previous model training configuration information.
进一步地,可以根据对其训练效果的判断确定终端设备反馈的模型更新参数在本地的模型更新中所起的作用,例如终端设备1反馈的精度指示信息指示其本地训练的精度是97%,而终端设备2反馈的精度指示信息指示其本地训练的精度是85%,可见终端设备1本地训练的精度要高于终端设备2本地训练的精度,换言之,终端设备1的训练效果应该要优于终端设备2。进一步地,网络设备在使用各个终端设备反馈的模型更新参数进行本地更新时,可以给予终端设备1反馈的模型更新参数更大的权重,而给予终端设备2反馈的模型更新参数相对较小的权重。通过该方式,可以提高网络设备进行模型的本地更新的有效性和准确性。Further, the role of the model update parameters fed back by the terminal device in the local model update can be determined according to the judgment of its training effect. For example, the accuracy indication information fed back by the terminal device 1 indicates that the accuracy of its local training is 97%, while The accuracy indication information fed back by the terminal device 2 indicates that the accuracy of its local training is 85%. It can be seen that the accuracy of the local training of the terminal device 1 is higher than that of the local training of the terminal device 2. In other words, the training effect of the terminal device 1 should be better than that of the terminal device. device 2. Further, when using the model update parameters fed back by each terminal device to perform local update, the network device may give a larger weight to the model update parameter fed back by the terminal device 1, while giving the model update parameter fed back by the terminal device 2 a relatively small weight. . In this way, the effectiveness and accuracy of the local update of the model by the network device can be improved.
前面介绍了网络设备根据终端设备的计算能力为各个终端设备配置模型训练参数信息进行本地训练的方式,在该方式中,根据各个终端设备的计算能力的不同配置对应的模型训练参数信息,以便于各个终端设备能够尽量在相同(或者近似相同)的时间内完成内地的机器学习模型的训练,减少各个终端设备向网络设备发送模型更新参数的时间差异性,从而减少网络设备接收各个终端设备发送的模型更新参数的时间差异性,网络设备可以尽量利用各个终端设备反馈的模型更新参数在短时间内对本地的机器学习模型进行本地更新,从而提高本地更新的收敛速度,进而提高机器学习模型的更新效率。The method in which the network device configures the model training parameter information for each terminal device to perform local training according to the computing capability of the terminal device has been introduced above. Each terminal device can try to complete the training of the machine learning model in the mainland within the same (or approximately the same) time, reducing the time difference between each terminal device sending model update parameters to the network device, thereby reducing the network device receiving the data sent by each terminal device. Due to the time difference of the model update parameters, the network device can try to use the model update parameters fed back by each terminal device to locally update the local machine learning model in a short period of time, thereby improving the convergence speed of the local update and improving the update of the machine learning model. effectiveness.
本申请实施例中,还提供另一种更新机器学习模型的方法,在该方法中,网络设备主动向各个终端设备请求各个终端设备中的模型更新参数,并且请求的时间点是网络设备根据各个终端设备实际需传输的数据量和传输链情况(传输链路的质量)确定的。具体来说,网络设备根据各个终端设备向网络设备发送各自的模型更新参数的所需时长(例如将该时长称作传输时长)选择向各个终端设备请求模型更新参数的时间点,根据各个终端设备的传输时长之间的不同而差异化地在不同的时间点向各个终端设备请求各自的模型更新参数,这样可以尽量减少由于传输时长的不同而导致网络设备接收各个终端设备发送的模型更新参数之间的时间差异,使得各个终端设备发送的模型更新参数能够尽量在相同时间(或者近似相同的短时间内)到达网络设备,从而减少网络设备获取各个终端设备的模型更新参数之间的时间差异,以便于网络设备能够在短时间内根据各个终端设备的模型更新参数对本地的机器学习模型进行本地更新,从而提高本地更新的收敛速度,进而提高机器学习模型的更新效率。In the embodiment of the present application, another method for updating a machine learning model is also provided. In this method, the network device actively requests each terminal device for the model update parameters in each terminal device, and the time point of the request is the network device according to each terminal device. The amount of data that the terminal device actually needs to transmit and the condition of the transmission chain (the quality of the transmission link) are determined. Specifically, the network device selects the time point for requesting the model update parameters from each terminal device according to the time required for each terminal device to send the respective model update parameters to the network device (for example, the duration is referred to as the transmission duration). The difference between the transmission time and the different time points to request the respective model update parameters from each terminal device, which can minimize the difference between the network device receiving the model update parameters sent by each terminal device due to the difference in transmission time. The time difference between each terminal device allows the model update parameters sent by each terminal device to reach the network device at the same time (or approximately the same short time), thereby reducing the time difference between the network device acquiring the model update parameters of each terminal device. In this way, the network device can locally update the local machine learning model according to the model update parameters of each terminal device in a short time, thereby improving the convergence speed of the local update, thereby improving the updating efficiency of the machine learning model.
为便于理解,以下结合图5介绍本申请实施例提供的另一种更新机器学习模型的方法,在图5的介绍中,以第一终端设备为例进行说明,其中的第一终端设备为参与FL的多个终端设备中的任一个终端设备。For ease of understanding, another method for updating a machine learning model provided by an embodiment of the present application is described below with reference to FIG. 5 . In the introduction of FIG. 5 , a first terminal device is used as an example for description, wherein the first terminal device is a participant. Any one of the multiple terminal devices of FL.
S51:第一终端设备向网络设备发送参数可用性指示信息。S51: The first terminal device sends parameter availability indication information to the network device.
如前所述的,参数可用性指示信息可用于指示第一终端设备中的模型更新参数的可用性,由于终端设备在完成对机器学习模型的本地训练之后,终端设备的模型更新参数都是可用的,所以,通过参数可用性指示信息还可以指示终端设备已经完成了本地的机器学习模型的训练,换言之,第一终端设备可以通过参数可用性指示信息告知网络设备自身已经完成了机器学习模型的本地训练这一事件。在具体实施过程中,参数可用性指示信息可以携带在第一终端设备向网络设备发送的RRC重建立完成消息、RRC重配置完成消息、RRC 恢复完成消息、RRC建立完成消息、UE信息响应消息、NAS消息等消息中的任意一种消息中,即,第一终端设备可以通过前述的任意一种消息向网络设备告知第一该终端设备中的模型更新参数的可用性。As mentioned above, the parameter availability indication information can be used to indicate the availability of the model update parameters in the first terminal device, because after the terminal device completes the local training of the machine learning model, the model update parameters of the terminal device are all available, Therefore, the parameter availability indication information can also indicate that the terminal device has completed the training of the local machine learning model. In other words, the first terminal device can inform the network device that it has completed the local training of the machine learning model through the parameter availability indication information. event. In a specific implementation process, the parameter availability indication information may be carried in the RRC re-establishment complete message, the RRC reconfiguration complete message, the RRC recovery complete message, the RRC establishment complete message, the UE information response message, the NAS In any of the messages such as messages, that is, the first terminal device may notify the network device of the availability of the model update parameters in the first terminal device through any of the aforementioned messages.
在具体实施过程中,S51不是必须的步骤,所以在图5中以虚线表示S51,也就是说,第一终端设备可以向网络设备发送参数可用性指示信息,也可以无需向网络设备发送参数可用性指示信息,本申请实施例不做限制。In the specific implementation process, S51 is not a necessary step, so S51 is represented by a dotted line in FIG. 5 , that is to say, the first terminal device can send parameter availability indication information to the network device, or it does not need to send the parameter availability indication to the network device. information, which is not limited in the embodiments of the present application.
S52:网络设备确定多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长。S52: The network device determines the transmission duration for each terminal device in the plurality of terminal devices to send their respective model update parameters to the network device.
其中的多个终端设备可以包括第一终端设备,或者也可以不包括第一终端设备。其中的多个终端设备中的各个终端设备均是网络设备预先为其分发了初始的机器学习模型的终端设备,并且该多个终端设备是参与对网络设备为其分发的机器学习模型进行本地训练的所有终端设备中的大部分(例如80%)或者绝大部分(例如95%)。The plurality of terminal devices may include the first terminal device, or may not include the first terminal device. Each of the multiple terminal devices is a terminal device for which the network device pre-distributed the initial machine learning model, and the multiple terminal devices participate in the local training of the machine learning model distributed by the network device. Most (eg 80%) or the vast majority (eg 95%) of all terminal devices of
S53:网络设备根据各个终端设备向网络设备发送各自的模型更新参数的传输时长,选择获取第一终端设备的模型更新参数的时间点。S53: The network device selects a time point for acquiring the model update parameters of the first terminal device according to the transmission duration of each terminal device sending the respective model update parameters to the network device.
本申请实施例中,网络设备在向各个终端设备(包括第一终端设备)请求各个终端设备对应的模型更新参数之前,各个终端设备已经完成了各自本地的机器学习模型的训练并得到的各自对应的模型更新参数,如此,网络设备可以根据各个终端设备向网络设备发送各自的模型更新参数的传输时长,确定获取各个终端设备中的模型更新参数的时间点,例如将选择的时间点称作获取时间,获取时间可以是网络设备向第一终端设备发送用于请求获取模型更新参数的获取请求的时间,即,网络设备可以在该获取时间点向第一终端设备发送获取请求;或者,获取时间也可以是网络设备向第一终端设备指示的向网络设备发送模型更新参数的时间,即,第一终端设备可以在该获取时间向网络设备发送其本地的模型更新参数。In the embodiment of the present application, before the network device requests each terminal device (including the first terminal device) for the model update parameters corresponding to each terminal device, each terminal device has completed the training of its own local machine learning model and obtained the corresponding corresponding In this way, the network device can determine the time point for acquiring the model update parameters in each terminal device according to the transmission duration of each terminal device sending the respective model update parameters to the network device. For example, the selected time point is called acquisition. time, the acquisition time may be the time when the network device sends the acquisition request for requesting the acquisition of model update parameters to the first terminal device, that is, the network device may send the acquisition request to the first terminal device at the acquisition time point; or, the acquisition time It may also be the time indicated by the network device to the first terminal device to send the model update parameters to the network device, that is, the first terminal device may send its local model update parameters to the network device at the acquisition time.
其中,传输时长可以理解为是从各个终端设备从发送各自的模型更新参数到网络设备接收到该模型更新参数之间的间隔时长,这与各个终端设备与网络设备之间的通信链路质量有关,所以在一种可能的实施方式中,网络设备可以根据终端设备发送的CQI获得该终端设备的上行传输速率,然后根据模型更新参数的数据量和上行传输速率确定该终端设备对应的传输时长,例如以w表示终端设备的上行传输速率,以q表示该终端设备的模型更新参数的数据量,则对应该终端设备的传输时长T=q/w。按照此方法,可以确定各个终端设备对应的传输时长,然后根据大部分终端设备(例如80%)或者全部终端设备所对应的传输时长确定获取该终端设备中的模型更新参数的时间点。The transmission duration can be understood as the interval from each terminal device sending its own model update parameter to the network device receiving the model update parameter, which is related to the quality of the communication link between each terminal device and the network device , so in a possible implementation manner, the network device can obtain the uplink transmission rate of the terminal device according to the CQI sent by the terminal device, and then determine the transmission duration corresponding to the terminal device according to the data volume of the model update parameter and the uplink transmission rate, For example, w represents the uplink transmission rate of the terminal device, and q represents the data volume of the model update parameter of the terminal device, then corresponding to the transmission duration of the terminal device T=q/w. According to this method, the transmission duration corresponding to each terminal device can be determined, and then the time point for acquiring the model update parameters in the terminal device is determined according to the transmission duration corresponding to most terminal devices (eg, 80%) or all terminal devices.
对于上述的传输时长的计算公式T=q/w,其中的q表示对应的终端设备向网络设备发送模型更新参数的数据量。由于各个终端设备中进行本地训练的机器学习模型是由网络设备预先统一分发的,所以各个终端设备训练完成后得到的模型更新参数对于网络设备来说是已知的,进而,网络设备可以知晓各个模型更新参数的数据量.在一种实施方式中,终端设备向网络设备传输所有模型更新参数,那么网络设备在知晓各个模型更新参数的数据量的基础上可以估算出所有模型更新参数的总数据量,得到前述的q;在另一种实施方式中,网络设备可以终端设备请求指定类型的模型更新参数,在知晓所指定的各个模型更新参数的数据量的基础上,网络设备可以估算出所有指定的模型更新参数的总数据量,得到前述q。For the above-mentioned calculation formula of the transmission duration T=q/w, q represents the data amount of the model update parameter sent by the corresponding terminal device to the network device. Since the machine learning model for local training in each terminal device is uniformly distributed in advance by the network device, the model update parameters obtained after the training of each terminal device is known to the network device, and the network device can know each Data volume of model update parameters. In one embodiment, the terminal device transmits all model update parameters to the network device, then the network device can estimate the total data of all model update parameters on the basis of knowing the data volume of each model update parameter In another embodiment, the network device can request the terminal device for a specified type of model update parameter, and on the basis of knowing the data amount of each specified model update parameter, the network device can estimate all The total data volume of the specified model update parameters, obtain the aforementioned q.
例如,终端设备1对应的传输时长为10分钟,终端设备2对应的传输时长为15分钟,终端设备3对应的传输时长为22分钟,终端设备4对应的传输时长为28分钟,则网络设备可以指示终端设备4在13:02的时候向网络设备发送模型更新参数,指示终端设备3在13:08的时候向网络设备发送模型更新参数,指示终端设备2在13:15的时候向网络设备发送13:08,指示终端设备1在13:20的时候向网络设备发送模型更新参数。也就是说,可以指示传输时长越长的终端设备越提前发送模型更新参数,以及指示传输时长越短的终端设备越后发送模型更新,同理,向传输时长越长的终端设备越提前发送获取请求,以及向传输时长越短的终端设备越后发送获取请求,通过该方式,通过提前开始发送模型更新参数来弥补较长的传输时长,减少各个终端设备传输模型更新参数的时间差异性,使得网络设备能够在相同(或者尽量相同)的时间内接收到各个终端设备的模型更新参数,从而减少网络设备接收各个终端设备的模型更新参数的时间差异性。For example, the transmission duration corresponding to terminal device 1 is 10 minutes, the transmission duration corresponding to terminal device 2 is 15 minutes, the transmission duration corresponding to terminal device 3 is 22 minutes, and the transmission duration corresponding to terminal device 4 is 28 minutes, then the network device can Instruct the terminal device 4 to send the model update parameters to the network device at 13:02, instruct the terminal device 3 to send the model update parameters to the network device at 13:08, and instruct the terminal device 2 to send the network device at 13:15. At 13:08, instruct the terminal device 1 to send the model update parameters to the network device at 13:20. That is to say, the terminal device with the longer transmission duration can be instructed to send the model update parameters in advance, and the terminal device with the shorter transmission duration can be instructed to send the model update later. request, and send the acquisition request to the terminal device with the shorter transmission time later. In this way, the long transmission time is compensated for by starting to send the model update parameters in advance, and the time difference of the transmission model update parameters of each terminal device is reduced, so that the The network device can receive the model update parameters of each terminal device within the same (or as much as possible the same) time, thereby reducing the time difference for the network device to receive the model update parameters of each terminal device.
如此,综合考虑了大部分甚至是全部参与方的传输模型更新参数所需的时间,可以更为精确地控制各个终端设备上报各自的模型更新参数的时间,从而减少网络设备获取各个终端设备发送的模型更新参数的时间差异,进而提高本地模型更新的收敛速度,提高模型更新效率。In this way, the time required for the transmission model update parameters of most or even all participants can be comprehensively considered, and the time for each terminal device to report its own model update parameters can be more accurately controlled, thereby reducing the network device's acquisition of the data sent by each terminal device. The time difference of the model update parameters, thereby improving the convergence speed of the local model update and improving the model update efficiency.
S54:网络设备在选择的时间点,向第一终端设备发送获取请求。S54: At the selected time point, the network device sends an acquisition request to the first terminal device.
按照S52~S53介绍的方法,网络设备确定获取第一终端设备的模型更新参数的时间点,可以在确定的时间点上向第一终端设备发送获取请求,该获取请求用于指示第一终端设备向网络设备发送第一终端设备中的模型更新参数。According to the methods introduced in S52 to S53, the network device determines the time point for acquiring the model update parameters of the first terminal device, and may send an acquisition request to the first terminal device at the determined time point, where the acquisition request is used to instruct the first terminal device The model update parameters in the first terminal device are sent to the network device.
在具体实施过程中,网络设备未向第一终端设备明确指示需要获取哪些模型更新参数,那么按照网络设备和第一终端设备中的默认约定,第一终端设备可以将其得到的所有模型更新参数均发送给网络设备。在另一种可能的实施方式中,网络设备可以从第一终端设备中可用的模型更新参数中选择需要的部分模型更新参数,那么在该方式中,获取请求还可以用于指示网络设备指定的模型更新参数,即表明网络设备只需要获取请求所指示的模型更新参数,这样第一终端设备只需要将网络设备所请求的指定模型更新参数反馈给网络设备即可,这样可以减少数据传输量,降低网络传输开销。In the specific implementation process, the network device does not explicitly indicate to the first terminal device which model update parameters need to be acquired, then according to the default agreement between the network device and the first terminal device, the first terminal device can obtain all the model update parameters obtained by the first terminal device. are sent to the network device. In another possible implementation manner, the network device may select the required partial model update parameters from the model update parameters available in the first terminal device, then in this manner, the acquisition request may also be used to indicate the network device specified Model update parameters, which means that the network device only needs to obtain the model update parameters indicated by the request, so that the first terminal device only needs to feed back the specified model update parameters requested by the network device to the network device, which can reduce the amount of data transmission, Reduce network transmission overhead.
S55:第一终端设备根据获取请求,确定需要向网络设备发送的模型更新参数。S55: The first terminal device determines, according to the acquisition request, model update parameters that need to be sent to the network device.
如上所述的两种情况,第一终端设备根据获取请求确定网络设备需要的模型更新参数可能是第一终端设备中的所有模型更新参数或者部分模型更新参数。In the above two cases, the model update parameters determined by the first terminal device according to the acquisition request may be all model update parameters or part of the model update parameters in the first terminal device.
S56:第一终端设备将确定的模型更新参数发送给网络设备。S56: The first terminal device sends the determined model update parameters to the network device.
上述S51~S56(也可以不包括S51)示出了网络设备根据各个终端设备的传输时长确定时间点,并在确定的时间点向第一终端设备发送获取请求以向第一终端设备请求模型更新参数的实施例。在该实施例中,网络设备可以明确控制向各个终端设备请求模型更新参数的时间,在通过模型训练配置信息减少各个终端设备完成本地训练的时间差异的基础上,可以进一步地减少各个终端设备上报模型更新参数的时间差异,从而减少网络设备实际获取到各个终端设备发送的模型更新参数的时间差异。The above S51 to S56 (or S51 may not be included) show that the network device determines a time point according to the transmission duration of each terminal device, and sends an acquisition request to the first terminal device at the determined time point to request model update from the first terminal device Examples of parameters. In this embodiment, the network device can explicitly control the time for requesting model update parameters from each terminal device, and on the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, it can further reduce the reporting of each terminal device. The time difference between the model update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device.
S57:网络设备向第一终端设备发送上报时间信息。S57: The network device sends reporting time information to the first terminal device.
其中的上报时间信息用于指示第一终端设备在网络设备所确定的获取时间向网络设备发送第一终端设备中的模型更新参数,也就是说,网络设备可以向各个终端设备明确指示各个终端设备向网络设备上报模型更新参数的具体时间。The reporting time information is used to instruct the first terminal device to send the model update parameters in the first terminal device to the network device at the acquisition time determined by the network device, that is, the network device can explicitly indicate each terminal device to each terminal device The specific time for reporting model update parameters to the network device.
S58:第一终端设备在上报时间信息所指示的获取时间,向网络设备发送模型更新参数。S58: The first terminal device sends the model update parameter to the network device at the acquisition time indicated by the reporting time information.
根据上报时间信息的指示,在上报时间信息所指示的获取时间到达时,第一终端设备向网络设备发送经过本地训练得到的模型更新参数。According to the indication of the reporting time information, when the acquisition time indicated by the reporting time information arrives, the first terminal device sends the model update parameters obtained through local training to the network device.
上述S51、S52、S53、S57、S58(也可以不包括S51)示出了网络设备根据各个终端设备的传输时长确定时间点,并指示第一终端设备在该时间点向网络设备上报模型更新参数的实施例。在该实施例中,网络设备可以明确控制各个终端设备上报模型更新参数的具体时间,在通过模型训练配置信息减少各个终端设备完成本地训练的时间差异的基础上,可以进一步地减少各个终端设备上报模型更新参数的时间差异,从而减少网络设备实际获取到各个终端设备发送的模型更新参数的时间差异。需要说明的是,在具体实施过程中,上述S51~S56所示出的流程和S51、S52、S53、S57、S58所示出的流程可以择一实施,本申请实施例不做限制。在图5中是以实施S51~S56所对应的流程为例,所以在图5中S57、S58对应的步骤是以虚线表示的,表示可以是不执行的步骤。The above S51, S52, S53, S57, S58 (may not include S51) show that the network device determines a time point according to the transmission duration of each terminal device, and instructs the first terminal device to report the model update parameter to the network device at this time point. example. In this embodiment, the network device can explicitly control the specific time at which each terminal device reports the model update parameters. On the basis of reducing the time difference between each terminal device completing local training through the model training configuration information, the reporting by each terminal device can be further reduced. The time difference between the model update parameters is reduced, thereby reducing the time difference between the network device actually obtaining the model update parameters sent by each terminal device. It should be noted that, in the specific implementation process, the processes shown in the above S51 to S56 and the processes shown in S51, S52, S53, S57, and S58 may be alternatively implemented, which is not limited in the embodiment of the present application. In FIG. 5 , the process corresponding to S51 to S56 is implemented as an example, so in FIG. 5 , the steps corresponding to S57 and S58 are represented by dotted lines, indicating that the steps may not be executed.
S59:网络设备根据第一终端设备发送的模型更新参数对本地的机器学习模型进行更新。S59: The network device updates the local machine learning model according to the model update parameter sent by the first terminal device.
上述只是以第一终端设备为例介绍了网络设备获取一个终端设备中的模型更新参数的实施方式,按照前述介绍的方法,网络设备可以获得其它终端设备中的模型更新参数,并且由于是根据各个终端设备传输模型更新参数的传输时长差异化地向各个终端设备发送获取请求的,所以网络设备可以在近乎相同的时间接收到各个终端设备发送的各自的模型更新参数,减少了网络设备接收各个终端设备发送的模型更新参数的时间差异性。进一步地,网络设备可以在短时间内利用所有终端设备的模型更新参数对本地的机器学习模型进行本地更新,使得更新可以快速收敛,从而提高机器学习模型的更新效率。The above only takes the first terminal device as an example to introduce the implementation manner in which the network device obtains the model update parameters in one terminal device. According to the method described above, the network device can obtain the model update parameters in other terminal devices, and because it is based on each The transmission duration of the terminal device transmission model update parameters is differentiated and the acquisition request is sent to each terminal device, so the network device can receive the respective model update parameters sent by each terminal device at almost the same time, reducing the network device receiving each terminal device. Time variance of model update parameters sent by the device. Further, the network device can use the model update parameters of all terminal devices to locally update the local machine learning model in a short period of time, so that the update can be quickly converged, thereby improving the update efficiency of the machine learning model.
在前述介绍的更新机器学习模型的方法的所有实施例中,例如无论是在图4的实施例介绍还是图5的实施例介绍中,终端设备和网络设备之间可以基于现有的协议栈发送相关信息,例如,基于RRC消息在终端设备和接入网设备之间发送相关信息,或者基于NAS消息在终端设备和核心网设备之间发送相关信息。In all the above-mentioned embodiments of the method for updating a machine learning model, for example, whether in the introduction of the embodiment of FIG. 4 or the introduction of the embodiment of FIG. 5, the terminal device and the network device may send messages based on the existing protocol stack. For the related information, for example, the related information is sent between the terminal device and the access network device based on the RRC message, or the related information is sent between the terminal device and the core network device based on the NAS message.
此外,当网络设备是接入网设备且在CU-DU的架构下,终端设备和CU之间交互的信息可以通过DU转发,即终端设备将需要传输给CU的信息先发送给DU,然后DU再基于与CU之间的F1接口将该信息转发给CU。以及,终端设备与DU之间交互的信息可以直接相互发送,即,终端设备需要发送给DU的信息可以直接发送给DU,而DU需要发送给终端设备的信息也可以直接发送给终端设备。In addition, when the network device is an access network device and under the CU-DU architecture, the information exchanged between the terminal device and the CU can be forwarded through the DU, that is, the terminal device first sends the information that needs to be transmitted to the CU to the DU, and then the DU The information is then forwarded to the CU based on the F1 interface with the CU. And, the information exchanged between the terminal device and the DU can be directly sent to each other, that is, the information that the terminal device needs to send to the DU can be directly sent to the DU, and the information that the DU needs to send to the terminal device can also be directly sent to the terminal device.
以图4为例,假设图4中的网络设备是CU,那么其中的S41、S42、S46由CU执行,S44由终端设备执行,S43中的模型训练配置信息则是由CU先发送给DU后再由DU转发给终端设备,S45中的模型更新参数则是由终端设备先发送给DU后再由DU转发给CU;再假设图4中的网络设备是DU,那么其中的S41、S42、S46由DU执行,S44由终端设备执行,S43中的模型训练配置信息则是由DU直接发送给终端设备,S45中的模型更新参数则是由终端设备直接发送给DU。Taking Figure 4 as an example, assuming that the network device in Figure 4 is a CU, then S41, S42, and S46 are executed by the CU, S44 is executed by the terminal device, and the model training configuration information in S43 is first sent by the CU to the DU. The DU is then forwarded to the terminal device, and the model update parameters in S45 are first sent by the terminal device to the DU and then forwarded by the DU to the CU; assuming that the network device in Figure 4 is a DU, then S41, S42, S46 Performed by the DU, S44 is performed by the terminal device, the model training configuration information in S43 is directly sent by the DU to the terminal device, and the model update parameters in S45 are directly sent by the terminal device to the DU.
再以图5为例,假设图5中的网络设备是CU,那么其中的S52、S53、S59由CU执行,S55由终端设备执行,S51中的参数可用性指示信息、S56中需要发送的模型更新参数、S58中在上报时间信息所指示的时间点发送的模型更新参数则是由终端设备先发送给DU 后再由DU转发给CU,S54中的获取请求、S57中的上报时间则是由CU先发送给DU后再由DU转发给终端设备;再假设图5中的网络设备是DU,那么其中的S52、S53、S59由DU执行,S55由终端设备执行,S51中的参数可用性指示信息、S56中需要发送的模型更新参数、S58中在上报时间信息所指示的时间点发送的模型更新参数则是由终端设备直接发送给DU,S54中的获取请求、S57中的上报时间则是由DU直接发送给终端设备。Taking Fig. 5 as an example again, assuming that the network device in Fig. 5 is a CU, then S52, S53, and S59 are executed by the CU, S55 is executed by the terminal device, the parameter availability indication information in S51, and the model update that needs to be sent in S56. The parameters and the model update parameters sent at the time point indicated by the reporting time information in S58 are first sent by the terminal device to the DU and then forwarded by the DU to the CU. The acquisition request in S54 and the reporting time in S57 are sent by the CU. It is first sent to the DU and then forwarded to the terminal device by the DU; then suppose that the network device in Figure 5 is a DU, then S52, S53, and S59 are performed by the DU, and S55 is performed by the terminal device. The parameter availability indication information in S51, The model update parameters that need to be sent in S56 and the model update parameters sent at the time point indicated by the reporting time information in S58 are directly sent by the terminal device to the DU, and the acquisition request in S54 and the reporting time in S57 are sent by the DU sent directly to the end device.
上述举例的图4和图5中的网络设备分别为CU或DU时,CU和DU分别执行的步骤的具体实施例可以参见前述图4、图5的实施例描述部分,这里就不再重复说明了。When the network devices in FIG. 4 and FIG. 5 are respectively CU or DU, the specific embodiments of the steps performed by the CU and the DU respectively can refer to the description of the embodiments in the aforementioned FIG. 4 and FIG. 5 , and the description will not be repeated here. .
基于同一发明构思,本申请实施例提供一种通信装置,该通信装置可以是网络设备或者设置在网络设备内部的芯片。该通信装置具备实现上述图4~图5所示实施例中的网络设备的功能,比如,该通信装置包括执行上述图4~图5所示实施例中的网络设备所执行的步骤所对应的模块或单元或手段(means),所述功能或单元或手段可以通过软件实现,或者通过硬件实现,也可以通过硬件执行相应的软件实现。例如,参见图6所示,本申请实施例中的通信装置包括处理单元601和通信单元602,其中:Based on the same inventive concept, an embodiment of the present application provides a communication device, where the communication device may be a network device or a chip provided inside the network device. The communication apparatus has the function of implementing the network equipment in the embodiments shown in FIG. 4 to FIG. 5 . For example, the communication apparatus includes the functions corresponding to the steps performed by the network equipment in the embodiments shown in FIG. 4 to FIG. 5 . Modules or units or means, the functions or units or means can be implemented by software, or by hardware, or by executing corresponding software by hardware. For example, as shown in FIG. 6 , the communication apparatus in this embodiment of the present application includes a processing unit 601 and a communication unit 602, wherein:
处理单元601,用于根据终端设备的计算能力,确定终端设备对应的模型训练配置信息;A processing unit 601, configured to determine model training configuration information corresponding to the terminal device according to the computing capability of the terminal device;
通信单元602,用于将模型训练配置信息发送给终端设备,以及接收终端设备发送的模型更新参数,其中,模型更新参数是终端设备根据所述模型训练配置信息对该第一机器学习模型进行训练后更新的参数;The communication unit 602 is configured to send the model training configuration information to the terminal device, and receive the model update parameter sent by the terminal device, wherein the model update parameter is that the terminal device trains the first machine learning model according to the model training configuration information parameters updated later;
处理单元601,还用于根据模型更新参数对第二机器学习模型进行更新。The processing unit 601 is further configured to update the second machine learning model according to the model update parameter.
在一种可能的实施方式中,通信单元602还用于:In a possible implementation manner, the communication unit 602 is also used for:
接收来自终端设备的第一算力指示信息,第一算力指示信息用于指示终端设备的计算能力;或者,receiving first computing power indication information from the terminal device, where the first computing power indication information is used to indicate the computing capability of the terminal device; or,
在向终端设备发送计算能力获取请求后,接收来自终端设备的第二算力指示信息,第二算力指示信息用于指示终端设备的计算能力;或者,After sending a computing capability acquisition request to the terminal device, receive second computing power indication information from the terminal device, where the second computing power indication information is used to indicate the computing capability of the terminal device; or,
接收来自其它网络设备的第三算力指示信息,第三算力指示信息用于指示终端设备的计算能力。Receive third computing power indication information from other network devices, where the third computing power indication information is used to indicate the computing capability of the terminal device.
在一种可能的实施方式中,模型训练配置信息包括超参数、精度、训练时间信息中的至少一种。In a possible implementation, the model training configuration information includes at least one of hyperparameters, precision, and training time information.
在一种可能的实施方式中,通信单元602还用于向终端设备发送训练特征信息,该训练特征信息用于指示终端设备对第一机器学习模型进行训练所使用的训练特征集。In a possible implementation manner, the communication unit 602 is further configured to send training feature information to the terminal device, where the training feature information is used to indicate the training feature set used by the terminal device to train the first machine learning model.
在一种可能的实施方式中,通信单元602还用于向终端设备发送精度评估信息,该精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种。In a possible implementation manner, the communication unit 602 is further configured to send accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy.
在一种可能的实施方式中,通信单元602还用于接收来自终端设备的精度指示信息,该精度指示信息用于指示终端设备利用模型训练配置信息对第一机器学习模型进行训练后所达到的精度。In a possible implementation manner, the communication unit 602 is further configured to receive accuracy indication information from the terminal device, where the accuracy indication information is used to instruct the terminal device to use the model training configuration information to train the first machine learning model. precision.
在一种可能的实施方式中,处理单元601还用于确定用于获取终端设备的模型更新参数的时间点;则对应的,通信单元602还用于:In a possible implementation manner, the processing unit 601 is further configured to determine a time point for acquiring the model update parameters of the terminal device; correspondingly, the communication unit 602 is further configured to:
在前述时间点向终端设备发送获取请求,该获取请求用于指示终端设备向网络设备发送终端设备的模型更新参数;或者,Send an acquisition request to the terminal device at the aforementioned time point, where the acquisition request is used to instruct the terminal device to send the model update parameters of the terminal device to the network device; or,
向终端设备发送上报时间信息,该上报时间信息用于指示在前述时间点向网络设备发 送模型更新参数。Send reporting time information to the terminal device, where the reporting time information is used to indicate that the model update parameters are sent to the network device at the aforementioned time point.
在一种可能的实施方式中,处理单元601具体用于确定多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长;以及根据各个终端设备对应的各个传输时长确定前述时间点。In a possible implementation manner, the processing unit 601 is specifically configured to determine the transmission duration for each terminal device in the multiple terminal devices to send their respective model update parameters to the network device; and determine the aforementioned transmission duration according to each transmission duration corresponding to each terminal device point in time.
在一种可能的实施方式中,获取请求还用于指示需要获取指定的模型更新参数。In a possible implementation manner, the acquisition request is further used to indicate that the specified model update parameters need to be acquired.
在一种可能的实施方式中,通信单元602还用于接收来自终端设备的参数可用性指示信息,该参数可用性指示信息用于指示终端设备中的模型更新参数的可用性。In a possible implementation manner, the communication unit 602 is further configured to receive parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Wherein, all relevant contents of the steps involved in the above method embodiments can be cited in the functional descriptions of the corresponding functional modules, which will not be repeated here.
基于同一发明构思,本申请实施例提供一种通信装置,该通信装置可以是终端设备或者设置在终端设备内部的芯片。该通信装置具备实现上述图4所示实施例中的终端设备的功能,或者通信装置具备实现上述图5所示实施例中的第一终端设备的功能,比如,该通信装置包括执行上述图4~图5所示实施例中的终端设备或第一终端设备所执行的步骤所对应的模块或单元或手段,所述功能或单元或手段可以通过软件实现,或者通过硬件实现,也可以通过硬件执行相应的软件实现。例如,参见图7所示,本申请实施例中的通信装置包括通信单元701和处理单元702,其中:Based on the same inventive concept, an embodiment of the present application provides a communication device, where the communication device may be a terminal device or a chip provided inside the terminal device. The communication device has the function of implementing the terminal device in the embodiment shown in FIG. 4, or the communication device has the function of implementing the first terminal device in the embodiment shown in FIG. 5. For example, the communication device includes executing the above-mentioned FIG. 4 ~ the modules or units or means corresponding to the steps performed by the terminal device or the first terminal device in the embodiment shown in FIG. 5 , the functions, units or means may be implemented by software, or by hardware, or by hardware Execute the corresponding software implementation. For example, as shown in FIG. 7 , the communication apparatus in this embodiment of the present application includes a communication unit 701 and a processing unit 702, wherein:
通信单元701,用于接收网络设备发送的模型训练配置信息,该模型训练配置信息是根据终端设备的计算能力确定的;A communication unit 701, configured to receive model training configuration information sent by a network device, where the model training configuration information is determined according to the computing capability of the terminal device;
处理单元702,用于根据模型训练配置信息对第一机器学习模型进行训练,以得到模型更新参数;A processing unit 702, configured to train the first machine learning model according to the model training configuration information to obtain model update parameters;
通信单元701,还用于将模型更新参数发送给网络设备,该模型更新参数用于网络设备对第二机器学习模型进行更新。The communication unit 701 is further configured to send the model update parameter to the network device, where the model update parameter is used by the network device to update the second machine learning model.
在一种可能的实施方式中,通信单元701还用于接收网络设备发送的计算能力获取请求;以及根据计算能力获取请求,向网络设备发送第二算力指示信息,该第二算力指示信息用于指示终端设备的计算能力。In a possible implementation manner, the communication unit 701 is further configured to receive a computing capability acquisition request sent by the network device; and send second computing power indication information to the network device according to the computing capability acquisition request, where the second computing power indication information Used to indicate the computing capability of the terminal device.
在一种可能的实施方式中,通信单元701还用于接收来自网络设备的训练特征信息,训练特征信息用于指示终端设备对第一机器学习模型进行训练所使用的训练特征集;则对应的,处理单元702还用于根据模型训练配置信息和训练特征信息对第一机器学习模型进行训练。In a possible implementation manner, the communication unit 701 is further configured to receive training feature information from the network device, where the training feature information is used to indicate the training feature set used by the terminal device to train the first machine learning model; then the corresponding , the processing unit 702 is further configured to train the first machine learning model according to the model training configuration information and the training feature information.
在一种可能的实施方式中,通信单元701还用于接收来自网络设备的精度评估信息,精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种;则对应的,处理单元702还用于根据精度评估信息确定训练后的第一机器学习模型所达到的精度。In a possible implementation manner, the communication unit 701 is further configured to receive accuracy evaluation information from the network device, where the accuracy evaluation information includes at least one of a method for evaluating the accuracy or a test sample for evaluating the accuracy; then the corresponding The processing unit 702 is further configured to determine the accuracy achieved by the trained first machine learning model according to the accuracy evaluation information.
在一种可能的实施方式中,通信单元701用于向网络设备发送精度指示信息,该精度指示信息用于指示终端设备利用模型训练配置信息对第一机器学习模型进行训练后所达到的精度。In a possible implementation manner, the communication unit 701 is configured to send accuracy indication information to the network device, where the accuracy indication information is used to indicate the accuracy achieved by the terminal device after training the first machine learning model by using the model training configuration information.
在一种可能的实施方式中,通信单元701还用于接收来自网络设备的获取请求,并根据获取请求向网络设备发送模型更新参数,该获取请求用于指示终端设备向网络设备发送终端设备的模型更新参数。In a possible implementation manner, the communication unit 701 is further configured to receive an acquisition request from the network device, and send model update parameters to the network device according to the acquisition request, where the acquisition request is used to instruct the terminal device to send the network device the information of the terminal device to the network device. Model update parameters.
在一种可能的实施方式中,通信单元701还用于接收来自网络设备的上报时间信息,并在该上报时间信息所指示的时间点向网络设备发送模型更新参数。In a possible implementation manner, the communication unit 701 is further configured to receive reporting time information from the network device, and send model update parameters to the network device at the time point indicated by the reporting time information.
在一种可能的实施方式中,通信单元701还用于向网络设备发送参数可用性指示信息,参数可用性指示信息用于指示终端设备中的模型更新参数的可用性。In a possible implementation manner, the communication unit 701 is further configured to send parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Wherein, all relevant contents of the steps involved in the above method embodiments can be cited in the functional descriptions of the corresponding functional modules, which will not be repeated here.
基于同一发明构思,本申请实施例提供一种通信装置,该通信装置可以是网络设备或者设置在网络设备内部的芯片。该通信装置具备实现上述图4~图5所示实施例中的网络设备的功能,比如,该通信装置包括执行上述图4~图5所示实施例中的网络设备所执行的步骤所对应的模块或单元或手段(means),所述功能或单元或手段可以通过软件实现,或者通过硬件实现,也可以通过硬件执行相应的软件实现。例如,参见图8所示,本申请实施例中的通信装置包括处理单元801和通信单元802,其中:Based on the same inventive concept, an embodiment of the present application provides a communication device, where the communication device may be a network device or a chip provided inside the network device. The communication apparatus has the function of implementing the network equipment in the embodiments shown in FIG. 4 to FIG. 5 . For example, the communication apparatus includes the functions corresponding to the steps performed by the network equipment in the embodiments shown in FIG. 4 to FIG. 5 . Modules or units or means, the functions or units or means can be implemented by software, or by hardware, or by executing corresponding software by hardware. For example, as shown in FIG. 8 , the communication apparatus in this embodiment of the present application includes a processing unit 801 and a communication unit 802, wherein:
处理单元801,用于根据多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长,选择获取第一终端设备获取的更新参数的时间点;A processing unit 801, configured to select a time point for acquiring the update parameters acquired by the first terminal device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device;
通信单元802,用于在上述时间点向第一终端设备发送获取请求,并接收第一终端设备发送的模型更新参数,该获取请求用于请求第一终端设备向网络设备发送模型更新参数;或者,用于向第一终端设备发送上报时间信息,并接收第一终端设备向网络设备发送模型更新参数,该上报时间信息用于指示第一终端设备在上述时间点向网络设备发送模型更新参数;A communication unit 802, configured to send an acquisition request to the first terminal device at the above-mentioned time point, and receive the model update parameter sent by the first terminal device, where the acquisition request is used to request the first terminal device to send the model update parameter to the network device; or is used to send reporting time information to the first terminal device and receive model update parameters sent by the first terminal device to the network device, where the reporting time information is used to instruct the first terminal device to send the model update parameters to the network device at the above-mentioned time point;
处理单元801,还用于根据模型更新参数对第二机器学习模型进行更新。The processing unit 801 is further configured to update the second machine learning model according to the model update parameter.
在一种可能的实施方式中,通信单元802还用于接收来自第一终端设备的参数可用性指示信息。In a possible implementation manner, the communication unit 802 is further configured to receive parameter availability indication information from the first terminal device.
在一种可能的实施方式中,通信单元802还用于接收来自网络设备的用于指示指定的模型更新参数的指示信息。In a possible implementation manner, the communication unit 802 is further configured to receive indication information from the network device for indicating the specified model update parameter.
在一种可能的实施方式中,指示信息携带在获取请求中。In a possible implementation manner, the indication information is carried in the acquisition request.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Wherein, all relevant contents of the steps involved in the above method embodiments can be cited in the functional descriptions of the corresponding functional modules, which will not be repeated here.
基于同一发明构思,本申请实施例提供一种通信装置,该通信装置可以是终端设备或者设置在终端设备内部的芯片。该通信装置具备实现上述图4所示实施例中的终端设备的功能,或者通信装置具备实现上述图5所示实施例中的第一终端设备的功能,比如,该通信装置包括执行上述图4~图5所示实施例中的终端设备或第一终端设备所执行的步骤所对应的模块或单元或手段,所述功能或单元或手段可以通过软件实现,或者通过硬件实现,也可以通过硬件执行相应的软件实现。例如,参见图9所示,本申请实施例中的通信装置包括通信单元901和处理单元902,其中:Based on the same inventive concept, an embodiment of the present application provides a communication device, where the communication device may be a terminal device or a chip provided inside the terminal device. The communication device has the function of implementing the terminal device in the embodiment shown in FIG. 4, or the communication device has the function of implementing the first terminal device in the embodiment shown in FIG. 5. For example, the communication device includes executing the above-mentioned FIG. 4 ~ the modules or units or means corresponding to the steps performed by the terminal device or the first terminal device in the embodiment shown in FIG. 5 , the functions, units or means may be implemented by software, or by hardware, or by hardware Execute the corresponding software implementation. For example, as shown in FIG. 9 , the communication apparatus in this embodiment of the present application includes a communication unit 901 and a processing unit 902, wherein:
通信单元901,用于接收网络设备发送的获取请求,其中,发送该获取请求的时间点是网络设备根据多个终端设备中的各个终端设备向网络设备发送各自的模型更新参数的传输时长确定的;或者用于接收网络设备发送的上报时间信息,该上报时间信息用于指示在时间点向网络设备发送模型更新参数;The communication unit 901 is configured to receive an acquisition request sent by a network device, wherein the time point at which the acquisition request is sent is determined by the network device according to the transmission duration of each terminal device in the plurality of terminal devices sending their respective model update parameters to the network device ; Or used to receive the reporting time information sent by the network device, the reporting time information is used to indicate that the model update parameters are sent to the network device at a time point;
处理单元902,用于根据获取请求确定需要发送的模型更新参数;a processing unit 902, configured to determine model update parameters to be sent according to the acquisition request;
通信单元901,还用于将确定的模型更新参数发送给网络设备,或者用于在上报时间信息指示的时间点向网络设备发送模型更新参数,该模型更新参数用于网络设备对第二机器学习模型进行更新。The communication unit 901 is further configured to send the determined model update parameter to the network device, or to send the model update parameter to the network device at the time point indicated by the reporting time information, where the model update parameter is used by the network device to learn about the second machine The model is updated.
在一种可能的实现方式中,通信单元901还用于向网络设备发送参数可用性指示信息。In a possible implementation manner, the communication unit 901 is further configured to send parameter availability indication information to the network device.
在一种可能的实现方式中,通信单元901还用于接收来自网络设备的用于指示指定的模型更新参数的指示信息。In a possible implementation manner, the communication unit 901 is further configured to receive indication information from the network device for indicating the specified model update parameter.
在一种可能的实现方式中,指示信息携带在获取请求中。In a possible implementation manner, the indication information is carried in the acquisition request.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Wherein, all relevant contents of the steps involved in the above method embodiments can be cited in the functional descriptions of the corresponding functional modules, which will not be repeated here.
基于同一发明构思,参见图10,本申请实施例还提供一种通信装置,包括:Based on the same inventive concept, referring to FIG. 10 , an embodiment of the present application further provides a communication device, including:
至少一个处理器1001;以及与至少一个处理器1001通信连接的通信接口1003;至少一个处理器1001通过执行存储器1002存储的指令,使得该通信装置通过通信接口1003执行上述图4~图5所示实施例中的网络设备所执行的方法步骤。At least one processor 1001; and a communication interface 1003 communicatively connected to the at least one processor 1001; the at least one processor 1001 executes the instructions stored in the memory 1002 by executing the instructions stored in the memory 1002, so that the communication device executes the above-mentioned operations shown in FIG. 4 to FIG. 5 through the communication interface 1003 Method steps performed by a network device in an embodiment.
可选的,存储器1002位于通信装置之外。Optionally, the memory 1002 is located outside the communication device.
可选的,通信装置包括存储器1002,存储器1002与至少一个处理器1001相连,存储器1002存储有可被至少一个处理器1001执行的指令。图10中用虚线表示存储器1002对于通信装置是可选的。Optionally, the communication apparatus includes a memory 1002 , the memory 1002 is connected to the at least one processor 1001 , and the memory 1002 stores instructions that can be executed by the at least one processor 1001 . The memory 1002 is optional to the communication device as indicated by dashed lines in FIG. 10 .
其中,至少一个处理器1001和存储器1002可以通过接口电路耦合,也可以集成在一起,这里不做限制。Wherein, at least one of the processor 1001 and the memory 1002 may be coupled through an interface circuit, or may be integrated together, which is not limited here.
本申请实施例中不限定上述处理器1001、存储器1002以及通信接口1003之间的具体连接介质。本申请实施例在图10中以处理器1001、存储器1002以及通信接口1003之间通过总线1004连接,总线在图10中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图10中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The specific connection medium between the processor 1001 , the memory 1002 , and the communication interface 1003 is not limited in the embodiments of the present application. In this embodiment of the present application, the processor 1001, the memory 1002, and the communication interface 1003 are connected through a bus 1004 in FIG. 10. The bus is represented by a thick line in FIG. 10, and the connection between other components is only for schematic illustration. , is not limited. The bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 10, but it does not mean that there is only one bus or one type of bus.
基于同一发明构思,参见图11,本申请实施例还提供一种通信装置,包括:Based on the same inventive concept, referring to FIG. 11 , an embodiment of the present application further provides a communication device, including:
至少一个处理器1101;以及与至少一个处理器1101通信连接的通信接口1103;至少一个处理器1101通过执行存储器1102存储的指令,使得该通信装置通过通信接口1103执行上述图4所示实施例中的终端设备所执行的方法的步骤,或者执行上述图5所示实施例中的第一终端设备所执行的方法步骤。At least one processor 1101; and a communication interface 1103 communicatively connected to at least one processor 1101; at least one processor 1101 executes the instructions stored in the memory 1102 by executing the instructions stored in the memory 1102, so that the communication device executes the above-mentioned embodiment shown in FIG. 4 through the communication interface 1103. The steps of the method executed by the terminal device shown in FIG. 5 may be executed, or the steps of the method executed by the first terminal device in the above-mentioned embodiment shown in FIG. 5 are executed.
可选的,存储器1102位于通信装置之外。Optionally, the memory 1102 is located outside the communication device.
可选的,通信装置包括存储器1102,存储器1102与至少一个处理器1101相连,存储器1102存储有可被至少一个处理器1101执行的指令。附图11用虚线表示存储器1102对于通信装置是可选的。Optionally, the communication apparatus includes a memory 1102 , the memory 1102 is connected to the at least one processor 1101 , and the memory 1102 stores instructions that can be executed by the at least one processor 1101 . Figure 11 shows in dashed lines that the memory 1102 is optional to the communication device.
其中,处理器1101和存储器1102可以通过接口电路耦合,也可以集成在一起,这里不做限制。The processor 1101 and the memory 1102 may be coupled through an interface circuit, or may be integrated together, which is not limited here.
本申请实施例中不限定上述处理器1101、存储器1102以及通信接口1103之间的具体连接介质。本申请实施例在图11中以处理器1101、存储器1102以及通信接口1103之间通过总线1104连接,总线在图11中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图11中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The specific connection medium between the processor 1101 , the memory 1102 , and the communication interface 1103 is not limited in the embodiments of the present application. In the embodiment of the present application, the processor 1101, the memory 1102, and the communication interface 1103 are connected by a bus 1104 in FIG. 11. The bus is represented by a thick line in FIG. 11, and the connection between other components is only for schematic illustration. , is not limited. The bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 11, but it does not mean that there is only one bus or one type of bus.
应理解,本申请实施例中提及的处理器可以通过硬件实现也可以通过软件实现。当通过硬件实现时,该处理器可以是逻辑电路、集成电路等。当通过软件实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现。It should be understood that the processor mentioned in the embodiments of the present application may be implemented by hardware or software. When implemented in hardware, the processor may be a logic circuit, an integrated circuit, or the like. When implemented in software, the processor may be a general-purpose processor implemented by reading software codes stored in memory.
示例性的,处理器可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。Exemplarily, the processor may be a central processing unit (central processing unit, CPU), or other general-purpose processors, digital signal processors (digital signal processors, DSP), application specific integrated circuits (application specific integrated circuit, ASIC) , off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
应理解,本申请实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic rAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Eate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。It should be understood that the memory mentioned in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. Wherein, the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory may be random access memory (RAM), which acts as an external cache. By way of example and not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Eate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synchlink DRAM, SLDRAM) ) and direct memory bus random access memory (Direct Rambus RAM, DR RAM).
需要说明的是,当处理器为通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件时,存储器(存储模块)可以集成在处理器中。It should be noted that when the processor is a general-purpose processor, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components, the memory (storage module) can be integrated in the processor.
应注意,本文描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。It should be noted that the memory described herein is intended to include, but not be limited to, these and any other suitable types of memory.
基于同一发明构思,本申请实施例还提供一种通信系统,该通信系统包括图6中的通信装置和图7中的通信装置,或者包括图8中的通信装置和图9中的通信装置,或者包括图10中的通信装置和图11中的通信装置。Based on the same inventive concept, an embodiment of the present application further provides a communication system, the communication system includes the communication device in FIG. 6 and the communication device in FIG. 7 , or includes the communication device in FIG. 8 and the communication device in FIG. 9 , Or include the communication device in FIG. 10 and the communication device in FIG. 11 .
基于同一发明构思,本申请实施例还提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得上述图4~图5所示实施例中的网络设备所执行的方法被执行。Based on the same inventive concept, an embodiment of the present application further provides a computer-readable storage medium, including a program or an instruction. When the program or instruction is run on a computer, the network in the embodiments shown in FIG. 4 to FIG. 5 can be The method executed by the device is executed.
基于同一发明构思,本申请实施例还提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得上述图4~图5所示实施例中的终端设备或第一终端设备所执行的方法被执行。Based on the same inventive concept, an embodiment of the present application further provides a computer-readable storage medium, including a program or an instruction. When the program or instruction is run on a computer, the terminal in the embodiments shown in FIG. 4 to FIG. 5 can be The method performed by the device or the first terminal device is performed.
基于同一发明构思,本申请实施例还提供一种芯片,所述芯片与存储器耦合,用于读取并执行所述存储器中存储的程序指令,使得上述图4~图5所示实施例中的网络设备所执行的方法被执行。Based on the same inventive concept, an embodiment of the present application further provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the The method performed by the network device is performed.
基于同一发明构思,本申请实施例还提供一种芯片,所述芯片与存储器耦合,用于读取并执行所述存储器中存储的程序指令,使得上述图4~图5所示实施例中的终端设备或第一终端设备所执行的方法被执行。Based on the same inventive concept, an embodiment of the present application further provides a chip, which is coupled to a memory and used to read and execute program instructions stored in the memory, so that the The method performed by the terminal device or the first terminal device is performed.
基于同一发明构思,本申请实施例还提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得上述图4~图5所示实施例中的网络设备所执行的方法被执行。Based on the same inventive concept, an embodiment of the present application further provides a computer program product, including instructions, which, when running on a computer, cause the methods performed by the network devices in the embodiments shown in FIG. 4 to FIG. 5 to be executed.
基于同一发明构思,本申请实施例还提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得上述图4~图5所示实施例中的终端设备或第一终端设备所执行的方法被执行。Based on the same inventive concept, an embodiment of the present application also provides a computer program product, which includes instructions, which when run on a computer, cause the terminal device or the first terminal device in the embodiments shown in FIG. 4 to FIG. 5 to execute the program. method is executed.
由于本申请实施例提供的图6~图11所示的通信装置可用于执行图4~图5所示的实施例中相应的实施例所提供的方法,因此其所能获得的技术效果可参考上述方法实施例,在此不再赘述。Since the communication devices shown in FIG. 6 to FIG. 11 provided in the embodiments of the present application can be used to execute the methods provided by the corresponding embodiments in the embodiments shown in FIG. 4 to FIG. 5 , the technical effects that can be obtained may refer to The foregoing method embodiments are not repeated here.
本申请实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, apparatuses (systems), and computer program products according to the embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(digital versatile disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, from a website site, computer, server or data center via Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media. The available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital versatile disc (DVD)), or semiconductor media (eg, solid state disk (SSD) ))Wait.
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims (32)

  1. 一种更新机器学习模型的方法,其特征在于,所述方法包括:A method for updating a machine learning model, wherein the method comprises:
    网络设备根据终端设备的计算能力,确定所述终端设备对应的模型训练配置信息;The network device determines model training configuration information corresponding to the terminal device according to the computing capability of the terminal device;
    所述网络设备将所述模型训练配置信息发送给所述终端设备;The network device sends the model training configuration information to the terminal device;
    所述网络设备接收所述终端设备发送的模型更新参数,其中,所述模型更新参数是所述终端设备根据所述模型训练配置信息对第一机器学习模型训练后更新的模型参数;receiving, by the network device, a model update parameter sent by the terminal device, wherein the model update parameter is a model parameter updated by the terminal device after training the first machine learning model according to the model training configuration information;
    所述网络设备根据所述模型更新参数对第二机器学习模型进行更新。The network device updates the second machine learning model according to the model update parameter.
  2. 如权利要求1所述的方法,其特征在于,所述模型训练配置信息包括以下至少一种:The method of claim 1, wherein the model training configuration information includes at least one of the following:
    超参数;hyperparameters;
    精度;precision;
    训练时间信息。Training time information.
  3. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    所述网络设备向所述终端设备发送训练特征信息,所述训练特征信息用于指示所述终端设备对所述第一机器学习模型进行训练所使用的训练特征集。The network device sends training feature information to the terminal device, where the training feature information is used to indicate a training feature set used by the terminal device to train the first machine learning model.
  4. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    所述网络设备向所述终端设备发送精度评估信息,所述精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种。The network device sends accuracy evaluation information to the terminal device, where the accuracy evaluation information includes at least one of a method for evaluating accuracy or a test sample for evaluating accuracy.
  5. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    所述网络设备接收来自所述终端设备的精度指示信息,所述精度指示信息用于指示所述终端设备利用所述模型训练配置信息对所述第一机器学习模型进行训练后所达到的精度。The network device receives accuracy indication information from the terminal device, where the accuracy indication information is used to indicate the accuracy achieved by the terminal device after training the first machine learning model by using the model training configuration information.
  6. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    所述网络设备确定用于获取所述终端设备的模型更新参数的时间点;determining, by the network device, a time point for acquiring the model update parameter of the terminal device;
    在所述时间点向所述终端设备发送获取请求,所述获取请求用于指示所述终端设备向所述网络设备发送所述终端设备的模型更新参数;或者,Send an acquisition request to the terminal device at the time point, where the acquisition request is used to instruct the terminal device to send the model update parameter of the terminal device to the network device; or,
    向所述终端设备发送上报时间信息,所述上报时间信息用于指示在所述时间点向所述网络设备发送模型更新参数。Sending reporting time information to the terminal device, where the reporting time information is used to instruct to send model update parameters to the network device at the time point.
  7. 如权利要求6所述的方法,其特征在于,所述网络设备确定获取所述终端设备的模型更新参数的时间点,包括:The method according to claim 6, wherein determining, by the network device, a time point for acquiring the model update parameters of the terminal device comprises:
    所述网络设备确定多个终端设备中的各个终端设备向所述网络设备发送各自的模型更新参数的传输时长;The network device determines the transmission duration for each terminal device in the plurality of terminal devices to send the respective model update parameters to the network device;
    所述网络设备根据各个终端设备对应的各个传输时长确定所述时间点。The network device determines the time point according to each transmission duration corresponding to each terminal device.
  8. 如权利要求6所述的方法,其特征在于,所述获取请求还用于指示需要获取指定的模型更新参数。The method of claim 6, wherein the obtaining request is further used to indicate that the specified model update parameters need to be obtained.
  9. 如权利要求6所述的方法,其特征在于,所述方法还包括:The method of claim 6, wherein the method further comprises:
    所述网络设备接收来自所述终端设备的参数可用性指示信息,所述参数可用性指示信息用于指示所述终端设备中的模型更新参数的可用性。The network device receives parameter availability indication information from the terminal device, where the parameter availability indication information is used to indicate the availability of model update parameters in the terminal device.
  10. 一种更新机器学习模型的方法,其特征在于,所述方法包括:A method for updating a machine learning model, the method comprising:
    终端设备接收网络设备发送的模型训练配置信息,所述模型训练配置信息是根据所述 终端设备的计算能力确定的;The terminal device receives the model training configuration information sent by the network device, and the model training configuration information is determined according to the computing capability of the terminal device;
    所述终端设备根据所述模型训练配置信息对第一机器学习模型进行训练,以得到模型更新参数;The terminal device trains the first machine learning model according to the model training configuration information to obtain model update parameters;
    所述终端设备将所述模型更新参数发送给所述网络设备,所述模型更新参数用于所述网络设备对第二机器学习模型进行更新。The terminal device sends the model update parameter to the network device, where the model update parameter is used by the network device to update the second machine learning model.
  11. 如权利要求10所述的方法,其特征在于,所述方法还包括:The method of claim 10, wherein the method further comprises:
    所述终端设备接收来自所述网络设备的训练特征信息,所述训练特征信息用于指示所述终端设备对所述第一机器学习模型进行训练所使用的训练特征集;receiving, by the terminal device, training feature information from the network device, where the training feature information is used to indicate a training feature set used by the terminal device to train the first machine learning model;
    所述终端设备根据所述模型训练配置信息对第一机器学习模型进行训练,包括:The terminal device trains the first machine learning model according to the model training configuration information, including:
    所述终端设备根据所述模型训练配置信息和所述训练特征信息对所述第一机器学习模型进行训练。The terminal device trains the first machine learning model according to the model training configuration information and the training feature information.
  12. 如权利要求10所述的方法,其特征在于,所述方法还包括:The method of claim 10, wherein the method further comprises:
    所述终端设备接收来自所述网络设备的精度评估信息,所述精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种;The terminal device receives accuracy evaluation information from the network device, the accuracy evaluation information including at least one of a method for evaluating accuracy or a test sample for evaluating accuracy;
    所述终端设备根据所述精度评估信息确定训练后的第一机器学习模型所达到的精度。The terminal device determines the accuracy achieved by the trained first machine learning model according to the accuracy evaluation information.
  13. 如权利要求10-12中任一所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-12, wherein the method further comprises:
    所述终端设备向所述网络设备发送精度指示信息,所述精度指示信息用于指示所述终端设备利用所述模型训练配置信息对所述第一机器学习模型进行训练后所达到的精度。The terminal device sends accuracy indication information to the network device, where the accuracy indication information is used to indicate the accuracy achieved by the terminal device after training the first machine learning model by using the model training configuration information.
  14. 如权利要求10-12中任一所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-12, wherein the method further comprises:
    所述终端设备接收来自所述网络设备的获取请求,并根据所述获取请求向所述网络设备发送所述模型更新参数,所述获取请求用于指示所述终端设备向所述网络设备发送所述终端设备的模型更新参数;或者,The terminal device receives an acquisition request from the network device, and sends the model update parameter to the network device according to the acquisition request, where the acquisition request is used to instruct the terminal device to send the data to the network device. the model update parameters of the terminal device described above; or,
    所述终端设备接收来自所述网络设备的上报时间信息,并在所述上报时间信息所指示的时间点向所述网络设备发送所述模型更新参数。The terminal device receives the report time information from the network device, and sends the model update parameter to the network device at the time point indicated by the report time information.
  15. 如权利要求10-12中任一所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-12, wherein the method further comprises:
    所述终端设备向所述网络设备发送参数可用性指示信息,所述参数可用性指示信息用于指示所述终端设备中的模型更新参数的可用性。The terminal device sends parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of a model update parameter in the terminal device.
  16. 一种通信装置,其特征在于,包括:A communication device, comprising:
    处理单元,用于根据终端设备的计算能力,确定所述终端设备对应的模型训练配置信息;a processing unit, configured to determine model training configuration information corresponding to the terminal device according to the computing capability of the terminal device;
    通信单元,用于将所述模型训练配置信息发送给所述终端设备,以及接收所述终端设备发送的模型更新参数,其中,所述模型更新参数是所述终端设备根据所述模型训练配置信息对第一机器学习模型训练后更新的模型参数;a communication unit, configured to send the model training configuration information to the terminal device, and receive a model update parameter sent by the terminal device, wherein the model update parameter is the terminal device according to the model training configuration information Model parameters updated after training the first machine learning model;
    所述处理单元,还用于根据所述模型更新参数对第二机器学习模型进行更新。The processing unit is further configured to update the second machine learning model according to the model update parameter.
  17. 如权利要求16所述的装置,其特征在于,所述模型训练配置信息包括以下至少一种:The apparatus of claim 16, wherein the model training configuration information includes at least one of the following:
    超参数;hyperparameters;
    精度;precision;
    训练时间信息。Training time information.
  18. 如权利要求16或17所述的装置,其特征在于,所述通信单元还用于:The device according to claim 16 or 17, wherein the communication unit is further used for:
    向所述终端设备发送训练特征信息,所述训练特征信息用于指示所述终端设备对所述第一机器学习模型进行训练所使用的训练特征集。Sending training feature information to the terminal device, where the training feature information is used to indicate a training feature set used by the terminal device to train the first machine learning model.
  19. 如权利要求16或17所述的装置,其特征在于,所述通信单元还用于:The device according to claim 16 or 17, wherein the communication unit is further used for:
    向所述终端设备发送精度评估信息,所述精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种。Sending accuracy evaluation information to the terminal device, the accuracy evaluation information including at least one of a method for evaluating accuracy or a test sample for evaluating accuracy.
  20. 如权利要求16或17所述的装置,其特征在于,所述通信单元还用于:The device according to claim 16 or 17, wherein the communication unit is further used for:
    接收来自所述终端设备的精度指示信息,所述精度指示信息用于指示所述终端设备利用所述模型训练配置信息对所述第一机器学习模型进行训练后所达到的精度。Accuracy indication information from the terminal device is received, where the accuracy indication information is used to indicate the accuracy achieved by the terminal device after training the first machine learning model by using the model training configuration information.
  21. 如权利要求16或17所述的装置,其特征在于,The device of claim 16 or 17, wherein:
    所述处理单元还用于:The processing unit is also used to:
    确定用于获取所述终端设备的模型更新参数的时间点;determining a time point for acquiring the model update parameters of the terminal device;
    则,所述通信单元还用于:Then, the communication unit is also used for:
    在所述时间点向所述终端设备发送获取请求,所述获取请求用于指示所述终端设备向所述网络设备发送所述终端设备的模型更新参数;或者,Send an acquisition request to the terminal device at the time point, where the acquisition request is used to instruct the terminal device to send the model update parameter of the terminal device to the network device; or,
    向所述终端设备发送上报时间信息,所述上报时间信息用于指示在所述时间点向所述网络设备发送模型更新参数。Sending reporting time information to the terminal device, where the reporting time information is used to instruct to send model update parameters to the network device at the time point.
  22. 如权利要求21所述的装置,其特征在于,所述处理单元具体用于:The apparatus of claim 21, wherein the processing unit is specifically configured to:
    确定多个终端设备中的各个终端设备向所述网络设备发送各自的模型更新参数的传输时长;determining the transmission duration for each terminal device in the plurality of terminal devices to send the respective model update parameters to the network device;
    根据各个终端设备对应的各个传输时长确定所述时间点。The time point is determined according to each transmission duration corresponding to each terminal device.
  23. 如权利要求21所述的装置,其特征在于,所述获取请求还用于指示需要获取指定的模型更新参数。The apparatus according to claim 21, wherein the obtaining request is further used to indicate that the specified model update parameter needs to be obtained.
  24. 如权利要求21所述的装置,其特征在于,所述通信单元还用于:The apparatus of claim 21, wherein the communication unit is further configured to:
    接收来自所述终端设备的参数可用性指示信息,所述参数可用性指示信息用于指示所述终端设备中的模型更新参数的可用性。Parameter availability indication information is received from the terminal device, where the parameter availability indication information is used to indicate the availability of model update parameters in the terminal device.
  25. 一种通信装置,其特征在于,包括:A communication device, comprising:
    通信单元,用于接收网络设备发送的模型训练配置信息,所述模型训练配置信息是根据终端设备的计算能力确定的;a communication unit, configured to receive model training configuration information sent by the network device, where the model training configuration information is determined according to the computing capability of the terminal device;
    处理单元,用于根据所述模型训练配置信息对第一机器学习模型进行训练,以得到模型更新参数;a processing unit, configured to train the first machine learning model according to the model training configuration information to obtain model update parameters;
    所述通信单元,还用于将所述模型更新参数发送给所述网络设备,所述模型更新参数用于所述网络设备对第二机器学习模型进行更新。The communication unit is further configured to send the model update parameter to the network device, where the model update parameter is used by the network device to update the second machine learning model.
  26. 如权利要求25所述的装置,其特征在于,所述通信单元还用于:The apparatus of claim 25, wherein the communication unit is further configured to:
    接收来自所述网络设备的训练特征信息,所述训练特征信息用于指示所述终端设备对所述第一机器学习模型进行训练所使用的训练特征集;receiving training feature information from the network device, where the training feature information is used to instruct the terminal device to use a training feature set for training the first machine learning model;
    则,所述处理单元还用于:Then, the processing unit is also used for:
    根据所述模型训练配置信息和所述训练特征信息对所述第一机器学习模型进行训练。The first machine learning model is trained according to the model training configuration information and the training feature information.
  27. 如权利要求25所述的装置,其特征在于,所述通信单元还用于:The apparatus of claim 25, wherein the communication unit is further configured to:
    接收来自所述网络设备的精度评估信息,所述精度评估信息包括用于评估精度的方法或用于评估精度的测试样本中的至少一种;receiving accuracy evaluation information from the network device, the accuracy evaluation information including at least one of a method for evaluating accuracy or a test sample for evaluating accuracy;
    则,所述处理单元还用于:Then, the processing unit is also used for:
    根据所述精度评估信息确定训练后的第一机器学习模型所达到的精度。The accuracy achieved by the trained first machine learning model is determined according to the accuracy evaluation information.
  28. 如权利要求25-27中任一所述的装置,其特征在于,所述通信单元用于:The apparatus according to any one of claims 25-27, wherein the communication unit is used for:
    向所述网络设备发送精度指示信息,所述精度指示信息用于指示所述终端设备利用所述模型训练配置信息对所述第一机器学习模型进行训练后所达到的精度。Sending accuracy indication information to the network device, where the accuracy indication information is used to indicate the accuracy achieved by the terminal device after training the first machine learning model by using the model training configuration information.
  29. 如权利要求25-27中任一所述的装置,其特征在于,所述通信单元还用于:The apparatus according to any one of claims 25-27, wherein the communication unit is further configured to:
    接收来自所述网络设备的获取请求,并根据所述获取请求向所述网络设备发送所述模型更新参数,所述获取请求用于指示所述终端设备向所述网络设备发送所述终端设备的模型更新参数;或者,Receive an acquisition request from the network device, and send the model update parameter to the network device according to the acquisition request, where the acquisition request is used to instruct the terminal device to send the terminal device's information to the network device. model update parameters; or,
    接收来自所述网络设备的上报时间信息,并在所述上报时间信息所指示的时间点向所述网络设备发送所述模型更新参数。Receive reporting time information from the network device, and send the model update parameter to the network device at a time point indicated by the reporting time information.
  30. 如权利要求25-27中任一所述的装置,其特征在于,所述通信单元还用于:The apparatus according to any one of claims 25-27, wherein the communication unit is further configured to:
    向所述网络设备发送参数可用性指示信息,所述参数可用性指示信息用于指示所述终端设备中的模型更新参数的可用性。Sending parameter availability indication information to the network device, where the parameter availability indication information is used to indicate the availability of the model update parameter in the terminal device.
  31. 一种通信装置,其特征在于,包括:A communication device, characterized in that it includes:
    至少一个处理器;以及与所述至少一个处理器通信连接的存储器、通信接口;at least one processor; and a memory, a communication interface communicatively coupled to the at least one processor;
    其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述至少一个处理器通过执行所述存储器存储的指令,使得所述装置执行如权利要求1-9或10-15中任一项所述的方法。Wherein, the memory stores instructions executable by the at least one processor, and the at least one processor executes the instructions stored in the memory to cause the apparatus to perform the method described in claim 1-9 or 10-15. The method of any one.
  32. 一种计算机可读存储介质,其特征在于,包括程序或指令,当所述程序或指令在计算机上运行时,使得如权利要求1-9或10-15中任一项所述的方法被执行。A computer-readable storage medium, characterized in that it includes a program or an instruction, which, when the program or instruction is run on a computer, causes the method according to any one of claims 1-9 or 10-15 to be executed .
PCT/CN2021/100003 2020-08-24 2021-06-15 Method for updating machine learning model, and communication apparatus WO2022041947A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010858858.7 2020-08-24
CN202010858858.7A CN114091679A (en) 2020-08-24 2020-08-24 Method for updating machine learning model and communication device

Publications (1)

Publication Number Publication Date
WO2022041947A1 true WO2022041947A1 (en) 2022-03-03

Family

ID=80295726

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/100003 WO2022041947A1 (en) 2020-08-24 2021-06-15 Method for updating machine learning model, and communication apparatus

Country Status (2)

Country Link
CN (1) CN114091679A (en)
WO (1) WO2022041947A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114513270A (en) * 2022-03-07 2022-05-17 苏州大学 Heterogeneous wireless network spectrum resource sensing method and system based on federal learning
WO2023184310A1 (en) * 2022-03-31 2023-10-05 Qualcomm Incorporated Centralized machine learning model configurations
WO2023206437A1 (en) * 2022-04-29 2023-11-02 富士通株式会社 Information transmission method and apparatus
WO2023207026A1 (en) * 2022-04-29 2023-11-02 富士通株式会社 Information indication method and apparatus and information processing method and apparatus
WO2023206583A1 (en) * 2022-04-30 2023-11-02 Qualcomm Incorporated Techniques for training devices for machine learning-based channel state information and channel state feedback
WO2023231620A1 (en) * 2022-06-02 2023-12-07 华为技术有限公司 Communication method and apparatus
WO2024007989A1 (en) * 2022-07-05 2024-01-11 维沃移动通信有限公司 Information reporting method and apparatus, terminal, and access network device
WO2024020747A1 (en) * 2022-07-25 2024-02-01 北京小米移动软件有限公司 Model generation method and apparatus
WO2024026844A1 (en) * 2022-08-05 2024-02-08 Nokia Shanghai Bell Co., Ltd. Monitoring data events for updating model
WO2024041563A1 (en) * 2022-08-24 2024-02-29 中国电信股份有限公司 Model acquisition method, apparatus and system
EP4346177A1 (en) * 2022-09-29 2024-04-03 Nokia Technologies Oy Ai/ml operation in single and multi-vendor scenarios
WO2024065709A1 (en) * 2022-09-30 2024-04-04 华为技术有限公司 Communication method and related device
WO2024078615A1 (en) * 2022-10-14 2024-04-18 维沃移动通信有限公司 Model selection method, terminal and network-side device
WO2024088119A1 (en) * 2022-10-25 2024-05-02 维沃移动通信有限公司 Data processing method and apparatus, and terminal and network-side device
WO2024094038A1 (en) * 2022-11-01 2024-05-10 华为技术有限公司 Method for switching or updating ai model, and communication apparatus

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023173434A1 (en) * 2022-03-18 2023-09-21 北京小米移动软件有限公司 Channel estimation method, apparatus and device, and storage medium
CN117178579A (en) * 2022-03-31 2023-12-05 北京小米移动软件有限公司 Method and device for determining model used by terminal equipment
WO2023221000A1 (en) * 2022-05-18 2023-11-23 北京小米移动软件有限公司 Authentication and authorization method and apparatus for ai function in core network
WO2024031246A1 (en) * 2022-08-08 2024-02-15 Nec Corporation Methods for communication
WO2024031697A1 (en) * 2022-08-12 2024-02-15 Zte Corporation Device capability and performance monitoring for a model
WO2024036605A1 (en) * 2022-08-19 2024-02-22 Lenovo (Beijing) Ltd. Support of ue centric ai based temporal beam prediction
CN117714309A (en) * 2022-09-13 2024-03-15 华为技术有限公司 Data transmission method, device, equipment and storage medium
WO2024055306A1 (en) * 2022-09-16 2024-03-21 Oppo广东移动通信有限公司 Communication device, method and apparatus, storage medium, chip, product, and program
CN117834427A (en) * 2022-09-26 2024-04-05 维沃移动通信有限公司 Method and device for updating AI model parameters and communication equipment
WO2024065681A1 (en) * 2022-09-30 2024-04-04 Shenzhen Tcl New Technology Co., Ltd. Communication devices and methods for machine learning model monitoring
CN117997738A (en) * 2022-11-03 2024-05-07 展讯通信(上海)有限公司 AI model updating method and communication device
WO2024103271A1 (en) * 2022-11-16 2024-05-23 华为技术有限公司 Communication methods and related apparatuses

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017156791A1 (en) * 2016-03-18 2017-09-21 Microsoft Technology Licensing, Llc Method and apparatus for training a learning machine
CN110263908A (en) * 2019-06-20 2019-09-20 深圳前海微众银行股份有限公司 Federal learning model training method, equipment, system and storage medium
CN111310932A (en) * 2020-02-10 2020-06-19 深圳前海微众银行股份有限公司 Method, device and equipment for optimizing horizontal federated learning system and readable storage medium
CN111401552A (en) * 2020-03-11 2020-07-10 浙江大学 Federal learning method and system based on batch size adjustment and gradient compression rate adjustment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017156791A1 (en) * 2016-03-18 2017-09-21 Microsoft Technology Licensing, Llc Method and apparatus for training a learning machine
CN110263908A (en) * 2019-06-20 2019-09-20 深圳前海微众银行股份有限公司 Federal learning model training method, equipment, system and storage medium
CN111310932A (en) * 2020-02-10 2020-06-19 深圳前海微众银行股份有限公司 Method, device and equipment for optimizing horizontal federated learning system and readable storage medium
CN111401552A (en) * 2020-03-11 2020-07-10 浙江大学 Federal learning method and system based on batch size adjustment and gradient compression rate adjustment

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114513270A (en) * 2022-03-07 2022-05-17 苏州大学 Heterogeneous wireless network spectrum resource sensing method and system based on federal learning
CN114513270B (en) * 2022-03-07 2022-12-02 苏州大学 Heterogeneous wireless network spectrum resource sensing method and system based on federal learning
WO2023184310A1 (en) * 2022-03-31 2023-10-05 Qualcomm Incorporated Centralized machine learning model configurations
WO2023206437A1 (en) * 2022-04-29 2023-11-02 富士通株式会社 Information transmission method and apparatus
WO2023207026A1 (en) * 2022-04-29 2023-11-02 富士通株式会社 Information indication method and apparatus and information processing method and apparatus
WO2023206583A1 (en) * 2022-04-30 2023-11-02 Qualcomm Incorporated Techniques for training devices for machine learning-based channel state information and channel state feedback
WO2023231620A1 (en) * 2022-06-02 2023-12-07 华为技术有限公司 Communication method and apparatus
WO2024007989A1 (en) * 2022-07-05 2024-01-11 维沃移动通信有限公司 Information reporting method and apparatus, terminal, and access network device
WO2024020747A1 (en) * 2022-07-25 2024-02-01 北京小米移动软件有限公司 Model generation method and apparatus
WO2024026844A1 (en) * 2022-08-05 2024-02-08 Nokia Shanghai Bell Co., Ltd. Monitoring data events for updating model
WO2024041563A1 (en) * 2022-08-24 2024-02-29 中国电信股份有限公司 Model acquisition method, apparatus and system
EP4346177A1 (en) * 2022-09-29 2024-04-03 Nokia Technologies Oy Ai/ml operation in single and multi-vendor scenarios
WO2024065709A1 (en) * 2022-09-30 2024-04-04 华为技术有限公司 Communication method and related device
WO2024078615A1 (en) * 2022-10-14 2024-04-18 维沃移动通信有限公司 Model selection method, terminal and network-side device
WO2024088119A1 (en) * 2022-10-25 2024-05-02 维沃移动通信有限公司 Data processing method and apparatus, and terminal and network-side device
WO2024094038A1 (en) * 2022-11-01 2024-05-10 华为技术有限公司 Method for switching or updating ai model, and communication apparatus

Also Published As

Publication number Publication date
CN114091679A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2022041947A1 (en) Method for updating machine learning model, and communication apparatus
US20230179490A1 (en) Artificial intelligence-based communication method and communication apparatus
US20230016595A1 (en) Performing a handover procedure
US20230209390A1 (en) Intelligent Radio Access Network
US20210385682A1 (en) Configuration of a neural network for a radio access network (ran) node of a wireless network
WO2022141295A1 (en) Communication method and apparatus
US20230217308A1 (en) Traffic flow prediction in a wireless network using heavy-hitter encoding and machine learning
US20230224752A1 (en) Communication method, apparatus, and system
WO2022121804A1 (en) Method for semi-asynchronous federated learning and communication apparatus
WO2022226713A1 (en) Method and apparatus for determining policy
US20230100253A1 (en) Network-based artificial intelligence (ai) model configuration
WO2022082356A1 (en) Communication method and apparatus
US20240015534A1 (en) Model processing method, communication apparatus, and system
US20230289615A1 (en) Training a machine learning model
US11863354B2 (en) Model transfer within wireless networks for channel estimation
CN117716674A (en) Network resource model-based solution for AI-ML model training
WO2022183362A1 (en) Communication method, device and storage medium
WO2024031535A1 (en) Wireless communication method, terminal device, and network device
WO2023185452A1 (en) Communication method and communication apparatus
WO2024027427A1 (en) Anomaly detection method and communication apparatus
CN114143832B (en) Service processing method, device and storage medium
US12010571B2 (en) Spectral efficiency prediction with artificial intelligence for enhancing carrier aggregation and proactive radio resource management
WO2023226004A1 (en) Network device and method for prediction operation
WO2023236774A1 (en) Intent management method and apparatus
US20230353326A1 (en) Nr framework for beam prediction in spatial domain

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859791

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21859791

Country of ref document: EP

Kind code of ref document: A1