WO2024044881A1 - 一种数据处理方法、训练方法及相关装置 - Google Patents

一种数据处理方法、训练方法及相关装置 Download PDF

Info

Publication number
WO2024044881A1
WO2024044881A1 PCT/CN2022/115466 CN2022115466W WO2024044881A1 WO 2024044881 A1 WO2024044881 A1 WO 2024044881A1 CN 2022115466 W CN2022115466 W CN 2022115466W WO 2024044881 A1 WO2024044881 A1 WO 2024044881A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
information
machine learning
processing
learning model
Prior art date
Application number
PCT/CN2022/115466
Other languages
English (en)
French (fr)
Inventor
王坚
滕伟
李榕
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2022/115466 priority Critical patent/WO2024044881A1/zh
Publication of WO2024044881A1 publication Critical patent/WO2024044881A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This application relates to the technical field of Artificial Intelligence (AI), and in particular to a data processing method, training method and related devices.
  • AI Artificial Intelligence
  • the process of processing original data to obtain high-quality data can be regarded as a data noise reduction process.
  • a machine learning model By performing multiple denoising processes on the original data through a machine learning model, high-quality data can be obtained.
  • the entire process can be modeled using a Markov chain.
  • the training process of the machine learning model used for noise reduction is highly complex, and using the machine learning model for inference, that is, denoising the data, often requires performing a larger number of denoising processes.
  • the data processing process It is relatively complex, so a large amount of computing resources are required on the device to support model training and inference, making it difficult to implement on most devices with weak computing capabilities.
  • This application provides a data processing method. By deploying machine learning models with the same structure on different devices, multiple devices jointly complete data processing, so as to continuously improve the quality of the obtained data and reduce the burden of each device. Data processing pressure ensures that devices with weak computing power can also process data.
  • a first aspect of the present application provides a data processing method.
  • the data processing method is executed by a first device.
  • the first device may be a terminal device or a network device, or some components of the terminal device or network device (for example, Processor, chip or chip system, etc.).
  • the first device may also be a logic module and/or software that can realize all or part of the functions of the terminal device.
  • the method includes: the first device receives first data from the second device, and the first data is data processed by the first machine learning model; that is, the second device obtains the first data through processing by the first machine learning model. After receiving the data, the first data is sent to the first device. Then, the first device processes the first data through the second machine learning model to obtain the second data.
  • the structure of the first machine learning model is the same as the structure of the second machine learning model, and the first device and the second device are used to jointly perform data processing.
  • the first device and the second device may also be referred to as distributed devices. Different distributed devices realize joint processing of data by exchanging data.
  • the second device processes the data through a certain machine learning model, it sends the processed data to the first device; the first device continues to process the data based on the machine learning model with the same structure, thereby achieving two The devices jointly perform data processing.
  • the second machine learning model is a diffusion model
  • the data processing process using the diffusion model can be modeled as a Markov chain
  • the second machine learning model is used to denoise the first data.
  • the diffusion model can be implemented through neural networks, such as fully connected neural networks, convolutional neural networks, residual neural networks, etc.
  • the process of processing data through the diffusion model refers to continuously processing the data output by the diffusion model in the previous data processing through the diffusion model, thereby achieving multi-step processing of data based on the same diffusion model, and finally obtaining high-quality output data.
  • diffusion models with the same structure are deployed on different devices, and the different devices serially implement joint processing of data.
  • each The data processing pressure of the device On the basis of obtaining high-quality data, each The data processing pressure of the device.
  • the method further includes: the first device receiving first information from the second device, the first information being used to request the first device to perform processing on the first data. For example, when the second device needs to obtain high-quality data and the second device itself cannot complete the entire process of processing the data, the second device uses the first machine learning model to partially process the original data to obtain the first data. ; And the second device sends the first data and the first information to the first device to request the first device to assist the second device in continuing to complete the data processing.
  • devices with weak data processing capabilities can request assistance from other devices to complete data processing, making full use of the data processing capabilities of each device to ensure that data processing capabilities are relatively high.
  • Weak devices are also able to obtain data of the quality they require.
  • the first information sent by the second device to the first device is used to indicate that the number of times the first data needs to be processed is the first number.
  • the first device performs a first number of processing on the first data through the second machine learning model to obtain the second data.
  • the capability of the first device supports completing the first number of processing on the first data.
  • the first device since the second device requests assistance from the first device to complete data processing, after the first device processes and obtains the second data, the first device sends the second data to the second device to The processed second data is fed back to the second device.
  • the first information is also used to indicate information about the source device.
  • the source device is the first device to request assistance in processing data, and the second device is one of them.
  • a device that assists in processing data and the first device is another device that assists in processing data.
  • the first device sends the second data to the source device, ensuring that the source device can obtain the final processed data.
  • the first information sent by the second device to the first device is used to indicate that the number of times the first data needs to be processed is the first number.
  • the first device processes the first data a second time through the second machine learning model to obtain the second data, where the first time is greater than the second time. The capability of the first device does not support completing the first time on the first data. Number processing.
  • the first device sends the second data and the second information to the third device; wherein the second information is used to indicate that the number of times the second data needs to be processed is the third number, and the third number is the first number and the second number.
  • the third device is used to assist the first device in performing data processing.
  • the second device requests the first device to assist in processing the first data 1,000 times, and the capability of the first device can only support the first device to complete the processing of the first data 600 times.
  • the data is processed 600 times to obtain the second data; and the first device sends the second information to the third device to request the third device to continue processing the second data 400 times.
  • multiple devices complete a part of the data processing according to their own capabilities, and send the unfinished data to the next device, and the next device continues to complete the processing of the data.
  • multiple devices can be coordinated to jointly complete data processing, fully utilizing the data processing capabilities of each device, and ensuring that devices with weak computing capabilities can also obtain other data processing capabilities. Data of required quality.
  • the method further includes: the first device sending assistance request information to the second device, where the assistance request information is used to request the second device to assist in processing the data.
  • the first device can proactively request the second device to assist in processing the data, and the second device processes the data first, and then hands the processed data to the first device for continued processing, thus avoiding the need for communication between the first device and the first device.
  • Data is exchanged twice between the second devices.
  • the method further includes: the first device sends third information to the central device, the third information is used to indicate the number of data processing times required by the first device; the first device receives feedback from the central device Information, the feedback information is used to indicate that the second device is an assisting node. That is to say, the first device can first feed back to the central device the number of data processing times required by the first device, and the central device can implement coordination among multiple devices. That is, the central device instructs the second device to assist the first device in processing data.
  • the second device may also feed back to the central device the number of processing times of the data required by the second device.
  • the central device can determine that the second device will process the data first, and then the first device will continue to process the data based on the data processed by the second device, thereby effectively utilizing the third device.
  • the data obtained by processing by the second device improves the efficiency of data processing by the first device.
  • the central device is used to coordinate the joint processing of data among various distributed devices, and the data processing tasks of each distributed device can be determined based on the needs of each distributed device, thereby improving the efficiency of joint processing of data by distributed devices.
  • the method further includes: the first device receiving fourth information from the central device, where the fourth information is used to indicate the number of times the first device needs to perform processing on the data received from the second device.
  • the first device processes the first data through the second machine learning model according to the fourth information to obtain the second data required by the first device.
  • the central device can determine the data processing sequence and data processing times between each distributed device based on the data processing needs of each distributed device, so that Each distributed device can determine the number of times of data processing after receiving data sent by other distributed devices.
  • the fourth information is also used to indicate information of a third device
  • the third device is a device to receive data processed by the first device.
  • the first device sends the second data to the third device according to the fourth information. That is to say, after the first device processes the second data with the assistance of the second device, the first device can send the second data to the third device so that the third device can use the second data or continue to process the second data. The data is processed.
  • the central device can indicate in the information fed back to the distributed device which distributed device the distributed device needs to receive data from, the number of times to process the data after receiving it, and which distributed device the processed data needs to be sent to. This effectively achieves coordinated data joint processing between distributed devices.
  • the method further includes: the first device receiving fifth information from the second device, where the fifth information is used to indicate the number of times the first data has been processed.
  • the first device processes the first data through the second machine learning model based on the number of times of processing corresponding to the first data and the number of times of processing of the data required by the first device, to obtain the second data required by the first device.
  • the former distributed device indicates the number of times it has processed the data, which can facilitate the subsequent distributed device according to the data has been processed.
  • the number of processing times determines the number of times the data still needs to be processed, ensuring joint processing of data, and there is no need for the central device to specify the number of data processing times, which is conducive to each distributed device dynamically adjusting the number of data processing times according to actual operating conditions.
  • the second aspect of the present application provides a data processing method.
  • the data processing method is executed by a first device.
  • the first device may be a terminal device or a network device, or some components of the terminal device or network device (for example, Processor, chip or chip system, etc.).
  • the first device may also be a logic module and/or software that can realize all or part of the functions of the terminal device.
  • the method includes: first, the first device performs processing on the original data through the first machine learning model to obtain the first data. Then, the first device sends the first data to the second device. Finally, the first device receives the second data sent by the second device or other devices. The second data is processed based on the second machine learning model.
  • the structure of the first machine learning model is the same as that of the second machine learning model.
  • the first device and the second device can interact in advance, so that after the second device receives the data sent by the first device, it can determine that the data sent by the first device needs to be processed a certain number of times. .
  • the first device first determines the number of times the original data needs to be processed, and performs a certain number of times of processing on the original data to obtain the first data. Since the number of times the first device performs processing on the original data is less than the number of times the original data needs to be processed, the first device sends the first data to the second device, and by default requests the second device to assist in processing the first data.
  • the equipment of multiple devices jointly completes data processing to continuously improve the quality of the data obtained, reduce the data processing pressure of each device, and ensure calculation Less capable units are also able to obtain data of the quality they require.
  • the first machine learning model is a diffusion model
  • the data processing process using the diffusion model can be modeled as a Markov chain
  • the first machine learning model is used to perform denoising processing on the original data.
  • the first device may also send first information to the second device, where the first information is used to request the second device to perform processing on the first data, and/or the first information is used to indicate The number of times the first data needs to be processed is determined based on the number of times the original data needs to be processed and the number of times the first device performs processing on the original data.
  • the first device in addition to sending the first data to be processed to the second device, the first device also sends first information to the second device to instruct the second device how to process the first data.
  • the third aspect of the present application provides a data processing method.
  • the data processing method is executed by a central device.
  • the central device may be a terminal device or a network device, or some components (such as a processor) in the terminal device or network device. , chip or chip system, etc.).
  • the central device may also be a logic module and/or software that can realize all or part of the functions of the terminal device.
  • the method includes: the central device receives first information from the first device and second information from the second device, wherein the first information is used to indicate a first processing number of data required by the first device, and the second The information is used to indicate the second number of processing times of data required by the second device.
  • the data processing model corresponding to the first number of processing times is the same as the data processing model corresponding to the second number of processing times.
  • the central device sends the third information to the second device.
  • the third information is used to instruct the second device to send the processed data to the first device.
  • the second processing number of data required by the second device is less than or equal to the first processing number of data required by the first device.
  • the first device and the second device can also be called distributed devices, and both the first device and the second device feed back the number of required data processing times to the central device.
  • the central device determines the data processing sequence of the first device and the second device in the joint processing of data based on the number of data processing times required by the first device and the second device, thereby instructing the second device to send the processed data to the first device. data. That is, the second device processes the data first, and then sends the processed data to the first device.
  • the central device is used to coordinate the joint processing of data among various distributed devices, and the data processing tasks of each distributed device can be determined based on the needs of each distributed device, thereby improving the efficiency of joint processing of data by distributed devices.
  • the method further includes: the central device sending fourth information to the first device, where the fourth information is used to indicate the number of times the first device needs to process the data received from the second device.
  • the fourth information sent by the central device to the first device may be to instruct the first device from The number of times the data received by the second device needs to be processed is 400 times.
  • the fourth aspect of the present application provides a model training method, which is applied to a first device in the training system.
  • the first device may be a terminal device or a network device, or some components of the terminal device or network device (such as a processing unit). device, chip or chip system, etc.).
  • the first device may also be a logic module and/or software that can realize all or part of the functions of the terminal device.
  • the training system includes multiple distributed devices. Specifically, the method includes: first, the first device obtains a training sample set, the training sample set includes first data and second data, the first data is obtained based on the second data, and the second data is a variation of the first data. training label.
  • the first device trains the first machine learning model based on the training sample set to obtain the trained first machine learning model.
  • the first data is used as input data of the first machine learning model
  • the first machine learning model processes the first data, and calculates a loss function based on the processed data and the second data, and based on the The loss function updates the parameters of the first machine learning model, thereby implementing training of the first machine learning model.
  • the first device sends the trained first machine learning model to the second device.
  • the second device is a device for aggregating machine learning models with the same structure and different parameters that are trained by multiple devices.
  • the second device may also be called a polymerization device.
  • the method further includes: the first device sending first information to the third device, where the first information is used to indicate capabilities related to model training on the first device.
  • the third device is used to determine the training content that the multiple devices are responsible for based on the capabilities of the multiple devices participating in the machine learning model training. Therefore, the third device can also be called a central device.
  • the first device receives second information from the third device, and the second information is used to indicate the number of times the input data is processed by the first machine learning model trained on the first device.
  • the second information is also used to indicate requirements for input data of the first machine learning model. For example, when the input data of the first machine learning model is obtained by subjecting the target data to noise processing, the requirement for the input data of the first machine learning model may be the number of times the target data is subject to noise processing.
  • the process includes: the first device processes the target data according to the input data requirements indicated by the second information and the number of times the first machine learning model processes the input data, and obtains the second Second data and first data.
  • the input data requirement is to perform M-N to M times of noise processing on the target data to obtain a set of data, from which the first data and the second data are obtained, where the second data is the training label of the first data, and
  • the number of noise additions required to obtain the second data is smaller than the number of noise additions required to obtain its corresponding first data.
  • the method further includes: the first device receives a second machine learning model from the second device; the first device trains the second machine learning model based on the training sample set to obtain the trained a second machine learning model; the first device sends the trained second machine learning model to the second device.
  • the second device after the second device aggregates the machine learning models of multiple distributed devices, the second device continues to send the aggregated second machine learning model to the first device, so that the first device continues to analyze the second machine learning model. Learn the model for training.
  • the fifth aspect of the present application provides a model training method, which method is applied to a first device.
  • the first device may be a terminal device or a network device, or some components (such as a processor, a chip or a network device) of the terminal device or the network device. chip system, etc.).
  • the first device may also be a logic module and/or software that can realize all or part of the functions of the terminal device.
  • the method includes: the first device receives a plurality of capability information, the plurality of capability information comes from a plurality of different devices, and each capability information in the plurality of capability information is used to indicate capabilities related to model training on the device. Then, the first device sends different training configuration information to multiple different devices according to the plurality of capability information.
  • the training configuration information is used to indicate the number of times the machine learning model trained on the device processes the input data.
  • the training configuration information It is also used to indicate the input data requirements of the machine learning model trained on the device.
  • the machine learning models trained by multiple different devices have the same structure.
  • the first device may also be called a central device.
  • the process of using a machine learning model to continuously process data can be regarded as a Markov chain, and using a machine learning model to process data once can be regarded as a link in the Markov chain.
  • the central device can split the Markov chain into multiple sub-chains, and configure the split sub-chains into different distributed devices according to the capabilities of each distributed device, that is, different distributed devices are used to perform different tasks. training tasks.
  • the capabilities related to model training on the distributed device may include the computing capability, storage capability, communication capability and other capabilities of the distributed device.
  • computing power can be measured by the number of operations that a distributed device can perform per second
  • storage capacity can be measured by the amount of storage space allocated to model training on a distributed device
  • communication capability can be measured by the amount of storage space allocated to model training on a distributed device. measured by the data transfer rate of the process.
  • the capabilities related to model training on the distributed device may also include other capabilities that can affect model training, which are not specifically limited here.
  • a sixth aspect of the present application provides a communication device.
  • the communication device includes: a transceiver module for receiving first data from a second device, where the first data is data processed by a first machine learning model; processing A module for processing the first data through a second machine learning model to obtain second data.
  • the structure of the first machine learning model is the same as that of the second machine learning model.
  • the communication device and The second device is used to jointly perform data processing.
  • the second machine learning model is a diffusion model
  • the data processing process using the diffusion model can be modeled as a Markov chain
  • the second machine learning model is used to denoise the first data.
  • the transceiver module is also configured to receive first information from the second device, where the first information is used to request the communication device to perform processing on the first data.
  • the first information is used to indicate that the number of times the first data needs to be processed is the first number; the processing module is also used to process the first data through the second machine learning model. , obtain the second data, wherein the capability of the first device supports completing the first number of processes on the first data.
  • the transceiver module is also used to send the second data to the second device; or, the transceiver module is used to send the second data to the source device, where the first information is also used to indicate the information of the source device.
  • the source device is the first device that requests assistance in processing data.
  • the first information is used to indicate that the number of times the first data needs to be processed is the first time; the processing module is also used to process the first data the second time through the second machine learning model, Obtain second data, in which the first number is greater than the second number, and the capability of the first device does not support completing the processing of the first number of first data; the transceiver module is also used to send the second data and the second number to the third device.
  • Two information; wherein, the second information is used to indicate that the number of times the second data needs to be processed is the third number, and the third number is the difference between the first number and the second number, and the third device is used to assist the communication device in executing the data. deal with.
  • the transceiver module is also configured to send assistance request information to the second device, where the assistance request information is used to request the second device to assist in processing data.
  • the transceiver module is also used to send third information to the central device, and the third information is used to indicate the number of data processing times required by the communication device; the transceiver module is also used to receive feedback from the central device. Information, the feedback information is used to indicate that the second device is an assisting node.
  • the transceiver module is also used to receive fourth information from the central device, and the fourth information is used to indicate the number of times the communication device needs to process the data received from the second device; the processing module, It is also used to process the first data through the second machine learning model according to the fourth information to obtain the second data required by the communication device.
  • the fourth information is also used to indicate information about a third device
  • the third device is a device to receive data processed by the first device; the transceiver module is also used to send a message to a third device according to the fourth information.
  • the third device sends second data.
  • the transceiver module is also configured to receive fifth information from the second device, where the fifth information is used to indicate the number of times of processing corresponding to the first data; the processing module is also configured to calculate the number of times of processing according to and the number of processing times of data required by the communication device.
  • the first data is processed through the second machine learning model to obtain the second data required by the communication device.
  • the transceiver module is a transceiver
  • the processing module is a processor
  • a seventh aspect of the present application provides a communication device, including: a processing module, configured to process original data through a first machine learning model to obtain first data; a transceiver module, configured to send the first data to a second device; transceiver The module is also used to receive second data sent by the second device or other devices. The second data is obtained by processing the first data based on the second machine learning model.
  • the structure of the first machine learning model is the same as that of the second machine learning model. The structure is the same.
  • the first machine learning model is a diffusion model
  • the data processing process using the diffusion model can be modeled as a Markov chain
  • the first machine learning model is used to perform denoising processing on the original data.
  • the transceiver module is also configured to send first information to the second device, where the first information is used to request the second device to perform processing on the first data, and/or the first information is used to indicate that the second device A number of times the data needs to be processed. The number of times the first data needs to be processed is determined based on the number of times the original data needs to be processed and the number of times the first device performs processing on the original data.
  • the transceiver module is a transceiver
  • the processing module is a processor
  • An eighth aspect of the present application provides a communication device, including: a transceiver module configured to receive first information from a first device and second information from a second device, where the first information is used to indicate data required by the first device. The first number of processing times, the second information is used to indicate the second number of processing times of data required by the second device, the data processing model corresponding to the first number of processing times is the same as the data processing model corresponding to the second number of processing times; the transceiver module uses When sending the third information to the second device, the third information is used to instruct the second device to send the processed data to the first device, wherein the second processing number of the data required by the second device is less than or equal to that of the first device. The first processing number of required data.
  • the transceiver module is further configured to send fourth information to the first device, where the fourth information is used to indicate the number of times the first device needs to perform processing on data received from the second device.
  • the transceiver module is a transceiver.
  • a ninth aspect of the present application provides a model training device, including a transceiver module for obtaining a training sample set.
  • the training sample set includes first data and second data.
  • the first data is obtained based on the second data, and the second data is the training label of the first data;
  • the processing module is used to train the first machine learning model based on the training sample set to obtain the trained first machine learning model, where the first machine learning model is used to process the first data ;
  • the sending module is used to send the trained first machine learning model to the second device.
  • the second device is a device used to aggregate machine learning models with the same structure and different parameters that are trained by multiple devices.
  • the sending module is also configured to send first information to a third device.
  • the first information is used to indicate capabilities related to model training on the model training device.
  • the third device is used to participate in machine learning based on The capabilities of multiple devices for model training determine the training content that the multiple devices are responsible for; the transceiver module is also used to receive second information from the third device, and the second information is used to indicate the first machine trained on the model training device.
  • the number of times the learning model processes the input data is also used to indicate the input data requirements of the first machine learning model; the processing module is also used to: according to the input data requirements indicated by the second information and the first machine
  • the learning model processes the input data a number of times, processes the original data, and obtains the second data; and processes the second data according to the number of times the first machine learning model processes the input data indicated by the second information, and obtains the second data.
  • the transceiver module is also used to receive the second machine learning model from the second device; the processing module is also used to train the second machine learning model based on the training sample set, and obtain the trained the second machine learning model; the sending module is also used to send the trained second machine learning model to the second device.
  • the transceiver module is a transceiver
  • the processing module is a processor
  • a tenth aspect of the present application provides a model training device, including a transceiver module for receiving multiple capability information.
  • the multiple capability information comes from multiple different devices, and each capability information in the multiple capability information is used for Indicate capabilities related to model training on the device; the transceiver module is used to send different training configuration information to multiple different devices according to multiple capability information, and the training configuration information is used to instruct the machine learning model trained on the device to input data The number of times of processing, the training configuration information is also used to indicate the input data requirements of the machine learning model trained on the device.
  • the machine learning models trained by multiple different devices have the same structure.
  • the transceiver module is a transceiver.
  • the eleventh aspect of the embodiment of the present application provides a communication device, including at least one processor, the at least one processor is coupled to a memory; the memory is used to store programs or instructions; the at least one processor is used to execute the program or instructions , so that the device implements the method described in the aforementioned first aspect or any possible implementation of the first aspect, or, so that the device implements the aforementioned second aspect or the method described in any possible implementation of the second aspect method, or to enable the device to implement the method described in the foregoing third aspect or any of the possible implementations of the third aspect, or to enable the device to implement the foregoing fourth aspect or any of the possible implementations of the fourth aspect Implement the method described in the manner or, so that the device implements the method described in the foregoing fifth aspect or any possible implementation manner of the fifth aspect.
  • the communication device further includes the above-mentioned memory.
  • the memory and the processor are integrated together, or the memory and the processor are provided separately.
  • the communication device further includes a transceiver for sending and receiving data or signaling.
  • a twelfth aspect of the embodiment of the present application provides a computer-readable storage medium that stores one or more computer-executable instructions.
  • the processor executes the above-mentioned first aspect or any of the first aspects.
  • the method described in one possible implementation manner, or the processor performs the method described in the above second aspect or any possible implementation manner of the second aspect, or the processor performs the method described in the above third aspect or The method described in any possible implementation manner of the third aspect, or the processor executes the method described in the above fourth aspect or any possible implementation manner of the fourth aspect, or the processor executes the method described above The method described in the fifth aspect or any possible implementation manner of the fifth aspect.
  • a thirteenth aspect of the embodiment of the present application provides a computer program product (or computer program) that stores one or more computers.
  • the processor executes the above-mentioned first aspect or the third aspect.
  • the method in any possible implementation manner or, the processor executes the above second aspect or the method described in any possible implementation manner of the second aspect, or, the processor executes the method in the above third aspect or The method described in any possible implementation manner of the third aspect, or the processor executes the method described in the above fourth aspect or any possible implementation manner of the fourth aspect, or the processor executes the method described above The method described in the fifth aspect or any possible implementation manner of the fifth aspect.
  • a fourteenth aspect of the embodiment of the present application provides a chip system, which includes at least one processor and is used to support a communication device to implement the functions involved in the above first aspect or any possible implementation of the first aspect. , or, used to support the communication device to implement the functions involved in the above-mentioned second aspect or any one of the possible implementations of the second aspect, or, used to support the communication device to implement the above-mentioned third aspect or any one of the possible implementations of the third aspect.
  • the chip system may also include a memory for storing necessary program instructions and data of the communication device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the chip system further includes an interface circuit that provides program instructions and/or data to the at least one processor.
  • the fifteenth aspect of the embodiments of the present application provides a communication system.
  • the communication system includes the communication devices related to the above-mentioned sixth and seventh aspects, and/or the communication system includes the above-mentioned sixth aspect, seventh aspect and The communication device according to the eighth aspect, and/or the communication system includes the communication devices according to the ninth and tenth aspects.
  • Figure 1 is a schematic diagram of the process of data processing by a diffusion model provided by an embodiment of the present application
  • Figure 2 is a partial structural schematic diagram of a fully connected neural network provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a neural network training process provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of the process of backpropagation performed by a neural network provided by an embodiment of the present application
  • Figure 5 is a schematic architectural diagram of a wireless communication system provided by an embodiment of the present application.
  • Figure 6 is a schematic architectural diagram of a smart home communication system provided by an embodiment of the present application.
  • Figure 7 is a schematic flow chart of a model training method provided by an embodiment of the present application.
  • Figure 8 is another schematic flowchart of a model training method provided by an embodiment of the present application.
  • Figure 9A is a schematic flowchart of a data processing method 900 provided by an embodiment of the present application.
  • Figure 9B is another schematic flowchart of a data processing method 900 provided by an embodiment of the present application.
  • Figure 10A is a schematic flowchart of a data processing method 1000 provided by an embodiment of the present application.
  • Figure 10B is another schematic flowchart of a data processing method 1000 provided by an embodiment of the present application.
  • Figure 11 is a schematic flowchart of a data processing method 1100 provided by an embodiment of the present application.
  • Figure 12 is a schematic flowchart of a data processing method 1200 provided by an embodiment of the present application.
  • Figure 13 is a schematic structural diagram of a communication device 1300 provided by an embodiment of the present application.
  • Figure 14 is a schematic structural diagram of a model training device 1400 provided by an embodiment of the present application.
  • Figure 15 is a schematic structural diagram of a communication device 1500 provided by an embodiment of the present application.
  • Figure 16 is a schematic structural diagram of a communication device 1600 provided by an embodiment of the present application.
  • Figure 17 is a schematic structural diagram of a communication device 1700 provided by an embodiment of the present application.
  • first”, “second”, etc. in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects (for example, to distinguish objects in the same embodiment), and are not necessarily used to describe specific
  • the order or sequence, and in different embodiments the objects defined by “first”, “second”, etc. may refer to different objects.
  • the “first device” in the first embodiment may refer to a distributed node
  • the “first device” in the second embodiment may refer to a central node. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein.
  • Terminal equipment refers to a wireless terminal equipment that can receive scheduling information and instruction information sent by network equipment.
  • a wireless end device may refer to a device that provides voice and/or data connectivity to a user, or a handheld device with wireless connectivity capabilities, or other processing device connected to a wireless modem.
  • Terminal equipment can communicate with one or more core networks or the Internet via a wireless access network (RAN).
  • RAN wireless access network
  • the terminal device may be a mobile terminal device, such as a mobile phone (also known as a "cellular" phone, mobile phone), a computer and a data card, for example, it may be a portable, pocket-sized, handheld, computer built-in or vehicle-mounted mobile devices that exchange voice and/or data with the radio access network.
  • the terminal device may also be a personal communication service (PCS) phone, a cordless phone, a Session Initialization Protocol (SIP) phone, a wireless local loop (WLL) station, a personal digital phone Assistant (personal digital assistant, PDA), tablet computer (Tablet Personal Computer, Tablet PC), computer with wireless transceiver function and other devices.
  • terminal equipment can also be called a system, subscriber unit, subscriber station, mobile station, mobile station (MS), remote station, access Access point (AP), remote terminal (remote terminal), access terminal (access terminal), user terminal (user terminal), user agent (user agent), subscriber station (SS), user Customer premises equipment (CPE), terminal, user equipment (UE), mobile terminal (MT), etc.
  • PCS personal communication service
  • SIP Session Initialization Protocol
  • WLL wireless local loop
  • PDA personal digital phone Assistant
  • tablet computer Tablet Personal Computer, Tablet PC
  • terminal equipment can also be called a system, subscriber unit, subscriber station, mobile station, mobile station (MS), remote station, access Access point (AP
  • the terminal device may also be a wearable device.
  • Wearable devices can also be called wearable smart devices or smart wearable devices. It is a general term for applying wearable technology to intelligently design daily wear and develop wearable devices, such as glasses, gloves, watches, clothing and shoes. wait.
  • a wearable device is a portable device that is worn directly on the body or integrated into the user's clothing or accessories. Wearable devices are not just hardware devices, but also achieve powerful functions through software support, data interaction, and cloud interaction.
  • wearable smart devices include full-featured, large-sized devices that can achieve complete or partial functions without relying on smartphones, such as smart watches or smart glasses, and those that only focus on a certain type of application function and need to cooperate with other devices such as smartphones. Used, such as various smart bracelets, smart helmets, smart jewelry, etc. for physical sign monitoring.
  • the terminal can also be a drone, a robot, a terminal in device-to-device (D2D) communication, a terminal in vehicle to everything (V2X), or a virtual reality (VR) terminal.
  • equipment augmented reality (AR) terminal equipment, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical, smart grids ( Wireless terminals in smart grid, wireless terminals in transportation safety, wireless terminals in smart city, wireless terminals in smart home, etc.
  • the terminal device may also be a terminal device in a fifth generation (5th generation, 5G) communication system, a later evolved communication system (such as a sixth generation (6th generation, 6G) communication system, etc.) or a future evolved public land mobile Terminal equipment in the network (public land mobile network, PLMN), etc.
  • 5G fifth generation
  • 6G 6th generation
  • 6G networks can further expand the form and function of 5G communication terminals.
  • 6G terminals include but are not limited to cars, cellular network terminals (integrated with satellite terminal functions), drones, and Internet of Things (IoT) devices.
  • IoT Internet of Things
  • the above-mentioned terminal device may have AI processing capabilities and be able to use an AI model to process data.
  • Network equipment may refer to equipment that provides wireless access services in a wireless network.
  • the network device may be a RAN node (or RAN device) that connects the terminal device to the wireless network, which may also be called a base station.
  • RAN equipment are: the next Generation Node B (gNB), transmission reception point (TRP), evolved Node B (evolved Node B, eNB), Radio network controller (RNC), Node B (Node B, NB), home base station (e.g., home evolved Node B, or home Node B, HNB), base band unit (base band unit, BBU), or Wireless fidelity (Wi-Fi) access point (Access Point, AP), etc.
  • the network device may include a centralized unit (CU) node, a distributed unit (DU) node, or a RAN device including a CU node and a DU node.
  • CU centralized unit
  • DU distributed unit
  • RAN device including a CU node and a DU node.
  • the network device may be other devices that provide wireless communication functions for terminal devices.
  • the embodiments of this application do not limit the specific technology and specific equipment form used by the network equipment. For convenience of description, the embodiments of this application are not limited.
  • the network equipment may also include core network equipment.
  • the core network equipment may include, for example, a mobility management entity (MME), a home subscriber server (HSS), and a service in a fourth generation (4G) network.
  • MME mobility management entity
  • HSS home subscriber server
  • 4G fourth generation
  • Gateway serving gateway, S-GW
  • policy and charging rules function PCRF
  • PDN gateway public data network gateway
  • P-GW access and charging in 5G networks
  • Network elements such as access and mobility management function (AMF), user plane function (UPF) or session management function (SMF).
  • the core network equipment may also include other core network equipment in the 5G network and the next generation or future network of the 5G network.
  • the above-mentioned network device may also be a network node with AI capabilities, and may provide AI services for terminals or other network devices.
  • it may be an AI node or a computing power node on the network side (access network or core network).
  • RAN nodes with AI capabilities may be an AI node or a computing power node on the network side (access network or core network).
  • core network elements with AI capabilities may be an AI node or a computing power node on the network side (access network or core network).
  • the device used to implement the function of the network device may be a network device, or may be a device that can support the network device to implement the function, such as a chip system, and the device may be installed in the network device.
  • Markov chain refers to a stochastic process in probability theory and mathematical statistics that has Markov properties and exists in a discrete index set and state space.
  • a Markov chain is a set of discrete random variables with Markov properties. Specifically, for a random variable set ⁇ is a non-empty set, also called the sample space; F is a non-empty subset of the power set of the sample space ⁇ ; P is the probability. If the values of the random variables in the random variable set X are all in the countable set:
  • X is called a Markov chain
  • the countable set S is called the state space
  • the value of the Markov chain in the state space is called the state.
  • X i is a random variable in the random variable set X, i can be any integer greater than 0;
  • S i is an element in the countable set S.
  • a countable set S is a set in which each element in the set can establish a one-to-one correspondence with each element of the natural number set.
  • the diffusion model is an artificial intelligence model that uses the Markov chain principle to process data. For data with noise, the diffusion model can be used to denoise the data, thereby obtaining higher quality data. The following will introduce the principle of denoising data using the diffusion model.
  • Figure 1 is a schematic diagram of a data processing process provided by a diffusion model according to an embodiment of the present application.
  • the data X 0 obeys the distribution q(X 0 )
  • the data X 0 is gradually denoised in the order from right to left in Figure 1.
  • each time noise is added a new data can be obtained.
  • a total of T data from X 1 to X T is obtained.
  • the data noise process can be regarded as a Markov process.
  • the conditional probability of obtaining X 1 to X T based on data X 0 is: Among them, the value range of t is 1 ⁇ t ⁇ T, ⁇ means finding the product, and the single-step transition probability is means that from X t-1 to X t is a is the Gaussian distribution transformation with mean value and ⁇ t as the variance; the variance parameter ⁇ t is a designable parameter.
  • the final X T will conform to a Gaussian distribution with a mean of 0 and a variance of I, that is
  • the data can be denoised through a reverse process (that is, the process of processing data from left to right in Figure 1).
  • a reverse process that is, the process of processing data from left to right in Figure 1.
  • X t-1 ) of the reverse process is difficult to calculate, it can be approximated by a neural network.
  • ⁇ t is a parameter that does not undergo learning, that is, ⁇ t is a designable parameter.
  • the mean ⁇ ⁇ (X t , t) is learned through the neural network, that is, the input of the neural network is the data X t of step t and the step index t, and the trainable parameter is ⁇ .
  • the diffusion model is a neural network model, such as a fully connected neural network, a convolutional neural network, a residual neural network, etc. This embodiment does not limit the specific model structure of the diffusion model.
  • an MLP consists of an input layer (left), an output layer (right), and multiple hidden layers (middle).
  • the data corresponding to the input layer in Figure 2 can be called input data, and the input data can include data required for training or data required for inference, such as the data X t in Figure 1 above.
  • Multiple hidden layers in Figure 2 are deployed with corresponding model parameters for processing input data based on these model parameters.
  • the data corresponding to the output layer in Figure 2 can be called output data, which is the data obtained after multiple hidden layers process the input data, such as the data X 0 in Figure 1 above.
  • each layer of the above MLP contains several nodes, called neurons. Among them, neurons in two adjacent layers are connected in pairs.
  • the output h of the neuron in the next layer is the weighted sum of the output x of all the neurons in the previous layer connected to it and passes through the activation function. It can be expressed as the following formula 2:
  • w is the weight matrix
  • b is the bias vector
  • f is the activation function
  • n is the number of layers of the neural network, and the value of n is an integer greater than 1.
  • a neural network can be understood as a mapping relationship from an input data set to an output data set.
  • neural networks are initialized randomly, and the process of using existing data to obtain this mapping relationship from random w and b is called neural network training.
  • the specific method of training is to use a loss function to evaluate the output results of the neural network.
  • Figure 3 is a schematic diagram of a neural network training process provided by an embodiment of the present application.
  • the error can be back-propagated, and the neural network parameters (including w and b) can be iteratively optimized through the gradient descent method until the loss function reaches the minimum value, which is the "optimal point" in Figure 3.
  • the neural network parameters corresponding to the "optimal point" in Figure 3 can be used as neural network parameters in the trained AI model information.
  • the gradient descent process can be expressed as the following formula 4:
  • is the parameter to be optimized (including w and b)
  • L is the loss function
  • eta is the learning rate, which controls the step size of gradient descent, means finding partial derivatives
  • L is the partial derivative of the parameter to be optimized.
  • FIG. 4 is a schematic diagram of a process of backpropagation performed by a neural network according to an embodiment of the present application.
  • the gradient of the previous layer parameters can be calculated recursively from the gradient of the subsequent layer parameters, which can be expressed as the following formula 5:
  • W ij is the weight of node j connecting node i
  • S i is the weighted sum of inputs on node i.
  • the wireless communication system includes a network device 501 and a terminal device 502.
  • the terminal device 502 may include one or more terminal devices, such as a smart bracelet, a smart phone, a smart TV, and a laptop shown in FIG. 5 .
  • the network device 501 establishes a wireless connection with each terminal device in the terminal device 502, and a wireless connection may also be established between the terminal devices in the terminal device 502.
  • the network device 501 can send downlink data to the terminal device 502, such as a model that needs to be trained; each terminal device in the terminal device 502 can send uplink data to the network device 501, such as a trained model. Model.
  • each terminal device in the terminal device 502 can also send data to each other, such as data required in the model training process or data required in the model inference process.
  • the wireless communication systems mentioned in the embodiments of this application include but are not limited to: fifth generation mobile communication technology (5th Generation Mobile Communication Technology, 5G) communication system, 6G communication system, satellite communication system, short-distance communication System, Narrow Band-Internet of Things (NB-IoT), Global System for Mobile Communications (GSM), Enhanced Data rate for GSM Evolution (EDGE) , Wideband Code Division Multiple Access (WCDMA), Code Division Multiple Access 2000 (Code Division Multiple Access, CDMA2000), Time Division-Synchronization Code Division Multiple Access (TD) -SCDMA) and Long Term Evolution (LTE) and other communication systems.
  • 5G Fifth Generation Mobile Communication Technology
  • 6G communication system satellite communication system
  • short-distance communication System Narrow Band-Internet of Things
  • NB-IoT Narrow Band-Internet of Things
  • GSM Global System for Mobile Communications
  • EDGE Enhanced Data rate for GSM Evolution
  • WCDMA Wideband Code Division Multiple Access
  • CDMA2000 Code Division Multiple Access 2000
  • TD Time Division-Synchr
  • FIG. 6 is a schematic architectural diagram of a smart home communication system provided by an embodiment of the present application.
  • various smart home products are connected through wireless networks to enable smart home products to transmit data to each other.
  • smart home products such as smart TVs, smart air purifiers, smart water dispensers, smart speakers, and sweeping robots are taken as examples. These smart home products are all connected to the same wireless network through wireless routers, thereby realizing various smart home products. Data interaction between home products.
  • other types of smart home products may also be included in practical applications, such as smart refrigerators, smart range hoods, smart curtains and other smart home products. This embodiment does not limit the types of smart home products. Make restrictions.
  • smart home products can also be connected wirelessly directly without having to access the same wireless network through a wireless router.
  • various smart home products are connected wirelessly through Bluetooth.
  • the method provided by the embodiment of the present application can also be applied to other communication system scenarios.
  • different devices such as smart robots, lathes, transport vehicles, etc.
  • the embodiments of this application do not limit the specific scenarios in which the data processing method is applied.
  • FIG. 7 is a schematic flowchart of a model training method provided by an embodiment of the present application. As shown in Figure 7, the model training method includes the following steps 701-708.
  • Step 701 The central device receives multiple capability information.
  • the central device is used to obtain the capability information of each distributed device, and allocate training tasks to each distributed device based on the capability information of each distributed device.
  • the multiple capability information received by the central device comes from multiple different distributed devices (for example, distributed device 1 to distributed device N shown in Figure 7).
  • each of the plurality of capability information is used to indicate capabilities related to model training on the distributed device. Simply put, each distributed device collects capabilities related to model training on its own device, and feeds back its own capabilities related to model training to the central device by sending capability information to the central device.
  • the central device is, for example, the terminal device or network device introduced above; or, the central device is a device used to implement the functions of the above-mentioned terminal device or network device, for example, the central device is a chip or chip in the terminal device or network device. system.
  • the distributed device is, for example, the above-mentioned terminal device, or a device used to implement the functions of the above-mentioned terminal device.
  • the central device may be the above-mentioned base station, and the distributed device may be a terminal device such as a smart bracelet, a smart watch, a smart TV, a smart phone, or a laptop computer.
  • the capabilities related to model training on the distributed device may include one or more of the computing capabilities, storage capabilities, and communication capabilities of the distributed device.
  • computing power can be measured by the number of operations that a distributed device can perform per second
  • storage capacity can be measured by the amount of storage space allocated to model training on a distributed device
  • communication capability can be measured by the amount of storage space allocated to model training on a distributed device. measured by the data transfer rate of the process.
  • the capabilities related to model training on the distributed device may also include other capabilities that can affect model training, which are not specifically limited in this embodiment.
  • Step 702 The central device sends corresponding training configuration information to multiple different distributed devices based on multiple capability information.
  • the central device After receiving a plurality of capability information, the central device can obtain the capabilities of each distributed device participating in model training to perform model training. Based on the ability of each distributed device to perform model training, the central device can determine the training configuration information of each distributed device in the entire model training phase. Among them, the training configuration information refers to the specific training tasks that the distributed device needs to perform during the model training phase.
  • the process of denoising data through the diffusion model is actually using the same machine learning model to continuously process the data, thereby gradually reducing the noise in the data, and then obtaining High quality data.
  • the more times the diffusion model performs denoising on the data the higher the quality of the resulting data.
  • the central device can split the Markov chain into multiple sub-chains, and configure the split sub-chains into different distributed devices according to the capabilities of each distributed device. After the central device configures the split sub-chains to different distributed devices, the different distributed devices are used to perform different training tasks. For example, distributed device 1 executes training task 1, distributed device 2 executes training task 2, and distributed device 3 executes training task 3. Alternatively, some of the distributed devices may be used to perform the same training task. For example, distributed device 1 executes training task 1, distributed device 2 also executes training task 1, and distributed device 3 executes training task 2.
  • the training configuration information sent by the central device is used to indicate the number of times that the machine learning model trained on the distributed device processes the input data. Moreover, since each distributed device is responsible for different aspects of the training phase, the quality requirements for the input data of the machine learning model trained on each distributed device are also different. Therefore, training configuration information is also used to indicate input data requirements for machine learning models trained on distributed devices.
  • the training task in the entire training phase is: perform T times of processing on the data X T (input data) that has been noise-added T times through the machine learning model to obtain the data different sub-training tasks, and these three sub-training tasks are deployed in different distributed devices.
  • the first sub-training task can be to use the machine learning model to process the data X T that has been noise-processed T times n times to obtain the data The data 0 .
  • Step 703 The aggregation device sends the machine learning model and target data samples to multiple distributed devices.
  • the aggregation device is used to aggregate models trained by each distributed device, and feed back the aggregated model to each distributed device.
  • the aggregation device and the aforementioned central device may be the same device or may be different devices, which is not specifically limited in this embodiment.
  • the aggregation device is, for example, the terminal device or network device introduced above; or, the aggregation device is a device used to implement the functions of the above-mentioned terminal device or network device, for example, the aggregation device is a chip in the terminal device or network device or Chip system.
  • the central device may be the above-mentioned base station
  • the aggregation device may be the above-mentioned base station or server
  • the distributed device may be a terminal device such as a smart bracelet, smart watch, smart TV, smart phone or laptop computer.
  • the machine learning model sent by the aggregation device to multiple distributed devices is the same, so that multiple distributed devices perform model training on the same machine learning model.
  • the parameters in the machine learning model sent by the aggregation device may be initial parameters obtained after random initialization.
  • the machine learning model sent by the aggregation device may be a diffusion model, and the machine learning model is used to perform denoising processing on the data.
  • the target data samples sent by the aggregation device to multiple distributed devices may also be the same, so that different distributed devices generate corresponding training sample sets according to the training configuration information.
  • the target data sample sent by the aggregation device to the multiple distributed devices may be data with higher quality.
  • the target data sample (for example, X 0 in FIG. 1 ) may be an image without noise.
  • the process of the distributed device training the machine learning model is: the distributed device first obtains lower-quality training samples by adding noise to the target data samples; then, the distributed device inputs the training samples into the machine learning model, and passes the machine The learning model performs denoising on the training samples and trains the machine learning model based on the output results of the machine learning model.
  • the target data sample sent by the aggregation device may be determined by the type of machine learning model that needs to be trained. This embodiment does not specifically limit the specific type of the target data sample.
  • the machine learning model that needs to be trained is the machine learning model of each module of the transceiver in the communication system, such as the machine learning model of the transmitter, the machine learning model of the receiver, the machine learning model of channel estimation, the machine learning model of channel compression feedback, and the prediction model. Coding machine learning model, beam management machine learning model or positioning machine learning model, the data sample can be channel data.
  • the data sample can be image data.
  • an image processing model such as an image classification model, an image enhancement model, an image compression model, or an image detection model
  • the data sample can be image data.
  • the data sample can be speech data.
  • steps 702 and 703 are not specifically limited in this embodiment. Step 703 may be executed before step 702, or step 703 may be executed simultaneously with step 702.
  • Step 704 The distributed device generates a training sample set based on the target data sample.
  • the distributed device Since the training configuration information sent by the central device to the distributed device indicates the number of times the machine learning model trained on the distributed device processes the input data (such as X T in Figure 1 ) and the requirements for the input data, the distributed device The device can generate a training sample set based on the target data sample (for example, X 0 in Figure 1 ) for subsequent training of the machine learning model. Wherein, each training sample in the training sample set generated by the distributed device meets the requirements of the input data indicated in the training configuration information.
  • the training sample set generated by the distributed device based on the target data sample may include first data and second data, where the first data is the input data of the machine learning model during the training process, and the second data is the first data.
  • the training labels of the data are used to generate a loss function in combination with the output results of the machine learning model, so as to update the machine learning model based on the loss function and complete the training of the machine learning model.
  • the first data is input into the machine learning model to obtain the output result of the machine learning model; then, by calculating the output result of the machine learning model and The difference between the second data (that is, the training labels of the input data) is used to construct a loss function; finally, the parameters in the machine learning model are updated based on the value of the loss function, thereby completing a round of iterative training of the machine learning model.
  • the distributed device processes the target data sample (such as noise processing) according to the input data requirements indicated by the training configuration information and the number of times the machine learning model processes the input data. ), get the second data. Then, the distributed device processes the second data (for example, adding noise) according to the number of times the machine learning model processes the input data indicated by the training configuration information, to obtain the first data.
  • the target data sample such as noise processing
  • the distributed device processes the second data (for example, adding noise) according to the number of times the machine learning model processes the input data indicated by the training configuration information, to obtain the first data.
  • the input data requirements are used to indicate what kind of data the input data is.
  • the input data is data obtained by performing noise processing on the target data sample for a specified number of times.
  • the second data is obtained, and the number of times the distributed device processes the target data sample is the number of times indicated in the requirement for the input data and the number of times the machine learning model processes the input data. the difference between.
  • the distributed device processes the second data according to the number of times the machine learning model processes the input data indicated by the training configuration information to obtain the first data.
  • the distributed device can process the target data sample X 0 performs noise addition processing MN times to M times to obtain data ⁇ X MN , That is, the first data (input data) is the data X M after performing M times of noise adding processing on the target data sample X 0 , and the second data is the data X MN after performing MN times of noise adding processing on the target data sample X 0 , That is, the training label is used in a round of iterative training of the machine learning model (such as the training of noise reduction processing) as an object for comparison with the output result obtained by inputting the first data into the machine learning model to determine the loss. function to update the parameters in the machine learning model, thereby completing a round of iterative training of the machine learning model.
  • the distributed device first performs M-N noise processing on the target data sample based on the input data requirements indicated by the training configuration information and the number of times the machine learning model processes the input data, and obtains After the second data, perform noise adding processing on the second data N times to obtain the first data.
  • the training configuration information indicates that the number of times the machine learning model trained on the distributed device processes the input data is 5, and the number of The input data requirements indicate that the input data is the data after adding noise to the target data sample 15 times.
  • the distributed device can first perform 10 to 15 times of noise processing on the target data sample X 0 obtained from the aggregation device to obtain data ⁇ X 10 , X 11 , X 12 , X 13 , X 14 , X 15 ⁇ .
  • the data X 10 refers to the data obtained by performing the noise adding process 10 times on the target data sample Data (i.e., first data), data X 15 may be data obtained by performing noise processing on X 10 five times. Then, during the training process , the data Optionally, the distributed device may perform noise processing on the data samples according to the formula: The conditional probability distribution shown is used to sample the input data.
  • Step 705 The distributed device trains the machine learning model based on the training sample set to obtain the trained machine learning model.
  • the distributed device after generating the training sample set, inputs the first data in the training sample set as input data into the machine learning model to obtain the output data of the machine learning model; then, the distributed device based on the input data
  • the training label (i.e., the second data) and the output data calculate the loss function, and update the machine learning model based on the value of the loss function to obtain the trained machine learning model.
  • the specific process of updating the machine learning model based on the loss function can be referred to the above introduction, and will not be repeated here.
  • Step 706 The distributed device sends the trained machine learning model to the aggregation device.
  • each distributed device After each distributed device trains the machine learning model based on the training samples generated on the device and obtains the trained machine learning model, each distributed device sends the trained machine learning model to the aggregation device so that the aggregation device Aggregate machine learning models.
  • Step 707 The aggregation device aggregates the trained machine learning models sent by each distributed device to obtain an aggregation model.
  • the machine learning models trained by each distributed device are models with the same structure
  • the machine learning models trained by each distributed device are models with the same structure but different parameters.
  • the aggregation device can perform a weighted summation of the parameters in the multiple trained machine learning models to obtain new parameters, where the new parameters are the parameters of the aggregation model. That is to say, after the aggregation device aggregates the trained machine learning models sent by each distributed device, the structure of the resulting aggregation model does not change, but the parameters in the aggregation model change, and the parameters in the aggregation model change. The parameters are obtained based on the trained machine learning model sent by each distributed device.
  • the aggregation device can also implement model aggregation in other ways. In this embodiment There is no specific limit on this.
  • Step 708 The aggregation device sends the aggregation model to each distributed device.
  • the aggregation device After realizing the aggregation of the models, the aggregation device sends the aggregation model to each distributed device so that each distributed device can continue to train the aggregation model.
  • each distributed device may require multiple rounds of iterative training of machine learning models. Therefore, after each distributed device receives the aggregation model sent by the aggregation device, each distributed device and the aggregation device perform the above steps 704-708 in a loop until the condition for terminating the machine learning model training is reached.
  • the condition for terminating the training of the machine learning model may be that the number of rounds of iterative training of the machine learning model by the distributed device reaches a preset number of rounds; or, the performance of the machine learning model trained by the distributed device reaches the preset requirement.
  • model training method provided by the embodiments of the present application will be introduced in detail below with reference to specific examples.
  • the process of the model training method is introduced in detail.
  • the process of model training method includes the following four stages.
  • the base station sends the target data sample X 0 and the machine learning model to multiple terminal devices participating in model training.
  • the machine learning model sent by the base station to multiple terminal devices may be a machine learning model with randomly initialized parameters.
  • the machine learning model sent by the base station to multiple terminal devices is an aggregate model obtained by aggregating multiple machine learning models obtained in the previous round of iterative training.
  • stage 2 multiple terminal devices generate training samples based on the target data sample X 0 and train the machine learning model based on the training samples.
  • each terminal device Since terminal devices such as smartphones, smart watches, smart bracelets, and laptops are assigned different training content, each terminal device generates training samples matching the training content based on the target data samples, and trains the data sent by the base station based on the training samples. machine learning model.
  • the input data requirements of the machine learning model trained in the smartphone indicate that the input data is data obtained by adding noise to the target data sample X 0 T times; and, the smartphone The machine learning model in needs to process the input data three times to obtain the output data.
  • the smartphone performs T-3 to T times of noise processing on the target data sample X 0 to obtain data ⁇ X T-3 , X T-2 , X T-1 , X T ⁇ ; Based on the data ⁇ X T-3 , X T-2 , X T-1 , X T ⁇ , the smartphone can construct a set of training samples (X T , The data X T in the training sample is the input data, and the data X T-3 is the training label.
  • the smartphone reuses the machine learning model to be trained and obtains a total model consisting of three identical machine learning models to be trained and connected in sequence. The input of the latter machine learning model in the total model is the previous one.
  • the output of the machine learning model then, the smartphone inputs the input data in the training sample into the overall model to obtain the output data output by the overall model, so as to construct and update the machine learning model based on the output data and the training labels in the training sample. loss function.
  • the parameters of the machine learning model are updated based on the calculated value of the loss function.
  • each machine learning model in the total model will actually have corresponding output data.
  • training a machine learning model in addition to constructing a loss function based on the output data of the overall model (that is, the output data of the third machine learning model in the overall model), it can also be based on the output of other machine learning models in the overall model. The data is used to construct the loss function together.
  • the output data X T-2 ' output by the first machine learning model in the total model and the output data -1 '
  • the output data X T ' output by the third machine learning model in the total model.
  • build the loss function 1 based on the training label X T-2 and the output data X T-2 '
  • build the loss function 2 based on the training label X T-1 and the output data X T -1 ' X T 'Construct loss function 3.
  • the total loss function is calculated based on loss function 1, loss function 2 and loss function 3, and the machine learning model is updated based on the value of the total loss function, thereby obtaining the trained machine learning model.
  • the input data of the machine learning model trained in the smart watch is X T-3 , that is, the data obtained after adding noise to the target data sample X 0 T-3 times; and, smart The machine learning model in the watch needs to process the input data twice to obtain the output data.
  • the smart watch performs T-5 to T-3 noise processing on the target data sample X 0 to obtain data ⁇ X T-5 , X T-4 , X T-3 ⁇ ; based on The data ⁇ X T-5 , T-5 , X T-4 ), (X T-4 , X T-3 ), (X T-5 , X T-3 ).
  • a smart watch trains a machine learning model based on the generated training data and can also obtain the trained machine model.
  • the input data of the machine learning model trained in the smart bracelet is X 4 , which is the data obtained after adding noise to the target data sample X 0 four times; and, the machine in the smart bracelet
  • the learning model needs to process the input data once to obtain the output data.
  • the smart bracelet performs 3 to 4 times of noise processing on the data sample X 0 to obtain data ⁇ X 3 , X 4 ⁇ , thereby constructing the training sample ⁇ X 3 , X 4 ⁇ .
  • the smart bracelet trains the machine learning model based on the generated training samples ⁇ X 3 , X 4 ⁇ to obtain the trained machine model.
  • the input data of the machine learning model trained in the laptop is X 3 , that is, the data obtained after adding noise to the target data sample X 0 three times; and, the machine learning model in the laptop needs Perform noise reduction processing on the input data three times to obtain the output data.
  • the laptop computer performs 0 to 3 noise addition processes on the data sample X 0 to obtain the data ⁇ X 0 , X3 ⁇ .
  • the smart bracelet trains the machine learning model based on the generated training samples ⁇ X 0 , X 3 ⁇ to obtain the trained machine model.
  • other terminal devices can also be responsible for training the machine model based on the data between data X T-5 and data X 4 , are not shown one by one in Figure 8.
  • the terminal devices can send the generated training data to each other, thereby preventing each terminal device from independently repeatedly generating the same training data and improving the efficiency of training data generation.
  • the smartphone needs to generate data ⁇ X T-3 , X T-2 , X T-1 , X T ⁇ , and the smart watch also needs to generate data X T-3 , so the smartphone can The generated data X T-3 is sent to the smart watch, thereby eliminating the process of generating data X T-3 .
  • stage 3 multiple terminal devices send trained machine learning models to the base station respectively.
  • each terminal device Since each terminal device is responsible for different training content and the training data used to train the machine learning model is also different, the machine learning model trained by each terminal device is often different. After each terminal device completes one or more rounds of iterative training of the model and obtains the trained machine learning model, the terminal device sends the trained machine learning model to the base station so that the base station can aggregate the trained machine learning models.
  • the base station aggregates multiple trained machine learning models to obtain an aggregate model.
  • the base station After the base station obtains the aggregate model by aggregating the trained machine learning models sent by each terminal device, the base station can continue to send the aggregate model to each terminal device, so that each terminal device continues to iteratively train the aggregate model. Finally, after the machine learning model trained by the terminal device reaches the model training termination condition, the terminal device no longer trains the aggregation model sent by the base station, but uses the last aggregation model sent by the base station as the model used for model inference. Model, that is, the aggregated model is used to perform subsequent data processing tasks.
  • the multiple distributed devices that jointly process data through the machine learning model introduced in this embodiment may be multiple distributed devices that jointly train the machine learning model, that is, the multiple distributed devices first After the machine learning model is obtained through joint training, the data is jointly processed through the same machine learning model.
  • multiple distributed devices may jointly process data through a preset machine learning model, that is, multiple distributed devices do not perform the process of jointly training the machine learning model.
  • the joint training process of the machine learning model and the joint data processing process may be integrated or independent, and this embodiment does not specifically limit this.
  • FIG. 9A is a schematic flowchart of a data processing method 900 provided by an embodiment of the present application. As shown in Figure 9A, the data processing method 900 includes the following steps 901-905.
  • Step 901 The distributed device 1 determines data processing requirements.
  • the data processing requirement in the distributed device 1 is to process raw data with noise to obtain the target data required by the distributed device 1 .
  • the quality of the target data is higher than the quality of the original data, that is, the noise in the target data is smaller than the noise in the original data.
  • the target data expected by the distributed device 1 may often be data used to perform other model training tasks. Therefore, the distributed device 1 can determine the data processing requirements according to the requirements of other model training tasks for the required data, and the data processing requirements are used to indicate the degree of processing of the original data.
  • the distributed device 1 expects to obtain higher-quality image data so that it can subsequently train an image classification model based on the higher-quality image data; therefore, the distributed device 1 can according to the input data requirements of the image classification model. Determines the extent to which raw image data is processed.
  • the target data expected by the distributed device 1 may be data used to train models such as a transmitter machine learning model, a receiver machine learning model, or a channel estimation machine learning model.
  • the target data expected by the distributed device 1 may be data used to train models such as image classification models, image enhancement models, image compression models, or image detection models. Therefore, the distributed device 1 can determine the quality requirements of the target data based on the accuracy requirements of the model used to perform training using the target data, and further determine the data processing requirements based on the quality gap between the target data and the original data.
  • the data processing requirement may be the number of times the machine learning model is used to process the original data.
  • the distributed device 1 can determine the data processing requirement to process the original data 10,000 times in sequence; for another example, when the quality requirement of the required data is not high, the distributed device 1 Equation 1 can determine the data processing requirement to process the original data step by step 1000 times.
  • Step 902 The distributed device 1 processes the original data through the machine learning model to obtain the first data.
  • the distributed device 1 After determining the data processing requirements, the distributed device 1 processes the original data through the machine learning model based on the data processing capability of the device to obtain the first data. Among them, the data processing capability of the distributed device 1 cannot meet the data processing needs of the distributed device 1, so the first data obtained by the distributed device 1 is not the target data expected by the distributed device 1.
  • the data processing capability of the distributed device may be related to the processing resources and storage resources on the distributed device, which is not specifically limited in this embodiment.
  • the distributed device 1 gradually processes the original data 200 times through the machine learning model to obtain the first data.
  • the first data needs to be processed 800 times to obtain data that meets the data processing requirements of the distributed device 1 .
  • the process of the distributed device 1 gradually processing the original data 200 times through the machine learning model may refer to the following: the distributed device 1 reuses the machine learning model to obtain a data generated by sequentially connecting 200 machine learning models.
  • the overall model then, the distributed device 1 inputs the original data into the overall model, and the 200 machine learning models in the overall model process the data in sequence to obtain the first data.
  • the input of any machine learning model in the total model is the output of the previous machine learning model.
  • the process of the distributed device 1 gradually processing the original data 200 times through the machine learning model can also refer to: the distributed device 1 processes the original data once through the machine learning model to obtain the once-processed data; then, the distributed device Device 1 then inputs the once-processed data into the machine learning model to obtain secondary-processed data; secondly, distributed device 1 continues to input the secondary-processed data into the machine learning model to obtain three-times processed data. This cycle continues until the distributed device 1 processes the data 200 times through the machine learning model and obtains the first data. That is to say, the distributed device 1 uses the data output by the machine learning model after processing the data each time as the input of the machine learning model in the next data processing process, thereby realizing multiple processing of data based on the same machine learning model. times processing.
  • the machine learning model used to process the original data on the distributed device 1 is, for example, the above-mentioned diffusion model.
  • Step 903 Distributed device 1 sends the first data to distributed device 2.
  • the distributed device 1 Since the data processing capability of the distributed device 1 cannot support the distributed device 1 to complete data processing, that is, the first data processed by the distributed device 1 cannot meet the data processing requirements of the distributed device 1, so the distributed device 1 The first data is sent to the distributed device 2 to request the distributed device 2 to assist the distributed device 1 in continuing to process the first data.
  • the distributed device 1 can also send the first information to the distributed device 2.
  • the first information is used to request the distributed device 2 to send the first data to the distributed device 2.
  • One data is processed.
  • the first information may also indicate the number of times the first data needs to be processed, that is, the number of times the distributed device 2 processes the first data. For example, assume that the data processing requirements of the distributed device 1 are to gradually process the original data 1,000 times through the machine learning model, and the distributed device 1 only gradually processes the original data 200 times. Therefore, the distributed device 1 can process the original data step by step in the first information. Indicates that the number of times the first data is to be processed is 800 times.
  • the sending of the first data and the sending of the first information can be performed separately.
  • the distributed device 1 may also negotiate with the distributed device 2 in advance so that the distributed device 2 can determine the number of times the data received from the distributed device 1 needs to be processed.
  • the distributed device 1 only needs to send the first data to the distributed device 2, and does not need to send the above-mentioned first information to the distributed device 2.
  • the distributed device 1 negotiates with the distributed device 2 in advance so that the distributed device 2 can determine the number of times the received data needs to be processed.
  • the signaling overhead incurred by the distributed device 1 is repeatedly notified to the distributed device 2.
  • the distributed device 1 and the distributed device 2 are, for example, the above-mentioned terminal equipment, or devices used to implement the functions of the above-mentioned terminal equipment.
  • the distributed device 1 may be a smart watch, and the distributed device 2 may be a smartphone.
  • Step 904 The distributed device 2 processes the first data through the machine learning model to obtain the second data.
  • the data processing capability of the distributed device 2 can support the distributed device 2 to assist the distributed device 1 in completing data processing, that is, the second data processed by the distributed device 2 can meet the data processing needs of the distributed device 1 , the second data is the data expected by the distributed device 1.
  • the distributed device 2 can use the machine learning model to process the first data according to the number of times of processing indicated by the first information.
  • the first data is processed multiple times step by step to obtain the second data.
  • the distributed device 1 instructs the distributed device 2 to process the first data 800 times in the first information
  • the distributed device 2 gradually processes the first data 800 times through the machine learning model to obtain the second data. data.
  • the data processing capability of the distributed device 2 can support the distributed device 2 to complete the processing of the first data a specified number of times by the distributed device 1 .
  • the machine learning model used by the distributed device 2 to process the first data may be the same as the machine learning model used by the distributed device 1 to process the first data, so as to ensure that the distributed device 2 performs noise reduction processing on the first data. performance to ensure that the second data can meet the data processing needs of the distributed device 1.
  • Step 905 Distributed device 2 sends second data to distributed device 1.
  • the distributed device 2 After processing the second data, since the second data can meet the data processing requirements of the distributed device 1, the distributed device 2 sends the second data to the distributed device 1, thus assisting the distributed device 1 in processing the data. In this way, after receiving the second data, the distributed device 1 can perform other data processing tasks based on the second data, for example, perform training tasks of other models based on the second data.
  • the equipment of multiple devices jointly completes data processing to continuously improve the quality of the data obtained, reduce the data processing pressure of each device, and ensure calculation Less capable units are also able to obtain data of the quality they require.
  • FIG. 9B is another schematic flowchart of a data processing method 900 provided by an embodiment of the present application.
  • the data processing method 900 may include the following steps 906-910. Among them, steps 906-910 are not sequentially related to the above-mentioned steps 901-905. Steps 906-910 and the above-mentioned steps 901-905 may be two independent sets of steps.
  • Distributed device 1 and distributed device 2 can complete the joint processing of data by executing the above steps 901-905; distributed device 1 and distributed device 2 can also complete data by executing the above steps 906-910. Joint processing.
  • Step 906 Distributed device 1 determines data processing requirements.
  • step 906 is similar to the above-mentioned step 901.
  • step 901 please refer to the above-mentioned step 901, which will not be described again here.
  • Step 907 Distributed device 1 sends assistance request information to distributed device 2.
  • the distributed device 1 can determine whether the data processing capability of the device can meet the data processing requirements. When the distributed device 1 determines that the data processing capability of the device cannot meet the data processing requirements, the distributed device 1 can send a request for assistance information to the distributed device 2 to request the distributed device 2 to assist the distributed device 1 in data processing. deal with.
  • the assistance request information sent by the distributed device 1 may indicate the number of times the original data needs to be processed, that is, the number of times the distributed device 2 processes the original data.
  • the data processing requirement of distributed device 1 is to gradually process the original data 1,000 times through the machine learning model, and the data processing capability of distributed device 1 only supports it to process the data step by step 200 times. Therefore, distributed device 1 can process the data step by step 200 times.
  • the request for assistance information indicates that the number of times the original data needs to be processed is 800 times, that is, the distributed device 1 instructs the distributed device 2 to process the original data 800 times.
  • Step 908 The distributed device 2 processes the original data through the machine learning model to obtain the first data.
  • the distributed device 2 After receiving the assistance request information sent by the distributed device 1, the distributed device 2 processes the original data through the machine learning model based on the instruction of the assistance request information to obtain the first data.
  • the distributed device 2 processes the original data 800 times through the machine learning model to obtain the first data.
  • Step 909 Distributed device 2 sends the first data to distributed device 1.
  • the distributed device 2 can also send instruction information to the distributed device 1.
  • the instruction information is used to instruct the distributed device 2 to process the original data. Number of times processed. For example, when the distributed device 2 processes the original data 800 times through the machine learning model to obtain the first data, the instruction information sent by the distributed device 2 indicates that the first data is obtained after processing the original data 800 times. . It can be understood that the number of times the distributed device 2 can process the original data according to its own capabilities is less than the number of times to be processed indicated by the distributed device 1 .
  • the sending of the first data and the sending of the indication information can be performed separately.
  • Step 910 The distributed device 1 processes the first data through the machine learning model to obtain the second data.
  • the distributed device 1 After receiving the first data, the distributed device 1 can determine the number of times the distributed device 1 processes the first data based on the data processing requirements of the device and the number of times the distributed device 2 processes the original data, thereby passing The machine learning model processes the first data and obtains the second data.
  • the distributed device 1 when the distributed device 1 instructs the distributed device 2 to process the original data 800 times by requesting assistance information, the distributed device 1 can process the original data 1,000 times according to its own data processing requirements and determine that more processing is needed. The first data received is processed 200 times. Therefore, the distributed device 1 continues to process the first data 800 times through the machine learning model to obtain the second data.
  • the distributed device 2 can assist the distributed device 1 in completing the data processing.
  • the data processing requirements of the distributed device 1 are high or the data processing capabilities of the distributed device 2 are low, it may be difficult for the distributed device 2 to assist the distributed device 1 to complete the data processing on its own.
  • Device 2 can also send the data processed by this device to other distributed devices, so that other distributed devices can continue to assist in completing the data processing; optionally, distributed device 1 can also request other distributed devices to continue assisting.
  • the processing is similar to requesting the distributed device 2 to assist in the processing, and will not be described again here.
  • FIG. 10A is a schematic flowchart of a data processing method 1000 provided by an embodiment of the present application. As shown in Figure 10A, the data processing method 1000 includes the following steps 1001-1007.
  • Step 1001 The distributed device 1 determines data processing requirements.
  • Step 1002 The distributed device 1 processes the original data through the machine learning model to obtain the first data.
  • Step 1003 Distributed device 1 sends first data to distributed device 2.
  • steps 1001-1003 are similar to the above-mentioned steps 901-903.
  • steps 901-903 please refer to the above-mentioned steps 901-903, which will not be described again here.
  • Step 1004 The distributed device 2 processes the first data through the machine learning model to obtain the second data.
  • the data processing capability of the distributed device 2 cannot support the distributed device 2 to assist the distributed device 1 in completing data processing.
  • the data processing requirements of distributed device 1 are to gradually process the original data 1000 times through the machine learning model, and distributed device 1 only gradually processes the original data 200 times, so distributed device 1 can be an instruction distributed device 2 gradually processes the first data 800 times; however, the data processing capability of the distributed device 2 is not enough to support the distributed device 2 to gradually process the first data 800 times.
  • the distributed device 2 may only be able to gradually process the first data. Process 200 times to get the second data.
  • the second data obtained by the distributed device 2 after processing the first data through the machine learning model still does not meet the data processing needs of the distributed device 1, that is, the second data is not the target data expected by the distributed device 1. .
  • Step 1005 Distributed device 2 sends second data to distributed device 3.
  • the distributed device 2 can continue to request other distributed devices to assist in completing the data processing.
  • the distributed device 2 sends the second data to the distributed device 3, so that the distributed device 3 continues to process the second data, so as to assist the distributed device 1 in completing the data processing.
  • the distributed device 2 can also send second information to the distributed device 3.
  • the second information is used to indicate the data sent by the distributed device 2.
  • the number of times the second data needs to be processed may be calculated based on the number of times the first data needs to be processed indicated in the first information and the number of times the distributed device 2 actually processes the first data.
  • the sending of the second data and the sending of the second information can be performed separately.
  • the distributed device 1 indicates the first information through the first information.
  • the number of times a piece of data needs to be processed is 800 times; after the distributed device 2 receives the first data and the first information, the distributed device 2 processes the first data 200 times to obtain the second data; therefore, the distributed device 2
  • the second data and second information may be sent to the distributed device 3, and the second information is used to indicate that the number of times the second data is to be processed is 600 times (800-200).
  • Step 1006 distributed device 3-distributed device N assist in processing data in sequence.
  • the distributed device 3 After receiving the second data sent by the distributed device 2, the distributed device 3 continues to process the second data. Moreover, if the number of times the distributed device 3 processes the second data is still less than the number of times the second data needs to be processed, the distributed device 3 continues to send the processed data of the distributed device 3 to the next distributed device of the distributed device 3 data to instruct subsequent distributed devices to continue to assist in completing data processing until distributed device N processes and obtains target data that can meet the data processing needs of distributed device 1.
  • the distributed device 3 and the distributed device N may be the same distributed device, or they may be different distributed devices.
  • distributed device 3 and distributed device N are drawn as different distributed devices.
  • Step 1007 Distributed device N sends target data to distributed device 1.
  • the distributed device N After processing and obtaining the target data required by the distributed device 1, since the target data can meet the data processing needs of the distributed device 1, the distributed device N sends the target data to the distributed device 1, thus completing the task of assisting the distributed device 1. Data processing. In this way, after receiving the target data, the distributed device 1 can perform other data processing tasks based on the target data, for example, perform training tasks of other models based on the second data.
  • the distributed device N can directly send the target data to the distributed device 1; the distributed device N can also send the target data to the previous distributed device of the distributed device N (that is, request the distributed device N to assist in processing the distribution of data).
  • the target data is sent by the device N-1), so that the target data can be sent to the distributed device 1 hop by hop.
  • FIG. 10B is another schematic flowchart of a data processing method 1000 provided by an embodiment of the present application.
  • the data processing method 1000 may include the following steps 1008-1014. Among them, steps 1008-1014 are not sequentially related to the above-mentioned steps 1001-1007. Steps 1008-1014 and the above-mentioned steps 1001-1007 may be two independent sets of steps.
  • Distributed device 1 - distributed device N can complete the joint processing of data by executing the above steps 1001-1007; distributed device 1 - distributed device N can also complete the data by executing the above steps 1008-1014. Joint processing.
  • Step 1008 Distributed device 1 determines data processing requirements.
  • Step 1009 Distributed device 1 sends assistance request information to distributed device 2.
  • steps 1008-1009 are similar to the above-mentioned steps 906-907.
  • steps 906-907 please refer to the above-mentioned steps 906-907, which will not be described again here.
  • Step 1010 the distributed device 2 processes the original data through the machine learning model to obtain intermediate data.
  • the data processing capability of the distributed device 2 is not enough to support the distributed device 2 to complete the number of data processing indicated by the distributed device 1 in the request for assistance information. Therefore, the distributed device 2 processes the original data through the machine learning model according to the data processing capability of the device to obtain intermediate data.
  • the number of data processing times corresponding to the intermediate data is smaller than the number of data processing times indicated by the distributed device 1 in the request for assistance information.
  • the distributed device uses the machine learning model
  • the intermediate data obtained after processing the original data 300 times cannot meet the needs of the distributed device 1, that is, the intermediate data is not the data expected by the distributed device 1.
  • Step 1011 Distributed device 2 sends intermediate data to distributed device 3.
  • the distributed device 2 can continue to request other distributed devices to assist in completing the data processing.
  • the distributed device 2 sends intermediate data to the distributed device 3 so that the distributed device 3 continues to process the intermediate data to assist the distributed device 1 in completing the data processing.
  • the distributed device 2 can also send indication information to the distributed device 3.
  • the indication information is used to indicate the intermediate data sent by the distributed device 2.
  • the number of times to be processed. The number of times the intermediate data needs to be processed may be calculated based on the number of times the original data needs to be processed indicated in the request for assistance information and the number of times the distributed device 2 actually processes the original data.
  • the sending of intermediate data and the sending of indication information can be performed separately.
  • Step 1012 distributed device 3-distributed device N assist in processing data in sequence to obtain the first data required by distributed device 1.
  • the distributed device 3 After receiving the intermediate data sent by the distributed device 2, the distributed device 3 continues to process the intermediate data. Moreover, if the number of times the distributed device 3 processes the intermediate data is still less than the number of times the intermediate data needs to be processed, the distributed device 3 continues to send the data processed by the distributed device 3 to the next distributed device of the distributed device 3 , to instruct subsequent distributed devices to continue to assist in completing data processing until distributed device N processes and obtains the first data that can meet the data processing requirements of distributed device 1.
  • Step 1013 Distributed device N sends the first data to distributed device 1.
  • the first data can meet the data processing requirements indicated by the distributed device 1 in the request for assistance information, so the distributed device N sends the first data to the distributed device 1 .
  • Step 1014 The distributed device 1 processes the first data through the machine learning model to obtain the second data.
  • the above method 900 and method 1000 introduce the process of a certain distributed device requesting other distributed devices to assist in completing data processing.
  • different distributed devices may need to process the same type of data, and the data processing requirements of different distributed devices are different.
  • a central device can be used to coordinate the data processing needs of each distributed device, thereby achieving joint processing of data among different distributed devices.
  • Figure 11 is a schematic flowchart of a data processing method 1100 provided by an embodiment of the present application. As shown in Figure 11, the data processing method 1100 includes the following steps 1101-1108.
  • Step 1101 Multiple distributed devices send data processing requirements to the central device respectively.
  • the multiple distributed devices include distributed device 1 , distributed device 2 and distributed device 3 .
  • the data required by multiple distributed devices is the same type of data, but different distributed devices have different quality requirements for the required data. That is, the number of times different distributed devices need to perform noise reduction processing on the original data is not the same.
  • distributed device 1 needs to use image data to train an image classification model.
  • the image classification model does not have high requirements for the quality of image data, so distributed device 1
  • the data processing requirements can specifically include denoising the original image data 1,000 times.
  • the distributed device 2 may need image data to train a semantic segmentation model.
  • the semantic segmentation model is used to identify each object in the image. Therefore, the semantic segmentation model has higher requirements for the quality of the image data; the data processing requirements of the distributed device 2 are specific.
  • the original image data can be denoised 5000 times.
  • the distributed device 3 may need image data to train an image enhancement model.
  • the image enhancement model is used to identify specific objects in the image and enhance the clarity of the recognized specific objects. Therefore, the image enhancement model has quality requirements for the image data. Highest.
  • the data processing requirements of the distributed device 3 may specifically be to perform noise reduction processing on the original image data 10,000 times.
  • Step 1102 The central device determines the order in which each distributed device processes data.
  • the central device can determine the order in which each distributed device processes data based on the number of data processing requirements of each distributed device. For example, the central device first determines the number of data processing times in the data processing requirements of each distributed device, and then determines the order in which each distributed device processes data in order of the number of data processing times from small to large. The smaller the number of data processing times in the data processing requirements of the distributed device, the higher the order in which the distributed device processes the data; the greater the number of data processing times in the data processing requirement of the distributed device, the higher the order in which the distributed device processes the data. The further back.
  • the data processing requirements of distributed device 1 are to perform noise reduction processing on original image data 1,000 times
  • the data processing requirements of distributed device 2 are to perform noise reduction processing on original image data 5,000 times
  • the data processing requirements of distributed device 3 are to perform noise reduction processing on original image data 5,000 times.
  • the requirement is to perform noise reduction processing on the original image data 10,000 times, then the order in which the three distributed devices process the data is: distributed device 1 ⁇ distributed device 2 ⁇ distributed device 3.
  • Step 1103 The central device sends instruction information to each distributed device to instruct each distributed device in the order in which the data is processed.
  • the central device After determining the order in which each distributed device processes data, the central device sends instruction information to each distributed device to instruct each distributed device in the order in which it processes data. In this way, after each distributed device receives the instruction information sent by the central device, it can determine from which distributed device the processed data is received and to which distributed device the processed data on its own device is sent.
  • the central device can also provide the data to each distributed device.
  • the instruction information sent indicates the number of times each distributed device needs to process the data.
  • the central device The instruction information 1 sent to the distributed device 1 may specifically be: the previous hop node is empty, the number of local data processing times is 1000, and the next hop node is the distributed device 2. That is, distributed device 1 is the first node that starts processing data, and distributed device 1 needs to process the data 1,000 times through the machine learning model, and send the processed data to distributed device 2.
  • the instruction information 2 sent by the central device to the distributed device 2 may specifically be: the previous hop node is the distributed device 1, the number of local data processing times is 4000 (i.e.
  • the instruction information 3 sent by the central device to the distributed device 3 may specifically be: the previous hop node is the distributed device 2, the number of local data processing times is 5000, and the next hop node is empty.
  • the data processing capabilities of each distributed device meet the data processing needs of this device. Therefore, after any distributed device receives data that has been processed a certain number of times from other distributed devices, it can process the data that has been processed for a certain number of times. The data will continue to be processed to obtain data that meets the data processing requirements of the device.
  • the data processing capabilities of the distributed devices may not be able to meet their own data processing needs. If the central device continues to determine the combination of distributed devices based on the relationship between the number of data processing times in the data processing needs of each distributed device, The way data is processed may cause some distributed devices to be unable to complete the processing of the data. Therefore, in this example, the central device can determine the manner in which the distributed devices jointly process data based on the relationship between the number of data processing times in the data processing requirements of each distributed device and the data processing capabilities of each distributed device.
  • the data processing requirements of distributed device 1, distributed device 2 and distributed device 3 are to perform noise reduction processing on the original image data 1000 times, 5000 times and 10000 times respectively, and the data processing of distributed device 1
  • the data processing capability supports it to perform 1000 times of noise reduction processing on data
  • the data processing capability of distributed device 2 supports it to perform 2000 times of noise reduction processing on data
  • the data processing capability of distributed device 3 supports it to perform 9000 times of noise reduction processing on data.
  • the instruction information 1 sent by the central device to the distributed device 1 may specifically be: the previous hop node is empty, the number of local data processing times is 1000, and the next hop node is the distributed device 2.
  • the instruction information 2 sent by the central device to the distributed device 2 may specifically be: the previous hop node is the distributed device 1, the number of local data processing times is 2000, the next hop node is the distributed device 3; the previous hop node is the distributed device 2 Device 3, the number of local data processing times is 0, and the next hop node is empty.
  • the instruction information 3 sent by the central device to the distributed device 3 may specifically be: the previous hop node is the distributed device 2, the next hop node is the distributed device 2 when the number of local data processing times is 2000, and the next hop node is the distributed device 2 when the number of local data processing times is 7000. The hop node is empty.
  • Step 1104 The distributed device 1 processes the original data T1 times through the machine learning model to obtain the first data.
  • the distributed device 1 After receiving the instruction information sent by the central device, the distributed device 1 can determine that it is the first device to process data. Therefore, the distributed device 1 processes the original data T1 times through the machine learning model to obtain the first data.
  • the central device can specify the number of times each distributed device processes data through instruction information.
  • the number of times the distributed device 1 processes the original data may match the data processing requirements of the distributed device 1 . That is, the data processing requirement of the distributed device 1 is to process the original data T1 times, and the number of times the distributed device 1 actually processes the original data is also T1 times.
  • the central device does not specify the number of times each distributed device processes data.
  • the number of times the distributed device 1 processes the original data may not match the data processing requirements of the distributed device 1 . That is, the number of times the distributed device 1 actually processes the original data may be greater or less than the number of data processing times required by the distributed device 1 .
  • the data processing requirement of the distributed device 1 is to process the original data N1 times, and the number of times the distributed device 1 actually processes the original data is T1 times, where N1 can be greater or less than T1.
  • the number of times T1 that the distributed device 1 actually processes the original data can be greater than the required number of data processing times N1; when the computing resources on the distributed device 1 Or when storage resources are relatively tight, the number of times T1 that the distributed device 1 actually processes the original data may be less than the required number of times of data processing N1.
  • Step 1105 Distributed device 1 sends the first data to distributed device 2.
  • the instruction information received by the distributed device 1 from the central device also indicates that the distributed device 1 needs to send processed data to the distributed device 2 . Therefore, after the distributed device 1 processes the original data and obtains the first data, the distributed device 1 sends the first data to the distributed device 2 .
  • the distributed device 1 can send information to the distributed device 2 to indicate that the distributed device 1 has processed the original data. number of times.
  • Step 1106 The distributed device 2 processes the first data T2 times through the machine learning model to obtain the second data.
  • the distributed device 2 After receiving the instruction information sent by the central device, the distributed device 2 can determine that it needs to receive data from the distributed device 1 and continue to process the received data. Therefore, the distributed device 2 uses the machine learning model to process the received data.
  • the first data is processed T2 times to obtain the second data. Wherein, when the second data obtained by the distributed device 2 processing the first data T2 times can meet the data processing requirements of the distributed device 2, the second data is the data required by the distributed device 2.
  • the central device can specify the number of times the distributed device 2 processes data through instruction information. After the distributed device 2 processes the first data T2 times according to the instructions of the central device, the data required by the distributed device 2 can be obtained. For example, assume that the data processing requirements of the distributed device 2 are to process the original data T1 + T2 times. Since the first data received by the distributed device 2 is the data obtained after processing the original data T1 times, the distributed device 2 The second data obtained after the device 2 processes the first data T2 times according to the instructions of the central device is the data required by the distributed device 2 .
  • the distributed device 2 may be to process the first data S1 times to obtain the second data.
  • the second data is not the data required by the distributed device 2.
  • the distributed device 2 can send the second data to other distributed devices to request other distributed devices to assist the distributed device 2 in continuing to process the second data. for processing.
  • the distributed device 2 may process the first data S2 times to obtain the second data.
  • the data obtained when the distributed device 2 processes the first data T2 times is the data required by the distributed device 2 .
  • Step 1107 Distributed device 2 sends second data to distributed device 3.
  • the instruction information received by the distributed device 2 from the central device also indicates that the distributed device 2 needs to send processed data to the distributed device 3 . Therefore, after the distributed device 2 processes the first data and obtains the second data, the distributed device 2 sends the second data to the distributed device 3 .
  • Step 1108 The distributed device 3 processes the second data T3 times through the machine learning model to obtain the third data.
  • the distributed device 3 After receiving the instruction information sent by the central device, the distributed device 3 can determine that it needs to receive data from the distributed device 2 and continue to process the received data. Therefore, the distributed device 3 uses the machine learning model to process the second data. The data is processed T3 times to obtain the third data. Wherein, when the third data obtained by the distributed device 3 processing the first data T3 times can meet the data processing requirements of the distributed device 3, the third data is the data required by the distributed device 3.
  • the process of the distributed device 3 processing the second data under various circumstances is similar to the process of the distributed device 2 processing the first data.
  • the process of the distributed device 3 processing the second data is similar to the process of the distributed device 2 processing the first data.
  • the above method 1100 is explained by taking three distributed devices to jointly process data as an example. In actual applications, two or more distributed devices may jointly process data based on the above process. This embodiment does not limit the number of distributed devices that jointly process data.
  • the above method 1100 introduces how to coordinate the joint processing of data by distributed devices when the data processing requirements of each distributed device are different.
  • some distributed devices may have the same data processing requirements, and the central device can allocate data processing tasks to these distributed devices based on the data processing capabilities of the distributed devices.
  • Figure 12 is a schematic flowchart of a data processing method 1200 provided by an embodiment of the present application. As shown in Figure 12, the data processing method 1200 includes the following steps 1201-1207.
  • Step 1201 Distributed device 1 and distributed device 2 send data processing requirements and data processing capabilities to the central device respectively.
  • the data processing requirements of distributed device 1 and distributed device 2 are the same.
  • the data processing requirements of distributed device 1 and distributed device 2 are to perform noise reduction processing on the original data 1,000 times.
  • the data processing capabilities of the distributed device 1 and the data processing capabilities of the distributed device 2 may be the same or different, which is not specifically limited in this embodiment.
  • Step 1202 The central device determines the order in which each distributed device processes data and the number of times the data is processed.
  • the central device can process Part of the process of data is allocated to the distributed device 1, and another part of the process of processing the data is allocated to the distributed device 2.
  • the central device determines the order in which the distributed devices process data.
  • the central device may randomly determine the order in which distributed device 1 and distributed device 2 process data.
  • the central device may determine the order in which the distributed device 1 and the distributed device 2 process the data according to the source of the data that the distributed device 1 and the distributed device 2 need to process. For example, assuming that the data that distributed device 1 and distributed device 2 need to process comes from distributed device 0, and distributed device 1 is located between distributed device 0 and distributed device 2, then the central device can determine that distributed device 1 The data is processed first, and then the distributed device 2 continues to process the data processed by the distributed device 1.
  • the central device may determine the order in which distributed device 1 and distributed device 2 process the data based on the next hop node of the data processed by distributed device 1 and distributed device 2. For example, assuming that the data finally processed by distributed device 1 and distributed device 2 needs to be sent to distributed device 3, and distributed device 1 is located between distributed device 2 and distributed device 3, then the central device can determine the distributed device 3. Device 2 processes the data first, and then the distributed device 1 continues to process the data processed by the distributed device 2, so that the distributed device 1 can send the final processed data to the distributed device 3 at a faster speed.
  • the number of times each distributed device processes data may be determined by the central device based on the data processing capabilities of each distributed device. The higher the data processing capability of the distributed device, the greater the number of times the central device can determine that the distributed device processes data; the lower the data processing capability of the distributed device, the smaller the number of times the central device can determine that the distributed device processes data.
  • Step 1203 The central device sends instruction information to each distributed device to indicate the order and number of times each distributed device processes data.
  • the central device can send the instruction information 1 to the distributed device 1 to instruct the distributed device 1
  • the data is first processed 200 times and the processed data is sent to the distributed device 2.
  • the central device also sends instruction information 2 to the distributed device 2 to instruct the distributed device 2 to receive data from the distributed device 1, process the received data 800 times, and send the processed data to the distributed device 2.
  • Step 1204 The distributed device 1 processes the data to be processed N1 times through the machine learning model to obtain the first data.
  • the distributed device 1 Based on the instructions of the central device, the distributed device 1 performs N1 data processing on the data to be processed through the machine learning model.
  • the data to be processed may be original data pre-stored on the distributed device 1; the data to be processed may also be processed data sent to the distributed device 1 by other distributed devices. This embodiment does not include data to be processed. Make specific limitations.
  • Step 1205 Distributed device 1 sends the first data to distributed device 2.
  • Step 1206 The distributed device 2 processes the first data N2 times through the machine learning model to obtain the second data.
  • the data processing requirements of the distributed device 1 and the distributed device 2 are to process the data to be processed N1 + N2 times. Therefore, after the distributed device 2 processes the first data N2 times, the second obtained The data is the data required by distributed device 1 and distributed device 2. At this time, the distributed device 2 can use the obtained second data to perform other tasks, such as using the second data to train an image processing model.
  • Step 1207 Distributed device 2 sends second data to distributed device 1.
  • the distributed device 2 Since the central device instructs the distributed device 2 to send the processed data to the distributed device 1, the distributed device 2 sends the second data to the distributed device 1 after obtaining the second data, so that the distributed device 1 can perform the processing based on the second data. data to perform other tasks.
  • this application also provides a communication device.
  • the embodiment of the present application provides a communication device 1300.
  • the communication device 1300 can realize the functions of the terminal device (or network device) in the above method embodiment, and therefore can also realize the benefits of the above method embodiment. Effect.
  • the communication device 1300 may be a terminal device (or network device), or may be an integrated circuit or component within the terminal device (or network device), such as a chip.
  • the following embodiments will be described by taking the communication device 1300 as a terminal device or a network device as an example.
  • the communication device 1300 includes: a transceiver module 1301 and a processing module 1302.
  • the transceiver module 1301 is used to receive the first data from the second device.
  • the first data is data processed by the first machine learning model;
  • the processing module 1302 is used to process the first data through the second machine learning model.
  • the second data is obtained, the structure of the first machine learning model is the same as the structure of the second machine learning model, and the communication device and the second device are used to jointly perform data processing.
  • the second machine learning model is a diffusion model, and the second machine learning model is used to perform denoising processing on the first data.
  • the transceiver module 1301 is also used to receive first information from the second device, where the first information is used to request the communication device to perform processing on the first data.
  • the first information is used to indicate that the number of times the first data needs to be processed is the first count; the processing module 1302 is also used to perform the first count on the first data through the second machine learning model. Process to obtain second data, wherein the capability of the first device supports completing the first number of processes on the first data.
  • the transceiver module 1301 is also used to send the second data to the second device; or, the transceiver module 1301 is used to send the second data to the source device, where the first information is also used to indicate the source device
  • the source device is the first device that requests assistance in processing the data.
  • the first information is used to indicate that the first data needs to be processed for the first time; the processing module 1302 is also used to process the first data for the second time through the second machine learning model. , obtain the second data, in which the first number is greater than the second number, and the capability of the first device does not support completing the processing of the first number on the first data; the transceiver module 1301 is also used to send the second data to the third device and second information; wherein the second information is used to indicate that the number of times the second data needs to be processed is the third number of times, the third number of times is the difference between the first number and the second number of times, and the third device is used to assist the communication device to execute Processing of data.
  • the transceiver module 1301 is also configured to send assistance request information to the second device, where the assistance request information is used to request the second device to assist in processing data.
  • the transceiver module 1301 is also used to send third information to the central device.
  • the third information is used to indicate the number of data processing times required by the communication device; the transceiver module 1301 is also used to receive the central device.
  • the feedback information is used to indicate that the second device is an assisting node.
  • the transceiver module 1301 is also used to receive fourth information from the central device.
  • the fourth information is used to indicate the number of times the communication device needs to process the data received from the second device; the processing module 1302. Also used to process the first data through the second machine learning model according to the fourth information to obtain the second data required by the communication device.
  • the fourth information is also used to indicate information about a third device, which is a device to receive the processed data; the transceiver module 1301 is also used to send a message to the third device according to the fourth information.
  • the device sends second data.
  • the transceiver module 1301 is also configured to receive fifth information from the second device, where the fifth information is used to indicate the number of times the first data has been processed; the processing module 1302 is also configured to The number of processing times and the number of processing times of data required by the communication device are processed through the second machine learning model to obtain the second data required by the communication device.
  • the processing module 1302 is used to process the original data through the first machine learning model to obtain the first data; the transceiving module 1301 is used to send the first data to the second device to request the first data.
  • the second device assists in processing the first data; the transceiver module 1301 is used to receive the second data sent by the second device or other devices.
  • the second data is obtained by processing the first data, and the second data is based on the second machine learning model. After processing, the structure of the first machine learning model is the same as the structure of the second machine learning model.
  • the first machine learning model is a diffusion model, and the first machine learning model is used to perform denoising processing on the original data.
  • the transceiver module is also configured to send first information to the second device, the first information is used to request the second device to perform processing on the first data, and the first information is also used to indicate the first
  • the number of times the data needs to be processed, the number of times the first data needs to be processed, is determined based on the number of times the original data needs to be processed and the number of times the first device performs processing on the original data.
  • the transceiver module 1301 is configured to receive first information from the first device and second information from the second device, where the first information is used to indicate the first data required by the first device. The number of processing times.
  • the second information is used to indicate the second number of processing times of data required by the second device.
  • the data processing model corresponding to the first number of processing times is the same as the data processing model corresponding to the second number of processing times.
  • the transceiver module 1301 is used to send The second device sends third information, and the third information is used to instruct the second device to send the processed data to the first device, wherein the second processing number of data required by the second device is less than or equal to that required by the first device. The first number of times the data is processed.
  • the transceiver module 1301 is also configured to send fourth information to the first device, where the fourth information is used to indicate the number of times the first device needs to perform processing on the data received from the second device.
  • the transceiver module 1301 in the communication device 1300 may be a transceiver
  • the processing module 1302 may be a processor.
  • the communication device 1300 is an integrated circuit or component inside a terminal device or a network device
  • the transceiver module 1301 in the communication device 1300 can be an output or input pin on the chip
  • processing Module 1302 may be an on-chip computing component.
  • the transceiver module 1301 in the communication device 1300 can be a communication interface on the chip system
  • the processing module 1302 can be a processing core on the chip system.
  • the model training device 1400 can realize the functions of the terminal device (or network device) in the above method embodiment, and therefore can also realize the functions of the above method embodiment. beneficial effects.
  • the model training device 1400 may be a terminal device (or network device), or may be an integrated circuit or component within the terminal device (or network device), such as a chip. The following embodiments will be described by taking the model training device 1400 as a terminal device or a network device as an example.
  • the model training device 1400 includes: a transceiver module 1401 and a processing module 1402.
  • the transceiver module 1401 is used to obtain a training sample set.
  • the training sample set includes first data and second data.
  • the first data is obtained based on the second data, and the second data is the training label of the first data;
  • the processing module 1402 Used to train the first machine learning model based on the training sample set to obtain the trained first machine learning model, where the first machine learning model is used to process the first data;
  • the transceiver module 1401 is used to send data to the second device
  • the trained first machine learning model is sent to the second device for aggregating machine learning model devices with the same structure and different parameters that are trained by multiple devices.
  • the transceiver module 1401 is also used to send first information to a third device.
  • the first information is used to indicate capabilities related to model training on the model training device.
  • the third device is used to perform training based on participating machines.
  • the capabilities of multiple devices for learning model training determine the training content that the multiple devices are responsible for;
  • the transceiver module 1401 is also used to receive second information from a third device, where the second information is used to indicate the third device trained on the model training device. The number of times a machine learning model processes input data.
  • the second information is also used to indicate the input data requirements of the first machine learning model; the processing module 1402 is also used to: according to the input data requirements indicated by the second information; The first machine learning model processes the input data the number of times, processes the original data, and obtains the second data; and processes the second data according to the number of times the first machine learning model processes the input data indicated by the second information. , get the first data.
  • the transceiver module 1401 is also used to receive the second machine learning model from the second device; the processing module 1402 is also used to train the second machine learning model based on the training sample set, to obtain The trained second machine learning model; the transceiving module 1401 is also used to send the trained second machine learning model to the second device.
  • the transceiver module 1401 is configured to receive multiple capability information, the multiple capability information comes from multiple different devices, and each capability information in the multiple capability information is used to indicate on the device Capabilities related to model training; the sending module 1403 is used to send different training configuration information to multiple different devices according to multiple capability information.
  • the training configuration information is used to instruct the machine learning model trained on the device to process the input data. The number of times, the training configuration information is also used to indicate the input data requirements of the machine learning model trained on the device.
  • the machine learning models trained by multiple different devices have the same structure.
  • the transceiver module 1401 in the model training device 1400 can be a transceiver
  • the processing module 1402 can be a processor.
  • the model training device 1400 is an integrated circuit or component inside a terminal device or a network device
  • the transceiver module 1401 in the model training device 1400 can be an output or input pipe on the chip.
  • the processing module 1402 may be a computing component on the chip.
  • the transceiver module 1401 in the model training device 1400 can be a communication interface on the chip system, and the processing module 1402 can be a processing core on the chip system.
  • FIG. 15 is another schematic structural diagram of a communication device 1500 provided in this application.
  • the communication device 1500 at least includes an input and output interface 1502 .
  • the communication device 1500 may be a chip or an integrated circuit.
  • the communication device further includes a logic circuit 1501.
  • the transceiver module 1301 shown in Figure 13 can be a communication interface, and the communication interface can be the input and output interface 1502 in Figure 15.
  • the input and output interface 1502 can include an input interface and an output interface.
  • the communication interface may also be a transceiver circuit, and the transceiver circuit may include an input interface circuit and an output interface circuit.
  • the input and output interface 1502 is used to obtain the AI model information of the first network device; the logic circuit 1501 is used to determine the AI performance information of the first network device based on the AI model information of the first network device.
  • the logic circuit 1501 and the input-output interface 1502 can also perform other steps performed by the terminal device in any of the foregoing embodiments and achieve corresponding beneficial effects, which will not be described again here.
  • the logic circuit 1501 is used to generate AI model information of the first network device; the input and output interface 1502 is used to send the AI model information of the first network device.
  • the logic circuit 1501 and the input-output interface 1502 can also perform other steps performed by the network device in any embodiment and achieve corresponding beneficial effects, which will not be described again here.
  • the processing module 1302 shown in FIG. 13 may be the logic circuit 1501 in FIG. 15 .
  • the logic circuit 1501 may be a processing device, and the functions of the processing device may be partially or fully implemented through software. Among them, the functions of the processing device can be partially or fully implemented through software.
  • the processing device may include a memory and a processor, wherein the memory is used to store a computer program, and the processor reads and executes the computer program stored in the memory to perform corresponding processing and/or steps in any method embodiment. .
  • the processing means may comprise only a processor.
  • the memory for storing computer programs is located outside the processing device, and the processor is connected to the memory through circuits/wires to read and execute the computer programs stored in the memory.
  • the memory and processor can be integrated together, or they can also be physically independent of each other.
  • the processing device may be one or more chips, or one or more integrated circuits.
  • the processing device may be one or more field-programmable gate arrays (FPGA), application specific integrated circuit (ASIC), system on chip (SoC), central processing unit (central processor unit, CPU), network processor (network processor, NP), digital signal processing circuit (digital signal processor, DSP), microcontroller unit (micro controller unit, MCU), programmable logic device, PLD) or other integrated chips, or any combination of the above chips or processors, etc.
  • FPGA field-programmable gate arrays
  • ASIC application specific integrated circuit
  • SoC system on chip
  • central processing unit central processor unit, CPU
  • network processor network processor
  • NP network processor
  • DSP digital signal processing circuit
  • microcontroller unit microcontroller unit
  • microcontroller unit micro controller unit, MCU
  • PLD programmable logic device
  • FIG. 16 is a communication device 1600 involved in the above embodiment provided for an embodiment of the present application.
  • the communication device 1600 can specifically be a communication device serving as a terminal device in the above embodiment.
  • the example shown in FIG. 16 is a terminal.
  • the device is implemented through a terminal device (or a component in the terminal device).
  • the communication device 1600 may include but is not limited to at least one processor 1601 and a communication port 1602.
  • the device may also include at least one of a memory 1603 and a bus 1604.
  • the at least one processor 1601 is used to control the actions of the communication device 1600.
  • the processor 1601 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. It may implement or execute the various illustrative logical blocks, modules and circuits described in connection with this disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on.
  • the communication device 1600 shown in Figure 16 can be specifically used to implement the steps implemented by the terminal device in the foregoing method embodiment, and to achieve the corresponding technical effects of the terminal device.
  • the specific implementation methods of the communication device shown in Figure 16 are all Reference may be made to the descriptions in the foregoing method embodiments, which will not be described again here.
  • FIG. 17 is a schematic structural diagram of the communication device 1700 involved in the above embodiment provided by the embodiment of the present application.
  • the communication device 1700 can specifically be the communication device as a network device in the above embodiment, as shown in Figure 17
  • An example is that the network device is implemented by a network device (or a component in the network device), wherein the structure of the communication device may refer to the structure shown in FIG. 17 .
  • the communication device 1700 includes at least one processor 1711 and at least one network interface 1714. Further optionally, the communication device further includes at least one memory 1717, at least one transceiver 1713 and one or more antennas 1715.
  • the processor 1711, the memory 1717, the transceiver 1713 and the network interface 1714 are connected, for example, through a bus. In the embodiment of the present application, the connection may include various interfaces, transmission lines or buses, etc., which is not limited in this embodiment.
  • Antenna 1715 is connected to transceiver 1713.
  • the network interface 1714 is used to enable the communication device to communicate with other communication devices through communication links.
  • the network interface 1714 may include a network interface between a communication device and a core network device, such as an S1 interface, and the network interface may include a network interface between a communication device and other communication devices (such as other network devices or core network devices), such as an X2 Or Xn interface.
  • a network interface between a communication device and a core network device such as an S1 interface
  • the network interface may include a network interface between a communication device and other communication devices (such as other network devices or core network devices), such as an X2 Or Xn interface.
  • the processor 1711 is mainly used to process communication protocols and communication data, control the entire communication device, execute software programs, and process data of the software programs, for example, to support the communication device to perform actions described in the embodiments.
  • the communication device may include a baseband processor and a central processing unit.
  • the baseband processor is mainly used to process communication protocols and communication data.
  • the central processing unit is mainly used to control the entire terminal device, execute software programs, and process data of the software programs.
  • the processor 1711 in Figure 17 can integrate the functions of the baseband processor and the central processor. Those skilled in the art can understand that the baseband processor and the central processor can also be independent processors, interconnected through technologies such as buses.
  • the terminal device may include multiple baseband processors to adapt to different network standards, the terminal device may include multiple central processors to enhance its processing capabilities, and various components of the terminal device may be connected through various buses.
  • the baseband processor can also be expressed as a baseband processing circuit or a baseband processing chip.
  • the central processing unit can also be expressed as a central processing circuit or a central processing chip.
  • the function of processing communication protocols and communication data can be built into the processor, or can be stored in the memory in the form of a software program, and the processor executes the software program to implement the baseband processing function.
  • Memory is mainly used to store software programs and data.
  • the memory 1717 may exist independently and be connected to the processor 1711. Alternatively, the memory 1717 may be integrated with the processor 1711, for example, within a chip. Among them, the memory 1717 can store the program code for executing the technical solution of the embodiment of the present application, and the execution is controlled by the processor 1711. The various computer program codes executed can also be regarded as the driver of the processor 1711.
  • Figure 17 shows only one memory and one processor. In an actual terminal device, there may be multiple processors and multiple memories. Memory can also be called storage media or storage devices.
  • the memory may be a storage element on the same chip as the processor, that is, an on-chip storage element, or an independent storage element, which is not limited in the embodiments of the present application.
  • the transceiver 1713 may be used to support the reception or transmission of radio frequency signals between the communication device and the terminal, and the transceiver 1713 may be connected to the antenna 1715.
  • Transceiver 1713 includes a transmitter Tx and a receiver Rx.
  • one or more antennas 1715 can receive radio frequency signals
  • the receiver Rx of the transceiver 1713 is used to receive the radio frequency signals from the antennas, convert the radio frequency signals into digital baseband signals or digital intermediate frequency signals, and convert the digital baseband signals into digital baseband signals.
  • the signal or digital intermediate frequency signal is provided to the processor 1711, so that the processor 1711 performs further processing on the digital baseband signal or digital intermediate frequency signal, such as demodulation processing and decoding processing.
  • the transmitter Tx in the transceiver 1713 is also used to receive a modulated digital baseband signal or a digital intermediate frequency signal from the processor 1711, and convert the modulated digital baseband signal or digital intermediate frequency signal into a radio frequency signal, and pass it through a One or more antennas 1715 transmit the radio frequency signal.
  • the receiver Rx can selectively perform one or more levels of down-mixing processing and analog-to-digital conversion processing on the radio frequency signal to obtain a digital baseband signal or a digital intermediate frequency signal.
  • the sequence of the down-mixing processing and the analog-to-digital conversion processing is The order is adjustable.
  • the transmitter Tx can selectively perform one or more levels of upmixing processing and digital-to-analog conversion processing on the modulated digital baseband signal or digital intermediate frequency signal to obtain a radio frequency signal.
  • the upmixing processing and digital-to-analog conversion processing are The order is adjustable.
  • Digital baseband signals and digital intermediate frequency signals can be collectively referred to as digital signals.
  • the transceiver 1713 may also be called a transceiver unit, a transceiver, a transceiver device, etc.
  • the devices used to implement the receiving function in the transceiver unit can be regarded as the receiving unit
  • the devices used in the transceiver unit used to implement the transmitting function can be regarded as the transmitting unit, that is, the transceiver unit includes a receiving unit and a transmitting unit, and the receiving unit also It can be called a receiver, input port, receiving circuit, etc.
  • the sending unit can be called a transmitter, transmitter, or transmitting circuit, etc.
  • the communication device 1700 shown in Figure 17 can be used to implement the steps implemented by the network equipment in the foregoing method embodiments, and to achieve the corresponding technical effects of the network equipment.
  • the specific implementation of the communication device 1700 shown in Figure 17 is: Reference may be made to the descriptions in the foregoing method embodiments, and details will not be repeated here.
  • Embodiments of the present application also provide a computer-readable storage medium that stores one or more computer-executable instructions.
  • the processor executes the possible implementations of the terminal device in the foregoing embodiments. Methods.
  • Embodiments of the present application also provide a computer-readable storage medium that stores one or more computer-executable instructions.
  • the processor executes the possible implementations of the network device in the foregoing embodiments. Methods.
  • Embodiments of the present application also provide a computer program product (or computer program) that stores one or more computers.
  • the processor executes the method of possible implementation of the above terminal device.
  • Embodiments of the present application also provide a computer program product that stores one or more computers.
  • the processor executes the method of the possible implementation of the network device.
  • Embodiments of the present application also provide a chip system, which includes at least one processor and is used to support the communication device in implementing the functions involved in the possible implementation manners of the communication device.
  • the chip system further includes an interface circuit that provides program instructions and/or data to the at least one processor.
  • the chip system may also include a memory for storing necessary program instructions and data of the communication device.
  • the chip system may be composed of chips, or may include chips and other discrete devices, where the communication device may specifically be the terminal equipment in the foregoing method embodiment.
  • Embodiments of the present application also provide a chip system, which includes at least one processor and is used to support the communication device in implementing the functions involved in the possible implementation manners of the communication device.
  • the chip system further includes an interface circuit that provides program instructions and/or data to the at least one processor.
  • the chip system may also include a memory, which is used to store necessary program instructions and data for the communication device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the communication device may specifically be the network device in the aforementioned method embodiment.
  • An embodiment of the present application also provides a communication system.
  • the network system architecture includes the terminal device and network device in any of the above embodiments.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units. If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请提供了一种数据处理方法,应用于具有人工智能(artificial intelligence,AI)处理能力的无线通信系统中。在该方法中,通过在不同的通信装置上部署结构相同的机器学习模型,由多个装置联合完成数据的处理,以不断地提高所得到的数据的质量,减轻了每个装置的数据处理压力,保证计算能力较弱的装置也能够获得其所需质量的数据。

Description

一种数据处理方法、训练方法及相关装置 技术领域
本申请涉及人工智能(Artificial Intelligence,AI)技术领域,尤其涉及一种数据处理方法、训练方法及相关装置。
背景技术
随着大数据时代的来临,数据规模增长迅猛,如何从大规模数据中挖掘出有价值的信息成为大部分应用场景下亟待解决的问题。目前,通常是采用机器学习模型来对数据进行处理,以挖掘得到数据中有价值的信息。
在采用机器学习模型对数据进行处理之前,往往需要采用大量的数据对机器学习模型进行训练,以训练得到具有较高精度的机器学习模型。然而,在实际应用中,能够用于训练机器学习模型的高质量数据往往是难以获取的。尤其在无线通信网络中,由于干扰和噪声的影响,获取到的数据往往需要经过复杂的处理过程,才能够得到高质量的数据。因此,如何以较低的复杂度获取用于机器学习模型训练的高质量数据,是当前的热门研究方向。
在相关技术中,对原始数据进行处理以得到高质量数据的过程可以看作为一个数据降噪过程。通过机器学习模型对原始数据进行多次降噪处理,则能够得到高质量的数据,整个过程可以用马尔科夫链进行建模。然而,用于降噪的机器学习模型的训练过程复杂度较高,而使用该机器学习模型进行推理,即对数据进行降噪处理,往往需要执行较多次数的降噪处理,数据的处理过程较为复杂,因此设备上需要有大量的计算资源来支持模型的训练和推理,导致大部分计算能力较弱的设备难以实现。
发明内容
本申请提供一种数据处理方法,通过在不同的装置上部署结构相同的机器学习模型,由多个装置联合完成数据的处理,以不断地提高所得到的数据的质量,减轻了每个装置的数据处理压力,保证计算能力较弱的装置也能够实现数据的处理。
本申请第一方面提供一种数据处理方法,以该数据处理方法由第一装置执行为例,该第一装置可以是终端设备或网络设备,或者是终端设备或网络设备中的部分组件(例如处理器、芯片或芯片系统等)。或者,该第一装置还可以是能实现全部或部分终端设备功能的逻辑模块和/或软件。具体地,该方法包括:第一装置接收来自于第二装置的第一数据,第一数据为经过第一机器学习模型处理后的数据;即第二装置通过第一机器学习模型处理得到第一数据后,将第一数据发送给第一装置。然后,第一装置通过第二机器学习模型对第一数据进行处理,得到第二数据。其中,第一机器学习模型的结构与第二机器学习模型的结构相同,第一装置和第二装置用于联合执行数据的处理。
在一些情况下,第一装置和第二装置也可以称为分布式装置。不同的分布式装置之间通过交互数据来实现数据的联合处理。
简单来说,第二装置通过某个机器学习模型对数据进行处理后,将处理后的数据发送 给第一装置;第一装置再基于相同结构的机器学习模型继续对数据进行处理,从而实现两个装置联合执行数据的处理。
本方案中,通过在不同的装置上部署机器学习模型,由多个装置联合完成数据的处理,以不断地提高所得到的数据的质量,减轻了每个装置的数据处理压力,保证计算能力较弱的装置也能够获得其所需质量的数据。
在一种可能的实现方式中,第二机器学习模型是扩散模型,使用扩散模型进行的数据处理过程可以建模为马尔科夫链,第二机器学习模型用于对第一数据进行降噪处理。其中,扩散模型可以通过神经网络实现,例如全连接神经网络、卷积神经网络、残差神经网络等。其中,通过扩散模型对数据进行处理的过程是指通过扩散模型不断地处理该扩散模型上一次数据处理所输出的数据,从而实现基于同一个扩散模型逐步对数据进行多步处理,最终得到高质量的输出数据。
本方案中,基于扩散模型的特点,将相同结构的扩散模型部署于不同的装置上,由不同的装置串行地实现数据的联合处理,在能够获得高质量数据的基础上,减轻了每个装置的数据处理压力。
在一种可能的实现方式中,该方法还包括:第一装置接收来自于第二装置的第一信息,该第一信息用于请求第一装置对第一数据执行处理。例如,在第二装置需要获得高质量的数据,且第二装置自身无法完成对数据进行处理的整个过程时,第二装置则通过第一机器学习模型对原始数据进行一部分处理,得到第一数据;并且第二装置向第一装置发送第一数据和第一信息,以请求第一装置协助第二装置继续完成数据的处理。
本方案中,通过在装置之间交互请求信息,使得数据处理能力较弱的装置可以向其他的装置请求协助完成数据的处理,充分地利用了各个装置上的数据处理能力,保证数据处理能力较弱的装置也能够获得其所需质量的数据。
在一种可能的实现方式中,第二装置向第一装置发送的第一信息用于指示第一数据待处理的次数为第一次数。第一装置通过第二机器学习模型对第一数据进行第一次数的处理,得到第二数据。其中,第一装置的能力支持对第一数据完成第一次数的处理。
在一种可能的实现方式中,由于第二装置是向第一装置请求协助完成数据的处理,因此在第一装置处理得到第二数据后,第一装置向第二装置发送第二数据,以实现向第二装置反馈已处理得到的第二数据。
或者,在第二装置并非是首个请求协助处理数据的装置的情况下,第一信息还用于指示源装置的信息,源装置即为首个请求协助处理数据的装置,而第二装置是其中一个协助处理数据的装置,第一装置则是另一个协助处理数据的装置。这样,在第一装置处理得到第二数据后,第一装置向源装置发送第二数据,保证源装置能够获得最终处理好的数据。
在一种可能的实现方式中,第二装置向第一装置发送的第一信息用于指示第一数据待处理的次数为第一次数。第一装置通过第二机器学习模型对第一数据进行第二次数的处理,得到第二数据,其中第一次数大于第二次数,第一装置的能力不支持对第一数据完成第一次数的处理。并且,第一装置向第三装置发送第二数据以及第二信息;其中,第二信息用于指示第二数据待处理的次数为第三次数,第三次数为第一次数与第二次数的差值,第三 装置用于协助第一装置执行数据的处理。
简单来说,假设第二装置请求第一装置协助对第一数据处理1000次,而第一装置的能力只能支持第一装置对第一数据完成600次的处理,第一装置则对第一数据完成600次处理,得到第二数据;并且,第一装置向第三装置发送第二信息,以请求第三装置继续对第二数据完成400次处理。
本方案中,多个装置在联合处理数据的过程中,根据自身的能力分别完成对数据的一部分处理,并且将未完成处理的数据发送至下一个装置,由下一个装置继续完成对数据的处理。这样一来,能够在兼顾各个装置本身的数据处理能力的同时,协调了多个装置来联合完成数据的处理,充分利用了各个装置的数据处理能力,保证计算能力较弱的装置也能够获得其所需质量的数据。
在一种可能的实现方式中,该方法还包括:第一装置向第二装置发送请求协助信息,该请求协助信息用于请求第二装置协助处理数据。
也就是说,第一装置可以主动请求第二装置协助处理数据,且由第二装置先对数据进行处理后,再将处理后的数据交由第一装置继续进行处理,避免在第一装置和第二装置之间交互两次数据。
在一种可能的实现方式中,该方法还包括:第一装置向中心装置发送第三信息,第三信息用于指示第一装置所需的数据的处理次数;第一装置接收中心装置的反馈信息,该反馈信息用于指示第二装置为协助节点。也就是说,第一装置可以先向中心装置反馈第一装置所需的数据的处理次数,由中心装置实现多个装置之间的统筹,即中心装置指示第二装置协助第一装置处理数据。
其中,第二装置也可以是向中心装置反馈第二装置所需的数据的处理次数。在这种情况下,中心装置可以确定先由第二装置进行数据的处理,然后再由第一装置在第二装置所处理得到的数据的基础上继续进行数据的处理,从而有效地利用了第二装置处理得到的数据,提高第一装置处理数据的效率。
本方案中,通过中心装置来实现各个分布式装置之间联合处理数据的统筹,能够基于各个分布式装置的需求确定各个分布式装置的数据处理任务,提高分布式装置联合处理数据的效率。
在一种可能的实现方式中,该方法还包括:第一装置接收来自于中心装置的第四信息,第四信息用于指示第一装置从第二装置接收到的数据需执行的处理次数。第一装置根据第四信息,通过第二机器学习模型对第一数据进行处理,得到第一装置所需的第二数据。
也就是说,在中心装置统筹分布式装置联合处理数据的过程中,中心装置可以基于各个分布式装置对数据的处理需求来确定各个分布式装置之间的数据处理顺序以及数据处理次数,从而使得各个分布式装置在接收到其他分布式装置所发送的数据后能够确定数据处理次数。
在一种可能的实现方式中,该第四信息还用于指示第三装置的信息,第三装置为待接收第一装置处理后的数据的装置。其中,第一装置根据第四信息,向第三装置发送第二数据。也就是说,第一装置在第二装置的协助下处理得到第二数据之后,第一装置可以向第 三装置发送第二数据,以便于第三装置使用第二数据或者是再继续对第二数据进行处理。
简单来说,中心装置可以在向分布式装置反馈的信息中指示分布式装置需要从哪个分布式装置接收数据、接收到数据后的处理次数以及需要将处理后的数据发送给哪个分布式装置,从而有效地实现协调分布式装置之间的数据联合处理。
在一种可能的实现方式中,该方法还包括:第一装置接收来自于第二装置的第五信息,第五信息用于指示第一数据对应的已处理次数。第一装置根据第一数据对应的已处理次数以及第一装置所需的数据的处理次数,通过第二机器学习模型对第一数据进行处理,得到第一装置所需的第二数据。
本方案中,在前一个分布式装置向后一个分布式装置发送处理后的数据的过程中,前一个分布式装置指示自身已对数据进行处理的次数,能够便于后一个分布式装置根据数据已处理的次数确定仍需对数据进行处理的次数,保证数据的联合处理,且无需由中心装置指定数据处理次数,有利于各个分布式装置根据实际运行情况动态调整数据处理次数。
本申请第二方面提供一种数据处理方法,以该数据处理方法由第一装置执行为例,该第一装置可以是终端设备或网络设备,或者是终端设备或网络设备中的部分组件(例如处理器、芯片或芯片系统等)。或者,该第一装置还可以是能实现全部或部分终端设备功能的逻辑模块和/或软件。
具体地,该方法包括:首先,第一装置通过第一机器学习模型对原始数据执行处理,得到第一数据。然后,第一装置向第二装置发送第一数据。最后,第一装置接收第二装置或其他装置发送的第二数据,第二数据是基于第二机器学习模型处理得到的,第一机器学习模型的结构与第二机器学习模型的结构相同。
可选的,第一装置和第二装置可以提前进行交互,以使得第二装置在接收到第一装置所发送的数据之后,即可确定需要对第一装置所发送的数据进行一定次数的处理。
也就是说,第一装置先确定原始数据需执行的处理次数,并对原始数据执行一定次数的处理,得到第一数据。由于第一装置对原始数据执行处理的次数少于原始数据需执行的处理次数,因此第一装置向第二装置发送第一数据,默认请求第二装置协助处理第一数据。
本方案中,通过在不同的装置上部署机器学习模型,由多个装置的设备联合完成数据的处理,以不断地提高所得到的数据的质量,减轻了每个装置的数据处理压力,保证计算能力较弱的装置也能够获得其所需质量的数据。
在一种可能的实现方式中,第一机器学习模型是扩散模型,使用扩散模型进行的数据处理过程可以建模为马尔科夫链,第一机器学习模型用于对原始数据进行降噪处理。
在一种可能的实现方式中,第一装置还可以是向第二装置发送第一信息,该第一信息用于请求第二装置对第一数据执行处理,和/或第一信息用于指示第一数据待处理的次数,该第一数据待处理的次数是基于原始数据需执行的处理次数以及第一装置对原始数据执行处理的次数确定的。
也就是说,第一装置除了向第二装置发送待处理的第一数据之外,还向第二装置发送第一信息,以指示第二装置如何处理第一数据。
本申请第三方面提供一种数据处理方法,以该数据处理方法由中心装置执行为例,该中心装置可以是终端设备或网络设备,或者是终端设备或网络设备中的部分组件(例如处理器、芯片或芯片系统等)。或者,该中心装置还可以是能实现全部或部分终端设备功能的逻辑模块和/或软件。
具体地,该方法包括:中心装置接收来自于第一装置的第一信息和第二装置的第二信息,其中第一信息用于指示第一装置所需的数据的第一处理次数,第二信息用于指示第二装置所需的数据的第二处理次数,第一处理次数对应的数据处理模型与第二处理次数对应的数据处理模型相同;中心装置向第二装置发送第三信息,第三信息用于指示第二装置向第一装置发送执行处理后的数据。其中,第二装置所需的数据的第二处理次数小于或等于第一装置所需的数据的第一处理次数。
简单来说,第一装置和第二装置也可以称为分布式装置,第一装置和第二装置均向中心装置反馈所需数据的处理次数。中心装置根据第一装置和第二装置所需数据的处理次数,确定第一装置和第二装置在联合处理数据过程中的数据处理顺序,从而指示第二装置向第一装置发送执行处理后的数据。即第二装置先处理数据后,再将处理后的数据发送给第一装置。
本方案中,通过中心装置来实现各个分布式装置之间联合处理数据的统筹,能够基于各个分布式装置的需求确定各个分布式装置的数据处理任务,提高分布式装置联合处理数据的效率。
在一种可能的实现方式中,该方法还包括:中心装置向第一装置发送第四信息,第四信息用于指示第一装置从第二装置接收到的数据需执行的处理次数。
示例性地,假设第一装置所需数据的处理次数为1000次,第二装置所需数据的处理次数为600次,那么中心装置向第一装置发送的第四信息可以是指示第一装置从第二装置接收到的数据需执行的处理次数为400次。
本申请第四方面提供一种模型训练方法,该方法应用于训练系统中的第一装置,该第一装置可以是终端设备或网络设备,或者是终端设备或网络设备中的部分组件(例如处理器、芯片或芯片系统等)。或者,该第一装置还可以是能实现全部或部分终端设备功能的逻辑模块和/或软件。其中,训练系统包括多个分布式装置。具体地,该方法包括:首先,第一装置获取训练样本集合,该训练样本集合包括第一数据和第二数据,第一数据是基于第二数据得到的,且第二数据为第一数据的训练标签。然后,第一装置基于训练样本集合对第一机器学习模型进行训练,得到训练后的第一机器学习模型。在训练过程中,将第一数据作为第一机器学习模型的输入数据,由第一机器学习模型对第一数据进行处理,并根据处理后的数据和第二数据来计算损失函数,并基于该损失函数更新第一机器学习模型的参数,从而实现第一机器学习模型的训练。在训练结束后,第一装置向第二装置发送训练后的第一机器学习模型,第二装置是用于聚合由多个装置训练得到的结构相同且参数不同的机器学习模型的装置。其中,第二装置也可以称为聚合装置。
本方案中,通过在各个不同的分布式装置上部署相同的机器学习模型,并对相同的机器学习模型进行训练,最后由聚合装置将各个分布式装置所训练的机器学习模型进行聚合,能够实现将训练过程拆分至不同的分布式装置上来执行,减轻各个分布式装置的模型训练压力。
在一种可能的实现方式中,该方法还包括:第一装置向第三装置发送第一信息,第一信息用于指示第一装置上与模型训练相关的能力。其中,第三装置用于基于参与机器学习模型训练的多个装置的能力确定多个装置所负责的训练内容,因此第三装置也可以称为中心装置。在发送第一信息后,第一装置接收来自于第三装置的第二信息,第二信息用于指示第一装置上训练的第一机器学习模型对输入数据进行处理的次数。第二信息还用于指示第一机器学习模型的输入数据的需求。例如,在第一机器学习模型的输入数据是对目标数据进行加噪处理得到的情况下,第一机器学习模型的输入数据的需求可以为对目标数据进行加噪处理的次数。
在第一装置获取训练样本集合的过程中,包括:第一装置根据第二信息所指示的输入数据的需求以及第一机器学习模型对输入数据进行处理的次数,对目标数据进行处理,得到第二数据和第一数据。例如,假设输入数据的需求为对目标数据执行M-N次至M次加噪处理,得到一组数据,并从中得到第一数据和第二数据,其中第二数据是第一数据的训练标签,且获得第二数据所需的加噪次数小于获得其对应的第一数据所需的加噪次数。
在一种可能的实现方式中,该方法还包括:第一装置接收来自于第二装置的第二机器学习模型;第一装置基于训练样本集合对第二机器学习模型进行训练,得到训练后的第二机器学习模型;第一装置向第二装置发送训练后的第二机器学习模型。
也就是说,在第二装置聚合了多个分布式装置的机器学习模型之后,第二装置继续向第一装置发送聚合后得到的第二机器学习模型,以使得第一装置继续对第二机器学习模型进行训练。
本申请第五方面提供一种模型训练方法,该方法应用于第一装置,该第一装置可以是终端设备或网络设备,或者是终端设备或网络设备中的部分组件(例如处理器、芯片或芯片系统等)。或者,该第一装置还可以是能实现全部或部分终端设备功能的逻辑模块和/或软件。所述方法包括:第一装置接收多个能力信息,多个能力信息来自于多个不同的装置,且多个能力信息中的每个能力信息均用于指示装置上与模型训练相关的能力。然后,第一装置根据多个能力信息分别向多个不同的装置发送不同的训练配置信息,该训练配置信息用于指示装置上训练的机器学习模型对输入数据进行处理的次数,该训练配置信息还用于指示装置上训练的机器学习模型的输入数据的需求,多个不同的装置所训练的机器学习模型为结构相同的模型。其中,第一装置也可以称为中心装置。
简单来说,采用机器学习模型对数据不断地进行处理的过程可以视为一个马尔科夫链,采用机器学习模型对数据执行一次处理可以视为马尔科夫链中的一个环节。中心装置可以将马尔科夫链拆分为多个子链,并根据各个分布式装置的能力,将拆分得到的子链配置到不同的分布式装置中,即不同的分布式装置用于执行不同的训练任务。
可选的,分布式装置上与模型训练相关的能力可以包括分布式装置的计算能力、存储能力和通信能力等能力。其中,计算能力可以采用分布式装置每秒能够执行运算的次数来衡量;存储能力可以采用分布式装置上分配给模型训练的存储空间大小来衡量;通信能力可以采用分布式装置上分配给模型训练过程的数据传输速率来衡量。除了上述的多种能力之外,分布式装置上与模型训练相关的能力还可以是包括其他能够影响模型训练的能力,在此不做具体限定。
本申请第六方面提供一种通信装置,该通信装置包括:收发模块,用于接收来自于第二装置的第一数据,所述第一数据为经过第一机器学习模型处理后的数据;处理模块,用于通过第二机器学习模型对所述第一数据进行处理,得到第二数据,所述第一机器学习模型的结构与所述第二机器学习模型的结构相同,所述通信装置和所述第二装置用于联合执行数据的处理。
在一种可能的实现方式中,第二机器学习模型是扩散模型,使用扩散模型进行的数据处理过程可以建模为马尔科夫链,第二机器学习模型用于对第一数据进行降噪处理。
在一种可能的实现方式中,收发模块,还用于接收来自于第二装置的第一信息,第一信息用于请求通信装置对第一数据执行处理。
在一种可能的实现方式中,第一信息用于指示第一数据待处理的次数为第一次数;处理模块还用于通过第二机器学习模型对第一数据进行第一次数的处理,得到第二数据,其中第一装置的能力支持对第一数据完成第一次数的处理。
在一种可能的实现方式中,收发模块,还用于向第二装置发送第二数据;或者,收发模块用于向源装置发送第二数据,其中第一信息还用于指示源装置的信息,源装置为首个请求协助处理数据的装置。
在一种可能的实现方式中,第一信息用于指示第一数据待处理的次数为第一次数;处理模块还用于通过第二机器学习模型对第一数据进行第二次数的处理,得到第二数据,其中第一次数大于第二次数,第一装置的能力不支持对第一数据完成第一次数的处理;收发模块,还用于向第三装置发送第二数据以及第二信息;其中,第二信息用于指示第二数据待处理的次数为第三次数,第三次数为第一次数与第二次数的差值,第三装置用于协助通信装置执行数据的处理。
在一种可能的实现方式中,收发模块,还用于向第二装置发送请求协助信息,请求协助信息用于请求第二装置协助处理数据。
在一种可能的实现方式中,收发模块,还用于向中心装置发送第三信息,第三信息用于指示通信装置所需的数据的处理次数;收发模块,还用于接收中心装置的反馈信息,反馈信息用于指示第二装置为协助节点。
在一种可能的实现方式中,收发模块,还用于接收来自于中心装置的第四信息,第四信息用于指示通信装置从第二装置接收到的数据需执行的处理次数;处理模块,还用于根据第四信息,通过第二机器学习模型对第一数据进行处理,得到通信装置所需的第二数据。
在一种可能的实现方式中,第四信息还用于指示第三装置的信息,第三装置为待接收 第一装置处理后的数据的装置;收发模块,还用于根据第四信息,向第三装置发送第二数据。
在一种可能的实现方式中,收发模块,还用于接收来自于第二装置的第五信息,第五信息用于指示第一数据对应的已处理次数;处理模块,还用于根据处理次数以及通信装置所需的数据的处理次数,通过第二机器学习模型对第一数据进行处理,得到通信装置所需的第二数据。
在一种可能的实现方式中,收发模块为收发器,处理模块为处理器。
本申请第七方面提供一种通信装置,包括:处理模块,用于通过第一机器学习模型对原始数据执行处理,得到第一数据;收发模块,用于向第二装置发送第一数据;收发模块,还用于接收第二装置或其他装置发送的第二数据,第二数据是基于第二机器学习模型对第一数据处理得到的,第一机器学习模型的结构与第二机器学习模型的结构相同。
在一种可能的实现方式中,第一机器学习模型是扩散模型,使用扩散模型进行的数据处理过程可以建模为马尔科夫链,第一机器学习模型用于对原始数据进行降噪处理。
在一种可能的实现方式中,收发模块,还用于向第二装置发送第一信息,第一信息用于请求第二装置对第一数据执行处理,和/或第一信息用于指示第一数据待处理的次数,第一数据待处理的次数是基于原始数据需执行的处理次数以及第一装置对原始数据执行处理的次数确定的。
在一种可能的实现方式中,收发模块为收发器,处理模块为处理器。
本申请第八方面提供一种通信装置,包括:收发模块,用于接收来自于第一装置的第一信息和第二装置的第二信息,第一信息用于指示第一装置所需的数据的第一处理次数,第二信息用于指示第二装置所需的数据的第二处理次数,第一处理次数对应的数据处理模型与第二处理次数对应的数据处理模型相同;收发模块,用于向第二装置发送第三信息,第三信息用于指示第二装置向第一装置发送执行处理后的数据,其中第二装置所需的数据的第二处理次数小于或等于第一装置所需的数据的第一处理次数。
在一种可能的实现方式中,收发模块,还用于向第一装置发送第四信息,第四信息用于指示第一装置从第二装置接收到的数据需执行的处理次数。
在一种可能的实现方式中,收发模块为收发器。
本申请第九方面提供一种模型训练装置,包括收发模块,用于获取训练样本集合,训练样本集合包括第一数据和第二数据,第一数据是基于第二数据得到的,且第二数据为第一数据的训练标签;处理模块,用于基于训练样本集合对第一机器学习模型进行训练,得到训练后的第一机器学习模型,其中第一机器学习模型用于对第一数据进行处理;发送模块,用于向第二装置发送训练后的第一机器学习模型,第二装置是用于聚合由多个装置训练得到的结构相同且参数不同的机器学习模型的装置。
在一种可能的实现方式中,发送模块,还用于向第三装置发送第一信息,第一信息用 于指示模型训练装置上与模型训练相关的能力,第三装置用于基于参与机器学习模型训练的多个装置的能力确定多个装置所负责的训练内容;收发模块,还用于接收来自于第三装置的第二信息,第二信息用于指示模型训练装置上训练的第一机器学习模型对输入数据进行处理的次数,第二信息还用于指示第一机器学习模型的输入数据的需求;处理模块,还用于:根据第二信息所指示的输入数据的需求以及第一机器学习模型对输入数据进行处理的次数,对原始数据进行处理,得到第二数据;根据第二信息所指示的第一机器学习模型对输入数据进行处理的次数,对第二数据进行处理,得到第一数据。
在一种可能的实现方式中,收发模块,还用于接收来自于第二装置的第二机器学习模型;处理模块,还用于基于训练样本集合对第二机器学习模型进行训练,得到训练后的第二机器学习模型;发送模块,还用于向第二装置发送训练后的第二机器学习模型。
在一种可能的实现方式中,收发模块为收发器,处理模块为处理器。
本申请第十方面提供一种模型训练装置,包括收发模块,用于接收多个能力信息,多个能力信息来自于多个不同的装置,且多个能力信息中的每个能力信息均用于指示装置上与模型训练相关的能力;收发模块,用于根据多个能力信息分别向多个不同的装置发送不同的训练配置信息,训练配置信息用于指示装置上训练的机器学习模型对输入数据进行处理的次数,训练配置信息还用于指示装置上训练的机器学习模型的输入数据的需求,多个不同的装置所训练的机器学习模型为结构相同的模型。
在一种可能的实现方式中,收发模块为收发器。
本申请实施例第十一方面提供了一种通信装置,包括至少一个处理器,该至少一个处理器与存储器耦合;该存储器用于存储程序或指令;该至少一个处理器用于执行该程序或指令,以使该装置实现前述第一方面或第一方面任意一种可能的实现方式所述的方法,或,以使该装置实现前述第二方面或第二方面任意一种可能的实现方式所述的方法,或,以使该装置实现前述第三方面或第三方面任意一种可能的实现方式所述的方法,或,以使该装置实现前述第四方面或第四方面任意一种可能的实现方式所述的方法或,以使该装置实现前述第五方面或第五方面任意一种可能的实现方式所述的方法。
一种可能的实现方式中,该通信装置还包括上述存储器。可选地,该存储器和处理器集成在一起,或者,该存储器和处理器分开设置。
一种可能的实现方式中,该通信装置还包括收发器,用于收发数据或信令。
本申请实施例第十二方面提供一种存储一个或多个计算机执行指令的计算机可读存储介质,当计算机执行指令被处理器执行时,该处理器执行如上述第一方面或第一方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第二方面或第二方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第三方面或第三方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第四方面或第四方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第五方面或第五方面任意一种可能的实现方式所述的方法。
本申请实施例第十三方面提供一种存储一个或多个计算机的计算机程序产品(或称计算机程序),当计算机程序产品被该处理器执行时,该处理器执行如上述第一方面或第一方面任意一种可能实现方式的方法,或,该处理器执行如上述第二方面或第二方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第三方面或第三方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第四方面或第四方面任意一种可能的实现方式所述的方法,或,该处理器执行如上述第五方面或第五方面任意一种可能的实现方式所述的方法。
本申请实施例第十四方面提供了一种芯片系统,该芯片系统包括至少一个处理器,用于支持通信装置实现上述第一方面或第一方面任意一种可能的实现方式中所涉及的功能,或,用于支持通信装置实现上述第二方面或第二方面任意一种可能的实现方式中所涉及的功能,或,用于支持通信装置实现上述第三方面或第三方面任意一种可能的实现方式中所涉及的功能,或,用于支持通信装置实现上述第四方面或第四方面任意一种可能的实现方式中所涉及的功能,或,用于支持通信装置上述第五方面或第五方面任意一种可能的实现方式所述的方法。
在一种可能的设计中,该芯片系统还可以包括存储器,用于保存该通信装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。可选地,该芯片系统还包括接口电路,该接口电路为该至少一个处理器提供程序指令和/或数据。
本申请实施例第十五方面提供了一种通信系统,该通信系统包括上述第六方面和第七方面所涉及的通信装置,和/或,该通信系统包括上述第六方面、第七方面和第八方面所涉及的通信装置,和/或,该通信系统包括上述第九方面和第十方面的通信装置。
其中,第六方面至第十五方面中任一种设计方式所带来的技术效果可参见上述第一方面至第五方面中不同实现方式所带来的技术效果,在此不再赘述。
附图说明
图1为本申请实施例提供的一种扩散模型处理数据的过程示意图;
图2为本申请实施例提供的一种全连接神经网络的部分结构示意图;
图3为本申请实施例提供的一种神经网络训练的过程示意图;
图4为本申请实施例提供的一种神经网络执行反向传播的过程示意图;
图5为本申请实施例提供的一种无线通信系统的架构示意图;
图6为本申请实施例提供的一种智能家居通信系统的架构示意图;
图7为本申请实施例提供的一种模型训练方法的流程示意图;
图8为本申请实施例提供的一种模型训练方法的另一流程示意图;
图9A为本申请实施例提供的一种数据处理方法900的流程示意图;
图9B为本申请实施例提供的一种数据处理方法900的另一流程示意图;
图10A为本申请实施例提供的一种数据处理方法1000的流程示意图;
图10B为本申请实施例提供的一种数据处理方法1000的另一流程示意图;
图11为本申请实施例提供的一种数据处理方法1100的流程示意图;
图12为本申请实施例提供的一种数据处理方法1200的流程示意图;
图13为本申请实施例提供的一种通信装置1300的结构示意图;
图14为本申请实施例提供的一种模型训练装置1400的结构示意图;
图15为本申请实施例提供的一种通信装置1500的结构示意图;
图16为本申请实施例提供的一种通信装置1600的结构示意图;
图17为本申请实施例提供的一种通信装置1700的结构示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象(例如,区分同一实施例中的对象),而不必用于描述特定的顺序或先后次序,且在不同实施例中“第一”、“第二”等限定的对象(如“第一信息”、“第一装置”、“第二信息”、“第二装置”等)可能指代不同的对象,例如实施例一中“第一装置”可能指代分布式节点,实施例二中“第一装置”可能指代中心节点。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。
此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。在本申请中出现的对步骤进行的命名或者编号,并不意味着必须按照命名或者编号所指示的时间/逻辑先后顺序执行方法流程中的步骤,已经命名或者编号的流程步骤可以根据要实现的技术目的变更执行次序,只要能达到相同或者相类似的技术效果即可。
为了便于理解,以下先介绍本申请实施例所涉及的技术术语。
(1)终端设备
终端设备是指能够接收网络设备发送的调度信息和指示信息的无线终端设备。无线终端设备可以是指向用户提供语音和/或数据连通性的设备,或具有无线连接功能的手持式设备,或连接到无线调制解调器的其他处理设备。
终端设备可以经无线接入网(wireless access network,RAN)与一个或多个核心网或者互联网进行通信。示例性地,终端设备可以是移动终端设备,如移动电话(或称为“蜂窝”电话,手机(mobile phone))、计算机和数据卡,例如,可以是便携式、袖珍式、手持式、计算机内置的或者车载的移动装置,它们与无线接入网交换语音和/或数据。又例如,终端设备还可以是个人通信业务(personal communication service,PCS)电话、无绳电话、会话发起协议(Session initialization Protocol,SIP)话机、无线本地环路(wireless local loop,WLL)站、个人数字助理(personal digital assistant,PDA)、 平板电脑(Tablet Personal Computer,Tablet PC)、带无线收发功能的电脑等设备。一般地,终端设备也可以称为系统、订户单元(subscriber unit)、订户站(subscriber station),移动站(mobile station)、移动台(mobile station,MS)、远程站(remote station)、接入点(access point,AP)、远程终端设备(remote terminal)、接入终端设备(access terminal)、用户终端设备(user terminal)、用户代理(user agent)、用户站(subscriber station,SS)、用户端设备(customer premises equipment,CPE)、终端(terminal)、用户设备(user equipment,UE)、移动终端(mobile terminal,MT)等。
作为示例而非限定,在本申请实施例中,终端设备还可以是可穿戴设备。可穿戴设备也可以称为穿戴式智能设备或智能穿戴式设备等,是应用穿戴式技术对日常穿戴进行智能化设计、开发出可以穿戴的设备的总称,如眼镜、手套、手表、服饰及鞋等。可穿戴设备即直接穿在身上,或是整合到用户的衣服或配件的一种便携式设备。可穿戴设备不仅仅是一种硬件设备,更是通过软件支持以及数据交互、云端交互来实现强大的功能。广义穿戴式智能设备包括功能全、尺寸大、可不依赖智能手机实现完整或者部分的功能,例如:智能手表或智能眼镜等,以及只专注于某一类应用功能,需要和其它设备如智能手机配合使用,如各类进行体征监测的智能手环、智能头盔、智能首饰等。
终端还可以是无人机、机器人、设备到设备通信(device-to-device,D2D)中的终端、车到一切(vehicle to everything,V2X)中的终端、虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程医疗(remote medical)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等。
此外,终端设备也可以是第五代(5th generation,5G)通信系统、之后演进的通信系统(例如第六代(6th generation,6G)通信系统等)中的终端设备或者未来演进的公共陆地移动网络(public land mobile network,PLMN)中的终端设备等。示例性的,6G网络可以进一步扩展5G通信终端的形态和功能,6G终端包括但不限于车、蜂窝网络终端(融合卫星终端功能)、无人机、物联网(internet of things,IoT)设备。
在本申请实施例中,上述终端设备可以是具有AI处理能力,能够采用AI模型对数据进行处理。
(2)网络设备
网络设备可以是指无线网络中提供无线接入服务的设备。例如,网络设备可以为将终端设备接入到无线网络的RAN节点(或RAN设备),又可以称为基站。目前,一些RAN设备的举例为:5G通信系统中的下一代基站(the next Generation Node B,gNB)、传输接收点(transmission reception point,TRP)、演进型节点B(evolved Node B,eNB)、无线网络控制器(radio network controller,RNC)、节点B(Node B,NB)、家庭基站(例如,home evolved Node B,或home Node B,HNB)、基带单元(base band unit,BBU), 或无线保真(wireless fidelity,Wi-Fi)接入点(Access Point,AP)等。另外,在一种网络结构中,网络设备可以包括集中单元(centralized unit,CU)节点、或分布单元(distributed unit,DU)节点、或包括CU节点和DU节点的RAN设备。
网络设备可以是其它为终端设备提供无线通信功能的装置。本申请的实施例对网络设备所采用的具体技术和具体设备形态不做限定。为方便描述,本申请实施例并不限定。
网络设备还可以包括核心网设备,核心网设备例如包括第四代(4th generation,4G)网络中的移动性管理实体(mobility management entity,MME),归属用户服务器(home subscriber server,HSS),服务网关(serving gateway,S-GW),策略和计费规则功能(policy and charging rules function,PCRF),公共数据网网关(public data network gateway,PDN gateway,P-GW);5G网络中的访问和移动管理功能(access and mobility management function,AMF)、用户面功能(user plane function,UPF)或会话管理功能(session management function,SMF)等网元。此外,该核心网设备还可以包括5G网络以及5G网络的下一代或未来网络中的其他核心网设备。
本申请实施例中,上述网络设备还可以具有AI能力的网络节点,可以为终端或其他网络设备提供AI服务,例如,可以为网络侧(接入网或核心网)的AI节点、算力节点、具有AI能力的RAN节点、具有AI能力的核心网网元等。
本申请实施例中,用于实现网络设备的功能的装置可以是网络设备,也可以是能够支持网络设备实现该功能的装置,例如芯片系统,该装置可以被安装在网络设备中。
(3)马尔科夫链(Markov Chain,MC)
马尔可夫链是指概率论和数理统计中具有马尔可夫性质(Markov property)且存在于离散的指数集(index set)和状态空间(state space)内的随机过程(stochastic process)。
简单来说,马尔可夫链是一组具有马尔可夫性质的离散随机变量的集合。具体地,对概率空间(Ω,F,P)内以一维可数集为指数集的随机变量集合X,其中,概率空间(Ω,F,P)是一个总测度为1的测度空间;Ω是一个非空集合,也称为样本空间;F是样本空间Ω的幂集的一个非空子集;P为概率。若随机变量集合X中的随机变量的取值都在可数集内:X i=s i,s i∈s,且随机变量的条件概率满足公式1所示的关系:
p(X t+1|X t,...,X 1)=p(X t+1|X t)     公式1
那么,X被称为马尔可夫链,可数集S被称为状态空间(state space),马尔可夫链在状态空间内的取值称为状态。其中,X i为随机变量集合X中的随机变量,i可以为大于0的任意整数;S i为可数集S中的元素。可数集S是指集合中的每个元素都能与自然数集的每个元素建立一一对应关系的集合。
(4)扩散模型
扩散模型是一种利用马尔科夫链原理来处理数据的人工智能模型。对于带有噪声的数据,能够采用扩散模型对数据进行降噪处理,从而能够获得更高质量的数据。以下将介绍扩散模型对数据进行降噪处理的原理。
请参阅图1,图1为本申请实施例提供的一种扩散模型处理数据的过程示意图。如图1所示,给定数据X 0服从分布q(X 0),按照图1中从右到左的顺序对数据X 0进行逐步加噪。 其中,每一次加噪均能够得到一个新的数据。在对数据X 0进行T次逐步加噪后,得到X 1到X T共T个数据。
具体地,数据的加噪过程可以看作是一个马尔科夫过程。基于数据X 0获得X 1到X T的条件概率为:
Figure PCTCN2022115466-appb-000001
其中,t的取值范围为1≤t≤T,∏表示求乘积,单步转移概率为
Figure PCTCN2022115466-appb-000002
Figure PCTCN2022115466-appb-000003
表示从X t-1到X t是一个以
Figure PCTCN2022115466-appb-000004
为均值且以β t为方差的高斯分布变换;方差参数β t是可设计参数。在数据的加噪过程中,当加噪步数T足够大,且β t选择合理时,最终得到的X T将符合均值为0且方差为I的高斯分布,即
Figure PCTCN2022115466-appb-000005
在这种情况下,可以通过一个反向过程来对数据进行降噪处理(即图1中从左往右处理数据的过程)。具体地,从一个高斯分布采样得到的样本开始,逐步生成符合给定分布的数据样本。由于反向过程的转移概率q(X t|X t-1)难以计算,因此可以通过神经网络进行近似。示例性地,用p θ(X t|X t-1)近似反向过程的转移概率,并使得p θ(X t|X t-1)符合高斯分布,即
Figure PCTCN2022115466-appb-000006
其中,
Figure PCTCN2022115466-appb-000007
表示从X t到X t-1是一个以μ θ(Xt t)为均值且以Σ θ(X t,t)为方差的高斯分布变换;均值μ θ(X t,t)和方差Σ θ(X t,t)都可以通过神经网络进行近似。实验发现,方差项的影响较小,一般可以将其固定,如
Figure PCTCN2022115466-appb-000008
其中σ t为不进行学习的参数,即σ t为可设计参数。通过神经网络学习均值μ θ(X t,t),即神经网络的输入为第t步的数据X t和步数索引t,可训练参数为θ。
实验显示,也可以使用神经网络来近似表达噪声项ε θ(X t,t),其中噪声项ε θ(X t,t)表示从X t-1到X t的加噪参数,再通过
Figure PCTCN2022115466-appb-000009
获得生成数据X 0,其中, α t=1-β t
Figure PCTCN2022115466-appb-000010
t=1时z=0,t>1时
Figure PCTCN2022115466-appb-000011
简单来说,如图1所示,通过对原始数据X 0进行逐步加噪,最终能够得到噪声满足高斯分布的数据X T。通过采用扩散模型,对数据X T进行逐步降噪处理,最终能够还原得到数据X 0。其中,扩散模型是一种神经网络模型,例如全连接神经网络、卷积神经网络、残差神经网络等,本实施例并不限定扩散模型的具体模型结构。
为便于理解,以下将以全连接神经网络为例,对神经网络以及其训练的方法进行介绍。其中,全连接神经网络又叫多层感知机(multilayer perceptron,MLP)。请参阅图2,图2为本申请实施例提供的一种全连接神经网络的部分结构示意图。如图2所示,一个MLP包含一个输入层(左侧),一个输出层(右侧),及多个隐藏层(中间)。
可选的,图2中的输入层对应的数据可以称为输入数据,该输入数据可以包括训练所需数据或推理所需数据,例如上述图1中的数据X t。图2中的多个隐藏层部署有相应的模型参数,用于基于这些模型参数对输入数据进行处理。图2中的输出层对应的数据可以称为输出数据,是由多个隐藏层对输入数据进行处理后所得到的数据,例如上述图1中的数据X 0
此外,上述MLP的每层包含若干个节点,称为神经元。其中,相邻两层的神经元间两两相连。
可选的,考虑相邻两层的神经元,下一层的神经元的输出h为所有与之相连的上一层神经元的输出x的加权和并经过激活函数,可以表示为以下的公式2:
h=f(wx+b)        公式2
其中,w为权重矩阵,b为偏置向量,f为激活函数。
进一步的,神经网络的输出可以表示为以下的公式3:
y=f n(w nf n-1(...)+b n)      公式3
其中,n为神经网络的层数,且n的取值为大于1的整数。换言之,可以将神经网络理解为一个从输入数据集合到输出数据集合的映射关系。而通常神经网络都是随机初始化的,用已有数据从随机的w和b得到这个映射关系的过程被称为神经网络的训练。
可选的,训练的具体方式为采用损失函数(loss function)对神经网络的输出结果进行评价。请参阅图3,图3为本申请实施例提供的一种神经网络训练的过程示意图。如图3所示,可以将误差反向传播,通过梯度下降的方法即能迭代优化神经网络参数(包括w和b),直到损失函数达到最小值,即图3中的“最优点”。可以理解的是,图3中的“最优点”对应的神经网络参数可以作为训练好的AI模型信息中的神经网络参数。
可选的,梯度下降的过程可以表示为以下的公式4:
Figure PCTCN2022115466-appb-000012
其中,θ为待优化参数(包括w和b),L为损失函数,η为学习率,控制梯度下降的步长,
Figure PCTCN2022115466-appb-000013
表示求偏导数,
Figure PCTCN2022115466-appb-000014
为损失函数L的偏导数,
Figure PCTCN2022115466-appb-000015
为待优化参数的偏导数。
进一步的,反向传播的过程可以利用求偏导的链式法则。请参阅图4,图4为本申请实施例提供的一种神经网络执行反向传播的过程示意图。如图4所示,前一层参数的梯度可以由后一层参数的梯度递推计算得到,可以表达为以下的公式5:
Figure PCTCN2022115466-appb-000016
其中,W ij为节点j连接节点i的权重,S i为节点i上的输入加权和。
以上介绍了本申请实施例所涉及的技术术语,以下将介绍本申请实施例提供的方法所应用的场景。
请参阅图5,图5为本申请实施例提供的一种无线通信系统的架构示意图。如图5所示,无线通信系统中包括网络设备501和终端设备502。其中,终端设备502可以包括一个或多个终端设备,例如图5中所示的智能手环、智能手机、智慧电视以及笔记本电脑。其中,网络设备501与终端设备502中的各个终端设备建立无线连接,终端设备502中的终端设备之间也可以建立有无线连接。在图5所示的无线通信系统中,网络设备501可以向终端设备502发送下行数据,例如需要训练的模型;终端设备502中的各个终端设备可以向网络设备501发送上行数据,例如训练好的模型。此外,终端设备502中的各个终端设备之间还可以互相发送数据,例如模型训练过程中所需的数据或模型推理过程中所需的数据。
需要说明的是,本申请实施例所提及的无线通信系统包括但不限于:第五代移动通信技术(5th Generation Mobile Communication Technology,5G)通信系统、6G通信系统、卫星通信系统、短距通信系统、窄带物联网系统(Narrow Band-Internet of Things,NB-IoT)、全球移动通信系统(Global System for Mobile Communications,GSM)、增强型数据速率GSM演进系统(Enhanced Data rate for GSM Evolution,EDGE)、宽带码分多址系统(Wideband Code Division Multiple Access,WCDMA)、码分多址2000系统(Code Division Multiple Access,CDMA2000)、时分同步码分多址系统(Time Division-Synchronization Code Division Multiple Access,TD-SCDMA)以及长期演进系统(Long Term Evolution,LTE)等通信系统。本申请实施例并不对无线通信系统的具体架构进行限定。
请参阅图6,图6为本申请实施例提供的一种智能家居通信系统的架构示意图。如图6所示,在智能家居场景中,各种智能家居产品之间通过无线网络连接,以实现智能家居产品之间能够互相传输数据。在图6中,以智慧电视、智能空气净化器、智能饮水机、智能音箱以及扫地机器人等智能家居产品为例,这些智能家居产品均通过无线路由器连接至同一个无线网络中,从而实现各个智能家居产品之间的数据交互。除了上述示例的智能家居产品之外,在实际应用中还可以包括其他类型的智能家居产品,例如智能冰箱、智能抽油 烟机、智能窗帘等智能家居产品,本实施例并不对智能家居产品的类型进行限定。
此外,不同的智能家居产品之间也可以是直接进行无线连接,而不需要通过无线路由器接入到同一个无线网络中。例如,各个智能家居产品之间通过蓝牙来实现无线连接。
除了上述图5和图6所介绍的场景以外,本申请实施例提供的方法还可以应用于其他的通信系统场景下。例如,在智能工厂场景中,不同的设备(例如智能机器人、车床、搬运车辆等设备)之间通过无线网络进行连接,并通过无线网络互相传递数据。本申请实施例并不对数据处理方法所应用的具体场景进行限定。
以上介绍了本申请实施例提供的方法所应用的场景。为便于理解,以下将先介绍本申请实施例中执行于模型训练阶段的模型训练方法,再进一步介绍本申请实施例中执行于模型推理阶段的数据处理方法。
请参阅图7,图7为本申请实施例提供的一种模型训练方法的流程示意图。如图7所示,该模型训练方法包括以下的步骤701-708。
步骤701,中心装置接收多个能力信息。
本实施例中,中心装置用于获取各个分布式装置的能力信息,并基于各个分布式装置的能力信息为各个分布式装置分配训练任务。其中,中心装置所接收到的多个能力信息来自于多个不同的分布式装置(例如图7中所示的分布式装置1-分布式装置N)。并且,多个能力信息中的每个能力信息均用于指示分布式装置上与模型训练相关的能力。简单来说,每个分布式装置收集本装置上与模型训练相关的能力,并通过向中心装置发送能力信息的方式来向中心装置反馈自身与模型训练相关的能力。
具体地,中心装置例如为上述所介绍的终端设备或网络设备;或者,中心装置为用于实现上述终端设备或网络设备的功能的装置,例如中心装置为终端设备或网络设备中的芯片或芯片系统。分布式装置例如为上述的终端设备,或者用于实现上述终端设备的功能的装置。在一个具体的示例中,中心装置可以为上述的基站,分布式装置可以为智能手环、智能手表、智慧电视、智能手机或笔记本电脑等终端设备。
可选的,分布式装置上与模型训练相关的能力可以包括分布式装置的计算能力、存储能力和通信能力等能力中的一项或多项能力。其中,计算能力可以采用分布式装置每秒能够执行运算的次数来衡量;存储能力可以采用分布式装置上分配给模型训练的存储空间大小来衡量;通信能力可以采用分布式装置上分配给模型训练过程的数据传输速率来衡量。除了上述的多种能力之外,分布式装置上与模型训练相关的能力还可以是包括其他能够影响模型训练的能力,本实施例对此不做具体限定。
步骤702,中心装置根据多个能力信息分别向多个不同的分布式装置发送对应的训练配置信息。
在接收到多个能力信息后,中心装置能够获取参与模型训练的各个分布式装置执行模型训练的能力。基于各个分布式装置执行模型训练的能力,中心装置能够确定各个分布式装置在整个模型训练阶段的训练配置信息。其中,训练配置信息是指分布式装置在模型训练阶段需要执行的具体训练任务。
示例性地,由上述对扩散模型的介绍可知,通过扩散模型对数据进行降噪处理的过程实际上是采用同一个机器学习模型不断地对数据进行处理,从而逐步降低数据中的噪声,进而得到高质量的数据。扩散模型对数据执行降噪处理的次数越多,那么所得到的数据的质量则越高。
简单来说,采用机器学习模型对数据不断地进行处理的过程可以视为一个马尔科夫链,采用机器学习模型对数据执行一次处理可以视为马尔科夫链中的一个环节。中心装置可以将马尔科夫链拆分为多个子链,并根据各个分布式装置的能力,将拆分得到的子链配置到不同的分布式装置中。其中,在中心装置将拆分得到的子链配置到不同的分布式装置之后,不同的分布式装置用于执行不同的训练任务。例如,分布式装置1执行训练任务1,分布式装置2执行训练任务2,分布式装置3执行训练任务3。又或者,部分分布式装置可以是用于执行相同的训练任务。例如,分布式装置1执行训练任务1,分布式装置2也执行训练任务1,分布式装置3执行训练任务2。
示例性地,中心装置所发送的训练配置信息用于指示分布式装置上训练的机器学习模型对输入数据进行处理的次数。并且,由于各个分布式装置分别负责训练阶段中不同的环节,因此各个分布式装置上所训练的机器学习模型的输入数据的质量需求也是不同的。因此,训练配置信息还用于指示分布式装置上训练的机器学习模型的输入数据的需求。
例如,假设整个训练阶段中的训练任务为:通过机器学习模型对进行了T次加噪处理的数据X T(输入数据)进行T次处理,得到数据X 0;那么训练任务可以被分为三个不同的子训练任务,且这三个子训练任务部署于不同的分布式装置中。其中,第一个子训练任务可以为通过机器学习模型对进行了T次加噪处理的数据X T进行n次处理,得到数据X T-n;第二个子训练任务可以为通过机器学习模型对进行了T-m次加噪处理的数据X T-m进行k次处理,得到数据X T-m-k;第三个子训练任务可以为通过机器学习模型对进行了w次加噪处理的数据X w进行w次处理,得到数据X 0
步骤703,聚合装置向多个分布式装置发送机器学习模型和目标数据样本。
本实施例中,聚合装置用于聚合各个分布式装置训练得到的模型,并向各个分布式装置反馈聚合后的模型。其中,聚合装置和前述的中心装置可以为同一个装置,也可以为不同的装置,本实施例对此不做具体限定。
示例性地,聚合装置例如为上述所介绍的终端设备或网络设备;或者,聚合装置为用于实现上述终端设备或网络设备的功能的装置,例如聚合装置为终端设备或网络设备中的芯片或芯片系统。在一个具体的示例中,中心装置可以为上述的基站,聚合装置可以为上述的基站或服务器,分布式装置可以为智能手环、智能手表、智慧电视、智能手机或笔记本电脑等终端设备。
可选的,聚合装置向多个分布式装置所发送的机器学习模型是相同的,以便于多个分布式装置对同一个机器学习模型执行模型训练。并且,聚合装置所发送的机器学习模型中的参数可以是随机初始化后所得到的初始参数。示例性地,聚合装置所发送的机器学习模型可以为扩散模型,该机器学习模型用于对数据进行降噪处理。
此外,聚合装置向多个分布式装置所发送的目标数据样本也可以是相同的,以便于不同的分布式装置根据训练配置信息生成相应的训练样本集合。其中,聚合装置向多个分布式装置所发送的目标数据样本可以为具有较高质量的数据,例如目标数据样本(例如图1中的X 0)为一个没有噪声的图像。分布式装置对机器学习模型进行训练的过程是:分布式装置先通过对目标数据样本进行加噪处理,得到质量较低的训练样本;然后,分布式装置将训练样本输入机器学习模型,通过机器学习模型对训练样本进行降噪处理,并基于机器学习模型的输出结果对机器学习模型进行训练。
可选的,聚合装置所发送的目标数据样本可以是由需要训练的机器学习模型的类型来确定的,本实施例对目标数据样本的具体类型并不做具体限定。
例如,在需要训练的机器学习模型为通信系统中收发机各模块的机器学习模型时,比如发送机机器学习模型、接收机机器学习模型、信道估计机器学习模型、信道压缩反馈机器学习模型、预编码机器学习模型、波束管理机器学习模型或定位机器学习模型,数据样本则可以为信道数据。
又例如,在需要训练的机器学习模型为图像处理模型时,比如图像分类模型、图像增强模型、图像压缩模型或图像检测模型,数据样本则可以为图像数据。
再例如,在需要训练的机器学习模型为语音处理模型时,比如语音识别模型或语音生成模型,数据样本则可以为语音数据。
需要说明的是,本实施例中对步骤702和703之间的执行顺序并不做具体限定,步骤703可以是在步骤702之前执行,或者步骤703与步骤702同时执行。
步骤704,分布式装置基于目标数据样本生成训练样本集合。
由于中心装置向分布式装置所发送的训练配置信息中指示了分布式装置上训练的机器学习模型对输入数据(例如图1中的X T)进行处理的次数以及输入数据的需求,因此分布式装置可以基于目标数据样本(例如图1中的X 0)来生成训练样本集合,以用于后续机器学习模型的训练。其中,分布式装置所生成的训练样本集合中的各个训练样本均满足训练配置信息中所指示的输入数据的需求。
可选的,分布式装置基于目标数据样本所生成的训练样本集合可以包括第一数据和第二数据,其中第一数据为机器学习模型在训练过程中的输入数据,第二数据则为第一数据的训练标签。其中,机器学习模型的输入数据的训练标签用于结合机器学习模型的输出结果来生成损失函数,以便于基于损失函数更新机器学习模型,完成机器学习模型的训练。以第一数据和第二数据为例,在机器学习模型的一轮迭代训练中,将第一数据输入机器学习模型,得到机器学习模型的输出结果;然后,通过计算机器学习模型的输出结果与第二数据(即输入数据的训练标签)之间的差异来构建损失函数;最后,基于损失函数的值来更新机器学习模型中的参数,从而完成机器学习模型的一轮迭代训练。
在第一数据和第二数据的生成过程中,分布式装置根据训练配置信息所指示的输入数据的需求以及机器学习模型对输入数据进行处理的次数,对目标数据样本进行处理(例如加噪处理),得到第二数据。然后,分布式装置根据训练配置信息所指示的机器学习模型对输入数据进行处理的次数,对第二数据进行处理(例如加噪处理),得到第一数据。
其中,输入数据的需求用于指示输入数据是什么样的数据,例如输入数据为对目标数据样本执行指定次数加噪处理后得到的数据。分布式装置对目标数据样本进行处理后,得到第二数据,且分布式装置对目标数据样本进行处理的次数为输入数据的需求中所指示的次数与机器学习模型对输入数据进行处理的次数之间的差值。
然后,分布式装置根据训练配置信息所指示的机器学习模型对输入数据进行处理的次数,对第二数据进行处理,得到第一数据。
例如,假设输入数据的需求指示输入数据为对目标数据样本执行M次加噪处理所得到的数据,且机器学习模型对输入数据进行处理的次数为N次,那么分布式装置可以对目标数据样本X 0执行M-N次至M次加噪处理,得到数据{X M-N,X M-N+1,…,X M},将其中的数据X M-N作为第二数据,X M作为第一数据。即,第一数据(输入数据)为对目标数据样本X 0执行M次加噪处理后的数据X M,第二数据为对目标数据样本X 0执行M-N次加噪处理后的数据X M-N,即训练标签,用于在机器学习模型的一轮迭代训练中(如降噪处理的训练),作为与第一数据输入机器学习模型而得到的输出结果进行差异比较的对象,以用于确定损失函数来更新机器学习模型中的参数,从而完成机器学习模型的一轮迭代训练。
也就是说,在实际应用中,分布式装置是基于训练配置信息所指示的输入数据的需求以及机器学习模型对输入数据进行处理的次数,先对目标数据样本进行M-N次的加噪处理,得到第二数据后,再对第二数据进行N次的加噪处理,得到第一数据。
示例性地,假设分布式装置从聚合装置处获得的目标数据样本为数据X 0,训练配置信息指示分布式装置上训练的机器学习模型对输入数据进行处理的次数为5,且机器学习模型的输入数据的需求指示输入数据为对目标数据样本加噪15次后的数据。那么,分布式装置可以先对从聚合装置处获得的目标数据样本X 0执行10次至15次加噪处理,得到数据{X 10,X 11,X 12,X 13,X 14,X 15}。其中,数据X 10是指对目标数据样本X 0执行10次加噪处理所得到的数据(即第二数据);数据X 15是指对目标数据样本X 0执行15次加噪处理所得到的数据(即第一数据),数据X 15可以是对X 10进行5次加噪处理后得到的数据。然后,在训练过程中,将数据X 15作为机器学习模型的输入数据(即上述的第一数据),并将数据X 10作为输入数据的训练标签(即上述的第二数据)。可选的,分布式装置对数据样本进行加噪处理的方式具体可以是根据公式
Figure PCTCN2022115466-appb-000017
所示的条件概率分布来进行采样,从而获得输入数据。
步骤705,分布式装置基于训练样本集合训练机器学习模型,得到训练后的机器学习模型。
示例性地,在生成训练样本集合后,分布式装置将训练样本集合中作为输入数据的第一数据输入至机器学习模型中,得到机器学习模型的输出数据;然后,分布式装置基于输入数据的训练标签(即第二数据)和输出数据计算损失函数,并基于损失函数的值对机器学习模型进行更新,得到训练后的机器学习模型。其中,机器学习模型基于损失函数进行 更新的过程具体可以参考上述的介绍,在此不再赘述。
步骤706,分布式装置向聚合装置发送训练后的机器学习模型。
在各个分布式装置基于本装置上所生成的训练样本对机器学习模型进行训练,并得到训练后的机器学习模型后,各个分布式装置向聚合装置发送训练后的机器学习模型,以便于聚合装置对机器学习模型进行聚合。
步骤707,聚合装置聚合各个分布式装置发送的训练后的机器学习模型,得到聚合模型。
可选的,由于各个分布式装置所训练的机器学习模型是结构相同的模型,因此各个分布式装置训练得到的机器学习模型为结构相同但参数不同的模型。在聚合装置接收到分布式装置所发送的多个训练后的机器学习模型后,聚合装置可以对多个训练后的机器学习模型中的参数进行加权求和,得到新的参数,其中新的参数则为聚合模型的参数。也就是说,在聚合装置对各个分布式装置发送的训练后的机器学习模型进行聚合后,所得到的聚合模型的结构不发生变化,但聚合模型中的参数发生了变化,且聚合模型中的参数是基于各个分布式装置所发送的训练后的机器学习模型得到的。
需要说明的是,除了上述对多个训练后的机器学习模型中的参数进行加权求和,以得到聚合模型的方式之外,聚合装置还可以通过其他的方式来实现模型的聚合,本实施例对此不做具体限定。
步骤708,聚合装置向各个分布式装置发送聚合模型。
在实现模型的聚合后,聚合装置则向各个分布式装置发送聚合模型,以便于各个分布式装置对聚合模型继续进行训练。
可以理解的是,以上步骤701-708介绍了分布式装置对机器学习模型进行第一轮迭代训练的过程。在实际应用中,分布式装置可能需要对机器学习模型进行多轮迭代训练。因此,在各个分布式装置接收到聚合装置发送的聚合模型后,各个分布式装置和聚合装置循环执行上述的步骤704-708,直至达到机器学习模型训练终止的条件。其中,机器学习模型训练终止的条件可以为分布式装置对机器学习模型进行迭代训练的轮数达到预设轮次数;或者,分布式装置训练得到的机器学习模型的性能达到了预设要求。
为了便于理解,以下将结合具体例子详细介绍本申请实施例提供的模型训练方法。如图8所示,图8中以基站为聚合装置,智能手机、智能手表、智能手环以及笔记本电脑等终端设备为多个分布式装置为例,对模型训练方法的过程进行详细介绍。如图8所示,模型训练方法的过程包括以下的四个阶段。
阶段1,基站向参与模型训练的多个终端设备发送目标数据样本X 0和机器学习模型。
在机器学习模型的第一轮迭代训练过程中,基站向多个终端设备发送的机器学习模型可以为随机初始化参数后的机器学习模型。在第N(N大于1)轮迭代训练过程中,基站向多个终端设备发送的机器学习模型则为聚合上一轮迭代训练得到的多个机器学习模型后的聚合模型。
阶段2,多个终端设备分别基于目标数据样本X 0生成训练样本,并基于训练样本训练 机器学习模型。
由于智能手机、智能手表、智能手环以及笔记本电脑等终端设备被分配了不同的训练内容,因此各个终端设备分别基于目标数据样本生成匹配训练内容的训练样本,并基于训练样本来训练基站所发送的机器学习模型。
示例性地,如图8中所示,智能手机中所训练的机器学习模型的输入数据的需求指示输入数据为对目标数据样本X 0进行T次加噪后所得到的数据;并且,智能手机中的机器学习模型需要对输入数据进行3次处理,得到输出数据。因此,在生成训练样本的过程中,智能手机对目标数据样本X 0执行T-3次至T次加噪处理,得到数据{X T-3,X T-2,X T-1,X T};基于数据{X T-3,X T-2,X T-1,X T},智能手机可以构建得到一组包括输入数据和训练标签的训练样本(X T,X T-3),其中训练样本中的数据X T为输入数据,数据X T-3为训练标签。在训练过程中,智能手机通过复用待训练的机器学习模型,得到由3个相同的待训练的机器学习模型依次连接构成的总模型,其中总模型中后一个机器学习模型的输入为前一个机器学习模型的输出;然后,智能手机将训练样本中的输入数据输入总模型中,得到总模型所输出的输出数据,以便于基于输出数据和训练样本中的训练标签构建用于更新机器学习模型的损失函数。
例如,对于训练样本(X T,X T-3),将数据X T-3输入总模型中,得到总模型的输出数据X T’,根据输出数据X T’和训练标签X T计算损失函数的值。然后,再基于计算得到的损失函数的值来更新机器学习模型的参数。
可选的,由于总模型是由3个相同的机器学习模型依次连接得到的,因此实际上总模型中的每个机器学习模型都会具有对应的输出数据。在训练机器学习模型时,除了基于总模型的输出数据(即总模型中第三个机器学习模型的输出数据)来构建损失函数之外,还可以是基于总模型中其他的机器学习模型的输出数据来一并构建损失函数。
例如,将数据X T-3输入总模型后,得到总模型中第一个机器学习模型所输出的输出数据X T-2’、总模型中第二个机器学习模型所输出的输出数据X T-1’、总模型中第三个机器学习模型所输出的输出数据X T’。然后,基于训练标签X T-2和输出数据X T-2’构建损失函数1,基于训练标签X T-1和输出数据X T-1’构建损失函数2,基于训练标签X T和输出数据X T’构建损失函数3。最后,基于损失函数1、损失函数2和损失函数3计算得到总损失函数,并基于总损失函数的值更新机器学习模型,从而得到训练后的机器学习模型。
类似地,对于智能手表而言,智能手表中所训练的机器学习模型的输入数据为X T-3,即对目标数据样本X 0进行T-3次加噪后所得到的数据;并且,智能手表中的机器学习模型需要对输入数据进行2次处理,得到输出数据。在生成训练样本的过程中,智能手表对目标数据样本X 0执行T-5次至T-3次加噪处理,得到数据{X T-5,X T-4,X T-3};基于数据{X T-5,X T-4,X T-3}获得多组训练样本,该多组训练样本中的每组训练样本包括输入数据和训练标签,该多组训练样本例如为(X T-5,X T-4)、(X T-4,X T-3)、(X T-5,X T-3)。参照智能手机基于训练数据对机器学习模型进行训练的过程,智能手表基于生成的训练数据对机器学习模型进行训练,同样能够得到训练后的机器模型。
对于智能手环而言,智能手环中所训练的机器学习模型的输入数据为X 4,即对目标数 据样本X 0进行4次加噪后所得到的数据;并且,智能手环中的机器学习模型需要对输入数据进行1次处理,得到输出数据。在生成训练样本的过程中,智能手环对数据样本X 0执行3次至4次加噪处理,得到数据{X 3,X 4},从而构建得到训练样本{X 3,X 4}。智能手环基于生成的训练样本{X 3,X 4}对机器学习模型进行训练,得到训练后的机器模型。
对于笔记本电脑而言,笔记本电脑中所训练的机器学习模型的输入数据为X 3,即对目标数据样本X 0进行3次加噪后所得到的数据;并且,笔记本电脑中的机器学习模型需要对输入数据进行3次降噪处理,得到输出数据。在生成训练样本的过程中,笔记本电脑对数据样本X 0执行0次至3次加噪处理,得到数据{X 0,X 1,X 2,X 3},从而构建得到训练样本{X 0,X 3}。智能手环基于生成的训练样本{X 0,X 3}对机器学习模型进行训练,得到训练后的机器模型。除了图8中所示出的智能手机、智能手表、智能手环以及笔记本电脑之外,还可以是由其他的终端设备基于数据X T-5与数据X 4之间的数据来负责训练机器模型,图8中并不一一示出。
可选的,在各个终端设备生成训练数据的过程中,终端设备之间可以相互发送已生成的训练数据,从而避免各个终端设备独立重复生成相同的训练数据,提高训练数据的生成效率。例如,对于智能手机而言,智能手机需要生成数据{X T-3,X T-2,X T-1,X T},而智能手表也需要生成数据X T-3,因此智能手机可以将所生成的数据X T-3发送给智能手表,从而使得免去生成数据X T-3的过程。
阶段3,多个终端设备分别向基站发送训练后的机器学习模型。
由于每个终端设备所负责的训练内容并不一样,且用于训练机器学习模型的训练数据也不一样,因此每个终端设备训练得到的机器学习模型往往也是不一样的。在各个终端设备结束一轮或多轮模型的迭代训练,得到训练后的机器学习模型之后,终端设备则向基站发送训练后的机器学习模型,以便于基站聚合各个训练后的机器学习模型。
阶段4,基站聚合多个训练后的机器学习模型,得到聚合模型。
在基站通过聚合各个终端设备所发送的训练后的机器学习模型并得到聚合模型之后,基站可以继续向各个终端设备发送聚合模型,以便于各个终端设备继续对聚合模型进行迭代训练。最终,在终端设备所训练的机器学习模型达到模型训练终止条件之后,终端设备不再对基站所发送的聚合模型进行训练,而是将基站最后一次所发送的聚合模型作为模型推理时所采用的模型,即将该聚合模型用于执行后续的数据处理任务。
以上介绍了多个分布式装置对机器学习模型进行联合训练的过程,以下将介绍多个分布式装置通过机器学习模型对数据进行联合处理的过程。其中,多个分布式装置通过机器学习模型对数据进行联合处理的场景有多种,以下将结合附图详细介绍多种场景下的数据联合处理过程。
需要说明的是,本实施例中所介绍的通过机器学习模型对数据进行联合处理的多个分布式装置可以是对机器学习模型进行联合训练的多个分布式装置,即多个分布式装置先联合训练得到机器学习模型后,再通过相同的机器学习模型对数据进行联合处理。或者,本实施例中多个分布式装置可以是通过预置的机器学习模型对数据进行联合处理,即多个分 布式装置并没有执行联合训练机器学习模型的过程。简单来说,机器学习模型的联合训练过程和数据联合处理过程这两个过程可以是融合的,也可以是独立的,本实施例对此并不做具体限定。
请参阅图9A,图9A为本申请实施例提供的一种数据处理方法900的流程示意图。如图9A所示,数据处理方法900包括以下的步骤901-905。
步骤901,分布式装置1确定数据处理需求。
本实施例中,分布式装置1中的数据处理需求是对具有噪声的原始数据进行处理,以得到分布式装置1所需的目标数据。其中,目标数据的质量高于原始数据的质量,即目标数据中的噪声小于原始数据中的噪声。一般来说,分布式装置1期望得到的目标数据往往可以是用于执行其他的模型训练任务的数据。因此,分布式装置1可以根据其他模型训练任务针对于所需数据的需求,来确定数据处理需求,该数据处理需求用于指示对原始数据进行处理的程度。例如,分布式装置1期望得到质量较高的图像数据,以便于后续能够基于质量较高的图像数据来训练图像分类模型;因此,分布式装置1可以根据图像分类模型对输入数据的需求,来确定对原始图像数据进行处理的程度。
例如,在原始数据为信道数据的情况下,分布式装置1期望得到的目标数据可以是用于训练发送机机器学习模型、接收机机器学习模型或信道估计机器学习模型等模型的数据。又例如,在原始数据为图像数据的情况下,分布式装置1期望得到的目标数据可以是用于训练图像分类模型、图像增强模型、图像压缩模型或图像检测模型等模型的数据。因此,分布式装置1可以是根据采用目标数据来执行训练的模型的精度需求来确定目标数据的质量需求,进而基于目标数据与原始数据之间的质量差距来确定数据处理需求。
可选的,在采用机器学习模型(例如上述的扩散模型)对原始数据进行处理的情况下,数据处理需求可以为采用机器学习模型对原始数据进行处理的次数。例如,在所需数据的质量需求较高的情况下,分布式装置1可以确定数据处理需求为对原始数据依次处理10000次;又例如,在所需数据的质量需求不高的情况下,分布式装置1可以确定数据处理需求为对原始数据逐步处理1000次。
步骤902,分布式装置1通过机器学习模型处理原始数据,得到第一数据。
在确定数据处理需求后,分布式装置1基于本装置的数据处理能力,通过机器学习模型处理原始数据,得到第一数据。其中,分布式装置1的数据处理能力并不能满足分布式装置1的数据处理需求,因此分布式装置1所得到的第一数据也并非是分布式装置1所期望得到的目标数据。其中,分布式装置的数据处理能力可以是与分布式装置上的处理资源以及存储资源相关,本实施例对此不做具体限定。
示例性地,假设分布式装置1的数据处理需求为通过机器学习模型对原始数据逐步处理1000次,而分布式装置1的数据处理能力仅支持分布式装置1对原始数据逐步处理200次,则分布式装置1则通过机器学习模型对原始数据逐步处理200次,得到第一数据。其中,第一数据还需要被处理800次才能够得到满足分布式装置1的数据处理需求的数据。
可选的,分布式装置1通过机器学习模型对原始数据逐步处理200次的过程可以是指:分布式装置1通过将机器学习模型进行复用,得到由200个机器学习模型依次连接而 成的总模型;然后,分布式装置1将原始数据输入至总模型中,由总模型中的200个机器学习模型依次处理数据,得到第一数据。其中,总模型中的任意一个机器学习模型的输入即为前一个机器学习模型的输出。
或者,分布式装置1通过机器学习模型对原始数据逐步处理200次的过程还可以是指:分布式装置1通过机器学习模型对原始数据进行一次处理,得到一次处理后的数据;然后,分布式装置1再将一次处理后的数据输入机器学习模型中,得到二次处理后的数据;其次,分布式装置1继续将二次处理后的数据输入机器学习模型中,得到三次处理后的数据,以此循环,直至分布式装置1通过机器学习模型对数据执行200次处理,得到第一数据。也就是说,分布式装置1是将机器学习模型每次对数据进行处理后所输出的数据作为下一次数据处理过程中机器学习模型的输入,从而实现基于同一个机器学习模型依次对数据进行多次处理。
可选的,分布式装置1上用于处理原始数据的机器学习模型例如为上述的扩散模型。
步骤903,分布式装置1向分布式装置2发送第一数据。
由于分布式装置1的数据处理能力并不能够支持分布式装置1完成数据的处理,即分布式装置1处理得到的第一数据达不到分布式装置1的数据处理需求,因此分布式装置1向分布式装置2发送第一数据,以请求分布式装置2协助分布式装置1继续处理第一数据。
可选的,在分布式装置1向分布式装置2发送第一数据的同时,分布式装置1还可以向分布式装置2发送第一信息,该第一信息用于请求分布式装置2对第一数据进行处理。其中,第一信息还可以指示第一数据待处理的次数,即分布式装置2对第一数据进行处理的次数。例如,假设分布式装置1的数据处理需求为通过机器学习模型对原始数据逐步处理1000次,而分布式装置1仅对原始数据逐步处理了200次,因此分布式装置1可以在第一信息中指示第一数据的待处理次数为800次。可选的,第一数据的发送和第一信息的发送可以分开执行。
可选的,分布式装置1也可以是提前与分布式装置2进行协商,以使得分布式装置2能够确定从分布式装置1接收到的数据需要进行处理的次数。在这种情况下,分布式装置1只需要向分布式装置2发送第一数据,而不需要再向分布式装置2发送上述的第一信息。例如,在分布式装置1需要对大量数据进行逐步处理的情况下,分布式装置1提前与分布式装置2进行协商,以使得分布式装置2能够确定接收到的数据需要执行处理的次数。这样,分布式装置1每次对一个数据进行处理特定次数后,则将处理得到的数据发送给分布式装置2,以使得分布式装置2根据提前协商的内容对数据继续进行处理,从而避免分布式装置1反复通知分布式装置2所带来的信令开销。
在本实施例中,分布式装置1和分布式装置2例如为上述的终端设备,或者用于实现上述终端设备的功能的装置。在一个具体的示例中,分布式装置1可以为智能手表,分布式装置2可以为智能手机。
步骤904,分布式装置2通过机器学习模型处理第一数据,得到第二数据。
本实施例中,分布式装置2的数据处理能力能够支持分布式装置2协助分布式装置1 完成数据的处理,即分布式装置2处理得到的第二数据能够满足分布式装置1的数据处理需求,第二数据即为分布式装置1期望得到的数据。
示例性地,在分布式装置1通过第一信息指示分布式装置2对第一数据进行处理的次数的情况下,分布式装置2可以根据第一信息所指示的处理次数,通过机器学习模型对第一数据逐步进行多次处理,从而得到第二数据。例如,在分布式装置1在第一信息中指示分布式装置2对第一数据处理800次的情况下,分布式装置2则通过机器学习模型对第一数据逐步处理800次,从而得到第二数据。
需要说明的是,在本实施例中,分布式装置2的数据处理能力能够支持分布式装置2对第一数据完成分布式装置1所指定次数的处理。
其中,分布式装置2用于处理第一数据的机器学习模型可以是与分布式装置1中处理得到第一数据的机器学习模型相同,以便于保证分布式装置2对第一数据进行降噪处理的性能,确保第二数据能够满足分布式装置1的数据处理需求。
步骤905,分布式装置2向分布式装置1发送第二数据。
在处理得到第二数据后,由于第二数据能够满足分布式装置1的数据处理需求,因此分布式装置2向分布式装置1发送第二数据,从而完成协助分布式装置1处理数据。这样一来,分布式装置1在接收到第二数据后,则能够基于第二数据执行其他数据处理任务,例如基于第二数据执行其他模型的训练任务。
本方案中,通过在不同的装置上部署机器学习模型,由多个装置的设备联合完成数据的处理,以不断地提高所得到的数据的质量,减轻了每个装置的数据处理压力,保证计算能力较弱的装置也能够获得其所需质量的数据。
请参阅图9B,图9B为本申请实施例提供的一种数据处理方法900的另一流程示意图。如图9B所示,在另一个可能的实施例中,数据处理方法900可以是包括以下的步骤906-910。其中,步骤906-910与上述的步骤901-905并没有顺序关联,步骤906-910与上述的步骤901-905可以是独立的两套步骤。分布式装置1和分布式装置2可以是通过执行上述的步骤901-905来完成数据的联合处理;分布式装置1和分布式装置2也可以是通过执行上述的步骤906-910来完成数据的联合处理。
步骤906,分布式装置1确定数据处理需求。
本实施例中,步骤906与上述的步骤901类似,具体请参考上述的步骤901,在此不再赘述。
步骤907,分布式装置1向分布式装置2发送请求协助信息。
分布式装置1在确定数据处理需求后,分布式装置1可以判断本装置的数据处理能力是否能够满足数据处理需求。在分布式装置1确定本装置的数据处理能力无法满足数据处理需求的情况下,分布式装置1可以向分布式装置2发送请求协助信息,以请求分布式装置2协助分布式装置1进行数据的处理。
可选的,分布式装置1所发送的请求协助信息中可以指示原始数据的待处理次数,即分布式装置2对原始数据进行处理的次数。例如,假设分布式装置1的数据处理需求为通 过机器学习模型对原始数据逐步处理1000次,而分布式装置1的数据处理能力仅支持其对数据逐步处理200次,因此分布式装置1可以在请求协助信息中指示原始数据的待处理次数为800次,即分布式装置1指示分布式装置2对原始数据处理800次。
步骤908,分布式装置2通过机器学习模型处理原始数据,得到第一数据。
在接收到分布式装置1发送的请求协助信息后,分布式装置2则基于请求协助信息的指示,通过机器学习模型处理原始数据,得到第一数据。
例如,假设请求协助信息中指示原始数据的待处理次数为800次,分布式装置2则通过机器学习模型对原始数据进行处理800次,得到第一数据。
步骤909,分布式装置2向分布式装置1发送第一数据。
可选的,在分布式装置2向分布式装置1发送第一数据的同时,分布式装置2还可以向分布式装置1发送指示信息,该指示信息用于指示分布式装置2对原始数据进行处理的次数。例如,分布式装置2在通过机器学习模型对原始数据进行处理800次得到第一数据的情况下,分布式装置2所发送的指示信息则指示第一数据是对原始数据处理800次后得到的。可以理解,分布式装置2可以根据自身能力对原始数据处理的次数小于分布式装置1指示的待处理次数。
可选的,第一数据的发送和指示信息的发送可以分开执行。
步骤910,分布式装置1通过机器学习模型处理第一数据,得到第二数据。
在接收到第一数据后,分布式装置1可以基于本装置的数据处理需求,以及分布式装置2对原始数据进行处理的次数,确定分布式装置1对第一数据进行处理的次数,从而通过机器学习模型处理第一数据,得到第二数据。
例如,在分布式装置1通过请求协助信息指示分布式装置2对原始数据处理800次的情况下,分布式装置1可以根据自身的数据处理需求为对原始数据进行处理1000次,确定还需要对接收到的第一数据处理200次。因此,分布式装置1通过机器学习模型继续对第一数据进行800次处理,得到第二数据。
在上述的数据处理方法900中,分布式装置1向分布式装置2发送第一数据或请求协助信息后,分布式装置2则能够协助分布式装置1完成数据的处理。然而,在一些场景下,例如分布式装置1的数据处理需求较高或者是分布式装置2的数据处理能力较低,分布式装置2可能难以独自协助分布式装置1完成数据的处理,因此分布式装置2还可以将本装置处理后的数据发送给其他的分布式装置,让其他的分布式装置继续协助完成数据的处理;可选的,分布式装置1还可以请求其他分布式装置继续协助处理,与请求分布式装置2协助处理类似,在此不再赘述。
请参阅图10A,图10A为本申请实施例提供的一种数据处理方法1000的流程示意图。如图10A所示,数据处理方法1000包括以下的步骤1001-1007。
步骤1001,分布式装置1确定数据处理需求。
步骤1002,分布式装置1通过机器学习模型处理原始数据,得到第一数据。
步骤1003,分布式装置1向分布式装置2发送第一数据。
本实施例中,步骤1001-1003与上述的步骤901-903类似,具体请参考上述的步骤901-903,在此不再赘述。
步骤1004,分布式装置2通过机器学习模型处理第一数据,得到第二数据。
在本实施例中,分布式装置2的数据处理能力并不能够支持分布式装置2协助分布式装置1完成数据的处理。例如,假设分布式装置1的数据处理需求为通过机器学习模型对原始数据逐步处理1000次,而分布式装置1仅对原始数据逐步处理了200次,因此分布式装置1可以是指示分布式装置2对第一数据逐步处理800次;然而,分布式装置2的数据处理能力并不足以支持分布式装置2对第一数据逐步处理800次,分布式装置2可能是只能够对第一数据逐步处理200次,以得到第二数据。
也就是说,分布式装置2通过机器学习模型处理第一数据后所得到的第二数据仍不满足分布式装置1的数据处理需求,即第二数据并不是分布式装置1期望得到的目标数据。
步骤1005,分布式装置2向分布式装置3发送第二数据。
由于分布式装置2处理第一数据后所得到的第二数据并非是分布式装置1所期望得到的目标数据,因此分布式装置2可以继续请求其他的分布式装置来协助完成数据的处理。
具体地,分布式装置2向分布式装置3发送第二数据,以使得分布式装置3继续对第二数据进行处理,以协助分布式装置1完成数据的处理。
可选的,在分布式装置2向分布式装置3发送第二数据的同时,分布式装置2还可以向分布式装置3发送第二信息,该第二信息用于指示分布式装置2所发送的第二数据待处理的次数。其中,第二数据待处理的次数可以是根据第一信息中所指示的第一数据待处理的次数以及分布式装置2实际对第一数据进行处理的次数计算得到的。可选的,第二数据的发送和第二信息的发送可以分开执行。
例如,假设分布式装置1的数据处理需求为通过机器学习模型对原始数据逐步处理1000次,而分布式装置1仅对原始数据逐步处理了200次,因此分布式装置1通过第一信息指示第一数据待处理的次数为800次;分布式装置2接收到第一数据和第一信息后,分布式装置2对第一数据执行了200次处理,得到第二数据;因此,分布式装置2可以向分布式装置3发送第二数据和第二信息,该第二信息用于指示第二数据待处理的次数为600次(800-200)。
步骤1006,分布式装置3-分布式装置N依次协助处理数据。
类似地,分布式装置3接收到分布式装置2所发送的第二数据后,继续对第二数据进行处理。并且,如果分布式装置3对第二数据进行处理的次数仍然小于第二数据的待处理次数时,分布式装置3则继续向分布式装置3的下一个分布式装置发送分布式装置3处理后的数据,以指示后续的分布式装置继续协助完成数据的处理,直至分布式装置N处理得到能够满足分布式装置1的数据处理需求的目标数据。
在本实施例中,分布式装置3与分布式装置N可以为同一个分布式装置,也可以为不同的分布式装置。在图10A中,分布式装置3与分布式装置N被绘制为不同的分布式装置。
步骤1007,分布式装置N向分布式装置1发送目标数据。
在处理得到分布式装置1所需的目标数据后,由于该目标数据能够满足分布式装置1的数据处理需求,因此分布式装置N向分布式装置1发送目标数据,从而完成协助分布式装置1处理数据。这样一来,分布式装置1在接收到目标数据后,则能够基于目标数据执行其他数据处理任务,例如基于第二数据执行其他模型的训练任务。
可选的,分布式装置N可以是直接向分布式装置1发送目标数据;分布式装置N也可以是向分布式装置N的前一个分布式装置(即请求分布式装置N协助处理数据的分布式装置N-1)发送目标数据,以使得目标数据能够被逐跳发送至分布式装置1。
请参阅图10B,图10B为本申请实施例提供的一种数据处理方法1000的另一流程示意图。如图10B所示,在另一个可能的实施例中,数据处理方法1000可以是包括以下的步骤1008-1014。其中,步骤1008-1014与上述的步骤1001-1007并没有顺序关联,步骤1008-1014与上述的步骤1001-1007可以是独立的两套步骤。分布式装置1-分布式装置N可以是通过执行上述的步骤1001-1007来完成数据的联合处理;分布式装置1-分布式装置N也可以是通过执行上述的步骤1008-1014来完成数据的联合处理。
步骤1008,分布式装置1确定数据处理需求。
步骤1009,分布式装置1向分布式装置2发送请求协助信息。
本实施例中,步骤1008-1009与上述的步骤906-907类似,具体请参考上述的步骤906-907,在此不再赘述。
步骤1010,分布式装置2通过机器学习模型处理原始数据,得到中间数据。
与上述步骤908不同的是,分布式装置2的数据处理能力并不足以支持分布式装置2完成分布式装置1在请求协助信息中所指示的数据处理次数。因此,分布式装置2根据本装置的数据处理能力,通过机器学习模型处理原始数据,得到中间数据。其中,中间数据对应的数据处理次数小于分布式装置1在请求协助信息中所指示的数据处理次数。
例如,假设分布式装置1在请求协助信息中指示原始数据的待处理次数为800次,而分布式装置2的数据处理能力仅支持其对原始数据处理300次,那么分布式装置通过机器学习模型对原始数据进行300次处理后所得到的中间数据并不能够满足分布式装置1的需求,即中间数据并不是分布式装置1期望得到的数据。
步骤1011,分布式装置2向分布式装置3发送中间数据。
由于分布式装置2处理原始数据后所得到的中间数据并非是分布式装置1所期望得到的数据,因此分布式装置2可以继续请求其他的分布式装置来协助完成数据的处理。
具体地,分布式装置2向分布式装置3发送中间数据,以使得分布式装置3继续对中间数据进行处理,以协助分布式装置1完成数据的处理。
可选的,在分布式装置2向分布式装置3发送中间数据的同时,分布式装置2还可以向分布式装置3发送指示信息,该指示信息用于指示分布式装置2所发送的中间数据待处理的次数。其中,中间数据待处理的次数可以是根据请求协助信息中所指示的原始数据待处理的次数以及分布式装置2实际对原始数据进行处理的次数计算得到的。可选的,中间数据的发送和指示信息的发送可以分开执行。
步骤1012,分布式装置3-分布式装置N依次协助处理数据,得到分布式装置1所需的第一数据。
类似地,分布式装置3接收到分布式装置2所发送的中间数据后,继续对中间数据进行处理。并且,如果分布式装置3对中间数据进行处理的次数仍然小于中间数据的待处理次数时,分布式装置3则继续向分布式装置3的下一个分布式装置发送分布式装置3处理后的数据,以指示后续的分布式装置继续协助完成数据的处理,直至分布式装置N处理得到能够满足分布式装置1的数据处理需求的第一数据。
步骤1013,分布式装置N向分布式装置1发送第一数据。
其中,第一数据是能够满足分布式装置1在请求协助信息中所指示的数据处理需求,因此分布式装置N向分布式装置1发送第一数据。
步骤1014,分布式装置1通过机器学习模型处理第一数据,得到第二数据。
以上的方法900和方法1000介绍了某个分布式装置请求其他的分布式装置协助完成数据处理的过程。在一些场景下,不同的分布式装置可能需要对相同类型的数据进行处理,且不同分布式装置的数据处理需求是不相同的。在这种情况下,可以采用中心装置来统筹各个分布式装置的数据处理需求,从而实现在不同分布式装置之间联合处理数据。
请参阅图11,图11为本申请实施例提供的一种数据处理方法1100的流程示意图。如图11所示,数据处理方法1100包括以下的步骤1101-1108。
步骤1101,多个分布式装置分别向中心装置发送数据处理需求。
在本实施例中,以多个分布式装置为3个为例,示例性的,多个分布式装置包括分布式装置1、分布式装置2和分布式装置3。多个分布式装置的所需数据是相同类型的数据,但不同的分布式装置对所需数据的质量需求是不同的,即不同的分布式装置对原始数据进行降噪处理的次数的需求是不相同的。
例如,假设多个分布式装置的所需数据均为图像数据,分布式装置1需要采用图像数据来训练图像分类模型,该图像分类模型对图像数据的质量要求并不高,因此分布式装置1的数据处理需求具体可以为对原始图像数据进行降噪处理1000次。
分布式装置2可以是需要图像数据来训练语义分割模型,该语义分割模型用于识别图像中的各个物体,因此语义分割模型对图像数据的质量要求较高;分布式装置2的数据处理需求具体可以为对原始图像数据进行降噪处理5000次。
此外,分布式装置3可以是需要图像数据来训练图像增强模型,该图像增强模型用于识别图像中的特定物体并增强识别到的特定物体的清晰度,因此图像增强模型对图像数据的质量要求最高。分布式装置3的数据处理需求具体可以为对原始图像数据进行降噪处理10000次。
步骤1102,中心装置确定各个分布式装置处理数据的顺序。
由于各个分布式装置的数据处理需求并不相同,中心装置可以根据各个分布式装置针对于数据的处理次数需求,确定各个分布式装置处理数据的顺序。示例性地,中心装置先确定各个分布式装置的数据处理需求中的数据处理次数,然后按照数据处理次数从小到大 的顺序来确定各个分布式装置处理数据的顺序。分布式装置的数据处理需求中的数据处理次数越小,则分布式装置处理数据的顺序越靠前;分布式装置的数据处理需求中的数据处理次数越大,则分布式装置处理数据的顺序越靠后。
例如,假设分布式装置1的数据处理需求为对原始图像数据进行降噪处理1000次,分布式装置2的数据处理需求为对原始图像数据进行降噪处理5000次,分布式装置3的数据处理需求为对原始图像数据进行降噪处理10000次,那么这三个分布式装置处理数据的顺序为:分布式装置1→分布式装置2→分布式装置3。
步骤1103,中心装置向各个分布式装置发送指示信息,以指示各个分布式装置处理数据的顺序。
在确定各个分布式装置处理数据的顺序之后,中心装置则向各个分布式装置发送指示信息,以指示各个分布式装置处理数据的顺序。这样,各个分布式装置在收到中心装置发送的指示信息之后,则能够确定从哪个分布式装置接收处理后的数据以及将本装置上处理后的数据发送给哪个分布式装置。
可选的,在各个分布式装置上处理数据的能力较为稳定的情况下,即各个分布式装置上分配给数据处理的计算资源和存储资源较为稳定时,中心装置还可以在向各个分布式装置发送的指示信息中指示各个分布式装置需要对数据进行处理的次数。
在一个可能的示例中,在分布式装置1、分布式装置2以及分布式装置3的数据处理需求分别为对原始图像数据进行降噪处理1000次、5000次以及10000次的情况下,中心装置向分布式装置1发送的指示信息1具体可以为:上一跳节点为空,本地处理数据次数为1000,下一跳节点为分布式装置2。即分布式装置1为开始处理数据的首个节点,且分布式装置1需要通过机器学习模型对数据处理1000次,并将处理后的数据发送给分布式装置2。此外,中心装置向分布式装置2发送的指示信息2具体可以为:上一跳节点为分布式装置1,本地处理数据次数为4000(即5000-1000),下一跳节点为分布式装置3。中心装置向分布式装置3发送的指示信息3具体可以为:上一跳节点为分布式装置2,本地处理数据次数为5000,下一跳节点为空。其中,在本示例中,各个分布式装置的数据处理能力均满足本装置的数据处理需求,因此任意一个分布式装置从其他分布式装置接收到已处理一定次数的数据后,均能够对已处理的数据继续进行处理,从而得到满足本装置数据处理需求的数据。
在另外一些情况下,分布式装置的数据处理能力可能并不能够满足自身的数据处理需求,如果中心装置继续按照各个分布式装置的数据处理需求中数据处理次数的大小关系来确定分布式装置联合处理数据的方式,可能会导致部分分布式装置无法完成对数据的处理。因此,在本示例中,中心装置可以基于各个分布式装置的数据处理需求中数据处理次数的大小关系以及各个分布式装置的数据处理能力来确定分布式装置联合处理数据的方式。
示例性地,在分布式装置1、分布式装置2以及分布式装置3的数据处理需求分别为对原始图像数据进行降噪处理1000次、5000次以及10000次,且分布式装置1的数据处理能力支持其对数据进行1000次降噪处理,分布式装置2的数据处理能力支持其对数据进 行2000次降噪处理,以及分布式装置3的数据处理能力支持其对数据进行9000次降噪处理的情况下,中心装置向分布式装置1发送的指示信息1具体可以为:上一跳节点为空,本地处理数据次数为1000,下一跳节点为分布式装置2。中心装置向分布式装置2发送的指示信息2具体可以为:上一跳节点为分布式装置1,本地处理数据次数为2000,下一跳节点为分布式装置3;上一跳节点为分布式装置3,本地处理数据次数为0,下一跳节点为空。中心装置向分布式装置3发送的指示信息3具体可以为:上一跳节点为分布式装置2,本地处理数据次数为2000时下一跳节点为分布式装置2,本地处理数据次数为7000时下一跳节点为空。
步骤1104,分布式装置1通过机器学习模型对原始数据进行T1次处理,得到第一数据。
在接收到中心装置所发送的指示信息后,分布式装置1可以确定自身为第一个处理数据的装置,因此分布式装置1通过机器学习模型对原始数据进行T1次处理,得到第一数据。
可选的,在各个分布式装置上处理数据的能力较为稳定的情况下,中心装置可以通过指示信息指定各个分布式装置处理数据的次数。其中,分布式装置1对原始数据进行处理的次数可以是与分布式装置1的数据处理需求匹配。即,分布式装置1的数据处理需求为对原始数据进行T1次处理,且分布式装置1实际对原始数据进行处理的次数也为T1次。
可选的,在各个分布式装置上处理数据的能力有波动的情况下,中心装置并不指定各个分布式装置处理数据的次数。分布式装置1对原始数据进行处理的次数可以是不与分布式装置1的数据处理需求匹配。即,分布式装置1实际对原始数据进行处理的次数可以是大于或小于分布式装置1所需求的数据处理次数。例如,分布式装置1的数据处理需求为对原始数据进行N1次处理,而分布式装置1实际对原始数据进行处理的次数为T1次,其中N1可以是大于或小于T1。当分布式装置1上的计算资源以及存储资源较为充裕的情况下,分布式装置1实际对原始数据进行处理的次数T1可以是大于需求的数据处理次数N1;当分布式装置1上的计算资源或存储资源较为紧张的情况下,分布式装置1实际对原始数据进行处理的次数T1可以是小于需求的数据处理次数N1。
步骤1105,分布式装置1向分布式装置2发送第一数据。
本实施例中,在分布式装置1从中心装置处所接收的指示信息中,还指示了分布式装置1需要向分布式装置2发送处理后的数据。因此,分布式装置1在对原始数据进行处理并得到第一数据之后,分布式装置1向分布式装置2发送第一数据。
可选的,在中心装置并没有指示各个分布式装置需要对数据进行处理的次数的情况下,分布式装置1可以向分布式装置2发送信息,以指示分布式装置1对原始数据已进行处理的次数。
步骤1106,分布式装置2通过机器学习模型对第一数据进行T2次处理,得到第二数据。
在接收到中心装置所发送的指示信息后,分布式装置2可以确定自身需要从分布式装置1接收数据,并继续对接收到的数据进行处理,因此分布式装置2通过机器学习模型对 接收到的第一数据进行T2次处理,得到第二数据。其中,在分布式装置2对第一数据进行T2次处理所得到的第二数据能够满足分布式装置2的数据处理需求的情况下,第二数据则为分布式装置2所需要的数据。
可选的,在各个分布式装置上处理数据的能力较为稳定的情况下,中心装置可以通过指示信息指定分布式装置2处理数据的次数。分布式装置2根据中心装置的指示,对第一数据进行T2次处理后,即可得到分布式装置2所需的数据。例如,假设分布式装置2的数据处理需求为对原始数据进行T1+T2次处理,由于分布式装置2所接收到的第一数据是对原始进行T1次处理后所得到的数据,因此分布式装置2根据中心装置的指示对第一数据进行T2次处理后所得到的第二数据即为分布式装置2所需的数据。
可选的,在各个分布式装置上处理数据的能力有波动的情况下,分布式装置2对第一数据进行处理的次数可以是第一数据对应的数据处理次数以及分布式装置2的数据处理需求来确定。例如,在分布式装置1向分布式装置2指示分布式装置1实际对原始数据进行处理的次数为T1次的情况下,分布式装置2可以根据自身的数据处理需求为对原始数据处理N2次,确定分布式装置2需要对第一数据进行处理的次数可以为N2-T1=T2次。
可选的,在分布式装置2的数据处理能力仅支持其对第一数据处理S1次(S1<T2),即分布式装置2不支持对第一数据处理T2次的情况下,分布式装置2可以是对第一数据处理S1次,得到第二数据。其中,该第二数据并非为分布式装置2所需的数据,分布式装置2可以向其他的分布式装置发送第二数据,以请求其他的分布式装置协助分布式装置2继续对第二数据进行处理。
又或者,在分布式装置2的数据处理能力支持其对第一数据处理S2次(S2>T2),即分布式装置2支持对第一数据进行处理的次数大于T2次的情况下,分布式装置2可以是对第一数据处理S2次,得到第二数据。其中,在分布式装置2可以对第一数据处理S2次的过程中,分布式装置2对第一数据处理的次数为T2次时所得到的数据即为分布式装置2所需的数据。
步骤1107,分布式装置2向分布式装置3发送第二数据。
本实施例中,在分布式装置2从中心装置处所接收的指示信息中,还指示了分布式装置2需要向分布式装置3发送处理后的数据。因此,分布式装置2在对第一数据进行处理并得到第二数据之后,分布式装置2向分布式装置3发送第二数据。
步骤1108,分布式装置3通过机器学习模型对第二数据进行T3次处理,得到第三数据。
在接收到中心装置所发送的指示信息后,分布式装置3可以确定自身需要从分布式装置2接收数据,并继续对接收到的数据进行处理,因此分布式装置3通过机器学习模型对第二数据进行T3次处理,得到第三数据。其中,在分布式装置3对第一数据进行T3次处理所得到的第三数据能够满足分布式装置3的数据处理需求的情况下,第三数据则为分布式装置3所需要的数据。
其中,分布式装置3在各种情况下对第二数据进行处理的过程与分布式装置2对第一数据进行处理的过程类似,具体可以参考上述步骤1106中分布式装置2对第一数据进行处 理的过程,在此不再赘述。
可以理解的是,以上方法1100是以三个分布式装置联合处理数据为例进行了说明,在实际应用中,可以是两个或两个以上的分布式装置基于上述的流程来联合处理数据,本实施例并不对联合处理数据的分布式装置的数量进行限定。
此外,以上方法1100介绍了在各个分布式装置的数据处理需求不相同时,如何统筹分布式装置联合处理数据。在一些特殊的场景下,部分分布式装置的数据处理需求可能会相同,中心装置可以根据分布式装置的数据处理能力来为这部分分布式装置分配数据处理任务。
请参阅图12,图12为本申请实施例提供的一种数据处理方法1200的流程示意图。如图12所示,数据处理方法1200包括以下的步骤1201-1207。
步骤1201,分布式装置1和分布式装置2分别向中心装置发送数据处理需求以及数据处理能力。
本实施例中,分布式装置1和分布式装置2的数据处理需求是相同的。例如,分布式装置1和分布式装置2的数据处理需求均是对原始数据进行降噪处理1000次。
此外,分布式装置1的数据处理能力与分布式装置2的数据处理能力可以是相同的,也可以是不相同的,本实施例对此不做具体限定。
步骤1202,中心装置确定各个分布式装置处理数据的顺序以及处理数据的次数。
本实施例中,由于分布式装置1和分布式装置2的数据处理需求是相同的,代表分布式装置1所需的数据和分布式装置2所需的数据是相同的,因此中心装置可以处理数据的部分流程分配至分布式装置1中,并将处理数据的另一部分流程分配至分布式装置2中。
可选的,中心装置确定分布式装置处理数据的顺序的方式可以有多种。示例性地,中心装置可以随机确定分布式装置1和分布式装置2处理数据的顺序。或者,中心装置可以根据分布式装置1和分布式装置2需要处理的数据的来源来确定分布式装置1和分布式装置2处理数据的顺序。例如,假设分布式装置1和分布式装置2需要处理的数据来源于分布式装置0,且分布式装置1位于分布式装置0与分布式装置2之间,则中心装置可以确定分布式装置1先处理数据,然后由分布式装置2继续处理分布式装置1所处理得到的数据。又或者,中心装置可以根据分布式装置1和分布式装置2处理得到的数据的下一跳节点来确定分布式装置1和分布式装置2处理数据的顺序。例如,假设分布式装置1和分布式装置2最终处理得到的数据需要发送给分布式装置3,且分布式装置1位于分布式装置2与分布式装置3之间,则中心装置可以确定分布式装置2先处理数据,然后由分布式装置1继续处理分布式装置2所处理得到的数据,以便于分布式装置1能够以更快的速度将最终处理得到的数据发送给分布式装置3。
可选的,各个分布式装置处理数据的次数可以是由中心装置根据各个分布式装置的数据处理能力来决定的。分布式装置的数据处理能力越高,中心装置可以确定该分布式装置处理数据的次数越大;分布式装置的数据处理能力越低,中心装置可以确定该分布式装置处理数据的次数越小。
步骤1203,中心装置向各个分布式装置发送指示信息,以指示各个分布式装置处理数据的顺序以及次数。
示例性地,在分布式装置1的数据处理能力较低,而分布式装置2的数据处理能力较高的情况下,中心装置可以向分布式装置1发送指示信息1,以指示分布式装置1先处理数据200次并将处理得到的数据发送给分布式装置2。此外,中心装置还向分布式装置2发送指示信息2,以指示分布式装置2从分布式装置1接收数据并对接收到的数据进行800次处理,以及将处理后得到的数据发送给分布式装置1。
步骤1204,分布式装置1通过机器学习模型对待处理数据进行N1次处理,得到第一数据。
基于中心装置的指示,分布式装置1通过机器学习模型对待处理数据进行N1次数据。其中,待处理数据可以为预先存储于分布式装置1上的原始数据;待处理数据也可以是其他分布式装置发送给分布式装置1的已处理过的数据,本实施例并不对待处理数据进行具体限定。
步骤1205,分布式装置1向分布式装置2发送第一数据。
步骤1206,分布式装置2通过机器学习模型对第一数据进行N2次处理,得到第二数据。
本实施例中,分布式装置1和分布式装置2的数据处理需求为对待处理数据进行N1+N2次处理,因此在分布式装置2对第一数据进行N2次处理后,所得到的第二数据则为分布式装置1和分布式装置2所需的数据。此时,分布式装置2可以采用得到的第二数据来执行其他的任务,例如采用第二数据来训练图像处理模型。
步骤1207,分布式装置2向分布式装置1发送第二数据。
由于中心装置指示了分布式装置2需要向分布式装置1发送处理后的数据,因此分布式装置2在得到第二数据后向分布式装置1发送第二数据,以便于分布式装置1基于第二数据来执行其他的任务。
为了实现上述方法实施例,本申请还提供了一种通信装置。请参阅图13,本申请实施例提供了一种通信装置1300,该通信装置1300可以实现上述方法实施例中终端设备(或网络设备)的功能,因此也能实现上述方法实施例所具备的有益效果。在本申请实施例中,该通信装置1300可以是终端设备(或网络设备),也可以是终端设备(或网络设备)内部的集成电路或者元件等,例如芯片。下文实施例以该通信装置1300为终端设备或网络设备为例进行说明。
在一个可能的实施例中,该通信装置1300包括:收发模块1301和处理模块1302。
收发模块1301,用于接收来自于第二装置的第一数据,第一数据为经过第一机器学习模型处理后的数据;处理模块1302,用于通过第二机器学习模型对第一数据进行处理,得到第二数据,第一机器学习模型的结构与第二机器学习模型的结构相同,通信装置和第二装置用于联合执行数据的处理。
在一种可能的实现方式中,第二机器学习模型是扩散模型,第二机器学习模型用于对 第一数据进行降噪处理。
在一种可能的实现方式中,收发模块1301,还用于接收来自于第二装置的第一信息,第一信息用于请求通信装置对第一数据执行处理。
在一种可能的实现方式中,第一信息用于指示第一数据待处理的次数为第一次数;处理模块1302还用于通过第二机器学习模型对第一数据进行第一次数的处理,得到第二数据,其中第一装置的能力支持对第一数据完成第一次数的处理。
在一种可能的实现方式中,收发模块1301,还用于向第二装置发送第二数据;或者,收发模块1301用于向源装置发送第二数据,其中第一信息还用于指示源装置的信息,源装置为首个请求协助处理数据的装置。
在一种可能的实现方式中,第一信息用于指示第一数据待处理的次数为第一次数;处理模块1302还用于通过第二机器学习模型对第一数据进行第二次数的处理,得到第二数据,其中第一次数大于第二次数,第一装置的能力不支持对第一数据完成第一次数的处理;收发模块1301,还用于向第三装置发送第二数据以及第二信息;其中,第二信息用于指示第二数据待处理的次数为第三次数,第三次数为第一次数与第二次数的差值,第三装置用于协助通信装置执行数据的处理。
在一种可能的实现方式中,收发模块1301,还用于向第二装置发送请求协助信息,请求协助信息用于请求第二装置协助处理数据。
在一种可能的实现方式中,收发模块1301,还用于向中心装置发送第三信息,第三信息用于指示通信装置所需的数据的处理次数;收发模块1301,还用于接收中心装置的反馈信息,反馈信息用于指示第二装置为协助节点。
在一种可能的实现方式中,收发模块1301,还用于接收来自于中心装置的第四信息,第四信息用于指示通信装置从第二装置接收到的数据需执行的处理次数;处理模块1302,还用于根据第四信息,通过第二机器学习模型对第一数据进行处理,得到通信装置所需的第二数据。
在一种可能的实现方式中,第四信息还用于指示第三装置的信息,第三装置为待接收处理后的数据的装置;收发模块1301,还用于根据第四信息,向第三装置发送第二数据。
在一种可能的实现方式中,收发模块1301,还用于接收来自于第二装置的第五信息,第五信息用于指示第一数据对应的已处理次数;处理模块1302,还用于根据处理次数以及通信装置所需的数据的处理次数,通过第二机器学习模型对第一数据进行处理,得到通信装置所需的第二数据。
在另一个可能的实施例中,处理模块1302,用于通过第一机器学习模型对原始数据执行处理,得到第一数据;收发模块1301,用于向第二装置发送第一数据,以请求第二装置协助处理第一数据;收发模块1301,用于接收第二装置或其他装置发送的第二数据,第二数据是对第一数据执行处理得到的,第二数据是基于第二机器学习模型处理得到的,第一机器学习模型的结构与第二机器学习模型的结构相同。
在一种可能的实现方式中,第一机器学习模型是扩散模型,第一机器学习模型用于对 原始数据进行降噪处理。
在一种可能的实现方式中,收发模块,还用于向第二装置发送第一信息,第一信息用于请求第二装置对第一数据执行处理,且第一信息还用于指示第一数据待处理的次数,第一数据待处理的次数是基于原始数据需执行的处理次数以及第一装置对原始数据执行处理的次数确定的。
在另一个可能的实施例中,收发模块1301,用于接收来自于第一装置的第一信息和第二装置的第二信息,第一信息用于指示第一装置所需的数据的第一处理次数,第二信息用于指示第二装置所需的数据的第二处理次数,第一处理次数对应的数据处理模型与第二处理次数对应的数据处理模型相同;收发模块1301,用于向第二装置发送第三信息,第三信息用于指示第二装置向第一装置发送执行处理后的数据,其中第二装置所需的数据的第二处理次数小于或等于第一装置所需的数据的第一处理次数。
在一种可能的实现方式中,收发模块1301,还用于向第一装置发送第四信息,第四信息用于指示第一装置从第二装置接收到的数据需执行的处理次数。
可选的,在上述的通信装置1300为终端设备或网络设备时,该通信装置1300中的收发模块1301可以为收发器,处理模块1302可以为处理器。在通信装置1300为终端设备或网络设备内部的集成电路或者元件等的情况下,例如通信装置1300为芯片时,该通信装置1300中的收发模块1301可以为芯片上的输出、输入管脚,处理模块1302可以为芯片上的运算部件。又例如,在通信装置1300为芯片系统时,该通信装置1300中的收发模块1301可以为芯片系统上的通信接口,处理模块1302可以为芯片系统上的处理核。
请参阅图14,本申请实施例提供了一种模型训练装置1400,该模型训练装置1400可以实现上述方法实施例中终端设备(或网络设备)的功能,因此也能实现上述方法实施例所具备的有益效果。在本申请实施例中,该模型训练装置1400可以是终端设备(或网络设备),也可以是终端设备(或网络设备)内部的集成电路或者元件等,例如芯片。下文实施例以该模型训练装置1400为终端设备或网络设备为例进行说明。
如图14所示,模型训练装置1400包括:收发模块1401和处理模块1402。收发模块1401,用于获取训练样本集合,训练样本集合包括第一数据和第二数据,第一数据是基于第二数据得到的,且第二数据为第一数据的训练标签;处理模块1402,用于基于训练样本集合对第一机器学习模型进行训练,得到训练后的第一机器学习模型,其中第一机器学习模型用于对第一数据进行处理;收发模块1401,用于向第二装置发送训练后的第一机器学习模型,第二装置用于聚合由多个装置训练得到的结构相同且参数不同的机器学习模型装置。
在一种可能的实现方式中,收发模块1401,还用于向第三装置发送第一信息,第一信息用于指示模型训练装置上与模型训练相关的能力,第三装置用于基于参与机器学习模型训练的多个装置的能力确定多个装置所负责的训练内容;收发模块1401,还用于接收来自 于第三装置的第二信息,第二信息用于指示模型训练装置上训练的第一机器学习模型对输入数据进行处理的次数,第二信息还用于指示第一机器学习模型的输入数据的需求;处理模块1402,还用于:根据第二信息所指示的输入数据的需求以及第一机器学习模型对输入数据进行处理的次数,对原始数据进行处理,得到第二数据;根据第二信息所指示的第一机器学习模型对输入数据进行处理的次数,对第二数据进行处理,得到第一数据。
在一种可能的实现方式中,收发模块1401,还用于接收来自于第二装置的第二机器学习模型;处理模块1402,还用于基于训练样本集合对第二机器学习模型进行训练,得到训练后的第二机器学习模型;收发模块1401,还用于向第二装置发送训练后的第二机器学习模型。
在另一个可能的实施例中,收发模块1401,用于接收多个能力信息,多个能力信息来自于多个不同的装置,且多个能力信息中的每个能力信息均用于指示装置上与模型训练相关的能力;发送模块1403,用于根据多个能力信息分别向多个不同的装置发送不同的训练配置信息,训练配置信息用于指示装置上训练的机器学习模型对输入数据进行处理的次数,训练配置信息还用于指示装置上训练的机器学习模型的输入数据的需求,多个不同的装置所训练的机器学习模型为结构相同的模型。
可选的,在上述的模型训练装置1400为终端设备或网络设备时,该模型训练装置1400中的收发模块1401可以为收发器,处理模块1402可以为处理器。在模型训练装置1400为终端设备或网络设备内部的集成电路或者元件等的情况下,例如模型训练装置1400为芯片时,该模型训练装置1400中的收发模块1401可以为芯片上的输出、输入管脚,处理模块1402可以为芯片上的运算部件。又例如,在模型训练装置1400为芯片系统时,该模型训练装置1400中的收发模块1401可以为芯片系统上的通信接口,处理模块1402可以为芯片系统上的处理核。
请参阅图15,为本申请提供的通信装置1500的另一种示意性结构图,通信装置1500至少包括输入输出接口1502。其中,通信装置1500可以为芯片或集成电路。
可选地,该通信装置还包括逻辑电路1501。
其中,图13所示收发模块1301可以为通信接口,该通信接口可以是图15中的输入输出接口1502,该输入输出接口1502可以包括输入接口和输出接口。或者,该通信接口也可以是收发电路,该收发电路可以包括输入接口电路和输出接口电路。
可选地,输入输出接口1502用于获取第一网络设备的AI模型信息;逻辑电路1501用于基于该第一网络设备的AI模型信息确定该第一网络设备的AI性能信息。其中,逻辑电路1501和输入输出接口1502还可以执行前述任一实施例中终端设备执行的其他步骤并实现对应的有益效果,此处不再赘述。
可选地,逻辑电路1501用于生成第一网络设备的AI模型信息;输入输出接口1502用于发送第一网络设备的AI模型信息。其中,逻辑电路1501和输入输出接口1502还可以执行任一实施例中网络设备执行的其他步骤并实现对应的有益效果,此处不再赘述。
在一种可能的实现方式中,图13所示的处理模块1302可以为图15中的逻辑电路1501。
可选地,逻辑电路1501可以是一个处理装置,处理装置的功能可以部分或全部通过软件实现。其中,处理装置的功能可以部分或全部通过软件实现。
可选地,处理装置可以包括存储器和处理器,其中,存储器用于存储计算机程序,处理器读取并执行存储器中存储的计算机程序,以执行任意一个方法实施例中的相应处理和/或步骤。
可选地,处理装置可以仅包括处理器。用于存储计算机程序的存储器位于处理装置之外,处理器通过电路/电线与存储器连接,以读取并执行存储器中存储的计算机程序。其中,存储器和处理器可以集成在一起,或者也可以是物理上互相独立的。
可选地,该处理装置可以是一个或多个芯片,或一个或多个集成电路。例如,处理装置可以是一个或多个现场可编程门阵列(field-programmable gate array,FPGA)、专用集成芯片(application specific integrated circuit,ASIC)、系统芯片(system on chip,SoC)、中央处理器(central processor unit,CPU)、网络处理器(network processor,NP)、数字信号处理电路(digital signal processor,DSP)、微控制器(micro controller unit,MCU),可编程控制器(programmable logic device,PLD)或其它集成芯片,或者上述芯片或者处理器的任意组合等。
请参阅图16,为本申请的实施例提供的上述实施例中所涉及的通信装置1600,该通信装置1600具体可以为上述实施例中的作为终端设备的通信装置,图16所示示例为终端设备通过终端设备(或者终端设备中的部件)实现。
其中,该通信装置1600的一种可能的逻辑结构示意图,该通信装置1600可以包括但不限于至少一个处理器1601以及通信端口1602。
进一步可选地,该装置还可以包括存储器1603、总线1604中的至少一个,在本申请的实施例中,该至少一个处理器1601用于对通信装置1600的动作进行控制处理。
此外,处理器1601可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。该处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
需要说明的是,图16所示通信装置1600具体可以用于实现前述方法实施例中终端设备所实现的步骤,并实现终端设备对应的技术效果,图16所示通信装置的具体实现方式,均可以参考前述方法实施例中的叙述,此处不再一一赘述。
请参阅图17,为本申请的实施例提供的上述实施例中所涉及的通信装置1700的结构示意图,该通信装置1700具体可以为上述实施例中的作为网络设备的通信装置,图17所示示例为网络设备通过网络设备(或者网络设备中的部件)实现,其中,该通信装置的结构可以参考图17所示的结构。
通信装置1700包括至少一个处理器1711以及至少一个网络接口1714。进一步可选地,该通信装置还包括至少一个存储器1717、至少一个收发器1713和一个或多个天线1715。处理器1711、存储器1717、收发器1713和网络接口1714相连,例如通过总线相连,在本申请实施例中,该连接可包括各类接口、传输线或总线等,本实施例对此不做限定。天线1715与收发器1713相连。网络接口1714用于使得通信装置通过通信链路,与其它通信设备通信。例如网络接口1714可以包括通信装置与核心网设备之间的网络接口,例如S1接口,网络接口可以包括通信装置和其他通信装置(例如其他网络设备或者核心网设备)之间的网络接口,例如X2或者Xn接口。
处理器1711主要用于对通信协议以及通信数据进行处理,以及对整个通信装置进行控制,执行软件程序,处理软件程序的数据,例如用于支持通信装置执行实施例中所描述的动作。通信装置可以包括基带处理器和中央处理器,基带处理器主要用于对通信协议以及通信数据进行处理,中央处理器主要用于对整个终端设备进行控制,执行软件程序,处理软件程序的数据。图17中的处理器1711可以集成基带处理器和中央处理器的功能,本领域技术人员可以理解,基带处理器和中央处理器也可以是各自独立的处理器,通过总线等技术互联。本领域技术人员可以理解,终端设备可以包括多个基带处理器以适应不同的网络制式,终端设备可以包括多个中央处理器以增强其处理能力,终端设备的各个部件可以通过各种总线连接。该基带处理器也可以表述为基带处理电路或者基带处理芯片。该中央处理器也可以表述为中央处理电路或者中央处理芯片。对通信协议以及通信数据进行处理的功能可以内置在处理器中,也可以以软件程序的形式存储在存储器中,由处理器执行软件程序以实现基带处理功能。
存储器主要用于存储软件程序和数据。存储器1717可以是独立存在,与处理器1711相连。可选地,存储器1717可以和处理器1711集成在一起,例如集成在一个芯片之内。其中,存储器1717能够存储执行本申请实施例的技术方案的程序代码,并由处理器1711来控制执行,被执行的各类计算机程序代码也可被视为是处理器1711的驱动程序。
图17仅示出了一个存储器和一个处理器。在实际的终端设备中,可以存在多个处理器和多个存储器。存储器也可以称为存储介质或者存储设备等。存储器可以为与处理器处于同一芯片上的存储元件,即片内存储元件,或者为独立的存储元件,本申请实施例对此不做限定。
收发器1713可以用于支持通信装置与终端之间射频信号的接收或者发送,收发器1713可以与天线1715相连。收发器1713包括发射机Tx和接收机Rx。具体地,一个或多个天线1715可以接收射频信号,该收发器1713的接收机Rx用于从天线接收该射频信号,并将射频信号转换为数字基带信号或数字中频信号,并将该数字基带信号或数字中频信号提供给该处理器1711,以便处理器1711对该数字基带信号或数字中频信号做进一步的处理,例如解调处理和译码处理。此外,收发器1713中的发射机Tx还用于从处理器1711接收经过调制的数字基带信号或数字中频信号,并将该经过调制的数字基带信号或数字中频信号转换为射频信号,并通过一个或多个天线1715发送该射频信号。具体地,接收机Rx可以选择性地对射频信号进行一级或多级下混频处理和模数转换处理以得到数字基带信号或数字中频信号,该 下混频处理和模数转换处理的先后顺序是可调整的。发射机Tx可以选择性地对经过调制的数字基带信号或数字中频信号时进行一级或多级上混频处理和数模转换处理以得到射频信号,该上混频处理和数模转换处理的先后顺序是可调整的。数字基带信号和数字中频信号可以统称为数字信号。
收发器1713也可以称为收发单元、收发机、收发装置等。可选地,可以将收发单元中用于实现接收功能的器件视为接收单元,将收发单元中用于实现发送功能的器件视为发送单元,即收发单元包括接收单元和发送单元,接收单元也可以称为接收机、输入口、接收电路等,发送单元可以称为发射机、发射器或者发射电路等。
需要说明的是,图17所示通信装置1700具体可以用于实现前述方法实施例中网络设备所实现的步骤,并实现网络设备对应的技术效果,图17所示通信装置1700的具体实现方式,均可以参考前述方法实施例中的叙述,此处不再一一赘述。
本申请实施例还提供一种存储一个或多个计算机执行指令的计算机可读存储介质,当计算机执行指令被处理器执行时,该处理器执行如前述实施例中终端设备可能的实现方式所述的方法。
本申请实施例还提供一种存储一个或多个计算机执行指令的计算机可读存储介质,当计算机执行指令被处理器执行时,该处理器执行如前述实施例中网络设备可能的实现方式所述的方法。
本申请实施例还提供一种存储一个或多个计算机的计算机程序产品(或称计算机程序),当计算机程序产品被该处理器执行时,该处理器执行上述终端设备可能实现方式的方法。
本申请实施例还提供一种存储一个或多个计算机的计算机程序产品,当计算机程序产品被该处理器执行时,该处理器执行上述网络设备可能实现方式的方法。
本申请实施例还提供了一种芯片系统,该芯片系统包括至少一个处理器,用于支持通信装置实现上述通信装置可能的实现方式中所涉及的功能。可选地,所述芯片系统还包括接口电路,所述接口电路为所述至少一个处理器提供程序指令和/或数据。在一种可能的设计中,该芯片系统还可以包括存储器,存储器,用于保存该通信装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件,其中,该通信装置具体可以为前述方法实施例中终端设备。
本申请实施例还提供了一种芯片系统,该芯片系统包括至少一个处理器,用于支持通信装置实现上述通信装置可能的实现方式中所涉及的功能。可选地,所述芯片系统还包括接口电路,所述接口电路为所述至少一个处理器提供程序指令和/或数据。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存该通信装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件,其中,该通信装置具体可以为前述方法实施例中网络设备。
本申请实施例还提供了一种通信系统,该网络系统架构包括上述任一实施例中的终端设备和网络设备。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (52)

  1. 一种数据处理方法,其特征在于,包括:
    第一装置接收来自于第二装置的第一数据,所述第一数据为经过第一机器学习模型处理后的数据;
    所述第一装置通过第二机器学习模型对所述第一数据进行处理,得到第二数据,所述第一机器学习模型的结构与所述第二机器学习模型的结构相同,所述第一装置和所述第二装置用于联合执行数据的处理。
  2. 根据权利要求1所述的方法,其特征在于,所述第二机器学习模型是扩散模型,所述第二机器学习模型用于对所述第一数据进行降噪处理。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    所述第一装置接收来自于所述第二装置的第一信息,所述第一信息用于请求所述第一装置对所述第一数据执行处理。
  4. 根据权利要求3所述的方法,其特征在于,所述第一信息用于指示所述第一数据待处理的次数为第一次数;
    所述第一装置通过第二机器学习模型对所述第一数据进行处理,得到第二数据,包括:
    所述第一装置通过所述第二机器学习模型对所述第一数据进行所述第一次数的处理,得到所述第二数据,其中所述第一装置的能力支持对所述第一数据完成所述第一次数的处理。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    所述第一装置向所述第二装置发送所述第二数据;
    或者,所述第一装置向源装置发送所述第二数据,其中所述第一信息还用于指示所述源装置的信息,所述源装置为首个请求协助处理数据的装置。
  6. 根据权利要求3所述的方法,其特征在于,所述第一信息用于指示所述第一数据待处理的次数为第一次数;
    所述第一装置通过第二机器学习模型对所述第一数据进行处理,得到第二数据,包括:
    所述第一装置通过所述第二机器学习模型对所述第一数据进行第二次数的处理,得到所述第二数据,其中所述第一次数大于所述第二次数,所述第一装置的能力不支持对所述第一数据完成所述第一次数的处理;
    所述方法还包括:
    所述第一装置向第三装置发送所述第二数据以及第二信息;
    其中,所述第二信息用于指示所述第二数据待处理的次数为第三次数,所述第三次数为所述第一次数与所述第二次数的差值,所述第三装置用于协助所述第一装置执行数据的 处理。
  7. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    所述第一装置向所述第二装置发送请求协助信息,所述请求协助信息用于请求所述第二装置协助处理数据。
  8. 根据权利要求1、2或7所述的方法,其特征在于,所述方法还包括:
    所述第一装置向中心装置发送第三信息,所述第三信息用于指示所述第一装置所需的数据的处理次数;
    所述第一装置接收所述中心装置的反馈信息,所述反馈信息用于指示所述第二装置为协助节点。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    所述第一装置接收来自于所述中心装置的第四信息,所述第四信息用于指示所述第一装置从所述第二装置接收到的数据需执行的处理次数;
    所述第一装置通过第二机器学习模型对所述第一数据进行处理,得到第二数据,包括:
    所述第一装置根据所述第四信息,通过所述第二机器学习模型对所述第一数据进行处理,得到所述第一装置所需的所述第二数据。
  10. 根据权利要求9所述的方法,其特征在于,所述第四信息还用于指示第三装置的信息,所述第三装置为待接收处理后的数据的装置;
    所述方法还包括:
    所述第一装置根据所述第四信息,向所述第三装置发送所述第二数据。
  11. 根据权利要求7或8所述的方法,其特征在于,所述方法还包括:
    所述第一装置接收来自于所述第二装置的第五信息,所述第五信息用于指示所述第一数据对应的已处理次数;
    所述第一装置根据所述处理次数以及所述第一装置所需的数据的处理次数,通过所述第二机器学习模型对所述第一数据进行处理,得到所述第一装置所需的所述第二数据。
  12. 一种数据处理方法,其特征在于,包括:
    第一装置通过第一机器学习模型对原始数据执行处理,得到第一数据;
    所述第一装置向第二装置发送所述第一数据;
    所述第一装置接收第二装置或其他装置发送的第二数据,所述第二数据是基于第二机器学习模型对所述第一数据处理得到的,所述第一机器学习模型的结构与所述第二机器学习模型的结构相同。
  13. 根据权利要求12所述的方法,其特征在于,所述第一机器学习模型是扩散模型,所述第一机器学习模型用于对所述原始数据进行降噪处理。
  14. 根据权利要求12或13所述的方法,其特征在于,所述方法还包括:
    所述第一装置向所述第二装置发送第一信息,所述第一信息用于请求所述第二装置对所述第一数据执行处理,和/或所述第一信息用于指示所述第一数据待处理的次数,所述第一数据待处理的次数是基于所述原始数据需执行的处理次数以及所述第一装置对所述原始数据执行处理的次数确定的。
  15. 一种数据处理方法,其特征在于,包括:
    中心装置接收来自于第一装置的第一信息和第二装置的第二信息,所述第一信息用于指示所述第一装置所需的数据的第一处理次数,所述第二信息用于指示所述第二装置所需的数据的第二处理次数,所述第一处理次数对应的数据处理模型与所述第二处理次数对应的数据处理模型相同;
    所述中心装置向所述第二装置发送第三信息,所述第三信息用于指示所述第二装置向所述第一装置发送执行处理后的数据,其中所述第二装置所需的数据的第二处理次数小于或等于所述第一装置所需的数据的第一处理次数。
  16. 根据权利要求15所述的方法,其特征在于,所述方法还包括:
    所述中心装置向所述第一装置发送第四信息,所述第四信息用于指示所述第一装置从所述第二装置接收到的数据需执行的处理次数。
  17. 一种模型训练方法,其特征在于,包括:
    第一装置获取训练样本集合,所述训练样本集合包括第一数据和第二数据,所述第一数据是基于是所述第二数据得到的,且所述第二数据为所述第一数据的训练标签;
    所述第一装置基于所述训练样本集合对第一机器学习模型进行训练,得到训练后的第一机器学习模型,其中所述第一机器学习模型用于对所述第一数据进行处理;
    所述第一装置向第二装置发送训练后的第一机器学习模型,所述第二装置是用于聚合由多个装置训练得到的结构相同且参数不同的机器学习模型的装置。
  18. 根据权利要求17所述的方法,其特征在于,所述方法还包括:
    所述第一装置向第三装置发送第一信息,所述第一信息用于指示所述第一装置上与模型训练相关的能力,所述第三装置用于基于参与机器学习模型训练的多个装置的能力确定所述多个装置所负责的训练内容;
    所述第一装置接收来自于所述第三装置的第二信息,所述第二信息用于指示所述第一装置上训练的所述第一机器学习模型对输入数据进行处理的次数,所述第二信息还用于指示所述第一机器学习模型的输入数据的需求;
    所述第一装置根据所述第二信息所指示的输入数据的需求以及所述第一机器学习模型对输入数据进行处理的次数,对原始数据进行处理,得到所述第二数据;
    所述第一装置根据所述第二信息所指示的所述第一机器学习模型对输入数据进行处理的次数,对所述第二数据进行处理,得到所述第一数据。
  19. 根据权利要求17或18所述的方法,其特征在于,所述方法还包括:
    所述第一装置接收来自于所述第二装置的第二机器学习模型;
    所述第一装置基于所述训练样本集合对所述第二机器学习模型进行训练,得到训练后的第二机器学习模型;
    所述第一装置向所述第二装置发送所述训练后的第二机器学习模型。
  20. 一种模型训练方法,其特征在于,包括:
    第一装置接收多个能力信息,所述多个能力信息来自于多个不同的装置,且所述多个能力信息中的每个能力信息均用于指示装置上与模型训练相关的能力;
    所述第一装置根据所述多个能力信息分别向所述多个不同的装置发送不同的训练配置信息,所述训练配置信息用于指示装置上训练的机器学习模型对输入数据进行处理的次数,所述训练配置信息还用于指示装置上训练的机器学习模型的输入数据的需求,所述多个不同的装置所训练的机器学习模型为结构相同的模型。
  21. 一种通信装置,其特征在于,所述通信装置为第一装置,所述通信装置包括:
    收发模块,用于接收来自于第二装置的第一数据,所述第一数据为经过第一机器学习模型处理后的数据;
    处理模块,用于通过第二机器学习模型对所述第一数据进行处理,得到第二数据,所述第一机器学习模型的结构与所述第二机器学习模型的结构相同,所述第一装置和所述第二装置用于联合执行数据的处理。
  22. 根据权利要求21所述的装置,其特征在于,所述第二机器学习模型是扩散模型,所述第二机器学习模型用于对所述第一数据进行降噪处理。
  23. 根据权利要求21或22所述的装置,其特征在于,所述收发模块还用于接收来自于所述第二装置的第一信息,所述第一信息用于请求所述第一装置对所述第一数据执行处理。
  24. 根据权利要求23所述的装置,其特征在于,所述第一信息用于指示所述第一数据待处理的次数为第一次数;
    所述处理模块还用于通过所述第二机器学习模型对所述第一数据进行所述第一次数的处理,得到所述第二数据,其中所述第一装置的能力支持对所述第一数据完成所述第一次数的处理。
  25. 根据权利要求24所述的装置,其特征在于,还包括:
    发送模块,用于向所述第二装置发送所述第二数据;
    或者,所述发送模块,用于向源装置发送所述第二数据,其中所述第一信息还用于指示所述源装置的信息,所述源装置为首个请求协助处理数据的装置。
  26. 根据权利要求23所述的装置,其特征在于,所述第一信息用于指示所述第一数据待处理的次数为第一次数;
    所述处理模块还用于通过所述第二机器学习模型对所述第一数据进行第二次数的处理,得到所述第二数据,其中所述第一次数大于所述第二次数,所述第一装置的能力不支持对所述第一数据完成所述第一次数的处理;
    所述收发模块,还用于向第三装置发送所述第二数据以及第二信息;
    其中,所述第二信息用于指示所述第二数据待处理的次数为第三次数,所述第三次数为所述第一次数与所述第二次数的差值,所述第三装置用于协助所述第一装置执行数据的处理。
  27. 根据权利要求21或22所述的装置,其特征在于,
    所述收发模块,还用于向所述第二装置发送请求协助信息,所述请求协助信息用于请求所述第二装置协助处理数据。
  28. 根据权利要求21、22或27所述的装置,其特征在于,所述收发模块,还用于:
    向中心装置发送第三信息,所述第三信息用于指示所述第一装置所需的数据的处理次数;
    接收所述中心装置的反馈信息,所述反馈信息用于指示所述第二装置为协助节点。
  29. 根据权利要求28所述的装置,其特征在于,
    所述收发模块,还用于接收来自于所述中心装置的第四信息,所述第四信息用于指示所述第一装置从所述第二装置接收到的数据需执行的处理次数;
    所述处理模块,还用于根据所述第四信息,通过所述第二机器学习模型对所述第一数据进行处理,得到所述第一装置所需的所述第二数据。
  30. 根据权利要求29所述的装置,其特征在于,所述第四信息还用于指示第三装置的信息,所述第三装置为待接收处理后的数据的装置;
    所述收发模块,还用于根据所述第四信息,向所述第三装置发送所述第二数据。
  31. 根据权利要求27或28所述的装置,其特征在于,
    所述收发模块,还用于接收来自于所述第二装置的第五信息,所述第五信息用于指示所述第一数据对应的已处理次数;
    所述处理模块,还用于根据所述处理次数以及所述第一装置所需的数据的处理次数,通过所述第二机器学习模型对所述第一数据进行处理,得到所述第一装置所需的所述第二数据。
  32. 根据权利要求21-31任意一项所述的装置,其特征在于,所述收发模块为收发器,所述处理模块为处理器。
  33. 一种通信装置,其特征在于,所述通信装置为第一装置,包括:
    处理模块,用于通过第一机器学习模型对原始数据执行处理,得到第一数据;
    收发模块,用于向第二装置发送所述第一数据和第一信息;
    所述收发模块,还用于接收第二装置或其他装置发送的第二数据,所述第二数据是基于第二机器学习模型对所述第一数据处理得到的,所述第一机器学习模型的结构与所述第二机器学习模型的结构相同。
  34. 根据权利要求33所述的装置,其特征在于,所述第一机器学习模型是扩散模型,所述第一机器学习模型用于对所述原始数据进行降噪处理。
  35. 根据权利要求33或34所述的装置,其特征在于,
    所述收发模块,还用于向所述第二装置发送第一信息,所述第一信息用于请求所述第二装置对所述第一数据执行处理,和/或所述第一信息用于指示所述第一数据待处理的次数,所述第一数据待处理的次数是基于所述原始数据需执行的处理次数以及所述第一装置对所述原始数据执行处理的次数确定的。
  36. 根据权利要求33-35任意一项所述的装置,其特征在于,所述收发模块为收发器,所述处理模块为处理器。
  37. 一种通信装置,其特征在于,包括:
    收发模块,用于接收来自于第一装置的第一信息和第二装置的第二信息,所述第一信息用于指示所述第一装置所需的数据的第一处理次数,所述第二信息用于指示所述第二装置所需的数据的第二处理次数,所述第一处理次数对应的数据处理模型与所述第二处理次数对应的数据处理模型相同;
    所述收发模块,还用于向所述第二装置发送第三信息,所述第三信息用于指示所述第二装置向所述第一装置发送执行处理后的数据,其中所述第二装置所需的数据的第二处理次数小于或等于所述第一装置所需的数据的第一处理次数。
  38. 根据权利要求37所述的装置,其特征在于,所述收发模块,还用于向所述第一装置发送第四信息,所述第四信息用于指示所述第一装置从所述第二装置接收到的数据需执 行的处理次数。
  39. 根据权利要求37-38任意一项所述的装置,其特征在于,所述收发模块为收发器。
  40. 一种模型训练装置,其特征在于,包括:
    收发模块,用于获取训练样本集合,所述训练样本集合包括第一数据和第二数据,所述第一数据是基于是所述第二数据得到的,且所述第二数据为所述第一数据的训练标签;
    处理模块,用于基于所述训练样本集合对第一机器学习模型进行训练,得到训练后的第一机器学习模型,其中所述第一机器学习模型用于对所述第一数据进行处理;
    所述收发模块,还用于向第二装置发送训练后的第一机器学习模型,所述第二装置是用于聚合由多个装置训练得到的结构相同且参数不同的机器学习模型的装置。
  41. 根据权利要求40所述的装置,其特征在于,
    所述收发模块,还用于向第三装置发送第一信息,所述第一信息用于指示所述第一装置上与模型训练相关的能力,所述第三装置用于基于参与机器学习模型训练的多个装置的能力确定所述多个装置所负责的训练内容;
    所述收发模块,还用于接收来自于所述第三装置的第二信息,所述第二信息用于指示所述第一装置上训练的所述第一机器学习模型对输入数据进行处理的次数,所述第二信息还用于指示所述第一机器学习模型的输入数据的需求;
    所述处理模块,还用于根据所述第二信息所指示的输入数据的需求以及所述第一机器学习模型对输入数据进行处理的次数,对原始数据进行处理,得到所述第二数据;
    所述处理模块,还用于根据所述第二信息所指示的所述第一机器学习模型对输入数据进行处理的次数,对所述第二数据进行处理,得到所述第一数据。
  42. 根据权利要求40或41所述的装置,其特征在于,
    所述收发模块,还用于接收来自于所述第二装置的第二机器学习模型;
    所述处理模块,还用于基于所述训练样本集合对所述第二机器学习模型进行训练,得到训练后的第二机器学习模型;
    所述收发模块,还用于向所述第二装置发送所述训练后的第二机器学习模型。
  43. 根据权利要求40-42任意一项所述的装置,其特征在于,所述收发模块为收发器,所述处理模块为处理器。
  44. 一种模型训练装置,其特征在于,包括:
    收发模块,用于接收多个能力信息,所述多个能力信息来自于多个不同的装置,且所述多个能力信息中的每个能力信息均用于指示装置上与模型训练相关的能力;
    所述收发模块,还用于根据所述多个能力信息分别向所述多个不同的装置发送不同的训练配置信息,所述训练配置信息用于指示装置上训练的机器学习模型对输入数据进行处理的次数,所述训练配置信息还用于指示装置上训练的机器学习模型的输入数据的需求,所述多个不同的装置所训练的机器学习模型为结构相同的模型。
  45. 根据权利要求44所述的装置,其特征在于,所述收发模块为收发器。
  46. 一种通信装置,其特征在于,包括至少一个处理器,所述至少一个处理器与存储器耦合,
    所述存储器用于存储程序或指令;
    所述至少一个处理器用于执行所述程序或指令,以使所述装置实现如权利要求1至11中任一项所述的方法;或者,实现如权利要求12至14中任一项所述的方法;或者,实现如权利要求15至16中任一项所述的方法;或者,实现如权利要求17至19中任一项所述的方法;或者,实现如权利要求20所述的方法。
  47. 一种通信装置,其特征在于,包括至少一个处理器;
    所述至少一个处理器用于执行程序或指令,以使所述装置实现如权利要求1至11中任一项所述的方法;或者,实现如权利要求12至14中任一项所述的方法;或者,实现如权利要求15至16中任一项所述的方法;或者,实现如权利要求17至19中任一项所述的方法;或者,实现如权利要求20所述的方法。
  48. 一种通信系统,其特征在于,包括:如权利要求21至32中任一项所述的通信装置以及如权利要求33至36中任一项所述的通信装置。
  49. 根据权利要求48所述的系统,其特征在于,所述系统还包括:如权利要求37至39中任一项所述的通信装置。
  50. 一种通信系统,其特征在于,包括:如权利要求40至43中任一项所述的模型训练装置以及如权利要求44-45任一项所述的模型训练装置。
  51. 一种计算机可读存储介质,其特征在于,所述可读存储介质存储有指令,当所述指令被计算机执行时,使得权利要求1至11中任一项所述的方法被执行;或者,使得权利要求12至14任一项所述的方法被执行;或者,使得权利要求15至16任一项所述的方法被执行;或者,使得权利要求17至19任一项所述的方法被执行;或者,使得权利要求20所述的方法被执行。
  52. 一种计算机程序产品,其特征在于,所述程序产品包括指令,当所述指令在计算机上运行时,使得权利要求1至11中任一项所述的方法被执行;或者,使得权利要求12至14任一项所述的方法被执行;或者,使得权利要求15至16任一项所述的方法被执行;或者,使得权利要求17至19任一项所述的方法被执行;或者,使得权利要求20所述的方法被执行。
PCT/CN2022/115466 2022-08-29 2022-08-29 一种数据处理方法、训练方法及相关装置 WO2024044881A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/115466 WO2024044881A1 (zh) 2022-08-29 2022-08-29 一种数据处理方法、训练方法及相关装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/115466 WO2024044881A1 (zh) 2022-08-29 2022-08-29 一种数据处理方法、训练方法及相关装置

Publications (1)

Publication Number Publication Date
WO2024044881A1 true WO2024044881A1 (zh) 2024-03-07

Family

ID=90100092

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/115466 WO2024044881A1 (zh) 2022-08-29 2022-08-29 一种数据处理方法、训练方法及相关装置

Country Status (1)

Country Link
WO (1) WO2024044881A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711556A (zh) * 2018-12-24 2019-05-03 中国南方电网有限责任公司 机巡数据处理方法、装置、网级服务器和省级服务器
CN111276120A (zh) * 2020-01-21 2020-06-12 华为技术有限公司 语音合成方法、装置和计算机可读存储介质
CN111598139A (zh) * 2020-04-24 2020-08-28 北京奇艺世纪科技有限公司 数据处理方法及系统
CN113436208A (zh) * 2021-06-30 2021-09-24 中国工商银行股份有限公司 基于端边云协同的图像处理方法、装置、设备及介质
CN113657613A (zh) * 2021-08-23 2021-11-16 北京易真学思教育科技有限公司 预测模型训练方法、数据处理装置和系统
WO2021232832A1 (zh) * 2020-05-19 2021-11-25 华为技术有限公司 数据处理方法、联邦学习的训练方法及相关装置、设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711556A (zh) * 2018-12-24 2019-05-03 中国南方电网有限责任公司 机巡数据处理方法、装置、网级服务器和省级服务器
CN111276120A (zh) * 2020-01-21 2020-06-12 华为技术有限公司 语音合成方法、装置和计算机可读存储介质
CN111598139A (zh) * 2020-04-24 2020-08-28 北京奇艺世纪科技有限公司 数据处理方法及系统
WO2021232832A1 (zh) * 2020-05-19 2021-11-25 华为技术有限公司 数据处理方法、联邦学习的训练方法及相关装置、设备
CN113436208A (zh) * 2021-06-30 2021-09-24 中国工商银行股份有限公司 基于端边云协同的图像处理方法、装置、设备及介质
CN113657613A (zh) * 2021-08-23 2021-11-16 北京易真学思教育科技有限公司 预测模型训练方法、数据处理装置和系统

Similar Documents

Publication Publication Date Title
CN112054863B (zh) 一种通信方法及装置
WO2021217519A1 (zh) 用于调整神经网络的方法和装置
US20230342593A1 (en) Neural network training method and related apparatus
WO2023020502A1 (zh) 数据处理方法及装置
WO2024044881A1 (zh) 一种数据处理方法、训练方法及相关装置
US11456834B2 (en) Adaptive demodulation reference signal (DMRS)
WO2023273956A1 (zh) 一种基于多任务网络模型的通信方法、装置及系统
CN111783932A (zh) 训练神经网络的方法和装置
WO2022001822A1 (zh) 获取神经网络的方法和装置
WO2022222116A1 (zh) 信道恢复的方法及收端设备
WO2024108356A1 (zh) Csi反馈的方法、发端设备和收端设备
WO2023226650A1 (zh) 一种模型训练的方法和装置
WO2024065696A1 (zh) 无线通信的方法、终端设备和网络设备
WO2024098259A1 (zh) 生成样本集的方法和设备
WO2024026846A1 (zh) 一种人工智能模型处理方法及相关设备
WO2023185890A1 (zh) 一种数据处理方法及相关装置
WO2023283785A1 (zh) 信号处理的方法及接收机
WO2023004638A1 (zh) 信道信息反馈的方法、发端设备和收端设备
WO2023015499A1 (zh) 无线通信的方法和设备
WO2022236788A1 (zh) 通信方法、设备及存储介质
WO2022257121A1 (zh) 通信方法、设备及存储介质
WO2024130739A1 (zh) 无线通信的方法及设备
WO2023179675A1 (zh) 信息处理方法和通信装置
WO2023125598A1 (zh) 一种通信方法及通信装置
WO2023179577A1 (zh) 一种通信方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22956717

Country of ref document: EP

Kind code of ref document: A1