WO2023280176A1 - 模型训练方法及相关装置 - Google Patents

模型训练方法及相关装置 Download PDF

Info

Publication number
WO2023280176A1
WO2023280176A1 PCT/CN2022/103985 CN2022103985W WO2023280176A1 WO 2023280176 A1 WO2023280176 A1 WO 2023280176A1 CN 2022103985 W CN2022103985 W CN 2022103985W WO 2023280176 A1 WO2023280176 A1 WO 2023280176A1
Authority
WO
WIPO (PCT)
Prior art keywords
communication device
data
machine learning
learning model
control layer
Prior art date
Application number
PCT/CN2022/103985
Other languages
English (en)
French (fr)
Inventor
胡斌
王坚
徐晨
张公正
李榕
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22836925.2A priority Critical patent/EP4358446A4/en
Publication of WO2023280176A1 publication Critical patent/WO2023280176A1/zh
Priority to US18/405,019 priority patent/US20240152766A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0009Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the channel coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0006Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the transmission format
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0023Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the signalling
    • H04L1/0026Transmission of channel quality indication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Definitions

  • the present application relates to the communication field, and in particular to a model training method and related devices.
  • each sub-module needs to be optimized individually.
  • this method will introduce more interference effects, such as amplifier distortion and channel impairment, and each module has control factors and the number of parameters, which makes the complexity of end-to-end optimization very high.
  • both the sending end and the receiving end can process communication signals through machine learning models such as auto encoders.
  • the end-to-end communication quality can be improved.
  • the signal sent by the sending end needs to pass through the channel to reach the receiving end, and the channel will interfere with the communication signal, which increases the difficulty of autoencoder training.
  • the interference generated by the communication signal through the channel is generally difficult to be represented by the model, which increases the training difficulty of the autoencoder and affects the end-to-end communication quality.
  • This application provides a model training method and related devices, which are conducive to increasing the feasibility of training machine learning models, improving the convergence speed of training, and optimizing the robustness of machine learning models when the channel is not modeled. Thereby improving the end-to-end communication quality.
  • the present application provides a model training method, which can be applied to a communication system including a first communication device and a second communication device, the number of the first communication device is at least one, and the first communication device is deployed with a first A machine learning model, the method includes: the first communication device sends first data to the second communication device through a channel, the first data is the output result of the first training data input to the first machine learning model, and the first machine learning model includes a control Layer, the control layer is at least one layer of the first machine learning model; the first communication device receives the second loss function through the feedback channel, the feedback channel is determined according to the observation error, and the second loss function is the first loss function sent by the second communication device The loss function is obtained after transmission through the feedback channel; the first communication device updates the parameters of the control layer based on the Kalman filter according to the second loss function, and obtains the updated parameters of the control layer, and the updated parameters of the control layer are used to update the parameters of the second A parameter of a machine learning model.
  • control layer is the last layer of the first machine learning model.
  • control layer is at least one layer of network selected in the first machine learning model in the embodiment of the present application.
  • the control layer is only an example of a name, and other names with the same characteristics can be included in the embodiment of the present application. in the scope of protection.
  • the above-mentioned second loss function may be cross entropy or minimum mean square error or the like.
  • the above Kalman filter type may be volumetric Kalman filter or extended Kalman filter, etc., and the embodiment of the present application does not limit the type of Kalman filter.
  • the first communication device when the channel is not modeled, can receive the second loss function according to the channel through power control, so that the first communication device can use the Kalman filter to The parameters of the control layer of the first machine learning model are updated.
  • the accuracy of model training can still be guaranteed, the impact of channel errors on model training can be reduced, and the number of machines used for end-to-end communication can be increased. Learn the feasibility of model training, improve the convergence speed of training machine learning models, and optimize the robustness of machine learning models, thereby improving the quality of end-to-end communication.
  • updating the parameters of the control layer based on the above-mentioned Kalman filter to obtain the updated parameters of the control layer includes: the first communication device according to the prior parameters of the control layer, the second The error covariance of the loss function and the second loss function is used to obtain the Kalman gain; the first communication device updates the parameters of the control layer according to the Kalman gain to obtain the updated parameters of the control layer.
  • the prior parameters of the control layer may be initial parameters of the control layer.
  • the prior parameters of the control layer may be parameters of the changed control layer. It should be understood that the a priori parameters of the control layer may vary according to the updated parameters of the control layer.
  • the model training method provided in the embodiment of the present application can reduce the influence of channel error on updating the parameters of the control layer by calculating the Kalman gain to update the parameters of the control layer, and improve the accuracy of updating the parameters of the control layer.
  • the above method further includes: the first communication device updates the first step in the first machine learning model based on reverse gradient propagation according to the updated control layer parameters and Kalman gain.
  • the parameters of the network layer obtain the parameters of the updated first network layer, and the first network layer includes the network layer before the control layer; the first communication device according to the parameters of the updated control layer and the updated parameters of the first network layer parameters to obtain the updated first machine learning model.
  • the first communication device can extract features of the first training data by updating parameters of the control layer and the first network layer, so as to better realize functions of source coding, channel coding, and modulation.
  • the model training method provided in the embodiment of the present application facilitates the extraction of the relationship between the training data by updating the parameters of the control layer and the first network layer, and updates the parameters of the first machine learning model based on the Kalman gain and the parameters of the control layer, Computational complexity for fewer parameter updates.
  • the above method further includes: the first communication device sends fourth data to the second communication device through a channel, and the fourth data is the output result of the second training data input to the first machine learning model; the first communication device receives instruction information from the second communication device, and the instruction information is used to instruct the first communication device to stop the training of the first machine learning model; the first The communication device stops the training of the first machine learning model according to the indication information.
  • the second communication device may also send the third loss function to the first communication device through a channel, and the first communication device judges whether the third loss function is lower than a preset threshold , if it is lower than the preset threshold, the first communication device stops the training of the first machine learning model, and sends instruction information to the second communication device, where the instruction information is used to instruct the second communication device to stop the training of the second machine learning model.
  • the second communication device determines the third loss function, if the third loss function is lower than a preset threshold, the second communication device stops sending the third loss function to the first communication device, and if within a period of time, If the second communication device does not receive the third loss function, it stops the training of the first machine learning model.
  • the first data includes N sets of data, where N is a positive integer, and the value of N is determined according to the type of Kalman filtering and the dimension of the parameters of the control layer.
  • the first communication device may sample the parameters of the control layer to obtain sampling points of the parameters of the control layer.
  • the number of samples can be determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the first data includes M sets of data, where M is a positive integer, and the value of M is determined by the first communication device and other first communication devices according to preset rules, The sum of M and the number of data sent by other first communication devices is determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the above-mentioned first communication device may determine the quantity M of the first data with other first communication devices according to preset rules.
  • the preset rule may be that in the communication system, the number of output results of the machine learning models in each first communication device is greater than or equal to 1, and the number of output results of the machine learning models in all first communication devices The sum of the numbers is determined by the type of Kalman filter and the dimension of the parameters of the control layer.
  • All the first communication devices in the communication system can determine their own sampling points by communicating with each other.
  • the first communication device sends the first data to the second communication device through the channel, and the first data may include M sets of data, if the sum of the output results of the machine learning models in all the first communication devices is P, then the number of second loss functions received by the first communication device through the channel is P.
  • the first communication device updates the parameters of the control layer according to the P second loss functions, and obtains the updated parameters of the control layer.
  • the first communication device may transmit the updated parameters of the control layer to other first communication devices through mutual communication.
  • first communication devices in the communication system adopt a central distributed training method, and the sampling of the control layer is divided into multiple subtasks, which are jointly completed by multiple first communication devices.
  • the above-mentioned first communication device can be used as the center
  • the communication device can receive the second loss function sent by the second communication device, and train to obtain the parameters of the control layer, and then send it to other first communication devices.
  • the model training method provided by the embodiment of the present application divides the sampling of the control layer into multiple subtasks, which are jointly completed by multiple first communication devices, which can reduce the calculation amount of the first communication device, thereby reducing the calculation burden of the first communication device , to ensure the deployment of online training.
  • the foregoing method further includes: the first communication device sends updated parameters of the control layer to other first communication devices.
  • the first communication device may also transmit the updated parameters of the first machine learning model to other first communication devices through mutual communication.
  • the first communication device may also transmit the updated parameters of the control layer and the Kalman gain to other first communication devices through mutual communication, and the other first communication devices
  • the communication device may update the parameters of the first network layer in the first machine learning model based on the received updated parameters of the control layer and the Kalman gain based on reverse gradient propagation, and then update the parameters of the first machine learning model.
  • the above-mentioned first communication device may transmit the prior parameters of the control layer, the second loss function, the error covariance of the second loss function, and the updated parameters of the control layer to other first communication devices through mutual communication.
  • the other first communication device may first determine the Kalman gain according to the received prior parameters of the control layer, the second loss function, and the error covariance of the second loss function, and then according to the updated parameters of the control layer and the The Kalman gain is to update the parameters of the first network layer in the first machine learning model based on reverse gradient propagation, and then update the parameters of the first machine learning model.
  • the model training method provided by the embodiment of the present application adopts a central distributed training method.
  • the first communication device in the center is trained, it can send updated model parameters to other first communication devices, saving other first communication devices.
  • the training cost of the device reduces the calculation amount of other first communication devices.
  • the above method further includes: the first communication device according to the multiple loss functions received within the first time period
  • the variance of the channel determines the degree of nonlinearity of the channel in the first time period, and the multiple loss functions include the second loss function; the first communication device determines the type of Kalman filtering according to the degree of nonlinearity of the channel in the first time period.
  • the first communication device may determine the type of Kalman filtering by judging the degree of nonlinearity of the channel in the first time period.
  • the model training method provided in the embodiment of the present application can judge the impact of the environment on the channel through the nonlinear degree of the first time period, and reduce the impact of the environment on the channel by changing the type of Kalman filter, so that the first machine learning model is updated Complexity and precision are balanced.
  • the variance of the second loss function is greater than or equal to the first threshold, and the degree of nonlinearity of the channel in the first time period is strongly nonlinear; or, the second loss function's If the variance is smaller than the first threshold, the degree of nonlinearity of the channel in the first time period is weak nonlinearity.
  • the degree of nonlinearity of the channel in the first time period is strongly nonlinear, and the type of Kalman filtering is volumetric Kalman filtering; or, the channel is in the first time period
  • the degree of nonlinearity is weak nonlinearity, and the type of Kalman filter is extended Kalman filter.
  • the present application provides a model training method, which can be applied to a communication system including a first communication device and a second communication device, the number of the first communication device is at least one, and the first communication device is deployed with a first A machine learning model, the second communication device is deployed with a second machine learning model, the method includes: the second communication device receives second data through a channel, and the second data is obtained after the first data sent by the first communication device is transmitted through the channel , the first data is the output result of the first training data input to the first machine learning model, the first machine learning model includes a control layer, and the control layer is at least one layer of the first machine learning model; the second communication device transfers the second data Input to the second machine learning model to obtain the third data; the second communication device determines the first loss function according to the third data and the first training data; the second communication device communicates to the first through the feedback channel The device sends a first loss function, the feedback channel is determined according to the observation error, and the first loss function is used to update the parameters of
  • the second communication device when the channel is not modeled, can determine the observation error according to the error between the predicted value and the real value within a period of time, and construct a feedback channel whose variance is the observation error , so that the first communication device can update the parameters of the first machine learning model based on Kalman filtering, which can reduce the influence of channel errors on model training, increase the feasibility of model training, and improve the convergence speed of training autoencoders, Optimizing the robustness of autoencoders to improve end-to-end communication quality.
  • the above method further includes: the second communication device updates the parameters of the second machine learning model based on the back gradient propagation according to the first loss function, and obtains the updated second machine learning model learning model.
  • the above method further includes: the second communication device receives fifth data through a channel, and the fifth data is obtained after the fourth data sent by the first communication device is transmitted through a channel, The fourth data is the output result of the second training data input to the first machine learning model; the second communication device inputs the fifth data into the second machine learning model to obtain sixth data; the second communication device according to the sixth data and the first Two training data, determine the third loss function; if the third loss function is lower than the preset threshold, the second communication device sends instruction information to the first communication device, and the instruction information is used to instruct the first communication device to stop the first machine learning model train.
  • the present application provides a device related to model training, which can be used in the first communication device of the first aspect, and the device can be a terminal device or a network device, or a device in a terminal device or a network device A device (for example, a chip, or a chip system, or a circuit), or a device that can be matched with a terminal device or a network device.
  • a device related to model training which can be used in the first communication device of the first aspect, and the device can be a terminal device or a network device, or a device in a terminal device or a network device A device (for example, a chip, or a chip system, or a circuit), or a device that can be matched with a terminal device or a network device.
  • the communication device may include a one-to-one corresponding module or unit for executing the method/operation/step/action described in the first aspect.
  • the module or unit may be a hardware circuit, software, or It can be implemented by combining hardware circuits with software.
  • the device includes a transceiver unit and a processing unit.
  • the transceiver unit is used to: send the first data to the second communication device through the channel, the first data is the output result of the first training data input to the first machine learning model, the first machine learning model includes a control layer, and the control layer is the second At least one layer of a machine learning model; receiving a second loss function through a channel, and the second loss function is obtained after the first loss function sent by the second communication device is transmitted through the channel.
  • the processing unit is configured to: update the parameters of the control layer based on the Kalman filter according to the second loss function to obtain updated parameters of the control layer, and the updated parameters of the control layer are used to update parameters of the first machine learning model.
  • the above processing unit is further configured to: obtain the Kalman gain according to the prior parameters of the control layer, the second loss function, and the error covariance of the second loss function; according to Karl Mann gain, update the parameters of the control layer, and obtain the updated parameters of the control layer.
  • the above-mentioned transceiver unit is further configured to: send fourth data to the second communication device through a channel, and the fourth data is the output of the second training data input to the first machine learning model Result; receiving instruction information from the second communication device, where the instruction information is used to instruct the device to stop the training of the first machine learning model.
  • the above processing unit is further configured to: stop the training of the first machine learning model according to the instruction information.
  • the first data includes N sets of data, where N is a positive integer, and the value of N is determined according to the type of Kalman filtering and the dimension of the parameters of the control layer.
  • the first data includes M sets of data, where M is a positive integer, and the value of M is determined by the device and other first communication devices according to preset rules, and M and The sum of the numbers of data sent by other first communication devices is determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the above-mentioned transceiver unit is further configured to: the first communication device sends the updated parameters of the control layer to other first communication devices.
  • the above processing unit is further configured to: judge the degree of nonlinearity of the channel in the first time period according to variances of multiple loss functions received in the first time period,
  • the multiple loss functions include a first loss function; according to the nonlinear degree of the channel in the first time period, the type of Kalman filtering is determined.
  • the variance of the second loss function is greater than or equal to the first threshold, and the degree of nonlinearity of the channel in the first time period is strongly nonlinear; or, the second loss function's If the variance is smaller than the first threshold, the degree of nonlinearity of the channel in the first time period is weak nonlinearity.
  • the degree of nonlinearity of the channel in the first time period is strongly nonlinear, and the type of Kalman filtering is volumetric Kalman filtering; or, the channel is in the first time period
  • the degree of nonlinearity is weak nonlinearity, and the type of Kalman filter is extended Kalman filter.
  • the present application provides a device related to model training, which can be used in the second communication device of the second aspect, and the device can be a terminal device or a network device, or can be a device in a terminal device or a network device.
  • a device for example, a chip, or a chip system, or a circuit), or a device that can be matched with a terminal device or a network device.
  • the communication device may include a one-to-one corresponding module or unit for executing the method/operation/step/action described in the second aspect.
  • the module or unit may be a hardware circuit, software, or It can be implemented by combining hardware circuits with software.
  • the device includes a transceiver unit and a processing unit.
  • the transceiver unit is used to: receive second data through a channel, the second data is obtained after the first data sent by the first communication device is transmitted through the channel, and the first data is the output of the first training data input to the first machine learning model
  • the first machine learning model includes a control layer that is at least one layer of the first machine learning model.
  • the processing unit is used for: inputting the second data into the second machine learning model to obtain the third data; and determining the first loss function according to the predicted value and the first training data of the third data.
  • the transceiver unit is also used for: sending the first loss function to the first communication device through the feedback channel, the feedback channel is determined according to the observation error, and the first loss function is used for updating the parameters of the control layer of the first machine learning model.
  • the above processing unit is further configured to: update the parameters of the second machine learning model based on reverse gradient propagation according to the first loss function, to obtain an updated second machine learning model .
  • the above-mentioned transceiver unit is further configured to: receive fifth data through a channel, the fifth data is obtained after the fourth data sent by the first communication device is transmitted through the channel, and the fourth The data is the output result of the second training data input to the first machine learning model; the above processing unit is used to: input the fifth data into the second machine learning model to obtain sixth data; according to the sixth data and the second training data, Determine the third loss function; the above-mentioned transceiver unit is also used for: if the third loss function is lower than the preset threshold value, send instruction information to the first communication device, and the instruction information is used to instruct the first communication device to stop the training of the first machine learning model .
  • the present application provides yet another device related to model training, including a processor, which is coupled to a memory and can be used to execute instructions in the memory, so as to realize any of the possible implementations in the above aspects.
  • the device further includes a memory.
  • the device further includes a communication interface, and the processor is coupled to the communication interface for communicating with other communication devices.
  • the present application provides a processing device, including a processor and a memory.
  • the processor is used to read instructions stored in the memory, and can receive signals through the receiver and transmit signals through the transmitter, so as to execute the method in any possible implementation manner of the above-mentioned aspects.
  • processors there are one or more processors, and one or more memories.
  • the memory may be integrated with the processor, or the memory may be separated from the processor.
  • the memory and the processor can be integrated on the same chip, or they can be arranged on different chips respectively. This application does not limit the type of the memory and the arrangement of the memory and the processor.
  • sending the first data may be the process of outputting the first data from the processor
  • receiving the second data may be the process of receiving and inputting the second data by the processor.
  • processed output data may be output to the transmitter
  • input data received by the processor may be from the receiver.
  • the transmitter and the receiver may be collectively referred to as a transceiver.
  • the processing device in the sixth aspect above can be a chip, and the processor can be implemented by hardware or by software.
  • the processor can be a logic circuit, an integrated circuit, etc.; when implemented by software
  • the processor may be a general-purpose processor, which is realized by reading software codes stored in a memory, and the memory may be integrated in the processor or exist independently outside the processor.
  • the present application provides a computer program product
  • the computer program product includes: a computer program (also called code, or instruction), when the computer program is executed, it enables the computer to execute any one of the above-mentioned possible aspects method in the implementation.
  • the present application provides a computer-readable storage medium
  • the computer-readable storage medium stores a computer program (also referred to as code, or instruction) when it runs on a computer, causing the computer to perform the above-mentioned aspects A method in any of the possible implementations.
  • the present application provides a computer program, which, when running on a computer, enables the methods in the possible implementation manners in the above-mentioned aspects to be executed.
  • the present application provides a communication system, including the device in the third aspect and its various possible implementations above, and the device in the fourth aspect and its various possible implementations above.
  • FIG. 1 is a schematic diagram of an end-to-end signal transmission process
  • Fig. 2 is a schematic diagram of an end-to-end signal transmission process based on an autoencoder
  • Fig. 3 is a schematic flow chart of a model training method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an end-to-end signal transmission process provided by an embodiment of the present application.
  • Fig. 5 is a schematic flowchart of another model training method provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of updating first network layer parameters provided by an embodiment of the present application.
  • Fig. 7 is a schematic diagram of the cross-entropy loss based on the model training method provided by the embodiment of the present application.
  • FIG. 8 is a schematic diagram of changes in the bit error rate based on the model training method provided by the embodiment of the present application.
  • Fig. 9 is a schematic flowchart of another model training method provided by the embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a device related to model training provided by an embodiment of the present application.
  • FIG. 11 is a schematic block diagram of another device related to model training provided by an embodiment of the present application.
  • 5G systems usually include the following three application scenarios: enhanced mobile broadband (eMBB), ultra-reliable and low latency communications (URLLC) and massive machine type communications (massive machine type of communication, mMTC).
  • eMBB enhanced mobile broadband
  • URLLC ultra-reliable and low latency communications
  • mMTC massive machine type communications
  • the communication device in this embodiment of the present application may be a network device or a terminal device. It should be understood that the terminal device may be replaced with a device or chip capable of performing functions similar to the terminal device, and the network device may also be replaced with a device or chip capable of performing similar functions as the network device, and the embodiment of the present application does not limit its name.
  • the terminal equipment in the embodiment of the present application may also be referred to as: user equipment (user equipment, UE), mobile station (mobile station, MS), mobile terminal (mobile terminal, MT), access terminal, subscriber unit, subscriber station, Mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent or user device, etc.
  • user equipment user equipment
  • MS mobile station
  • MS mobile terminal
  • MT mobile terminal
  • access terminal subscriber unit, subscriber station, Mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent or user device, etc.
  • a terminal device may be a device that provides voice/data connectivity to users, for example, a handheld device with a wireless connection function, a vehicle-mounted device, and the like.
  • some terminal devices are: mobile phone (mobile phone), tablet computer, notebook computer, palmtop computer, mobile internet device (mobile internet device, MID), wearable device, virtual reality (virtual reality, VR) device, enhanced Augmented reality (AR) equipment, wireless terminals in industrial control, wireless terminals in self driving, wireless terminals in remote medical surgery, smart grid Wireless terminals in transportation safety, wireless terminals in smart city, wireless terminals in smart home, cellular phones, cordless phones, session initiation protocols protocol, SIP) telephone, wireless local loop (wireless local loop, WLL) station, personal digital assistant (personal digital assistant, PDA), handheld device with wireless communication function, computing device or other processing device connected to a wireless modem
  • Vehicle-mounted devices, wearable devices, terminal devices in a 5G network, or terminal devices in a future evolving public land mobile network (PLMN), etc. are not limited
  • the terminal device can also be a terminal device in the Internet of Things (IoT) system.
  • IoT Internet of Things
  • IoT is an important part of the development of information technology in the future, and its main technical feature is that items can be Connect with the network to realize the intelligent network of man-machine interconnection and object interconnection.
  • the network device in the embodiment of the present application may be a device that provides a wireless communication function for a terminal device.
  • the network device may also be called an access network device or a wireless access network device, and may be a transmission reception point (transmission reception point, TRP), can also be an evolved base station (evolved NodeB, eNB or eNodeB) in the LTE system, can also be a home base station (for example, home evolved NodeB, or home Node B, HNB), a base band unit (base band unit, BBU ), can also be a wireless controller in a cloud radio access network (cloud radio access network, CRAN) scenario, or the network device can be a relay station, an access point, a vehicle-mounted device, a wearable device, and a network device in a 5G network Or the network equipment in the future evolved PLMN network, etc., can be the access point (access point, AP) in the wireless local area network (Wireless Local Area Network, WLAN), or the gNB in the
  • the network device may include a centralized unit (centralized unit, CU) node, or a distributed unit (distributed unit, DU) node, or a radio access network (radio access network, RAN) including a CU node and a DU node ) device, or the RAN device of the control plane CU node (CU-CP node), the user plane CU node (CU-UP node) and the DU node.
  • CU-CP node control plane CU node
  • CU-UP node user plane CU node
  • DU node radio access network
  • the network device provides services for the terminal devices in the cell, and the terminal device communicates with the network device or other devices corresponding to the cell through the transmission resources (for example, frequency domain resources, or spectrum resources) allocated by the network device.
  • the network device can be
  • the macro base station (for example, macro eNB or macro gNB, etc.) may also be a base station corresponding to a small cell (small cell), where the small cell may include: a metro cell (metro cell), a micro cell (micro cell), a pico cell ( pico cell), femto cell, etc. These small cells have the characteristics of small coverage and low transmission power, and are suitable for providing high-speed data transmission services.
  • the embodiment of the present application does not specifically limit the specific structure of the execution subject of the method provided in the embodiment of the present application, as long as the program that records the code of the method provided in the embodiment of the present application can be executed according to the method provided in the embodiment of the present application Communication is sufficient.
  • the execution body of the method provided by the embodiment of the present application may be a terminal device or a network device, or a functional module in a terminal device or a network device that can call a program and execute the program.
  • various aspects or features of the present application may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques.
  • article of manufacture covers a computer program accessible from any computer readable device, carrier or media.
  • computer-readable media may include, but are not limited to: magnetic storage devices (e.g., hard disk, floppy disk, or tape, etc.), optical disks (e.g., compact disc (compact disc, CD), digital versatile disc (digital versatile disc, DVD) etc.), smart cards and flash memory devices (for example, erasable programmable read-only memory (EPROM), card, stick or key drive, etc.).
  • magnetic storage devices e.g., hard disk, floppy disk, or tape, etc.
  • optical disks e.g., compact disc (compact disc, CD), digital versatile disc (digital versatile disc, DVD) etc.
  • smart cards and flash memory devices for example, erasable programmable read-only memory (EPROM), card, stick or key drive, etc.
  • various storage media described herein can represent one or more devices and/or other machine-readable media for storing information.
  • the term "machine-readable medium” may include, but is not limited to, various other media capable of storing, containing and/or carrying instructions and/or data.
  • each sub-module In traditional end-to-end communication systems, the processing of communication signals is generally divided into a series of sub-modules, such as source coding, channel coding, modulation, and channel estimation. To improve end-to-end communication quality, each sub-module needs to be optimized individually. Among them, each sub-module is modeled based on a specific signal processing algorithm, which is usually approximated as some simplified linear model. However, this way of optimizing each sub-module individually cannot guarantee the end-to-end optimization of the entire communication system, but will introduce more interference effects, such as amplifier distortion and channel impairment, etc., and each module has control factors and The number of parameters makes the complexity of end-to-end optimization for this traditional method very high.
  • the communication device may be a terminal device or a network device. If the sending end in the communication system is a terminal device, the receiving end may be a network device or other terminal devices. Alternatively, if the sending end in the communication system is a network device, the receiving end may be a terminal device or other network device, that is, the embodiment of the present application may be applied between a network device and a network device, between a network device and a terminal device, between a terminal An end-to-end communication system for various scenarios such as between devices and terminal devices.
  • FIG. 1 shows a schematic diagram of a traditional end-to-end signal transmission process.
  • the transmission process of communication signals can be divided into sub-modules such as source coding, channel coding, modulation, channel, demodulation, channel decoding, and source decoding.
  • the sending end can send the communication signal u to the receiving end. Specifically, the sending end can first convert the communication signal u into a communication signal x through sub-modules such as source coding, channel coding, and modulation, and then send the communication signal x to the receiving end through a channel.
  • the communication signal x through the channel will carry There is a channel error, so the communication signal received by the receiving end through the channel is y, and the communication signal u* is obtained through demodulation, channel decoding, and source decoding and other sub-modules.
  • the communication system is optimized end-to-end, even if the error between the communication signal u* received by the receiving end and the communication signal u sent by the sending end is as small as possible, it is necessary to optimize each sub-module, which will make the end-to-end optimized The complexity is very high, and the end-to-end optimization of the entire communication system cannot be guaranteed.
  • both the sending end and the receiving end can process the communication signal through an auto encoder.
  • both the sending end and the receiving end can be modeled in the form of a neural network, and learn the distribution of data through a large number of training samples, and then use it to predict the result.
  • Such an end-to-end learning method can achieve joint optimization, and the traditional end-to-end communication method can achieve better results.
  • FIG. 2 shows a schematic diagram of an end-to-end signal transmission process based on an autoencoder.
  • the transmission process of communication signals can be divided into encoding autoencoders and decoding autoencoders, which reduces the number of sub-modules.
  • the sending end can send the communication signal u to the receiving end. Specifically, the sending end can convert the communication signal u into a communication signal x through an encoded autoencoder, and then send the communication signal x to the receiving end through a channel.
  • the communication signal x through the channel will have channel errors, so the receiving end
  • the communication signal received through the channel is y, and the decoded self-encoder obtains the communication signal u*.
  • the channel in this communication system is generally difficult to be represented by a model, which will affect the calculation of the loss function of the decoding autoencoder, thereby affecting the training of the autoencoder, increasing the training difficulty of the autoencoder, and affecting the end-to-end communication quality.
  • the embodiment of the present application provides a model training method and a related device, which is beneficial to increase the feasibility of training the machine learning model and improve the convergence speed of the training machine learning model when the channel is not modeled. Optimize the robustness of machine learning models to improve end-to-end communication quality.
  • control layer and network layer are illustrative examples given for convenience of description, and should not constitute any limitation to the present application. This application does not exclude the possibility of defining other terms that can achieve the same or similar functions in existing or future agreements.
  • the first, second and various numbers are only for convenience of description, and are not used to limit the scope of the embodiments of the present application.
  • different communication devices are distinguished, different machine learning models are distinguished, and the like.
  • “at least one” means one or more, and “multiple” means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • “At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one (one) of a, b and c may represent: a, or b, or c, or a and b, or a and c, or b and c, or a, b and c, wherein a, b, c can be single or multiple.
  • the first communication device may be the above-mentioned terminal device or network device
  • the second communication device may be the above-mentioned terminal device or network device. It should be understood that the first communication device is equivalent to the above-mentioned sending end, and the second communication device is equivalent to the above-mentioned receiving end.
  • FIG. 3 is a schematic diagram of a model training method 300 provided by an embodiment of the present application.
  • the method 300 may be applied to a communication system including a first communication device and a second communication device, the number of the first communication device is at least one, and the first communication device may be deployed with a first machine learning model. As shown in FIG. 3, the method 300 may include the following steps:
  • the first communication device may send the first data to the second communication device through a channel, the first data is the output result of the first training data input to the first machine learning model, the first machine learning model includes a control layer, and the control layer is At least one layer of the first machine learning model.
  • the control layer can be the last at least one layer of network in the first machine learning model, or at least one layer of network anywhere in the first machine learning model. No limit.
  • FIG. 4 shows a schematic diagram of an end-to-end signal transmission process provided by an embodiment of the present application.
  • the control layer is the last layer of network in the first machine learning model.
  • the number of network layers in the first machine learning model in FIG. 4 is only an example, which is not limited in this embodiment of the present application.
  • control layer is at least one layer of network selected in the first machine learning model in the embodiment of the present application.
  • the control layer is only an example of a name, and other names with the same characteristics can be included in the protection of the embodiment of the present application. in range.
  • the first data above can be x in Figure 4
  • the first training data can be u in Figure 4
  • the first machine learning model can be understood as the self-encoder or neural network model decoded in Figure 2 above .
  • the second communication device receives second data through a channel, where the second data is obtained after the first data sent by the first communication device is transmitted through the channel.
  • the second data may be y in FIG. 4 .
  • the second communication device inputs the second data into the second machine learning model to obtain third data.
  • the second machine learning model can be understood as the self-encoder or neural network model decoded in FIG. 2 above.
  • the third data may be u* in FIG. 4 .
  • the second communication device determines a first loss function according to the third data and the first training data.
  • the second communication device may determine the first loss function by using the third data as the predicted value and the first training data as the real value.
  • the first loss function may also be called an objective function, which is not limited in this embodiment of the present application.
  • the first training data is sample data, which may be preset or sent by other communication devices. It should be understood that if the second communication device receives the first training data sent from other communication devices, the first training data does not pass through a channel of unknown error or unknown noise.
  • the first communication device sends the first training data u, but the second communication device finally obtains the third data u*, so the second communication device can use the third data u* as the predicted value, the first The training data u is used as the real value, and the first loss function is determined, and the first loss function is an error function between the third data u* and the first training data u.
  • the error between the third data and the first training data is caused by the channel.
  • the first loss function may be cross entropy or minimum mean square error.
  • the first loss function can be used as an observation in Kalman filtering.
  • the second communication device sends the first loss function to the first communication device through a feedback channel, the feedback channel is determined according to the observation error, and the first loss function is used to update parameters of the first machine learning model.
  • the variance of the feedback channel may be an observation error.
  • the feedback channel may be an additive white Gaussian noise (AWGN) channel with a mean value of 0 and a variance of an observation error.
  • AWGN additive white Gaussian noise
  • the second communication device may change the signal-to-noise ratio of the transmitted signal by controlling the transmission power of the signal fed back the first loss function, and construct an AWGN channel whose variance is the observation error.
  • the observation error may be determined by the second communication device according to the error between the predicted value and the real value within a period of time.
  • the period of time may be any period of time, and the embodiment of the present application does not limit the length of the period of time.
  • the first loss function can also be used to update the parameters of the second machine learning model.
  • the second communication device may update parameters of the second machine learning model based on reverse gradient propagation according to the first loss function to obtain an updated second machine learning model.
  • the first communication device receives the second loss function through the feedback channel, and the second loss function is obtained after the first loss function sent by the second communication device is transmitted through the feedback channel.
  • the second loss function may include the observation error
  • the first communication device updates the parameters of the control layer based on the Kalman filter according to the second loss function to obtain the updated parameters of the control layer, and the updated parameters of the control layer are used to update the parameters of the control layer of the first machine learning model parameter.
  • the second loss function may include observation errors.
  • the second loss function is the observation in the Kalman filtering method. If the error is larger, it can indicate that the confidence of the posterior second loss function (observation) is low, and it is more inclined to estimate the parameter results of the updated control layer ; If the error is smaller, it can indicate that the confidence of the second posterior loss function (observation) is high, and it is more inclined to update the parameter results of the control layer according to the second posterior loss function.
  • the parameters of the a priori control layer are calculated by the first communication device according to the second loss function and the Kalman filter, and the parameters of the a priori control layer are parameters of the control layer before each update.
  • the type of the Kalman filter may be a volumetric Kalman filter or an extended Kalman filter, and the embodiment of the present application does not limit the type of the Kalman filter.
  • the volumetric Kalman filter can be expressed by the following formula:
  • k can be the number of rounds of training or training time
  • u k can be the above-mentioned first training data
  • ⁇ k can be the parameter of the above-mentioned control layer
  • h(u k ; ⁇ k ) can be an end-to-end nonlinear function
  • the function h(u k ; ⁇ k ) can represent the nonlinear relationship among the above-mentioned first machine learning model, channel and second machine learning model
  • r k is an observation error
  • d k is an observation quantity.
  • the second communication device when the channel is not modeled, can determine the observation error according to the error between the predicted value and the real value within a period of time, and construct a feedback channel whose variance is the observation error so that the first communication device updates the parameters of the first machine learning model based on Kalman filtering, which can reduce the influence of channel errors on model training, increase the feasibility of model training, and improve the convergence speed of training machine learning models, Karl
  • the observation-based update method in the Mann filtering method can optimize the robustness of the machine learning model, thereby improving the end-to-end communication quality.
  • updating the parameters of the control layer based on the Kalman filter to obtain the updated parameters of the control layer includes: the first communication device according to the prior parameters of the control layer, the second loss function, and the second loss function The error covariance obtains the Kalman gain; the first communication device updates the parameters of the control layer according to the Kalman gain to obtain the updated parameters of the control layer.
  • the prior parameters of the control layer may be initial parameters of the control layer.
  • the prior parameters of the control layer may be parameters of the changed control layer. It should be understood that the a priori parameters of the control layer may vary according to the updated parameters of the control layer.
  • the first communication device can obtain the Kalman gain according to the prior parameters of the control layer, the second loss function determined by the third data and the first training data, and the error covariance of the second loss function; and update the control according to the Kalman gain
  • the parameters of the layer are obtained to obtain the parameters of the updated control layer.
  • the model training method provided in the embodiment of the present application can reduce the influence of channel error on updating the parameters of the control layer by calculating the Kalman gain to update the parameters of the control layer, and improve the accuracy of updating the parameters of the control layer.
  • the above method 300 further includes: the first communication device updates the parameters of the first network layer in the first machine learning model based on reverse gradient propagation according to the updated parameters of the control layer and the Kalman gain, Obtain the updated parameters of the first network layer, the first network layer includes the network layer before the control layer; the first communication device obtains the updated parameters according to the updated parameters of the control layer and the updated parameters of the first network layer.
  • the first network layer includes the network layer before the control layer.
  • the first machine learning model has 8 network layers, and if the control layer is located in the fifth layer of the first machine learning model, the first network layer includes the first 4 network layers of the first machine learning model.
  • the first machine learning model has a total of 12 network layers. If the control layer is located on the 10th to 12th layers of the first machine learning model, the first network layer includes the first 9 network layers of the first machine learning model.
  • the first network layer may be based on a network structure such as a full connection, a convolutional layer, or a residual network (resnet).
  • a network structure such as a full connection, a convolutional layer, or a residual network (resnet).
  • the first communication device can extract features of the first training data by updating parameters of the control layer and the first network layer, so as to better realize functions of source coding, channel coding, and modulation.
  • the model training method provided in the embodiment of the present application facilitates the extraction of the relationship between the training data by updating the parameters of the control layer and the first network layer, and updates the parameters of the first machine learning model based on the Kalman gain and the parameters of the control layer, Computational complexity for fewer parameter updates.
  • the above method 300 further includes: the first communication device sends fourth data to the second communication device through a channel, and the fourth data is the second training data input to the first The output result of the machine learning model; the second communication device receives the fifth data through the channel, and the fifth data is obtained after the fourth data sent by the first communication device is transmitted through the channel; the second communication device inputs the fifth data to the second The machine learning model obtains the sixth data; the second communication device determines a third loss function according to the sixth data and the second training data; if the third loss function is lower than a preset threshold, the second communication device sends instruction information, the instruction information is used to instruct the first communication device to stop the training of the first machine learning model, correspondingly, the first communication device receives the instruction information from the second communication device, and the instruction information is used to instruct the first communication device to stop the first machine learning model Training of the machine learning model: the first communication device stops the training of the first machine learning model according to the instruction information.
  • the first communication device After obtaining the updated first machine learning model, the first communication device will start a new round of training, that is, the first communication device will input the second training data into the first machine learning model, and the output result will be the fourth data , and send the fourth data to the second communication device through the channel. Since the fourth data passes through the channel, there will be a channel error, so the second communication device will receive the fifth data.
  • the second communication device can obtain the sixth data, use the sixth data as the predicted value, and use the second training data as the real value to determine the third loss function, if the third loss function is lower than the expected value If the threshold is set, the second communication device can determine that the updated first machine learning model obtained through the previous round of training is a model that satisfies the conditions, and this round of training can no longer be performed, so it sends instruction information to the first communication device.
  • the instruction information is used to instruct the first communication device to stop the training of the first machine learning model.
  • the second communication device will repeat the previous round of training steps to continue training.
  • the second communication device may periodically determine whether the third loss function is lower than a preset threshold, and if the third loss function is lower than the threshold, send indication information to the first communication device.
  • the second communication device may determine whether the third loss function is lower than a preset threshold every certain time or every certain number of training rounds.
  • the model training method provided by the embodiment of the present application, in the process of repeatedly updating the parameters of the first machine learning model, when it is detected that the third loss function meets the preset threshold, it can stop updating the parameters of the first machine learning model, which is beneficial Unnecessary training is reduced, computing resources are saved, and power consumption of the first communication device is reduced.
  • the second communication device may also send the third loss function to the first communication device through a channel, and the first communication device judges whether the third loss function is lower than a preset threshold , if it is lower than the preset threshold, the first communication device stops the training of the first machine learning model, and sends instruction information to the second communication device, where the instruction information is used to instruct the second communication device to stop the training of the second machine learning model.
  • the second communication device determines the third loss function, if the third loss function is lower than a preset threshold, the second communication device stops sending the third loss function to the first communication device, and if within a period of time, If the second communication device does not receive the third loss function, it stops the training of the first machine learning model.
  • the above method 300 further includes: the first communication device according to the variance of the multiple loss functions received within the first time period , judging the degree of nonlinearity of the channel within the first time period, the multiple loss functions include a second loss function; the first communication device determines the type of Kalman filtering according to the degree of nonlinearity of the channel within the first time period.
  • variance ⁇ 2 can be expressed by the following formula:
  • L k is the second loss function at time k
  • T is the duration of the first time period
  • the first communication device can determine the degree of nonlinearity of the channel through the value of ⁇ 2 .
  • the first time period is any continuous period of time, and this embodiment of the present application does not limit the length of the first time period.
  • the first communication device may determine the type of Kalman filtering by judging the degree of nonlinearity of the channel in the first time period.
  • the model training method provided in the embodiment of the present application can judge the impact of the environment on the channel through the nonlinear degree of the first time period, and reduce the impact of the environment on the channel by changing the type of Kalman filter, so that the first machine learning model is updated Complexity and precision are balanced.
  • the first communication device may preset a first threshold, and when the variance of the second loss function is greater than or equal to the first threshold, the degree of nonlinearity of the channel within the first time period is strongly nonlinear; when the second The variance of the loss function is smaller than the first threshold, and the degree of nonlinearity of the channel in the first time period is weak nonlinearity.
  • the value of the first threshold and the number of first thresholds may be determined by the first communication device according to the calculation accuracy of the Kalman filter.
  • the first communication device can obtain the second-order estimation accuracy by using the third-order integration method of the volumetric Kalman filter, that is, setting a first threshold to classify the degree of nonlinearity into strong nonlinearity and weak nonlinearity.
  • the first threshold is a value greater than 0 and less than 1. It should be understood that if the first communication device adopts a higher-order integration method in the volumetric Kalman filter, higher calculation accuracy can be obtained, the value of the first threshold can be different, and the number of the first threshold can include at least one.
  • the type of Kalman filtering may be volumetric Kalman filtering; when the degree of nonlinearity of the channel within the first time period is When the nonlinearity is weak, the type of Kalman filter can be extended Kalman filter.
  • the first communication device may select an extended Kalman filter with low complexity to update the parameters of the first machine learning model; when the degree of nonlinearity of the channel is strong nonlinearity, the first The communication device may select a volumetric Kalman filter with relatively high complexity to update parameters of the first machine learning model.
  • the first communication device may update the parameters of the first machine learning model in a higher-order integral manner.
  • the number of sampling points may be n 2 +n+1, the calculation accuracy is higher, and it is more suitable for strongly nonlinear channel estimation.
  • the first communication device may reduce the number of layers of the control layer; when the degree of nonlinearity of the channel is strong nonlinearity, the first communication device may increase the number of layers of the control layer number.
  • the above-mentioned parameter of the control layer is ⁇ c
  • the first communication device can adapt to change the number of layers of the parameter ⁇ c of the control layer according to the degree of nonlinearity of the control layer.
  • control layer parameters When the nonlinearity of the channel is weak, fewer control layer parameters can eliminate the influence of channel errors and reduce the complexity of updating control layer parameters. At the same time, when updating the parameters of the first network layer, it can reduce the reverse The calculation amount of gradient propagation is reduced, thereby reducing the complexity of training the first machine learning model.
  • control layer parameters can eliminate the influence of strong nonlinearity of the channel error and improve the update accuracy of the control layer parameters.
  • the above-mentioned first data may include N sets of data, where N is a positive integer, and the value of N is determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the number of the first data can be determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the first training data is a set of data, and the first communication device respectively inputs the set of data into the first machine learning model, 12 sets of first data can be obtained.
  • the first data may be a set of data, that is, there is no need to sample the parameters of the control layer.
  • the above method of updating the parameters of the control layer is still applicable.
  • model training method provided by the embodiment of the present application will be described in detail by taking the first communication device sampling the parameters of the control layer and then performing model training as an example.
  • FIG. 5 shows a schematic flowchart of another model training method 500 provided by the embodiment of the present application. As shown in Figure 5, the method may include the following steps:
  • the first communication device samples parameters of the control layer to obtain sampling points of the parameters of the control layer.
  • the control layer may be the last at least one layer of the network of the first machine learning model.
  • the first communication device may first initialize the parameter ⁇ 0 of the control layer and the error covariance P 0
  • 0 I of the parameter ⁇ 0 of the control layer in the first machine learning model. Then, the first telecommunications device may sample ⁇ 0 .
  • the sampling point at time k can be expressed as Among them, k ⁇ 1, the sampling point at time 0 can be recorded as It should be understood that the moment can be understood as the moment of sampling or the number of times of sampling.
  • k refers to the moment when the Kalman filter is updated, or the number of training times.
  • the value range of k is determined by the entire training process, that is, the training is terminated when the above-mentioned first loss function is lower than the preset threshold.
  • k-1 is the error covariance between the sampling parameters of the control layer at time k-1 and the sampling parameters of the control layer at time k, and this P k
  • Q k-1 is the system noise
  • k-1 the relationship between Q k-1 and P k
  • is the forgetting factor, which means that the exponential decay weight is applied to the past data, and the value range is 0 ⁇ 1.
  • the first communication device may use the volume method to calculate the Gaussian weight integral, as shown in the following formula (5):
  • e i represents the unit column vector whose i-th element is 1.
  • the sampling points of the parameters of the control layer can be calculated by generating 2n sampling points Wherein, n is a positive integer greater than or equal to 1.
  • the first communication device may input first training data into a first machine learning model to obtain first data, where the first machine learning model includes sampling points of parameters of the control layer.
  • the first training data is a set of data. If the type of Kalman filter is volumetric Kalman filter, then the sampling points of the parameters of the control layer are 2n, and the first data may be 2n sets of data.
  • the first communication device may send the first data to the second communication device through a channel.
  • the second communication device receives second data through the channel, where the second data is obtained after the first data sent by the first communication device is transmitted through the channel.
  • first data is 2n sets of data
  • second data is also 2n sets of data
  • the second communication device inputs the second data into the second machine learning model to obtain third data.
  • the third data is 2n sets of data.
  • the third data can be represented by the following formula (7):
  • u k is the first training data
  • ⁇ k is the control layer parameter at this moment
  • Indicates that the mean is
  • the variance is a Gaussian distribution of P k
  • k is the number of rounds of training or the moment of training.
  • the second communication device may estimate the error covariance P dd between the third data according to the third data, and the P dd may be expressed by the following formula (8):
  • R k is the covariance of the observation error.
  • the second communication device may use the volumetric method to calculate the Gaussian weighted integral.
  • D is the center vector, and D can be expressed by the following formula (11):
  • the second communication device determines a first loss function by using the third data as a predicted value and the first training data as a real value.
  • the first loss function includes 2n, that is, the above-mentioned first training data passes through each sampling point to obtain a data, the first data passes through the channel to obtain a third data, and according to a third data and the first training data to obtain A first loss function, so the above sampling points include 2n, then the first loss function includes 2n.
  • the second communication device may calculate cross entropy as a first loss function, and the first loss function L k may be expressed by the following formula (12):
  • the first training data is u k
  • the third data is h(u k ; ⁇ k ).
  • the training goal is to make the error between the real value and the third data as small as possible, even if the first loss function Lk is as small as possible, so Lk can be approximated as 0, as shown in the following formula (13):
  • the second communication device can replace the calculation of the above-mentioned observed third data by observing the first loss function, so,
  • observation value of the second communication device that is, the first loss function
  • L i,k the observation value of the second communication device
  • i can be an integer that takes the number of times ⁇ 1, 2, . . . , 2n ⁇ .
  • the second communication device updates parameters of the second machine learning model based on reverse gradient propagation according to the first loss function, to obtain an updated second machine learning model.
  • the second communication device may calculate the mean value of the first loss function, adopt the mean value of the first loss function and update the parameters of the second machine learning model based on reverse gradient propagation.
  • the mean value of this first loss function can be Wherein, L i, k are 2n first loss functions.
  • the second communication device sends the first loss function to the first communication device through a feedback channel, the feedback channel is determined by the second communication device according to the observation error, and the first loss function is used to update parameters of the first machine learning model.
  • the second communication device may send the first loss function L i,k to the first communication device through a feedback channel, that is, send 2n first loss functions respectively.
  • the second communication device can dynamically estimate the observation error according to the environment change, and make the error of the channel approximately the same as the observation error through power control, so as to construct the feedback channel.
  • the second communication device may define the error between the predicted value and the real value as and presuppose the error covariance
  • the value of R max may be an empirical value, and the empirical value may be determined by the second communication device according to the received error covariance from the first communication device.
  • the second communication device can be based on Estimate the error covariance of predicted and true values over time Should It can be expressed by the following formula (15):
  • T i is the duration of the period, i ⁇ 0.
  • a corresponding table may also be established for the adjustment of R k , and the corresponding table includes the corresponding relationship between the index and the value of R k , which may correspond to R max to R min according to the index from large to small.
  • the second communicator can calculate the value to determine the selection of R k , for example, when When , the index decreases by 1, and the corresponding selected R k decreases. Otherwise, the adjustment is stopped, and the observation error covariance
  • R k r k I, where I is an identity matrix, and rk is a variance.
  • the second communication device may model the channel as an additive white Gaussian noise (AWGN) channel with a mean value of 0 and a variance of r k through power control, and feed back 2n first loss functions L i,k For the first communication device, used for the first communication device to update the parameters of the control layer.
  • AWGN additive white Gaussian noise
  • the second communication device may send the mean value of the first loss function to the first communication device through a channel, namely At the same time, the center vector D is sent to the first communication device.
  • the first communication device receives the second loss function through the feedback channel, and the second loss function is obtained after the first loss function sent by the second communication device is transmitted through the channel.
  • the second communication device models the feedback channel as an AWGN channel whose channel error is the observation error r k , so the first loss function is transmitted through the feedback channel to obtain the second loss function That is shown in the following formula (16):
  • the second communication device sends the first loss function L i,k to the first communication device through the feedback channel, and the second loss function received by the first communication device is
  • the second communication device may send the center vector D to the first communication device through a feedback channel, and correspondingly, the first communication device
  • the mean value of the second loss function and the center vector D with observation errors may be received via a feedback channel.
  • the first communication device obtains a Kalman gain according to the second loss function, the prior parameters of the control layer, and the error covariance of the second loss function.
  • the first communication device may estimate the error covariance of the second loss function according to the second loss function.
  • the first communication device can obtain the cross-covariance P ⁇ d of the second loss function according to the prior parameters of the control layer, where P ⁇ d can be expressed by the following formula (18):
  • P ⁇ d can be expressed by the following formula (19) or (20):
  • the first communication device can obtain the Kalman gain G k according to the error covariance of the second loss function and the cross covariance of the second loss function, where G k can be expressed by the following formula (21):
  • the first communication device updates the parameters of the control layer according to the Kalman gain, and obtains the updated parameters of the control layer.
  • the first communication device updates the parameters of the first network layer in the first machine learning model based on reverse gradient propagation according to the updated parameters of the control layer and the Kalman gain, and obtains the updated parameters of the first network layer, the first A network layer includes the network layer preceding the control layer.
  • FIG. 6 shows a schematic diagram of updating parameters of the first network layer.
  • the parameters of the control layer are recorded as ⁇ c
  • the parameters of the first network layer are recorded as ⁇ zc
  • ⁇ zc represents the weight between the layer l zc-1 and the layer l zc to which it belongs in the network
  • the gradient based on the Kalman filter can be Among them, j is the number of updates, G is the Kalman gain calculated for the jth time, is the second loss function obtained from the jth calculation.
  • the parameter update method of the network layer before the calculation control layer can be obtained, that is, it can be shown by the following formula (23):
  • z is the total number of network layers of the first machine learning model, and j can be an integer that takes ⁇ 1, 2, . . . , z-c ⁇ .
  • the first communication device obtains an updated first machine learning model according to the updated parameters of the control layer and the updated parameters of the first network layer.
  • the model training method provided by the embodiment of the present application samples the parameters of the control layer, better combines Kalman filtering into the model training, further increases the feasibility of model training, improves the convergence speed of the training autoencoder, and optimizes the autoencoder The robustness of the device, thereby improving the quality of end-to-end communication.
  • the method 500 is also simulated to verify the effect of the method 500 .
  • the simulation is performed under the AWGN time-varying perturbation channel, and the effect of the method 500 proposed in the embodiment of the present application and the policy gradient (PG) based on reinforcement learning is compared.
  • the method 500 proposed in the embodiment of the present application is a training method based on a volumetric Kalman filter (cubature kalman filter, CKF).
  • the signal-to-noise ratio of the channel changes in real time, and the value range of the signal-to-noise ratio can be set to [10,25], where the unit of the signal-to-noise ratio is decibel.
  • the modulation order is 4, and the length of the first training data is 256.
  • the first training data needs to be one-hot encoded before being input into the volumetric Kalman filter-based machine learning model. Get training data of length 16.
  • the above simulation iterates 4000 times for CKF and PG respectively, and observes the cross-entropy loss and bit error rate changes of the two algorithms respectively.
  • FIG. 7 shows a schematic diagram of the cross-entropy loss based on the model training method provided by the embodiment of the present application.
  • the descending speed of CKF is greater than that of PG
  • the loss disturbance of CKF is smaller than that of PG
  • the cross-entropy loss of CKF is smaller than that of PG
  • the smaller the cross-entropy loss it represents the channel The less impact on communication between the first communication device and the second communication device.
  • FIG. 8 shows a schematic diagram of changes in the bit error rate based on the model training method provided by the embodiment of the present application. As shown in Figure 8, as the number of iterations increases, the rate of decline of CKF is greater than that of PG, and the bit error rate of CKF is smaller than that of PG.
  • the above-mentioned first data may include M sets of data, where M is a positive integer, the value of M is determined by the first communication device and other first communication devices according to preset rules, and M and other first communication devices The sum of the numbers of data sent by a communication device is determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the data sent by the other first communication devices includes the output result of the machine learning model in each of the other first communication devices.
  • the above-mentioned first communication device may determine the quantity M of the first data with other first communication devices according to preset rules.
  • the preset rule may be that in the communication system, the number of output results of the machine learning models in each first communication device is greater than or equal to 1, and the number of output results of the machine learning models in all first communication devices The sum of the numbers is determined by the type of Kalman filter and the dimension of the parameters of the control layer.
  • Multiple first communication devices in the communication system can determine their own sampling points by communicating with each other. For example, if there are a total of a first communication devices in the communication system, and the a first communication devices form a ring topology, then the a-1 first communication devices may communicate with each other to determine the sampling point number sequence.
  • the model training method provided by the embodiment of the present application divides the sampling of the control layer into multiple subtasks, which are jointly completed by multiple first communication devices, which can reduce the calculation amount of the first communication device, thereby reducing the calculation burden of the first communication device , to ensure the deployment of online training.
  • the parameters of the control layer can still be updated according to the above method 300 to obtain the updated parameters of the control layer.
  • the first communication device may send the first data to the second communication device through a channel, and the first data may include M sets of data, if the output results of the machine learning models in the multiple first communication devices in the communication system are The sum of the numbers is P, then the number of first loss functions that the first communication device can receive through the channel is P, and it should be understood that the value of P is greater than or equal to the value of M.
  • the first communication device may update parameters of the control layer according to the P first loss functions to obtain updated parameters of the control layer.
  • the first communication device may transmit the updated parameters of the control layer to other first communication devices through mutual communication.
  • first communication devices in the communication system adopt a central distributed training method, and the sampling of the control layer is divided into multiple subtasks, which are jointly completed by multiple first communication devices.
  • the above-mentioned first communication device can be used as the center
  • the communication device can receive the first loss function sent by the second communication device, and train to obtain the parameters of the control layer, and then send it to other first communication devices.
  • FIG. 9 shows a schematic flowchart of another model training method 900 .
  • the communication system may include a first communication device 1, a first communication device 2, and a second communication device.
  • the first communication device 1 is deployed with a first machine learning model 1
  • the first communication device 2 is deployed with a first Machine Learning Model 2.
  • the number of first communication devices in the communication system is only an example, and that the first communication device 2 is a distributed central first communication device is only an example, which is not limited in this embodiment of the present application.
  • method 900 may include the following steps:
  • the first communication device 1 inputs the first training data into the first machine learning model 1 to obtain the first data 1.
  • the first machine learning model 1 includes sampling points 1 of the parameters of the control layer, and the sampling points of the parameters of the control layer Point 1 is obtained by sampling parameters of the control layer in the first machine learning model 1 by the first communication device 1 .
  • the first communication device 2 inputs the first training data into the first machine learning model 2 to obtain the first data 2.
  • the first machine learning model 2 includes sampling points 2 of the parameters of the control layer, and the sampling points of the parameters of the control layer Point 2 is obtained by the first communication device 2 sampling the parameters of the control layer in the first machine learning model 2 .
  • the initial parameters of the first machine learning model 1 and the second machine learning model 2 may be the same or different.
  • the first communication device 1 and the first communication device 2 may determine the number of sampling points 1 and the number of sampling points 2 according to preset rules. Exemplarily, if the first communication device 1 or the first communication device 2 uses the volumetric Kalman filter to train the first machine learning model 1 or the first machine learning model 2, and the first machine learning model 1 and the first machine learning model 2 The number of layers of the network layer is the same, both are n, then the sum of the number of sampling points 1 and the number of sampling points 2 is 2n, and the ratio of the number of sampling points 1 to the number of sampling points 2 can be greater than 0 any value of .
  • the first communication device 2 may obtain sampling points 2 of the parameters of the control layer by sampling the parameters of the control layer.
  • the first communication device 1 sends the first data 1 to the second communication device through a channel.
  • the second communication device receives the second data 1 through the channel, where the second data 1 is obtained after the first data 1 is transmitted through the channel.
  • the first communication device 2 sends the first data 2 to the second communication device through the channel.
  • the second communication device receives the second data 2 through the channel, where the second data 2 is obtained after the first data 2 is transmitted through the channel.
  • the second communication device determines a first loss function according to the second data 1 and the second data 2.
  • the second communication device can respectively input the second data 1 and the second data 2 into the second machine learning model to obtain the third data 1 and the third data 2, and use the third data 1 as the predicted value and the first training data as the real Value, determine the first loss function 1, use the third data 2 as the predicted value, and the first training data as the real value, and determine the first loss function 2.
  • the above-mentioned first loss function includes a first loss function 1 and a first loss function 2 .
  • the specific implementation manner is the same as that of S505 and S506 above, and will not be repeated here.
  • the second communication device sends the first loss function to the first communication device 2 through the feedback channel.
  • the process of constructing the feedback channel by the second communication device is the same as that in the foregoing embodiment, and will not be repeated here.
  • the first communication device 2 receives the second loss function through the feedback channel, where the second loss function is obtained after the first loss function is transmitted through the feedback channel.
  • the first communication device 2 is the central first communication device, and the first communication device 2 can receive all the second loss functions sent by the second communication device through the feedback channel.
  • the first communication device 2 obtains updated parameters of the control layer according to the second loss function.
  • the first communication device 2 sends the updated parameters of the control layer to the first communication device 1 .
  • the first communication device 2 is the central first communication device, and can send the updated parameters of the control layer to other first communication devices, that is, the first communication device 1 .
  • the model training method provided in the embodiment of the present application adopts a central distributed training method, and divides the sampling of the control layer into multiple subtasks, which are jointly completed by two first communication devices, reducing the number of non-central first communication devices (the second For the calculation amount of a communication device 1), the central first communication device sends the updated parameters of the control layer to other first communication devices, which improves the efficiency of updating the parameters of the control layer.
  • the above-mentioned first communication device 1 may also send the first data 1 to the first communication device 2, and the first communication device 2 fuses the first data 2 and sends them to the second communication device through a channel.
  • the first communication device may also transmit the updated parameters of the first machine learning model to the other first communication means.
  • the model training method provided by the embodiment of the present application adopts a central distributed training method.
  • the first communication device in the center is trained, it can send updated model parameters to other first communication devices, saving other first communication devices.
  • the training cost of the device reduces the calculation amount of other first communication devices.
  • the first communication device may also transmit the updated parameters of the control layer and the Kalman gain to other first communication devices through mutual communication.
  • the other first communication device may update the parameters of the first network layer in the first machine learning model based on the received updated parameters of the control layer and the Kalman gain based on reverse gradient propagation, and then update the first Parameters of the machine learning model.
  • the above-mentioned first communication device may transmit the prior parameters of the control layer, the second loss function, the error covariance of the second loss function, and the updated parameters of the control layer to other first communication devices through mutual communication.
  • the other first communication device may first determine the Kalman gain according to the received prior parameters of the control layer, the second loss function, and the error covariance of the second loss function, and then according to the updated parameters of the control layer and the The Kalman gain is to update the parameters of the first network layer in the first machine learning model based on reverse gradient propagation, and then update the parameters of the first machine learning model.
  • sequence numbers of the above processes do not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
  • model training method of the embodiment of the present application is described in detail above with reference to FIG. 1 to FIG. 9 , and the relevant apparatus for model training of the embodiment of the present application will be described in detail below in conjunction with FIG. 10 and FIG. 11 .
  • FIG. 10 shows a schematic block diagram of an apparatus 1000 related to model training provided by an embodiment of the present application.
  • the apparatus 1000 includes: a transceiver unit 1010 and a processing unit 1020 .
  • the device 1000 may implement various steps or processes performed by the first communication device associated with the method embodiment 300 above.
  • the transceiver unit 1010 is configured to: send the first data to the second communication device through the channel, the first data is the output result of the first training data input to the first machine learning model, the first machine learning model includes a control layer, and the control The layer is at least one layer of the first machine learning model; the second loss function is received through the feedback channel, the feedback channel is determined according to the observation error, and the first loss function is after the first loss function sent by the second communication device is transmitted through the feedback channel owned.
  • the processing unit 1020 is configured to: update the parameters of the control layer based on the Kalman filter according to the first loss function to obtain the updated parameters of the control layer, and the updated parameters of the control layer are used to update the parameters of the first machine learning model.
  • processing unit 1020 is further configured to: obtain the Kalman gain according to the prior parameters of the control layer, the second loss function, and the error covariance of the second loss function; update the parameters of the control layer according to the Kalman gain, Get the parameters of the updated control layer.
  • the transceiver unit 1010 is further configured to: send fourth data to the second communication device through a channel, the fourth data is the output result of the second training data input to the first machine learning model; receive the output from the second communication device
  • the instruction information is used to instruct the device to stop the training of the first machine learning model.
  • the processing unit 1020 is further configured to: stop the training of the first machine learning model according to the instruction information.
  • the first data includes N sets of data, where N is a positive integer, and the value of N is determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the first data includes M sets of data, where M is a positive integer, the value of M is determined by the device and other first communication devices according to preset rules, and the value of M and data sent by other first communication devices The sum of the numbers is determined according to the type of Kalman filter and the dimension of the parameters of the control layer.
  • the transceiving unit 1010 is further configured to: the first communication device sends the updated parameters of the control layer to other first communication devices.
  • the above-mentioned processing unit 1020 is further configured to: judge the degree of nonlinearity of the channel in the first time period according to variances of multiple loss functions received in the first time period, the multiple loss functions including the second loss function ; Determine the type of Kalman filter according to the degree of nonlinearity of the channel in the first time period.
  • the variance of the second loss function is greater than or equal to the first threshold, and the degree of nonlinearity of the channel in the first time period is strongly nonlinear; or, the variance of the second loss function is smaller than the first threshold, and the channel is in the first time period.
  • the degree of nonlinearity in the time period is weakly nonlinear.
  • the degree of nonlinearity of the channel in the first time period is strongly nonlinear, and the type of Kalman filtering is volumetric Kalman filtering; or, the degree of nonlinearity of the channel in the first time period is weakly nonlinear, and the type of Kalman filtering is volumetric Kalman filtering;
  • the type of Mann filter is extended Kalman filter.
  • the device 1000 may implement various steps or processes corresponding to the execution of the second communication device in the method embodiment 300 above.
  • the transceiver unit 1010 is configured to: receive second data through a channel, the second data is obtained after the first data sent by the first communication device is transmitted through the channel, and the first data is the first training data input to the first machine learning
  • the first machine learning model includes a control layer, and the control layer is at least one layer of the first machine learning model.
  • the processing unit 1020 is used to: input the second data into the second machine learning model to obtain the third data; determine the first loss function according to the third data and the first training data, and the first loss function is used to update the machine learning model parameters of the control layer.
  • the transceiver unit 1010 is also configured to: send a first loss function to the first communication device through a feedback channel, the feedback channel is determined according to the observation error, and the first loss function is used to update the control layer of the first machine learning model parameters.
  • processing unit 1010 is further configured to: update parameters of the second machine learning model based on reverse gradient propagation according to the first loss function, to obtain an updated second machine learning model.
  • the transceiver unit 1010 is further configured to: receive fifth data through a channel, the fifth data is obtained after the fourth data sent by the first communication device is transmitted through the channel, and the fourth data is the second training data input to the first communication device.
  • the output result of a machine learning model; the above-mentioned processing unit 1020 is used to: input the fifth data into the second machine learning model to obtain the sixth data; determine the third loss function according to the sixth data and the second training data; the above-mentioned sending and receiving
  • the unit is further configured to: if the third loss function is lower than the preset threshold, send instruction information to the first communication device, where the instruction information is used to instruct the first communication device to stop the training of the first machine learning model.
  • the apparatus 1000 here is embodied in the form of functional units.
  • the term "unit” here may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor for executing one or more software or firmware programs (such as a shared processor, a dedicated processor, or a group processor, etc.) and memory, incorporated logic, and/or other suitable components to support the described functionality.
  • ASIC application specific integrated circuit
  • the device 1000 may specifically be the first communication device or the second communication device in the above embodiments, or the first communication device or the second communication device in the above embodiments
  • the function of the device can be integrated in the device, and the device can be used to execute the various processes and/or steps corresponding to the first communication device or the second communication device in the above method embodiments. To avoid repetition, details are not repeated here.
  • the above-mentioned device 1000 has the function of implementing the corresponding steps performed by the first communication device or the second communication device in the above-mentioned embodiment; the above-mentioned function can be realized by hardware, or can be realized by executing corresponding software by hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the above-mentioned transceiving unit 1020 may include a sending unit and a receiving unit, the sending unit may be used to implement various steps and/or processes for performing sending actions corresponding to the above-mentioned transceiver unit, and the receiving unit may be used to implement the corresponding The various steps and/or processes for performing receiving actions.
  • the sending unit may be replaced by a transmitter, and the receiving unit may be replaced by a receiver, respectively performing transceiving operations and related processing operations in each method embodiment.
  • the transceiving unit 1020 may be replaced by a communication interface to perform transceiving operations in various method embodiments.
  • the communication interface may be a circuit, a module, a bus, a bus interface, a transceiver, and other devices capable of implementing a communication function.
  • the processing unit 1010 in the above embodiments may be implemented by a processor or a processor-related circuit
  • the transceiver unit 1020 may be implemented by a transceiver or a transceiver-related circuit or an interface circuit.
  • a storage unit may also be included, and the storage unit is used to store a computer program, and the processing unit 1010 may call and run the computer program from the storage unit, so that the device 1000 executes the above-mentioned method embodiment
  • the method for connecting the first communication device or the second communication device is not limited in this embodiment of the present application.
  • the units in the above embodiments may also be referred to as modules, circuits, or components.
  • the device in FIG. 10 may also be a chip or a chip system, for example: a system on chip (system on chip, SoC).
  • the transceiver unit may be a transceiver circuit of the chip, which is not limited here.
  • FIG. 11 shows a schematic block diagram of another model training related apparatus 1100 provided by an embodiment of the present application.
  • the apparatus 1100 includes a processor 1110 and a transceiver 1120 .
  • the processor 1110 and the transceiver 1120 communicate with each other through an internal connection path, and the processor 1110 is used to execute instructions to control the transceiver 1120 to send signals and/or receive signals.
  • the apparatus 1100 may further include a memory 1130, and the memory 1130 communicates with the processor 1110 and the transceiver 1120 through an internal connection path.
  • the memory 1130 is used to store instructions, and the processor 1110 can execute the instructions stored in the memory 1130 .
  • the device 1100 is configured to implement various processes and steps corresponding to the first communication device or the second communication device in the above method embodiments.
  • the device 1100 may specifically be the first communication device or the second communication device in the foregoing embodiments, or may be a chip or a chip system.
  • the transceiver 1120 may be a transceiver circuit of the chip, which is not limited here.
  • the device 1100 may be configured to execute various steps and/or processes corresponding to the first communication device or the second communication device in the foregoing method embodiments.
  • the memory 1130 may include read-only memory and random-access memory, and provides instructions and data to the processor. A portion of the memory may also include non-volatile random access memory.
  • the memory may also store device type information.
  • the processor 1110 can be used to execute instructions stored in the memory, and when the processor 1110 executes the instructions stored in the memory, the processor 1110 can be used to execute the method corresponding to the first communication device or the second communication device. Each step and/or process of the example.
  • each step of the above method can be completed by an integrated logic circuit of hardware in a processor or an instruction in the form of software.
  • the steps of the methods disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware. To avoid repetition, no detailed description is given here.
  • the processor in the embodiment of the present application may be an integrated circuit chip, which has a signal processing capability.
  • each step of the above-mentioned method embodiments may be completed by an integrated logic circuit of hardware in a processor or instructions in the form of software.
  • the above-mentioned processor may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field-programmable gate array
  • the processor in the embodiment of the present application may realize or execute the various methods, steps and logic block diagrams disclosed in the embodiment of the present application.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.
  • the memory in the embodiments of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories.
  • the non-volatile memory can be read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically programmable Erases programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which acts as external cache memory.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM direct memory bus random access memory
  • direct rambus RAM direct rambus RAM
  • the present application also provides a computer program product, the computer program product including: computer program code, when the computer program code is run on the computer, the computer is made to execute the program shown in the above embodiments. Methods.
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium has program codes, and when the program codes are run on a computer, the computer is made to execute the above-mentioned embodiments. method shown.
  • the present application also provides a chip, the chip includes a processor, configured to read instructions stored in the memory, and when the processor executes the instructions, the chip implements the above-mentioned embodiments method shown in .
  • the present application provides a computer program that, when running on a computer, enables the methods in the possible implementation manners in the foregoing method embodiments to be executed.
  • the present application also provides a communication system, including the first communication device and the second communication device in the foregoing embodiments.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Algebra (AREA)
  • Mathematical Optimization (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请提供了一种模型训练方法及相关装置,有利于提高训练模型的收敛速度,提高端到端的通信质量。该方法包括:第一通信装置通过信道向第二通信装置发送第一数据,第一数据是第一机器学习模型的输出结果,第二通信装置通过信道接收第二数据,并将第二数据输入至第二机器学习模型,得到第三数据;根据第三数据和第一训练数据,确定第一损失函数;通过反馈信道向第一通信装置发送第一损失函数;第一通信装置通过反馈信道接收第二损失函数,并根据第二损失函数,基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,更新后的控制层的参数用于更新第一机器学习模型的参数。

Description

模型训练方法及相关装置
本申请要求于2021年07月09日提交中国国家知识产权局、申请号为202110780949.8、申请名称为“模型训练方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信领域,尤其涉及一种模型训练方法及相关装置。
背景技术
传统端到端的通信系统,通信信号的处理过程一般都会被分为一系列的子模块,例如信源编码、信道编码、调制以及信道估计等。若使端到端的通信实现优化,需要单独优化每个子模块。但该方法会引入较多的干扰效应,如放大器失真和信道损伤等,并且每个模块均有控制因素和参数数量,使端到端优化的复杂性非常高。
随着深度学习技术的发展,在端到端的通信系统中,发送端和接收端均可以通过自编码器(auto encoder)等机器学习模型对通信信号进行处理。在该通信系统中,当自编码器的优化程度高时,可以提升端到端的通信质量。但发送端发送的信号需要经过信道才能到达接收端,信道会对通信信号产生干扰,增加自编码器训练的难度。
目前,通过信道的通信信号产生的干扰一般难以用模型进行表征,增加自编码器的训练难度,进而影响端到端的通信质量。
发明内容
本申请提供了一种模型训练方法及相关装置,在未对信道进行建模的情况下,有利于增加训练机器学习模型的可行性,提高训练的收敛速度,优化机器学习模型的鲁棒性,从而提高端到端的通信质量。
第一方面,本申请提供了一种模型训练方法,可以应用于包括第一通信装置和第二通信装置的通信系统,第一通信装置的个数为至少一个,第一通信装置部署有第一机器学习模型,该方法包括:第一通信装置通过信道向第二通信装置发送第一数据,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层;第一通信装置通过反馈信道接收第二损失函数,反馈信道是根据观测误差确定的,第二损失函数是第二通信装置发送的第一损失函数经过反馈信道传输后得到的;第一通信装置根据第二损失函数,基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,更新后的控制层的参数用于更新第一机器学习模型的参数。
应理解,在一种可能的实现方式中,上述控制层为第一机器学习模型的最后一层。
还应理解,控制层是本申请实施例在第一机器学习模型中选择的至少一层网络,控制层仅仅为一个名称的示例,其他具有相同特点的名称,均可以包含在本申请实施 例的保护范围中。
上述第二损失函数可以为交叉熵或最小均方差等。上述卡尔曼滤波的类型可以为容积卡尔曼滤波或者扩展卡尔曼滤波等,本申请实施例对卡尔曼滤波的类型不做限定。
本申请实施例提供的模型训练方法,在未对信道进行建模的情况下,第一通信装置可以根据通过功率控制的信道接收第二损失函数,以使第一通信装置可以基于卡尔曼滤波对第一机器学习模型的控制层的参数进行更新,在存在信道误差的情况下,仍可以的保证模型训练的准确性,减小信道误差对模型训练的影响,增加用于端到端通信的机器学习模型训练的可行性,提高训练机器学习模型的收敛速度,优化机器学习模型的鲁棒性,从而提高端到端的通信质量。
结合第一方面,在一种可能的实现方式中,上述基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,包括:第一通信装置根据控制层的先验参数、第二损失函数和第二损失函数的误差协方差,得到卡尔曼增益;第一通信装置根据卡尔曼增益,更新控制层的参数,得到更新后的控制层的参数。
该控制层的先验参数可以为该控制层的初始参数。当控制层的初始参数变化时,控制层的先验参数可以为变化后的控制层的参数。应理解,控制层的先验参数可以根据更新的控制层的参数而变化。
本申请实施例提供的模型训练方法,通过计算卡尔曼增益更新控制层的参数,可以减小信道误差对更新控制层的参数的影响,提高更新控制层的参数的准确性。
结合第一方面,在一种可能的实现方式中,上述方法还包括:第一通信装置根据更新后的控制层的参数和卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,得到更新后的第一网络层的参数,第一网络层包括在控制层之前的网络层;第一通信装置根据更新后的控制层的参数和更新后的第一网络层的参数,得到更新后的第一机器学习模型。
应理解,第一通信装置可以通过更新控制层和第一网络层的参数,提取第一次训练数据的特征,更好的实现信源编码、信道编码以及调制的作用。
本申请实施例提供的模型训练方法,通过更新控制层和第一网络层的参数,有利于提取训练数据之间的关系,基于卡尔曼增益和控制层的参数更新第一机器学习模型的参数,可以较少参数更新的计算复杂度。
结合第一方面,在一种可能的实现方式中,在得到更新后的第一机器学习模型之后,上述方法还包括:第一通信装置通过信道向第二通信装置发送第四数据,第四数据是第二训练数据输入至第一机器学习模型的输出结果;第一通信装置接收来自第二通信装置的指示信息,指示信息用于指示第一通信装置停止第一机器学习模型的训练;第一通信装置根据指示信息,停止第一机器学习模型的训练。
可选地,上述第二通信装置确定第三损失函数后,也可以将该第三损失函数通过信道发送到第一通信装置,由第一通信装置判断该第三损失函数是否低于预设阈值,若低于预设阈值,第一通信装置停止第一机器学习模型的训练,并向第二通信装置发送指示信息,该指示信息用于指示第二通信装置停止第二机器学习模型的训练。
可选地,上述第二通信装置确定第三损失函数后,若该第三损失函数低于预设阈值,第二通信装置停止向第一通信装置发送第三损失函数,若在一段时间内,第二通 信装置未收到第三损失函数,则停止第一机器学习模型的训练。
本申请实施例提供的模型训练方法,在反复更新第一机器学习模型的参数的过程中,当检测到第三损失函数满足预设阈值时,可以停止更新第一机器学习模型的参数,有利于减少不必要的训练,节省运算资源,降低第一通信装置的功耗。
结合第一方面,在一种可能的实现方式中,第一数据包括N组数据,其中,N为正整数,且N的值是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
第一通信装置可以对控制层的参数进行采样,得到控制层的参数的采样点。其中,采样的个数可以根据卡尔曼滤波的类型和控制层的参数的维度确定。
结合第一方面,在一种可能的实现方式中,第一数据包括M组数据,其中,M为正整数,M的值是第一通信装置与其他第一通信装置根据预设规则确定的,M与其他第一通信装置所发送的数据的个数之和是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
在端到端的通信系统中,第一通信装置的个数可以为多个,第二通信装置的个数可以为1个。在该通信系统中,上述第一通信装置可以与其他第一通信装置根据预设规则确定第一数据的数量M。该预设规则可以是在该通信系统中,每个第一通信装置中的机器学习模型的输出结果的个数均大于等于1,且所有的第一通信装置中的机器学习模型的输出结果的个数之和由卡尔曼滤波的类型和控制层的参数的维度确定。
在该通信系统中的所有的第一通信装置可以通过互相通信,确定自身的采样点。
应理解,第一通信装置通过信道向第二通信装置发送第一数据,该第一数据可以包括M组数据,若所有的第一通信装置中的机器学习模型的输出结果的个数之和为P个,则第一通信装置通过信道接收的第二损失函数的个数为P个。第一通信装置根据所述P个第二损失函数,更新控制层的参数,得到更新后的控制层的参数。第一通信装置可以将更新后的控制层的参数通过互相通信的方式传输给其他第一通信装置。
应理解,通信系统中的多个第一通信装置采用中心式的分布式训练方法,将控制层的采样分成多个子任务,由多个第一通信装置共同完成,上述第一通信装置可作为中心的通信装置,可以接收第二通信装置发送的第二损失函数,并训练得到控制层的参数,然后下发给其他第一通信装置。
本申请实施例提供的模型训练方法,将对控制层的采样分成多个子任务,由多个第一通信装置共同完成,可以降低第一通信装置的运算量,从而降低第一通信装置的运算负担,保证在线训练的部署实现。
结合第一方面,在一种可能的实现方式中,上述方法还包括:第一通信装置向其他第一通信装置发送更新后的控制层的参数。
可选地,第一通信装置更新第一机器学习模型的参数后,第一通信装置还可以将更新后的第一机器学习模型的参数通过互相通信的方式传输给其他第一通信装置。
可选地,若第一通信装置确定卡尔曼增益后,第一通信装置还可以将更新后的控制层的参数和该卡尔曼增益通过互相通信的方式传输给其他第一通信装置,其他第一通信装置可以基于接收到的更新后的控制层的参数和该卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,进而更新第一机器学习模型的参数。
可选地,上述第一通信装置可以将控制层的先验参数、第二损失函数、第二损失 函数的误差协方差以及更新后的控制层的参数通过互相通信的方式传输给其他第一通信装置,其他第一通信装置可以先根据接收到的控制层的先验参数、第二损失函数、第二损失函数的误差协方差确定卡尔曼增益,然后再根据更新后的控制层的参数和该卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,进而更新第一机器学习模型的参数。
本申请实施例提供的模型训练方法,采用中心式的分布式训练方法,当中心的第一通信装置训练完成后,可以向其他第一通信装置发送更新后的模型参数,节省了其他第一通信装置的训练成本,减小了其他第一通信装置的计算量。
结合第一方面,在一种可能的实现方式中,在第一通信装置通过信道接收第二损失函数之后,上述方法还包括:第一通信装置根据第一时间段内接收到的多个损失函数的方差,判断信道在第一时间段内的非线性程度,多个损失函数包括第二损失函数;第一通信装置根据信道在第一时间段内的非线性程度,确定卡尔曼滤波的类型。
不同环境下信道不同,第一通信装置可以通过判断信道在第一时间段内的非线性程度,确定卡尔曼滤波的类型。
本申请实施例提供的模型训练方法,可以通过第一时间段的非线性程度判断环境对信道的影响,通过改变卡尔曼滤波的类型减小环境对信道的影响,使第一机器学习模型更新的复杂度和精度达到平衡。
结合第一方面,在一种可能的实现方式中,第二损失函数的方差大于或等于第一阈值,信道在第一时间段内的非线性程度为强非线性;或者,第二损失函数的方差小于第一阈值,信道在第一时间段内的非线性程度为弱非线性。
结合第一方面,在一种可能的实现方式中,信道在第一时间段内的非线性程度为强非线性,卡尔曼滤波的类型为容积卡尔曼滤波;或者,信道在第一时间段内的非线性程度为弱非线性,卡尔曼滤波的类型为扩展卡尔曼滤波。
第二方面,本申请提供了一种模型训练方法,可以应用于包括第一通信装置和第二通信装置的通信系统,第一通信装置的个数为至少一个,第一通信装置部署有第一机器学习模型,第二通信装置部署有第二机器学习模型,该方法包括:第二通信装置通过信道接收第二数据,第二数据是第一通信装置发送的第一数据经过信道传输后得到的,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层;第二通信装置将第二数据输入至第二机器学习模型,得到第三数据;所述第二通信装置根据所述第三数据和所述第一训练数据,确定第一损失函数;第二通信装置通过反馈信道向第一通信装置发送第一损失函数,反馈信道是根据观测误差确定的,第一损失函数用于更新第一机器学习模型的控制层的参数。
本申请实施例提供的模型训练方法,在未对信道进行建模的情况下,第二通信装置可以根据一段时间内的预测值和真实值的误差确定观测误差,构建方差为观测误差的反馈信道,以使第一通信装置可以基于卡尔曼滤波对第一机器学习模型的参数进行更新,可以减小信道误差对模型训练的影响,增加模型训练的可行性,提高训练自编码器的收敛速度,优化自编码器的鲁棒性,从而提高端到端的通信质量。
结合第二方面,在一种可能的实现方式中,上述方法还包括:第二通信装置根据 第一损失函数,基于反向梯度传播更新第二机器学习模型的参数,得到更新后的第二机器学习模型。
结合第二方面,在一种可能的实现方式中,上述方法还包括:第二通信装置通过信道接收第五数据,第五数据是第一通信装置发送的第四数据经过信道传输后得到的,第四数据是第二训练数据输入至第一机器学习模型的输出结果;第二通信装置将第五数据输入至第二机器学习模型,得到第六数据;第二通信装置根据第六数据和第二训练数据,确定第三损失函数;若第三损失函数低于预设阈值,第二通信装置向第一通信装置发送指示信息,指示信息用于指示第一通信装置停止第一机器学习模型的训练。
本申请实施例提供的模型训练方法,在反复更新第一机器学习模型的参数的过程中,当检测到第三损失函数满足预设阈值时,可以停止更新第一机器学习模型的参数,有利于减少不必要的训练,节省运算资源,降低第一通信装置的功耗。
第三方面,本申请提供了一种模型训练的相关装置,该装置可以用于第一方面的第一通信装置,该装置可以是终端设备或网络设备,也可以是终端设备或网络设备中的装置(例如,芯片,或者芯片系统,或者电路),或者是能够和终端设备或网络设备匹配使用的装置。
一种可能的实现中,该通信装置可以包括执行第一方面中所描述的方法/操作/步骤/动作所一一对应的模块或单元,该模块或单元可以是硬件电路,也可是软件,也可以是硬件电路结合软件实现。
一种可能的实现中,该装置包括收发单元和处理单元。该收发单元用于:通过信道向第二通信装置发送第一数据,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层;通过信道接收第二损失函数,第二损失函数是第二通信装置发送的第一损失函数经过信道传输后得到的。该处理单元用于:根据第二损失函数,基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,更新后的控制层的参数用于更新第一机器学习模型的参数。
结合第三方面,在一种可能的实现方式中,上述处理单元还用于:根据控制层的先验参数、第二损失函数和第二损失函数的误差协方差,得到卡尔曼增益;根据卡尔曼增益,更新控制层的参数,得到更新后的控制层的参数。
结合第三方面,在一种可能的实现方式中,上述收发单元还用于:通过信道向第二通信装置发送第四数据,第四数据是第二训练数据输入至第一机器学习模型的输出结果;接收来自第二通信装置的指示信息,指示信息用于指示该装置停止第一机器学习模型的训练。上述处理单元还用于:根据指示信息,停止第一机器学习模型的训练。
结合第三方面,在一种可能的实现方式中,第一数据包括N组数据,其中,N为正整数,且N的值是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
结合第三方面,在一种可能的实现方式中,第一数据包括M组数据,其中,M为正整数,M的值是该装置与其他第一通信装置根据预设规则确定的,M与其他第一通信装置所发送的数据的个数之和是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
结合第三方面,在一种可能的实现方式中,上述收发单元还用于:第一通信装置 向其他第一通信装置发送更新后的控制层的参数。
结合第三方面,在一种可能的实现方式中,上述处理单元还用于:根据第一时间段内接收到的多个损失函数的方差,判断信道在第一时间段内的非线性程度,多个损失函数包括第一损失函数;根据信道在第一时间段内的非线性程度,确定卡尔曼滤波的类型。
结合第三方面,在一种可能的实现方式中,第二损失函数的方差大于或等于第一阈值,信道在第一时间段内的非线性程度为强非线性;或者,第二损失函数的方差小于第一阈值,信道在第一时间段内的非线性程度为弱非线性。
结合第三方面,在一种可能的实现方式中,信道在第一时间段内的非线性程度为强非线性,卡尔曼滤波的类型为容积卡尔曼滤波;或者,信道在第一时间段内的非线性程度为弱非线性,卡尔曼滤波的类型为扩展卡尔曼滤波。
上述第三方面的各种可能的实现方式的有益效果参见第一方面,此处不再赘述。
第四方面,本申请提供了一种模型训练的相关装置,该装置可以用于第二方面的第二通信装置,该装置可以是终端设备或网络设备,也可以是终端设备或网络设备中的装置(例如,芯片,或者芯片系统,或者电路),或者是能够和终端设备或网络设备匹配使用的装置。
一种可能的实现中,该通信装置可以包括执行第二方面中所描述的方法/操作/步骤/动作所一一对应的模块或单元,该模块或单元可以是硬件电路,也可是软件,也可以是硬件电路结合软件实现。
一种可能的实现中,该装置包括收发单元和处理单元。该收发单元用于:通过信道接收第二数据,第二数据是第一通信装置发送的第一数据经过信道传输后得到的,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层。该处理单元用于:将第二数据输入至第二机器学习模型,得到第三数据;将第三数据根据预测值和第一训练数据,确定第一损失函数。该收发单元还用于:通过反馈信道向第一通信装置发送第一损失函数,反馈信道是根据观测误差确定的,第一损失函数用于更新第一机器学习模型的控制层的参数。
结合第四方面,在一种可能的实现方式中,上述处理单元还用于:根据第一损失函数,基于反向梯度传播更新第二机器学习模型的参数,得到更新后的第二机器学习模型。
结合第四方面,在一种可能的实现方式中,上述收发单元还用于:通过信道接收第五数据,第五数据是第一通信装置发送的第四数据经过信道传输后得到的,第四数据是第二训练数据输入至第一机器学习模型的输出结果;上述处理单元用于:将第五数据输入至第二机器学习模型,得到第六数据;根据第六数据和第二训练数据,确定第三损失函数;上述收发单元还用于:若第三损失函数低于预设阈值,向第一通信装置发送指示信息,指示信息用于指示第一通信装置停止第一机器学习模型的训练。
上述第三方面的各种可能的实现方式的有益效果参见第二方面,此处不再赘述。
第五方面,本申请提供了又一种模型训练的相关装置,包括处理器,该处理器与存储器耦合,可用于执行存储器中的指令,以实现上述各个方面中任一种可能实现方 式中的方法。可选地,该装置还包括存储器。可选地,该装置还包括通信接口,处理器与通信接口耦合,用于与其他通信装置进行通信。
第六方面,本申请提供了一种处理装置,包括处理器和存储器。该处理器用于读取存储器中存储的指令,并可通过接收器接收信号,通过发射器发射信号,以执行上述各个方面中任一种可能实现方式中的方法。
可选地,处理器为一个或多个,存储器为一个或多个。
可选地,存储器可以与处理器集成在一起,或者存储器与处理器分离设置。
在具体实现过程中,存储器可以与处理器集成在同一块芯片上,也可以分别设置在不同的芯片上,本申请对存储器的类型以及存储器与处理器的设置方式不做限定。
相关的数据交互过程例如发送第一数据可以为从处理器输出第一数据的过程,接收第二数据可以为处理器接收输入第二书记的过程。具体地,处理输出的数据可以输出给发射器,处理器接收的输入数据可以来自接收器。其中,发射器和接收器可以统称为收发器。
上述第六方面中的处理装置可以是一个芯片,该处理器可以通过硬件来实现也可以通过软件来实现,当通过硬件实现时,该处理器可以是逻辑电路、集成电路等;当通过软件来实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现,该存储器可以集成在处理器中,可以位于该处理器之外独立存在。
第七方面,本申请提供了一种计算机程序产品,计算机程序产品包括:计算机程序(也可以称为代码,或指令),当计算机程序被运行时,使得计算机执行上述各个方面中任一种可能实现方式中的方法。
第八方面,本申请提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序(也可以称为代码,或指令)当其在计算机上运行时,使得计算机执行上述各个方面中任一种可能实现方式中的方法。
第九方面,本申请提供了一种计算机程序,当其在计算机上运行时,使得上述各个方面中可能实现方式中的方法被执行。
第十方面,本申请提供了一种通信系统,包括上述第三方面及其各种可能实现的方式中的装置和上述第四方面及其各种可能实现的方式中的装置。
附图说明
图1是一种端到端的信号传输过程的示意图;
图2是一种基于自编码器的端到端的信号传输过程的示意图;
图3是本申请实施例提供的一种模型训练方法的示意性流程图;
图4是本申请实施例提供的一种端到端的信号传输过程的示意图;
图5是本申请实施例提供的另一种模型训练方法的示意性流程图;
图6是本申请实施例提供的更新第一网络层参数的示意图;
图7是基于本申请实施例提供的模型训练方法的交叉熵损失的示意图;
图8是基于本申请实施例提供的模型训练方法的误码率变化的示意图;
图9是本申请实施例提供的另一种模型训练方法的示意性流程图;
图10是本申请实施例提供的一种模型训练的相关装置的示意性框图;
图11是本申请实施例提供的另一种模型训练的相关装置的示意性框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请实施例的技术方案可以应用于各种通信系统,例如:窄带物联网系统(narrow band-internet of things,NB-IoT)、长期演进(long term evolution,LTE)系统、LTE频分双工(frequency division duplex,FDD)系统、LTE时分双工(time division duplex,TDD)、新无线(new radio,NR)等第五代移动通信(5th generation,5G)系统或、或者其他演进的通信系统等。5G系统通常包括以下三大应用场景:增强移动宽带(enhanced mobile broadband,eMBB),超高可靠与低时延通信(ultra-reliable and low latency communications,URLLC)和海量机器类通信(massive machine type of communication,mMTC)。
本申请实施例中的通信装置可以为网络设备或终端设备。应理解,终端设备可以替换为能够实现与终端设备类似的功能的装置或芯片,网络设备也可以替换为能够实现与网络设备类似的功能的装置或芯片,本申请实施例对其名称不作限定。
本申请实施例中的终端设备也可以称为:用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)、接入终端、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置等。
终端设备可以是一种向用户提供语音/数据连通性的设备,例如,具有无线连接功能的手持式设备、车载设备等。目前,一些终端设备的举例为:手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备,虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端、蜂窝电话、无绳电话、会话启动协议(session initiation protocol,SIP)电话、无线本地环路(wireless local loop,WLL)站、个人数字助理(personal digital assistant,PDA)、具有无线通信功能的手持设备、计算设备或连接到无线调制解调器的其它处理设备、车载设备、可穿戴设备,5G网络中的终端设备或者未来演进的公用陆地移动通信网络(public land mobile network,PLMN)中的终端设备等,本申请实施例对此并不限定。
此外,在本申请实施例中,终端设备还可以是物联网(internet of things,IoT)系统中的终端设备,IoT是未来信息技术发展的重要组成部分,其主要技术特点是将物品通过通信技术与网络连接,从而实现人机互连,物物互连的智能化网络。
另外,本申请实施例中的网络设备可以是为终端设备提供无线通信功能的设备,该网络设备也可以称为接入网设备或无线接入网设备,可以是传输接收点(transmission reception point,TRP),还可以是LTE系统中的演进型基站(evolved NodeB,eNB或eNodeB),还可以是家庭基站(例如,home evolved NodeB,或home Node B,HNB)、 基带单元(base band unit,BBU),还可以是云无线接入网络(cloud radio access network,CRAN)场景下的无线控制器,或者该网络设备可以为中继站、接入点、车载设备、可穿戴设备以及5G网络中的网络设备或者未来演进的PLMN网络中的网络设备等,可以是无线局域网(Wireless Local Area Network,WLAN)中的接入点(access point,AP),可以是新型无线(new radio,NR)系统中的gNB,可以是卫星通信系统中的卫星基站等,以及设备到设备(Device-to-Device,D2D)、车辆外联(vehicle-to-everything,V2X)、机器到机器(machine-to-machine,M2M)通信中承担基站功能的设备等,本申请实施例并不限定。
在一种网络结构中,网络设备可以包括集中单元(centralized unit,CU)节点、或分布单元(distributed unit,DU)节点、或包括CU节点和DU节点的无线接入网(radio access network,RAN)设备、或者控制面CU节点(CU-CP节点)和用户面CU节点(CU-UP节点)以及DU节点的RAN设备。
网络设备为小区内的终端设备提供服务,终端设备通过网络设备分配的传输资源(例如,频域资源,或者说,频谱资源)与小区对应的网络设备或者其他设备进行通信,该网络设备可以为宏基站(例如,宏eNB或宏gNB等),也可以为小小区(small cell)对应的基站,这里的小小区可以包括:城市小区(metro cell)、微小区(micro cell)、微微小区(pico cell)、毫微微小区(femto cell)等,这些小小区具有覆盖范围小、发射功率低的特点,适用于提供高速率的数据传输服务。
本申请实施例并未对本申请实施例提供的方法的执行主体的具体结构特别限定,只要能够通过运行记录有本申请实施例的提供的方法的代码的程序,以根据本申请实施例提供的方法进行通信即可,例如,本申请实施例提供的方法的执行主体可以是终端设备或网络设备,或者,是终端设备或网络设备中能够调用程序并执行程序的功能模块。
另外,本申请的各个方面或特征可以实现成方法、装置或使用标准编程和/或工程技术的制品。本申请中使用的术语“制品”涵盖可从任何计算机可读器件、载体或介质访问的计算机程序。例如,计算机可读介质可以包括,但不限于:磁存储器件(例如,硬盘、软盘或磁带等),光盘(例如,压缩盘(compact disc,CD)、数字通用盘(digital versatile disc,DVD)等),智能卡和闪存器件(例如,可擦写可编程只读存储器(erasable programmable read-only memory,EPROM)、卡、棒或钥匙驱动器等)。另外,本文描述的各种存储介质可代表用于存储信息的一个或多个设备和/或其它机器可读介质。术语“机器可读介质”可包括但不限于能够存储、包含和/或承载指令和/或数据的各种其它介质。
传统端到端的通信系统,通信信号的处理过程一般都会被分为一系列的子模块,例如信源编码、信道编码、调制、信道估计等。若提升端到端的通信质量,需要单独优化每个子模块。其中,每个子模块都是基于特定的信号处理算法建模,通常是近似为一些简化的线性模型。然而,用这种单独优化每个子模块的方式并不能保证整个通信系统实现端到端优化,反而会引入更多的干扰效应,如放大器失真和信道损伤等,同时每个模块均有控制因素和参数数量,使该传统方法进行端到端优化的复杂性非常高。
应理解,在传统端到端的通信系统中,通信装置可以为终端设备或网络设备。若通信系统中的发送端为终端设备,则接收端可以是网络设备或者其他终端设备。或者,若通信系统中的发送端为网络设备,则接收端可以是终端设备或者其他网络设备,即本申请实施例可以应用于网络设备和网络设备之间、网络设备和终端设备之间、终端设备和终端设备之间等多种场景的端到端的通信系统。
示例性地,图1示出了一种传统的端到端的信号传输过程的示意图。如图1所示,通信信号的传输过程可以分成信源编码、信道编码、调制、信道、解调、信道译码以及信源译码等子模块。发送端可以发送通信信号u到接收端。具体地,发送端可以将通信信号u先经过信源编码、信道编码、调制等子模块转换成通信信号x,再将该通信信号x通过信道发送到接收端,通过信道的通信信号x会带有信道误差,故接收端通过信道接收到的通信信号为y,经解调、信道译码以及信源译码等子模块得到通信信号u*。
若使通信系统实现端到端的优化,即使接收端接收的通信信号u*和发送端发送的通信信号u之间的误差达到尽可能的小,需要优化每个子模块,会使端到端优化的复杂性非常高,且不能保证整个通信系统实现端到端的优化。
随着深度学习技术的发展,发送端和接收端均可以通过自编码器(auto encoder)对通信信号进行处理。具体地,发送端和接收端均可以用神经网络的方式进行建模,并通过大量训练样本学习数据的分布,然后用来预测结果。这样的端到端学习方式能够做到联合优化,传统的端到端的通信方法可以做到更优的效果。
示例性地,图2示出了一种基于自编码器的端到端的信号传输过程的示意图。如图2所示,通信信号的传输过程可以分成编码的自编码器和译码的自编码器,减少了子模块的个数。发送端可以发送通信信号u到接收端。具体地,发送端可以将通信信号u经过编码的自编码器转换成通信信号x,再将该通信信号x通过信道发送到接收端,通过信道的通信信号x会带有信道误差,故接收端通过信道接收到的通信信号为y,经译码的自编码器得到通信信号u*。
在该通信系统中,当自编码器的优化程度较高时,可以提升端到端的通信质量。但该通信系统中信道一般难以用模型进行表征,会影响译码的自编码器计算损失函数,进而影响自编码器的训练,增加了自编码器的训练难度,影响端到端的通信质量。
有鉴于此,本申请实施例提供了一种模型训练方法及相关装置,在未对信道进行建模的情况下,有利于增加训练机器学习模型的可行性,提高训练机器学习模型的收敛速度,优化机器学习模型的鲁棒性,从而提高端到端的通信质量。
在介绍本申请实施例提供的模型训练方法及相关装置之前,先做出以下几点说明。
第一,在下文示出的实施例中,各术语及英文缩略语,如控制层和网络层等,均为方便描述而给出的示例性举例,不应对本申请构成任何限定。本申请并不排除在已有或未来的协议中定义其它能够实现相同或相似功能的术语的可能。
第二,在下文示出的实施例中,第一、第二以及各种数字编号仅为描述方便进行的区分,并不用来限制本申请实施例的范围。例如,区分不同的通信装置、区分不同的机器学习模型等。
第三,在下文示出的实施例中,“至少一个”是指一个或者多个,“多个”是指两个 或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b和c中的至少一项(个),可以表示:a,或b,或c,或a和b,或a和c,或b和c,或a、b和c,其中a,b,c可以是单个,也可以是多个。
下面,以第一通信装置和第二通信装置为例,详细介绍本申请的模型训练方法。第一通信装置可以为上述终端设备或网络设备,第二通信装置可以为上述终端设备或网络设备。应理解,该第一通信装置相当于上述发送端,第二通信装置相当于上述接收端。
图3为本申请实施例提供的一种模型训练方法300的示意图。该方法300可以由应用于包括第一通信装置和第二通信装置的通信系统,该第一通信装置的个数为至少一个,该第一通信装置可以部署有第一机器学习模型。如图3所示,该方法300可以包括下列步骤:
S301,第一通信装置可以通过信道向第二通信装置发送第一数据,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层。
控制层可以为第一机器学习模型中最后的至少一层网络,也可以是第一机器学习模型中任意位置的至少一层网络,本申请实施例对控制层在第一机器学习模型中的位置不做限定。
示例性地,图4示出了本申请实施例提供的一种端到端的信号传输过程的示意图。如图4所示,控制层为第一机器学习模型中最后的一层网络。其中,图4中第一机器学习模型中网络层的个数仅仅为一个示例,本申请实施例对此不做限定。
应理解,控制层是本申请实施例在第一机器学习模型中选择的至少一层网络,控制层仅仅为一个名称的示例,其他具有相同特点的名称,均可以包含在本申请实施例的保护范围中。
示例性地,上述第一数据可以为图4中的x,第一训练数据可以为图4中的u,第一机器学习模型可以理解为上述图2中译码的自编码器或者神经网络模型。
S302,第二通信装置通过信道接收第二数据,第二数据是第一通信装置发送的第一数据经过信道传输后得到的。
应理解,通过信道的数据会产生干扰,故第一数据经过信道后为第二数据。该第二数据可以为图4中的y。
S303,第二通信装置将第二数据输入至第二机器学习模型,得到第三数据。
示例性地,该第二机器学习模型可以理解为上述图2中解码的自编码器或者神经网络模型。第三数据可以为图4中的u*。
S304,第二通信装置根据第三数据和第一训练数据,确定第一损失函数。
示例性地,第二通信装置可以将第三数据作为预测值、第一训练数据作为真实值,确定第一损失函数。该第一损失函数也可以称为目标函数,本申请实施例不做限定。
该第一训练数据是样本数据,可以是预设的,也可以是其他通信装置发送的。应 理解,若第二通信装置接收来自其他通信装置发送的第一训练数据,该第一训练数据不经过未知误差或者未知噪声的信道。
示例性地,第一通信装置发送的是第一训练数据u,但第二通信装置最终得到的是第三数据u*,故第二通信装置可以将第三数据u*作为预测值、第一训练数据u作为真实值,确定第一损失函数,该第一损失函数为第三数据u*和第一训练数据u的误差函数。
应理解,第三数据和第一训练数据之间的误差是信道造成的。
该第一损失函数可以为交叉熵或最小均方差等。该第一损失函数可以作为卡尔曼滤波中的观测量。
S305,第二通信装置通过反馈信道向第一通信装置发送第一损失函数,反馈信道是根据观测误差确定的,第一损失函数用于更新第一机器学习模型的参数。
该反馈信道的方差可以是观测误差。示例性地,反馈信道可以是均值为0,方差为观测误差的加性高斯白噪声(additive white gaussian noise,AWGN)信道。第二通信装置在反馈第一损失函数时,可以通过控制反馈第一损失函数的信号的发射功率,改变发送信号的信噪比,构造方差为观测误差的AWGN信道。
观测误差可以是第二通信装置根据一段时间内的预测值和真实值的误差确定的。该一段时间可以为任意的一段时间,本申请实施例对该段时间的时长不做限定。
可选地,第一损失函数还可以用于更新第二机器学习模型的参数。具体地,第二通信装置可以根据第一损失函数,基于反向梯度传播更新第二机器学习模型的参数,得到更新后的第二机器学习模型。
S306,第一通信装置通过反馈信道接收第二损失函数,第二损失函数是第二通信装置发送的第一损失函数经过反馈信道传输后得到的。
示例性地,若反馈信道的信道误差为观测误差,则第二损失函数可以包括观测误差。
S307,第一通信装置根据第二损失函数,基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,更新后的控制层的参数用于更新第一机器学习模型的控制层的参数。
示例性地,第二损失函数可以包括观测误差。第二损失函数为卡尔曼滤波方法中的观测量,若该误差越大,可以表明后验的第二损失函数(观测量)置信度低,更倾向于用于估计更新的控制层的参数结果;若该误差越小,可以表明后验的第二损失函数(观测量)置信度高,更倾向于根据后验的第二损失函数更新的控制层的参数结果。
应理解,后验的控制层的参数为第一通信装置根据第二损失函数和卡尔曼滤波计算得到的,先验的控制层的参数为每次更新前的控制层的参数。
该卡尔曼滤波的类型可以为容积卡尔曼滤波或者扩展卡尔曼滤波等,本申请实施例对卡尔曼滤波的类型不做限定。
示例性地,若卡尔曼滤波的类型为容积卡尔曼滤波,该容积卡尔曼滤波可以通过下列公式表示:
d k=h(u k;θ k)+r k
其中,k可以为训练的轮数或者训练时刻,u k可以为上述第一训练数据,θ k可以 为上述控制层的参数,h(u k;θ k)可以为端到端的非线性函数,该函数h(u k;θ k)可以表示上述第一机器学习模型、信道以及第二机器学习模型的非线性关系,r k为观测误差,d k为观测量。
应理解,理论上d k可以经上述第一机器学习模型、信道以及第二机器学习模型仍然与上述第一训练数据相同。
本申请实施例提供的模型训练方法,在未对信道进行建模的情况下,第二通信装置可以根据一段时间内的预测值和真实值的误差确定观测误差,构建方差为观测误差的反馈信道,以使第一通信装置基于卡尔曼滤波对第一机器学习模型的参数进行更新,可以减小信道误差对模型训练的影响,增加模型训练的可行性,提高训练机器学习模型的收敛速度,卡尔曼滤波方法中基于观测量的更新方式,可以优化机器学习模型的鲁棒性,从而提高端到端的通信质量。
可选地,上述S307,基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,包括:第一通信装置根据控制层的先验参数、第二损失函数和第二损失函数的误差协方差,得到卡尔曼增益;第一通信装置根据卡尔曼增益,更新控制层的参数,得到更新后的控制层的参数。
该控制层的先验参数可以为该控制层的初始参数。当控制层的初始参数变化时,控制层的先验参数可以为变化后的控制层的参数。应理解,控制层的先验参数可以根据更新的控制层的参数而变化。
第一通信装置可以根据控制层的先验参数、第三数据和第一训练数据确定的第二损失函数以及第二损失函数的误差协方差,得到卡尔曼增益;并根据卡尔曼增益,更新控制层的参数,得到更新后的控制层的参数。
本申请实施例提供的模型训练方法,通过计算卡尔曼增益更新控制层的参数,可以减小信道误差对更新控制层的参数的影响,提高更新控制层的参数的准确性。
作为一个可选的实施例,上述方法300还包括:第一通信装置根据更新后的控制层的参数和卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,得到更新后的第一网络层的参数,第一网络层包括在控制层之前的网络层;第一通信装置根据更新后的控制层的参数和更新后的第一网络层的参数,得到更新后的第一机器学习模型。
应理解,无论控制层位于第一机器学习模型的第几层,第一网络层均包括在控制层之前的网络层。例如,第一机器学习模型共8层网络层,若控制层位于第一机器学习模型的第5层,则第一网络层包括第一机器学习模型的前4层网络层。又如,第一机器学习模型共12层网络层,若控制层位于第一机器学习模型的第10层至第12层,则第一网络层包括第一机器学习模型的前9层网络层。
示例性地,第一网络层可以是基于全连接、卷积层或者残差网络(resnet)等网络结构。
应理解,第一通信装置可以通过更新控制层和第一网络层的参数,提取第一次训练数据的特征,更好的实现信源编码、信道编码以及调制的作用。
本申请实施例提供的模型训练方法,通过更新控制层和第一网络层的参数,有利于提取训练数据之间的关系,基于卡尔曼增益和控制层的参数更新第一机器学习模型 的参数,可以较少参数更新的计算复杂度。
可选地,在得到更新后的第一机器学习模型之后,上述方法300还包括:第一通信装置通过信道向第二通信装置发送第四数据,第四数据是第二训练数据输入至第一机器学习模型的输出结果;第二通信装置通过信道接收第五数据,第五数据是第一通信装置发送的第四数据经过信道传输后得到的;第二通信装置将第五数据输入至第二机器学习模型,得到第六数据;第二通信装置根据第六数据和第二训练数据,确定第三损失函数;若第三损失函数低于预设阈值,第二通信装置向第一通信装置发送指示信息,指示信息用于指示第一通信装置停止第一机器学习模型的训练,对应地,第一通信装置接收来自第二通信装置的指示信息,指示信息用于指示第一通信装置停止第一机器学习模型的训练;第一通信装置根据指示信息,停止第一机器学习模型的训练。
在得到更新后的第一机器学习模型之后,第一通信装置又会开始新一轮的训练,即第一通信装置将第二训练数据输入至第一机器学习模型,得到输出结果为第四数据,并通过信道向第二通信装置发送该第四数据。由于该第四数据通过信道,会带有信道误差,故第二通信装置会接收到第五数据。和上一轮训练相同,第二通信装置可以得到第六数据,并将第六数据作为预测值,将第二训练数据作为真实值,确定第三损失函数,若该第三损失函数低于预设阈值,第二通信装置可以确定经过上一轮的训练得到的更新后的第一机器学习模型为满足条件的模型,可以不再进行该轮训练,故向第一通信装置发送指示信息,该指示信息用于指示第一通信装置停止第一机器学习模型的训练。
应理解,若该第三损失函数高于或等于预设阈值,第二通信装置将会重复上一轮的训练步骤继续进行训练。
可选地,上述第二通信装置可以周期性地判断第三损失函数是否低于预设阈值,若第三损失函数低于阈值,则向第一通信装置发送指示信息。例如,第二通信装置可以每隔一定的时间或者每间隔一定的训练轮数判断第三损失函数是否低于预设阈值。本申请实施例提供的模型训练方法,在反复更新第一机器学习模型的参数的过程中,当检测到第三损失函数满足预设阈值时,可以停止更新第一机器学习模型的参数,有利于减少不必要的训练,节省运算资源,降低第一通信装置的功耗。
可选地,上述第二通信装置确定第三损失函数后,也可以将该第三损失函数通过信道发送到第一通信装置,由第一通信装置判断该第三损失函数是否低于预设阈值,若低于预设阈值,第一通信装置停止第一机器学习模型的训练,并向第二通信装置发送指示信息,该指示信息用于指示第二通信装置停止第二机器学习模型的训练。
可选地,上述第二通信装置确定第三损失函数后,若该第三损失函数低于预设阈值,第二通信装置停止向第一通信装置发送第三损失函数,若在一段时间内,第二通信装置未收到第三损失函数,则停止第一机器学习模型的训练。
作为一个可选的实施例,在上述S306,第一通信装置通过信道接收第二损失函数之后,上述方法300还包括:第一通信装置根据第一时间段内接收到的多个损失函数的方差,判断信道在第一时间段内的非线性程度,多个损失函数包括第二损失函数;第一通信装置根据信道在第一时间段内的非线性程度,确定卡尔曼滤波的类型。
示例性地,方差σ 2可以通过下列公式表示:
Figure PCTCN2022103985-appb-000001
其中,L k为时刻k的第二损失函数,T为第一时间段的时长,
Figure PCTCN2022103985-appb-000002
为T时刻内多个损失函数的均值。
第一通信装置可以通过σ 2的值,判断信道的非线性程度。
应理解,第一时间段为任意一段连续的时间,本申请实施例对第一时间段的时长不做限定。
不同环境下信道不同,第一通信装置可以通过判断信道在第一时间段内的非线性程度,确定卡尔曼滤波的类型。
本申请实施例提供的模型训练方法,可以通过第一时间段的非线性程度判断环境对信道的影响,通过改变卡尔曼滤波的类型减小环境对信道的影响,使第一机器学习模型更新的复杂度和精度达到平衡。
可选地,第一通信装置可以预设第一阈值,当第二损失函数的方差大于或等于该第一阈值时,信道在第一时间段内的非线性程度为强非线性;当第二损失函数的方差小于该第一阈值,信道在所述第一时间段内的非线性程度为弱非线性。
该第一阈值的值和第一阈值的个数可以是第一通信装置根据卡尔曼滤波的计算精度确定的。示例性地,第一通信装置采用容积卡尔曼滤波的3阶积分方法可以得到2阶估计精度,即设定一个第一阈值,将非线性程度分为强非线性和弱非线性。其中,第一阈值是一个大于0且小于1的值。应理解,若第一通信装置采用容积卡尔曼滤波中更高阶的积分方法,可以得到更高的计算精度,第一阈值的值可以不同,第一阈值的个数可以包括至少一个。
可选地,当信道在该第一时间段内的非线性程度为强非线性时,卡尔曼滤波的类型可以为容积卡尔曼滤波;当所述信道在第一时间段内的非线性程度为弱非线性时,卡尔曼滤波的类型可以为扩展卡尔曼滤波。
当信道的非线性程度为弱非线性时,第一通信装置可以选择复杂度较低的扩展卡尔曼滤波更新第一机器学习模型的参数;当信道的非线性程度为强非线性时,第一通信装置可以选择复杂度较高的容积卡尔曼滤波更新第一机器学习模型的参数。
可选地,当信道的非线性程度为强非线性时,第一通信装置可以采用更高阶的积分方式更新第一机器学习模型的参数。
示例性地,第一通信装置若采用容积卡尔曼滤波的5阶积分方法,则采样点的个数可以为n 2+n+1,计算精度更高,更适合强非线性的信道估计。
可选地,当信道的非线性程度为弱非线性时,第一通信装置可以减少控制层的层数;当信道的非线性程度为强非线性时,第一通信装置可以增加控制层的层数。
上述控制层的参数为Θ c,第一通信装置可以根据控制层的非线性程度,适应变化控制层的参数Θ c的层数。
当信道的非线性程度较弱时,较少的控制层参数便可以消除信道误差的影响,减小更新控制层参数的复杂度,同时,在更新第一网络层的参数时,可以减少反向梯度传播的计算量,进而减小训练第一机器学习模型的复杂度。
当信道的非线性程度较强时,较多的控制层参数可以消除信道误差强非线性的影响,提高控制层参数更新的精度。
作为一个可选的实施例,上述第一数据可以包括N组数据,其中,N为正整数,且N的值是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
第一数据的个数可以根据卡尔曼滤波的类型和控制层的参数的维度确定。
示例性地,若控制层的参数的维度为6,卡尔曼滤波的类型为容积卡尔曼滤波,则第一数据的个数可以为2*6=12个,即第一通信装置对控制层每个维度的参数增加了左右两个扰动,可以得到12个采样点。若第一训练数据为一组数据,第一通信装置将该组数据分别输入第一机器学习模型中,则可以得到12组第一数据。
示例性地,若卡尔曼滤波的类型为扩展卡尔曼滤波,则第一数据可以为1组数据,即无需对控制层的参数进行采样。上述控制层的参数的更新的方式仍然可以适用。
下面,以第一通信装置对控制层的参数进行采样后进行模型训练为例,对本申请实施例提供的模型训练方法进行详细介绍。
图5示出了本申请实施例提供的另一种模型训练方法500的示意性流程图。如图5所示,该方法可以包括下列步骤:
S501,第一通信装置对控制层的参数进行采样,得到控制层的参数的采样点。
控制层可以是第一机器学习模型的最后至少一层网络。
示例性地,第一通信装置在训练第一机器学习模型之前,可以先初始化第一机器学习模型中控制层的参数θ 0和控制层参数θ 0的误差协方差P 0|0=I。然后,第一通信装置可以对θ 0进行采样。例如,k时刻的采样点可以表示为
Figure PCTCN2022103985-appb-000003
其中,k≥1,0时刻的采样点可以记为
Figure PCTCN2022103985-appb-000004
应理解,时刻可以理解是采样的时刻或者采样的次数。
Figure PCTCN2022103985-appb-000005
可以通过下列公式(1)表示:
Figure PCTCN2022103985-appb-000006
其中,
Figure PCTCN2022103985-appb-000007
表示服从均值为
Figure PCTCN2022103985-appb-000008
方差为P k-1|k-1的高斯分布,
Figure PCTCN2022103985-appb-000009
为第k-1时刻更新后的控制层的参数,P k-1|k-1为k-1时刻的控制层的参数的误差协方差,用来度量估计的准确程度,θ k-1为k-1时刻的控制层的参数。
应理解,
Figure PCTCN2022103985-appb-000010
可以理解为基于第k-1时刻结果对第k时刻参数的预测(先验估计)值。k是指更新卡尔曼滤波的时刻,或者是训练的次数。k的取值范围是由整个训练过程决定的,即上述第一损失函数低于预设阈值后就终止训练。
P k|k-1为k-1时刻的控制层的采样参数与k时刻的控制层的采样参数之间的误差协方差,该P k|k-1可以通过下列公式(2)表示:
Figure PCTCN2022103985-appb-000011
其中,Q k-1为系统噪声,该Q k-1与P k|k-1的关系可以通过下列公式(3)表示:
Figure PCTCN2022103985-appb-000012
其中,λ为遗忘因子,表示对过去的数据施加指数衰减权重,取值范围为0<λ≤1。
故上述P k|k-1可以转换成通过下列公式(4)表示:
Figure PCTCN2022103985-appb-000013
若卡尔曼滤波的类型为容积卡尔曼滤波,第一通信装置可以采用体积法来计算高斯权重积分,如下列公式(5)所示:
Figure PCTCN2022103985-appb-000014
其中,P=SS T,S是P的正交三角分解且
Figure PCTCN2022103985-appb-000015
γ i为积分点,且γ i可以通过下列公 式表示:
Figure PCTCN2022103985-appb-000016
其中,e i表示第i个元素为1的单位列向量。
因此,可通过生成2n个采样点计算得到控制层的参数的采样点
Figure PCTCN2022103985-appb-000017
其中,n为大于等于1的正整数。
S502,第一通信装置可以将第一训练数据输入至第一机器学习模型,得到第一数据,其中,第一机器学习模型包括上述控制层的参数的采样点。
第一训练数据为一组数据,若卡尔曼滤波的类型为容积卡尔曼滤波,则控制层的参数的采样点为2n个,则第一数据可以为2n组数据。
S503,第一通信装置可以通过信道向第二通信装置发送第一数据。
S504,第二通信装置通过信道接收第二数据,第二数据是第一通信装置发送的第一数据经过信道传输后得到的。
应理解,第一数据为2n组数据,则第二数据也为2n组数据。
S505,第二通信装置将第二数据输入至第二机器学习模型,得到第三数据。
应理解,第三数据为2n组数据。
示例性地,假设第一机器学习模型、信道以及第二机器学习模型所表达的非线性函数为h(u;θ),其中,u为通信系统的输入,θ为控制层的参数,则上述第三数据可以通过下列公式(7)表示:
Figure PCTCN2022103985-appb-000018
其中,u k为第一训练数据,θ k为该时刻的控制层参数,
Figure PCTCN2022103985-appb-000019
表示服从均值为
Figure PCTCN2022103985-appb-000020
方差为P k|k-1的高斯分布,k为训练的轮数或者训练的时刻。
Figure PCTCN2022103985-appb-000021
可以通过上述公式(1)表示,P k|k-1可以通过上述公式(2)表示。
第二通信装置可以根据该第三数据,估计该第三数据之间的误差协方差P dd,该P dd可以通过下列公式(8)表示:
Figure PCTCN2022103985-appb-000022
其中,R k为观测误差的协方差。
第二通信装置可以采用体积法来计算高斯权重积分。
示例性地,若记d i,k|k-1=h(u k;θ i,k|k-1)可以为2n个不同采样点θ i,k|k-1代入h(u k;θ k)中得到的第三数据,则第三数据可以为
Figure PCTCN2022103985-appb-000023
则上述P dd可以通过下列公式(9)表示:
Figure PCTCN2022103985-appb-000024
或者,上述P dd可以通过下列公式(10)表示:
Figure PCTCN2022103985-appb-000025
其中,1≤i≤2n,D为中心向量,D可以通过下列公式(11)表示:
Figure PCTCN2022103985-appb-000026
S506,第二通信装置将第三数据作为预测值、第一训练数据作为真实值,确定第一损失函数。
应理解,第一损失函数包括2n个,即上述第一训练数据经过每个采样点得到一个 数据,该第一数据经过信道后得到一个第三数据,根据一个第三数据和第一训练数据得到一个第一损失函数,故上述采样点包括2n个,则第一损失函数包括2n个。
第二通信装置可以计算交叉熵作为第一损失函数,该第一损失函数L k可以通过下列公式(12)表示:
L k=-∑u klogh(u k;θ k)  (12)
其中,第一训练数据为u k,第三数据为h(u k;θ k)。
训练目标为使真实值与第三数据的误差尽可能小,即使第一损失函数L k为尽可能小的值,故可以将L k近似为0,即下列公式(13)所示:
L k=|h(u k;θ k)-u k|≈0  (13)
则第二通信装置可以通过观测第一损失函数的方式代替计算上述观测第三数据,故,
上述P dd可以变化为通过下列公式(14)表示:
Figure PCTCN2022103985-appb-000027
且第二通信装置的观测值,即第一损失函数,可以记为L i,k,其中,第一损失函数包括2n个,i可以为取遍{1,2,…,2n}的整数。
S507,第二通信装置根据第一损失函数,基于反向梯度传播更新第二机器学习模型的参数,得到更新后的第二机器学习模型。
示例性地,第二通信装置可以计算第一损失函数的均值,采用第一损失函数的均值并基于反向梯度传播更新第二机器学习模型的参数。
该第一损失函数的均值可以为
Figure PCTCN2022103985-appb-000028
其中,L i,k为2n个第一损失函数。
S508,第二通信装置通过反馈信道向第一通信装置发送第一损失函数,反馈信道是第二通信装置根据观测误差确定的,第一损失函数用于更新第一机器学习模型的参数。
示例性地,第二通信装置可以通过反馈信道向第一通信装置发送第一损失函数L i,k,即分别发送2n个第一损失函数。
第二通信装置可以根据环境变化动态估计观测误差,并通过功率控制使信道的误差近似与观测误差相同,构造反馈信道。
示例性地,第二通信装置可以定义预测值和真实值的误差为
Figure PCTCN2022103985-appb-000029
并先预设误差协方差
Figure PCTCN2022103985-appb-000030
R max的值可以为经验值,该经验值可以是第二通信装置根据接收来自第一通信装置的误差协方差确定。第二通信装置可以根据
Figure PCTCN2022103985-appb-000031
估计一段时间内的预测值和真实值的误差协方差
Figure PCTCN2022103985-appb-000032
Figure PCTCN2022103985-appb-000033
可以通过下列公式(15)表示:
Figure PCTCN2022103985-appb-000034
其中,T i为该段时间的时长,i≥0。
随后的T i+1时间段,令
Figure PCTCN2022103985-appb-000035
其中0<λ≤1,再次计算该时间段的
Figure PCTCN2022103985-appb-000036
如果此时
Figure PCTCN2022103985-appb-000037
则停止调整,令观测误差协方差
Figure PCTCN2022103985-appb-000038
反之继续在T i+2时间段对R k调整。若出现误差协方差出现跳变,即
Figure PCTCN2022103985-appb-000039
则意味着环境有较大变动,此时重设R k=R max,并重复上述步骤。
另外也可对R k的调整建立对应表,该对应表中包括索引和R k的值的对应关系,可 以是按索引从大到小对应R max到R min。第二通信装置可以通过计算两个时间段的
Figure PCTCN2022103985-appb-000040
Figure PCTCN2022103985-appb-000041
值来确定对R k的选取,例如,当
Figure PCTCN2022103985-appb-000042
时,索引减1,对应选取的R k减小。反之则停止调整,令观测误差协方差
Figure PCTCN2022103985-appb-000043
其中,R k=r kI,其中,I为单位阵,r k为方差。
第二通信装置可以通过功率控制,将信道建模成均值为0,方差为r k的加性高斯白噪声(additive white gaussian noise,AWGN)信道,将2n个第一损失函数L i,k反馈给第一通信装置,用于第一通信装置更新控制层的参数。
可选地,第二通信装置可以通过信道向第一通信装置发送第一损失函数的均值,即
Figure PCTCN2022103985-appb-000044
同时向第一通信装置发送上述中心向量D。
S509,第一通信装置通过反馈信道接收第二损失函数,第二损失函数是第二通信装置发送的第一损失函数经过信道传输后得到的。
第二通信装置将反馈信道建模成信道误差为观测误差r k的AWGN信道,故第一损失函数经过反馈信道传输后得到第二损失函数
Figure PCTCN2022103985-appb-000045
即下列公式(16)所示:
Figure PCTCN2022103985-appb-000046
即第二通信装置通过反馈信道向第一通信装置发送第一损失函数为L i,k,则第一通信装置接收到的第二损失函数为
Figure PCTCN2022103985-appb-000047
可选地,若第二通信装置向第一通信装置发送第一损失函数的均值,则第二通信装置可以将上述中心向量D通过反馈信道发送给第一通信装置,相应的,第一通信装置可以通过反馈信道接收第二损失函数的均值和带有观测误差的中心向量D。
S510,第一通信装置根据第二损失函数、控制层的先验参数以及第二损失函数的误差协方差,得到卡尔曼增益。
示例性地,首先,第一通信装置可以根据第二损失函数估计第二损失函数的误差协方差。
由于反馈信道的误差期望为0,即
Figure PCTCN2022103985-appb-000048
Figure PCTCN2022103985-appb-000049
故第二损失函数的误差协方差可以通过公式(17)表示:
Figure PCTCN2022103985-appb-000050
应理解,该第二损失函数的误差协方差
Figure PCTCN2022103985-appb-000051
与上述公式(8)中的P dd相同。
然后,第一通信装置可以根据控制层的先验参数,得到第二损失函数的交叉协方差P θd,其中,P θd可以通过下列公式(18)表示:
Figure PCTCN2022103985-appb-000052
进一步的,P θd可以通过下列公式(19)或(20)表示:
Figure PCTCN2022103985-appb-000053
或者,
Figure PCTCN2022103985-appb-000054
其中,
Figure PCTCN2022103985-appb-000055
最后,第一通信装置可以根据第二损失函数的误差协方差和第二损失函数的交叉协方差,得到卡尔曼增益G k,其中,G k可以通过下列公式(21)表示:
Figure PCTCN2022103985-appb-000056
S511,第一通信装置根据卡尔曼增益,更新控制层的参数,得到更新后的控制层的参数。
示例性地,更新后的控制层的参数
Figure PCTCN2022103985-appb-000057
可以通过下列公式(22)表示:
Figure PCTCN2022103985-appb-000058
S512,第一通信装置根据更新后的控制层的参数和卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,得到更新后的第一网络层的参数,第一网络层包括在控制层之前的网络层。
示例性地,图6示出了一种更新第一网络层的参数的示意图。如图6所示,将控制层参数记为Θ c,第一网络层的参数记为Θ z-c,Θ z-c表示在网络中的所属l z-c-1层和l z-c层之间的权重,l z-c为神经网络的所属层数的参数,其中,l z=Θ cl z-c,l z-c=Θ z-cl x-c-1c可以表示控制层的网络层数。
基于卡尔曼滤波的梯度可以为
Figure PCTCN2022103985-appb-000059
其中,j为更新的次数,G为第j次计算得到的卡尔曼增益,
Figure PCTCN2022103985-appb-000060
为第j次计算得到的第二损失函数。假定
Figure PCTCN2022103985-appb-000061
Figure PCTCN2022103985-appb-000062
表示向量的伪逆,可以得到推算控制层前一个网络层的参数更新方式,即可以通过下列公式(23)所示:
Figure PCTCN2022103985-appb-000063
其中,z为第一机器学习模型的总网络层数,j可以为取遍{1,2,…,z-c}的整数。
依次类推,第一网络层的其他网络按该更新方式进行更新,此处不再进行赘述。
S513,第一通信装置根据更新后的控制层的参数和更新后的第一网络层的参数,得到更新后的第一机器学习模型。
本申请实施例提供的模型训练方法,对控制层的参数进行采样,更好地结合卡尔曼滤波到模型训练中,进一步增加模型训练的可行性,提高训练自编码器的收敛速度,优化自编码器的鲁棒性,从而提高端到端的通信质量。
本申请实施例还对该方法500进行了仿真,以检验该方法500的效果。示例性地,仿真是在AWGN时变扰动信道下进行的,对比本申请实施例提出的方法500和基于强化学习的策略梯度(policy gradient,PG)的效果。其中,本申请实施例提出的方法500为基于容积卡尔曼滤波(cubature kalman filter,CKF)的训练方法。
在该仿真中,信道的信噪比是实时变化的,信噪比的取值范围可以设置为[10,25],其中,信噪比的单位为分贝。另外,在该仿真中,调制阶数为4,第一训练数据的长度为256,该第一训练数据在输入基于容积卡尔曼滤波的机器学习模型之前,需要进行独热编码(one-hot)得到长度为16的训练数据。
上述仿真分别对CKF和PG迭代4000次,分别观察两种算法的交叉熵损失和误码率变化。
图7示出了基于本申请实施例提供的模型训练方法的交叉熵损失的示意图。如图7所示,随着迭代次数的增加,CKF的下降速度大于PG,CKF的损失扰动小于PG的损失扰动,CKF的交叉熵损失小于PG的交叉熵损失,交叉熵损失越小,代表信道对 第一通信装置和第二通信装置之间的通信影响越小。
图8示出了基于本申请实施例提供的模型训练方法的误码率变化的示意图。如图8所示,随着迭代次数的增加,CKF的下降速度大于PG,CKF的误码率小于PG的误码率。
由图7和图8可知,基于CKF的训练方法,可以提高模型训练的收敛速度和鲁棒性。
作为一个可选的实施例,上述第一数据可以包括M组数据,其中,M为正整数,M的值是第一通信装置与其他第一通信装置根据预设规则确定的,M与其他第一通信装置所发送的数据的个数之和是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
其他第一通信装置所发送的数据包括其他第一通信装置中的每个第一通信装置中的机器学习模型的输出结果。
在端到端的通信系统中,第一通信装置的个数可以为多个,第二通信装置的个数可以为1个。在该通信系统中,上述第一通信装置可以与其他第一通信装置根据预设规则确定第一数据的数量M。该预设规则可以是在该通信系统中,每个第一通信装置中的机器学习模型的输出结果的个数均大于等于1,且所有的第一通信装置中的机器学习模型的输出结果的个数之和由卡尔曼滤波的类型和控制层的参数的维度确定。在该通信系统中的多个第一通信装置可以通过互相通信,确定自身的采样点。例如,若该通信系统中共有a个第一通信装置,且该a个第一通信装置构成环拓扑结构,则a-1个第一通信装置可以通过互相通信确定采样点编号时序。
本申请实施例提供的模型训练方法,将对控制层的采样分成多个子任务,由多个第一通信装置共同完成,可以降低第一通信装置的运算量,从而降低第一通信装置的运算负担,保证在线训练的部署实现。
在该通信系统下,第一通信装置为通信系统中分布式的中心第一通信装置时,仍可以按照上述方法300更新控制层的参数,得到更新后的控制层的参数。
应理解,第一通信装置可以通过信道向第二通信装置发送第一数据,该第一数据可以包括M组数据,若通信系统中的多个第一通信装置中的机器学习模型的输出结果的个数之和为P个,则第一通信装置可以通过信道接收的第一损失函数的个数为P个,应理解,P的值大于或等于M的值。第一通信装置可以根据所述P个第一损失函数,更新控制层的参数,得到更新后的控制层的参数。第一通信装置可以将更新后的控制层的参数通过互相通信的方式传输给其他第一通信装置。
应理解,通信系统中的多个第一通信装置采用中心式的分布式训练方法,将控制层的采样分成多个子任务,由多个第一通信装置共同完成,上述第一通信装置可作为中心的通信装置,可以接收第二通信装置发送的第一损失函数,并训练得到控制层的参数,然后下发给其他第一通信装置。
示例性地,图9示出了另一种模型训练方法900的示意性流程图。如图9所示,通信系统可以包括第一通信装置1、第一通信装置2以及第二通信装置,第一通信装置1部署有第一机器学习模型1,第一通信装置2部署有第一机器学习模型2。应理解,该通信系统中第一通信装置的个数仅仅为一个示例,第一通信装置2为分布式的中心第一通信装置仅仅为一个示例,本申请实施例对此不做限定。
如图9所示,方法900可以包括下列步骤:
S901,第一通信装置1将第一训练数据输入至第一机器学习模型1,得到第一数据1,第一机器学习模型1包括控制层的参数的采样点1,该控制层的参数的采样点1是第一通信装置1对第一机器学习模型1中控制层的参数进行采样得到的。
S902,第一通信装置2将第一训练数据输入至第一机器学习模型2,得到第一数据2,第一机器学习模型2包括控制层的参数的采样点2,该控制层的参数的采样点2是第一通信装置2对第一机器学习模型2中控制层的参数进行采样得到的。
第一机器学习模型1和第二机器学习模型2的初始参数可以相同也可以不同。
第一通信装置1和第一通信装置2可以根据预设规则确定采样点1的数量和采样点2的数量。示例性地,若第一通信装置1或第一通信装置2采用容积卡尔曼滤波训练第一机器学习模型1或第一机器学习模型2,且第一机器学习模型1和第一机器学习模型2的网络层的层数相同,均为n个,则采样点1的数量和与采样点2的数量之和为2n个,且采样点1的数量和采样点2的数量的比值可以为大于0的任意数值。
第一通信装置2可以通过对控制层的参数进行采样,得到的控制层的参数的采样点2。
S903,第一通信装置1将第一数据1通过信道发送给第二通信装置。
S904,第二通信装置通过信道接收第二数据1,第二数据1是第一数据1经过信道传输后得到的。
S905,第一通信装置2将第一数据2通过信道发送给第二通信装置.
S906,第二通信装置通过信道接收第二数据2,第二数据2是第一数据2经过信道传输后得到的。
S907,第二通信装置根据第二数据1和第二数据2,确定第一损失函数。
第二通信装置可以将第二数据1和第二数据2分别输入至第二机器学习模型,得到第三数据1和第三数据2,将第三数据1作为预测值、第一训练数据作为真实值,确定第一损失函数1,将第三数据2作为预测值、第一训练数据作为真实值,确定第一损失函数2。上述第一损失函数包括第一损失函数1和第一损失函数2。具体的实现方式与上述S505和S506相同,此处不再赘述。
S908,第二通信装置通过反馈信道向第一通信装置2发送第一损失函数。
第二通信装置构造反馈信道的过程与上述实施例相同,此处不再赘述。
S909,第一通信装置2通过反馈信道接收第二损失函数,第二损失函数是第一损失函数经过反馈信道传输后得到的。
第一通信装置2为中心第一通信装置,第一通信装置2可以接收第二通信装置通过反馈信道发送的全部的第二损失函数。
S910,第一通信装置2根据第二损失函数,得到更新后的控制层的参数。
S911,第一通信装置2向第一通信装置1发送更新后的控制层的参数。
第一通信装置2为中心第一通信装置,可以将更新后的控制层的参数发送给其他第一通信装置,即第一通信装置1。
本申请实施例提供的模型训练方法,采用中心式的分布式训练方法,将控制层的采样分成多个子任务,由两个第一通信装置共同完成,减小了非中心第一通信装置(第 一通信装置1)的运算量,由中心第一通信装置向其他第一通信装置发送更新后的控制层的参数,提高了更新控制层的参数的效率。
可选地,上述第一通信装置1还可以将第一数据1发送给第一通信装置2,由第一通信装置2融合第一数据2后一起通过信道发送给第二通信装置。
可选地,若上述第一通信装置根据上述方法300的方法更新第一机器学习模型的参数后,第一通信装置还可以将更新后的第一机器学习模型的参数通过互相通信的方式传输给其他第一通信装置。
本申请实施例提供的模型训练方法,采用中心式的分布式训练方法,当中心的第一通信装置训练完成后,可以向其他第一通信装置发送更新后的模型参数,节省了其他第一通信装置的训练成本,减小了其他第一通信装置的计算量。
可选地,若上述第一通信装置根据上述方法300的方法确定卡尔曼增益后,第一通信装置还可以将更新后的控制层的参数和该卡尔曼增益通过互相通信的方式传输给其他第一通信装置,其他第一通信装置可以基于接收到的更新后的控制层的参数和该卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,进而更新第一机器学习模型的参数。
可选地,上述第一通信装置可以将控制层的先验参数、第二损失函数、第二损失函数的误差协方差以及更新后的控制层的参数通过互相通信的方式传输给其他第一通信装置,其他第一通信装置可以先根据接收到的控制层的先验参数、第二损失函数、第二损失函数的误差协方差确定卡尔曼增益,然后再根据更新后的控制层的参数和该卡尔曼增益,基于反向梯度传播更新第一机器学习模型中第一网络层的参数,进而更新第一机器学习模型的参数。
上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
上文中结合图1至图9,详细描述了本申请实施例的模型训练方法,下面将结合10和图11,详细描述本申请实施例的模型训练的相关装置。
图10示出了本申请实施例提供的一种模型训练的相关装置1000的示意性框图。该装置1000包括:收发单元1010和处理单元1020。
在一种可能的实现方式中,该装置1000可以实现关联于上文方法实施例300中的第一通信装置执行的各个步骤或流程。
其中,该收发单元1010用于:通过信道向第二通信装置发送第一数据,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层;通过反馈信道接收第二损失函数,反馈信道是根据观测误差确定的,第一损失函数是第二通信装置发送的第一损失函数经过反馈信道传输后得到的。该处理单元1020用于:根据第一损失函数,基于卡尔曼滤波更新控制层的参数,得到更新后的控制层的参数,更新后的控制层的参数用于更新第一机器学习模型的参数。
可选地,上述处理单元1020还用于:根据控制层的先验参数、第二损失函数和第二损失函数的误差协方差,得到卡尔曼增益;根据卡尔曼增益,更新控制层的参数,得到更新后的控制层的参数。
可选地,上述收发单元1010还用于:通过信道向第二通信装置发送第四数据,第四数据是第二训练数据输入至第一机器学习模型的输出结果;接收来自第二通信装置的指示信息,指示信息用于指示该装置停止第一机器学习模型的训练。上述处理单1020还用于:根据指示信息,停止第一机器学习模型的训练。
可选地,第一数据包括N组数据,其中,N为正整数,且N的值是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
可选地,第一数据包括M组数据,其中,M为正整数,M的值是该装置与其他第一通信装置根据预设规则确定的,M与其他第一通信装置所发送的数据的个数之和是根据卡尔曼滤波的类型和控制层的参数的维度确定的。
可选地,上述收发单元1010还用于:第一通信装置向其他第一通信装置发送更新后的控制层的参数。
可选地,上述处理单元1020还用于:根据第一时间段内接收到的多个损失函数的方差,判断信道在第一时间段内的非线性程度,多个损失函数包括第二损失函数;根据信道在第一时间段内的非线性程度,确定卡尔曼滤波的类型。
可选地,第二损失函数的方差大于或等于第一阈值,信道在第一时间段内的非线性程度为强非线性;或者,第二损失函数的方差小于第一阈值,信道在第一时间段内的非线性程度为弱非线性。
可选地,信道在第一时间段内的非线性程度为强非线性,卡尔曼滤波的类型为容积卡尔曼滤波;或者,信道在第一时间段内的非线性程度为弱非线性,卡尔曼滤波的类型为扩展卡尔曼滤波。
在一种可能的实现方式中,该装置1000可以实现对应于上文方法实施例300中的第二通信装置执行的各个步骤或流程。
其中,该收发单元1010用于:通过信道接收第二数据,第二数据是第一通信装置发送的第一数据经过信道传输后得到的,第一数据是第一训练数据输入至第一机器学习模型的输出结果,第一机器学习模型包括控制层,控制层为第一机器学习模型的至少一层。该处理单元1020用于:将第二数据输入至第二机器学习模型,得到第三数据;根据第三数据和第一训练数据,确定第一损失函数,第一损失函数用于更新机器学习模型的控制层的参数。该收发单元1010还用于:通过反馈信道向第一通信装置发送第一损失函数,反馈信道是根据观测误差确定的,所述第一损失函数用于更新所述第一机器学习模型的控制层的参数。
可选地,上述处理单元1010还用于:根据第一损失函数,基于反向梯度传播更新第二机器学习模型的参数,得到更新后的第二机器学习模型。
可选地,上述收发单元1010还用于:通过信道接收第五数据,第五数据是第一通信装置发送的第四数据经过信道传输后得到的,第四数据是第二训练数据输入至第一机器学习模型的输出结果;上述处理单元1020用于:将第五数据输入至第二机器学习模型,得到第六数据;根据第六数据和第二训练数据,确定第三损失函数;上述收发单元还用于:若第三损失函数低于预设阈值,向第一通信装置发送指示信息,指示信息用于指示第一通信装置停止第一机器学习模型的训练。
这里的装置1000以功能单元的形式体现。这里的术语“单元”可以指应用特有集成 电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,该装置1000可以具体为上述实施例中的第一通信装置或者第二通信装置,或者,上述实施例中第一通信装置或者第二通信装置的功能可以集成在该装置中,该装置可以用于执行上述方法实施例中与第一通信装置或者第二通信装置对应的各个流程和/或步骤,为避免重复,在此不再赘述。
上述装置1000具有实现上述实施例中第一通信装置或者第二通信装置执行的相应步骤的功能;上述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。例如,上述收发单元1020可以包括发送单元和接收单元,该发送单元可以用于实现上述收发单元对应的用于执行发送动作的各个步骤和/或流程,该接收单元可以用于实现上述收发单元对应的用于执行接收动作的各个步骤和/或流程。该发送单元可以由发射器替代,该接收单元可以由接收器替代,分别执行各个方法实施例中的收发操作以及相关的处理操作。又例如,该收发单元1020可以由通信接口替代,执行各个方法实施例中的收发操作。在本申请实施例中,通信接口可以是电路、模块、总线、总线接口、收发器等可以实现通信功能的装置。应理解,上文实施例中的处理单元1010可以由处理器或处理器相关电路实现,收发单元1020可以由收发器或收发器相关电路或接口电路实现。
可选地,在上述可能设计的装置中,还可以包括存储单元,该存储单元用于存储计算机程序,处理单元1010可以从存储单元中调用并运行该计算机程序,使得装置1000执行上述方法实施例中第一通信装置或者第二通信装置的方法,本申请实施例对此不作限定。
此外,上述实施例中的单元也可以称为模块或者电路或者部件等。在本申请的实施例,图10的装置也可以是芯片或者芯片系统,例如:片上系统(system on chip,SoC)。对应地,收发单元可以是该芯片的收发电路,在此不做限定。
图11示出了本申请实施例提供的另一种模型训练的相关装置1100的示意性框图。该装置1100包括处理器1110和收发器1120。其中,处理器1110和收发器1120通过内部连接通路互相通信,该处理器1110用于执行指令,以控制该收发器1120发送信号和/或接收信号。
可选地,该装置1100还可以包括存储器1130,该存储器1130与处理器1110、收发器1120通过内部连接通路互相通信。该存储器1130用于存储指令,该处理器1110可以执行该存储器1130中存储的指令。装置1100用于实现上述方法实施例中的第一通信装置或者第二通信装置对应的各个流程和步骤。
装置1100可以具体为上述实施例中的第一通信装置或第二通信装置,也可以是芯片或者芯片系统。对应的,该收发器1120可以是该芯片的收发电路,在此不做限定。具体地,该装置1100可以用于执行上述方法实施例中与第一通信装置或第二通信装置对应的各个步骤和/或流程。可选地,该存储器1130可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器的一部分还可以包括非易失性随机存取存储器。例如,存储器还可以存储设备类型的信息。该处理器1110可以用于执行存储 器中存储的指令,并且当该处理器1110执行存储器中存储的指令时,该处理器1110用于执行上述与第一通信装置或第二通信装置对应的方法实施例的各个步骤和/或流程。
在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
应注意,本申请实施例中的处理器可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。本申请实施例中的处理器可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
根据本申请实施例提供的方法,本申请还提供一种计算机程序产品,该计算机程序产品包括:计算机程序代码,当该计算机程序代码在计算机上运行时,使得该计算机执行上述实施例中所示的方法。
根据本申请实施例提供的方法,本申请还提供一种计算机可读存储介质,该计算机可读存储介质有程序代码,当该程序代码在计算机上运行时,使得该计算机执行上述实施例中所示的方法。
根据本申请实施例提供的方法,本申请还提供一种芯片,该芯片包括处理器,用于读取存储器中存储的指令,当该处理器执行所述指令时,使得该芯片实现上述实施例中所示的方法。
根据本申请实施例提供的方法,本申请提供一种计算机程序,当其在计算机上运行时,使得上述方法实施例中可能实现方式中的方法被执行。
本申请还提供一种通信系统,包括上述各个实施例中的第一通信装置和第二通信装置。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (31)

  1. 一种模型训练方法,其特征在于,应用于包括第一通信装置和第二通信装置的通信系统,所述第一通信装置的个数为至少一个,所述第一通信装置部署有第一机器学习模型,所述方法包括:
    所述第一通信装置通过信道向所述第二通信装置发送第一数据,所述第一数据是第一训练数据输入至所述第一机器学习模型的输出结果,所述第一机器学习模型包括控制层,所述控制层为所述第一机器学习模型的至少一层;
    所述第一通信装置通过反馈信道接收第二损失函数,所述反馈信道是根据观测误差确定的,所述第二损失函数是所述第二通信装置发送的第一损失函数经过所述反馈信道传输后得到的;
    所述第一通信装置根据所述第二损失函数,基于卡尔曼滤波更新所述控制层的参数,得到更新后的所述控制层的参数,所述更新后的所述控制层的参数用于更新所述第一机器学习模型的参数。
  2. 根据权利要求1所述的方法,其特征在于,所述基于卡尔曼滤波更新所述控制层的参数,得到更新后的所述控制层的参数,包括:
    所述第一通信装置根据所述控制层的先验参数、所述第二损失函数和所述第二损失函数的误差协方差,得到卡尔曼增益;
    所述第一通信装置根据所述卡尔曼增益,更新所述控制层的参数,得到更新后的所述控制层的参数。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    所述第一通信装置根据所述更新后的所述控制层的参数和所述卡尔曼增益,基于反向梯度传播更新所述第一机器学习模型中第一网络层的参数,得到更新后的所述第一网络层的参数,所述第一网络层包括在所述控制层之前的网络层;
    所述第一通信装置根据所述更新后的所述控制层的参数和所述更新后的所述第一网络层的参数,得到更新后的所述第一机器学习模型。
  4. 根据权利要求3所述的方法,其特征在于,在所述得到更新后的所述第一机器学习模型之后,所述方法还包括:
    所述第一通信装置通过所述信道向所述第二通信装置发送第四数据,所述第四数据是第二训练数据输入至所述第一机器学习模型的输出结果;
    所述第一通信装置接收来自所述第二通信装置的指示信息,所述指示信息用于指示所述第一通信装置停止所述第一机器学习模型的训练;
    所述第一通信装置根据所述指示信息,停止所述第一机器学习模型的训练。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一数据包括N组数据,其中,N为正整数,且N的值是根据所述卡尔曼滤波的类型和所述控制层的参数的维度确定的。
  6. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一数据包括M组数据,其中,M为正整数,M的值是所述第一通信装置与其他第一通信装置根据预设规则确定的,M与所述其他第一通信装置所发送的数据的个数之和是根据所述卡尔曼滤波的类型和所述控制层的参数的维度确定的。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    所述第一通信装置向所述通信系统中的其他第一通信装置发送所述更新后的所述控制层的参数。
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,在所述第一通信装置通过所述信道接收第二损失函数之后,所述方法还包括:
    所述第一通信装置根据第一时间段内接收到的多个损失函数的方差,判断所述信道在所述第一时间段内的非线性程度,所述多个损失函数包括所述第二损失函数;
    所述第一通信装置根据所述信道在所述第一时间段内的非线性程度,确定所述卡尔曼滤波的类型。
  9. 根据权利要求8所述的方法,其特征在于,所述第二损失函数的方差大于或等于第一阈值,所述信道在所述第一时间段内的非线性程度为强非线性;或者,
    所述第二损失函数的方差小于所述第一阈值,所述信道在所述第一时间段内的非线性程度为弱非线性。
  10. 根据权利要求8或9所述的方法,其特征在于,所述信道在所述第一时间段内的非线性程度为强非线性,所述卡尔曼滤波的类型为容积卡尔曼滤波;或者,
    所述信道在所述第一时间段内的非线性程度为弱非线性,所述卡尔曼滤波的类型为扩展卡尔曼滤波。
  11. 一种模型训练方法,其特征在于,应用于包括第一通信装置和第二通信装置的通信系统,所述第一通信装置的个数为至少一个,所述第一通信装置部署有第一机器学习模型,所述第二通信装置部署有第二机器学习模型,所述方法包括:
    所述第二通信装置通过信道接收第二数据,所述第二数据是所述第一通信装置发送的第一数据经过所述信道传输后得到的,所述第一数据是第一训练数据输入至所述第一机器学习模型的输出结果,所述第一机器学习模型包括控制层,所述控制层为所述第一机器学习模型的至少一层;
    所述第二通信装置将所述第二数据输入至所述第二机器学习模型,得到第三数据;
    所述第二通信装置根据所述第三数据和所述第一训练数据,确定第一损失函数;
    所述第二通信装置通过反馈信道向所述第一通信装置发送所述第一损失函数,所述反馈信道是根据观测误差确定的,所述第一损失函数用于更新所述第一机器学习模型的控制层的参数。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    所述第二通信装置根据所述第一损失函数,基于反向梯度传播更新所述第二机器学习模型的参数,得到更新后的所述第二机器学习模型。
  13. 根据权利要求12所述的方法,其特征在于,所述方法还包括:
    所述第二通信装置通过所述信道接收第五数据,所述第五数据是所述第一通信装置发送的第四数据经过所述信道传输后得到的,所述第四数据是第二训练数据输入至所述第一机器学习模型的输出结果;
    所述第二通信装置将所述第五数据输入至所述第二机器学习模型,得到第六数据;
    所述第二通信装置根据所述第六数据和所述第二训练数据,确定第三损失函数;
    若所述第三损失函数低于预设阈值,所述第二通信装置向所述第一通信装置发送 指示信息,所述指示信息用于指示所述第一通信装置停止所述第一机器学习模型的训练。
  14. 一种模型训练的相关装置,其特征在于,包括:
    收发单元,用于通过信道向第二通信装置发送第一数据,所述第一数据是第一训练数据输入至第一机器学习模型的输出结果,所述第一机器学习模型包括控制层,所述控制层为所述第一机器学习模型的至少一层;
    所述收发单元,还用于通过反馈信道接收第二损失函数,所述反馈信道是根据观测误差确定的,所述第二损失函数是所述第二通信装置发送的第一损失函数经过所述反馈信道传输后得到的;
    处理单元,用于根据所述第二损失函数,基于卡尔曼滤波更新所述控制层的参数,得到更新后的所述控制层的参数,所述更新后的所述控制层的参数用于更新所述第一机器学习模型的参数。
  15. 根据权利要求14所述的装置,其特征在于,所述处理单元还用于:
    根据所述更新后的所述控制层的参数和卡尔曼增益,基于反向梯度传播更新所述第一机器学习模型中第一网络层的参数,得到更新后的所述第一网络层的参数,所述第一网络层的参数包括在控制层之前的网络层的参数,所述卡尔曼增益是根据所述控制层的先验参数、所述第二损失函数和所述第二损失函数的误差协方差得到的;
    根据所述更新后的所述控制层的参数和所述更新后的所述第一网络层的参数,得到更新后的所述第一机器学习模型。
  16. 根据权利要求15所述的装置,其特征在于,所述收发单元还用于:
    通过所述信道向所述第二通信装置发送第四数据,所述第四数据是第二训练数据输入至所述第一机器学习模型的输出结果;
    接收来自所述第二通信装置的指示信息,所述指示信息用于指示停止所述第一机器学习模型的训练;
    所述处理单元还用于:
    根据所述指示信息,停止所述第一机器学习模型的训练。
  17. 根据权利要求14至16中任一项所述的装置,其特征在于,所述第一数据包括N组数据,其中,N为正整数,且N的值是根据所述卡尔曼滤波的类型和所述控制层的参数的维度确定的。
  18. 根据权利要求14至16中任一项所述的装置,其特征在于,所述第一数据包括M组数据,其中,M为正整数,M的值是所述装置与其他第一通信装置根据预设规则确定的,M与所述其他第一通信装置所发送的数据的个数之和是根据所述卡尔曼滤波的类型和所述控制层的参数的维度确定的。
  19. 根据权利要求18所述的装置,其特征在于,所述收发单元还用于:
    向所述其他第一通信装置发送所述更新后的所述控制层的参数。
  20. 根据权利要求14至19中任一项所述的装置,其特征在于,所述收发单元还用于:
    根据第一时间段内接收到的多个损失函数的方差,判断所述信道在所述第一时间段内的非线性程度,所述多个损失函数包括所述第二损失函数;
    所述处理单元还用于:
    根据所述信道在所述第一时间段内的非线性程度,确定所述卡尔曼滤波的类型。
  21. 根据权利要求20所述的装置,其特征在于,所述第二损失函数的方差大于或等于第一阈值,所述信道在所述第一时间段内的非线性程度为强非线性;或者,
    所述第二损失函数的方差小于所述第一阈值,所述信道在所述第一时间段内的非线性程度为弱非线性。
  22. 根据权利要求19或21所述的装置,其特征在于,所述信道在所述第一时间段内的非线性程度为强非线性,所述卡尔曼滤波的类型为容积卡尔曼滤波;或者,
    所述信道在所述第一时间段内的非线性程度为弱非线性,所述卡尔曼滤波的类型为扩展卡尔曼滤波。
  23. 一种模型训练的相关装置,其特征在于,包括:
    收发单元,用于通过信道接收第二数据,所述第二数据是第一通信装置发送的第一数据经过所述信道传输后得到的,所述第一数据是第一训练数据输入至第一机器学习模型的输出结果,所述第一机器学习模型包括控制层,所述控制层为所述第一机器学习模型的至少一层;
    处理单元,用于将所述第二数据输入至第二机器学习模型,得到第三数据;根据所述第三数据和所述第一训练数据,确定第一损失函数;
    所述收发单元还用于:通过反馈信道向所述第一通信装置发送所述第一损失函数,所述反馈信道是根据观测误差确定的,所述第一损失函数用于更新所述第一机器学习模型的控制层的参数。
  24. 根据权利要求23所述的装置,其特征在于,所述处理单元还用于:
    根据所述第一损失函数,基于反向梯度传播更新所述第二机器学习模型的参数,得到更新后的所述第二机器学习模型。
  25. 根据权利要求24所述的装置,其特征在于,所述收发单元还用于:
    通过所述信道接收第五数据,所述第五数据是所述第一通信装置发送的第四数据经过所述信道传输后得到的,所述第四数据是第二训练数据输入至所述第一机器学习模型的输出结果;
    所述处理单元还用于:
    将所述第五数据输入至所述第二机器学习模型,得到第六数据;
    根据所述第六数据和所述第二训练数据,确定第三损失函数;
    所述收发单元还用于:
    若所述第三损失函数低于预设阈值,向所述第一通信装置发送指示信息,所述指示信息用于指示所述第一通信装置停止所述第一机器学习模型的训练。
  26. 一种通信装置,其特征在于,包括:处理器和收发器,所述收发器用于和其它装置通信,所述处理器与存储器耦合,所述存储器用于存储计算机程序,当所述处理器调用所述计算机程序时,使得所述装置执行权利要求1至10中任一项所述的方法或者权利要求11至13中任一项所述的方法。
  27. 一种芯片系统,其特征在于,包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片系统的通信设备执行权利要求1至10中任一项所述的 方法或者权利要求11至13中任一项所述的方法。
  28. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,当所述计算机程序在计算机上运行时,使得权利要求1至10中任一项所述的方法或者权利要求11至13中任一项所述的方法被执行。
  29. 一种计算机程序产品,其特征在于,所述计算机程序产品包括指令,当所述指令被执行时,使得权利要求1至10中任一项所述的方法或者权利要求11至13中任一项所述的方法被执行。
  30. 一种计算机程序,其特征在于,当所述计算机程序在计算机上运行时,使得权利要求1至10中任一项所述的方法或者权利要求11至13中任一项所述的方法被执行。
  31. 一种通信系统,其特征在于,包括:权利要求14至22中任一项所述的模型训练的相关装置和权利要求23至25中任一项所述的模型训练的相关装置。
PCT/CN2022/103985 2021-07-09 2022-07-05 模型训练方法及相关装置 WO2023280176A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22836925.2A EP4358446A4 (en) 2021-07-09 2022-07-05 MODEL LEARNING METHOD AND RELATED APPARATUS
US18/405,019 US20240152766A1 (en) 2021-07-09 2024-01-05 Model training method and related apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110780949.8 2021-07-09
CN202110780949.8A CN115603859A (zh) 2021-07-09 2021-07-09 模型训练方法及相关装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/405,019 Continuation US20240152766A1 (en) 2021-07-09 2024-01-05 Model training method and related apparatus

Publications (1)

Publication Number Publication Date
WO2023280176A1 true WO2023280176A1 (zh) 2023-01-12

Family

ID=84801296

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/103985 WO2023280176A1 (zh) 2021-07-09 2022-07-05 模型训练方法及相关装置

Country Status (4)

Country Link
US (1) US20240152766A1 (zh)
EP (1) EP4358446A4 (zh)
CN (1) CN115603859A (zh)
WO (1) WO2023280176A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116704296A (zh) * 2023-08-04 2023-09-05 浪潮电子信息产业股份有限公司 一种图像处理方法、装置、系统、设备及计算机存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180308013A1 (en) * 2017-04-24 2018-10-25 Virginia Tech Intellectual Properties, Inc. Radio signal identification, identification system learning, and identifier deployment
CN110474716A (zh) * 2019-08-14 2019-11-19 安徽大学 基于降噪自编码器的scma编解码器模型的建立方法
US20190392302A1 (en) * 2018-06-20 2019-12-26 Disney Enterprises, Inc. Efficient encoding and decoding sequences using variational autoencoders
CN111224677A (zh) * 2018-11-27 2020-06-02 华为技术有限公司 编码方法、译码方法及装置
CN111327367A (zh) * 2018-12-14 2020-06-23 上海诺基亚贝尔股份有限公司 光网络中的光发射器、方法和存储介质
CN111434049A (zh) * 2017-06-19 2020-07-17 弗吉尼亚科技知识产权有限公司 使用多天线收发器无线传输的信息的编码和解码

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180308013A1 (en) * 2017-04-24 2018-10-25 Virginia Tech Intellectual Properties, Inc. Radio signal identification, identification system learning, and identifier deployment
CN111434049A (zh) * 2017-06-19 2020-07-17 弗吉尼亚科技知识产权有限公司 使用多天线收发器无线传输的信息的编码和解码
US20190392302A1 (en) * 2018-06-20 2019-12-26 Disney Enterprises, Inc. Efficient encoding and decoding sequences using variational autoencoders
CN111224677A (zh) * 2018-11-27 2020-06-02 华为技术有限公司 编码方法、译码方法及装置
CN111327367A (zh) * 2018-12-14 2020-06-23 上海诺基亚贝尔股份有限公司 光网络中的光发射器、方法和存储介质
CN110474716A (zh) * 2019-08-14 2019-11-19 安徽大学 基于降噪自编码器的scma编解码器模型的建立方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116704296A (zh) * 2023-08-04 2023-09-05 浪潮电子信息产业股份有限公司 一种图像处理方法、装置、系统、设备及计算机存储介质
CN116704296B (zh) * 2023-08-04 2023-11-03 浪潮电子信息产业股份有限公司 一种图像处理方法、装置、系统、设备及计算机存储介质

Also Published As

Publication number Publication date
US20240152766A1 (en) 2024-05-09
CN115603859A (zh) 2023-01-13
EP4358446A1 (en) 2024-04-24
EP4358446A4 (en) 2024-10-23

Similar Documents

Publication Publication Date Title
CN112054863B (zh) 一种通信方法及装置
WO2023280176A1 (zh) 模型训练方法及相关装置
CN104168573A (zh) Femtocell网络下基于分簇干扰对齐的干扰消除方法
WO2022001822A1 (zh) 获取神经网络的方法和装置
WO2023125660A1 (zh) 一种通信方法及装置
WO2022206328A1 (zh) 一种通信协作方法及装置
CN111971915B (zh) 传输探测参考信号的方法及终端设备
CN112492637A (zh) 一种用于小区业务量预测的方法和装置
CN112888076B (zh) 一种调度方法及装置
CN116982300A (zh) 信号处理的方法及接收机
CN114513279A (zh) 一种数据传输的方法和装置
CN115987727B (zh) 信号传输方法和装置
WO2022100514A1 (zh) 决策方法和决策装置
WO2023097645A1 (zh) 数据获取方法、装置、设备、介质、芯片、产品及程序
WO2023137641A1 (zh) 信道估计方法、训练信道估计模型的方法和通信设备
EP4319070A1 (en) Artificial intelligence-based channel estimation method and apparatus
Zeyde et al. Confidential communication in C-RAN systems with infrastructure sharing
WO2023060503A1 (zh) 信息处理方法、装置、设备、介质、芯片、产品及程序
WO2024017301A1 (zh) 通信方法及装置
CN117813801A (zh) 通信方法、模型训练方法和设备
EP4322065A1 (en) Gradient transmission method and related apparatus
Singhal et al. Joint uplink-downlink cell associations for interference networks with local connectivity
WO2024130713A1 (zh) 一种信号处理的方法和通信装置
WO2023155650A1 (zh) 基于系统极化码的编码方法和编码装置
WO2024183610A1 (zh) 通信方法和通信装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836925

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022836925

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022836925

Country of ref document: EP

Effective date: 20240116

NENP Non-entry into the national phase

Ref country code: DE