WO2022104799A1 - Procédé d'entraînement, appareil d'entraînement et support d'enregistrement - Google Patents

Procédé d'entraînement, appareil d'entraînement et support d'enregistrement Download PDF

Info

Publication number
WO2022104799A1
WO2022104799A1 PCT/CN2020/130896 CN2020130896W WO2022104799A1 WO 2022104799 A1 WO2022104799 A1 WO 2022104799A1 CN 2020130896 W CN2020130896 W CN 2020130896W WO 2022104799 A1 WO2022104799 A1 WO 2022104799A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
training
compression
node
mode
Prior art date
Application number
PCT/CN2020/130896
Other languages
English (en)
Chinese (zh)
Inventor
牟勤
洪伟
赵中原
王屹东
熊可欣
Original Assignee
北京小米移动软件有限公司
北京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京小米移动软件有限公司, 北京邮电大学 filed Critical 北京小米移动软件有限公司
Priority to CN202080003605.XA priority Critical patent/CN114793453A/zh
Priority to PCT/CN2020/130896 priority patent/WO2022104799A1/fr
Publication of WO2022104799A1 publication Critical patent/WO2022104799A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models

Definitions

  • the present disclosure relates to the field of wireless communication technologies, and in particular, to a training method, a training device and a storage medium.
  • the communication network has the characteristics of ultra-high speed, ultra-low latency, ultra-high reliability, and ultra-multiple connections.
  • artificial intelligence is introduced to improve the resource utilization of communication networks, terminal service experience, automation and intelligent control and management of communication networks, and models obtained through deep learning of artificial intelligence can have better performance.
  • its high storage space and computing resource consumption make it difficult to be effectively applied on various hardware platforms, and the communication overhead is large, the precision is small, and the security is low.
  • the present disclosure provides a training method, a training device and a storage medium.
  • a training method applied to a first node, the method includes:
  • a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained.
  • a compressed model In response to receiving a model training request, a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained.
  • a compressed model In response to receiving a model training request, a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained.
  • the model compression parameters include a plurality of model compression options
  • the obtaining the first compression model of the first training model based on the first training model and the model compression parameters includes:
  • the determining the first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model includes: :
  • the first loss function is determined using the first cross entropy and the first relative entropy divergence.
  • the method further includes:
  • a second loss function for updating parameters of the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model .
  • the method for updating the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model A second loss function for model parameters, including:
  • the second loss function is determined using the second cross entropy and the second relative entropy divergence.
  • the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
  • the number of the first training models is determined based on the model training mode.
  • the method further includes:
  • a second indication message is sent, where the second indication message includes a number of first compressed models corresponding to the model training mode.
  • the method further includes:
  • a third indication message is received, where the third indication message includes an indication of determining the training model.
  • the model training mode includes a multi-training node mode
  • the method further includes:
  • the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first training model based on the number of the first compression models; based on For the third compression model, the model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
  • the method further includes:
  • a fifth indication message is received, where the fifth indication message is used to instruct the end of training the first compression model.
  • a training method applied to a second node comprising:
  • the model training request includes model compression parameters, and the model compression parameters are used to compress a first training model to obtain a first compression model, and the first training model is obtained by training based on the model training request .
  • the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
  • the number of the first training models is determined based on the model training mode.
  • the method further includes:
  • a second indication message is received, the second indication message includes a number of first compressed models corresponding to the model training mode.
  • the method further includes:
  • a third indication message is sent, where the third indication message includes an indication of determining the training model.
  • the model training mode includes a multi-training node mode
  • the method further includes:
  • a fourth indication message is sent; the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first compression model based on the number of the first training models.
  • the method further includes:
  • a fifth instruction message is sent, where the fifth instruction message is used to instruct the end of training the first compression model.
  • the method further includes:
  • a subscription requirement is received, and a model training request is sent based on the subscription requirement.
  • a training apparatus applied to a first node comprising:
  • a model training and compression module configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, obtain A first compressed model of the first training model.
  • the model compression parameters include a plurality of model compression options
  • the model training and compression module is configured to determine a first model compression option from among the multiple model compression options, and compress the first training model based on the first model compression option to obtain a second compression model ; Determine the first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model; update all the parameters based on the first loss function parameters of the second compression model to obtain the first compression model.
  • the apparatus further includes a data processing and storage module
  • the data processing and storage module is used to determine the first cross entropy between the output of the second compression model and the sample parameter set, and to determine the difference between the output of the second compression model and the output of the first training model
  • the first relative entropy divergence of , and the first loss function is determined based on the first cross entropy and the first relative entropy divergence.
  • the data processing and storage module is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameters used for training the first training model set to determine a second loss function for updating the parameters of the first training model.
  • the data processing and storage module is further configured to determine a second cross entropy between the output of the first training model and the sample parameter set, and determine the output of the first training model and the a second relative entropy divergence between the outputs of the second compression model; the second loss function is determined based on the second cross entropy and the second relative entropy divergence.
  • the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models the multi-training node mode; the number of the first training models is determined based on the model training mode.
  • the device network communication module comprises
  • the first network communication module is configured to send a second indication message, where the second indication message includes a number of first compressed models corresponding to the model training mode.
  • the first network communication module is further configured to receive a third indication message, where the third indication message includes an indication of determining a training model.
  • the first network communication module is further configured to receive a fourth indication message; the fourth indication message is used to indicate a third compression model, and the third compression model is based on the first compression model A compression model obtained by federally averaging the first training model; based on the third compression model, re-determining the model compression parameters, and updating the first compression model based on the re-determined model compression parameters.
  • the first network communication module is further configured to receive a fifth indication message, where the fifth indication message is used to instruct the end of training the first compression model.
  • a training apparatus applied to a second node comprising:
  • the second network communication module is used to send a model training request; wherein, the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain a first compression model, and the first training model Obtained by training based on the model training request.
  • the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
  • the number of the first training models is determined based on the model training mode.
  • the second network communication module is further configured to receive a second indication message, where the second indication message includes a number of first compressed models corresponding to the model training mode.
  • the second network communication module is further configured to send a third instruction message, where the third instruction message includes an instruction for determining the training model.
  • the model training mode includes a multi-training node mode
  • the network communication module is further configured to send a fourth instruction message; the fourth instruction message is used to indicate the third compression model, the first
  • the three-compression model is a compression model obtained by federally averaging the first compression model based on the number of the first training models.
  • the second network communication module is further configured to send a fifth instruction message, where the fifth instruction message is used to instruct the end of training the first compression model.
  • the apparatus further includes a service management module
  • the service management module is configured to receive subscription requirements and send a model training request based on the subscription requirements.
  • a training device comprising:
  • processor configured to: execute the first aspect or the training method described in any implementation manner of the first aspect, or execute the second aspect or The training method described in any one of the implementation manners of the second aspect.
  • a non-transitory computer-readable storage medium which enables the mobile terminal to execute the first aspect or the first aspect when instructions in the storage medium are executed by a processor of a mobile terminal.
  • the training method described in any one of the embodiments of the aspect, or the training method described in the second aspect or any one of the embodiments of the second aspect is performed.
  • the technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects: in the present disclosure, the trained model is compressed, and the parameters of the compressed model are updated, so that the compressed model can have the same effect as the training model, thereby reducing the transmission time
  • the model is the signaling overhead, and can ensure the accuracy and reliability of the model, and further ensure the security of user information.
  • FIG. 1 is a schematic diagram of a system architecture of a compression method provided by the present disclosure.
  • Fig. 2 is a flowchart of a training method according to an exemplary embodiment.
  • Fig. 3 is a flowchart of another training method according to an exemplary embodiment.
  • Fig. 4 is a flowchart of yet another training method according to an exemplary embodiment.
  • Fig. 5 is a flowchart of yet another training method according to an exemplary embodiment.
  • FIG. 6 is a flowchart of an implementation manner of determining a first compression model in a single training node mode in a training method provided by the present disclosure.
  • FIG. 7 is a flowchart of an implementation manner of determining a first compression model in a multi-training node mode in a training method provided by the present disclosure.
  • FIG. 8 is a schematic diagram of the protocol and interface of the model training and compression decision part in a training method provided by the present disclosure.
  • FIG. 9 is a schematic diagram of the protocol and interface of the model training and compression part in a single training node mode in a training method provided by the present disclosure.
  • FIG. 10 is a schematic diagram of a protocol and an interface of a model training and compression part in a multi-training node mode in a training method provided by the present disclosure.
  • FIG. 11 is a schematic diagram of a protocol and interface of a wireless data transmission part in a training method provided by the present disclosure.
  • Fig. 12 is a block diagram of a training apparatus according to an exemplary embodiment.
  • Fig. 13 is a block diagram of another training apparatus according to an exemplary embodiment.
  • Fig. 14 is a block diagram of an apparatus for training according to an exemplary embodiment.
  • Fig. 15 is a block diagram of another apparatus for training according to an exemplary embodiment.
  • the communication network has the characteristics of ultra-high speed, ultra-low latency, ultra-high reliability, and ultra-multiple connections.
  • the implementation process of using the deep learning algorithm when training the model includes: the model request node determines the model structure and the model training mode according to the model/analysis subscription requirements, wherein the model training mode includes a single training node mode and a multi-training node mode. .
  • the model request node sends the model structure and model training mode to the model training node, and the model training node independently conducts model training according to the model training mode or participates in the collaborative model training of multiple training nodes.
  • the model training node sends the model to the model request node, and the model request node performs a federated average of the models sent by the model request node in the multi-training node mode to obtain a global model.
  • the model request node checks whether the obtained model meets the model/analysis subscription requirements, and if so, the model request node sends the obtained model to the model/analysis party. If not, repeat the above model training process until the model obtained by the model request node meets the model/analysis subscription requirements.
  • the data volume of the model is relatively large, especially in the multi-training node mode, the model needs to perform multiple transmissions between the model training node and the model requesting node, which greatly increases the communication overhead.
  • the present disclosure provides a training method to solve the problems of high communication overhead, insufficient model accuracy, and security of terminal private data.
  • the training method provided by the present disclosure determines the model structure and model training mode according to network service requirements (such as model subscription requirements), and fully considers the local available computing power, communication conditions, training sample characteristics and other factors of the model training node, and formulates various Model compression option to reduce unnecessary communication overhead, improve wireless network resource utilization, and apply deep learning to network intelligence work in a more efficient and secure way.
  • FIG. 1 is a schematic diagram of a system architecture of a training method provided by the present disclosure.
  • the system includes a core network part and a radio access network part.
  • the terminal (user) accesses the base station through the wireless channel, the base stations are connected through the Xn interface, the base station accesses the terminal port function (User Port Function, UPF) network element of the core network through the N3 interface, and the UPF network element accesses the session through the N4 interface
  • the management function Session Management Function, SMF
  • SMF Session Management Function
  • the SMF network element is connected to the bus structure of the core network, and is connected with other network functions (Network Function, NF) of the core network.
  • Network Function Network Function
  • the communication system between the network device and the terminal shown in FIG. 1 is only a schematic illustration, and the wireless communication system may also include other network devices, for example, a wireless relay device and a wireless backhaul device, etc. Not shown in Figure 1.
  • the embodiments of the present disclosure do not limit the number of network devices and the number of terminals included in the wireless communication system.
  • the wireless communication system is a network that provides a wireless communication function.
  • Wireless communication systems can use different communication technologies, such as code division multiple access (CDMA), wideband code division multiple access (WCDMA), time division multiple access (TDMA) , frequency division multiple access (frequency division multiple access, FDMA), orthogonal frequency division multiple access (orthogonal frequency-division multiple access, OFDMA), single carrier frequency division multiple access (single Carrier FDMA, SC-FDMA), carrier sense Carrier Sense Multiple Access with Collision Avoidance.
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • TDMA time division multiple access
  • FDMA frequency division multiple access
  • OFDMA orthogonal frequency division multiple access
  • single carrier frequency division multiple access single Carrier FDMA, SC-FDMA
  • carrier sense Carrier Sense Multiple Access with Collision Avoidance CDMA
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • TDMA time division multiple access
  • OFDMA orthogonal
  • the network can be divided into 2G (English: generation) network, 3G network, 4G network or future evolution network, such as 5G network, 5G network can also be called a new wireless network ( New Radio, NR).
  • 2G International: generation
  • 3G network 4G network or future evolution network, such as 5G network
  • 5G network can also be called a new wireless network ( New Radio, NR).
  • New Radio New Radio
  • the present disclosure will sometimes refer to a wireless communication network simply as a network.
  • the wireless access network equipment may be: a base station, an evolved node B (base station), a home base station, an access point (AP) in a wireless fidelity (WIFI) system, a wireless relay A node, a wireless backhaul node, a transmission point (TP) or a transmission and reception point (TRP), etc., can also be a gNB in an NR system, or can also be a component or part of a device that constitutes a base station Wait.
  • the network device may also be an in-vehicle device. It should be understood that, in the embodiments of the present disclosure, the specific technology and specific device form adopted by the network device are not limited.
  • the terminal involved in the present disclosure may also be referred to as terminal equipment, user equipment (User Equipment, UE), mobile station (Mobile Station, MS), mobile terminal (Mobile Terminal, MT), etc.
  • a device that provides voice and/or data connectivity for example, a terminal may be a handheld device with wireless connectivity, a vehicle-mounted device, or the like.
  • some examples of terminals are: Smartphone (Mobile Phone), Pocket Personal Computer (PPC), PDA, Personal Digital Assistant (PDA), notebook computer, tablet computer, wearable device, or Vehicle equipment, etc.
  • the terminal device may also be an in-vehicle device. It should be understood that the embodiments of the present disclosure do not limit the specific technology and specific device form adopted by the terminal.
  • Fig. 2 is a flow chart of a training method according to an exemplary embodiment. As shown in Figure 2, the training method is used in the first node and includes the following steps.
  • step S11 in response to receiving a model training request, a first training model is trained.
  • the first node is a model training node, and the model training node is referred to as the first node in this disclosure for the convenience of description;
  • the second node is a model request node. for the second node.
  • the model training request includes model compression parameters.
  • the model compression parameters include at least one of the following:
  • Model training structure multiple model compression options, model training modes.
  • the model compression option is determined based on the model subscription requirement received by the second node (eg, the model requesting node).
  • the second node determines to send the model training request according to the received model subscription requirement.
  • the first node for example, the model training node
  • the first node sends third indication information, responds to the model training request information, and determines
  • the first training model is trained based on the local sample parameter set and the model training structure, and the relevant parameters required for model compression are determined.
  • the response model training request information sent by the first node further includes one or more of the local computing capability of the first node, communication conditions, and characteristics of the training sample parameter set.
  • step S12 a first compression model of the first training model is obtained based on the first training model and the model compression parameters.
  • the first node compresses the first training model based on the model compression option in the model compression parameters and the relevant parameters required for model compression.
  • the relevant parameters required for model compression are determined by the first node based on parameters such as model compression parameters and local computing capabilities sent by the second node, and the model compression options include model accuracy and model parameter data volume.
  • the model compression parameters include multiple model compression options, and the multiple model compression options are determined by the second node based on one or more of local computing capabilities, communication conditions, and training sample parameter set characteristics reported by multiple first nodes.
  • Fig. 3 is a flowchart of a training method according to an exemplary embodiment. As shown in FIG. 3 , based on the first training model and the model compression parameters, a compression model of the first training model is obtained, including the following steps.
  • step S21 a first model compression option is determined among the multiple model compression options, and the first training model is compressed based on the first model compression option to obtain a second compression model.
  • the first node determines a first model compression option for model compression according to one or more of local computing capabilities, communication conditions, and training samples among the multiple model compression options.
  • the first training model is compressed according to the requirement of the model parameter data volume in the model compression option to obtain the second compression model, and the symbol ⁇ S is used to identify the second compression model.
  • the following implementations may be used to compress the first training model by using the matrix g and the first model compression option:
  • the first node takes the amount of model parameter data as a constraint, and designs a pruning matrix X to retain the channel that contributes more to the accuracy in the model.
  • the first node takes the sum of the elements of each column of the pruning matrix X as the unknown item, and according to the size of the elements in each column of the matrix g, retains the channel corresponding to the item with the largest column element in the matrix, and prunes other channels.
  • X is used to prune ⁇ to obtain the second compression model ⁇ S .
  • the first node selects an appropriate model compression option, compresses the training model according to the model compression option, and then transmits it to the second node. Under the condition of retaining most of the accuracy of the deep learning model, try to compress the training model as much as possible. The data volume of the model is reduced.
  • This method realizes model compression according to the communication rate requirements of the model training node, which greatly reduces the communication overhead of the model uplink transmission.
  • step S22 a first loss function is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model.
  • the sample training parameter set further includes a sample verification parameter set, and at least one data pair of input and output of the sample verification parameter set is determined.
  • the first node inputs the input and output data pairs of the sample verification parameter set into the first training model and the second compression model, and determines the output of the first training model and the output of the second compression model, and the corresponding sample verification parameter set.
  • the output is the true value corresponding to the model input.
  • the first relative entropy divergence between, the sum of the first cross entropy and the first relative entropy divergence is determined as the loss function of the second compression model.
  • the present disclosure determines the loss function of the second compression model as the first loss function for the convenience of distinction.
  • a plurality of first loss functions are determined based on a plurality of input and output data pairs in the sample parameter set, and an average value of the plurality of first loss functions is determined, and a gradient descent method is used according to the average value of the plurality of first loss functions.
  • the parameters of the second compression model are updated to obtain the first compression model.
  • the first loss function (that is, the loss function of the second compression model) is expressed by the following formula:
  • the loss function of the second compression model is the loss function of the second compression model; is the first cross entropy between the output value of the second compression model and the input and output data pairs of the sample validation parameter set to the true value; D KL (p S
  • the first training model is the first training model after updating the parameters of the first training model based on the loss function of the first training model.
  • a loss function of the training model after updating the parameters of the first training model based on the loss function of the first training model, the loss function (ie, the first loss function) of the second compression model is determined.
  • the present disclosure determines the loss function of the first training model as the second loss function for the convenience of distinction.
  • the sample training parameter set further includes a sample verification parameter set, and the first node determines at least one data pair of input and output of the sample verification parameter set.
  • the first node inputs the input and output data pairs of the sample verification parameter set into the first training model and the second compression model, and determines the output of the first training model and the output of the second compression model, and the corresponding sample verification parameter set.
  • the output is the true value corresponding to the model input.
  • the second relative entropy divergence between, the sum of the second cross entropy and the second relative divergence is determined as the second loss function.
  • a plurality of second loss functions are determined based on a plurality of input-output data pairs in the sample parameter set, and an average value of the plurality of second loss functions is determined, and a gradient descent method is adopted according to the average value of the plurality of second loss functions
  • the parameters of the first training model are updated to obtain an updated first training model.
  • the second loss function (that is, the loss function of the first training model) is expressed by the following formula:
  • L ⁇ is the loss function of the first training model
  • L C is the second cross-entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value
  • p S ) is the relative entropy divergence of the output value of the first training model and the output value of the second compression model
  • p S is the output value of the second compression model
  • p 1 is the output value of the first training model.
  • the model training modes in the model compression parameters include a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
  • the first node determines the number of training first training models according to the model training mode included in the model training parameters. If the model training mode is single training node training, it is determined to train a first training model based on a single first node.
  • the training method is as follows: above. If the model training mode is multi-training node training, it is determined to train a plurality of first training models based on the plurality of first nodes, and different sequence marks are set for the plurality of first nodes to train the plurality of first training models.
  • the following takes the mth model training node (ie, the mth first node) as an example for description.
  • the multi-training node model is described.
  • the first node determines a first model compression option for model compression according to one or more of local computing capabilities, communication conditions, and training samples among the multiple model compression options.
  • the first training model is compressed according to the requirement of the model parameter data volume in the model compression option to obtain the second compression model, and the symbol ⁇ S is used to identify the second compression model.
  • the following implementations may be used to compress the first training model by using the matrix g and the first model compression option:
  • the first node takes the amount of model parameter data as a constraint, and designs a pruning matrix X to retain the channel that contributes more to the accuracy in the model.
  • the first node takes the sum of the elements of each column of the pruning matrix X as the unknown item, and according to the size of the elements in each column of the matrix g, retains the channel corresponding to the item with the largest column element in the matrix g, and prunes other channels.
  • After obtaining the pruning matrix X use X to prune ⁇ m to obtain the mth second compression model
  • the sample training parameter set further includes a sample verification parameter set, and at least one data pair of input and output of the sample verification parameter set is determined.
  • the first node inputs the input and output data pairs of the sample verification parameter set into the mth first training model and the mth second compression model, and determines the output of the mth first training model and the mth second compression model.
  • the output of the compressed model, and the output of the corresponding sample validation parameter set, the output is the real value corresponding to the model input.
  • the present disclosure determines the loss function of the mth second compression model as the mth first loss function for the convenience of distinction.
  • a plurality of mth first loss functions are determined based on a plurality of input and output data pairs in the sample parameter set, and an average value of a plurality of mth first loss functions is determined, according to the plurality of mth first loss functions.
  • the average value of the loss function uses the gradient descent method to update the parameters of the mth second compression model to obtain the mth first compression model.
  • the mth first loss function (that is, the loss function of the second compression model) is expressed by the following formula:
  • the mth first training model is the mth first training model obtained by updating the parameters of the mth first training model based on the loss function of the mth first training model.
  • the loss function of the mth first training model is preferentially determined, and after updating the parameters of the first training model based on the loss function of the mth first training model, the mth second compression model is determined.
  • the loss function of the model i.e. the mth first loss function).
  • the loss function of the mth first training model is determined as the mth second loss function.
  • the sample training parameter set further includes a sample verification parameter set
  • the first node determines at least one data pair of input and output of the sample verification parameter set.
  • the first node inputs the input and output data pairs of the sample verification parameter set into the mth first training model and the mth second compression model, and determines the output of the mth first training model and the mth second compression model.
  • the output of the compressed model, and the output of the corresponding sample validation parameter set, the output is the real value corresponding to the model input.
  • the second cross-entropy between the output of the mth first training model and the sample parameter set that is, the input and output data of the sample validation parameter set to the true value
  • the second relative entropy divergence between the outputs of the mth second compression model, the sum of the mth second cross entropy and the mth second relative divergence is determined as the mth second loss function.
  • a plurality of m-th second loss functions are determined based on a plurality of input-output data pairs in the sample parameter set, and the average value of a plurality of m-th second loss functions is determined, according to the plurality of m-th second loss functions.
  • the average value of the loss function uses the gradient descent method to update the parameters of the m-th first training model to obtain the updated m-th first training model.
  • the mth second loss function (that is, the loss function of the first training model) is expressed by the following formula:
  • L C is the second cross entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value
  • L C is the second cross entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value
  • p m is the output value of the mth first training model.
  • model compression methods such as model sparseness, parameter quantization, etc. may also be selected to determine the first compression model, which is not specifically limited in the present disclosure.
  • Fig. 4 is a flowchart showing a training method according to an exemplary embodiment. As shown in FIG. 4 , based on the first training model and the model compression parameters, a compression model of the first training model is obtained, including the following steps.
  • step S31 a fourth indication message is received.
  • the second node determines the first compression model according to the received second indication message. If there is one first compression model, it is determined whether the first compression model meets the model subscription requirement, or whether it meets the analysis subscription requirement. If there are multiple first compression models, the multiple first compression models are federated averaged to obtain a third compressed model after federation average or called a global model, and it is determined whether the third compressed model after federation average meets the model subscription requirements , or whether the analytics subscription requirements are met. In an implementation manner, if a first compression model or a third compression model after federated average of multiple first compression models does not meet the subscription requirement, a fourth indication message is sent, where the fourth indication message is used to indicate the third compression model. Model. The first node receives the fourth indication message.
  • step S32 based on the third compression model, model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
  • the first node re-determines model compression parameters according to the third compression model indicated by the fourth indication message, updates the first compression model based on the re-determined model compression parameters, and determines the loss function of the first compression model , and re-update the parameters of the first compression model until the second node determines a compression model that satisfies the model subscription requirement.
  • another implementation is that, when the second node determines that the first compression model meets the model subscription requirements, it determines to send a fifth indication message, where the fifth indication message is used to instruct the end of training the first A compressed model. After receiving the fifth indication message, the first node determines that the training of the first compression model is over, and the second node sends the determined compression model to the model subscriber.
  • the first node after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel.
  • the second indication message includes the number of first compressed models corresponding to the model training mode.
  • the embodiment of the present disclosure solves the problem that the data volume of the deep learning model is too large, effectively alleviates the shortage of wireless resources, reduces the problem of data transmission errors in the case of network congestion, and improves the reliability of model transmission in the wireless network. , to ensure the accuracy of the model.
  • the model obtained by training the first node using the local training parameter set is compressed and uploaded to the second node. This method not only keeps the user's private data locally, but also greatly increases the difficulty of the network's reverse reasoning for the model. , which further ensures the security of user information.
  • the embodiments of the present disclosure also provide a training method.
  • Fig. 5 is a flowchart of a training method according to an exemplary embodiment. As shown in Figure 5, the training method is used in the second node and includes the following steps.
  • step S41 a model training request is sent.
  • the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain the first compression model, and the first training model is obtained by training based on the model training request.
  • the first node is a model training node, and the model training node is referred to as the first node in this disclosure for the convenience of description;
  • the second node is a model request node. for the second node.
  • the model training request includes model compression parameters.
  • the model compression parameters include at least one of the following:
  • Model training structure multiple model compression options, model training modes.
  • the model compression option is determined based on the model subscription requirement received by the second node (eg, the model requesting node).
  • the second node determines to send the model training request according to the received model subscription requirement.
  • the first node for example, the model training node
  • the first node sends third indication information, responds to the model training request information, and determines
  • the first training model is trained based on the local sample parameter set and the model training structure, and the relevant parameters required for model compression are determined.
  • the information sent by the first node to respond to the model training request further includes one or more of the local computing capability of the first node, communication conditions, and characteristics of the training sample parameter set.
  • the first node compresses the first training model based on the model compression option in the model compression parameters and the relevant parameters required for model compression.
  • the model compression options include model accuracy and model parameter data volume.
  • the model compression parameters include multiple model compression options, and the multiple model compression options are determined by the second node based on one or more of local computing capabilities reported by multiple first nodes, communication conditions, and characteristics of the training sample parameter set.
  • the model training modes in the model compression parameters include a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
  • the first node determines the number of training first training models according to the model training mode included in the model training parameters. If the model training mode is single training node training, it is determined to train a first training model based on a single first node.
  • the training method is as follows: above. If the model training mode is multi-training node training, it is determined to train a plurality of first training models based on the plurality of first nodes, and different sequence marks are set for the plurality of first nodes to train the plurality of first training models.
  • the first node after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel.
  • the second indication message includes the number of first compressed models corresponding to the model training modes.
  • the second node receives the second instruction message to determine the number of first compression models corresponding to the model training modes, and performs a federated average on the received one or more first compressions to obtain a third compression model after the federal average, and determines the third compression model. Whether the compressed model meets the model subscription requirements, or whether it meets the analysis subscription requirements.
  • the subscription requirement may be issued by Operation Administration and Maintenance (OAM), or issued by the core network.
  • Subscription requirements include analysis ID, used to identify the analysis type of the model training request; notification target model training node address, used to associate notifications received by the requested party with this subscription; also used for analysis report information, including the preferred analysis accuracy level , analysis time interval and other parameters; analysis filter information (optional): Indicates the conditions to be met by the report analysis information.
  • a fourth indication message is sent, wherein the fourth indication message uses to indicate the third compression model.
  • the first node receives the fourth indication message.
  • another implementation is that, when the second node determines that the first compression model meets the model subscription requirements, it determines to send a fifth indication message, where the fifth indication message is used to instruct the end of training the first A compressed model. After receiving the fifth indication message, the first node determines that the training of the first compression model is finished, and the second node sends the determined compression model to the model subscriber.
  • the second node receives the subscription requirement sent by the OAM or the core network, and determines to send the model training request based on the received subscription requirement.
  • the first node and the second node may be applied between a base station and a base station, or between a base station and a terminal, and of course, may also be applied between a base station and a core network.
  • an application environment may be that the first node is a terminal and the second node is a base station.
  • the required measurement can also be applied to an application environment in which the first node is a base station and the second node is also a base station, and may also include an application
  • the first node is a base station
  • the second node is a core network node.
  • the first node after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel.
  • the second indication message includes the number of first compressed models corresponding to the model training mode.
  • the embodiment of the present disclosure solves the problem that the data volume of the deep learning model is too large, effectively alleviates the shortage of wireless resources, reduces the problem of data transmission errors in the case of network congestion, and improves the reliability of model transmission in the wireless network. , to ensure the accuracy of the model.
  • the model obtained by training the first node using the local training parameter set is compressed and uploaded to the second node. This method not only keeps the user's private data locally, but also greatly increases the difficulty of the network's reverse reasoning for the model. , which further ensures the security of user information.
  • the first node is referred to as a model training node
  • the second node is referred to as a model request node.
  • the present disclosure is further described in terms of interaction between model training nodes and model request nodes.
  • FIG. 6 is a flowchart of an implementation manner of determining a first compression model in a single training node mode in a training method provided by the present disclosure.
  • the model request node initiates a model training request to the model training node.
  • the model training node sends the local computing power, communication conditions and training sample parameter set characteristics to the model requesting node.
  • the model request node determines the model structure and model training mode according to the model/analysis subscription requirements, and proposes a variety of model compression options based on the information reported by the model training node, including model accuracy and model parameter data volume.
  • the model request node sends the model structure, model training mode, and model compression options to the model training node, and the model training node selects an appropriate model compression option.
  • the model training node uses the local sample parameter set for model training to obtain the first training model and related parameters required for model compression.
  • the model training node compresses the first training model according to the selected model compression option and relevant parameters required for model compression to obtain a first compressed model, and transmits the first compressed model to the model requesting node through a wireless channel.
  • the model training process ends, and the model request node reports the model to the model/analysis subscriber.
  • FIG. 7 is a flowchart of an implementation manner of determining a first compression model in a multi-training node mode in a training method provided by the present disclosure.
  • the model request node initiates a model training request to the model training node.
  • the model training node sends the local computing power, communication conditions and training sample parameter set characteristics to the model requesting node.
  • the model request node determines the model structure and model training mode according to the model/analysis subscription requirements, and proposes a variety of model compression options based on the information reported by the model training node, including model accuracy and model parameter data volume.
  • the model request node sends the model structure, model training mode, and model compression options to the model training node, and the model training node selects an appropriate model compression option.
  • the model training node uses the local sample parameter set for model training to obtain the first training model and related parameters required for model compression.
  • the model training node selects an appropriate model compression option, and compresses the first training model according to the selected model compression option and relevant parameters required for model compression to obtain a first compression model, and transmits the first compression model through the wireless channel Passed to the model request node.
  • the first compressed model sent from the first model training node of the model request node is federated average to obtain a global model.
  • the model training process ends, and the model request node reports the global model to the model/analysis subscriber. If the global model does not meet the model/analysis subscription requirements, the model training node reselects an appropriate model compression option, and updates the first compression model according to the re-determined model compression option.
  • FIG. 8 is a schematic diagram of the protocol and interface of the model training and compression decision part in a training method provided by the present disclosure. As shown in FIG. 8 , it includes a service management module and a network communication module in the model request node, and a network communication module, model training and compression module, data processing and storage module in the model training node device.
  • the service management module in the model request node, the network communication module and the network communication module, model training and compression module, data processing and storage module in the model training node device perform the following steps for information exchange.
  • step 1 includes steps 1a-1c, wherein in step 1a, the model request node service management module sends a model training request signaling to the model request node network communication module, and the content of the signaling instruction is to train the model The node initiates a model training request.
  • step 1b the model requesting node network communication module sends model training request signaling to the model requesting node network communication module.
  • step 1c the model training node network communication module sends the model training request response signaling to the model request node network communication module, and the content of the signaling instruction is to notify the acceptance of the model training request.
  • Step 2 includes steps 2a-2c, wherein in step 2a, the model training node model training and compression module sends the computing capability information reporting signaling to the model training node network communication module, and the signaling instruction content is to calculate the model training node equipment. The capability information is reported to the receiver.
  • the model training node data processing and storage module sends the training sample feature information reporting signaling to the model training node network communication module, and the signaling instruction content is to report the model training node local data training sample feature information to the receiver.
  • step 2c the model training node network communication module sends the computing capability and training sample feature information reporting signaling to the model training node network communication module, and the signaling instruction content is to report the model training node computing capability and local data training sample feature information to the recipient.
  • step 3 if the model training node is a terminal and the model requesting node is a base station, the network communication module of the model training node needs to measure the Channel Quality Indication (CQI) and report the signaling to the model requesting node network communication module , and the content of the signaling indication is to perform CQI measurement and report the CQI information to the receiver.
  • CQI Channel Quality Indication
  • the model request node network communication module sends the model training node computing capability, training sample characteristics, CQI information (optional) signaling to the model request node service management module, and the signaling indication content is the model to be received.
  • the computing power of the training node, the characteristics of the training samples, and the CQI information (optional) are aggregated and sent to the receiver.
  • step 5 the model request node service management module determines the model structure and model training mode according to the model/analysis subscription requirements.
  • step 6 the model request node service management module proposes various model compression options according to the information reported by the model training node.
  • step 7a-7b are included, wherein in step 7a, the model request node service management module sends the model structure and model training mode signaling to the model request node network communication module, and the signaling instruction content is to convert the model structure and model The training mode is sent to the receiver.
  • the model requesting node service management module sends the model requesting node network communication module a signaling of sending model compression options, and the content of the signaling instruction is to send multiple model compression options to the receiver.
  • Step 8 includes 8a-8b, wherein in step 8a, the model request node network communication module sends the model structure and model training mode signaling to the model training node network communication module. In step 8b, the model requesting node network communication module sends the model compression option signaling to the model training node network communication module.
  • step 9a-9b are included in step 9, wherein in step 9a, the model training node network communication module will send model structure and model training mode signaling to the model training node model training and compression module.
  • step 9b the model training node network communication module sends the model compression option signaling to the model training node model training and compression module.
  • step 10 the model training node selects an appropriate model compression option according to locally available computing power, real-time communication conditions, and characteristics of training samples, and selects an appropriate model compression option.
  • FIG. 9 is a schematic diagram of the protocol and interface of the model training and compression part in a single training node mode in a training method provided by the present disclosure. As shown in FIG. 9 , it includes a data processing and storage module, a model training and compression module, and a network communication module in the model training node, as well as a network communication module and a service management module in the model requesting node device.
  • the data processing and storage module, model training and compression module, network communication module in the model training node, and the network communication module and service management module in the model request node device perform the following steps for message interaction.
  • step 1 including steps 1a-1b, wherein in step 1a, the model training node model training and compression module sends the request local training data set signaling to the model training node data processing and storage module, and the signaling indicates that the content is the request Collect training datasets from local data.
  • the data processing and storage module of the model training node sends the signaling of sending the local training data set to the model training and compression module of the model training node, and the signaling indicates the content: collect data from local data to generate a training data set and send it to the model training and compression module. receiver.
  • step 2 the model training and compression module of the model training node uses the local training data set for model training, and obtains the training model and relevant parameters required for model compression.
  • step 3 the model training node compresses the original training model according to the selected model compression option and relevant parameters required for model compression to obtain a compressed model.
  • Step 4 includes 4a-4c, wherein, in step 4a, the model training node model training and compression module sends the compressed model to the model training node network communication module.
  • the model training node network communication module sends the compressed model to the model request node network communication module.
  • the model request node network communication module sends the compressed model to the model request node service management module.
  • step 5 the model request node service management module judges whether the obtained model satisfies the model/analysis subscription requirement. If satisfied, go to step 6.
  • step 6 the model request node service management module sends a signaling of notifying the model training end to the model training node network communication module via the model request node network communication module.
  • This process and corresponding signaling are newly added in the present invention, and the signaling indicates Content: Notify the model training node to end the model training process.
  • step 6a the model request node service management module sends the notification model training continuation signaling to the model training node network communication module via the model request node network communication module.
  • This process and the corresponding signaling are newly added in the present invention, and the signaling indicates Content: Notifies the model training process to continue.
  • step 6b the network communication module of the model training node will notify the model training continuation signaling to the model training node model training and compression module.
  • step 7 the model training and compression module of the model training node uses the local training data set to train the compressed model, and repeats steps 4a-7 until the model obtained by the model request node meets the model/analysis subscription requirements.
  • FIG. 10 is a schematic diagram of a protocol and an interface of a model training and compression part in a multi-training node mode in a training method provided by the present disclosure. As shown in FIG. 10 , it includes a model training and compression module, a transmission control module, and a network communication module in the model training node, and a network communication module, transmission control module, and service management module in the model request node device. The following steps are performed for the information exchange among its various modules.
  • Step 1 includes steps 1a-1b, wherein in step 1a, the model training and compression module of the model training node sends a request for local training data set signaling to the model training node data processing and storage module. In step 1b, the data processing and storage module of the model training node sends the signaling of sending the local training data set to the model training and compression module of the model training node.
  • step 2 the model training and compression module of the model training node uses the local training data set to perform model training to obtain a first training model and relevant parameters required for model compression.
  • step 3 the model training node compresses the first training model according to the selected model compression option and the relevant parameters required for model compression to obtain the first compression model.
  • Step 4 includes steps 4a-4c, wherein in step 4a, the model training node model training and compression module sends the first compressed model to the model training node network communication module. In step 4b, the model training node network communication module sends the first compressed model to the model request node network communication module. In step 4c, the model request node network communication module sends the first compressed model to the model request node model calculation and update module.
  • step 5 the model request node model calculation and update module summarizes the first compressed model sent from each model training node, and performs a federated average to obtain a global model.
  • step 6 the model request node model calculation and update module sends the global model to the model request node service management module.
  • step 7 the model request node service management module judges whether the obtained model meets the model/analysis subscription requirements. If so, execute:
  • step 8 the model request node service management module sends a signaling of notifying the model training end to the model training node network communication module via the model request node network communication module.
  • step 8a the model request node service management module sends the notification model training continuation signaling to the model training node network communication module via the model request node network communication module, and distributes the global model to the model training node via the model request node network communication module network communication module.
  • step 8b the model training node network communication module will notify the model training continuation signaling to the model training node model training and compression module, and send the global model to the model training node model training and compression module.
  • step 9 the model training and compression module of the model training node uses the local training data set to perform model training and compression on the global model sent by the model requesting node, and repeats steps 4a-9 until the model obtained by the model requesting node satisfies the model/analysis Subscription requirements.
  • FIG. 11 is a schematic diagram of a protocol and interface of a wireless data transmission part in a training method provided by the present disclosure. As shown in Figure 11, it includes a model training and compression module, a transmission control module, and a network communication module in the model training node, and a network communication module, transmission control module, and service management module in the model request node. It can be applied to the application scenario where the model requesting node is the base station and the model training node is the terminal. The following steps are performed for the information interaction between each module.
  • step 1 the model training node model training and compression module sends the compressed model to the model training node transmission control module.
  • step 2 the model training node network communication module sends the measured CQI and reporting signaling to the model training node transmission control module.
  • step 3 the model training node transmission control module formulates a data transmission scheme according to compression characteristics and wireless communication conditions.
  • step 4 the model training node transmission control module sends the data transmission scheme information signaling to the model training node network communication module, this process and the corresponding signaling are newly added in the present invention, and the signaling indicates the content: send the data transmission scheme
  • the information is sent to the receiver, including modulation mode, code rate and other information.
  • step 5 the model training and compression module of the model training node sends the compressed model to the network communication module of the model training node.
  • step 6 the model training node network communication module encapsulates the compressed model according to the data transmission scheme.
  • Step 7 includes steps 7a-7d, wherein in step 7a, the model training node network communication module transmits the compressed model data packet to the model request node network communication module.
  • the network communication module of the model request node sends the compressed model to the transmission control module of the model request node, and the decapsulated data is transmitted at this time.
  • the transmission control module of the model requesting node sends a signaling of acknowledging receipt of correct data to the network communication module of the model requesting node, and the content of the signaling indicates that the receiver has received correct data.
  • the model requesting node network communication module sends a notification acknowledging receipt of correct data signaling to the model training node network communication module.
  • the model request node transmission control module sends the compressed model to the model request node service management module. If it is a single training node mode, the compressed model can be directly sent to the model request node business management module; if it is a multi-training node mode, the global model needs to be obtained through the model request node model calculation and update module, and then sent. Request the node service management module for the model.
  • step 9 the model request node service management module judges whether the model meets the model/analysis subscription requirement. If so, go to steps 10a1-10b1.
  • step 10a1 the model requesting node service management module sends a signaling informing the model training end to the model requesting node transmission control module.
  • step 10b1 the model requesting node network communication module sends a signaling informing the model training end to the model training node network communication module.
  • steps 10a2-10b2 are performed.
  • step 10a2 the model request node service management module sends the distribution model training end signaling to the model request node transmission control module, and the signaling indicates the content: notifies the model training node to end the model training process.
  • step 10b2 the model requesting node network communication module sends the model training end signaling to the model training node network communication module.
  • the training node mode only the signaling to notify the model to continue training can be sent; in the case of the multi-training node mode, the global model needs to be distributed to the model training nodes.
  • the protocol and interface principle of global model distribution are similar to the above steps 1-7.
  • the sending module is replaced by the model request node
  • the receiving module is replaced by the model training node
  • the compression model is replaced by the global model.
  • the model requesting node should initiate a CQI measurement request to the model training node, and the model training node performs CQI measurement and then feeds it back to the model requesting node.
  • an embodiment of the present disclosure also provides a training device.
  • the training apparatus includes corresponding hardware structures and/or software modules for executing each function.
  • the embodiments of the present disclosure can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the technical solutions of the embodiments of the present disclosure.
  • FIG. 12 is a block diagram of a training apparatus 100 according to an exemplary embodiment. Referring to FIG. 12 , the apparatus is applied to a first node, including a model training and compression module 110 , a first network communication module 120 , a first transmission control module 130 and a data processing and storage module 140 .
  • a model training and compression module 110 the apparatus is applied to a first node, including a model training and compression module 110 , a first network communication module 120 , a first transmission control module 130 and a data processing and storage module 140 .
  • the model training and compression module 110 is configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters. Based on the first training model and the model compression parameters, a first compression model of the first training model is obtained.
  • the model compression parameter includes a plurality of model compression options.
  • the model training and compression module 110 is configured to determine a first model compression option among the multiple model compression options, and compress the first training model based on the first model compression option to obtain a second compression model.
  • the first loss function is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model.
  • the parameters of the second compression model are updated based on the first loss function to obtain the first compression model.
  • the apparatus further includes a data processing and storage module 140 .
  • the data processing and storage module 140 is configured to determine the first cross entropy between the output of the second compression model and the sample parameter set, and determine the first relative entropy between the output of the second compression model and the output of the first training model Divergence. Based on the first cross entropy and the first relative entropy divergence, a first loss function is determined.
  • the data processing and storage module 140 is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, determine the parameters for updating the first training model.
  • a second loss function for training model parameters is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, determine the parameters for updating the first training model.
  • the data processing and storage module 140 is further configured to determine the second cross-entropy between the output of the first training model and the sample parameter set, and to determine the difference between the output of the first training model and the second compression model The second relative entropy divergence between the outputs. Based on the second cross entropy and the second relative entropy divergence, a second loss function is determined.
  • the model compression parameters include a model training mode, which includes a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
  • the number of first training models is determined based on the model training mode.
  • the apparatus further includes a first network communication module 120 .
  • the first network communication module 120 is configured to send a second indication message, where the second indication message includes a number of the first compressed models corresponding to the model training modes.
  • the first network communication module 120 is further configured to receive a third instruction message, where the third instruction message includes an instruction to determine the training model.
  • the first network communication module 120 is further configured to receive a fourth indication message.
  • the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by federally averaging the first training model based on the number of the first compression models.
  • model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
  • the network communication module 120 is further configured to receive a fifth instruction message, where the fifth instruction message is used to instruct to end training the first compression model.
  • the first network communication module 120 is used for data transmission and control signaling interaction between the model requesting node and the model training node.
  • the first transmission control module 130 is used to formulate a data transmission scheme according to the characteristics of the data to be transmitted and wireless communication conditions, and package the data to be transmitted according to the data transmission scheme, only in the embodiment in which the model requesting node is the base station and the model training node is the user Requires the use of a transport control module.
  • the data processing and storage module is used to manage local data, generate training sample characteristic information, collect data to generate a local training data set, and store the data set.
  • the model training and compression module is used for model training using the local data set, and compressing the model according to the information required for model compression obtained in the training process.
  • FIG. 13 is a block diagram of a training apparatus 200 according to an exemplary embodiment.
  • the apparatus is applied to a second node, and includes a second network communication module 210 , a second transmission control module 220 , a service management module 230 and a model calculation and update module 240 .
  • the second network communication module 210 is configured to send a model training request.
  • the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain the first compression model, and the first training model is obtained by training based on the model training request.
  • the model compression parameters include a model training mode, which includes a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
  • the number of first training models is determined based on the model training mode.
  • the second network communication module 210 is further configured to receive a second indication message, where the second indication message includes a number of the first compressed models corresponding to the model training modes.
  • the second network communication module 210 is further configured to send a third instruction message, where the third instruction message includes an instruction to determine the training model.
  • the model training mode includes a multi-training node mode
  • the second network communication module 210 is further configured to send a fourth indication message.
  • the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing a federated average of the first compression models based on the number of the first training models.
  • the second network communication module 210 is further configured to send a fifth instruction message, where the fifth instruction message is used to instruct the end of training the first compression model.
  • the apparatus further includes a service management module 230 .
  • the service management module 230 is configured to receive subscription requirements and send a model training request based on the subscription requirements.
  • the second network communication module 210 is used for data transmission and control signaling interaction between the model requesting node and the model training node.
  • the second transmission control module 220 is configured to formulate a data transmission scheme according to the characteristics of the data to be transmitted and wireless communication conditions, and package the data to be transmitted according to the data transmission scheme, only in the embodiment in which the model requesting node is the base station and the model training node is the user Requires the use of a transport control module.
  • the service management module 230 is used to process model/analysis subscription requests, initiate model training requests to model training nodes, formulate model structures, model training modes and model compression options, and check whether the obtained models meet model/analysis subscription requirements.
  • the model calculation and update module 240 is used for performing federated averaging on the compressed models sent from multiple model training nodes in a multi-training node mode to obtain a global model, and distributing the global model to the model training nodes.
  • a model training node device for deep learning model training and compression oriented to a wireless network is responsible for: responding to a model training request from a model requesting node and reporting local resource information. Select the appropriate model compression option, and perform model training and compression according to the model training mode, the selected model compression option.
  • FIG. 14 is a block diagram of an apparatus 300 for training according to an exemplary embodiment.
  • apparatus 300 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.
  • apparatus 300 may include one or more of the following components: processing component 302, memory 304, power component 306, multimedia component 308, audio component 310, input/output (I/O) interface 312, sensor component 314, and Communication component 316 .
  • the processing component 302 generally controls the overall operation of the device 300, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 302 may include one or more processors 320 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 302 may include one or more modules that facilitate interaction between processing component 302 and other components. For example, processing component 302 may include a multimedia module to facilitate interaction between multimedia component 308 and processing component 302 .
  • Memory 304 is configured to store various types of data to support operations at device 300 . Examples of such data include instructions for any application or method operating on device 300, contact data, phonebook data, messages, pictures, videos, and the like. Memory 304 may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • Power component 306 provides power to various components of device 300 .
  • Power components 306 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power to device 300 .
  • Multimedia component 308 includes screens that provide an output interface between the device 300 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • the multimedia component 308 includes a front-facing camera and/or a rear-facing camera. When the apparatus 300 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
  • Audio component 310 is configured to output and/or input audio signals.
  • audio component 310 includes a microphone (MIC) that is configured to receive external audio signals when device 300 is in operating modes, such as call mode, recording mode, and voice recognition mode. The received audio signal may be further stored in memory 304 or transmitted via communication component 316 .
  • audio component 310 also includes a speaker for outputting audio signals.
  • the I/O interface 312 provides an interface between the processing component 302 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
  • Sensor assembly 314 includes one or more sensors for providing status assessment of various aspects of device 300 .
  • the sensor assembly 314 can detect the open/closed state of the device 300, the relative positioning of components, such as the display and keypad of the device 300, and the sensor assembly 314 can also detect a change in the position of the device 300 or a component of the device 300 , the presence or absence of user contact with the device 300 , the orientation or acceleration/deceleration of the device 300 and the temperature change of the device 300 .
  • Sensor assembly 314 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 316 is configured to facilitate wired or wireless communication between apparatus 300 and other devices.
  • Device 300 may access wireless networks based on communication standards, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 316 also includes a near field communication (NFC) module to facilitate short-range communication.
  • NFC near field communication
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • apparatus 300 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
  • non-transitory computer readable storage medium including instructions, such as memory 304 including instructions, executable by processor 320 of apparatus 300 to perform the method described above.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
  • FIG. 15 is a block diagram of an apparatus 400 for training according to an exemplary embodiment.
  • the apparatus 400 may be provided as a server.
  • apparatus 400 includes processing component 422, which further includes one or more processors, and a memory resource represented by memory 432 for storing instructions executable by processing component 422, such as an application program.
  • An application program stored in memory 432 may include one or more modules, each corresponding to a set of instructions.
  • the processing component 422 is configured to execute instructions to perform the training method described above.
  • Device 400 may also include a power supply assembly 426 configured to perform power management of device 400 , a wired or wireless network interface 450 configured to connect device 400 to a network, and an input output (I/O) interface 458 .
  • Device 400 may operate based on an operating system stored in memory 432, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
  • first, second, etc. are used to describe various information, but the information should not be limited to these terms. These terms are only used to distinguish the same type of information from one another, and do not imply a particular order or level of importance. In fact, the expressions “first”, “second” etc. are used completely interchangeably.
  • the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information, without departing from the scope of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

La présente divulgation se rapporte à un procédé d'entraînement, à un appareil d'entraînement et à un support d'enregistrement. Le procédé d'entraînement comprend les étapes suivantes : en réponse à la réception d'une demande d'entraînement de modèle, l'entraînement d'un premier modèle d'entraînement, la demande d'entraînement de modèle comprenant des paramètres de compression de modèle ; et sur la base du premier modèle d'entraînement et des paramètres de compression de modèle, l'obtention d'un premier modèle de compression du premier modèle d'entraînement. Ainsi, un modèle de compression peut avoir le même effet qu'un modèle d'entraînement, de sorte que les surcharges de signalisation pendant la transmission de modèle sont réduites, la précision et la fiabilité du modèle peuvent être garanties, et la sécurité des informations d'utilisateur est encore assurée.
PCT/CN2020/130896 2020-11-23 2020-11-23 Procédé d'entraînement, appareil d'entraînement et support d'enregistrement WO2022104799A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080003605.XA CN114793453A (zh) 2020-11-23 2020-11-23 一种训练方法、训练装置及存储介质
PCT/CN2020/130896 WO2022104799A1 (fr) 2020-11-23 2020-11-23 Procédé d'entraînement, appareil d'entraînement et support d'enregistrement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/130896 WO2022104799A1 (fr) 2020-11-23 2020-11-23 Procédé d'entraînement, appareil d'entraînement et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2022104799A1 true WO2022104799A1 (fr) 2022-05-27

Family

ID=81708237

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/130896 WO2022104799A1 (fr) 2020-11-23 2020-11-23 Procédé d'entraînement, appareil d'entraînement et support d'enregistrement

Country Status (2)

Country Link
CN (1) CN114793453A (fr)
WO (1) WO2022104799A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116233857A (zh) * 2021-12-02 2023-06-06 华为技术有限公司 通信方法和通信装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784474A (zh) * 2018-12-24 2019-05-21 宜通世纪物联网研究院(广州)有限公司 一种深度学习模型压缩方法、装置、存储介质及终端设备
CN109978144A (zh) * 2019-03-29 2019-07-05 联想(北京)有限公司 一种模型压缩方法和系统
WO2020131968A1 (fr) * 2018-12-18 2020-06-25 Movidius Ltd. Compression de réseau neuronal
CN111898484A (zh) * 2020-07-14 2020-11-06 华中科技大学 生成模型的方法、装置、可读存储介质及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020131968A1 (fr) * 2018-12-18 2020-06-25 Movidius Ltd. Compression de réseau neuronal
CN109784474A (zh) * 2018-12-24 2019-05-21 宜通世纪物联网研究院(广州)有限公司 一种深度学习模型压缩方法、装置、存储介质及终端设备
CN109978144A (zh) * 2019-03-29 2019-07-05 联想(北京)有限公司 一种模型压缩方法和系统
CN111898484A (zh) * 2020-07-14 2020-11-06 华中科技大学 生成模型的方法、装置、可读存储介质及电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEI YUE, CHEN SHICHAO;ZHU FENGHUA;XIONG GANG: "Pruning Method for Convolutional Neural Network Models Based on Sparse Regularization", COMPUTER ENGINEERING, vol. 47, no. 10, 14 November 2021 (2021-11-14), CN , pages 61 - 66, XP055931887, ISSN: 1000-3428, DOI: 10.19678/j.issn.1000-3428.0059375 *

Also Published As

Publication number Publication date
CN114793453A (zh) 2022-07-26

Similar Documents

Publication Publication Date Title
CN111837425A (zh) 一种接入方法、接入装置及存储介质
WO2021258370A1 (fr) Procédé de traitement de communication, appareil de traitement de communication et support d'enregistrement
WO2022099512A1 (fr) Procédé et appareil de traitement de données, dispositif de communication, et support de stockage
WO2022165856A1 (fr) Procédé de création de rapport d'informations, dispositif de création de rapport d'informations et support de stockage
WO2022104799A1 (fr) Procédé d'entraînement, appareil d'entraînement et support d'enregistrement
WO2022193194A1 (fr) Procédé de configuration de partie de bande passante, appareil de configuration de partie de bande passante, et support de stockage
WO2022120535A1 (fr) Procédé de détermination de ressources, appareil de détermination de ressources et support de stockage
WO2021012232A1 (fr) Procédé et appareil de traitement d'informations de niveau de puissance d'émission, et support de stockage lisible par ordinateur
CN111466127A (zh) 增强上行覆盖的处理方法、装置及存储介质
US11387923B2 (en) Information configuration method and apparatus, method and apparatus for determining received power, and base station
WO2023000341A1 (fr) Procédé de configuration d'informations, appareil de configuration d'informations et support de stockage
WO2022126555A1 (fr) Procédé de transmission, appareil de transmission et support de stockage
WO2021179126A1 (fr) Procédé de détection de signalisation de commande, appareil de détection de signalisation de commande et support d'informations
CN114080852A (zh) 能力信息的上报方法、装置、通信设备及存储介质
WO2021081731A1 (fr) Procédé et appareil d'établissement de connexion, station de base, équipement utilisateur et dispositif de réseau central
WO2021007827A1 (fr) Procédés et appareils d'indication et de détermination d'informations, dispositif de communication et support d'informations
WO2023155111A1 (fr) Procédé et appareil de traitement d'informations, dispositif de communication et support de stockage
WO2022133689A1 (fr) Procédé de transmission de modèle, dispositif de transmission de modèle, et support de stockage
WO2022151490A1 (fr) Procédé et appareil de détermination d'informations d'état de canal et support de stockage
WO2022204973A1 (fr) Procédé et appareil de détermination de politique et support de stockage
WO2022141289A1 (fr) Procédé et appareil de détermination de paramètres de configuration ainsi que support d'enregistrement
WO2022077265A1 (fr) Procédé de configuration de transmission, appareil de configuration de transmission et support de stockage
WO2022082742A1 (fr) Procédé et dispositif d'entraînement de modèle, serveur, terminal et support d'informations
CN111566985B (zh) 传输处理方法、装置、用户设备、基站及存储介质
WO2023039722A1 (fr) Procédé de notification d'informations, dispositif notification d'informations et support de stockage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20962090

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20962090

Country of ref document: EP

Kind code of ref document: A1