CN115174397B

CN115174397B - Federal edge learning training method and system combining gradient quantization and bandwidth allocation

Info

Publication number: CN115174397B
Application number: CN202210896876.3A
Authority: CN
Inventors: 唐斌; 阎昊; 叶保留
Original assignee: Hohai University HHU; Jiangsu Future Networks Innovation Institute
Current assignee: Hohai University HHU; Jiangsu Future Networks Innovation Institute
Priority date: 2022-07-28
Filing date: 2022-07-28
Publication date: 2023-10-13
Anticipated expiration: 2042-07-28
Also published as: CN115174397A

Abstract

The invention discloses a federal edge learning training method and system combining gradient quantization and bandwidth allocation. The method comprises the following steps: the node sends parameter information of self equipment to an edge server; the edge server calculates the channel condition of each node according to the parameter information sent by each node to obtain the channel gain of each uplink channel, solves the optimization problem according to the channel gain, the sending power, the sample number and the computing capacity, distributes the quantization bit number and the bandwidth, and broadcasts the available quantization bit number and the global model to the nodes participating in federal learning; the node calculates a local update gradient according to the global model and the local data, quantizes the local update gradient based on the quantization bit number, and sends the quantized local gradient to the edge server; the edge server aggregates the received gradients and updates the global model. The invention can effectively relieve the influence of limited edge resources and heterogeneous equipment on federal learning training, relieve communication bottleneck and improve training efficiency.

Description

Federal edge learning training method and system combining gradient quantization and bandwidth allocation

Technical Field

The invention relates to the field of distributed systems, in particular to a federal edge learning training method and system combining gradient quantization and bandwidth allocation.

Background

Federal edge learning (Federated Edge Learning) is becoming a popular distributed privacy preserving machine learning framework in which multiple edge devices collaboratively train a machine learning model with the aid of an edge server. In federal edge learning, an edge device calculates gradients of a global model from local data and iteratively uploads the gradients to an edge server for model updating. However, federal edge learning is often severely affected by communication bottlenecks due to the limited shared wireless spectrum and excessive training parameters. An effective way to alleviate the communication bottleneck is quantization, i.e. using fewer bits to represent the gradient. Considering gradient quantization to reduce traffic and reduce overall training delay is a key issue in federal edge learning. The total training delay is proportional to the number of training rounds required and the delay per round. The former is closely related to the gradient quantization scheme, i.e. the quantization level of the respective edge device. The latter is determined by both the gradient quantization scheme and the bandwidth allocation scheme. The bandwidth allocation scheme specifies how to share spectrum between edge devices, closely related to gradient quantization, e.g., when an edge device uses higher quantization levels resulting in greater traffic, it requires more bandwidth to shorten its transmission time. At the same time, the delay of each iteration is determined by the slowest edge device. Therefore, it is necessary to optimize bandwidth allocation and gradient quantization selection in combination with all edge devices.

Currently, several schemes have been proposed that consider joint optimization of bandwidth allocation and gradient quantization in different environments. Wherein part of the scheme considers a stochastic quantization method based on the maximum value, wherein the quantization level at each edge server depends on the dynamic range of its local gradient. Thus, joint optimization can only be performed when all edge devices have performed their local gradients, which results in faster computing edge devices having to wait for slower devices to start transmission. In contrast, other works consider another gradient-modulus based quantization method, where the variance of the quantized gradient is only related to the quantization level, so the edge server can optimize the quantization bit number allocation scheme before each round of training begins. However, although the above methods have good effects, these methods have the following problems in general: (1) Only the convergence of the model training is considered, the number of training rounds is optimized, but the per round delay is not considered. (2) All nodes use the same quantization level, resulting in federal learning results that are suboptimal.

Disclosure of Invention

The invention aims to: the invention mainly aims to relieve communication bottleneck in edge federal learning, overcome defects and shortcomings in the prior art, and provides a federal edge learning training method and system for jointly optimizing gradient quantization and bandwidth allocation.

The technical scheme is as follows: in order to achieve the above object, the present invention has the following technical scheme: a federal edge learning training method combining gradient quantization and bandwidth allocation comprises the following steps:

each node sends parameter information of own equipment to an edge server through an uplink channel;

the edge server estimates the channel gain and the computing capacity of the nodes according to the parameter information uploaded by the nodes, establishes an optimization problem aiming at minimizing iteration time and model convergence according to the channel gain, the transmitting power, the sample number and the computing capacity of each node, and solves the optimization problem to obtain the quantization bit number and the bandwidth distributed by each node;

the edge server broadcasts available quantization bit numbers and a global model to nodes participating in federal learning;

the node calculates a local update gradient according to the global model and the local data, and quantizes the local update gradient based on the quantization bit number;

the node sends the quantized local update gradient to an edge server;

the edge server aggregates the received gradient, updates the global model, and if the global model converges, the federal learning is finished; otherwise, each step of federal learning is performed from scratch until the global model converges.

Further, the parameter information sent by each node to the edge server includes: CPU frequency, sample number, transmit power, and device location, wherein CPU frequency and sample number are used to estimate the time required for node local computation, and node location and transmit power are used to estimate the transmission capability of the node.

Further, the optimization problem is as follows:

(P1)：minσ ² .t ^round

wherein ,represents the convergence of the model, N represents the total number of nodes, N _m Representing the number of samples of node m, s _m Represents the quantization level of node m, Z represents the modulus upper bound of the gradient, d represents the number of model parameters, t ^round Representing the time required for an iteration, +.>Represents the time, q, required for node m to calculate _m ＝/>Representing the number of quantization bits of node m, P _m Representing the transmit power of node m, h _m Representing channel gain of node m, b _m Representing the bandwidth allocated to node m, E _m Representing the energy limit of node m, N ₀ Gaussian white noise representing channel, +.>Represents the energy required by the node M calculation phase, M represents the number of nodes involved in training, B represents the total bandwidth of the uplink channel,/o>Refers to the number of bits used by the default representation element in the system, N ⁺ Representing a positive integer set.

Further, the node quantizing the local update gradient based on the quantization bit number includes: the node receives the quantization bit number allocation information, searches the quantization bit number allocated by the node number at the front end of each quantization bit number allocation unit in the quantization bit number allocation information block, and stores the quantization bit number in the storage unit; the nodes quantize the update gradient of the local model using a uniform quantization method using the number of quantization bits stored in the storage unit.

Further, the uniform quantization method includes: uniformly dividing the total quantization interval to obtain sub quantization intervals, wherein the number of the sub quantization intervals is quantization level, and the relation between the quantization level s and the quantization bit number q is as follows: s=2 ^q -1, quantizing to the left end point or the right end point of a sub-quantization interval according to a specified quantization rule when the value to be quantized is located in a certain sub-quantization interval.

Further, the specified quantization rule is: flattening the update gradient into vectors, the codebook values of the component mapping of each vector being as follows:

Q _s (v _i )＝||v||sgn(v _i )ξ _i (v，s)，i＝1，2，…，n

wherein ,Q_s (v _i ) To update the value of the ith component of the gradient flattened vector v, sgn (v _i ) Symbol, ζ, being the ith component of vector v _i (v, s) is an independent random variable, and represents the value of the ith component of the vector v according to probability, specifically:

wherein the probability p (a, s) =as-l, l is a non-negative integer of s or less.

Further, the node adopts a triplet (|v||) ₂ τ, ζ) represents the quantized update gradient and transmits information of the triplet to the edge server, where the triplet is specifically: the modulus v of the vector ₂ Vector components formed in original orderA symbol vector τ of (a) and a vector ζ of integer components of the component map, where ζ _i ＝s·ξ _i (v，s)。

The invention also provides a federal learning system, which comprises an edge server and a plurality of nodes, wherein each node is used for sending parameter information of self equipment to the edge server through an uplink channel, receiving the quantized bit number and a global model broadcast sent by the edge server, calculating a local update gradient according to the global model and the local data, quantizing the local update gradient based on the quantized bit number, and sending the quantized local update gradient to the edge server;

the edge server is used for estimating the channel gain and the computing power of the node according to the parameter information uploaded by the node, establishing an optimization problem aiming at minimizing iteration time and model convergence and solving the optimization problem according to the channel gain, the transmitting power, the sample number and the computing power of each node, obtaining the quantization bit number and the bandwidth distributed by each node, broadcasting the available quantization bit number and the global model to the nodes participating in federal learning, receiving the local update gradient of the node after quantization, aggregating the received gradient, and updating the global model until the global model converges.

Compared with the prior art, the invention has the following advantages and beneficial effects: (1) The invention fully utilizes the interactivity of the edge server and the nodes of federal learning, so that the edge server can carry out uniform quantization of quantization bit number self-adaptive adjustment on update gradients uploaded by the nodes by acquiring the calculation time of the nodes and the quality condition of uplink channels, and fully explores the mobility of edge-to-edge cooperation. (2) According to the uniform quantization method adopted by the invention, only the triples containing gradient vector information are required to be transmitted by the whole gradient vector, the data size uploaded by the quantized nodes is obviously reduced, and the quantized variance is smaller. (3) The invention has better expansion potential, can further reduce the data size uploaded after quantization by using a high-efficiency coding mode, and can increase a sparse method after quantization according to the need, thereby achieving better data compression effect. (4) The invention does not need to add an additional device in the original federal learning system, effectively reduces the system layout cost, is simple and effective, and has small interaction cost between nodes.

Drawings

FIG. 1 is a schematic diagram of a system for federal learning in accordance with the present invention;

FIG. 2 is a general flow chart of the Federal learning training method of the present invention;

FIG. 3 is a block of quantization bit number allocation information obtained by implementing quantization bit number allocation according to the present invention;

FIG. 4 is a schematic representation of the quantification of elements in the gradient of the present invention;

FIG. 5 is a flowchart of an embodiment of federal learning according to the present invention.

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings.

Fig. 1 is a schematic diagram of a federal learning system according to the present invention, where the federal learning system includes edge servers and a plurality of nodes, where the edge servers are deployed in a base station, and each edge server includes a computing unit, a storage unit, and a transmitting unit. The node comprises a quantization unit, a storage unit, a training unit and a transmitting unit. The edge server and the nodes cooperate to finish the appointed task, the original data required by the task is distributed in the nodes, and the edge server cannot contact the original data; the exchange of the local model update and the global model update is required to be completed between the node and the edge server, the node updates the local model by using the local original data and the global model broadcast by the edge server, and the edge server updates the global model by using the local model updated by the node.

In this embodiment, since the node is typically a portable mobile smart device such as a mobile phone or a smart watch, and the battery is limited in size and cannot provide sufficient resources to send local model updates with huge data scale, the federal edge learning training framework for jointly optimizing gradient quantization and bandwidth allocation provided by the invention considers the energy limitation of the device and can be used to alleviate the communication bottleneck generated when the edge server and the node interact in the federal learning system.

FIG. 2 is a flow chart of a federal edge learning training method combining gradient quantization and bandwidth allocation in accordance with the present invention. The method mainly comprises the following steps:

1) Uploading parameter information by the node, and estimating the channel gain of the node by the edge server;

the nodes participating in federal learning send device parameter information to the edge server, wherein the parameters comprise information such as CPU frequency, sample number, physical position, sending power and the like of the device. Wherein the CPU frequency and the number of samples are used to estimate the time required for local computation, and the node location and transmit power are used to estimate the transmission capability of the node.

The edge server receives the equipment information sent by the nodes, and the edge server sequentially stores the equipment information of each node in a storage unit in a queue form; the edge server calls the equipment information of one node in the storage unit queue each time according toEstimating channel gain in a computing unit, where o _m D is the Rayleigh fading parameter _m Distance between node m and base station, i.e. edge server; and copying the node numbers with the channel gain estimation completed, connecting the channel gain units of each node end to obtain channel gain information of all nodes, and storing the channel gain information in a storage unit.

2) The edge server utilizes the parameters of the nodes to solve the optimization problem, allocates the quantization bit number and bandwidth which can be used by the nodes, and broadcasts the global model and the quantization bit number allocation situation;

according to shannon's theorem, it can be inferred that the larger the transmission power of a node is, the larger the channel gain is, and the stronger the transmission capability is, but the closed solution of the transmission capability cannot be obtained from shannon's formula, and only estimation is possible. In the invention, according to the formulaEstimating the time required by node local calculation, wherein N is in the formula _m Representing the number of samples owned by a node, F _m Representing node CPU frequency, gamma representing one sample per node processingThe number of CPU cycles required.

The channel gain and node information (including transmit power) in the memory unit are used to estimate the transmission capability of the node. According to shannon's second formula, the transmission rate r of node m _m Expressed as:

wherein ,b_m Representing the bandwidth allocated to node m, P _m Representing the transmit power of node m, N ₀ Representing gaussian white noise of the channel.

The edge server obtains a quantized bit number distribution unit based on the estimated channel gain information and the local calculation time of the nodes, substitutes the estimated channel gain information and the local calculation time of the nodes into an optimization problem that parameters are unknown and a specific model is determined, respectively solves the quantized bit number and bandwidth distributed by each node, obtains quantized bit number distribution information blocks after head-to-tail connection, and stores and transmits the quantized bit number distribution information blocks in a queue mode. The computing unit of the edge server is responsible for solving the optimization problem and aggregating the received gradients, the storage unit is responsible for storing parameter information of the data nodes and the quantization bit number allocation information blocks, and the sending unit is responsible for broadcasting data, such as the quantization bit number allocation information blocks and the global model, to the nodes.

Fig. 3 shows a quantization bit number assignment information block obtained after quantization bit number assignment. The node number refers to a serial number that distinguishes between different nodes in the edge server.

According to an embodiment of the present invention, the optimization problem is specifically:

(P1)：minσ ² .t ^round

wherein ,represents the convergence of the model, N represents the total number of nodes, N _m Representing the number of samples of node m, s _m The quantization level of the node m is represented, Z represents the modulo upper bound of the gradient, and d represents the number of model parameters. t is t ^round Represents the time required for iteration, P _m Representing the transmit power of node m, h _m Representing channel gain of node m, b _m Representing the bandwidth allocated to node m, E _m Represents the energy limit of node m, q _m ＝/>The number of quantization bits representing the node m,representing the time required for node m to calculate, N ₀ Gaussian white noise representing channel, +.>Representing the energy required by the node m calculation phase, the invention assumes that the node calculation phase is power-fixed, thus calculating the energy required by the phase +.>And calculate time->In proportion, can be calculated according to CPU frequency and calculation time, and the formula is +.>γ _m Representing coefficients related to the system architecture, M representing the number of nodes involved in training, B representing the total bandwidth of the upstream channel, +.>Refers to the number of bits used by the default representation element in the system, typically 32 bits, N ⁺ Representing a positive integer set.

Wherein the quantization bit number and the bandwidth are unknown parameters, and the bandwidth and the quantization bit number together determine the time required for the node to transmit the gradient.

3) The node performs local training for a plurality of times by using the received global model and locally stored data to obtain updated local gradient information, and then quantizes the updated local gradient according to the allocated quantization bit number;

the node receives the quantized bit number allocation information block, and searches the quantized bit number allocated by the node according to the node serial number at the front end of each quantized bit number allocation unit in the quantized bit number allocation information block; the nodes quantize the update gradient of the local model using a uniform quantization method using the number of quantization bits stored in the storage unit. The storage unit of the node is responsible for storing training data and a global model, the training unit is responsible for training the global model by using the data to obtain gradients, the quantization unit is responsible for quantizing the trained gradients, and the sending unit is responsible for sending the quantized gradients to the edge server.

According to the embodiment of the invention, the sub-quantization interval in the uniform quantization method is obtained after the total quantization interval is uniformly divided, wherein when the value to be quantized is positioned in a certain sub-quantization interval, the value is quantized into the left end point or the right end point of the sub-quantization interval according to a certain quantization rule, and the specific quantization rule of the uniform quantization method is as follows: flattening the updated gradient into vectors, wherein the gradient is a multidimensional matrix, for example, the shape of the gradient may be 3 x 4 x 5, the vector is 60 x 1 when the gradient is expanded, one element in the vector is a component, and the value rule of a codebook mapped by the component of each vector is as follows:

Q _s (v _i )＝||v||sgn(v _i )ξ _i (v，s)，i＝1，2，…，n

wherein the probability p (a, s) =as-l, l is a non-negative integer less than or equal to s; note that a is a shape parameter and real parameter isIn the invention, the quantization level is the number of sub quantization intervals, and the relation between the quantization level s and the quantization bit number q is as follows: s=2 ^q -1, alternatively denoted->The value of one can be determined based on the other.

After the quantization by the uniform quantization method, by ternary elements group (|) v| ₂ τ, ζ) to represent the quantized update gradient, only the information of the triplet needs to be transmitted during transmission, and the specific content of the triplet is: the modulus v of the vector ₂ Symbol vector τ of vector components formed in original order and vector ζ formed by integer of component mapping, wherein ζ _i ＝s·ξ _i (v，s)，ξ _i Representing the ith component of the vector xi.

The quantization method is schematically shown in fig. 4. The figure shows the quantization of one component in a vector, the componentAfter operation, each component can be guaranteed to fall within [0,1 ]]Within the interval, thereby converting to [0,1 ]]The ratio of the final upper and lower valued probabilities is the ratio of the distances from the quantization point to the upper and lower quantization levels. In FIG. 4, a quantization level of 4, the ratio of component to vector modulo +.>A schematic diagram of 0.6, and a specific quantization rule is shown in the above formula.

4) The node uploads the quantized local update gradient to an edge server through an independent incoherent uplink channel;

in order to ensure that the signals transmitted in each sub-channel are not interfered with each other, an isolation belt is arranged between the sub-channels, so that the signals are not interfered with each other. Illustratively, the uplink channel uses frequency division multiplexed channels because frequency division multiplexing requires that the total frequency width be greater than the sum of the individual sub-channel frequencies.

5) And the edge server receives the quantized update gradient, aggregates the update gradient, and uses the aggregated gradient to update the global model on the edge server.

The invention fully utilizes the interactivity of the edge server and the nodes of federal learning, so that the edge server can carry out uniform quantization of quantization bit number self-adaptive adjustment on update gradients uploaded by the nodes by acquiring the calculation time of the nodes and the quality condition of uplink channels, and fully explores the mobility of edge-to-edge cooperation. The invention simultaneously considers the convergence of model training and the calculation and communication delay of each round, distributes different quantization grades for the nodes, obtains the best balance between training rounds and each round of delay, and solves the defects of the prior work.

FIG. 5 shows a specific process of federal learning in one embodiment, including the steps of: (1) The edge server selects nodes participating in federal learning iteration at the time; (2) uploading device parameter information by the node; (3) If the global iteration is the primary global iteration, the edge server performs model initialization and broadcasts the initialized global model to the nodes participating in training at this time; otherwise, broadcasting the updated global model obtained after the last global iteration to the nodes participating in training at this time; (4) The edge server estimates the node channel gain, and allocates the quantization bit number that the node can use separately; (5) The edge server broadcasts quantization bit number distribution conditions and a global model; (6) The node obtains the global model and the quantization bit allocation number obtained by itself from the broadcasted information. (7) The node performs one or more times of local training by using the received global model and the locally stored data to obtain updated local gradient information; (8) The node quantizes the updated local gradient information by using the distributed quantization bit number to obtain a quantized local update gradient for uploading; (9) The node uploads the quantized local update gradient to an edge server through an independent incoherent uplink channel; (10) The edge server receives the actual update gradient disturbed by the node channel, aggregates the actual update gradient, and uses the aggregated gradient to update the global model on the edge server, and if the global model converges, the federation learning is ended; otherwise, each step of federal learning is performed from scratch until the global model converges.

According to the invention, the edge server is brought into an optimization problem model according to parameter information uploaded by the node, the quantized bit number q and the bandwidth b of the node are calculated and sent to the node, the node determines the quantized grade s according to the relation between the quantized bit number and the quantized grade, then determines the quantized value according to the quantized grade s and the vector v after updating gradient flattening, forms a triplet, and then uploads the triplet to the edge server based on the bandwidth b.

Claims

1. The federal edge learning training method combining gradient quantization and bandwidth allocation is characterized by comprising the following steps of:

the edge server estimates the channel gain and the computing power of the nodes according to the parameter information uploaded by the nodes, establishes an optimization problem aiming at minimizing iteration time and model convergence according to the channel gain, the transmitting power, the sample number and the computing power of each node, and solves the optimization problem to obtain the quantization bit number and the bandwidth distributed by each node, wherein the optimization problem is as follows:

problem P1: min sigma ² ·t ^round

Constraint conditions:

wherein ,represents the convergence of the model, N represents the total number of nodes, N _m Representing the number of samples of node m, s _m Represents the quantization level of node m, Z represents the modulus upper bound of the gradient, d represents the number of model parameters, t ^round Representing the time required for an iteration, +.>Representing the time required for node m to calculate, +.> Representing the number of quantization bits of node m, P _m Representing the transmit power of node m, h _m Representing channel gain of node m, b _m Representing the bandwidth allocated to node m, E _m Representing the energy limit of node m, N ₀ Gaussian white noise representing channel, +.>Represents the energy required by the node M calculation phase, M represents the number of nodes involved in training, B represents the total bandwidth of the uplink channel,/o>Refers to the number of bits used by the default representation element in the system, N ⁺ Representing a positive integer set;

the node calculates a local update gradient according to the global model and the local data, and quantizes the local update gradient based on the quantization bit number, wherein the total quantization interval is uniformly divided to obtain sub quantization intervals, the number of the sub quantization intervals is quantization level, and the relation between the quantization level s and the quantization bit number q is as follows: s=2 ^q -1, quantizing to the left end point or the right end point of a sub-quantization interval according to a specified quantization rule when the value to be quantized is located in a certain sub-quantization interval;

the node sends the quantized local update gradient to an edge server;

2. The method of claim 1, wherein the parameter information sent by each node to the edge server comprises: CPU frequency, sample number, transmit power, and device location, wherein CPU frequency and sample number are used to estimate the time required for node local computation, and node location and transmit power are used to estimate the transmission capability of the node.

3. The method of claim 1, wherein the node quantizing the local update gradient based on the quantization bit number comprises: the node receives the quantization bit number allocation information, searches the quantization bit number allocated by the node number at the front end of each quantization bit number allocation unit in the quantization bit number allocation information block, and stores the quantization bit number in the storage unit; the nodes quantize the update gradient of the local model using a uniform quantization method using the number of quantization bits stored in the storage unit.

4. The method of claim 1, wherein the specified quantization rule is: flattening the update gradient into vectors, the codebook values of the component mapping of each vector being as follows:

Q _s (v _i )＝||v||sgn(v _i )ξ _i (v,s),i＝1,2,…,n

wherein the probabilityl is a non-negative integer less than or equal to s.

5. The method of claim 4, wherein the node employs triples (|v|) ₂ τ, ζ) represents the quantized update gradient and transmits the triplet information to the edge server, whichThe specific contents of the triples are as follows: modulus of vector II V II ₂ Symbol vector τ of vector components formed in original order and vector ζ formed by integer of component mapping, wherein ζ _i ＝s·ξ _i (v,s)，ξ _i Representing the ith component of the vector xi.

6. The federal learning system comprises an edge server and a plurality of nodes, and is characterized in that each node is used for sending parameter information of self equipment to the edge server through an uplink channel, receiving the quantization bit number and a global model broadcast by the edge server, calculating a local update gradient according to the global model and local data, quantizing the local update gradient based on the quantization bit number, and sending the quantized local update gradient to the edge server, wherein the total quantization interval is uniformly divided to obtain sub quantization intervals, the number of the sub quantization intervals is quantization level, and the relation between quantization level s and quantization bit number q is as follows: s=2 ^q -1, quantizing to the left end point or the right end point of a sub-quantization interval according to a specified quantization rule when the value to be quantized is located in a certain sub-quantization interval;

the edge server is used for estimating the channel gain and the computing capacity of the node according to the parameter information uploaded by the node, establishing an optimization problem aiming at minimizing iteration time and model convergence and solving the optimization problem according to the channel gain, the transmitting power, the sample number and the computing capacity of each node, obtaining the quantization bit number and the bandwidth distributed by each node, broadcasting the available quantization bit number and the global model to the nodes participating in federal learning, receiving the local update gradient of the node after quantization, aggregating the received gradient, and updating the global model until the global model converges;

the optimization problem established by the edge server is as follows:

problem P1: min sigma ² ·t ^round

Constraint conditions:

wherein ,represents the convergence of the model, N represents the total number of nodes, N _m Representing the number of samples of node m, s _m Represents the quantization level of node m, Z represents the modulus upper bound of the gradient, d represents the number of model parameters, t ^round Representing the time required for an iteration, +.>Representing the time required for node m to calculate, +.> Representing the number of quantization bits of node m, P _m Representing the transmit power of node m, h _m Representing channel gain of node m, b _m Representing the bandwidth allocated to node m, E _m Representing the energy limit of node m, N ₀ Gaussian white noise representing channel, +.>Represents the energy required by the node M calculation stage, M represents the number of the nodes participating in training, and B represents uplinkTotal bandwidth of channel, ">Refers to the number of bits used by the default representation element in the system, N ⁺ Representing a positive integer set.