CN114584436B

CN114584436B - Message aggregation system and method in concurrent communication network of single handshake

Info

Publication number: CN114584436B
Application number: CN202210483218.1A
Authority: CN
Inventors: 高镇; �乔力; 梅逸堃; 应科柯; 郑德智
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute Of Technology Measurement And Control Technology Co ltd
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-07-01
Anticipated expiration: 2042-05-06
Also published as: CN114584436A

Abstract

The invention discloses a message aggregation system and a message aggregation method in a concurrent communication network of single handshake, which belong to the technical field of data transmission in the communication network, wherein the system comprises a transmitting end and a receiving end, and the transmitting end comprises a downlink channel estimation and synchronization module, a local training module, a quantization module, a codebook modulation module and a pre-equalization module; the receiving end comprises a signal detection module, a gradient aggregation module, an averaging module and a model updating module; the method comprises a transmitting end signal processing process, an uplink transmission process and a receiving end signal processing process; the invention utilizes the characteristic that only the result of gradient message aggregation is needed to be obtained by federal learning and each user message does not need to be calculated independently, multiple users adopt a common quantization codebook and a common modulation codebook, and simultaneously, the respective gradient messages are transmitted in uplink at the same frequency, thereby realizing the high-efficiency aggregation of the gradient messages of the multiple users and reducing the communication resource overhead in the federal learning.

Description

Message aggregation system and method in concurrent communication network of single handshake

Technical Field

The invention relates to the field of data transmission in a communication network, in particular to a message aggregation system and a message aggregation method in a concurrent communication network with single handshake.

Background

Traditional machine learning centralizes user data to a central node, and utilizes massive computing resources to perform central learning. However, the central machine learning has a risk of revealing user private data, and simultaneously faces a problem of high overhead of mass data transmission. With the increase of the intelligent degree of the user terminal, distributed machine learning becomes possible, thereby overcoming the defects.

Federated learning is a typical distributed machine learning framework, where a central node trains a neural network together with multiple users through multiple message interactions. Taking any one-time message interaction process as an example, a plurality of users train according to respective local data sets to obtain local gradients, then local gradient information is sent to a central node, the central node aggregates the gradient information of the plurality of users to obtain global gradients, model updating is completed, updated model parameters are fed back to all the users, and next local training and message interaction are started. Due to the fact that the number of users and the dimensionality of the gradient vector are quite large, the message interaction process of federal learning brings huge burden to a communication network. Therefore, how to implement message interaction with low communication overhead is a key issue to be solved urgently.

Disclosure of Invention

In view of this, the present invention provides a message aggregation system and method in a concurrent communication network with single handshake, which can effectively reduce the communication resource overhead of federal learning.

The technical scheme for realizing the invention is as follows:

a message aggregation system in a concurrent communication network of single handshake comprises a transmitting terminal and a receiving terminal;

the transmitting terminal comprises a downlink channel estimation and synchronization module, a local training module, a quantization module, a codebook modulation module and a pre-equalization module;

the downlink channel estimation and synchronization module is used for performing downlink channel estimation and time synchronization according to downlink broadcast signals from the central node to multiple users;

the local training module is used for performing neural network training on each user according to local data to obtain respective local gradients of the multiple users;

the quantization module is used for quantizing the local gradient of each user according to the quantization codebook to obtain quantization code words and quantization indexes of the quantization code words in the quantization codebook;

the codebook modulation module is used for carrying out codebook modulation on the quantization value output by the quantization module according to a modulation codebook to obtain a modulation codeword corresponding to each quantization codeword;

particularly, all users adopt the same quantization codebook and modulation codebook, and the modulation code words in the modulation codebook correspond to the quantization code words in the quantization codebook one by one;

the pre-equalization module is used for pre-equalizing the modulation code words before each user sends the modulation code words according to the downlink channel estimation values to obtain sending signals;

and the receiving end carries out multi-user signal transmission detection to obtain the number of times of each modulation code word being transmitted, namely the number of times of each quantization code word appearing, then carries out gradient aggregation and averaging to finally obtain a global gradient, and completes model updating.

Furthermore, the quantization mode of the quantization module is scalar quantization or vector quantization, when scalar quantization is adopted, the dimension of the local gradient vector is not changed, and when vector quantization is adopted, the dimension of the local gradient vector is compressed.

Further, the modulation code words are transmitted on pre-allocated time-frequency resources, all users allocate the same time-frequency resources, each modulation code word is a vector containing a plurality of scalar elements, and each element in the modulation code word is pre-equalized according to a channel corresponding to a subcarrier where the element is located, because channels corresponding to different subcarriers (frequency domain resources) are different.

Further, the system considers the situation of time division duplex, so that the uplink and downlink channels have reciprocity, and the uplink transmission is pre-equalized according to the downlink channel estimation value.

Further, the receiving end comprises a signal detection module, a gradient aggregation module, an averaging module and a model updating module;

the signal detection module is used for carrying out multi-user signal transmission detection according to the received signal and the modulation codebook to obtain the number of times that each modulation code word in the modulation codebook is transmitted;

a gradient aggregation module, which is used for detecting the output of the module according to the signal, because the modulation code words and the quantization code words are in one-to-one correspondence, namely: modulating the number of times that each modulation code word in the codebook is sent to obtain the number of times that each quantization code word in the quantization codebook appears, then multiplying each quantization code word by the number of times that the quantization code word appears to obtain a multiplied quantization code word, and then summing all multiplied quantization code words;

the averaging module is used for calculating the number of users participating in the federal learning, and then dividing the summation result output by the gradient aggregation module by the number of the users to obtain a global gradient; specifically, the number of users is equal to the sum of the sending times of all modulation code words obtained by the signal detection module;

and the model updating module is used for updating the parameters of the neural network according to the global gradient output by the averaging module.

A message aggregation method in a concurrent communication network of single handshake comprises a transmitting end signal processing process, an uplink transmission process and a receiving end signal processing process;

the transmitting terminal signal processing process comprises the steps that each user receives a downlink broadcast signal, downlink channel estimation and synchronization are started, local training is started, local gradients obtained by the local training are quantized to obtain quantized code words and quantized indexes, codebook modulation is carried out according to the quantized indexes to obtain modulated code words, and then pre-equalization is carried out on the modulated code words to obtain a transmitting signal;

the uplink transmission process comprises that multiple users simultaneously transmit respective sending signals in the same frequency uplink, and the sending signals of the multiple users reach the central node through a channel;

the receiving end signal processing process comprises the steps that the central node carries out signal detection according to a received signal and a modulation codebook to obtain the number of times that each modulation code word in the modulation codebook is sent, namely the number of times that each quantization code word in the quantization codebook appears, then each quantization code word is multiplied by the number of times that each quantization code word appears to obtain multiplied quantization code words, then all multiplied quantization code words are summed to complete gradient aggregation, then the summation result is averaged to obtain a global gradient, and finally the global gradient is used for model updating.

Further, the method can complete the multi-user gradient aggregation only by one uplink transmission, that is: only a single handshake is required by the multi-user and the central node.

Has the advantages that:

(1) the invention realizes the high-efficiency aggregation of multi-user gradient messages without independently calculating the message of each user, thereby greatly reducing the communication resource overhead of federal learning;

(2) the invention only needs single handshake with the central node for multiple users, namely: one uplink transmission is carried out, so that the signaling overhead is reduced;

(3) when the quantization module of the transmitting terminal adopts a vector quantization mode, the dimensionality of the gradient vector is compressed, so that the transmission delay is reduced.

Drawings

Fig. 1 is a flow chart of message aggregation in a concurrent communication network with single handshake according to the present invention.

FIG. 2 is a comparison graph of performance evaluations performed in accordance with an embodiment.

Detailed Description

The invention is described in detail below by way of example with reference to the accompanying drawings.

The invention provides a message aggregation system and a message aggregation method in a concurrent communication network with single handshake, which are used for realizing efficient gradient aggregation of federal learning.

The invention considers 1 central node andKand (4) carrying out federal learning by each user, wherein all users and the central node are single antennas and jointly train a neural network. Setting the training phase of Federal learning collectively comprisesTRound training with the firstt（1≤t≤T) The training rounds are taken as an example, and the process of the transmitting end will be first detailed below. As shown in fig. 1, the transmitting end includes a downlink channel estimation and synchronization module, a local training module, a quantization module, a codebook modulation module, and a pre-equalization module; wherein the content of the first and second substances,

a downlink channel estimation and synchronization module: each user estimates a downlink channel from the downlink broadcast signal, the firstk

The downlink channel estimation value of each user is expressed as

Meanwhile, the step completes the time synchronization of multiple users; under the condition of a time division duplex system and perfect channel estimation, the estimated values of an uplink channel and a downlink channel are the same time

The same;

a local training module: each user performs neural network training based on a local data set, ak

The output of individual users being the local gradient

；

A quantization module: in a preferred embodiment, the firstk

The local gradient is processed by the individual user according to the Lloyd algorithm (for the Lloyd algorithm, see the literature "Least Squares Quantization in pulse code modulation", author, English name and presentation as "Lloyd, S.P.," Least Squares Quantization in PCM, "IEEE Transactions on Information Theory, Vol. IT-28, March, 1982, pp. 129-

Non-uniform quantization is performed to obtain a scalar quantization codebook

At the same time obtain

Quantization indices in a quantization codebook, expressed as matrices

(ii) a Attention is paid to

Is a set of integers which are,

the value of the medium element is an integer; for the

The first of the local gradient

Each element is represented as

Quantization index

To (1) a

The columns are shown as

(ii) a Note that

Only one element is 1, the other elements are all 0, and

；

a codebook modulation module: the modulation codebook is expressed as

Wherein A is aqColumn modulation code word and quantization codebook

To (1)qThe quantized code words are in one-to-one correspondence,

and the columns of A are not linearly related; without loss of generality, if the modulation index is set to be the same as the quantization index, then

Of 1 atk

According to modulation index (quantization index)

Selection of a transmission modulation code word is made, the selected transmission modulation code word being

；

Note bookWA transmission code word is

；

A pre-equalization module: the channel through which all elements in each transmit modulation codeword travel is set to be the same,

for the firstkThe user multiplies the selected transmitting modulation code word by the reciprocal of the channel estimation value to complete pre-equalization and obtain the transmitting signal

；

Due to simultaneous uplink transmission of multiple users, the central node receives signals

Expressed as:

(1)

wherein the content of the first and second substances,

represents the true secondkThe individual user is attUplink channel of training round, modulation index of multi-user superposition

，

Representing thermal noise;

the receiving end comprises a signal detection module, a gradient aggregation module, an averaging module and a model updating module, and the flow of the receiving end is detailed below. Wherein the content of the first and second substances,

the signal detection module: based on received signals

And a modulation codebook A known to the transmitting and receiving ends, recovering the modulation index of the multi-user superposition in the formula (1)

(ii) a Attention is paid to

Each column vector of (a) has sparsity, an

The value of the medium element is an integer, and the Bayesian compressed sensing algorithm is adopted to detect the signal to obtain the modulation index of multi-user superposition

Is estimated value of

；

A gradient polymerization module: the purpose of the module is to superpose the multi-user gradients

Completing the gradient polymerization; quantization codebooks

And modulation index estimated by last module

Since the modulation index is the same as the quantization index, then

Quantization index for multi-user stacks, hence gradient of multi-user stacks

；

An averaging module: the modules firstly pair

Summing by columns to obtain

Then on vector nWAveraging the elements to obtain an average value which is the number of usersKThen averaging the result of gradient aggregation of the previous module to obtain a global gradient

；

A model updating module: neural network parameters at a central node

The update rule of (1) is:

（2）

wherein the content of the first and second substances,

is the neural network parameter, global gradient, of the last round of training

The output of the last module is output by the last module,βis the learning rate, equation (2) completes the model update.

For the quantization module of the transmitting end, a detailed description of vector quantization is given below. The quantization codebook for the hypothetical vector quantization is represented as

Wherein each vector has a dimension of

(ii) a In advance, firstly

In (1)WA scalar element, each adjacentVEach of which is to be regarded as a vector,

setting up

Integer division is then obtained

A vector element, then

An

Vector elements of dimensionality are used as input of a vector quantization algorithm to obtain vectorsQuantization codebook

(ii) a Take clustering algorithm as an example, for this

By vector element

Clustering of individual classes to obtain

A cluster of

The centroids (vectors) of the clusters form a vector quantization codebook

(ii) a Note that the quantization module takes dimensions ofV×QThe vector quantization codebook of (1) is required to arrange the local gradients at the transmitting end of multiple users according to a preset arrangement rule

Dimension of

Then carrying out vector quantization; after the receiving end passes through the gradient aggregation module, the receiving end needs to be arranged according to the arrangement rule of the transmitting end

Dimension of

Is reduced to 1WThe other modules at the transceiving end are not changed.

The invention discloses a message aggregation system and a message aggregation method in a concurrent communication network with single handshake, which can reduce the communication overhead of federal study.

To illustrate the advantages of the method proposed by the present invention, the effect of the present invention will be described with reference to fig. 2.

In the simulation, the parameters related to the federal learning are set as follows: the central node and multiple users train a convolutional neural network together, and the structure of the convolutional neural network adopts LeNet (for the detailed structure of LeNet, see the literature ' translation: application of Gradient learning in document identification ', and the author, English name and provenance of the convolutional neural network are ' Y, Lecun, L, Bottou, Y, Bengio and P, Haffner, ' Gradient-based learning applied to document recognition, ' in Proceedings of the IEEE, vol.86, No. 11, pp. 2278-; the data set adopts fast-MNIST, 60000 training images are independently and uniformly distributed to 100 users, and the data samples of each user are guaranteed to be the same in quantity; randomly selecting 30 users to participate in federal learning in each training turn; the model training process adopts an adaptive moment estimation (Adam) optimizer; the learning rate is 0.001; training 10 times by the local network, and setting the batch size to be 5;

the communication-related parameter settings are as follows: the signal-to-noise ratio is set to 20 dB; each modulation code word in the modulation codebook is set to lengthL=16, when 4-bit quantization is adopted, the dimension of the quantization codebook isQ=16, when 5 bit quantization is used, the quantization codebook has dimensions ofQ=32, modulation codebook

Each element of (a) follows a complex gaussian distribution that is independently identically distributed;

scalar quantity adopts Lloyd algorithm to carry out local gradient on first user in advance

InWNon-uniform quantization is performed on scalar elements to obtain a quantization codebook

Then all users adopt the quantization codebook

(ii) a The quantization codebook for vector quantization is represented as

Setting up

By usingK-mean clustering algorithm to derive vector quantization codebook

(ii) a The receiver Signal detection module adopts An Approximate Message transfer algorithm (for the Approximate Message transfer algorithm, see the literature 'translation name: An Approximate Message transfer algorithm of An expected Propagation view angle', the author, English name and appearance of which are 'X, Meng, S, Wu, L, Kuang and J. Lu', 'An expression Propagation utilization permission on application Message Page', 'in IEEE Signal Processing Letters, vol.22, No. 8, pp. 1194-1197, Aug.2015, doi: 10.1109/LSP 2015.2391287'), and a modulation index for multi-user superposition to be estimated

Restoring column by column, and setting the prior of each element to be restored as an integer less than or equal to the total number of users;

specifically, fig. 2 shows that when the proposed scheme employs 4-bit scalar quantization, the accuracy of the test set of the neural network model obtained by training approaches the reference scheme employing perfect gradient aggregation; due to the vector quantizationV=20 in formula (1)

The number of columns is reduced by 20 times, so that the communication overhead is reduced by 20 times in each round of training process; when the training turns are the same, the 4-bit vector quantization adopted by the proposed scheme can reduce the accuracy of the test set of the training model, which is quantizationCaused by loss; when 5-bit vector quantization is adopted, the accuracy of a test set of the neural network model obtained by the scheme is obviously improved, and the test set approaches a reference scheme along with the increase of training rounds; therefore, the performance loss of vector quantization can be reduced by increasing the quantization bit.

In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A message aggregation system in a concurrent communication network with single handshake, the system being suitable for federal learning, comprising a transmitting end and a receiving end, characterized in that:

the transmitting terminal comprises a downlink channel estimation and synchronization module, a local training module, a quantization module, a codebook modulation module and a pre-equalization module; wherein the content of the first and second substances,

the downlink channel estimation and synchronization module is used for performing downlink channel estimation and time synchronization according to downlink broadcast signals from the central node to multiple users; the central node is a receiving end;

the codebook modulation module is used for carrying out codebook modulation on the quantized code words output by the quantization module according to a modulation codebook to obtain modulation code words corresponding to each quantized code word;

the modulation code words of the multiple users are transmitted on the same time-frequency resource, all the users adopt the same quantization codebook and modulation codebook, and the modulation code words in the modulation codebook correspond to the quantization code words in the quantization codebook one by one;

and the receiving end carries out multi-user signal transmission detection to obtain the number of times of transmitting each modulation code word and the number of times of appearing each quantization code word, then carries out gradient aggregation and averaging to finally obtain a global gradient, and completes model updating.

2. The message aggregation system in a single handshake concurrent communication network as claimed in claim 1, wherein the quantization mode of the quantization module is scalar quantization or vector quantization, when scalar quantization is used, the dimension of the local gradient vector is unchanged, and when vector quantization is used, the dimension of the local gradient vector is compressed.

3. The system of claim 1, wherein the modulation code words are transmitted on pre-allocated time-frequency resources, all users allocate the same time-frequency resources, each modulation code word is a vector comprising a plurality of scalar elements, and each element in the modulation code word is pre-equalized according to a channel corresponding to a subcarrier where the element is located, since channels corresponding to different subcarriers are different.

4. The system of claim 1, wherein the system has reciprocity between uplink and downlink channels in tdd (time division duplex) conditions, so that uplink transmission is pre-equalized according to the estimated value of the downlink channel.

5. The message aggregation system in the concurrent communication network with single handshake as claimed in claim 1, wherein the receiving end includes a signal detection module, a gradient aggregation module, an averaging module and a model update module;

the gradient aggregation module is used for obtaining the occurrence frequency of each quantization code word in the quantization codebook according to the output of the signal detection module, multiplying each quantization code word by the occurrence frequency of the quantization code word to obtain multiplied quantization code words, and summing all multiplied quantization code words;

the averaging module is used for calculating the number of users participating in the federal learning, and then dividing the summation result output by the gradient aggregation module by the number of the users to obtain a global gradient; the number of users is equal to the sum of the sending times of all the modulation code words obtained by the signal detection module;

6. A message aggregation method in a concurrent communication network of single handshake is characterized by comprising a transmitting end signal processing process, an uplink transmission process and a receiving end signal processing process;

the transmitting terminal signal processing process comprises the steps that each user receives a downlink broadcast signal, downlink channel estimation and synchronization are started, local training is started, local gradients obtained by the local training are quantized to obtain quantized code words and quantized indexes, codebook modulation is carried out according to the quantized indexes to obtain modulation code words, the modulation code words in the modulation codebook correspond to the quantized code words in the quantization codebook one by one, and then pre-equalization is carried out on the modulation code words to obtain a transmitting signal;

the uplink transmission process comprises that multiple users transmit respective sending signals in the same frequency uplink transmission at the same time, the sending signals of the multiple users reach a central node through a channel, and the central node is a receiving end;

the receiving end signal processing process comprises the steps that the central node carries out signal detection according to a received signal and a modulation codebook to obtain the number of times that each modulation code word in the modulation codebook is sent and the number of times that each quantization code word in the quantization codebook appears, then each quantization code word is multiplied by the number of times that each quantization code word appears to obtain multiplied quantization code words, then all multiplied quantization code words are summed to complete gradient aggregation, then the summation result is averaged to obtain a global gradient, and finally model updating is carried out by utilizing the global gradient.

7. The method as claimed in claim 6, wherein the method only needs one uplink transmission to complete the gradient aggregation of multiple users, and the multiple users and the central node only need one handshake.