CN115580879A

CN115580879A - Millimeter wave network beam management method based on federal reinforcement learning

Info

Publication number: CN115580879A
Application number: CN202211088629.7A
Authority: CN
Inventors: 薛青; 来东; 徐勇军; 梁志芳
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2022-09-07
Filing date: 2022-09-07
Publication date: 2023-01-06

Abstract

The invention relates to a millimeter wave network beam management method based on federal reinforcement learning, and belongs to the field of wireless communication. The method aims to carry out wave beam configuration at the millimeter wave base station side by collecting position information of users, namely, dynamic control of wave beam direction at the millimeter wave base station side is carried out based on periodic sensing user distribution, the maximum user coverage rate is realized by utilizing limited wave beams, and the wave beam utilization efficiency is improved. The invention implements corresponding beam management strategies through a reinforcement learning algorithm, aims to maximize the long-term network throughput and simultaneously realizes the intellectualization of beam management. In addition, the invention introduces a federal learning framework, and protects the privacy and safety of user data while finding the optimal beam configuration strategy of the system.

Description

Millimeter wave network beam management method based on federal reinforcement learning

Technical Field

The invention belongs to the field of wireless communication, and relates to a millimeter wave network beam management method based on federal reinforcement learning.

Background

The ultra-dense networking technology can solve the problems of limited coverage area of the millimeter wave base stations and sharp increase of data flow by increasing the number of the millimeter wave base stations in a unit area. The multiple association technology is one of key technologies in ultra-dense networking, and aims to realize simultaneous connection between a user and a plurality of millimeter wave base stations, ensure the communication quality of the user and improve the service rate of the user. Therefore, the related research based on ultra-dense networking has important significance. Millimeter wave transmission has large path loss, but the short wavelength nature of millimeter waves enables millimeter wave devices to integrate large-scale antenna arrays in relatively small sizes. By using the beam forming technology, the energy of the transmitted signal can be concentrated in a specific direction, so that additional antenna gain is obtained to make up for the path loss and improve the received power of the signal. If the number of narrow beams that can be formed by the millimeter wave base station is limited, and the beams cannot be used to realize full-area coverage, and only cover a part of the user area, then how to perform reasonable beam configuration on the millimeter wave base station side, and using the limited beams to cover as many users as possible is a key problem for improving the performance of the millimeter wave system. The design of beam management strategies under ultra-dense networking typically faces the following problems: 1) The ultra-dense networking causes the number of beams in the network to be increased greatly, so that the beam management problem is more complex compared with a common scene; 2) The user data has privacy, and how to realize the user privacy protection is to be solved.

The prior art related to the present invention can be mainly summarized as follows.

(1) Beam management related art: in millimeter wave communication, beam management is a relatively broad concept, including beam training, beam tracking, etc., and various schemes have been proposed at present. For example, for the beam training process, the following steps are usually included: beam scanning Beam surfing, beam measurement Beam reporting, beam determination, beam maintenance, beam failure recovery, etc. Different from the concept, the beam management in the invention mainly refers to the beam configuration problem at the millimeter wave base station side.

(2) The prior patent technology is as follows: for example, patent CN113055059A discloses a beam management method for massive MIMO communication. The method has the advantages that the method focuses on beam selection and beam maintenance between the millimeter wave base station and the user, beam selection in a historical beam management scheme is achieved through a CF (compact filter) collaborative filtering algorithm, the moving path of the user is predicted, and a continuous communication link is established through beam refinement. Patent CN113785503A discloses a beam management method using adaptive learning. The method aims to establish a beam management model at a user side and utilizes a reinforcement learning algorithm to carry out intelligent processing. The method focuses on the selection (alignment) of the beams and the association process between the base station and the user. Compared with the scheme disclosed in the prior patent of the invention, the invention has different research scenes and different research problems, and the proposed solution is more focused on solving the problem of optimal configuration of beams at the millimeter wave base station side in the ultra-dense networking.

(3) Federal reinforcement learning related art: deep Reinforcement Learning (DRL) combines deep learning and reinforcement learning for processing the perceptual decision problem of complex systems. The DRL can be used for solving the beam management problem in the millimeter wave ultra-dense networking scene. Federal learning is a promising distributed machine learning architecture, equipment can perform data acquisition and model training locally, and then uploads the trained model to a central node for model aggregation, so that the flow of original data is avoided, and the data privacy safety is greatly protected. For beam management in a super-dense millimeter wave network scene, implementing centralized decision results in huge resource overhead and time overhead, and federal learning can effectively overcome the problem. Therefore, the invention discloses a large-scale beam management method adopting DRL under a federal learning architecture.

Disclosure of Invention

In view of this, the present invention provides a method for managing beams of a millimeter wave network based on federal reinforcement learning.

In order to achieve the purpose, the invention provides the following technical scheme:

a millimeter wave network beam management method based on federal reinforcement learning comprises the following steps:

s1: constructing a millimeter wave base station side beam management model;

s2: initializing parameters of a millimeter wave base station side beam management model;

s3: each millimeter wave base station respectively collects the position information of a local user and trains a local beam management model by using a reinforcement learning algorithm;

s4: updating local model parameters by using a random gradient descent SGD;

s5: repeating S3 and S4, and entering S6 after the local model converges or iterates for N times;

s6: each millimeter wave base station uploads the local model parameters to a central control node or a central server to carry out model parameter aggregation, and a global model of beam management is obtained;

s7: each millimeter wave base station downloads global model parameters from the central control node to update the local model, and carries out beam configuration decision according to the current user information;

s8: and returning to S2, and waiting for next round of beam optimization.

Optionally, in S1, the millimeter wave base station side beam management problem is modeled as a markov decision process, and is solved by a reinforcement learning algorithm; a Markov decision process typically contains four elements S, A, P, R, where S represents the state space of the Markov decision process, A represents the motion space, P represents the state transition probability, and R represents the reward value.

Optionally, in S1, the input of the markov decision process may be determined by a coverage sector set of the millimeter wave base station beam; and representing the beam management strategy at the time t as a sector set covered by all millimeter wave base stations at the time t, namely the beam management strategy C (t) = { C (C) of the system ₁ (t),C ₂ (t),...,C _M (t) }, in which C _M (t) represents a sector set covered by the millimeter wave base station M at the time t; and a proper strategy is adopted to enable the millimeter wave base station to cover more users by using limited wave beams, so that the utilization rate of the wave beams and the throughput of the system are improved.

Optionally, in S1, a reinforcement learning algorithm DDQN is used to solve the markov decision process, and an initial model is established; the initial model is composed of two four-layer fully-connected neural networks, namely a training neural network and a target neural network; the training neural network is used to evaluate the value of the current action-state, i.e., the Q-value, while the target neural network is used to determine the maximum Q-value, expressed asQ _max And comparing the difference E [ (Q) between the two networks _max -Q) ² ]Defining as a loss function; the neural network adopts a ReLU function as an activation function, and the system and the rate are used as rewards for feedback; compared with the traditional reinforcement learning algorithm, the DDQN is added with an experience playback pool and model evaluation, and the problems of model deviation caused by excessively high DQN estimation and the cost of excessively large state space and action space caused by a Q-learning algorithm are solved.

Optionally, in S1, the beam management policy is performed periodically, and the beam configuration of each millimeter wave base station in the same period remains unchanged; each cycle contains three parts of content: 1) The network performance accumulated in the last period of the millimeter wave base station is configured by using a Federal-based reinforcement learning algorithm; 2) A user selects a proper millimeter wave base station for association; 3) And establishing a millimeter wave communication link for data transmission.

Optionally, in S2, a local model of millimeter wave base station side beam management is initialized; when the wave beam management period begins, each millimeter wave base station updates the local model by using the global model parameters downloaded from the central control node, so that local model convergence can be realized more quickly on the basis of keeping local characteristics; after the model is initialized, the millimeter wave base station configures the base station side wave beam of the current round according to the optimal wave beam management strategy obtained in the previous period; then, the corresponding local model is trained through the latest user position information.

Optionally, in S3, a user data screening mechanism is added before the beam management model training is performed, so as to ensure validity and diversity of data participating in the model training; the millimeter wave base station carries out relevance judgment by calculating the distance between the millimeter wave base station and each user, and selects the user position information in the coverage area as effective data for model training; in addition, whether the users participate in the training is judged according to the historical participation times of the users, and if the historical participation times are less, the users are included in the training range, so that the diversity of the participation model training data is ensured.

Optionally, in S6, the model parameters obtained in S5 are used to train a global model; and uploading the trained local model parameters to a central control node by the millimeter wave base station to perform aggregation of the model parameters so as to update the global model.

The invention has the beneficial effects that:

1. the invention provides a method for managing millimeter wave base station side wave beams based on federal reinforcement learning, which is characterized in that under the condition that the millimeter wave base station wave beams are limited, the position change of a user is periodically sensed by using a reinforcement learning algorithm, the wave beams on the millimeter wave base station side are configured, the self-adaptive management of the millimeter wave base station side wave beams is realized, the wave beam utilization efficiency of a millimeter wave base station is further improved, and the throughput of a system is optimized for a long time.

2. The invention provides a millimeter wave base station side beam management method based on federal reinforcement learning, which carries out model sharing through the idea of federal, so that user data can be trained locally without being uploaded to a central processing unit, the privacy safety of a user is ensured, and the convergence rate of a global model is improved.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a method for managing beams of a millimeter wave network based on federated learning;

fig. 2 is a system model of an ultra-dense millimeter wave heterogeneous network.

Fig. 3 is a system model of a super-dense millimeter wave homogeneous network.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by the terms "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not intended to indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limiting the present invention, and the specific meaning of the terms described above will be understood by those skilled in the art according to the specific circumstances.

A method for managing beams of a millimeter wave network based on federal reinforcement learning is shown in fig. 1.

In this embodiment, the beam management method based on federal reinforcement learning may be applied to a two-layer heterogeneous millimeter wave ultra-dense networking, as shown in fig. 2. The macro base station working in the microwave frequency band can be used as a central control node, the global model training and the local model parameter aggregation task are carried out, and each millimeter wave base station is used as an intelligent agent to conduct local beam management model training. That is, both the millimeter wave base station and the macro base station can perform user location information acquisition and model training based on DDQN, while the macro base station needs to perform model parameter aggregation. The communication between the millimeter wave base station and the user is realized by establishing a corresponding millimeter wave link, and the millimeter wave base station and the macro base station can be connected through an X2 interface.

The specific implementation steps of the scheme are as follows:

step 1: and constructing a millimeter wave base station side beam management (beam configuration) model. The method specifically comprises the following steps: the beam management problem on the millimeter wave base station side can be described by a markov decision process and solved by DDQN. In the heterogeneous network shown in fig. 2, the set of millimeter wave base stations participating in training is represented by M = {1, 2.., M }, and the set of users is represented by U = {1, 2.., U }. Each macroelement in DDQN is defined as follows.

(1) Defining the state space at time t as S _t ＝{U _m (t),C _m (t),D _k (t) }, in which U _m (t) represents the set of users served by the millimeter wave base station m at time t, C _m (t) represents the set of beam sectors covered by the millimeter wave base station m at time t, D _k (t)＝{C _m (t)} _{k＝1,2,...,M,k≠m} And the set of coverage sectors of the millimeter wave base stations except the millimeter wave base station m at the time t is represented.

(2) Defining the motion space at the time t as A _t ＝{C _m (t)}。

(3) Defining the network state slave S of the millimeter wave base station at the time t _t Go to S _t+1 Has a probability of P = P _r {S _t+1 |S _t }。

(4) Defining the reward function at time t as R _t = R (t), wherein

Represents the system throughput at time t, where the received data rate r of user u at time t _u (t) is represented by

Wherein W _m Represents the bandwidth, W, allocated to the user by the millimeter wave base station _M Denotes the bandwidth of the macro base station, N _M (t) denotes the number of users served by the macro base station, M _c (t) represents a set of millimeter wave base stations associated with the user at time t,

meaning that the user is associated with the macro base station only,

then it means that the user performs communication in the dual association mode at time t (i.e. the user associates the macro base station and the millimeter wave base station simultaneously), and the SINR _u,m Represents the SINR, SINR when the user u is associated with the millimeter wave base station m _u,M Representing the signal to interference plus noise ratio when user u is associated with the macro base station.

And 2, step: and initializing parameters of a millimeter wave base station side beam management model. The millimeter wave base station initializes the local model of the millimeter wave base station by downloading the global model parameters sent by the macro base station. Local model parameter θ _t+1 In an update manner of

Wherein G represents the global model parameter output by the system after the upper round communication is finished, and rho is the model learning rate L (theta) _t ) Representing the loss function of the ith millimeter wave base station, n _i Representing the amount of training data for the ith mm wave base station. After initialization is complete, the base station begins training the local model.

And step 3: and each millimeter wave base station respectively collects the position information of the local user and trains a local beam management model by using a DDQN algorithm. And (3) each millimeter wave base station completes the beam configuration of the communication according to the initial model parameters obtained in the step (2), trains the corresponding beam management model according to the user position information, and further obtains the optimal beam management strategy. Before model training, each millimeter wave base station can perform user data screening once so as to ensure the validity and diversity of training data. DDQN is theta according to weight value _t The training neural network evaluates the Q value and gives a weight value of

To estimate Q _max . Wherein in state S _t Lower adopting action A _t Q value of (1) available Q function

And (4) showing. For objective functions in DDQN

Update wherein R _t+1 Is the reward function at the moment of t +1, gamma is a reduction factor, and gamma belongs to [0,1']. The purpose of the DDQN is to minimize the difference between the target network and the training network, and the available loss function L (θ) = E [ (Y) _t ^DDQN -Q(S _t ,A _t ；θ _t )) ² ]And (6) evaluating. The core of the method is to determine the optimal beam management scheme by minimizing the loss function.

And 4, step 4: the local model parameters are updated with a random gradient descent. Update local model with SGD, have

Where λ is the step size.

And 5: and (5) repeating the steps 3 and 4, and entering the step 6 after the local model converges or iterates for a certain number of times.

And 6: after training is finished, each millimeter wave base station uploads local model parameters to the macro base station for model parameter aggregation, and a global model for beam management is obtained

Where n represents the total training data volume for all millimeter wave base stations participating in the training.

And 7: and each millimeter wave base station downloads global model parameters from the macro base station to update the local model, and makes beam configuration decision according to the current user information.

And 8: and returning to the step 2, and waiting for the next round of beam optimization.

In this embodiment, a beam management model is established based on a DDQN algorithm by using a federal learning framework, so as to maximize long-term throughput and simultaneously implement intelligent management of beams. Local model parameters trained by the millimeter wave base station in the system are uploaded to the macro base station for aggregation, a global model is obtained, the local model is further updated by downloading the global model parameters, and therefore the influence of surrounding cells is taken into consideration on the basis of keeping the characteristics of local data. The scheme can reduce the flow of user data while finding the optimal beam distribution strategy of the system, thereby greatly protecting the privacy security of the user.

The second embodiment of the invention:

in the present embodiment, the beam management method based on the federal reinforcement learning will be used for the mm wave homogeneous network, as shown in fig. 3. Each millimeter wave base station is used as an intelligent agent to perform distributed learning and cooperation, and the function of the central control node can be temporarily borne by a certain millimeter wave base station. The information sharing is realized by establishing a corresponding communication link between each node and the user, and the millimeter wave base stations are connected through an X2 interface. The millimeter wave base stations adjacent to each other can form a millimeter wave base station cluster, and the millimeter wave base stations belonging to the same cluster can share model parameters in a federal mode. Fig. 3 shows three millimeter wave base station clusters, where millimeter wave base stations in the clusters use DDQN to train local beam management models, and then obtain beam management models of the clusters through model aggregation. Model parameters are shared among all clusters in the same mode, and therefore the globally optimal beam management strategy is obtained. Similar to the first embodiment, the method comprises the following specific implementation steps:

step 1: and constructing a millimeter wave base station side beam management model. In particular, the beam management problem on the millimeter wave base station side can be described by a Markov decision process and solved by DDQN. In the homogeneous network shown in fig. 3, the millimeter wave base station set participating in training in the same cluster is denoted by M = {1,2,. Multidot.m }, and the user set is denoted by U = {1,2,. Multidot.u }. Each macroelement in DDQN is defined as follows.

(2) Defining the motion space at the moment t as A _t ＝{C _m (t)}。

(4) The reward function defining time t is defined as R _t = R (t), wherein

Wherein W _m Bandwidth allocated to users for millimeter wave base stations, M _c (t) represents the millimeter wave base station set associated with the user at time t, SINR _u,m Representing the signal to interference plus noise ratio when user u is associated with millimeter wave base station m.

Step 2: and initializing parameters of a millimeter wave base station side beam management model. The millimeter wave base station initializes the local model of the millimeter wave base station by downloading the global model parameters sent by the macro base station. Local model parameter θ _t+1 Is updated in a manner that

Wherein G represents the global model parameter output by the system after the upper round communication is finished, and rho is the model learning rate L (theta) _t ) Represents the loss function of the ith millimeter wave base station, n _i Representing the amount of training data for the ith mm wave base station. Initialization is completedThereafter, the base station begins training the local model.

And step 3: and each millimeter wave base station respectively collects the position information of the local user and trains a local beam management model by using the DDQN. And (3) each millimeter wave base station completes the beam configuration of the communication of the current round according to the initial model parameters obtained in the step (2), trains a corresponding beam management model according to the user position information, and further obtains the optimal beam management strategy. Before model training, each millimeter wave base station can perform user data screening once so as to ensure the validity and diversity of training data. DDQN according to weight value theta _t The training neural network evaluates the Q value and gives a weight value of

To estimate Q _max . Wherein in state S _t Lower adoption action A _t Q value of (1) available Q function

To indicate. Update of the objective function in DDQN to

Wherein R is _t+1 Is the reward function at the moment of t +1, gamma is a reduction factor, and gamma belongs to [0,1']. The purpose of the DDQN is to minimize the difference between the target network and the training network, i.e. the loss function L (θ) = E [ (Y) _t ^DDQN -Q(S _t ,A _t ；θ _t )) ² ]. The core of the method is to determine the optimal beam management scheme by minimizing the loss function.

And 4, step 4: the local model parameters are updated using a stochastic gradient descent. Update local model with SGD, have

Where λ is the step size.

Step 6: after training, each millimeter wave base station will copy the bookUploading the earth model parameters to a macro base station for model parameter aggregation to obtain a global model for beam management

Where n represents the total training data volume for all millimeter wave base stations participating in training.

In this embodiment, a fully distributed federal learning framework is utilized, and a beam management model is trained based on DDQN, so as to maximize the long-term throughput of the system and simultaneously achieve intelligent management of beams. In this embodiment, the training method of the beam management model in the same cluster is the same as that in the first embodiment. The difference between this embodiment and the first embodiment is that the central control node that undertakes the aggregation task is no longer a fixed macro base station, but is undertaken by millimeter wave base stations in turn inside the cluster or determined by setting certain selection conditions (for example, execution time of local model training, number of service users, and the like). If the millimeter wave base station bearing the aggregation task is selected based on a certain condition, the base station with the shortest time for executing the local model training may be selected, or the base station with the fewest number of users in service in the current period may be selected. This is mainly to consider factors such as model training efficiency and local computing resources. If a plurality of similar clusters exist in the system and model interaction can be performed in each cluster based on a federal learning framework, a globally optimal beam management strategy in the millimeter wave homogeneous system can be obtained by performing model sharing among the plurality of clusters.

Finally, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A millimeter wave network beam management method based on federal reinforcement learning is characterized in that: the method comprises the following steps:

s1: constructing a millimeter wave base station side beam management model;

s4: updating local model parameters by using a random gradient descent SGD;

s6: each millimeter wave base station uploads the local model parameters to a central control node or a central server for model parameter aggregation, and a global model of beam management is obtained;

s7: each millimeter wave base station downloads global model parameters from a central control node to update a local model, and carries out beam configuration decision according to current user information;

s8: and returning to S2, and waiting for next round of beam optimization.

2. The method of claim 1, wherein the method comprises: in the S1, the millimeter wave base station side beam management problem is modeled into a Markov decision process and is solved through a reinforcement learning algorithm; a Markov decision process typically contains four elements S, A, P, R, where S represents the state space of the Markov decision process, A represents the motion space, P represents the state transition probability, and R represents the reward value.

3. The method of claim 2, wherein the method comprises: in S1, the input of the Markov decision process can be based on millimeter waveDetermining a set of coverage sectors of a station beam; representing the beam management strategy at the time t as a sector set covered by all millimeter wave base stations at the time t, namely the beam management strategy C (t) = { C of the system ₁ (t),C ₂ (t),...,C _M (t) }, in which C _M (t) represents a sector set covered by the millimeter wave base station M at the time t; and a proper strategy is adopted to enable the millimeter wave base station to cover more users by using limited beams, so that the utilization rate of the beams and the throughput of the system are improved.

4. The method of claim 3, wherein the method comprises: in the S1, a Markov decision process is solved by using a reinforcement learning algorithm DDQN, and an initial model is established; the initial model is composed of two four-layer fully-connected neural networks, namely a training neural network and a target neural network; the training neural network is used to evaluate the value of the current action-state, i.e., the Q value, while the target neural network is used to determine the maximum Q value, denoted Q _max And the difference E [ (Q) between the two networks _max -Q) ² ]Defining as a loss function; the neural network adopts a ReLU function as an activation function, and the system and the rate are used as rewards for feedback; compared with the traditional reinforcement learning algorithm, the DDQN is added with an experience playback pool and model evaluation, and the problems of model deviation caused by excessively high DQN estimation and the cost of excessively large state space and action space caused by a Q-learning algorithm are solved.

5. The method of claim 4, wherein the method comprises: in the S1, the beam management strategy is performed periodically, and the beam configuration of each millimeter wave base station in the same period is kept unchanged; each cycle contains three parts of content: 1) The network performance accumulated in the last period of the millimeter wave base station is configured by using a Federal-based reinforcement learning algorithm; 2) A user selects a proper millimeter wave base station for association; 3) And establishing a millimeter wave communication link for data transmission.

6. The method of claim 5, wherein the method comprises: in the S2, initializing a local model of millimeter wave base station side beam management; when the wave beam management period begins, each millimeter wave base station updates the local model by using the global model parameters downloaded from the central control node, so that local model convergence can be realized more quickly on the basis of keeping local characteristics; after the model is initialized, the millimeter wave base station configures the base station side wave beam of the current round according to the optimal wave beam management strategy obtained in the previous period; then, the corresponding local model is trained through the latest user position information.

7. The method of claim 6, wherein the method comprises: in the S3, a user data screening mechanism is added before the beam management model training is carried out, so that the effectiveness and diversity of data participating in the model training are ensured; the millimeter wave base station carries out relevance judgment by calculating the distance between the millimeter wave base station and each user, and selects the user position information in the coverage area as effective data for model training; in addition, whether the user participates in the training is judged according to the historical participation times of the users, and if the historical participation times are less, the user is brought into the range of the training, so that the diversity of the model training data is guaranteed.

8. The method of claim 7, wherein the method comprises: in S6, training a global model by using the model parameters obtained in S5; and the millimeter wave base station uploads the trained local model parameters to the central control node to carry out model parameter aggregation so as to update the global model.