CN115173926B - Communication method and communication system of star-ground fusion relay network based on auction mechanism - Google Patents

Communication method and communication system of star-ground fusion relay network based on auction mechanism Download PDF

Info

Publication number
CN115173926B
CN115173926B CN202210806700.4A CN202210806700A CN115173926B CN 115173926 B CN115173926 B CN 115173926B CN 202210806700 A CN202210806700 A CN 202210806700A CN 115173926 B CN115173926 B CN 115173926B
Authority
CN
China
Prior art keywords
satellite
relay node
relay
potential
bidding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210806700.4A
Other languages
Chinese (zh)
Other versions
CN115173926A (en
Inventor
谢卓辰
杨文歆
晏睦彪
韩欣洋
刘会杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Engineering Center for Microsatellites
Innovation Academy for Microsatellites of CAS
Original Assignee
Shanghai Engineering Center for Microsatellites
Innovation Academy for Microsatellites of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Engineering Center for Microsatellites, Innovation Academy for Microsatellites of CAS filed Critical Shanghai Engineering Center for Microsatellites
Priority to CN202210806700.4A priority Critical patent/CN115173926B/en
Publication of CN115173926A publication Critical patent/CN115173926A/en
Application granted granted Critical
Publication of CN115173926B publication Critical patent/CN115173926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18519Operations control, administration or maintenance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention provides a communication method and a communication system of a star-ground fusion relay network based on an auction mechanism. The method comprises the following steps: broadcasting cooperative signaling to a plurality of potential relay nodes by the satellite, and taking the time slot of the satellite as a commodity in an auction mechanism; each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and takes all the potential relay nodes participating in bidding as a first relay node; the first relay node evaluates the channel value according to the channel performance and reports bidding vectors comprising the estimated value to a satellite, wherein the channels comprise satellite-relay channels, relay-satellite user channels and relay-ground user channels, and the channel performance comprises channel gain and sub-slot allocation length; and the satellite selects a first relay node corresponding to the maximum bidding vector from the plurality of first relay nodes as a winning relay node based on the auction mechanism and the bidding vector, and the winning relay node and the satellite achieve a cooperative relationship.

Description

Communication method and communication system of star-ground fusion relay network based on auction mechanism
Technical Field
The invention mainly relates to the technical field of satellite communication, in particular to a communication method and a communication system of a satellite-ground fusion relay network based on an auction mechanism.
Background
Researchers have focused on studying terrestrial mobile cellular networks for a considerable period of time. The ground cellular network ground mobile communication system has the advantages of large communication capacity, small network delay, high frequency spectrum efficiency and perfect technical development, and can widely cover densely populated urban areas, but the ground mobile network cannot cover most of global areas such as oceans, remote areas and the like due to the restriction of topography and economic factors. By virtue of the strong wide area coverage capability, the satellite communication system can provide seamless Internet broadband service for global users, particularly remote areas, oceans and disaster areas with link damage which are not covered by the traditional ground cellular network. However, low-orbit satellites cannot provide effective coverage for densely populated areas due to the effects of line-of-sight transmission, shadowing or shadowing effects, etc. Researchers have proposed the concept of a "satellite-ground convergence network" to merge satellite communication systems with terrestrial communication systems. However, satellite communication systems are different from ground communication systems in system, develop along two roads, how to exchange resources among different systems, so as to achieve the effect of co-win, and are important research subjects of satellite-ground integration networks. In order to further improve the coverage of the satellite mobile communication network and provide high data transmission rate service, the terrestrial mobile communication system can help the satellite amplify and forward signals, i.e. the terrestrial mobile communication system acts as a relay of satellite signals to help the satellite transmit signals. Thus, the ground mobile communication system can obtain the access opportunity of the satellite-ground shared frequency band while resisting the shadow effect and the fading effect. In this case, the terrestrial mobile communication system and the satellite communication system can be regarded as a whole, and they are called a star-to-ground fusion relay network.
At present, all devices with a relay function are always put into one alternative set in satellite-ground cooperative communication, but one beam of a satellite can cover several kilometers to tens of kilometers, and when a large number of potential relay nodes exist, the calculation amount is easy to be overlarge. In addition to the computational scale problem, in the conventional star-to-ground converged network relay selection problem, it is often assumed that each node is honest, but reality may not be so, which results in frequent scheduling after some "dishonest" potential relay nodes provide false information, resulting in a decrease in the transmission rate of the primary user. In addition, the existing research is often based on a quasi-static scene, and the switching overhead between different relays selected by satellites on front and back time snapshots caused by the dynamic property of a low-orbit satellite is not considered, but the relays are selected and the signal is amplified and transmitted only based on opportunistic scheduling, so that the repeated variation of alternative relays is caused for a high-dynamic communication system such as a low-orbit satellite, frequent interaction of control signaling is caused, and long time delay and calculation overhead of communication are increased.
Disclosure of Invention
The invention aims to solve the technical problem of providing a communication method and a communication system of a star-ground fusion relay network based on an auction mechanism for realizing high-efficiency star-ground cooperative communication in a dynamic environment.
In order to solve the technical problem, the invention provides a communication method of a star-to-ground fusion relay network based on an auction mechanism, wherein the star-to-ground fusion relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, and each potential relay node serves a plurality of ground users covered by the potential relay nodes on the ground, and the communication method is characterized by comprising the following steps: the satellite broadcasts cooperative signaling to the plurality of potential relay nodes, and the time slot of the satellite is used as commodity in an auction mechanism; each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and takes all the potential relay nodes participating in bidding as a first relay node, wherein the participation bidding indicates that the potential relay node is willing to acquire the commodity; the first relay node evaluates the channel value according to the channel performance and reports bidding vectors comprising the estimated value to the satellite, wherein the channels comprise satellite-relay channels, relay-satellite user channels and relay-ground user channels, and the channel performance comprises channel gain and sub-slot allocation length; and the satellite selects a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes based on an auction mechanism and the bidding vector as a winning relay node, and the winning relay node and the satellite achieve a cooperative relationship.
In an embodiment of the present application, each potential relay node predicts whether the potential relay node participates in bidding at the current moment based on a historical success rate and scene information, wherein the historical success rate is historical data of a cooperative relationship between the potential relay node and the satellite, and the scene information includes satellite-relay channel state information, relay-ground user channel state information and relay-satellite user channel state information obtained by a ground network.
In one embodiment of the present application, eachPredicting the potential relay nodes based on a reinforcement learning model, wherein the reinforcement learning model takes the potential relay nodes as an intelligent agent based on a Q-learning algorithm, the time period of the satellite passing is divided into N time segments, the model parameters of the reinforcement learning model comprise states, actions and rewards of the potential relay nodes in a certain time segment, and the states are defined as binary groups:
Figure BDA0003738060530000031
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003738060530000032
Figure BDA0003738060530000033
wherein (1)>
Figure BDA0003738060530000034
Representing satellite S and potential relay node R k Channel gain, P between s Representing the transmitted power, sigma, of said satellite 2 Representing the noise power of the satellite, +.>
Figure BDA0003738060530000035
Representing potential relay node R k And potential relay node R k Ground user under cover->
Figure BDA0003738060530000036
Channel gain between->
Figure BDA0003738060530000037
Representing potential relay node R k Is a residual power of (2); the action is defined as an action set A= { Y, N }, wherein Y represents bidding and N represents abandoning bidding; the rewards are represented by discount rewards, and are defined as accumulated values of multiplying each instant rewards of the intelligent agent in the learning process by a corresponding quantity of discount factors; defining an action cost function Q (s, a) as the expected value of discount returns when the action a epsilon A is in the state s within a certain time, wherein the reinforcement learning modelUpdating the action cost function to obtain an optimal action cost function Q * (s,a)。
In an embodiment of the present application, each potential relay node predicts based on a reinforcement learning model, the reinforcement learning model predicts whether the potential relay node participates in bidding at the current moment by adopting a Double DQN algorithm, and divides the satellite transit period isochronously into P time slices, and model parameters of the reinforcement learning model include states, actions and rewards of the potential relay node in a certain time slice, where the states are defined as six tuples:
Figure BDA0003738060530000038
wherein (1)>
Figure BDA0003738060530000039
Representing satellite S and potential relay node R k The gain of the channel between them,
Figure BDA00037380605300000310
representing the potential relay node R k Channel gain between satellite user D, +.>
Figure BDA00037380605300000311
Representing the satellite tilt angle of said satellite, +.>
Figure BDA00037380605300000312
Figure BDA00037380605300000313
Representing the angle of visibility of the potential relay node to the satellite,/for the satellite>
Figure BDA00037380605300000314
Figure BDA00037380605300000315
Representing the residual power of the potential relay node, wherein theta represents the movement angle of the satellite, and theta is E [0,180 DEG ]]The method comprises the steps of carrying out a first treatment on the surface of the The actions are expressed as: a=pi(s), wherein,pi (·) is a policy representing the mapping process from the environmental state s to action a; the rewards are represented by numerical values; the reinforcement learning model comprises a Q learning network and a target learning network, data in the environment are input into the Q learning network, a system selects the maximum value in the Q learning network as an action, parameters in the Q learning network are copied into the target learning network at intervals, and the target learning network carries out reverse transmission updating on the parameters of the Q learning network under the action of a loss function so as to gradually obtain an optimized Q value.
In an embodiment of the present application, further includes: the satellite broadcasts information of the winning relay node; upon arrival of a service period, the satellite transmits a satellite signal to the winning relay node; and the winning relay node accesses a satellite frequency band of the satellite and provides services for ground users.
In an embodiment of the present application, one transmission frame of the satellite and the winning relay node comprises (n+3) time slots, wherein a first time slot is used for the satellite to broadcast the cooperative signaling to all potential relay nodes, a second time slot is used for the satellite to transmit one or more satellite signals to the winning relay node, a third time slot is used for the winning relay node to forward the satellite signals to the satellite users, the remaining N time slots are used for the winning relay node to serve the terrestrial users within its coverage, and the service period comprises the remaining (n+2) time slots except for the first time slot.
In an embodiment of the present application, the auction mechanism is a victory auction mechanism and the winning relay node is the first relay node having the largest bid vector.
In an embodiment of the present application, after the winning relay node is determined, a next largest bid vector having a next largest bid vector is taken as the winning relay node's bid value.
In one embodiment of the present application, the bid vector is calculated step by step using the following formula
Figure BDA0003738060530000041
Each vector of +.>
Figure BDA0003738060530000042
Figure BDA0003738060530000043
Figure BDA0003738060530000044
Figure BDA0003738060530000045
Figure BDA0003738060530000046
Figure BDA0003738060530000051
Figure BDA0003738060530000052
Figure BDA0003738060530000053
Wherein the first set of relay nodes is denoted as r= { R 1 ,R 2 ,…,R k ,…,R K K represents the total number of the first relay nodes; kth first relay node R k Serving N ground subscribers, denoted U k ={U k1 ,U k2 ,…,U kn ,…,U kN };
Figure BDA0003738060530000054
Represents the kth first intermediateRelay node R k Providing an initial bid value; t is t 1 Representing a first relay node R k The length of the second transmission slot of (a); t is t 2 Representing a first relay node R k The length of the third transmission slot of (a); t is t kn (n=1 to N) represents the first relay node R k The length of the remaining N slots; />
Figure BDA0003738060530000055
Representing the signal-to-noise ratio of the satellite-first relay link; />
Figure BDA0003738060530000056
Representing the signal-to-noise ratio of the first relay-to-ground user link; />
Figure BDA0003738060530000057
Representing a signal-to-noise ratio of the first relay-satellite user link; />
Figure BDA0003738060530000058
Representing a first relay node R k Channel capacity of the link provided for satellite user D; />
Figure BDA0003738060530000059
Representing the QoS requirement of the ground user, which means the minimum data transmission rate which needs to be met by the ground user; />
Figure BDA00037380605300000510
Representing a first relay node R k Channel gain to satellite user D link;
Figure BDA00037380605300000511
the channel gains of three links of the satellite-first relay node, the first relay node-satellite user and the first relay node-ground user are respectively represented, and N and K are positive integers which are more than or equal to 1.
In an embodiment of the present application, after the first relay node reports the bid vector to the satellite, the method further includes the satellite correcting the bid vector based on a handover overhead index to obtain a corrected bid vector, where the handover overhead index includes one or any one of bid history information, predicted residence time and movement angle information, and the satellite selects, from the first relay nodes, a first relay node corresponding to a maximum corrected bid vector as a winning relay node according to the corrected bid vector, where the bid history information is obtained by a set of all winning relay nodes selected by the satellite on a time sequence; the predicted residence time and the movement angle information are derived from a geometric model.
In one embodiment of the present application, the bid vector is modified using the following formula:
Figure BDA00037380605300000512
wherein b k Representing the post-correction bid vector,
Figure BDA00037380605300000513
representing the normalized bidding vector after normalizing the bidding vector, wherein gamma represents the coefficient for balancing the data transmission rate and the switching overhead, and the larger gamma represents the data transmission rate to be more valued by satellite nodes, and the smaller gamma represents the switching overhead to be more worth reducing; />
Figure BDA0003738060530000061
Is a decision matrix, where x kj Represents any of the handover overhead indexes, K represents the sequence number of the first relay node, K is the total number of the first relay nodes, j represents the sequence number of the handover overhead index, and w j The weight of the j-th handover overhead indicator is represented.
In an embodiment of the present application, the weights are determined using an entropy weight method.
The communication system of the satellite-ground fusion relay network based on the auction mechanism is further provided for solving the technical problems, the satellite-ground fusion relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, each potential relay node serves a plurality of ground users covered by the potential relay nodes on the ground, and the communication system is characterized in that the satellite is used for broadcasting cooperation signaling to the plurality of potential relay nodes, and time slots of the satellite are used as commodities in the auction mechanism; each potential relay node is used for predicting whether the potential relay node participates in bidding at the current moment, and taking all potential relay nodes participating in bidding as a first relay node, wherein the participation bidding indicates that the potential relay node is willing to acquire the commodity, the first relay node is used for evaluating the channel value according to the channel performance and reporting bidding vectors comprising the estimated value to the satellite, wherein the channels comprise a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the channel performance comprises the channel gain and the sub-slot allocation length; the satellite is further used for selecting a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes based on an auction mechanism and the bidding vector as a winning relay node, and the winning relay node and the satellite are in cooperative relation.
According to the communication method, the success rate of participation of the potential relay node in bidding is estimated through the prediction model, and the conversion between the bidding state and the non-bidding state of the base station is realized, so that the scale of the satellite-ground cooperation matrix is reduced, and the purpose of reducing calculation cost is achieved. Meanwhile, an auction mechanism with excitation compatibility and effectiveness is adopted, so that the strategy of reporting real bidding by the potential relay node is ensured to be a dominant strategy, and a guarantee is provided for cooperation of a satellite mobile communication system and a ground cellular mobile communication system. Finally, based on the high dynamic and long time delay characteristics of the low-orbit satellite communication system, the method considers that the relays selected on different time snapshots are changed, and uses the switching overhead as an additional item to rewrite bidding in the auction process, so that the low-orbit satellite comprehensively considers the transmission benefit and the switching frequency, realizes multi-objective optimization, and is beneficial to reducing the time delay and the signaling transmission overhead.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the principles of the invention. In the accompanying drawings:
FIG. 1 is a schematic diagram of a scene of a star-to-ground converged relay network;
FIG. 2 is an exemplary flow chart of a communication method of an auction mechanism-based star-to-ground converged relay network in accordance with an embodiment of the present application;
FIG. 3 is a schematic information flow diagram in a communication method according to an embodiment of the present application;
FIG. 4 is a basic framework diagram of a reinforcement learning model in a communication method according to an embodiment of the present application;
FIG. 5 is a basic framework diagram of a Double DQN with empirical playback in a communication method according to an embodiment of the present application;
FIGS. 6A-6C illustrate satellite time slot assignments in three cooperative states;
fig. 7 is a schematic diagram of the relationship between satellite beams and base station locations.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present application, and it is obvious to those skilled in the art that the present application may be applied to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
As used in this application and in the claims, the terms "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
The relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description. Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but should be considered part of the specification where appropriate. In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
In addition, the terms "first", "second", etc. are used to define the components, and are merely for convenience of distinguishing the corresponding components, and unless otherwise stated, the terms have no special meaning, and thus should not be construed as limiting the scope of the present application. Furthermore, although terms used in the present application are selected from publicly known and commonly used terms, some terms mentioned in the specification of the present application may be selected by the applicant at his or her discretion, the detailed meanings of which are described in relevant parts of the description herein. Furthermore, it is required that the present application be understood, not simply by the actual terms used but by the meaning of each term lying within.
Flowcharts are used in this application to describe the operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously. At the same time, other operations are added to or removed from these processes.
According to the communication method of the satellite-ground fusion relay network based on the auction mechanism, the auction mechanism is introduced, and a cooperative communication process between a satellite mobile communication system and a ground mobile cellular system is constructed. The satellite-ground fusion relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users.
In order to facilitate the explanation of the communication method of the present application, first, scene definition is performed on the star-ground fusion relay network. The satellite-ground convergence relay network comprises two sub-networks, wherein one sub-network comprises a satellite mobile communication network, namely a main network; the other sub-network comprises N terrestrial cellular mobile communication networks, i.e. terrestrial networks. Each ground network contains one potential relay node and a plurality of ground subscribers.
Fig. 1 is a schematic diagram of a scenario of a star-to-ground fusion relay network. In fig. 1 there is shown one satellite 110,3 potential relay nodes 121, 122, 123 and 1 satellite user 130, which is also denoted hereinafter by the letter D. Fig. 1 is only an example, and is not intended to limit the number and positional relationship of satellite nodes, potential relay nodes, and satellite users in a star-to-ground fusion relay network. Satellite 110 may be a low-orbit satellite. In fig. 1, potential relay nodes 121, 122, 123 are all base stations. The base station node with the relay function may be a 5G base station with a receiving, amplifying and forwarding function, or may be a D2D node or an ad-Hoc network. The present application describes a specific embodiment in which a base station is used as a potential relay node, and is not used to limit a specific type of the potential relay node.
In fig. 1, circles around each potential relay node 121, 122, 123 are used to represent the cell coverage of that potential relay node, each potential relay node being used to serve the terrestrial users within its coverage. In a terrestrial wireless communication system, a terrestrial user is any user terminal, such as a mobile terminal, that may use signals transmitted by the potential relay node, and the present application is not limited thereto. As shown in fig. 1, each satellite, each potential relay node, and the satellite user themselves have computing and storage capabilities, which can act as a computing node.
In a satellite-ground fusion relay network, the main network includes one satellite node S and one or more satellite user nodes, namely satellite user 130 shown in fig. 1. Satellite users 130 include ground stations, satellite terminals, and dual mode terminals. The present application discusses an example in which a main network has one satellite user node, and the ideas of the present application may also be used in a cooperative communication scenario in a frequency division multiplexing system that includes a plurality of satellite user nodes in the main network.
Also shown in fig. 1 is an obstacle 140, such as a building. During movement of the satellite 110, the satellite signal is blocked by the obstacle 140 and cannot directly establish a communication link with the satellite user 130, at this time, a potential relay node that can receive the satellite signal and has a communication connection with the satellite user 130 may be selected as a relay, for example, the potential relay node 123, and the potential relay node 123 may receive the satellite signal and amplify and forward the satellite signal to the satellite user 130. The communication link is shown in dashed lines in fig. 1.
In the satellite-ground shared frequency band, if the satellite-ground shared frequency band is not guided, each base station under the coverage of the satellite wave beam can be a potential relay node, thereby providing services for ground mobile communication users in the coverage area and amplifying and forwarding satellite signals. It is assumed that all transmitters and receivers are equipped with a single antenna, in half duplex mode. Assuming that the total number of base stations K > 1, each terrestrial network can assist at most one satellite user due to power constraints. The proper relay selection can effectively improve the spectrum efficiency and the transmission capacity of the main user network. Since the relay cooperation of the same channel has a large overhead in terms of synchronization and coding and also has an unnecessary co-channel interference effect, only one relay is selected to transmit its signal considering only one channel. Assuming that the line-of-sight link between satellite node S and satellite user D in fig. 1 is blocked by obstruction 140, a communication connection may be established between satellite node S and satellite user D using the communication method of the present application.
Fig. 2 is an exemplary flow chart of a communication method of a star-to-ground converged relay network based on an auction mechanism according to an embodiment of the present application. Referring to fig. 2, the communication method of this embodiment includes the steps of:
Step S210: broadcasting cooperative signaling to a plurality of potential relay nodes by the satellite, and taking the time slot of the satellite as a commodity in an auction mechanism;
step S220: each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and takes all the potential relay nodes participating in bidding as a first relay node, wherein the participation bidding indicates that the potential relay nodes are willing to obtain goods, namely have a cooperative willing;
step S230: the first relay node evaluates the channel value according to the channel performance and reports bidding vectors comprising the estimated value to a satellite, wherein the channels comprise satellite-relay channels, relay-satellite user channels and relay-ground user channels, and the channel performance comprises channel gain and sub-slot allocation length;
step S240: the satellite selects a first relay node corresponding to the maximum bidding vector from the plurality of first relay nodes as a winning relay node based on the auction mechanism and the bidding vector, and the winning relay node and the satellite achieve a cooperative relationship.
The above steps S210 to S240 are described in detail with reference to the accompanying drawings.
Fig. 3 is a schematic information flow chart in a communication method according to an embodiment of the present application. Referring to fig. 3, there are shown 3 execution bodies including satellites 310, base station clusters 320, and satellite users 330 in the star-to-ground fusion relay network. The flow of information in three states, including (a) non-cooperative, (b) cooperative breaking, and (c) successful cooperation, is illustrated in fig. 3 based on whether a cooperative relationship is established between the satellite 310 and one or more potential relay nodes in the base station cluster 320. In a non-cooperative state, for example, satellite 310 may establish a communication connection directly with satellite users 330 without selecting any potential relay nodes as relays. In both states of the broken cooperation and successful cooperation, the satellite signal cannot directly establish a communication link with the target satellite user 330 due to the blocking or other reasons, and then it is necessary to broadcast the cooperation signaling to multiple base stations in the base station group 320 to find a base station that can act as a relay. If the base station meets the condition of the relay node, executing a cooperation process, and selecting a relay with the highest bidding vector or the highest correction bidding vector from the set of the first relay nodes with cooperative wishes by the satellite for cooperative communication. In some scenarios, such as where the ground base station also suffers from severe shadowing effects or where there is no suitable base station in the vicinity of the satellite user, either the bid vector or the corrected bid vector is negatively profitable at the satellite end, then the satellite performs a refusal to co-operation, which is referred to as "co-operation breaking". The result of whether the satellite 310 has established a cooperative relationship with a certain base station can be obtained according to the above steps S210-S240. In fig. 3, steps S311-S315 are shown in a specific cooperative burst condition, and steps S321-S329 are shown in a specific cooperative success condition. The communication method of the present application is described below with reference to fig. 2 and 3.
Step S210 corresponds to step S311 and step S321 in fig. 3, in which the satellite 310 broadcasts cooperative signaling to a plurality of base stations or base station clusters 320. The present application introduces an auction mechanism to determine the final winning relay node, with the time slot of the satellite as the commodity in the auction mechanism at step S210.
Auction is a resource allocation mechanism that derives from economics. Classical auction mechanisms are: english auctions, lotus auctions, vikrill auctions, second price sealed auctions, and the like. In the communication method of the present application, the slot length of the satellite is used as a resource, the first relay nodes are used as bidders, the bidding of each first relay node is a channel value estimated based on the channel performance, and an estimation function about the channel value will be described later.
Broadcasting the collaboration signaling by satellite 310 at step S210 corresponds to initiating an auction, which would result in excessive computation and waste of resources if each base station in base station cluster 320 participated in the bidding activity of the auction. The communication method of the present application therefore makes each base station calculate first in step S220 to determine whether to participate in bidding. As previously described, each base station is a computing node having storage and computing capabilities. Therefore, a prediction model can be constructed in each base station in advance, the profit function at the current moment is predicted based on the historical success rate and scene information, and whether the bidding process is participated or not is judged according to the positive and negative properties of the profit function, so that the overall calculation amount is greatly reduced. The historical success rate is historical data of the cooperative relationship between the base station and the satellite, and the scene information comprises three types CSI (Channel State Information) of satellite-relay channel state information, relay-ground user channel state information and relay-satellite user channel state information.
Let the original potential relay set size be K ', where K' is the total number of base stations under the beam with willingness to cooperate. After the filtering in step S220, the relay set is scaled down to K, where K < K', and the relay set K includes only the first relay nodes participating in bidding.
In the communication method of the present application, it is assumed that the main network and all the ground networks are sensible. The application discusses that a central control node does not exist in a satellite-ground cooperative communication scene, so that the information obtained by a satellite system and a ground system is not complete, namely, a ground network can only obtain three CSI of a satellite-relay channel, a relay-ground user channel and a relay-satellite user channel, and a main network can only obtain the CSI of the satellite-satellite user channel. Both the primary network and the ground network are expected to maximize benefit. For a satellite-ground fusion network, the large beam of satellites would cover a large number of potential relay nodes. Each potential relay node would participate in bidding directly without assessing the likelihood of successful bidding, and the huge set of potential relays would cause a dramatic increase in computational and communication overhead. According to the method and the system, the prediction model is built on each potential relay node, whether the potential relay node participates in bidding is predicted, the calculation and storage functions of the ground potential relay nodes can be fully utilized, the bidding scale is greatly reduced, the communication cost is reduced, and the on-board calculation cost is reduced.
In step S220, the present application does not limit the specific algorithm of the prediction model. Two algorithms, Q-learning and Double DQN, in the reinforcement learning model are described below as examples.
In some embodiments, the reinforcement learning model uses potential relay nodes as agents based on a Q-learning algorithm, divides the satellite transit period and the like into P time slices, and model parameters of the reinforcement learning model include states, actions and rewards of the potential relay nodes in a certain time slice.
Fig. 4 is a basic framework schematic diagram of the reinforcement learning model in the communication method according to an embodiment of the present application.Referring to fig. 4, an agent 410 and an environment 420 are included. Through reinforcement learning, agent 410 is in a current state (state) s t Sample and execute action a from policy pi t After environment 420 accepts the action, the state of agent 410 changes to the next state s t+1 And awards the signal (reorder) r t And fed back to agent 410. The purpose of reinforcement learning is to train a strategy pi or an optimal action cost function
Figure BDA0003738060530000121
Thereby maximizing the prize r earned by the agent 410. A in FIG. 4 t 、S t 、R t Etc. represent random variables, in the actual learning process a is used t 、s t 、r t Representing the determined value.
The meaning of policy pi (a|s) is a probability density function that takes action in a certain state s, and the characterization action is not deterministic but is sampled from the probability distribution. And after the intelligent agent makes action according to the strategy, the intelligent agent obtains rewards, and the intelligent agent is transferred to a new state. Training and learning are carried out through historical data, so that an optimized strategy is obtained, and the action predicted according to the model is more optimal.
In the embodiments of the present application, each potential relay node is considered an agent, modeling each agent. The agent may observe channel state information as part of its current state. The satellite transit period is isochronously divided into P time slices, each corresponding to a different "satellite-satellite user-potential relay node" geometry, which results in a corresponding different CSI, so the entire satellite transit process can be modeled as a reinforcement learning process.
In an embodiment where the prediction is based on the Q-learning algorithm, the following equation (1) is used to define the state:
Figure BDA0003738060530000131
the state s is composed ofThe channel state information and the signal-to-noise ratio of the transmitting end. Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003738060530000132
and- >
Figure BDA0003738060530000133
The expressions of (2) are respectively as follows:
Figure BDA0003738060530000134
Figure BDA0003738060530000135
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003738060530000136
representing satellite S and potential relay node R k Channel gain, P between s Representing the transmitted power, sigma, of the satellite 2 Representing noise power of satellite, +.>
Figure BDA0003738060530000137
Representing potential relay node R k And ground subscribers U within its coverage area kn Channel gain between->
Figure BDA0003738060530000138
Representing potential relay node R k Is a function of the remaining power of the power supply. It is discretized in a preprocessing stage to obtain quantized channel gains.
In an embodiment of ground potential relay bidding prediction based on the Q-learning algorithm, the actions of the potential relay nodes are defined as action set a= { Y, N }, where Y represents bidding and N represents relinquishing bidding. Bidding means that the relay node needs to consume own power and send signaling to the satellite, so that bidding is successful with a certain probability, and the opportunity of sharing the frequency band with the satellite is obtained. The bid-abandoning means that the relay node does not spend any overhead and the opportunity of sharing the frequency band with the satellite is lost.
In embodiments where predictions are made based on the Q-learning algorithm, rewards are represented in discount rewards, defined as the accumulated value of each instant rewards of an agent during learning multiplied by a corresponding number of discount factors. The definition of the action cost function Q (s, a) is:
Q(s,a)=Ε[U t |S t =s,A t =a t ] (4)
Wherein E (·) is the desired function, U t Is a discount return. Action cost function Q (s, a) is defined as the discount rewards U for action a ε A and in state s over a certain period of time t Is not limited to the above-described embodiments. Q (s, a) depends not only on the state but also on the policy. Q (s, a) is the basis for determining which action is selected in state s, i.e., the basis for prediction.
In the Q-Learning algorithm, the reinforcement Learning model updates the action cost function Q (s, a) to obtain an optimal action cost function Q under an optimal strategy * (s, a) for evaluating the quality of the action.
In connection with fig. 4, when agent 410 makes action a based on current state s, environment 420 will generate reward r based on agent 410's action a feedback. Using the method of time-series differentiation (Temporal Difference, TD), it is assumed that a transition, i.e. a quadruple (s t ,a t ,r t ,s t+1 ) Then a more recent equation (5) at each time slice is derived from the Bellman iteration equation:
Figure BDA0003738060530000141
in the formula, alpha epsilon (0, 1) represents a learning rate factor, and gamma epsilon [0,1] represents a discount factor.
In order to avoid nodes from being trapped in local optimum, an epsilon-greedy mode is adopted when Q-learning selection action is adopted, and balance is achieved between exploration and utilization. The convergence time of the algorithm depends on the state space and the size of the action space, and if the dimension is large, the algorithm convergence time becomes long.
On the action prediction problem of the potential relay node, the state s is a result after channel state information is quantized, the precision depends on the quantized order, and the higher the order is, the higher the precision is. If high-precision motion prediction is sought, the problem of excessively high state dimensions tends to occur. The Q-learning algorithm learns more efficiently in low-dimensional states and low-dimensional motion spaces, but less efficiently in high-dimensional states. The generalization and function approximation capability of the deep neural network are more suitable for the learning process under the high-dimensional state and the high-dimensional action space. Combining the Q-learning algorithm with the deep neural network, approximating Q with the deep neural network by training weights * (s, a), which is the basic structure of the DQN. The neural network inputs are states s, outputs are scores for the respective actions, and the action with the highest score is selected. The network adopts a gradient descent mode to update the weight, and in the optimization process, the network gradually approaches Q * (s, a), the better the action of the output, the more rewards the system gets.
Basic DQN has an overestimated problem that Double DQN can solve. FIG. 5 is a basic framework diagram of a Double DQN with empirical playback in a communication method according to an embodiment of the present application. Referring to fig. 5, the agents of double DQN include Q learning network (DQN) and Target learning network (Target Net). The Q learning network is used for selection of actions and is therefore also referred to as a selection network; the target Q network is used for evaluation of the action and is therefore also referred to as an evaluation network. The algorithm acquires data in the environment, inputs the data into the Q learning network, enables a system to select the maximum value in the Q learning network as an action, copies parameters in the Q learning network into a target learning network at intervals, and meanwhile, the target learning network carries out reverse transmission updating on the parameters of the Q learning network under the action of a loss function (SGD). In addition, by adopting an experience playback algorithm, historical data is put into an experience pool, and then a piece of data serving as mini-batch data is randomly sampled from the experience pool to update network parameters, the correlation of sequences can be broken, and the past experience can be reused to reduce the waste of transition.
In other embodiments of the present application, the reinforcement learning model predicts whether a potential relay node is involved in bidding at the current time based on a Double DQN algorithm, and model parameters of the reinforcement learning model include the status, actions, and rewards of the potential relay node in a certain time segment. Wherein the state s is defined as a six-tuple using the following equation (6):
Figure BDA0003738060530000151
in the method, in the process of the invention,
Figure BDA0003738060530000152
representing satellite S and potential relay node R k Channel gain between->
Figure BDA0003738060530000153
Representing potential relay node R k Channel gain between satellite user D, +.>
Figure BDA0003738060530000154
Representing the satellite tilt angle of said satellite, +.>
Figure BDA0003738060530000155
Figure BDA0003738060530000156
Representing the angle of visibility of the potential relay node to the satellite,/for the satellite>
Figure BDA0003738060530000157
Figure BDA0003738060530000158
Representing the residual power of the potential relay node, theta representing the movement angle of the satellite, and theta being 0,180 DEG]。
This action a is expressed by the following formula (7):
a=π(s) (7)
where pi (·) is a policy representing the mapping procedure from the environment state s to action a. After the intelligent agent completes the current action, the intelligent agent performs the current actionThe corresponding Q value is calculated, and simultaneously, the intelligent agent senses with the environment and the data (s t ,a t ,r t ,s t+1 ) Will be saved to the experience pool. The actions include 3 kinds, i.e., bid and succeed, bid and fail and not participate in bidding. Defining the bonus function is also based on the following rules: if bidding and success, giving high rewards; if bidding and failing, giving negative rewards; if the bidding is not participated, the reward is 0. The rewards are represented by values, and the determination of the particular value of the rewards is based on the criteria that the network training can converge.
Referring to FIG. 5, in the Double DQN algorithm, the agent uses two independent BP neural networks as Q-network approximators, Q-learning network Q (s, a; w) and target learning network Q (s, a; w), respectively - ) Wherein w and w - Representing the current parameter and the previous parameter, respectively. The Q learning network is used to select actions, expressed as:
Figure BDA0003738060530000159
the target learning network is used for evaluation, expressed as:
y t =r t +γ·Q(s t+1 ,a * ;w - ) (9)
in each time slice, the relay node R k Its transition(s) t ,a t ,r t ,s t+1 ) Store to experience pool and randomly sample a mini-batch from experience pool to update network parameters. TD error can be expressed as:
δ i =Q(s t ,a;w)-y t (10)
the loss function is defined as:
Figure BDA0003738060530000161
in mini-batch SGD, multiple transitions are sampled, and network parameters are updated with an average of multiple gradients:
Figure BDA0003738060530000162
in some embodiments, double DQN approximates action cost functions with BP networks, both BP networks having the same structure and different parameters. The BP neural network inputs the attribute quantity of the node number corresponding to state, the output node number corresponds to the action number. For example, in the above embodiment, where the state s is a six-tuple, six input nodes are employed, and 3 actions correspond to 3 output nodes. A classical three-layer BP neural network, i.e. an input layer, an hidden layer and an output layer, is used. Hidden layer neuron number is based on empirical formula
Figure BDA0003738060530000163
Determining, taking the hidden layer node number n' =6. The neural network can fit any function that contains a continuous mapping from one finite space to another.
The above describes the process of predicting whether the potential relay node participates in bidding at the current moment using the prediction model in step S220. The potential relay nodes participating in bidding are in a bidding state, and the potential relay nodes not participating in bidding are in a non-bidding state. At different times, it may be possible for one potential relay node to switch between bidding and non-bidding states. In connection with fig. 3, step S220 corresponds to step S312 in the cooperative breaking state in fig. 3, and step S322 in the cooperative success state, i.e. "prediction-based potential relay node size reduction".
In some embodiments, after the predicting step, the base station sends out the coordination request information to the satellite, including the bid amounts proposed by each base station, as shown in steps S313 and S323 in fig. 3, i.e. "send coordination request information to satellite", where the coordination request information includes the bid vector in step S230.
Referring to fig. 2, at step S230, the first relay node is some of the potential relay nodes that predict that the action is participating in bidding after having been screened. Each first relay node estimates the channel value according to the CSI and reports the bid vector including the estimated value to the satellite.
Since modifications to the bid vector are included in some embodiments, the bid vector prior to modification is referred to herein as the initial bid vector b 0 Expressed by the following formula (13):
Figure BDA0003738060530000171
in step S240, the satellite determines a winning relay node for successful bidding from the plurality of first relay nodes based on the auction mechanism and the bid vector. Step S240 corresponds to step S314 and step S324 in fig. 3, i.e., the satellite 310 determines whether to cooperate based on the auction mechanism.
In some embodiments, the auction mechanism employs a victory auction, the winning relay node being the first relay node with the largest bid vector. The vickeley auction mechanism is incentive compatible so that each relay node will faithfully report its estimate of each channel value to the satellite, thereby avoiding relay nodes reporting false information.
In some embodiments, the bid vector b' of the winning relay node is represented using the following equation (14):
Figure BDA0003738060530000172
that is, the bid vector b' of the winning relay node is the maximum value among the initial bid vectors of all the first relay nodes. The initial bid vectors may be ordered, for example, in descending order, with the initial bid vector having the largest value being the first.
According to the steps S210-S240, firstly, predicting whether to participate in bidding through a reinforcement learning model, and reducing the scale of potential relay nodes; and selecting the most suitable winning relay node from the first relay nodes based on an auction mechanism, and taking the winning relay node as the relay node between the satellite and the satellite user, wherein the winning relay node is the relay node with the optimal signal transmission performance.
Referring to fig. 3, in the non-cooperative state, potential relay nodes do not participate in bidding; in the state of cooperative rupture, there is no winning relay node among the potential relay nodes. The satellite 310 directly transmits a transmission signal to the satellite user 330 at step S315. In a successful cooperation state, the communication method of the present application further includes the following steps:
step S325: satellite 310 broadcasts information of winning relay nodes.
Step S326: waiting for the service period to arrive.
Step S327: upon arrival of the service period, satellite 310 transmits a satellite signal to the winning relay node.
Step S328: the winning relay node transmits a signal to satellite user 330.
Step S329: the winning relay node accesses the satellite frequency band of satellite 310 and provides service to ground subscribers.
Fig. 6A-6B show satellite time slot assignments in three states of collaboration. Wherein fig. 6A corresponds to the non-cooperative state in fig. 3, fig. 6B corresponds to the cooperative broken state in fig. 3, and fig. 6C corresponds to the cooperative success state in fig. 3. The satellite time slots are represented by rectangular bars, the transverse direction of which is the time direction.
In the non-cooperative state, as shown in fig. 6A, all time slots are used for signal transmission between the satellite and the satellite user, and need not be allocated to the relay node.
As shown in fig. 6B, in the cooperative burst state, the satellite will spend a period of time for interacting with the ground for control signaling, such as broadcasting cooperation information, receiving whether to cooperate with ACK, etc., i.e., the "signaling interaction" time slot in fig. 6B, corresponding to step S311 in fig. 3. The remaining time slots are used for the downlink satellite to transmit signals to satellite subscribers. No base station in the cooperative burst mode obtains the time slot of the satellite.
As shown in fig. 6C, in the cooperation success state, the satellite spends a time slot t corresponding to step S321 in fig. 3 0 For "signalling interactions", the remaining time slots being divided into "satellite-base station signallingGap t 1 And "base station-satellite user signal transmission" time slot t 2
In some embodiments, based on some auction mechanism, such as a victory auction, the winning relay node only needs to pay the next highest price after determining the winning relay node. According to the characteristics, the winning relay node only needs to pay out the bidding value corresponding to the next highest bidding node. Assuming that the length of the transmission time slot of the relay-terrestrial user signal corresponding to the highest bid is τ ', and the length of the transmission time slot of the relay-terrestrial user signal corresponding to the next highest bid is τ ", the winning relay node corresponding to the highest bid allocates the time slot with τ" length to the satellite transmission signal to the satellite user, and the time slot with Δτ=τ' - τ "is given to the terrestrial mobile communication system, that is, the time slot t in fig. 6C 2 The following partial time slot t k1 To t kN . The remaining time slots are reassigned by the winning relay node according to certain criteria to serve the N terrestrial users under the kth relay coverage.
In the embodiment shown in FIG. 6C, one transmission frame between the satellite and the winning relay node comprises (N+3) time slots, where the first time slot t 0 For satellite broadcasting cooperative signalling to all potential relay nodes, the second time slot t 1 For satellite transmission of one or more satellite signals to the winning relay node, a third time slot t 2 For the winning relay node to forward satellite signals to satellite users, the remaining N time slots are used for the winning relay node to serve terrestrial users within its coverage area. In step S326 in FIG. 3, the service period includes dividing the first time slot t 0 The remaining (n+2) time slots out.
In some embodiments, assuming that the satellite system transmission frame consists of flexibly adjustable time slots, the winning relay node receives and amplifies the satellite transmitted signal on the second time slot, transmits the amplified signal to the satellite user on the third time slot, and serves N terrestrial users in the latter N time slots to satisfy the terrestrial user transmission rate.
A successful cooperative communication process is described below in conjunction with the transmission and reception model and the channel model of the satellite communication field and fig. 6C.
As shown in fig. 6C, in the first time slot t 0 The satellite and the ground perform signaling interaction; in the second time slot t 1 In the method, a satellite S modulates M signals to different frequency points for broadcasting, and the signals are recorded as { x } p1 ,x p2 ,…,x pm ,…,x pM }. If the pilot signal shows that the CSI of a certain channel is not good, the satellite searches for a ground potential relay node to perform cooperative transmission in the beam coverage range. The present application analyzes only one of the M channels of the satellite, i.e., the scenario shown in FIG. 1, the transmission signal is denoted as x p . Relay node R k Serving ground subscribers within a coverage area, the N signals transmitted are represented as
Figure BDA0003738060530000191
Figure BDA0003738060530000192
The signal expression received by the ground potential relay node is as follows:
Figure BDA0003738060530000193
wherein eta k Is additive white Gaussian noise provided by the transmission channel, P S Refers to the transmitted signal of satellite S;
Figure BDA0003738060530000194
as in equation (2), the channel gain corresponding to the satellite-relay link is referred to. The data transmission rate of the first hop can thus be determined:
Figure BDA0003738060530000195
referring to fig. 6C, in the third slot t 2 In, potential relay node R k Amplifying and transmitting signals to satellite users, and acquiring t by ith ground user ki The time slots of length transmit their signals. Thus, the first relay node R k For satellite usersD provides the following transmission rate:
Figure BDA0003738060530000201
wherein t is 2 Refers to the third slot length; order the
Figure BDA0003738060530000202
Representing the signal-to-noise ratio of a transmitting end of a kth potential relay node; />
Figure BDA0003738060530000203
Refers to the channel gain of the kth potential relay node to satellite user D link.
Likewise, the transmission rate of the transmission signal of the kth potential relay node for the nth ground user in the cell is:
Figure BDA0003738060530000204
wherein t is kn Refers to potential relay node R k Transmitting signals to the time slot length corresponding to the nth ground user; also, let the
Figure BDA0003738060530000205
Representing the signal-to-noise ratio of a transmitting end of a kth potential relay node; / >
Figure BDA0003738060530000206
Refers to potential relay node R k Channel gain to nth terrestrial subscriber link.
In the present specification, H X,Y Represented as a channel between nodes X and Y, where X, Y ε { S, D, R k ,U kn K e {1,2, …, K }, N e {1,2, …, N }. In the expression, S refers to a satellite node; r is R k Is a potential relay node; d refers to satellite users; u (U) k ={U k1 ,U k2 ,…,U kn ,…,U kN "refers to potential relay node R k A set of terrestrial users within a cell. Propagation modelThe definition is as follows:
Figure BDA0003738060530000207
wherein the method comprises the steps of
Figure BDA0003738060530000208
Is free space loss:
Figure BDA0003738060530000209
G S and G t The antenna gain of the satellite transmitter and the antenna gain of the terrestrial receiver, respectively, d is the satellite-to-ground distance (km), and f is the carrier frequency (MHz).
Figure BDA00037380605300002010
The shadowing and fading coefficients follow a shadowing rice fading profile. />
Figure BDA00037380605300002011
The probability density function of (2) can be expressed as:
Figure BDA00037380605300002012
wherein the method comprises the steps of 1 F 1 (. Cndot. ). Cndot.represents kummer function, λ=1/2 b, α= ((2 bm)/(2 bm+Ω)) m β=Ω/(2b (2bm+Ω)), where b represents the average power of the multipath component, m is a parameter that measures the fading severity, and Ω is the average power representing the line-of-sight transmission component.
For terrestrial links, the channel gain is
Figure BDA00037380605300002013
Wherein->
Figure BDA00037380605300002014
Is R k And U kn Distance, eta is path lossConsumption coefficient->
Figure BDA0003738060530000211
For small scale fading, obey the Nakagami-m distribution, < >>
Figure BDA0003738060530000212
The probability density function is:
Figure BDA0003738060530000213
wherein the shape parameter mu characterizes the severity of the fading effect as an integer. ω is the average power of the components and Γ (·) represents the gamma function.
In the scenario of a star-to-ground converged relay network as shown in fig. 1, where the potential relay nodes are far apart, it may be assumed that all channels are subject to independent co-distributed fading. Furthermore, it is assumed that all channels are quasi-static, i.e. the channel gain is assumed to remain unchanged during one time segment of the satellite transit, and the channel gain varies from time segment to time segment. The ground relay acquires real-time CSI information of a relay node-satellite user link and a relay node-ground user link. The CSI is not shared between relay nodes, that is, the information is private, and from the gaming point of view, the information is incomplete. The auction process is conducted at the beginning of each time segment.
In step S230, the channel performance including the channel gain and the sub-slot allocation length is evaluated by the first relay node and reflected in the given bid vector. Wherein the time slot allocation length comprises a time slot length t of the first relay node transmitting the amplified signal to the satellite user 2
The definition of the bid vector is described below in connection with the time slot shown in fig. 6C. The initial bid vector in equation (13) is
Figure BDA0003738060530000214
Bid vector of winning relay node in equation (14) is
Figure BDA0003738060530000215
In the embodiment of the application, the first relay node adopts an AF protocol, and the relay node performs simple amplification processing on the received signal and then forwards the signal to the destination terminal. According to the optimal relay selection scheme of the AF protocol, after meeting the QoS index of the ground user in the cell, the time slot allocated to the satellite-relay and the time slot allocated to the relay-target node should be equal in length, namely t 1 =t 2 . Specifically, the following formula is used to calculate the bid vector b 0 Each of the vectors in (a)
Figure BDA0003738060530000216
Figure BDA0003738060530000217
Figure BDA0003738060530000218
Figure BDA0003738060530000219
/>
Figure BDA0003738060530000221
Figure BDA0003738060530000222
Relay R k Amplifying the signal according to the AF protocol, wherein the estimated channel capacity is as follows:
Figure BDA0003738060530000223
defining an initial bid value as the estimate:
Figure BDA0003738060530000224
wherein the first set of relay nodes is denoted as r= { R 1 ,R 2 ,…,R k ,…,R K K represents the total number of the first relay nodes; kth first relay node R k Serving N ground subscribers, denoted U k ={U k1 ,U k2 ,…,U kn ,…,U kN };
Figure BDA0003738060530000225
Represents the kth first relay node R k Providing an initial bid value; t is t 1 The representation shows a first relay node R k The length of the second transmission slot of (a); t is t 2 Representing a first relay node R k The length of the third transmission slot of (a); t is t kn (n=1 to N) represents the first relay node R k The length of the remaining N slots; />
Figure BDA0003738060530000226
Representing the signal-to-noise ratio of the satellite-first relay link; />
Figure BDA0003738060530000227
Representing the signal-to-noise ratio of the first relay-to-ground user link; />
Figure BDA0003738060530000228
Representing a signal-to-noise ratio of the first relay-satellite user link; />
Figure BDA0003738060530000229
Representing a first relay node R k Channel capacity of the link provided for satellite user D; />
Figure BDA00037380605300002210
Representing the requirements of the surface user QoS, herein referred to as the minimum data transmission rate that the surface user needs to meet; />
Figure BDA00037380605300002211
Representing a first relay node R k Channel gain to satellite user D link;
Figure BDA00037380605300002212
Figure BDA00037380605300002213
the channel gains of three links of the satellite-first relay node, the first relay node-satellite user and the first relay node-ground user are respectively represented, and N and K are positive integers which are more than or equal to 1.
The initial bid vector provided by each first relay node can be calculated by combining the formulas (23-29) above, and the initial bid vector is related to the channel gain and the sub-slot allocation length.
In some embodiments, the communication method of the present application further comprises: consider that the auction-based relay selection process will begin once every time segment during satellite transit. If different relays are frequently selected, extra switching and communication overhead are brought. Based on the above considerations, in an embodiment of the present application, after the first relay node reports the bid vector to the satellite, the satellite further corrects the initial bid vector based on the handover overhead index to obtain a corrected bid vector. The handoff overhead index includes one or any of bid history information, predicted residence time, and movement angle, wherein the bid history information is derived from a set of all winning relay nodes selected by the satellite over a time sequence.
For example, the set of all winning relay nodes selected by one satellite over one time sequence is denoted as r= { R 0 ,R 1 ,…,R p-1 ,R p … }, calculate
Figure BDA0003738060530000231
Wherein (1)>
Figure BDA0003738060530000232
R is exclusive OR sign p Representing the current winning relay node, R p-1 Indicating that the last time segment selects the relay node, a sequence including 0 and-1 can be obtained, 0 indicates that the current winning relay node is the same as the winning relay node of the last time segment, and the switching overhead is zero, and-1 indicates that the current winning relay node is different from the winning relay node of the last time segment, which brings about the switching overhead and has a negative effect. In addition to accounting for bidding history information, satellites predict the residence time of potential relays and estimate the movement angle based on ephemeris and geographic location information. The predicted residence time being short may be considered an unnecessary relay selection; the movement angle indicates the included angle between the connecting line between the satellite lower point and the potential relay and the satellite movement direction, and the potential relay with the included angle of an acute angle is selected to be more beneficial to reducing the switching of the relay. />
The predicted residence time T is based on the following movement model: the low orbit satellite keeps moving relative to the ground at a certain speed, and the satellite beam is a flat scanning beam. Then the predicted residence time is:
Figure BDA0003738060530000233
in the above formula, radius is the beam Radius of the low-orbit satellite, O is the beam center point, and v is the moving speed of the low-orbit satellite.
Fig. 7 is a schematic diagram of the relationship between satellite beams and base station locations. Where O represents the beam center point. Due to the mobility of the low-orbit satellite and the beam sweeping the ground, the position of the ground potential relay at the last moment is A 0 The current time position is A 1 The incoming beam spot is denoted as A 2 The exiting beam spot is denoted as A 3 . θ is the movement angle. The movement angle θ can also be obtained by the movement model:
Figure BDA0003738060530000234
in some embodiments, the initial bid vector is modified using the following formula:
Figure BDA0003738060530000235
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0003738060530000236
representing the normalized bidding vector after normalizing the bidding vector, wherein gamma represents the coefficient for balancing the data transmission rate and the switching overhead, and the larger gamma represents the data transmission rate to be more valued by satellite nodes, and the smaller gamma represents the switching overhead to be more worth reducing; />
Figure BDA0003738060530000241
Is a decision matrix, where x kj Represents any handover overhead index, K represents the number of first relay nodes, K is the total number of first relay nodes, j represents the number of handover overhead indexes, and w j The weight of the j-th handover overhead indicator is represented. The weight may be determined according to the actual situation.
In some embodiments, the weights are determined using an entropy weighting method.
The following is the step of utilizing entropy weight method to confirm the index weight of bidding historical information, prediction residence time, removal angle three:
step S250: initial data is normalized. And listing bidding historical information, predicted residence time and movement angle as decision matrixes. The decision matrix is expressed as:
Figure BDA0003738060530000242
Where j=3, there are 3 decision indicators. K is the number of first relay nodes. Knowing bidding historical information and predicted residence time as positive indexes, moving angles as negative indexes, and normalizing all index values by adopting a maximum and minimum normalization method.
The normalized calculation formula of the forward index is as follows:
Figure BDA0003738060530000243
the normalized calculation of the negative index is as follows:
Figure BDA0003738060530000244
step S251: and calculating the entropy value of each switching overhead index. Let the entropy of the j-th switching overhead index be e j J=1, 2, …, J, the calculation formula is as follows:
Figure BDA0003738060530000245
wherein:
Figure BDA0003738060530000246
step S252: the weights of the various indicators are calculated. The weight of the j-th handover overhead index is w j J=1, 2, …, J, the calculation formula is as follows:
Figure BDA0003738060530000247
in some embodiments, the following formula is used to make corrections to the switching overhead term for the current bid:
Figure BDA0003738060530000251
Figure BDA0003738060530000252
Figure BDA0003738060530000253
Figure BDA0003738060530000254
wherein b k And the gamma represents the coefficient for balancing the data transmission rate and the switching overhead, the larger the gamma represents the data transmission rate to be more valued by satellite nodes, and the smaller the gamma represents the switching overhead to be more worth reducing. Due to the difference in dimensions, the data transfer rate is normalized by first using equations (37) - (40), and the handover overhead is subtracted in equation (41). Thus, the post-correction bidding vector is related to the channel gain, the sub-slot allocation length and the switching overhead.
According to the communication method, the success rate of participation of the potential relay node in bidding is evaluated through the reinforcement learning model, the conversion between the bidding state and the non-bidding state of the base station is realized, the scale of the satellite-ground cooperation matrix is reduced, and the purpose of reducing calculation cost is achieved. Meanwhile, an auction mechanism with excitation compatibility and effectiveness is adopted, so that the strategy of reporting real bidding by the potential relay node is ensured to be a dominant strategy, and a guarantee is provided for cooperation of a satellite mobile communication system and a ground cellular mobile communication system. Finally, based on the high dynamic and long time delay characteristics of the low-orbit satellite communication system, the method considers that the relays selected on different time snapshots are changed, and uses the switching overhead as an additional item to rewrite bidding in the auction process, so that the low-orbit satellite comprehensively considers the transmission benefit and the switching frequency, realizes multi-objective optimization, and is beneficial to reducing the time delay and the signaling transmission overhead.
The communication system of the star-ground fusion relay network based on the auction mechanism comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, wherein each potential relay node serves a plurality of covered ground users on the ground. The communication system may be used to perform the communication method described above, and thus, the foregoing drawings and descriptions about the communication method may be used to describe the communication system, and the same will not be repeated.
In the communication system of the application, the satellite is used for broadcasting cooperative signaling to a plurality of potential relay nodes, and the time slot of the satellite is used as a commodity in an auction mechanism; each potential relay node is used for predicting whether the potential relay node participates in bidding at the current moment, and taking all the potential relay nodes participating in bidding as first relay nodes, wherein the potential relay nodes participate in bidding to indicate that the potential relay nodes are willing to acquire the commodity, the first relay nodes are used for evaluating the channel value according to the channel performance and reporting bidding vectors comprising the estimated value to a satellite, wherein the channel comprises a satellite relay channel, a relay satellite user channel and a relay ground user channel, and the channel performance comprises the channel gain and the sub-slot allocation length; the satellite is also used for selecting a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes as a winning relay node based on the auction mechanism and the bidding vector, and the winning relay node and the satellite achieve a cooperative relationship.
Some aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. The processor may be one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital signal processing devices (DAPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, or a combination thereof. Furthermore, aspects of the present application may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media. For example, computer-readable media can include, but are not limited to, magnetic storage devices (e.g., hard disk, floppy disk, tape … …), optical disk (e.g., compact disk CD, digital versatile disk DVD … …), smart card, and flash memory devices (e.g., card, stick, key drive … …).
The computer readable medium may comprise a propagated data signal with the computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take on a variety of forms, including electro-magnetic, optical, etc., or any suitable combination thereof. A computer readable medium can be any computer readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer readable medium may be propagated through any suitable medium, including radio, cable, fiber optic cable, radio frequency signals, or the like, or a combination of any of the foregoing.
While the basic concepts have been described above, it will be apparent to those skilled in the art that the above disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations of the present application may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this application, and are therefore within the spirit and scope of the exemplary embodiments of this application.
Meanwhile, the present application uses specific words to describe embodiments of the present application. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present application. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present application may be combined as suitable.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the number allows for a 20% variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.

Claims (12)

1. A communication method of a star-to-ground convergence relay network based on an auction mechanism, the star-to-ground convergence relay network including at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, each of the potential relay nodes serving a plurality of ground users covered by the potential relay nodes on the ground, the communication method comprising:
the satellite broadcasts cooperative signaling to the plurality of potential relay nodes, and the time slot of the satellite is used as commodity in an auction mechanism;
each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and takes all the potential relay nodes participating in bidding as a first relay node, wherein the participation bidding indicates that the potential relay node is willing to acquire the commodity;
the first relay node evaluates the channel value according to the channel performance and reports bidding vectors comprising the estimated value to the satellite, wherein the channels comprise satellite-relay channels, relay-satellite user channels and relay-ground user channels, and the channel performance comprises channel gain and sub-slot allocation length; and
the satellite selects a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes as a winning relay node based on an auction mechanism and the bidding vector, and the winning relay node and the satellite achieve a cooperative relationship;
Each potential relay node predicts whether the potential relay node participates in bidding at the current moment by adopting a reinforcement learning model based on historical success rate and scene information, wherein the historical success rate is historical data of the potential relay node and the satellite in a cooperative relation, and the scene information comprises satellite-relay channel state information, relay-ground user channel state information and relay-satellite user channel state information obtained by a ground network.
2. The communication method of claim 1, wherein the reinforcement learning model is based on a Q-learning algorithm to take the potential relay node as an agent, and the satellite transit period is isochronously divided into N time segments, and the model parameters of the reinforcement learning model include states, actions, and rewards of the potential relay node in a certain time segment, and the states are defined as a binary group:
Figure FDA0004142180280000011
wherein (1)>
Figure FDA0004142180280000012
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0004142180280000013
representing satellite S and potential relay node R k Channel gain, P between s Representing the transmitted power, sigma, of said satellite 2 Representing the noise power of the satellite, +.>
Figure FDA0004142180280000021
Representing potential relay node R k And potential relay node R k Ground user under cover- >
Figure FDA0004142180280000022
Channel gain between->
Figure FDA0004142180280000023
Representing potential relay node R k Is a residual power of (2); the action is defined as an action set A= { Y, N }, wherein Y represents bidding and N represents abandoning bidding; the rewards are represented by discount rewards, and are defined as accumulated values of multiplying each instant rewards of the intelligent agent in the learning process by a corresponding quantity of discount factors; defining the action cost function Q (s, a) to be at a certain timeThe reinforcement learning model updates the action cost function to obtain the optimal action cost function Q * (s,a)。
3. The communication method of claim 1, wherein the reinforcement learning model predicts whether the potential relay node participates in bidding at the current time using a Double DQN algorithm, divides the time between satellite transit time into P time segments, and the model parameters of the reinforcement learning model include states, actions, and rewards of the potential relay node in a certain time segment, the states being defined as six tuples:
Figure FDA0004142180280000024
wherein (1)>
Figure FDA0004142180280000025
Representing satellite S and potential relay node R k Channel gain between->
Figure FDA0004142180280000026
Representing the potential relay node R k Channel gain between satellite user D, +. >
Figure FDA0004142180280000027
Representing the satellite tilt angle of said satellite, +.>
Figure FDA0004142180280000028
Figure FDA0004142180280000029
Representing the angle of visibility of the potential relay node to the satellite,/for the satellite>
Figure FDA00041421802800000210
Figure FDA00041421802800000211
Representing the residual power of the potential relay node, wherein theta represents the movement angle of the satellite, and theta is E [0,180 DEG ]]The method comprises the steps of carrying out a first treatment on the surface of the The actions are expressed as: a=pi(s), where pi (·) is a policy representing the mapping procedure from the environmental state s to the action a; the rewards are represented by numerical values; the reinforcement learning model comprises a Q learning network and a target learning network, data in the environment are input into the Q learning network, a system selects the maximum value in the Q learning network as an action, parameters in the Q learning network are copied into the target learning network at intervals, and the target learning network carries out reverse transmission updating on the parameters of the Q learning network under the action of a loss function so as to gradually obtain an optimized Q value.
4. The communication method as claimed in claim 1, further comprising:
the satellite broadcasts information of the winning relay node;
upon arrival of a service period, the satellite transmits a satellite signal to the winning relay node; and
the winning relay node accesses the satellite frequency band of the satellite and provides services for ground users.
5. The communication method of claim 4, wherein one transmission frame of the satellite and the winning relay node comprises (n+3) time slots, wherein a first time slot is used for the satellite to broadcast the cooperative signaling to all potential relay nodes, a second time slot is used for the satellite to transmit one or more satellite signals to the winning relay node, a third time slot is used for the winning relay node to forward the satellite signals to the satellite users, the remaining N time slots are used for the winning relay node to serve terrestrial users within its coverage, and the service period comprises the remaining (n+2) time slots other than the first time slot.
6. The communication method of claim 5, wherein the auction mechanism is a victory auction mechanism and the winning relay node is a first relay node having a largest bid vector.
7. The communication method of claim 6, wherein after determining the winning relay node, a next largest bid vector having a next largest bid vector is taken as a bid value for the winning relay node.
8. The communication method of claim 6, wherein the bid vector is calculated step by step using the following formula
Figure FDA0004142180280000031
Each vector of +.>
Figure FDA0004142180280000032
Figure FDA0004142180280000033
Figure FDA0004142180280000034
Figure FDA0004142180280000035
Figure FDA0004142180280000036
Figure FDA0004142180280000037
Figure FDA0004142180280000038
Figure FDA0004142180280000039
Wherein the first set of relay nodes is denoted as r= { R 1 ,R 2 ,…,R k ,…,R K K represents the total number of the first relay nodes; kth first relay node R k Serving N ground subscribers, denoted U k ={U k1 ,U k2 ,…,U kn ,…,U kN };
Figure FDA0004142180280000041
Represents the kth first relay node R k Providing an initial bid value; t is t 1 Representing a first relay node R k The length of the second transmission slot of (a); t is t 2 Representing a first relay node R k The length of the third transmission slot of (a); t is t kn (n=1 to N) represents the first relay node R k The length of the remaining N slots; />
Figure FDA0004142180280000042
Representing the signal-to-noise ratio of the satellite-first relay link; />
Figure FDA0004142180280000043
Representing the signal-to-noise ratio of the first relay-to-ground user link; />
Figure FDA0004142180280000044
Representing a signal-to-noise ratio of the first relay-satellite user link; />
Figure FDA0004142180280000045
Representing a first relay node R k Channel capacity of the link provided for satellite user D; />
Figure FDA0004142180280000046
Representing the QoS requirement of the ground user, which means the minimum data transmission rate which needs to be met by the ground user; />
Figure FDA0004142180280000047
Representing a first relay node R k Channel gain to satellite user D link;
Figure FDA0004142180280000048
the method comprises the steps of respectively representing channel gains of three links of a satellite-first relay node, a first relay node-satellite user and a first relay node-ground user, wherein N and K are positive integers which are more than or equal to 1; p (P) s Representing the transmitted power, sigma, of said satellite 2 Representing the noise power of the satellite, +.>
Figure FDA0004142180280000049
Representing the first relay node R k Is a function of the remaining power of the power supply.
9. The communication method according to claim 8, further comprising, after the first relay node reports the bid vector to the satellite, the satellite correcting the bid vector based on a handover overhead index to obtain a corrected bid vector, the handover overhead index including one or any of bid history information, predicted residence time, and movement angle information, the satellite selecting, from the first relay nodes, a first relay node corresponding to a largest corrected bid vector as a winning relay node according to the corrected bid vector, wherein the bid history information is obtained from a set of all winning relay nodes selected by the satellite over a time sequence; the predicted residence time and the movement angle information are derived from a geometric model.
10. The communication method of claim 9, wherein the bid vector is modified using the following formula:
Figure FDA00041421802800000410
wherein b k Representing the post-correction bid vector,
Figure FDA00041421802800000411
representing the normalized bidding vector after normalizing the bidding vector, wherein gamma represents the coefficient for balancing the data transmission rate and the switching overhead, and the larger gamma represents the data transmission rate to be more valued by satellite nodes, and the smaller gamma represents the switching overhead to be more worth reducing; / >
Figure FDA00041421802800000412
Is a decision matrix, where x kj Represents any of the handover overhead indexes, K represents the sequence number of the first relay node, K is the total number of the first relay nodes, j represents the sequence number of the handover overhead index, and w j The weight of the j-th handover overhead indicator is represented.
11. The communication method of claim 10, wherein the weights are determined using an entropy weight method.
12. A communication system of a star-to-ground fusion relay network based on an auction mechanism, the star-to-ground fusion relay network comprising at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, each potential relay node serving a plurality of ground users covered by the potential relay nodes on the ground, characterized in that the satellite is used for broadcasting cooperative signaling to the plurality of potential relay nodes, and the time slot of the satellite is used as a commodity in the auction mechanism; each potential relay node is used for predicting whether the potential relay node participates in bidding at the current moment, and taking all potential relay nodes participating in bidding as a first relay node, wherein the participation bidding indicates that the potential relay node is willing to acquire the commodity, the first relay node is used for evaluating the channel value according to the channel performance and reporting bidding vectors comprising the estimated value to the satellite, wherein the channels comprise a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the channel performance comprises the channel gain and the sub-slot allocation length; the satellite is further used for selecting a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes based on an auction mechanism and the bidding vector as a winning relay node, and the winning relay node and the satellite achieve a cooperative relationship; each potential relay node predicts whether the potential relay node participates in bidding at the current moment by adopting a reinforcement learning model based on historical success rate and scene information, wherein the historical success rate is historical data of the potential relay node and the satellite in a cooperative relation, and the scene information comprises satellite-relay channel state information, relay-ground user channel state information and relay-satellite user channel state information obtained by a ground network.
CN202210806700.4A 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism Active CN115173926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210806700.4A CN115173926B (en) 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210806700.4A CN115173926B (en) 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism

Publications (2)

Publication Number Publication Date
CN115173926A CN115173926A (en) 2022-10-11
CN115173926B true CN115173926B (en) 2023-07-07

Family

ID=83492664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210806700.4A Active CN115173926B (en) 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism

Country Status (1)

Country Link
CN (1) CN115173926B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108023637B (en) * 2017-12-06 2020-06-23 中国人民解放军国防科技大学 Isomorphic multi-satellite online collaboration method
WO2022105621A1 (en) * 2020-11-17 2022-05-27 重庆邮电大学 Evolutionary game-based multi-user switching method in software-defined satellite network system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101490351B1 (en) * 2008-05-29 2015-02-05 삼성전자주식회사 Apparatus and method for relay station mode selection in broadband wireless access communication system
CN108832989A (en) * 2018-05-07 2018-11-16 哈尔滨工程大学 The online Dynamic Programming terminal of the task of low rail microsatellite and planing method used in

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108023637B (en) * 2017-12-06 2020-06-23 中国人民解放军国防科技大学 Isomorphic multi-satellite online collaboration method
WO2022105621A1 (en) * 2020-11-17 2022-05-27 重庆邮电大学 Evolutionary game-based multi-user switching method in software-defined satellite network system

Also Published As

Publication number Publication date
CN115173926A (en) 2022-10-11

Similar Documents

Publication Publication Date Title
CN114362810B (en) Low orbit satellite beam jump optimization method based on migration depth reinforcement learning
CN113543176B (en) Unloading decision method of mobile edge computing system based on intelligent reflecting surface assistance
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
US11388732B2 (en) Method for associating user equipment in a cellular network via multi-agent reinforcement learning
CN112118601A (en) Method for reducing task unloading delay of 6G digital twin edge computing network
CN114143346B (en) Joint optimization method and system for task unloading and service caching of Internet of vehicles
CN113543074A (en) Joint computing migration and resource allocation method based on vehicle-road cloud cooperation
CN105379412A (en) System and method for controlling multiple wireless access nodes
Pan et al. Artificial intelligence-based energy efficient communication system for intelligent reflecting surface-driven vanets
CN112929849B (en) Reliable vehicle-mounted edge calculation unloading method based on reinforcement learning
CN116456493A (en) D2D user resource allocation method and storage medium based on deep reinforcement learning algorithm
Xu et al. Joint task offloading and resource optimization in noma-based vehicular edge computing: A game-theoretic drl approach
CN115190489A (en) Cognitive wireless network dynamic spectrum access method based on deep reinforcement learning
CN115034390A (en) Deep learning model reasoning acceleration method based on cloud edge-side cooperation
CN116566838A (en) Internet of vehicles task unloading and content caching method with cooperative blockchain and edge calculation
Chua et al. Resource allocation for mobile metaverse with the Internet of Vehicles over 6G wireless communications: A deep reinforcement learning approach
Hazarika et al. Multi-agent DRL-based computation offloading in multiple RIS-aided IoV networks
Mafuta et al. Decentralized resource allocation-based multiagent deep learning in vehicular network
Liu et al. A novel hybrid split and federated learning architecture in wireless UAV networks
CN115173926B (en) Communication method and communication system of star-ground fusion relay network based on auction mechanism
Shaodong et al. Multi-step reinforcement learning-based offloading for vehicle edge computing
CN115022322B (en) Edge cloud cooperation task unloading method based on crowd-sourced evolution in Internet of vehicles
Wang et al. Joint offloading decision and resource allocation in vehicular edge computing networks
CN116600316A (en) Air-ground integrated Internet of things joint resource allocation method based on deep double Q networks and federal learning
Lyu et al. Service-driven resource management in vehicular networks based on deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant