CN115173926A - Communication method and communication system of satellite-ground converged relay network based on auction mechanism - Google Patents

Communication method and communication system of satellite-ground converged relay network based on auction mechanism Download PDF

Info

Publication number
CN115173926A
CN115173926A CN202210806700.4A CN202210806700A CN115173926A CN 115173926 A CN115173926 A CN 115173926A CN 202210806700 A CN202210806700 A CN 202210806700A CN 115173926 A CN115173926 A CN 115173926A
Authority
CN
China
Prior art keywords
satellite
relay node
relay
bidding
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210806700.4A
Other languages
Chinese (zh)
Other versions
CN115173926B (en
Inventor
谢卓辰
杨文歆
晏睦彪
韩欣洋
刘会杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Engineering Center for Microsatellites
Innovation Academy for Microsatellites of CAS
Original Assignee
Shanghai Engineering Center for Microsatellites
Innovation Academy for Microsatellites of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Engineering Center for Microsatellites, Innovation Academy for Microsatellites of CAS filed Critical Shanghai Engineering Center for Microsatellites
Priority to CN202210806700.4A priority Critical patent/CN115173926B/en
Publication of CN115173926A publication Critical patent/CN115173926A/en
Application granted granted Critical
Publication of CN115173926B publication Critical patent/CN115173926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18519Operations control, administration or maintenance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention provides a communication method and a communication system of a satellite-ground converged relay network based on an auction mechanism. The method comprises the following steps: broadcasting a cooperation signaling to a plurality of potential relay nodes by the satellite, and taking a time slot of the satellite as a commodity in an auction mechanism; each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and all the potential relay nodes participating in bidding are used as first relay nodes; the first relay node evaluates the value of a channel according to the performance of the channel, and reports a bidding vector including the estimated value to a satellite, wherein the channel comprises a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the performance of the channel comprises channel gain and sub-time slot allocation length; and the satellite selects a first relay node corresponding to the maximum bidding vector from the plurality of first relay nodes as a winning relay node based on the auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.

Description

Communication method and communication system of satellite-ground converged relay network based on auction mechanism
Technical Field
The invention mainly relates to the technical field of satellite communication, in particular to a communication method and a communication system of a satellite-ground converged relay network based on an auction mechanism.
Background
For a considerable period of time in the past, researchers have been focusing on studying land mobile cellular networks. The ground cellular network ground mobile communication system has large communication capacity, small network delay, high frequency spectrum efficiency and perfect technical development, can widely cover urban areas with dense population, but cannot cover most of the global areas such as the ocean, remote areas and the like due to the restriction of terrain and economic factors. The satellite communication system can provide seamless internet broadband service for global users, particularly remote areas, oceans and disaster areas with damaged links, which cannot be covered by the traditional ground cellular network, by virtue of the strong wide area coverage capability of the satellite communication system. However, low earth orbit satellites do not provide effective coverage for densely populated areas, subject to line-of-sight transmissions, shadowing, or shadowing effects, among other effects. Researchers have proposed the concept of a "satellite-to-ground convergence network" to converge a satellite communication system with a terrestrial communication system. However, the satellite communication system and the ground communication system have different systems, and develop along two roads, and how to exchange resources between different systems so as to achieve the effect of cooperation and win-win, which is an important research subject of the satellite-ground fusion network. In order to further improve the coverage of the satellite mobile communication network and provide high data transmission rate services, the terrestrial mobile communication system may help the satellite to amplify and forward the signal, i.e., the terrestrial mobile communication system acts as a relay for the satellite signal and helps the satellite to transmit the signal. Therefore, the ground mobile communication system can obtain the access opportunity of the satellite-ground shared frequency band while resisting the shadow effect and the fading effect. In this case, the terrestrial mobile communication system and the satellite communication system can be regarded as a whole, and are called a satellite-ground converged relay network.
At present, all devices with relay functions are usually put into an alternative set in satellite-ground cooperative communication, but one beam of a satellite can cover several kilometers to dozens of kilometers, and the calculation amount is easily overlarge under the condition that a large number of potential relay nodes exist. In addition to the problem of computational scale, in the conventional satellite-to-ground converged network relay selection problem, it is often assumed that each node is honest, but the reality may not be so, which causes some potential relay nodes which are not honest to provide false information and then to be frequently scheduled, resulting in the reduction of the transmission rate of a main user. In addition, the existing research is often based on a quasi-static scene, switching overhead between different relays selected by a satellite on two time snapshots before and after the satellite due to the dynamic property of the low-orbit satellite is not considered, and the relays are selected and amplified for signal transmission only based on opportunistic scheduling, which may cause repeated change of alternative relays for a high-dynamic communication system such as the low-orbit satellite, and bring frequent interaction of control signaling, resulting in long time delay of communication and increase of calculation overhead.
Disclosure of Invention
The invention aims to provide a communication method and a communication system of a satellite-ground converged relay network based on an auction mechanism, which can realize efficient satellite-ground cooperative communication in a dynamic environment.
In order to solve the above technical problem, the present invention provides a communication method of a satellite-ground converged relay network based on an auction mechanism, where the satellite-ground converged relay network includes at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, and each potential relay node serves a plurality of ground users covered by the potential relay node on the ground, and the communication method includes: broadcasting cooperation signaling to the plurality of potential relay nodes by the satellite, and taking a time slot of the satellite as a commodity in an auction mechanism; each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and all potential relay nodes participating in bidding are taken as first relay nodes, wherein the participation bidding indicates that the potential relay nodes are willing to obtain the commodity; the first relay node evaluates the value of a channel according to the performance of the channel, and reports a bidding vector comprising the estimated value to the satellite, wherein the channel comprises a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the performance of the channel comprises channel gain and sub-time slot allocation length; and the satellite selects a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes as a winning relay node based on an auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.
In an embodiment of the application, each potential relay node predicts whether the potential relay node participates in bidding at the current time based on a historical success rate and scene information, wherein the historical success rate is historical data of cooperation relationship established between the potential relay node and the satellite, and the scene information includes satellite-relay channel state information, relay-ground user channel state information and relay-satellite user channel state information obtained by a ground network.
In an embodiment of the present application, each of the potential relay nodes performs prediction based on a reinforcement learning model, the reinforcement learning model divides the transit period of the satellite into N time segments based on a Q-learning algorithm, the transit period of the satellite is divided into N time segments, model parameters of the reinforcement learning model include a state, an action, and an incentive of the potential relay node in a certain time segment, and the state is defined as a binary group:
Figure BDA0003738060530000031
wherein the content of the first and second substances,
Figure BDA0003738060530000032
Figure BDA0003738060530000033
wherein the content of the first and second substances,
Figure BDA0003738060530000034
representing the satellite S and potential relay nodes R k Channel gain, P, between s Representing the transmission power, σ, of the satellite 2 Representing the noise power of the satellite in question,
Figure BDA0003738060530000035
representing potential relay nodes R k And potential relay nodes R k Users of the ground under coverage
Figure BDA0003738060530000036
The gain of the channel in between is increased,
Figure BDA0003738060530000037
representing potential relay nodes R k The remaining power of; the action is defined as an action set A = { Y, N }, wherein Y represents bidding, and N represents abandonment bidding; the reward is represented by discount reward and is defined as the accumulated value of each instant reward multiplied by a corresponding number of discount factors in the learning process of the intelligent agent; defining an action value function Q (s, a) as an expected value of discount return when an action a belongs to A and is in a state s within a certain time, and updating the action value function by the reinforcement learning model to obtain an optimal action value function Q * (s,a)。
In an embodiment of the application, each potential relay node performs prediction based on a reinforcement learning model, the reinforcement learning model predicts whether the potential relay node participates in bidding at the current moment by using a Double DQN algorithm, the satellite transit period is divided into P time segments at equal time, model parameters of the reinforcement learning model include states, actions and rewards of the potential relay nodes in a certain time segment, and the states are defined as six-element groups:
Figure BDA0003738060530000038
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003738060530000039
representing the satellite S and potential relay nodes R k The gain of the channel in between is increased,
Figure BDA00037380605300000310
representing the potential relay node R k And the channel gain between the satellite user D,
Figure BDA00037380605300000311
represents the satellite inclination of the satellite in question,
Figure BDA00037380605300000312
Figure BDA00037380605300000313
representing the angle of visibility of the potential relay node to the satellite,
Figure BDA00037380605300000314
Figure BDA00037380605300000315
represents the remaining power of the potential relay node, theta represents the angle of movement of the satellite, theta e [0,180 DEG](ii) a The action is represented as: a = pi(s), where pi (·) is a policy, representing a mapping process from environment state s to action a; the reward is represented by a numerical value; the reinforcement learning model comprises a Q learning network and a target learning network, data in the environment are input into the Q learning network, a system selects the maximum value in the Q learning network as an action, parameters in the Q learning network are copied into the target learning network at intervals, and the target learning network carries out reverse transmission updating on the parameters of the Q learning network under the action of a loss function so as to gradually obtain an optimized Q value.
In an embodiment of the present application, the method further includes: the satellite broadcasts information of the winning relay node; upon arrival of a service period, the satellite transmitting a satellite signal to the winning relay node; and the winning relay node accesses the satellite frequency band of the satellite and provides service for the ground user.
In an embodiment of the present application, one transmission frame of the satellite and the winning relay node includes (N + 3) time slots, where a first time slot is used for the satellite to broadcast the cooperative signaling to all potential relay nodes, a second time slot is used for the satellite to transmit one or more satellite signals to the winning relay node, a third time slot is used for the winning relay node to forward the satellite signals to the satellite users, and the remaining N time slots are used for the winning relay node to serve the terrestrial users in its coverage area, and the service period includes the remaining (N + 2) time slots except the first time slot.
In an embodiment of the application, the auction mechanism is a vkkeli auction mechanism, and the winning relay node is the first relay node with the largest bidding vector.
In an embodiment of the present application, after determining the winning relay node, a next largest bid vector having a next largest bid vector is used as the bid value for the winning relay node.
In one embodiment of the present application, the bid vector is calculated step by step using the following formula
Figure BDA0003738060530000041
Each vector of
Figure BDA0003738060530000042
Figure BDA0003738060530000043
Figure BDA0003738060530000044
Figure BDA0003738060530000045
Figure BDA0003738060530000046
Figure BDA0003738060530000051
Figure BDA0003738060530000052
Figure BDA0003738060530000053
Wherein a first set of relay nodes is represented as R = { R = } 1 ,R 2 ,…,R k ,…,R K K represents the total number of the first relay nodes; kth first relay node R k Serving N terrestrial users, denoted U k ={U k1 ,U k2 ,…,U kn ,…,U kN };
Figure BDA0003738060530000054
Denotes the kth first relay node R k Providing an initial bid value; t is t 1 Representing a first relay node R k The length of the second transmission slot; t is t 2 Representing a first relay node R k The length of the third transmission slot; t is t kn (N =1 to N) denotes the first relay node R k The length of the remaining N time slots;
Figure BDA0003738060530000055
representing the signal-to-noise ratio of the satellite-first relay link;
Figure BDA0003738060530000056
representing a signal-to-noise ratio of the first relay-terrestrial user link;
Figure BDA0003738060530000057
representing a signal-to-noise ratio of the first relay-satellite user link;
Figure BDA0003738060530000058
denotes a first relay node R k Channel capacity of the link provided for satellite user D;
Figure BDA0003738060530000059
the requirement of the QoS of the ground user is represented, and the minimum data transmission rate required to be met by the ground user is indicated;
Figure BDA00037380605300000510
representing a first relay node R k Channel gain to satellite user D link;
Figure BDA00037380605300000511
respectively representing the channel gains of three links of a satellite, a first relay node, a satellite user and a first relay node, wherein N and K are positive integers which are more than or equal to 1.
In an embodiment of the present application, after the first relay node reports the bid vector to the satellite, the method further includes that the satellite corrects the bid vector based on a handover cost index to obtain a corrected bid vector, where the handover cost index includes one or any one of bid history information, predicted residence time, and movement angle information, and the satellite selects, according to the corrected bid vector, a first relay node corresponding to a maximum corrected bid vector from the first relay node as a winning relay node, where the bid history information is obtained from a set of all winning relay nodes selected by the satellite in a time sequence; the predicted dwell time and the movement angle information are obtained from a geometric model.
In an embodiment of the present application, the bidding vector is modified by the following formula:
Figure BDA00037380605300000512
wherein, b k Representing the revised bid vector and,
Figure BDA00037380605300000513
represents a normalized bidding vector after normalizing the bidding vector, and gamma represents a balanced data transmission rate sumThe larger gamma is, the more important the representation data transmission rate is by the satellite node, and the smaller gamma is, the more worth reducing the representation switching cost;
Figure BDA0003738060530000061
is a decision matrix, where x kj Representing any of said handover overhead indicators, K representing the sequence number of the first relay node, K being the total number of first relay nodes, j representing the sequence number of the handover overhead indicator, w j Representing the weight of the jth handover overhead indicator.
In an embodiment of the application, the weights are determined using an entropy weight method.
The present application further provides a communication system of a satellite-ground converged relay network based on an auction mechanism, where the satellite-ground converged relay network includes at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, and each potential relay node serves a plurality of ground users covered by the potential relay node on the ground, and is characterized in that the satellite is configured to broadcast a cooperation signaling to the plurality of potential relay nodes, and use a time slot of the satellite as a commodity in the auction mechanism; each potential relay node is used for predicting whether the potential relay node participates in bidding at the current moment, and taking all the potential relay nodes participating in bidding as first relay nodes, wherein the participation bidding indicates that the potential relay nodes are willing to obtain the commodity, the first relay nodes are used for evaluating the channel value according to the channel performance and reporting bidding vectors including the estimated value to the satellite, wherein the channels include a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the channel performance includes channel gain and sub-slot allocation length; the satellite is further used for selecting a first relay node corresponding to the maximum bidding vector from a plurality of first relay nodes as a winning relay node based on an auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.
The communication method of the application evaluates the success rate of the potential relay node participating in the bidding through the prediction model, and realizes the conversion between the bidding state and the non-bidding state of the base station, thereby reducing the scale of the satellite-ground cooperation matrix and achieving the purpose of reducing the calculation overhead. Meanwhile, an auction mechanism with incentive compatibility and effectiveness is adopted, so that the strategy that the potential relay node reports the real bidding is the dominant strategy, and the cooperation of a satellite mobile communication system and a ground cellular mobile communication system is guaranteed. Finally, based on the characteristics of high dynamic state and long time delay of the low-orbit satellite communication system, the method considers that the relay selected on the snapshots at different time changes, and uses the switching overhead as an additional item to rewrite the bidding of the auction process, so that the low-orbit satellite comprehensively considers the transmission benefit and the switching frequency, realizes multi-objective optimization, and is beneficial to reducing time delay and signaling transmission overhead.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the principle of the invention. In the drawings:
FIG. 1 is a schematic diagram of a scenario of a satellite-ground converged relay network;
fig. 2 is an exemplary flowchart of a communication method of an auction-based satellite-ground convergence relay network according to an embodiment of the present application;
fig. 3 is a schematic information flow diagram in a communication method according to an embodiment of the present application;
FIG. 4 is a basic framework diagram of a reinforcement learning model in the communication method according to an embodiment of the present application;
fig. 5 is a basic framework diagram of Double DQN with empirical playback in the communication method according to an embodiment of the present application;
FIGS. 6A-6C are diagrams illustrating satellite slot allocation in three cooperation states;
fig. 7 is a schematic diagram of the relationship between satellite beams and base station location.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only examples or embodiments of the application, from which the application can also be applied to other similar scenarios without inventive effort for a person skilled in the art. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
As used in this application and in the claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to include the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
The relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present application unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any particular value should be construed as exemplary only and not as limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
It should be noted that the terms "first", "second", and the like are used to define the components, and are only used for convenience of distinguishing the corresponding components, and the terms have no special meanings unless otherwise stated, and therefore, the scope of protection of the present application is not to be construed as being limited. Further, although the terms used in the present application are selected from publicly known and used terms, some of the terms mentioned in the specification of the present application may be selected by the applicant at his or her discretion, the detailed meanings of which are described in relevant parts of the description herein. Further, it is required that the present application is understood not only by the actual terms used but also by the meaning of each term lying within.
Flowcharts are used herein to illustrate the operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, various steps may be processed in reverse order or simultaneously. Meanwhile, other operations are added to or removed from these processes.
According to the communication method of the satellite-ground integration relay network based on the auction mechanism, the auction mechanism is introduced, and a cooperative communication process between the satellite mobile communication system and the ground mobile cellular system is constructed. The satellite-ground converged relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users.
For convenience of explaining the communication method of the present application, a scene definition is first performed on the satellite-ground converged relay network. The satellite-ground converged relay network comprises two sub-networks, wherein one sub-network comprises a satellite mobile communication network, namely a main network; the other subnetwork comprises N terrestrial cellular mobile communications networks, i.e. terrestrial networks. Each terrestrial network contains one potential relay node and multiple terrestrial users.
Fig. 1 is a schematic view of a scenario of a satellite-ground converged relay network. One satellite 110,3 of potential relay nodes 121, 122, 123 and 1 satellite user 130 are shown in fig. 1, and are also denoted by the letter D hereinafter. The illustration in fig. 1 is merely an example, and is not intended to limit the number and location relationships of satellite nodes, potential relay nodes, and satellite users in the satellite-to-ground converged relay network. The satellite 110 may be a low earth orbit satellite. In fig. 1, the potential relay nodes 121, 122, 123 are all base stations. The base station node with the relay function can be a 5G base station with the receiving, amplifying and forwarding functions, a D2D node or an ad-Hoc network. The present application describes a specific embodiment in which a base station is used as a potential relay node, and is not limited to a specific type of the potential relay node.
In fig. 1, a circle around each potential relay node 121, 122, 123 is used to represent the cell coverage of the potential relay node, and each potential relay node is used to serve terrestrial users within its coverage. In the terrestrial wireless communication system, the terrestrial user is any user terminal, such as a mobile terminal, which may use the signal transmitted by the potential relay node, and the application is not limited thereto. As shown in fig. 1, each satellite, each potential relay node, and the satellite user have computing and storage capabilities themselves, and may act as a computing node.
In the satellite-to-ground convergence relay network, the main network includes a satellite node S and one or more satellite user nodes, i.e., the satellite users 130 shown in fig. 1. Satellite users 130 include ground stations, satellite terminals, and dual mode terminals. The method and the device for cooperative communication in the frequency division multiplexing system have the advantages that the satellite user node is arranged in the main network for discussion, and the idea of the method and the device can also be used for a cooperative communication scene of the main network comprising a plurality of satellite user nodes in the frequency division multiplexing system.
Also shown in fig. 1 is an obstruction 140, such as a building. During the movement of the satellite 110, the satellite signal is blocked by the obstacle 140 and cannot directly establish a communication link with the satellite user 130, and in this case, a potential relay node, such as the potential relay node 123, which can receive the satellite signal and has a communication connection with the satellite user 130, may be selected as a relay, and the potential relay node 123 may receive the satellite signal, amplify the satellite signal, and forward the satellite signal to the satellite user 130. The communication links are shown as dashed lines in fig. 1.
In the satellite-ground shared frequency band, if not guided, each base station under the coverage of the satellite beam may be a potential relay node, which not only provides service for the ground mobile communication users in the coverage, but also amplifies and forwards the satellite signals. Assuming that all transmitters and receivers are equipped with a single antenna, half duplex mode is used. Assuming that the total number of base stations K > 1, each terrestrial network can only assist one satellite user at most due to power constraints. The appropriate relay selection can effectively improve the spectrum efficiency and the transmission capacity of the primary user network. Since the relay cooperation of the same channel has large overhead in terms of synchronization and coding and also has unnecessary co-channel interference influence, only one channel is considered and only one relay is selected to transmit the signal. Assuming that the line-of-sight link between the satellite node S and the satellite user D in fig. 1 is blocked by the obstacle 140, a communication connection can be established between the satellite node S and the satellite user D by using the communication method of the present application.
Fig. 2 is an exemplary flowchart of a communication method of an auction-mechanism-based satellite-ground convergence relay network according to an embodiment of the present application. Referring to fig. 2, the communication method of this embodiment includes the steps of:
step S210: broadcasting a cooperation signaling to a plurality of potential relay nodes by the satellite, and taking a time slot of the satellite as a commodity in an auction mechanism;
step S220: each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and all the potential relay nodes participating in bidding are used as first relay nodes, wherein the participation in bidding indicates that the potential relay nodes are willing to obtain commodities, namely have a cooperative willingness;
step S230: the first relay node evaluates the value of a channel according to the performance of the channel, and reports a bidding vector including the estimated value to a satellite, wherein the channel comprises a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the performance of the channel comprises channel gain and sub-time slot allocation length;
step S240: the satellite selects a first relay node corresponding to the maximum bidding vector from the plurality of first relay nodes as a winning relay node based on an auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.
The above steps S210 to S240 will be described in detail with reference to the drawings.
Fig. 3 is a schematic information flow diagram in a communication method according to an embodiment of the present application. Referring to fig. 3, there are shown 3 execution entities, including a satellite 310, a base station group 320 and a satellite user 330 in the satellite-ground convergence relay network. The information flow in three states, including (a) non-cooperative, (b) cooperative rupture, and (c) cooperative success, is illustrated in fig. 3 based on whether the satellite 310 and some potential relay node or nodes in the base station cluster 320 establish a cooperative relationship. In a non-cooperative state, for example, the satellite 310 may establish a communication connection directly with the satellite user 330 without selecting any potential relay node as a relay. In two states of cooperation rupture and cooperation success, if the satellite signal cannot directly establish a communication link with the target satellite user 330 due to occlusion or the like, it is necessary to broadcast a cooperation signaling to a plurality of base stations in the base station group 320 to find a base station that can be used as a relay. And if the base station meets the condition of serving as the relay node, executing a cooperation process, and selecting the relay with the highest bidding vector or the highest corrected bidding vector from the set of the first relay nodes with cooperation willingness by the satellite for cooperative communication. In some scenarios, such as where terrestrial base stations also suffer from severe shadow fading effects or there are no suitable base stations in the vicinity of the satellite user, the bid vector or the modified bid vector is negatively charged at the satellite end, and the satellite performs a denial of cooperation, which is referred to as "cooperation breach". The result of whether the satellite 310 has a cooperative relationship with a certain base station can be obtained according to the above steps S210-S240. In fig. 3, the execution steps S311-S315 are shown in a specific cooperation-breaking state, and the execution steps S321-S329 are shown in a specific cooperation-success state. The communication method of the present application is described below with reference to fig. 2 and 3.
Step S210 corresponds to step S311 and step S321 in fig. 3, in which the satellite 310 broadcasts cooperation signaling to a plurality of base stations or base station clusters 320. The present application introduces an auction mechanism to determine the final winning relay node, and takes the time slot of the satellite as a commodity in the auction mechanism at step S210.
Auctions are an economically derived resource allocation mechanism. The classic auction mechanism has: english auction, lotus auction, vycorrry auction, sealed second price auction, and the like. In the communication method of the present application, the time slot length of the satellite is used as a resource, the first relay nodes are used as bidders, the bidding of each first relay node is a channel value estimated based on channel performance, and an estimation function about the channel value will be described later.
The satellite 310 broadcasting the cooperation signaling at step S210 is equivalent to initiating an auction, and if each base station in the base station group 320 participates in the bidding activity of the auction, the amount of calculation is too large and resources are wasted. Therefore, in step S220, the communication method of the present application enables each base station to first calculate and determine whether to participate in bidding. As previously described, each base station is a computing node with storage and computing capabilities. Therefore, a prediction model can be built in each base station in advance, a revenue function at the current moment is predicted based on the historical success rate and the scene information, and whether to participate in the bidding process is judged according to the positive and negative characteristics of the revenue function, so that the overall calculated amount is greatly reduced. The historical success rate is historical data of a cooperative relationship between a base station and a satellite, and the scene Information comprises three CSI (Channel State Information) of satellite-relay Channel State Information, relay-ground user Channel State Information and relay-satellite user Channel State Information.
The original potential relay set size is assumed to be K ', wherein K' is the total number of base stations under the beam with the cooperative will. After the screening in step S220, the size of the relay set is reduced to K, where K < K', and the relay set K only includes the first relay node participating in bidding.
In the communication method of the present application, it is assumed that the main network and all the terrestrial networks are intelligent. The application discusses that a satellite-ground cooperative communication scene does not have a central control node, so that the information obtained by a satellite and a ground system is not complete, namely, a ground network can only obtain three CSI (channel state information) of a satellite-relay channel, a relay-ground user channel and a relay-satellite user channel, and a main network can only obtain the CSI of the satellite-satellite user channel. Both the main network and the terrestrial network are expected to maximize benefits. For a satellite-to-ground converged network, a large beam of satellites will cover a huge number of potential relay nodes. If each potential relay node directly participates in bidding without evaluating the success probability of bidding, the huge potential relay set causes the sharp increase of the calculation cost and the communication cost. According to the method and the device, the prediction model is built on each potential relay node, whether the potential relay node participates in bidding is predicted, the calculation and storage functions of the ground potential relay node can be fully utilized, the bidding scale is greatly reduced, the communication cost is reduced, and the on-satellite calculation overhead is reduced.
In step S220, the application does not limit the specific algorithm of the prediction model. The following description will take two algorithms of Q-learning and Double DQN in the reinforcement learning model as examples.
In some embodiments, the reinforcement learning model divides the transit period of the satellite and the like into P time segments by using the potential relay node as an agent based on a Q-learning algorithm, and the model parameters of the reinforcement learning model include the state, the action and the reward of the potential relay node in a certain time segment.
Fig. 4 is a basic framework diagram of a reinforcement learning model in a communication method according to an embodiment of the present application. Referring to fig. 4, an agent 410 and an environment 420 are included. Through reinforcement learning, agent 410 is in current state (state) s t Down-sampling from strategy pi and executing action (action) a t After the environment 420 accepts the action, the state of the agent 410 changes to the next state s t+1 And transmits the reward signal (reward) r t Fed back to agent 410. The purpose of reinforcement learning is to train the strategy pi or the optimal action cost function
Figure BDA0003738060530000121
Thereby maximizing the prize r earned by the agent 410. A in FIG. 4 t 、S t 、R t Etc. represent random variables, with a in the actual learning process t 、s t 、r t Indicating the determined value.
The meaning of policy pi (a | s) is a probability density function that takes an action in a certain state s, and characterizing the action is not deterministic, but sampled from the probability distribution. The agent will be rewarded after acting according to the policy and the agent will transition to a new state. Training and learning are carried out through historical data to obtain an optimized strategy, so that the action predicted according to the model is more optimal.
In an embodiment of the application, each potential relay node is considered as an agent, and each agent is modeled. The agent may observe the channel state information as part of its current state. The satellite transit time period is divided into P time slices, each time slice corresponding to a different "satellite-satellite user-potential relay node" geometry, which results in corresponding to different CSI, so the entire satellite transit process can be modeled as a reinforcement learning process.
In the embodiment of prediction based on the Q-learning algorithm, the following equation (1) is used to define the states:
Figure BDA0003738060530000131
the state s is a binary set containing channel state information and the signal-to-noise ratio at the transmitting end. Wherein the content of the first and second substances,
Figure BDA0003738060530000132
and
Figure BDA0003738060530000133
the expressions of (a) are respectively as follows:
Figure BDA0003738060530000134
Figure BDA0003738060530000135
wherein the content of the first and second substances,
Figure BDA0003738060530000136
representing the satellite S and potential relay nodes R k Channel gain between, P s Representing the transmitted power, σ, of the satellite 2 Which is indicative of the noise power of the satellite,
Figure BDA0003738060530000137
representing potential relay nodes R k And ground users U within its coverage area kn The gain of the channel in between is increased,
Figure BDA0003738060530000138
representing potential relay nodes R k The remaining power of. Discretized in a preprocessing stage to obtain quantized channel gains.
In an embodiment of ground potential relay bid prediction based on Q-learning algorithm, the actions of the potential relay node are defined as action set a = { Y, N }, where Y represents a bid and N represents a discard bid. The bidding means that the relay node needs to consume the power of the relay node and send a signaling to the satellite, so that the bidding is successful with a certain probability, and the opportunity of sharing the frequency band with the satellite is obtained. And abandoning bidding means that the relay node does not spend any overhead and also loses the opportunity of sharing the frequency band with the satellite.
In the embodiment of prediction based on the Q-learning algorithm, the reward is represented by a discount reward, which is defined as the accumulated value of each instant reward multiplied by a corresponding number of discount factors in the learning process of the intelligent agent. The action cost function Q (s, a) is defined as:
Q(s,a)=Ε[U t |S t =s,A t =a t ] (4)
where E (-) is an expectation function, U t Is a discount return. The action-value function Q (s, a) is defined as the discount return U when the action a belongs to A and is in the state s within a certain time t The expectation is that. Q (s, a) depends not only on the state but also on the policy. Q (s, a) is the basis for determining which action to select in state s, i.e., the basis for prediction.
In the Q-Learning algorithm, the reinforced Learning model updates the action value function Q (s, a) to obtain the optimal action value function Q under the optimal strategy * (s, a) for evaluating the performance of the operation.
In conjunction with fig. 4, after the agent 410 takes action a according to the current state s, the environment 420 will generate the reward r according to the feedback of the action a of the agent 410. Method using time-sequential difference (Temporal)Difference, TD), assuming that one transition is observed, the quadruple(s) t ,a t ,r t ,s t+1 ) Then the more recent formula at each time slice is obtained from the Bellman's iterative equation as formula (5):
Figure BDA0003738060530000141
in the formula, α ∈ (0,1) represents a learning rate factor, and γ ∈ [0,1] represents an expression factor.
In order to avoid the node from falling into local optimum, an epsilon-greedy mode is adopted when Q-learning selection action is adopted, and balance is obtained between exploration and utilization. The convergence time of the algorithm depends on the size of the state space and the motion space, and if the dimension is large, the convergence time of the algorithm becomes long.
On the action prediction problem of the potential relay node, the state s is the result after the quantization of the channel state information, the accuracy depends on the quantization order, and the higher the order is, the higher the accuracy is. If a highly accurate motion prediction is required, the problem of excessively high state dimension is caused. The Q-learning algorithm is more efficient in the low-dimensional state and the low-dimensional motion space, but is less efficient in the high-dimensional state. The generalization and function approximation capability of the deep neural network is more suitable for the learning process under a high-dimensional state and a high-dimensional action space. Combining Q-learning algorithm with deep neural network, and approximating Q with deep neural network by training weight * (s, a), which is the basic structure of DQN. The input to the neural network is state s and the output is the score for each action, the action with the highest score will be selected. The network adopts a gradient descending mode to update the weight, and in the optimization process, the network gradually approaches to Q * (s, a), the better the action of the output, the more prizes the system receives.
Basic DQN presents an overestimation problem that Double DQN can solve. Fig. 5 is a basic framework diagram of Double DQN with empirical playback in the communication method according to an embodiment of the present application. Referring to fig. 5, the agent of double DQN includes a Q learning network (DQN) and a Target learning network (Target Net). Q learning networks are used for selection of actions, and are therefore also referred to as selection networks; the target Q network is used for evaluation of actions and is therefore also referred to as an evaluation network. The algorithm obtains data in an environment, inputs the data into a Q learning network, enables a system to select the maximum value in the Q network as an action, copies parameters in the Q learning network into a target learning network at intervals, and meanwhile, the target learning network conducts reverse transmission updating on the parameters of the Q learning network under the action of a loss function (SGD). In addition, an experience playback algorithm is adopted, historical data are put into an experience pool, and then one of the historical data is randomly sampled from the experience pool to serve as mini-batch data to update network parameters, so that the correlation of the sequence can be broken, and the past experience can be repeatedly utilized to reduce the waste of transitions.
In other embodiments of the present application, the reinforcement learning model predicts whether the potential relay node participates in bidding at the current time based on the Double DQN algorithm, and the model parameters of the reinforcement learning model include the state, action and reward of the potential relay node in a certain time segment. Wherein the state s is defined as a six-tuple using the following equation (6):
Figure BDA0003738060530000151
in the formula (I), the compound is shown in the specification,
Figure BDA0003738060530000152
representing the satellite S and potential relay nodes R k The gain of the channel in between is increased,
Figure BDA0003738060530000153
representing potential relay nodes R k And the channel gain between the satellite user D,
Figure BDA0003738060530000154
represents the satellite inclination of the satellite in question,
Figure BDA0003738060530000155
Figure BDA0003738060530000156
representing the angle of visibility of the potential relay node to the satellite,
Figure BDA0003738060530000157
Figure BDA0003738060530000158
represents the remaining power of the potential relay node, theta represents the angle of movement of the satellite, theta ∈ [0,180 °]。
This action a is expressed by the following equation (7):
a=π(s) (7)
where π (·) is a policy, representing the mapping process from environment state s to action a. After the agent finishes the current action, the Q value corresponding to the current action is calculated, and meanwhile, the agent senses the environment and generates data(s) through interaction of the action t ,a t ,r t ,s t+1 ) It is saved to the experience pool. The actions include 3, i.e., bid and success, bid and failure, and no bid participation. The reward function is defined according to the following rules: if bidding is successful, high rewards are given; if bidding and failure, giving a negative reward; if not, the reward is 0. The reward is represented by a numerical value, and the specific numerical value of the reward is determined by taking the convergence of network training as a criterion.
Referring to FIG. 5, in the Double DQN algorithm, the agent uses two independent BP neural networks as Q network approximators, namely a Q learning network Q (s, a; w) and a target learning network Q (s, a; w) - ) Wherein w and w - Representing the current parameter and the previous parameter, respectively. The Q learning network is used to select actions, represented as:
Figure BDA0003738060530000159
the target learning network is used for evaluation and is represented as:
y t =r t +γ·Q(s t+1 ,a * ;w - ) (9)
in each time slice, the relay node R k Transferring it to transition(s) t ,a t ,r t ,s t+1 ) Storing the network parameter into an experience pool, and randomly sampling a mini-batch from the experience pool to update the network parameter. TD error can be expressed as:
δ i =Q(s t ,a;w)-y t (10)
the loss function is defined as:
Figure BDA0003738060530000161
in the mini-batch SGD, a number of transitions are sampled, and the network parameters are updated as the average of a number of gradients:
Figure BDA0003738060530000162
in some embodiments, the Double DQN approximates the action cost function with BP networks, both having the same structure and different parameters. The number of input nodes of the BP neural network corresponds to the attribute number of the state, and the number of output nodes corresponds to the action number. For example, in the above embodiment, where state s is a six-tuple, six input nodes are used, and 3 actions correspond to 3 output nodes. A classical three-layer BP neural network is employed, namely an input layer, a hidden layer and an output layer. Hidden layer neuron number according to empirical formula
Figure BDA0003738060530000163
And determining, taking the number n' =6 of the hidden layer nodes. The neural network may fit any "function that contains a continuous mapping from one finite space to another finite space".
The above describes the process of predicting whether the potential relay node participates in the bidding at the current time by using the prediction model in step S220. Potential relay nodes participating in bidding are in a bidding state, and potential relay nodes not participating in bidding are in a non-bidding state. At different times, it may be possible for a potential relay node to transition between bidding and non-bidding states. In conjunction with fig. 3, step S220 corresponds to step S312 in the cooperation rupture state in fig. 3, and step S322 in the cooperation success state, i.e., "potential relay node size reduction based on prediction".
In some embodiments, after the predicting step, the base station sends the satellite cooperation request information including the bid proposed by each base station, as shown in fig. 3 as steps S313 and S323, i.e., "send cooperation request information to the satellite", the cooperation request information including the bid vector in step S230.
Referring to fig. 2, the first relay nodes are some of the potential relay nodes that predict that the action is participating in bidding after having been filtered, at step S230. And each first relay node estimates the channel value according to the CSI and reports the bidding vector including the estimated value to the satellite.
Since, in some embodiments, a correction to the bid vector is included, the bid vector before correction is referred to herein as the initial bid vector b 0 Expressed by the following equation (13):
Figure BDA0003738060530000171
in step S240, the satellite determines a winning relay node from the plurality of first relay nodes, the winning relay node having successful bidding based on the auction mechanism and the bidding vector. Step S240 corresponds to steps S314 and S324 in fig. 3, i.e., the satellite 310 determines whether to cooperate based on the auction mechanism.
In some embodiments, the auction mechanism employs a vkkley auction, with the winning relay node being the first relay node with the largest bidding vector. The vkeley auction mechanism is incentive compatible, so that each relay node reports its estimate of each channel value to the satellite, thereby avoiding the relay node reporting false information.
In some embodiments, the bidding vector b' of the winning relay node is expressed using the following equation (14):
Figure BDA0003738060530000172
that is, the bidding vector b' of the winning relay node is the maximum value among the initial bidding vectors of all the first relay nodes. The initial bid vector with the largest value may be ranked first by ranking the initial bid vectors, e.g., descending order.
According to the steps S210-S240, whether to participate in bidding is predicted through a reinforcement learning model, and the scale of the potential relay node is reduced; and secondly, selecting the most appropriate winning relay node from the first relay nodes based on an auction mechanism, and taking the winning relay node as a relay node between a satellite and a satellite user, wherein the winning relay node is the relay node with the optimal signal transmission performance.
Referring to fig. 3, in the non-cooperative state, the potential relay node does not participate in bidding; in the cooperative burst state, there is no winning relay node among the potential relay nodes. The satellite 310 transmits the transmission signal directly to the satellite user 330 at step S315. In the successful cooperation state, the communication method further includes the following steps:
step S325: the satellite 310 broadcasts information of the winning relay node.
Step S326: waiting for a service period to arrive.
Step S327: at the arrival of the service period, the satellite 310 transmits the satellite signal to the winning relay node.
Step S328: the winning relay node transmits a signal to the satellite user 330.
Step S329: the winning relay node accesses the satellite frequency band of the satellite 310 and provides service to the terrestrial user.
Fig. 6A-6B show schematic diagrams of satellite slot allocation in three cooperation states. Wherein fig. 6A corresponds to the non-cooperative state in fig. 3, fig. 6B corresponds to the cooperative rupture state in fig. 3, and fig. 6C corresponds to the cooperative success state in fig. 3. Wherein the satellite time slots are represented by rectangular bars, which are transverse to the time direction.
As shown in fig. 6A, in the non-cooperative state, all the time slots are used for signal transmission between the satellite and the satellite user, and need not be allocated to the relay node.
As shown in fig. 6B, in the cooperation burst state, the satellite will spend a time slot to interact with the ground for control signaling, such as broadcasting cooperation information, receiving whether cooperation ACK, etc., i.e. the "signaling interaction" time slot in fig. 6B, corresponding to step S311 in fig. 3. The remaining time slots are used for downlink satellite transmission signals to satellite users. No base station in the cooperative burst state acquires the time slot for the satellite.
As shown in fig. 6C, in the cooperation successful state, the satellite takes a period of time t corresponding to step S321 in fig. 3 0 For "signalling interaction", the remaining time slots are divided into "satellite-base station signalling" time slots t 1 And "base station-satellite user signal transmission" time slot t 2
In some embodiments, based on some auction mechanism, such as the vkkeli auction, after determining the winning relay node, the winning relay node only needs to pay the next highest price. According to the characteristics, the winning relay node only needs to pay the bidding value equivalent to the bidding value corresponding to the next highest bidding node. Assuming that the length of the transmission timeslot of the relay-terrestrial user signal corresponding to the highest bidding price is τ ', and the length of the transmission timeslot of the relay-terrestrial user signal corresponding to the second highest bidding price is τ ″, the winning relay node corresponding to the highest bidding price allocates the timeslot with the length of τ ″ to the satellite transmission signal and gives the timeslot with Δ τ = τ' - τ ″ to the terrestrial mobile communication system, that is, timeslot t in fig. 6C 2 Followed by a partial time slot t k1 To t kN . The remaining time slots are reallocated by the winning relay node according to a certain criterion to serve the N terrestrial users under the k-th relay coverage.
In the embodiment shown in fig. 6C, one transmission frame between the satellite and the winning relay node includes (N + 3) time slots, where the first time slot t 0 For the satellite to broadcast cooperative signalling to all potential relay nodes, the second time slot t 1 For satellite coupling one or more satellitesThe star signal is transmitted to a winning relay node, and the third time slot t 2 The winning relay node is used for forwarding the satellite signals to the satellite users, and the rest N time slots are used for serving the ground users in the coverage area of the winning relay node. In step S326 of fig. 3, the service period includes dividing the first time slot t 0 The remaining (N + 2) slots.
In some embodiments, assuming that the satellite system transmission frame consists of flexibly adjustable time slots, the signal transmitted by the satellite is received and amplified by the winning relay node in the second time slot, the amplified signal is transmitted to the satellite users in the third time slot, and the N terrestrial users are served in the next N time slots to meet the terrestrial user transmission rate.
A successful cooperative communication process is described below with reference to fig. 6C and a transmission/reception model and a channel model in the satellite communication field.
As shown in fig. 6C, in the first time slot t 0 In the middle, the satellite and the ground perform signaling interaction; in the second time slot t 1 In the satellite S, M signals are modulated to different frequency points for broadcasting, and the signals are marked as { x p1 ,x p2 ,…,x pm ,…,x pM }. If the pilot signal shows that the CSI of a certain channel is not good, the satellite seeks a ground potential relay node for cooperative transmission in the coverage range of the beam. The present application analyzes only one of M channels of a satellite, i.e. the scenario shown in fig. 1, and the transmission signal is denoted as x p . Relay node R k Providing service to terrestrial users in a coverage area, the N signals transmitted are represented as
Figure BDA0003738060530000191
Figure BDA0003738060530000192
The signal received by the ground potential relay node is expressed as follows:
Figure BDA0003738060530000193
wherein eta k Is additive white Gaussian noise, provided by the transmission channel, P S Refers to the transmitted signal of satellite S;
Figure BDA0003738060530000194
the same as in equation (2) refers to the channel gain corresponding to the satellite-relay link. The data transmission rate of the first hop can thus be determined:
Figure BDA0003738060530000195
referring to FIG. 6C, in the third time slot t 2 Inner, potential relay node R k Amplifying and transmitting signal to satellite user, the ith ground user obtaining t ki The slot of length transmits its signal. Thus, the first relay node R k The transmission rates provided for satellite user D are:
Figure BDA0003738060530000201
wherein, t 2 Refers to the third slot length; order to
Figure BDA0003738060530000202
Characterizing the signal-to-noise ratio of the kth potential relay node transmitting end;
Figure BDA0003738060530000203
refers to the channel gain of the k-th potential relay node to satellite user D link.
Similarly, the transmission rate of the kth potential relay node for transmitting signals to the nth terrestrial user in the cell is as follows:
Figure BDA0003738060530000204
wherein, t kn Refers to a potential relay node R k Transmitting the signal to the time slot length corresponding to the nth ground user; likewise, let
Figure BDA0003738060530000205
The signal-to-noise ratio of the kth potential relay node transmitting end is represented;
Figure BDA0003738060530000206
refers to potential relay node R k Channel gain to the nth terrestrial user link.
In the present specification, H X,Y Expressed as a channel between nodes X and Y, where X, Y ∈ { S, D, R k ,U kn K is the {1,2, …, K }, and N is the {1,2, …, N }. In the expression, S refers to a satellite node; r k Is a potential relay node; d refers to a satellite user; u shape k ={U k1 ,U k2 ,…,U kn ,…,U kN Denotes a potential relay node R k A set of terrestrial users within a cell. The propagation model is defined as:
Figure BDA0003738060530000207
wherein
Figure BDA0003738060530000208
For free space loss:
Figure BDA0003738060530000209
G S and G t Antenna gain of the satellite transmitter and antenna gain of the terrestrial receiver, respectively, d is a satellite-to-ground distance (km) and f is a carrier frequency (MHz).
Figure BDA00037380605300002010
The shadowing and fading coefficients are obeyed to a shadowing leis fading profile.
Figure BDA00037380605300002011
The probability density function of (a) can be expressed as:
Figure BDA00037380605300002012
wherein 1 F 1 (-, -) represents kummer function, λ =1/2b, α = ((2 bm)/(2 bm + Ω)) m β = Ω/(2 b (2 bm + Ω)), where b denotes the average power of the multipath component, m is a parameter measuring the severity of fading, and Ω denotes the average power of the line-of-sight transmission component.
For terrestrial link, the channel gain is
Figure BDA00037380605300002013
Wherein
Figure BDA00037380605300002014
Is R k And U kn The distance therebetween, η is the path loss coefficient,
Figure BDA0003738060530000211
for small scale fading, following a Nakagami-m distribution,
Figure BDA0003738060530000212
the probability density function is:
Figure BDA0003738060530000213
where the shape parameter mu characterizes the severity of the fading effect and is an integer. ω is the average power of each component, and Γ (·) represents the gamma function.
In the scenario of a satellite-to-ground converged relay network as shown in fig. 1, where the potential relay nodes are far apart, it can be assumed that all channels are subject to independent and equally distributed fading. In addition, all channels are assumed to be quasi-static, i.e., the channel gain is assumed to remain constant during a time segment of satellite transit, and the channel gain varies during different time segments. And the ground relay acquires real-time CSI information of the relay node-satellite user link and the relay node-ground user link. The CSI is not shared between the relay nodes, i.e. the information is private, and from a gaming point of view the information is incomplete. The auction process is conducted at the beginning of each time segment.
In step S230, the channel performance including the channel gain and the sub-slot allocation length is evaluated by the first relay node and reflected in the given bidding vector. Wherein the time slot allocation length comprises the time slot length t of the first relay node for transmitting the amplified signal to the satellite user 2
The definition of the bid vector is described below in connection with the time slot shown in fig. 6C. The initial bid vector in equation (13) is
Figure BDA0003738060530000214
The winning relay node in equation (14) has a bid vector of
Figure BDA0003738060530000215
In the embodiment of the application, the first relay node adopts an AF protocol, and the relay node performs simple amplification processing on the received signal and then forwards the signal to the destination terminal. According to the optimal relay selection scheme of the AF protocol, after the QoS index of a local user in a cell is met, the length of a time slot allocated to a satellite-relay and the length of a time slot allocated to a relay-target node are equal, namely t 1 =t 2 . Specifically, the following formula is used to calculate the bid vector b 0 Each vector of
Figure BDA0003738060530000216
Figure BDA0003738060530000217
Figure BDA0003738060530000218
Figure BDA0003738060530000219
Figure BDA0003738060530000221
Figure BDA0003738060530000222
Relay R k Amplifying signals according to the AF protocol, wherein the estimated channel capacity is as follows:
Figure BDA0003738060530000223
defining the initial bid value as the estimated value:
Figure BDA0003738060530000224
wherein a first set of relay nodes is represented as R = { R = } 1 ,R 2 ,…,R k ,…,R K K represents the total number of the first relay nodes; kth first relay node R k Serving N terrestrial users, denoted U k ={U k1 ,U k2 ,…,U kn ,…,U kN };
Figure BDA0003738060530000225
Denotes the kth first relay node R k Providing an initial bid value; t is t 1 Representation shows a first relay node R k The length of the second transmission slot; t is t 2 Representing a first relay node R k The length of the third transmission slot; t is t kn (N =1 to N) denotes the first relay node R k The length of the remaining N time slots;
Figure BDA0003738060530000226
representing the signal-to-noise ratio of the satellite-first relay link;
Figure BDA0003738060530000227
representing a signal-to-noise ratio of the first relay-terrestrial user link;
Figure BDA0003738060530000228
representing a signal-to-noise ratio of the first relay-satellite user link;
Figure BDA0003738060530000229
representing a first relay node R k Channel capacity of the link provided for satellite user D;
Figure BDA00037380605300002210
represents the QoS requirement of the ground user, which means the minimum data transmission rate required to be met by the ground user;
Figure BDA00037380605300002211
representing a first relay node R k Channel gain to satellite user D link;
Figure BDA00037380605300002212
Figure BDA00037380605300002213
respectively representing the channel gains of three links of a satellite, a first relay node, a satellite user and a first relay node, wherein N and K are positive integers which are more than or equal to 1.
Combining the above equations (23-29), an initial bidding vector provided by each first relay node can be calculated, and the initial bidding vector is related to the channel gain and the sub-slot allocation length.
In some embodiments, the communication method of the present application further comprises: an auction-based relay selection process is performed once every time segment begins during satellite transit. If different relays are frequently selected, additional switching and communication overhead may be incurred. Based on the above consideration, in an embodiment of the present application, after the first relay node reports the bid vector to the satellite, the satellite further corrects the initial bid vector based on the handover overhead index, so as to obtain a corrected bid vector. The switching overhead index comprises one or any one of bidding history information, predicted residence time and moving angle, wherein the bidding history information is obtained by the set of all winning relay nodes selected by the satellite on a time sequence.
<xnotran> , R = { R </xnotran> 0 ,R 1 ,…,R p-1 ,R p …, calculate
Figure BDA0003738060530000231
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003738060530000232
is an XOR symbol, R p Indicating the currently winning relay node, R p-1 Indicating that the relay node was selected in the previous time segment, a sequence comprising 0 and-1 may be obtained, where 0 indicates that the current winning relay node is the same as the winning relay node in the previous time segment, the switching overhead is zero, and-1 indicates that the current winning relay node is different from the winning relay node in the previous time segment, which may bring about switching overhead, with a negative effect. In addition to considering bidding history information, the satellites predict the dwell time of potential relays and estimate the angle of movement based on ephemeris and geolocation information. A short predicted dwell time may be considered an unnecessary relay selection; the moving angle indicates an included angle between a connecting line between a satellite sub-satellite point and the potential relay and the moving direction of the satellite, and the potential relay with the included angle being an acute angle is selected to be more beneficial to reducing the switching of the relay.
The predicted residence time T is based on the following movement model: the low earth orbit satellite keeps moving relative to the ground at a certain speed, and the satellite beam is a flat scanning beam. Then the predicted residence time is:
Figure BDA0003738060530000233
in the above formula, radius is the Radius of the low-orbit satellite beam, O is the central point of the beam, and v is the moving speed of the low-orbit satellite.
Fig. 7 is a schematic diagram of the relationship between satellite beams and base station location. Where O represents the beam center point. Due to the mobility of the low orbit satellite and the flat sweeping surface of the wave beam, the position of the potential relay on the ground at the last moment is A 0 The current time position is A 1 The beam spot of the incoming beam is marked as A 2 And the departure beam spot is denoted by A 3 . θ is the movement angle. The movement angle θ can also be obtained by the movement model:
Figure BDA0003738060530000234
in some embodiments, the initial bid vector is modified using the following formula:
Figure BDA0003738060530000235
wherein the content of the first and second substances,
Figure BDA0003738060530000236
the normalized bidding vector after normalization processing is carried out on the bidding vector is shown, gamma shows a coefficient for balancing data transmission rate and switching overhead, the larger gamma represents that the data transmission rate is more emphasized by satellite nodes, and the smaller gamma represents that the switching overhead is more worthy of reduction;
Figure BDA0003738060530000241
is a decision matrix, where x kj Represents an arbitrary handover overhead indicator, K represents a sequence number of the first relay node, K is the total number of the first relay nodes, j represents a sequence number of the handover overhead indicator, w j Representing the weight of the jth handover overhead indicator. The weight may be determined according to the actual situation.
In some embodiments, the weights are determined using entropy weighting.
The method comprises the following steps of determining index weights of bidding history information, predicted residence time and moving angle by using an entropy weight method:
step S250: the initial data was normalized. And listing bidding history information, predicted residence time and moving angle as decision matrixes. The decision matrix is represented as:
Figure BDA0003738060530000242
where J =3 indicates that there are 3 decision indexes. K is the number of first relay nodes. And (3) standardizing all the index values by adopting a maximum and minimum standardization method, wherein the known bidding historical information and the predicted residence time are positive indexes, and the movement angle is a negative index.
The forward direction index standardization calculation formula is as follows:
Figure BDA0003738060530000243
the negative indicator normalization calculation time is as follows:
Figure BDA0003738060530000244
step S251: and calculating each switching overhead index entropy value. Let the j-th switching overhead index entropy be e j J =1,2, …, J, the calculation formula is as follows:
Figure BDA0003738060530000245
in the formula:
Figure BDA0003738060530000246
step S252: the weight of each index is calculated. The weight of the jth switching overhead index is w j J =1,2, …, J, the calculation formula is as follows:
Figure BDA0003738060530000247
in some embodiments, the following formula is used to modify the switching overhead term for the current bid:
Figure BDA0003738060530000251
Figure BDA0003738060530000252
Figure BDA0003738060530000253
Figure BDA0003738060530000254
wherein, b k And representing the bidding vector after correction, wherein gamma represents a coefficient for balancing the data transmission rate and the switching overhead, the larger gamma represents that the data transmission rate is more emphasized by the satellite node, and the smaller gamma represents that the switching overhead is more worthy of reduction. Due to different dimensions, the data transmission rate is normalized by the formulas (37) - (40), and the switching overhead is subtracted in the formula (41). Thus, the modified bid vector is associated with channel gain, sub-slot allocation length, and handover overhead.
The communication method provided by the application evaluates the success rate of the potential relay node participating in the bidding through the reinforcement learning model, and realizes the conversion between the bidding state and the non-bidding state of the base station, thereby reducing the scale of the satellite-ground cooperation matrix and achieving the purpose of reducing the calculation overhead. Meanwhile, an auction mechanism with incentive compatibility and effectiveness is adopted, so that the strategy that the potential relay node reports the real bidding is the dominant strategy, and the cooperation of a satellite mobile communication system and a ground cellular mobile communication system is guaranteed. Finally, based on the high dynamic and long time delay characteristics of the low-orbit satellite communication system, the method considers that the relay selected on the snapshot at different time changes, and takes the switching cost as an additional item to rewrite the bidding of the auction process, so that the low-orbit satellite comprehensively considers the transmission benefit and the switching frequency, realizes multi-objective optimization, and is beneficial to reducing time delay and signaling transmission cost.
The application also comprises a communication system of the satellite-ground converged relay network based on the auction mechanism, wherein the satellite-ground converged relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, and each potential relay node serves a plurality of ground users covered by the potential relay node on the ground. The communication system can be used to execute the communication method described above, and therefore, the drawings and descriptions related to the communication method can be used to describe the communication system, and the same contents will not be described again.
In the communication system of the application, a satellite is used for broadcasting a cooperation signaling to a plurality of potential relay nodes, and a time slot of the satellite is used as a commodity in an auction mechanism; each potential relay node is used for predicting whether the potential relay node participates in bidding at the current moment, and all the potential relay nodes participating in bidding are used as first relay nodes, wherein the participation in bidding indicates that the potential relay nodes are willing to obtain the commodity, the first relay nodes are used for evaluating the value of a channel according to the performance of the channel and reporting a bidding vector including the estimated value to a satellite, wherein the channel comprises a satellite relay channel, a relay satellite user channel and a relay ground user channel, and the performance of the channel comprises channel gain and sub-slot allocation length; the satellite is further used for selecting a first relay node corresponding to the maximum bidding vector from the plurality of first relay nodes as a winning relay node based on the auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.
Aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. The processor may be one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital signal processing devices (DAPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, or a combination thereof. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media. For example, computer-readable media can include, but are not limited to, magnetic storage devices (e.g., hard disk, floppy disk, magnetic strips … …), optical disks (e.g., compact Disk (CD), digital Versatile Disk (DVD) … …), smart cards, and flash memory devices (e.g., card, stick, key drive … …).
The computer readable medium may comprise a propagated data signal with the computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, and the like, or any suitable combination. The computer readable medium can be any computer readable medium that can communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or device. Program code on a computer readable medium may be propagated over any suitable medium, including radio, electrical cable, fiber optic cable, radio frequency signals, or the like, or any combination of the preceding.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing disclosure is by way of example only, and is not intended to limit the present application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such alterations, modifications, and improvements are intended to be suggested herein and are intended to be within the spirit and scope of the exemplary embodiments of this application.
Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.
Where numerals describing the number of components, attributes or the like are used in some embodiments, it is to be understood that such numerals used in the description of the embodiments are modified in some instances by the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.

Claims (13)

1. A communication method of an auction-mechanism-based satellite-ground converged relay network, wherein the satellite-ground converged relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, and each potential relay node serves a plurality of ground users covered by the potential relay node on the ground, the method comprising:
broadcasting cooperation signaling to the plurality of potential relay nodes by the satellite, and taking a time slot of the satellite as a commodity in an auction mechanism;
each potential relay node predicts whether the potential relay node participates in bidding at the current moment, and all potential relay nodes participating in bidding are used as first relay nodes, wherein the participation bidding indicates that the potential relay nodes are willing to obtain the commodity;
the first relay node evaluates the value of a channel according to the performance of the channel, and reports a bidding vector comprising the estimated value to the satellite, wherein the channel comprises a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the performance of the channel comprises channel gain and sub-time slot allocation length; and
and the satellite selects a first relay node corresponding to the maximum bidding vector from the plurality of first relay nodes as a winning relay node based on an auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.
2. The communication method according to claim 1, wherein each of the potential relay nodes predicts whether the potential relay node participates in bidding at the current time based on a historical success rate and scenario information, wherein the historical success rate is historical data of the potential relay node establishing a cooperative relationship with the satellite, and the scenario information includes satellite-relay channel state information, relay-terrestrial user channel state information and relay-satellite user channel state information obtained by a terrestrial network.
3. The communication method according to claim 2, wherein each of the potential relay nodes is predicted based on a reinforcement learning model, the reinforcement learning model divides the transit period of the satellite into N time segments based on a Q-learning algorithm with the potential relay node as an agent, the model parameters of the reinforcement learning model include a state, an action and a reward of the potential relay node in a certain time segment, the state is defined as a duplet:
Figure FDA0003738060520000011
wherein the content of the first and second substances,
Figure FDA0003738060520000021
wherein the content of the first and second substances,
Figure FDA0003738060520000022
representing the satellite S and potential relay nodes R k Channel gain between, P s Representing the transmission power, σ, of the satellite 2 Representing the noise power of the satellite in question,
Figure FDA0003738060520000023
representing potential relay nodes R k And potential relay nodes R k Users of the ground under coverage
Figure FDA0003738060520000024
The gain of the channel in between is increased,
Figure FDA0003738060520000025
representing potential relay nodes R k The remaining power of; the action is defined as an action set A = { Y, N }, wherein Y represents bidding, and N represents abandonment bidding; the reward is represented by discount reward and is defined as the accumulated value of each instant reward multiplied by a corresponding number of discount factors in the learning process of the intelligent agent; defining an action value function Q (s, a) as an expected value of discount return when an action a belongs to A and is in a state s within a certain time, and updating the action value function by the reinforcement learning model to obtain an optimal action value function Q * (s,a)。
4. The communication method of claim 2, wherein each potential relay node performs prediction based on a reinforcement learning model, the reinforcement learning model employs a Double DQN algorithm to predict whether the potential relay node participates in bidding at the current time, the satellite transit period is divided into P time slices in an isochronous manner, model parameters of the reinforcement learning model include states, actions and rewards of the potential relay node in a certain time slice, and the states are defined as six-element groups:
Figure FDA0003738060520000026
wherein the content of the first and second substances,
Figure FDA0003738060520000027
representing the satellite S and potential relay nodes R k The gain of the channel in between is increased,
Figure FDA0003738060520000028
representing the potential relay node R k And the channel gain between the satellite user D,
Figure FDA0003738060520000029
represents the satellite inclination of the satellite in question,
Figure FDA00037380605200000210
Figure FDA00037380605200000211
representing the angle of visibility of the potential relay node to the satellite,
Figure FDA00037380605200000212
Figure FDA00037380605200000213
represents the remaining power of the potential relay node, theta represents the angle of movement of the satellite, theta e [0,180 DEG](ii) a The action is represented as: a = pi(s), where pi (·) is a policy, representing a mapping process from environment state s to action a; the reward is represented by a numerical value; the reinforcement learning model comprises a Q learning network and a target learning network, data in the environment are input into the Q learning network, a system selects the maximum value in the Q learning network as an action, parameters in the Q learning network are copied into the target learning network at intervals, and the target learning network carries out reverse transmission updating on the parameters of the Q learning network under the action of a loss function so as to gradually obtain an optimized Q value.
5. The communication method of claim 1, further comprising:
the satellite broadcasts information of the winning relay node;
upon arrival of a service period, the satellite transmitting a satellite signal to the winning relay node; and
and the winning relay node accesses the satellite frequency band of the satellite and provides service for the ground user.
6. The communication method of claim 5, wherein one transmission frame of the satellite and the winning relay node comprises (N + 3) time slots, wherein a first time slot is used for the satellite to broadcast the cooperative signaling to all potential relay nodes, a second time slot is used for the satellite to transmit one or more satellite signals to the winning relay node, a third time slot is used for the winning relay node to forward the satellite signals to the satellite users, and the remaining N time slots are used for the winning relay node to serve the terrestrial users in its coverage area, and the service period comprises the remaining (N + 2) time slots except the first time slot.
7. The communication method of claim 6, wherein the auction mechanism is a vycorry auction mechanism and the winning relay node is the first relay node with the largest bid vector.
8. The communication method of claim 7, wherein after determining the winning relay node, a next largest bidding vector having a next to largest bidding vector is used as the bid intermediate value for the winning relay node.
9. The communication method of claim 6, wherein the bid vector is calculated step by step using the following formula
Figure FDA0003738060520000031
Each vector of
Figure FDA0003738060520000032
Figure FDA0003738060520000033
Figure FDA0003738060520000034
Figure FDA0003738060520000035
Figure FDA0003738060520000036
Figure FDA0003738060520000037
Figure FDA0003738060520000038
Figure FDA0003738060520000039
Wherein a first set of relay nodes is represented as R = { R = } 1 ,R 2 ,…,R k ,…,R K K represents the total number of the first relay nodes; kth first relay node R k Serving N terrestrial users, denoted U k ={U k1 ,U k2 ,…,U kn ,…,U kN };
Figure FDA0003738060520000041
Denotes the kth first relay node R k Providing an initial bid value; t is t 1 Denotes a first relay node R k The length of the second transmission slot; t is t 2 Representing a first relay node R k The length of the third transmission slot; t is t kn (N =1 to N) denotes the first relay node R k The length of the remaining N time slots;
Figure FDA0003738060520000042
representing the signal-to-noise ratio of the satellite-first relay link;
Figure FDA0003738060520000043
representing a signal-to-noise ratio of the first relay-terrestrial user link;
Figure FDA0003738060520000044
representing a signal-to-noise ratio of the first relay-satellite user link;
Figure FDA0003738060520000045
representing a first relay node R k Channel capacity of the link provided for satellite user D;
Figure FDA0003738060520000046
the requirement of the QoS of the ground user is represented, and the minimum data transmission rate required to be met by the ground user is represented;
Figure FDA0003738060520000047
representing a first relay node R k Channel gain to satellite user D link;
Figure FDA0003738060520000048
respectively representing the channel gains of three links of a satellite, a first relay node, a satellite user and a first relay node, wherein N and K are positive integers which are more than or equal to 1.
10. The communication method according to claim 9, wherein after the first relay node reports the bid vector to the satellite, the method further comprises the satellite modifying the bid vector based on a handover cost index to obtain a modified bid vector, wherein the handover cost index includes one or any of bid history information, predicted residence time and movement angle information, and the satellite selects a first relay node corresponding to a maximum modified bid vector from the first relay nodes as a winning relay node according to the modified bid vector, wherein the bid history information is obtained from a set of all winning relay nodes selected by the satellite in a time sequence; the predicted dwell time and the movement angle information are obtained from a geometric model.
11. The communication method of claim 10, wherein the bid vector is modified using the following equation:
Figure FDA0003738060520000049
wherein, b k Representing the revised bid vector(s),
Figure FDA00037380605200000410
the normalized bidding vectors after the normalization processing is carried out on the bidding vectors are represented, gamma represents coefficients for balancing data transmission rate and switching overhead, the larger gamma represents that the data transmission rate is more emphasized by the satellite nodes, and the smaller gamma represents that the switching overhead is more worth reducing;
Figure FDA00037380605200000411
is a decision matrix, where x kj Representing any of said handover overhead indicators, K representing the sequence number of the first relay node, K being the total number of first relay nodes, j representing the sequence number of the handover overhead indicator, w j Representing the weight of the jth handover overhead indicator.
12. The communication method of claim 11, wherein the weights are determined using an entropy weight method.
13. An auction-mechanism-based communication system of a satellite-ground converged relay network, wherein the satellite-ground converged relay network comprises at least one satellite, a plurality of potential relay nodes and a plurality of satellite users, each potential relay node serves a plurality of ground users covered by the potential relay node on the ground, and the satellite is used for broadcasting cooperation signaling to the plurality of potential relay nodes and using a time slot of the satellite as a commodity in the auction mechanism; each potential relay node is used for predicting whether the potential relay node participates in bidding at the current moment, and taking all the potential relay nodes participating in bidding as first relay nodes, wherein the participation bidding indicates that the potential relay nodes are willing to obtain the commodity, the first relay nodes are used for evaluating the channel value according to the channel performance and reporting bidding vectors including the estimated value to the satellite, wherein the channels include a satellite-relay channel, a relay-satellite user channel and a relay-ground user channel, and the channel performance includes channel gain and sub-slot allocation length; the satellite is further used for selecting a first relay node corresponding to the largest bidding vector from a plurality of first relay nodes as a winning relay node based on an auction mechanism and the bidding vectors, and the winning relay node and the satellite achieve a cooperative relationship.
CN202210806700.4A 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism Active CN115173926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210806700.4A CN115173926B (en) 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210806700.4A CN115173926B (en) 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism

Publications (2)

Publication Number Publication Date
CN115173926A true CN115173926A (en) 2022-10-11
CN115173926B CN115173926B (en) 2023-07-07

Family

ID=83492664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210806700.4A Active CN115173926B (en) 2022-07-08 2022-07-08 Communication method and communication system of star-ground fusion relay network based on auction mechanism

Country Status (1)

Country Link
CN (1) CN115173926B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090296627A1 (en) * 2008-05-29 2009-12-03 Samsung Electronics Co., Ltd. Apparatus and method for selecting relay station mode in wireless communication system
CN108023637A (en) * 2017-12-06 2018-05-11 中国人民解放军国防科技大学 Isomorphic multi-satellite online collaboration method
CN108832989A (en) * 2018-05-07 2018-11-16 哈尔滨工程大学 The online Dynamic Programming terminal of the task of low rail microsatellite and planing method used in
WO2022105621A1 (en) * 2020-11-17 2022-05-27 重庆邮电大学 Evolutionary game-based multi-user switching method in software-defined satellite network system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090296627A1 (en) * 2008-05-29 2009-12-03 Samsung Electronics Co., Ltd. Apparatus and method for selecting relay station mode in wireless communication system
CN108023637A (en) * 2017-12-06 2018-05-11 中国人民解放军国防科技大学 Isomorphic multi-satellite online collaboration method
CN108023637B (en) * 2017-12-06 2020-06-23 中国人民解放军国防科技大学 Isomorphic multi-satellite online collaboration method
CN108832989A (en) * 2018-05-07 2018-11-16 哈尔滨工程大学 The online Dynamic Programming terminal of the task of low rail microsatellite and planing method used in
WO2022105621A1 (en) * 2020-11-17 2022-05-27 重庆邮电大学 Evolutionary game-based multi-user switching method in software-defined satellite network system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
XIAOKAI ZHANG,等: "Vickrey Auction-Based Secondary Relay Selection in Cognitive Hybrid Satellite-Terrestrial Overlay Networks With Non-Orthogonal Multiple Access", 《IEEE WIRELESS COMMUNICATIONS LETTERS》 *
XIAOKAI ZHANG等: "Auction-Based Multichannel Cooperative Spectrum Sharing in Hybrid Satellite-Terrestrial IoT Networks", 《IEEE INTERNET OF THINGS JOURNAL》 *
XIAOKAI ZHANG等: "VCG Auction-based Multi-Relay Selection in Hybrid Satellite-Terrestrial Overlay Networks", 《2020 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP)》 *
徐赫等: "一种结合拍卖的D2D协作通信中继选择算法", 《计算机应用研究》 *

Also Published As

Publication number Publication date
CN115173926B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN114362810B (en) Low orbit satellite beam jump optimization method based on migration depth reinforcement learning
CN109947545B (en) Task unloading and migration decision method based on user mobility
CN113543074B (en) Joint computing migration and resource allocation method based on vehicle-road cloud cooperation
CN108123828B (en) Ultra-dense network resource allocation method based on access user mobility prediction
CN113613301B (en) Air-ground integrated network intelligent switching method based on DQN
CN112929849B (en) Reliable vehicle-mounted edge calculation unloading method based on reinforcement learning
Chen et al. Learning-based computation offloading for IoRT through Ka/Q-band satellite–terrestrial integrated networks
CN115190489A (en) Cognitive wireless network dynamic spectrum access method based on deep reinforcement learning
CN114050855A (en) Channel information self-adaptive oriented intelligent cooperative transmission method between low-orbit satellites
Hazarika et al. Multi-agent DRL-based computation offloading in multiple RIS-aided IoV networks
CN116886176A (en) Predictable inter-satellite routing method based on link utility function
CN115173926B (en) Communication method and communication system of star-ground fusion relay network based on auction mechanism
CN115022322B (en) Edge cloud cooperation task unloading method based on crowd-sourced evolution in Internet of vehicles
CN116546462A (en) Multi-agent air-ground network resource allocation method based on federal learning
CN116709249A (en) Management method for edge calculation in Internet of vehicles
Shaodong et al. Multi-step reinforcement learning-based offloading for vehicle edge computing
Zhuang et al. GA-MADDPG: A Demand-Aware UAV Network Adaptation Method for Joint Communication and Positioning in Emergency Scenarios
Chen et al. Profit-Aware Cooperative Offloading in UAV-Enabled MEC Systems Using Lightweight Deep Reinforcement Learning
CN115580900A (en) Unmanned aerial vehicle assisted cooperative task unloading method based on deep reinforcement learning
CN114268348A (en) Honeycomb-free large-scale MIMO power distribution method based on deep reinforcement learning
Zhao et al. Multi-agent deep reinforcement learning based resource management in heterogeneous V2X networks
Zhang et al. Cybertwin-driven multi-intelligent reflecting surfaces aided vehicular edge computing leveraged by deep reinforcement learning
Wang et al. Actor-Critic Based DRL Algorithm for Task Offloading Performance Optimization in Vehicle Edge Computing
HaghighiFard et al. Hierarchical Federated Learning in Multi-hop Cluster-Based VANETs
CN114666766B (en) Internet of things gateway communication load sharing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant