CN107995034A - A kind of dense cellular network energy and business collaboration method - Google Patents
A kind of dense cellular network energy and business collaboration method Download PDFInfo
- Publication number
- CN107995034A CN107995034A CN201711236163.XA CN201711236163A CN107995034A CN 107995034 A CN107995034 A CN 107995034A CN 201711236163 A CN201711236163 A CN 201711236163A CN 107995034 A CN107995034 A CN 107995034A
- Authority
- CN
- China
- Prior art keywords
- base station
- energy
- user
- terminal
- behalf
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/04—Wireless resource allocation
- H04W72/044—Wireless resource allocation based on the type of the allocated resource
- H04W72/0473—Wireless resource allocation based on the type of the allocated resource the resource being transmission power
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/51—Allocation or scheduling criteria for wireless resources based on terminal or device properties
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Abstract
The embodiment of the invention discloses a kind of dense cellular network energy and business collaboration method, can be applied to the resource allocation of online more base stations, first with matching theory, realizes the packet of user and corresponding base station.The sub-clustering of customer-centric is realized using matching theory, so as to reduce the scale of base station group in units of cluster, then realizes energy cooperation between the distribution of base station power and base station using acting on behalf of nitrification enhancement more.
Description
Technical field
The present invention relates to wireless communication field, more particularly to a kind of dense cellular network energy and business collaboration method.
Background technology
Super-intensive network is considered as one of most promising technology in 5G, and small honeycomb covering radius can be realized smaller
Interference, high spectrum reuse, high data rate, at the same time, substantial amounts of cellular basestation also bring unprecedented energy
Expense is measured, for studying as research hotspot in recent years for base station energy-saving problem.
At present, in the prior art just for the resource allocation under single honeycomb based on energy capture and two honeycombs, and
Research for more base station energy cooperations under dense network scene is less, how to carry out dense cellular network energy and industry
The cooperation of business is those skilled in the art's technical problem urgently to be resolved hurrily.
The content of the invention
In order to solve the above technical problems, an embodiment of the present invention provides a kind of dense cellular network energy and business collaboration side
Method.
An embodiment of the present invention provides following technical solution:
A kind of dense cellular network energy and business collaboration method, the described method includes:
According to utility function, the list of preferences on user terminal and base station is generated;
According to list of preferences, using multi-to-multi matching algorithm, user base station cluster is obtained;
In user base station cluster, using nitrification enhancement, the cooperation plan of energy between base station power distribution and base station is obtained
Slightly.
Wherein, it is described according to utility function, the list of preferences on user terminal and base station is generated, is specifically included:Definition
Utility functionThe data volume that nth base station can be sent on k-th of channel to terminal m is represented, according to transmission data rateAnd channel gainGenerate base station and the list of preferences of user.
Wherein, it is described in user base station cluster, using nitrification enhancement is acted on behalf of, obtain base station power distribution and base station more
Between energy cooperation policy, specifically include:
The first step, determines behavior aggregate, that is, acts on behalf of all possible behavior value of output, state representation is extracted from environment,
As observation of the agency to environment;
Second step, the state of each agency's observation current environment, into the exploratory stage;
3rd step, act on behalf of it is average and speed is target using maximization system, according to the behavior of the observation progress rationality of oneself
Selection, wherein, behavior, which includes the transmit power of base station and energy cooperation, the strategy that this part can be used to decision-making, two, at random
The experimental strategy and deterministic baseline policy of property;
4th step, after the completion of all decision-makings of all base stations, the incentive message of computing environment, its corresponding shape of each agent update
State behavior value;
5th step, repeats third and fourth step, until the exploratory stage terminates, the relatively newer strategy learnt and benchmark plan
Quality slightly, using preferably strategy as the output policy of this state.
Compared with prior art, above-mentioned technical proposal has the following advantages:
Method of the present invention, can be applied to the resource allocation of online more base stations, real first with matching theory
Current family and the packet of corresponding base station.The sub-clustering of customer-centric is realized using matching theory, calculation is matched better than with tradition
Method, is directed to user in the present invention, channel, the matching of base station three, by base station channel corresponding with its with an effectiveness letter
Number represents, the pairing of above-mentioned three can be realized using a matching process, so as to avoid the two level in conventional method
Match somebody with somebody, the complexity of calculating is reduced on the premise of guarantee is optimal.In power allocated phase, so as to reducing base station group in units of cluster
Scale, the cooperation of energy between the distribution of base station power and base station is then realized using online intensified learning method.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is attached drawing needed in technology description to be briefly described, it should be apparent that, drawings in the following description are the present invention
Some embodiments, for those of ordinary skill in the art, without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is illustrated by the dense cellular network energy that one embodiment of the invention provides and the flow of business collaboration method
Figure.
Embodiment
Just as described in the background section, there is provided a kind of dense cellular network energy and business collaboration method are people from this area
Member's urgent problem to be solved.
In view of this, the present invention proposes a kind of dense cellular network energy and business collaboration method, core of the invention
Thought is:Online resource allocation algorithm is just for single honeycomb or two honeycombs in the prior art.And it is directed to dense cellular
Network, when there are during a large amount of base stations, the method for traditional intensified learning does not guarantee that convergence in the case of more agencies, in addition works as base
When quantity of standing is especially more, if directly carrying out resource point using the constringent methods for acting on behalf of intensified learning of existing guarantee more
Match somebody with somebody, its convergence rate is also especially slowly, it is necessary to considerably long learning time.Due to the presence of disadvantages mentioned above, directly using traditional
Intensified learning method is not suitable for the resource allocation in dense network, and in order to overcome disadvantages mentioned above, inventor creatively proposes, first
First with matching theory, base station, user, resource block are matched, realize the packet of user and corresponding base station.Utilize matching theory
Realize the sub-clustering of customer-centric, it is then real using intensified learning method so as to reduce the scale of base station group in units of cluster
The cooperation of energy between existing base station power distribution and base station.
That is, the access point of magnanimity causes power distribution and the global optimization of energy cooperation to be faced with dense network
Huge difficulty, for this reason, patent of the present invention proposes distributed solution, institute in power allocation procedure is reduced by sub-clustering
The base station number being related to, effectively reduces the difficulty of nitrification enhancement, ensure that and is received in the limited learning cycle of algorithm
Hold back optimal policy.
Referring to Fig. 1, an embodiment of the present invention provides a kind of dense cellular network energy and business collaboration method, applied to close
Collect cellular network, the described method includes:
Step 101:According to utility function, the list of preferences on user terminal and base station is generated.
Wherein, according to utility function, the list of preferences on user terminal and base station is generated, is specifically included:
Define utility functionRepresent the data volume that nth base station can be sent on k-th of channel to terminal m, foundation
Send data rateAnd channel gainGenerate base station and the list of preferences of user.
Step 102:According to list of preferences, using multi-to-multi matching algorithm, user base station cluster is obtained.
Specifically, according to list of preferences, user terminal and base station are matched according to multi-to-multi matching theory model, obtained
To user base station cluster.
Step 103:In user base station cluster, using nitrification enhancement, energy between base station power distribution and base station is obtained
Cooperation policy.
In user base station cluster, the appearance acted on behalf of more causes the unstable of environment so that algorithm can not restrain, to solve this
Problem, proposes the concept of exploratory stage, and the action learning acted on behalf of more is modeled as stage game, it is allowed to which agency is in a state
The lower exploration for carrying out limited number of time, row is produced with different probability using the experimental strategy of randomness and deterministic baseline policy
To calculate its progressive award, after the exploratory stage, comparative experiments strategy and baseline policy, export final strategy.
Specifically, in user base station cluster, the battery capacity, channel condition information, the energy that obtain all base stations in cluster are caught
Obtain with the information such as data packet, start exploratory stage step, base station attempts to carry out power distribution in cluster, empirically tactful with Probability p
Decision-making is carried out, the reward of taken action and acquisition is recorded using baseline policy decision-making with probability 1-p, each acts on behalf of basis
The reward of acquisition updates the strategy of oneself, and by this information record into local knowledge base, the step of repeating the exploratory stage, directly
Terminate to the exploratory stage, the strategy and baseline policy that comparative learning arrives, export final strategy.
From above-described embodiment, method of the present invention, can be applied to the resource allocation of online more base stations, first
First with matching theory, the packet of user and corresponding base station is realized.The sub-clustering of customer-centric is realized using matching theory,
So as to reduce the scale of base station group in units of cluster, then realized using intensified learning method between the distribution of base station power and base station
The cooperation of energy.
Moreover, the method for present invention institute, by terminal, base station, the matching process simplification of three variables of resource block, by once
Matching process realizes the pairing of above three amount.Resource block and base station are subjected to unified arrangement, with k-th of i-th of base station
Channel is a matching amount and terminal coupling, and the list of preferences of terminal is base station and the synthesis of channel.Moreover, the above process is realized
The sub-clustering of base station, and realize the distribution of power after sub-clustering and energy cooperation, still, due to portfolio and energy capture
Uneven, the clustering model of above-mentioned formation will no longer be over time optimal Clustering Model, that is, need restarting to match
Algorithm sub-clustering, we mainly consider the energy residual situation of base station, when the remaining capacity for having base station in cluster is not enough to support this when
In gap during the transmitting of data, just it is again started up matching algorithm and forms new cluster.Since more agent algorithms are tactful in the present invention
Study, each agency is rationality in addition, and when member changes in cluster, what is learnt before is tactful equally applicable, is not required to again
Iterative study.
Wherein, it is described in user base station cluster, using nitrification enhancement, draw energy between base station power distribution and base station
Cooperation strategy, specifically include:
The first step, determines behavior aggregate A, that is, acts on behalf of all possible behavior value of output;State representation is extracted from environment
S, as observation of the agency to environment.
Second step, the state s of each agency's observation current environmentt, into the exploratory stage.
3rd step, act on behalf of it is average and speed is target using maximization system, according to the behavior of the observation progress rationality of oneself
Selection, wherein, behavior, which includes the transmit power of base station and energy cooperation, the strategy that this part can be used to decision-making, two, at random
The experimental strategy and deterministic baseline policy of property.
4th step, after the completion of all decision-makings of all base stations, the incentive message of computing environment, its corresponding shape of each agent update
State behavior value Q (s, a).
5th step, repeats third and fourth step, until the exploratory stage terminates.The relatively newer strategy learnt and benchmark plan
Quality slightly, using preferably strategy as the output policy of this state.
Below to involved in the above method to committed step carry out referring to explanation:
1. the generation of list of preferences
It is to realize user to match purpose, base station, the pairing of resource block, since the same resource block same time can only be by one
A base station and user use, so being ranked up in user terminal by what base station and resource block were unified, avoid 3 in the matching process
The matching of a amount, makes algorithm more concise.Instability caused by frequently switching access point in view of terminal, terminal are preferential
Consider the base station that access energy is more and channel quality is good, comprehensive two factors, we define utility functionRepresent n-th
The data volume that base station can be sent on k-th of channel to terminal m,
Wherein BnFor base station battery electricity,It is used for the channel gain of k-th of channel for connecting terminal m for nth base station,For the transmission power of base station, σ2For additive white Gaussian noise,Co-channel interference for other base stations to base station n,For transmission power of i-th of base station on k-th of channel.
Each terminal accordingly sorts base station and channel, the list of preferences of generation terminal-pair base station.Base station is to the inclined of terminal
Good list is determined by the channel gain of base station to terminal.In view of there is N number of base station in model, each base station have K it is mutually orthogonal
Subchannel, M user, then list of preferences be expressed as
Wherein, SBSiFor the arrangement behind base station and the expansion of its channel, the numbering of i-th of BTS channel of expression, UEiFor i-th
User.
2. matching process
Since each base station there are K orthogonal subchannels, K terminal can be serviced at the same time, it is further contemplated that terminal is more
Connect working method, it is assumed that each user can at most connect L different subchannel, therefore base station, terminal, the matching of channel belong to
Multi-to-multi matches.Detailed process is as follows.
1) when there is not matched terminal, an optional terminal, performs operation below,
2) request matching:The terminal m chosen sends pairing request to base station n, and k-th of the letter to be matched is contained in request
Road information, and n priority for terminal in base station is highest, and do not refused terminal m.
Respond:If m requested channels in base station are idle, receive request, otherwise, base station on channel K
The terminal m of terminal i and the current request pairing of pairing compares, and the list of preferences according to base station to terminal, it is high to receive priority
Terminal pairing request, refuses another terminal and is added in not matched terminal list.
3) until not matched terminal list is stops during sky, otherwise, return is 1).
4) matching terminates, and returns to the set of pairing.
3. act discretization and state feature extraction
Assuming that having m terminal in a cluster, then it is corresponding with m channel and is serviced for it, base station where collecting this m channel
Battery capacity, energy capture situation, the data packet to be sent and channel gain information, form the status information in this cluster, represent
ForWherein,The data package size of i-th of base station, energy are represented respectively
Amount capture, battery capacity and channel gain information.The action definition of agency isWhereinFor transmit power,
For the energy of two base station cooperations.In order to simplify the selection of behavior aggregate, patent of the present invention uses limited transmission power value and cooperation
Energy value, is expressed asWithWherein, δp,δERepresent minimum respectively for step-length
Transmit power and cooperation energy unit.
4. value function approximation
Action value function is approached using linear functionWillIt is expressed as limited a characteristic function φi,m
(st, a), m=1 ..., M and weight vectors θiSum of products form
Wherein, Φ (st, a)=(φi,1(st,a),...,φi,n(st, a)) be state action pair feature function set,
φi,l(st, function a) is characterized, θ is weight vectors, and characteristic function is using tiling coding (tiling in patent of the present invention
code).After characteristic function determines, to acting value functionRenewal be converted into adjustment to weight vectors, using most
Small mean square error is the target of weighed value adjusting, and the purpose of adjustment of weights is to minimize Qi(s, a) andDifference, it is more
Newly process is
5. system is averaged and speed
In view of under dense network scene, each honeycomb has the electric power storage of non-uniform energy capture and limited memory capacity
The characteristics of pond, in two neighboring time slot the change of base station battery electricity be expressed as
Wherein,The energy of data consumption is sent for t time slot base stations,The energy of base station i is shared to for t time slot base stations n,
η is energy transmission efficiency.Obviously, the currently used energy in base station is no more than the electricity stored in battery.
In view of the causality of energy capture, i.e. the energy of current time slots capture can only be in next and later time slot
Use, therefore, sending the required energy of data should meet
Signal-to-noise ratio is in downlink
T-th of time slot base station n is calculated by k-th of channel as the speed of terminal m transmission data is
Then the speed of base station all in system is
The purpose of patent of the present invention is that the system in finite time of maximizing is average and speed
s.t.(3)(4)(5)
6. power distribution algorithm
The power distribution algorithm of more base stations in cluster is discussed in detail in this part.This algorithm is using Markov game as theoretical mould
Type, is distributed and the energy cooperation between base station, realization more using intensified learning is acted on behalf of to complete the power of base station on the downlink channel
The purpose of maximum system throughput in the case of energy constraint.
It is traditional act on behalf of the problem of nitrification enhancement presence can not restrain, analyze its reason and be multiple learning agents
Exist at the same time and result in non-stable external environment condition, learning agent is in dynamic environment without calligraphy learning to a stable decision-making
Strategy.For this problem, the present invention proposes the method repeatedly explored under each state, is by this process model building
Stage game, in this stage, agency can use baseline policy decision-making, and the experiment plan of randomness is used with a small probability
New strategy is slightly explored, multiple learning agents learn optimal policy response under stable environment, after the exploration of limited number of time,
The new tactful and original baseline policy that comparative learning arrives, chooses preferably optimal plan of the strategy output as current state
Slightly.Idiographic flow is as follows.
1) arrange parameter:The experiment Probability p of i-th of agencyi, learning rate α, inertia values λi。
2) the tactful π of i-th of agency is initializedi, which is to act on behalf of the optimal response that i is directed to other proxy policies.
3) the state s of environment is sensed, starts the exploratory stage.
4) with 1-piProbability use initial policy (baseline policy) decision-making, with piProbability is explored using randomized policy,
It is expressed as
5) agency receives the incentive message r of environmenti(st,a1,...,ai,...,am), observation NextState st+1。
6) value function is updated according to following formula
For all
7) for all states, peak optimization reaction collectionIf the exploratory stage
Strategy belongs to peak optimization reaction collection, then the strategy of next exploratory stage isOtherwise, the strategy of next exploratory stage is
As shown in the above, the present invention propose be directed to dense network scene under, in base station utilisable energy by limited time, base
Stand, terminal, the sub-clustering of resource block and online power distribution and base station energy cooperation method.
The present invention is directed to the problem of management of a large amount of base stations of dense network, is realized using distributed matching algorithm with user
Centered on sub-clustering, so as to reduce the scale of base station group in units of cluster, and then simplify the complexity of power distribution.
A terminal can connect multiple base stations at the same time in the present invention, and a base station can service multiple terminals, therefore after matching
Have overlapping between cluster, in terms of user perspective, the base station that services same terminal is a cluster, between the base station in cluster energy cooperation lead to
Cross directly transmission electric energy to realize, from the perspective of base station, each base station connects multiple users equivalent to being in multiple clusters, cluster
Between energy cooperation can pass through base station adjust different terminals transmission power realize.From the point of view of whole system, above-mentioned two
A process can adjust flowing of the energy between base station, realize the balance of base station functions.
The present invention simplifies terminal, base station, the matching process of three variables of resource block, is realized by a matching process
State the pairing of three amounts.Resource block and base station are subjected to unified arrangement, using k-th of channel of i-th of base station as a matching
Amount and terminal coupling, the list of preferences of terminal is base station and the synthesis of channel.
The present invention introduces the concept of exploratory stage more in nitrification enhancement is acted on behalf of, and is used in exploratory stage agency solid
Fixed strategy π, is tested with a small probability to explore other strategies, thus creates a stable environment for agency
Learn optimal global decisions strategy, ensure that convergence.
The present invention considers that the data rate of base station in model is limited by local energy, when generating list of preferences, this
Patent of invention is base using the existing electricity B of base station battery as an influence factor, combined channel gain g, definition utility function
The estimate of data volume can be sent on every channel by standing
The present invention considers the efficiency of energy transmission between base station, can be by transmitting energy or adjustment to same with base station in cluster
The transmission power two ways of one terminal, which is realized, to be cooperated, and online nitrification enhancement realizes both equal in patent of the present invention
Weighing apparatus.
The status information s of base station of the present invention is continuous quantity, for this reason, introducing linear value function approximation method, is forced with linear function
Recently store and predict the Q values of each state, with reference to tiling coding algorithms, realize to continuous state space in model
Processing.
Moreover, traditional matching algorithm is to use second degree matches for the matched solution method of three amounts, i.e., first allow it
In two it is flux matched, then allow one of intermediate quantity and the 3rd flux matched, process complexity more of the present invention.
One kind is proposed in the case of energy constraint for super-intensive network in patent of the present invention, and base station is by using appropriate
Transmission power and energy coordination strategy to maximize, system is average for a long time and the purpose of speed.First to utilization in the present invention
The method carried out with algorithm to large-scale base station in sub-clustering, with existing literature is different, and the present invention makees base station and its resource block
For a matching amount, a list of preferences is generated, base station, user and resource block progress are realized using multi-to-multi matching algorithm
Match somebody with somebody.Promoter of the user as matching process, more consideration is given to the satisfaction for having arrived user, in substance form using user in
The overlapping clustering model of the heart.Compared to traditional second degree matches, and the matching of three amounts, method simplifies matching in patent of the present invention
Process, is allowed to more succinct understandable.For overlapping cluster after matching, the present invention propose it is online act on behalf of nitrification enhancement into
Row power distributes.Appearance for multiple learning agents causes dynamic environment so that agency can not be restrained with problem, it is proposed that
The method repeatedly explored under each state, this process model building is stage game, and in this stage, agency can use benchmark plan
Slightly decision-making, and new strategy is explored using the experimental strategy of randomness with a small probability, multiple learning agents are stable
Learn optimal policy response under environment, after the exploration of limited number of time, new tactful and original benchmark plan that comparative learning arrives
Slightly, preferably optimal policy of the strategy output as current state is chosen.More agent algorithms in the present invention ensure that convergence,
Simultaneously because the reduction of base station scale, acting on behalf of the environment of observation becomes simple, and algorithm can converge to optimal strategy faster.
In the prior art, the regenerative resource (such as solar energy, wind energy etc.) captured from environment is non-uniform, is had
Ripple is qualitative and discontinuity, for the lower base station of such power supply energy supply maximize handling capacity the problem of propose theoretical frame and
The method of specific implementation.
Base station energy capture proposed by the present invention and the method for multiple base station energy cooperations are ensureing the premise of service quality
Under contribute to energy-saving and emission-reduction, reduce operator's operating cost, realize the economic benefit of higher.
Patent of the present invention realizes the distributed of base station and obtains electric energy by energy capture from maintenance operation, single base station,
It is qualitative to reduce the ripple of capture energy by the storage battery accumulation of energy of limited capacity, meanwhile, the energy cooperation between base station realizes big rule
The networking of the mould base station energy is shared, and further improves the stability of base station operation.
Various pieces are described by the way of progressive in this specification, and what each some importance illustrated is and other parts
Difference, between various pieces identical similar portion mutually referring to.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or use the present invention.
A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
Embodiment illustrated herein is not intended to be limited to, and is to fit to consistent with the principles and novel features disclosed herein
Most wide scope.
Claims (3)
1. a kind of dense cellular network energy and business collaboration method, it is characterised in that the described method includes:
According to utility function, the list of preferences on user terminal and base station is generated;
According to list of preferences, using multi-to-multi matching algorithm, user base station cluster is obtained;
In user base station cluster, using nitrification enhancement, the cooperation policy of energy between base station power distribution and base station is obtained.
2. according to the method described in claim 1, it is characterized in that, described according to utility function, generate on user terminal and
The list of preferences of base station, specifically includes:
Define utility function Vnk,m, the data volume that nth base station can be sent on k-th of channel to terminal m is represented, according to transmission
Data rate Vnk,mWith channel gain gnk,m, generate base station and the list of preferences of user.
3. according to the method described in claim 1, it is characterized in that, described in user base station cluster, using acting on behalf of extensive chemical more
Algorithm is practised, the cooperation policy of energy between base station power distribution and base station is obtained, specifically includes:
The first step, determines behavior aggregate, that is, acts on behalf of all possible behavior value of output;State representation is extracted from environment, as
Act on behalf of the observation to environment;
Second step, the state of each agency's observation current environment, into the exploratory stage;
3rd step, act on behalf of it is average and speed is target using maximization system, according to the action selection of the observation progress rationality of oneself,
Wherein, behavior, which includes the transmit power of base station and energy cooperation, the strategy that this part can be used to decision-making, two, the reality of randomness
Test tactful and deterministic baseline policy;
4th step, after the completion of all decision-makings of all base stations, the incentive message of computing environment, its corresponding statusline of each agent update
For value;
5th step, repeats third and fourth step, until the exploratory stage terminates, the relatively newer strategy learnt and baseline policy
Quality, using preferably strategy as the output policy of this state.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711236163.XA CN107995034B (en) | 2017-11-30 | 2017-11-30 | Energy and service cooperation method for dense cellular network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711236163.XA CN107995034B (en) | 2017-11-30 | 2017-11-30 | Energy and service cooperation method for dense cellular network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107995034A true CN107995034A (en) | 2018-05-04 |
CN107995034B CN107995034B (en) | 2020-12-08 |
Family
ID=62034639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711236163.XA Active CN107995034B (en) | 2017-11-30 | 2017-11-30 | Energy and service cooperation method for dense cellular network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107995034B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109548073A (en) * | 2018-11-16 | 2019-03-29 | 厦门大学 | One kind is based on the matched adaptive slight differentiation cluster method of multi-to-multi |
CN109831819A (en) * | 2019-03-06 | 2019-05-31 | 重庆邮电大学 | One kind being based on isomery cellular network sub-clustering SMDP base station dormancy method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106170131A (en) * | 2016-08-22 | 2016-11-30 | 中央军委装备发展部第六十三研究所 | A kind of sane layering Game Learning resource allocation methods of channel status condition of uncertainty lower leaf heterogeneous network |
CN106454850A (en) * | 2016-10-14 | 2017-02-22 | 重庆邮电大学 | Resource distribution method for energy efficiency optimization of honeycomb heterogeneous network |
CN106507463A (en) * | 2016-09-19 | 2017-03-15 | 南京邮电大学 | A kind of isomery cellular network resource distribution method based on the heuristic sub-clustering of multichannel |
CN107302801A (en) * | 2017-05-19 | 2017-10-27 | 南京邮电大学 | To QoE double-deck matching game method below a kind of 5G mixing scene |
-
2017
- 2017-11-30 CN CN201711236163.XA patent/CN107995034B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106170131A (en) * | 2016-08-22 | 2016-11-30 | 中央军委装备发展部第六十三研究所 | A kind of sane layering Game Learning resource allocation methods of channel status condition of uncertainty lower leaf heterogeneous network |
CN106507463A (en) * | 2016-09-19 | 2017-03-15 | 南京邮电大学 | A kind of isomery cellular network resource distribution method based on the heuristic sub-clustering of multichannel |
CN106454850A (en) * | 2016-10-14 | 2017-02-22 | 重庆邮电大学 | Resource distribution method for energy efficiency optimization of honeycomb heterogeneous network |
CN107302801A (en) * | 2017-05-19 | 2017-10-27 | 南京邮电大学 | To QoE double-deck matching game method below a kind of 5G mixing scene |
Non-Patent Citations (1)
Title |
---|
CHUNHONG DUO, BAOGANG LI, YONGQIAN LI, AND YABO LV: "Energy Cooperation in Ultradense Network Powered by Renewable Energy Based on Cluster and Learning Strategy", 《HINDAWI》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109548073A (en) * | 2018-11-16 | 2019-03-29 | 厦门大学 | One kind is based on the matched adaptive slight differentiation cluster method of multi-to-multi |
CN109548073B (en) * | 2018-11-16 | 2020-09-25 | 厦门大学 | Self-adaptive small cell clustering method based on many-to-many matching |
CN109831819A (en) * | 2019-03-06 | 2019-05-31 | 重庆邮电大学 | One kind being based on isomery cellular network sub-clustering SMDP base station dormancy method |
CN109831819B (en) * | 2019-03-06 | 2021-10-22 | 重庆邮电大学 | Heterogeneous cellular network based cluster SMDP base station dormancy method |
Also Published As
Publication number | Publication date |
---|---|
CN107995034B (en) | 2020-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Intelligent resource management based on reinforcement learning for ultra-reliable and low-latency IoV communication networks | |
Chen et al. | Dynamic task offloading for mobile edge computing with hybrid energy supply | |
Guo et al. | An efficient computation offloading management scheme in the densely deployed small cell networks with mobile edge computing | |
CN110113195A (en) | A kind of method of joint unloading judgement and resource allocation in mobile edge calculations system | |
Yang et al. | Distributed deep reinforcement learning-based spectrum and power allocation for heterogeneous networks | |
CN109600178A (en) | The optimization method of energy consumption and time delay and minimum in a kind of edge calculations | |
Wang et al. | Joint resource allocation and power control for D2D communication with deep reinforcement learning in MCC | |
CN111953510A (en) | Smart grid slice wireless resource allocation method and system based on reinforcement learning | |
Lin et al. | Caching in heterogeneous ultradense 5G networks: A comprehensive cooperation approach | |
CN103582100B (en) | A kind of dynamic resource allocation method of the OFDMA downlink system under dynamic power obtains | |
CN111124531A (en) | Dynamic unloading method for calculation tasks based on energy consumption and delay balance in vehicle fog calculation | |
Deng et al. | Throughput maximization for multiedge multiuser edge computing systems | |
Mi et al. | Software-defined green 5G system for big data | |
CN111405646B (en) | Base station dormancy method based on Sarsa learning in heterogeneous cellular network | |
Tan et al. | Resource allocation of fog radio access network based on deep reinforcement learning | |
Wang et al. | Task allocation mechanism of power internet of things based on cooperative edge computing | |
Fan et al. | Towards throughput aware and energy aware traffic load balancing in heterogeneous networks with hybrid power supplies | |
Zhou et al. | Spatial–temporal energy management of base stations in cellular networks | |
Han et al. | Hybrid energy ratio allocation algorithm in a multi-base-station collaboration system | |
Long et al. | The end-to-end rate control in multiple-hop wireless networks: Cross-layer formulation and optimal allocation | |
Li et al. | An energy-effective network deployment scheme for 5G Cloud Radio Access Networks | |
CN107995034A (en) | A kind of dense cellular network energy and business collaboration method | |
CN109272167B (en) | Green energy cooperation method based on UUDN and Q neural network | |
Zhao et al. | Green concerns in federated learning over 6G | |
Ansari et al. | Freenet: Spectrum and energy harvesting wireless networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |