CN110381541A - A kind of smart grid slice distribution method and device based on intensified learning - Google Patents

A kind of smart grid slice distribution method and device based on intensified learning Download PDF

Info

Publication number
CN110381541A
CN110381541A CN201910452242.7A CN201910452242A CN110381541A CN 110381541 A CN110381541 A CN 110381541A CN 201910452242 A CN201910452242 A CN 201910452242A CN 110381541 A CN110381541 A CN 110381541A
Authority
CN
China
Prior art keywords
smart grid
slice
intensified learning
service
sliced
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910452242.7A
Other languages
Chinese (zh)
Other versions
CN110381541B (en
Inventor
孟萨出拉
王智慧
丁慧霞
吴赛
杨德龙
孙丽丽
曹新智
滕玲
段钧宝
李许安
王莹
王雪
陈源彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI, Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201910452242.7A priority Critical patent/CN110381541B/en
Publication of CN110381541A publication Critical patent/CN110381541A/en
Application granted granted Critical
Publication of CN110381541B publication Critical patent/CN110381541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • H04W28/24Negotiating SLA [Service Level Agreement]; Negotiating QoS [Quality of Service]

Abstract

The invention discloses a kind of, and the smart grid based on intensified learning is sliced distribution method characterized by comprising the power business of smart grid is classified according to type of service;By the corresponding different slice of the classification;The intensified learning model that smart grid slice is constructed according to the service indication of smart grid completes the distribution being sliced to smart grid, realizes the resource scheduling management of smart grid by the intensified learning model.By the way that the type of service of smart grid is classified, by the corresponding different slice of classification, the intensified learning model being sliced by the smart grid of building completes the distribution being sliced to smart grid.To solve the integration problem of 5G network microtomy and smart grid based on intensified learning.

Description

A kind of smart grid slice distribution method and device based on intensified learning
Technical field
This application involves the Internet resources of electric power wireless communication to distribute field, and in particular to a kind of intelligence based on intensified learning Energy power grid is sliced distribution method, while being related to a kind of smart grid slice distributor based on intensified learning.
Background technique
With high speed is ubiquitous, low-power consumption, low time delay the 5G epoch arrival, the communication of human society is done step-by-step unimpeded Change.Network slice is considered as one of important key technology of 5G network, and single physical network is divided into multiple independent patrol Network is collected to be allocated in different business scenarios, to support various vertical multi-service networks, and according to its characteristic to adapt to Different demands for services.The cost of deployment can be greatlyd save using network microtomy and reduces the occupation rate of network.
Under the driving that the energy and electricity needs increase, world power grid has stepped into the smart grid epoch from traditional network.Knot The development and Global Internet strategic idea of new round energy revolution, the communications field are closed, 5G network microtomy has for the first time A possibility that for smart grid business is applied to.The technical characteristic of 5G network slice is for carrying the wireless traffic towards power grid Using having the characteristics of slice customizable, be securely and reliably isolated and be sliced unified management between slice, and have quickly networking, The advantage of high-efficiency and economic, there is broad prospect of application in the power system.So the 5G network microtomy based on intensified learning With smart grid merge be urgent need to resolve the problem of.
Summary of the invention
The application provides a kind of smart grid slice distribution method based on intensified learning, solves the 5G based on intensified learning The integration problem of network microtomy and smart grid.
The application provides a kind of smart grid slice distribution method based on intensified learning characterized by comprising
The power business of smart grid is classified according to type of service;
By the corresponding different slice of the classification;
The intensified learning model that smart grid slice is constructed according to the service indication of smart grid, passes through the intensified learning Model completes the distribution being sliced to smart grid, realizes the resource scheduling management of smart grid.
Preferably, the power business of smart grid is classified according to type of service, comprising:
The power business of smart grid is divided into control class, information collection class and mobile application class according to type of service.
Preferably, by the corresponding different slice of the classification, comprising:
Control class is corresponded into uRLLC slice, information collection class is corresponded into mMTC slice, mobile application class is corresponded into eMBB and is cut Piece.
Preferably, the intensified learning model of the building smart grid, specifically, using Q-learningAlgorithm building intelligence The intensified learning model of power grid.
Preferably, the intensified learning model of the building smart grid slice, comprising: construct wireless access side and core respectively The intensified learning model of heart net side.
Preferably, the intensified learning model of the building smart grid slice, comprising:
State space is defined as S={ s1,s2,...,sn};
Motion space A is defined as A={ a1,a2,...,an};
Reward function is R={ s, a }, P (s, s*) indicate the transition probability that s' is transferred to from state s;
At any time, the slice controller in state s can choose movement a, and receive awards reward R immediatelyt, together When can also be transferred to next state s', the process of Q-learning algorithm can use the formula statement updated as follows,
Wherein α is learning rate, andIt is all instant reward RtDiscount accumulation,
Can by updating Q value within the sufficiently long duration, and by adjusting the value of α and γ, guarantee Q (s, a) most Value when optimal policy can be converged to eventually, i.e.,
The application provides a kind of smart grid slice distributor based on intensified learning simultaneously, which is characterized in that including;
Taxon classifies the power business of smart grid according to type of service;
Classification and slice corresponding unit, by the corresponding different slice of the classification;
Model construction unit constructs the intensified learning model of smart grid slice according to the service indication of smart grid;It is logical The intensified learning model is crossed, the distribution being sliced to smart grid is completed, realizes the resource scheduling management of smart grid.
The application provides a kind of smart grid slice distribution method based on intensified learning, by by the business of smart grid Type is classified, by the corresponding different slice of classification, the intensified learning model being sliced by the smart grid of building, and completion pair The distribution of smart grid slice.To solve the integration problem of 5G network microtomy and smart grid based on intensified learning.
Detailed description of the invention
Fig. 1 is that a kind of process of smart grid slice distribution method based on intensified learning provided by the embodiments of the present application is shown It is intended to;
Fig. 2 be the invention relates to smart grid scene under slide holding frame structure schematic diagram;
Fig. 3 be the invention relates to slice and smart grid three classes business between relation schematic diagram;
Fig. 4 be the invention relates to smart grid typical services slice QoS index;
Fig. 5 be the invention relates to smart grid be sliced resource management mechanism to RL mapping;
Fig. 6 is a kind of smart grid slice distributor schematic diagram based on intensified learning provided by the embodiments of the present application.
Specific embodiment
Many details are explained in the following description in order to fully understand the application.But the application can be with Much it is different from other way described herein to implement, those skilled in the art can be without prejudice to the application intension the case where Under do similar popularization, therefore the application is not limited by following public specific implementation.
Fig. 1 is please referred to, Fig. 1 is a kind of smart grid slice distribution side based on intensified learning provided by the embodiments of the present application Method is described in detail method provided by the present application below with reference to Fig. 1.
Step S101 classifies the power business of smart grid according to type of service.
Firstly, introduce the application based on smart grid scene under slide holding frame structure, as shown in Figure 2.
Network slice realizes that network-based control/data plane decouples by the help of SDN technology, and definition is opened therebetween Interface is put, realizes the flexible definition to the network function in network slice.To meet the needs of this kind of business, network slice is only wrapped Containing the network function for supporting specific transactions.Power business can be divided into control class (such as power distribution automation, accurate load control system Deng), information collection class (such as power information acquisition, transmission line of electricity monitoring), mobile application class (such as intelligent patrol detection, mobile operation Deng) three categories.
Step S102, by the corresponding different slice of the classification.
Fig. 3 is the relationship between three categories slice and the three classes business of smart grid.Control class is corresponded into uRLLC slice, Information collection class is corresponded into mMTC slice, mobile application class is corresponded into eMBB slice.
Step S103 constructs the intensified learning model of smart grid slice according to the service indication of smart grid, passes through institute Intensified learning model is stated, the distribution being sliced to smart grid is completed, realizes the resource scheduling management of smart grid.
Fig. 4 gives QoS (service) index of smart grid typical services slice.The application considers service plane, layout Control plane and data plane.Delineation of activities is flexible application (Elastic application) and answered in real time by service plane With (Real-Time application).Flexible application can tolerate relatively large delay, and there is no minimum bandwidth requirements. Specific example, such as automobile enter a distributed generation resource, video monitoring, user's metering.Real-time its network of application requirement provides most The performance guarantee of low level.Main representative type is URLLC slice business, and typical example is power distribution automation, emergency communication Deng.Data plane stores power equipment and interacts the data generated with physical layer.
In this application, emphasis consider layout control plane, introduce access net SDN (software defined network) controller and Core net SDN controller is each responsible for network function (NF) management of access net and core net and coordinates (such as services migrating and portion Administration), they are equivalent to two different agencies, between can be in communication with each other and common complete co-ordination.In face of service plane Type of service, channel condition, user demand all kinds of priori knowledges, the slice layout controller of layout control plane completes to cutting The division of piece network, and it is divided into wireless access network (RAN) side slicing and core net (CN) side slicing.The network of the side RAN and the side CN Slice is managed by respective SDN controller respectively, the responsible algorithm for executing respective network side, that is, the application proposition Smart grid based on intensified learning is sliced distribution method.
Illustrate the intensified learning model for the side RAN and the side CN that the application proposes below.
(1) side RAN radio resource is sliced
Give a series of existing slice χ12,...,χn, indicate that the collection of existing slice is combined into χ={ χ with vector χ1, χ2,...,χn, these are sliced shared aggregate bandwidth B;There are a series of Business Streams, with vector D={ d1,d2,...,dmTable Show.Variables D is actually the set that smart grid Business Stream is constituted.In face of smart grid multi-service feature, every kind of slice business The qos requirement of required satisfaction is different.But the Business Stream is specifically which kind of business in smart grid, unknown in advance, and The real-time requirement variation of business is unstable under the scene of smart grid.It can be seen that di(i ∈ M={ 1,2 .., m }) Obey specific discharge model.
Firstly the need of system state space, motion space and the reward function for defining the side RAN network.Be sliced controller with The interaction of wireless environment is by tuple [S, A, P (s, s*), R (s, a)] it indicates, wherein S indicates possible state set, and A is indicated may Behavior aggregate, P (s, s*) indicate to be transferred to the transition probability of s' from state s, (s a) is and the action triggers phase in state s R Associated reward is fed back to slice controller.The following are the mappings of wireless access side slicing resource management to RL.
A. state space:
State space is defined as a group of components S={ sslice}。ssliceIt is a vector, is used to indicate currently all The state of carrying associate power business slice can be used, wherein nth elements are
B. motion space:
The agency (Agent) of the service traffics model unknown in face of time-varying, intensified learning is necessary for corresponding power business The suitable slice resource of distribution.Agency can determine how lower a moment executes according to current slice state and reward function Movement.Motion space A is defined as A={ abandwidth, abandwidthIndicate that agency (Agent) is each to be logically independent to draw The slice divided distributes suitable bandwidth to carry corresponding business.
Since network slice is to share Internet resources between virtual network, must be mutually isolated between virtual network piece, Other slices are not interfered with when if congestion or failure occurs to carry current business so as to the inadequate resource on a slice. Therefore, to guarantee the isolation of slice with the maximization of utility of resource allocation, a kind of industry can only at most be carried by limiting each slice Business:
Two-valued variable is limited simultaneously
C. reward function
After specific slice is distributed to certain smart grid business by agency, a comprehensive income can be obtained, we are comprehensive by this Close reward of the income as system.It is very strict to the time delay of communication, bit error rate requirement to control class power business, the failure of communication Or mistake may influence the control execution of power grid, lead to operation of power networks failure.For some mobile application class business, (such as inspection is passed Defeated video, playback HD video etc.) need certain transmission rate to guarantee, and higher requirement is had to communication bandwidth.Power supply Reliability mean consistently and adequately, the power supply of high quality.For example, when power supply reliability reaches 99.999% (" 59 "), meaning Taste the year of Electricity customers, power off time did not exceeded 5 minutes per family in region, and when this number reaches 99.9999% (" 6 A 9 "), power off time will be reduced to 30 seconds or so per family in the year of Electricity customers in region.In the side RAN since frequency spectrum resource has Limit should choose optimal policy when distributing slice to maximize the QoS demand for meeting user.
It is main to consider downlink situation, using spectrum efficiency (SE) and time delay (Delay) as evaluation index.The frequency of system Spectrum efficiency can be with is defined as:
According to shannon formula R=blog2(1+(gBS→UEP)/σ2) it can be concluded that base station (BS) arrives the actual speed rate of user, Middle gBS→UEIt is base station to the channel status (CSI) between equipment, obeys Rayleigh fading.
When describing the QoS demand of user, we introduce utility function (utility function), i.e. slice business quilt The curve mapping between performance that the bandwidth and user being assigned to perceive.Herein, it will be assumed that be sliced the business of carrying Flexible application and in real time application can be divided into.
(a) flexible application
For such application program, there is no minimum bandwidth requirements, because it can tolerate relatively large prolong Late.Elastomeric flow amount utility models are used with minor function:
Wherein k is an adjustable parameter, it determines the shape of utility function, and ensures receiving largest request bandwidth When,But even if providing very high bandwidth, the user satisfaction of this application program is also extremely difficult to 1.Therefore, I Even if think bandwidth allocation to this Application Type in the case where network bandwidth is excessive, also should not be more than maximum belt Wide bmax
(b) application in real time
The performance guarantee of its network of the traffic requirement of this application type offer lowest level.If the bandwidth of distribution reduces To some threshold value hereinafter, QoS will become unacceptable.Real-time application is modeled using following utility function:
Wherein k1, k2It is adjustable parameter, they determine the shape of utility function.
The reward of definition study agency is as follows:
R=λ SE+ μ Ue+ξ·Urt
Wherein λ, μ, ξ are SE, UeAnd UrtWeight.
Therefore, for the angle of mathematics, we the problem of can formulate are as follows:
di(i ∈ M={ 1,2 .., m }) obeys specific discharge model (*)
Solve the problems, such as that the crucial difficulty of (*) is, due to the presence of discharge model, in the case where thing is not known first, industry Business changes in demand be it is unstable, i.e., the variation of business real-time requirement under smart grid scene is unknown.
(2) the core network slice based on priority scheduling
Similarly, if computing resource is virtually turned to each VNFs by we, by computational resource allocation to every The problem of a VNF, can be resolved as being sliced radio resource.Therefore, in this part, we discuss that another is heavy The problem of wanting, that is, general VNFs core network slice priority-based.The mapping that we use is cut with radio resource Piece is slightly different, to embody the flexibility of RL.Similarly, the interaction of controller and core-network side is sliced also by four-tuple [S, A, P (s,s*), R (s, a)] it indicates, the appropriate mapping of RL element to this slice problem is defined separately below.
A. state space
There are relevant service function chain (SFCs), their basic functions having the same in core-network side, but needs to disappear Different calculation processing units (CPUs) is consumed, and generates different results (queuing time of such as business).For example, based on business Value or other smart grid business correlated characteristics, Business Stream can be divided into three classes (such as A class, B class, C class), from A class to C class Priority gradually decrease, scheduling rule priority-based is defined as: SFC I priority processing A service stream, SFC II equality A class and b service stream are treated, but the priority for servicing c service stream is minimum.SFC III makes no exception to all Business Streams.? The queuing time of business is produced when based on priority scheduling.
State space can be defined as to T={ Tq, TqIt is a vector, the row of each element in characterization collection of services D Team's state.When the N number of CPU of use calculates business diWhen, i-th of element is Tqi, indicate business diQueuing time, wherein i ∈ M= {1,2,..,m}。
B. motion space
The CPU that each SFC is finally used depends on the quantity of its processed Business Stream.The CPU limited amount the case where Under, each type of Business Stream needs to be scheduled for SFC appropriate, so as to cause acceptable queuing time.Therefore it is handling Business diWhen, it needs to select suitable CPU quantity N in core-network sideCPU.Therefore defining motion space is ACPU={ aCPU, wherein aCPUIndicate the business d in face of arrivingi(i ∈ M={ 1,2 .., m }), the quantity of CPU required for being selected when executing and calculating.
C. reward function
When defining reward function, we characterize current business firstly the need of utility function U for the sensibility of time delay, Define new measurement " network request value " function W later to characterize the priority of business.
It has already mentioned above, in description flexible application and in real time in application, we use utility function:
To characterize business d respectivelyiQoS demand.Compared to the side RAN, the difference is that independent variable becomes calculating business di When core network side needed for CPU number n.But this can only reflect the QoS demand of different business.It is limited due to computing resource Property, after distributing computing resource, reasonable scheduling rule is needed to reflect any business of priority processing, therefore introduces " net Network request value " function W characterizes the priority of business.For any applied business di, need to meet network request value Is defined as:
Wi=2(p)Ui
Wherein p is business diPriority level, UiIt is any one member in flexible application and real-time application composition set Element, i.e. Ui∈{Ue,Ukt}.The weight 2 of service request(p)Indicate the importance that the request is requested relative to other.Definition reward letter Number are as follows:
R=Wi
Above formula can only obtain some business diCurrent preference grade, it would be desirable to obtain a series of business priority row Team's situation maximizes long-term reward so needing to accumulate, i.e.,
Fig. 5 is that smart grid is sliced the mapping of resource management mechanism to RL:
Next the slice distribution method based on intensified learning under the above-mentioned Model Background of the application proposition is introduced.
One kind being based on Q-learningThe side RAN and CN nitrification enhancement.Due to the hereinbefore side RAN, CN state The statement of set, set of actions and reward function is slightly different, and herein, based on it is proposed that RL to RAN, CN Mapping model, Q-learningAlgorithm has universality, and for convenience of indicating, it is S={ s that we, which unify state space, in this section1, s2,...,sn, motion space is A={ a1,a2,...,an, reward function is R={ s, a }, P (s, s*) indicate to turn from state s Move on to the transition probability of s'.
Being sliced the final target of controller is to find optimal dicing strategy π*, which is from state set to behavior aggregate One mapping, and need to maximize each state expection long-term discount reward:
The long-term discount reward of state s is the discount summation of the reward obtained on state trajectory, and is given by:
R(s,π(s))+γR(s1,π(s1))+γ2R(s2,π(s2))+...
Wherein γ is discount factor (0 < γ < 1), determines the corresponding present value of the following reward.In formula (*) Optimization aim indicates the state value function of any strategy, can be expressed as follows:
According to the optimality criterion of Bellman, at least there is a kind of optimal policy in single environment setting.Therefore, most The state value function of dominant strategy is given by:
State transition probability depends on many factors, such as flow load, business arrive and depart from rate, decision making algorithm etc., Therefore, it either still may be all not readily available in core-network side in wireless side.Therefore model-free intensified learning is very suitable to Derive optimal policy, because it does not need the expection of reward, and state transition probability can be used as priori knowledge and be obtained Know.In various existing RL algorithms, we select Q-learning
By taking the side RAN as an example, slice controller is interacted in very short discrete time section with wireless environment.State-movement two Movement-value function (also referred to as Q value) of tuple (s, π (s)) can be represented as Q (s, π (s)).Q (s, π (s)) is defined It is rewarded for the expection long-term discount of state s when using strategy π.Our target is to find a kind of optimisation strategy, is maximized The Q value of each state s:
According to Q-learningAlgorithm, slice controller can be based on existing information, pass through iterative learning to optimal Q value. At any time, the slice controller in state s can choose movement a.This reward R immediately that can receive awardst, while also can It is transferred to next state s'.Q-learningThe process of algorithm can be stated with the formula updated as follows:
Wherein α is learning rate, andIt is all instant reward RtDiscount accumulation:
Can by updating Q value within the sufficiently long duration, and by adjusting the value of α and γ, guarantee Q (s, a) most Value when optimal policy can be converged to eventually, i.e.,
Entire dicing strategy is provided by following algorithm.When initial, Q value is set to 0.In Q-learningAlgorithm applies it Before, slice controller executes initial slice distribution to different slices based on the power business flow demand estimation of each slice, this It is state initialization in order not to same slice that sample, which is done,.Existing radio resource slice solution is used based on bandwidth or based on money The supply in source gives radio resource allocation to different slices.
Due to Q-learningIt is a kind of online Iterative Algorithm, it executes two distinct types of operation.In the mode of exploration Under, slice controller randomly chooses a possible movement, to enhance its following decision.On the contrary, in development mode, slice Controller, which prefers it, to be attempted concurrently now to operate effectively in the past.We assume that the slice controller in state s is with the general of ε Rate is explored, and the Q value stored before being utilized with the probability of 1- ε.Under any state, not every movement is all can It is capable for being isolated between retention tab and piece, being sliced that controller must assure that will not be by identical Physical Resource Block (PRB) point Dispensing two different pieces (side RAN).
Corresponding with method provided by the present application, the application provides a kind of smart grid based on intensified learning simultaneously and cuts Piece distributor 600, which is characterized in that including;
Taxon 610 classifies the power business of smart grid according to type of service;
Classification and slice corresponding unit 620, by the corresponding different slice of the classification;
Model construction unit 630 constructs the intensified learning model of smart grid slice according to the service indication of smart grid; By the intensified learning model, the distribution being sliced to smart grid is completed, realizes the resource scheduling management of smart grid.
The application provides a kind of smart grid slice distribution method based on intensified learning, by by the business of smart grid Type is classified, by the corresponding different slice of classification, the intensified learning model being sliced by the smart grid of building, and completion pair The distribution of smart grid slice.To solve the integration problem of 5G network microtomy and smart grid based on intensified learning.
The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, although referring to above-described embodiment pair The present invention is described in detail, those of ordinary skill in the art still can to a specific embodiment of the invention into Row modifies perhaps equivalent replacement and these exist without departing from any modification of spirit and scope of the invention or equivalent replacement Apply within pending claims of the invention.

Claims (7)

1. a kind of smart grid based on intensified learning is sliced distribution method characterized by comprising
The power business of smart grid is classified according to type of service;
By the corresponding different slice of the classification;
The intensified learning model that smart grid slice is constructed according to the service indication of smart grid, passes through the intensified learning mould Type completes the distribution being sliced to smart grid, realizes the resource scheduling management of smart grid.
2. the method according to claim 1, wherein the power business of smart grid is carried out according to type of service Classification, comprising:
The power business of smart grid is divided into control class, information collection class and mobile application class according to type of service.
3. the method according to claim 1, wherein by the corresponding different slice of the classification, comprising:
Control class is corresponded into uRLLC slice, information collection class is corresponded into mMTC slice, mobile application class is corresponded into eMBB slice.
4. the method according to claim 1, wherein it is described building smart grid intensified learning model, specifically , use the intensified learning model of Q-learning algorithm building smart grid.
5. the method according to claim 1, wherein it is described building smart grid slice intensified learning model, It include: the intensified learning model for constructing wireless access side and core-network side respectively.
6. method according to claim 1 or 4, which is characterized in that the intensified learning mould of the building smart grid slice Type, comprising:
State space is defined as
Motion spaceIt is defined as
Reward function is Indicate the transition probability that s' is transferred to from state s;
At any time, the slice controller in state s can choose movement a, and receive awards instant rewardAlso can simultaneously It is transferred to next state s', the process of Q-learning algorithm can be stated with the formula updated as follows,
Wherein α is learning rate, andIt is all instant rewardsDiscount accumulation,
It can guarantee that Q (a) finally may be used by s by updating Q value within the sufficiently long duration, and by adjusting the value of α and γ To converge to value when optimal policy, i.e.,
7. a kind of smart grid based on intensified learning is sliced distributor, which is characterized in that including;
Taxon classifies the power business of smart grid according to type of service;
Classification and slice corresponding unit, by the corresponding different slice of the classification;
Model construction unit constructs the intensified learning model of smart grid slice according to the service indication of smart grid;Pass through institute Intensified learning model is stated, the distribution being sliced to smart grid is completed, realizes the resource scheduling management of smart grid.
CN201910452242.7A 2019-05-28 2019-05-28 Smart grid slice distribution method and device based on reinforcement learning Active CN110381541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910452242.7A CN110381541B (en) 2019-05-28 2019-05-28 Smart grid slice distribution method and device based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910452242.7A CN110381541B (en) 2019-05-28 2019-05-28 Smart grid slice distribution method and device based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN110381541A true CN110381541A (en) 2019-10-25
CN110381541B CN110381541B (en) 2023-12-26

Family

ID=68248856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910452242.7A Active CN110381541B (en) 2019-05-28 2019-05-28 Smart grid slice distribution method and device based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN110381541B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111292570A (en) * 2020-04-01 2020-06-16 广州爱浦路网络技术有限公司 Cloud 5GC communication experiment teaching system and teaching method based on project type teaching
CN111711538A (en) * 2020-06-08 2020-09-25 中国电力科学研究院有限公司 Power network planning method and system based on machine learning classification algorithm
CN111726811A (en) * 2020-05-26 2020-09-29 国网浙江省电力有限公司嘉兴供电公司 Slice resource allocation method and system for cognitive wireless network
CN111953510A (en) * 2020-05-15 2020-11-17 中国电力科学研究院有限公司 Smart grid slice wireless resource allocation method and system based on reinforcement learning
CN112365366A (en) * 2020-11-12 2021-02-12 广东电网有限责任公司 Micro-grid management method and system based on intelligent 5G slice
CN112383427A (en) * 2020-11-12 2021-02-19 广东电网有限责任公司 5G network slice deployment method and system based on IOTIPS fault early warning
CN112737813A (en) * 2020-12-11 2021-04-30 广东电力通信科技有限公司 Power business management method and system based on 5G network slice
CN113225759A (en) * 2021-05-28 2021-08-06 广东电网有限责任公司广州供电局 Network slice safety and decision management method for 5G smart power grid
CN113255347A (en) * 2020-02-10 2021-08-13 阿里巴巴集团控股有限公司 Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment
CN113316188A (en) * 2021-05-08 2021-08-27 北京科技大学 AI engine supporting access network intelligent slice control method and device
CN113329414A (en) * 2021-06-07 2021-08-31 深圳聚创致远科技有限公司 Smart power grid slice distribution method based on reinforcement learning
CN113630733A (en) * 2021-06-29 2021-11-09 广东电网有限责任公司广州供电局 Network slice distribution method and device, computer equipment and storage medium
CN113840333A (en) * 2021-08-16 2021-12-24 国网河南省电力公司信息通信公司 Power grid resource allocation method and device, electronic equipment and storage medium
CN114531403A (en) * 2021-11-15 2022-05-24 海盐南原电力工程有限责任公司 Power service network distinguishing method and system
CN115913966A (en) * 2022-12-06 2023-04-04 中国联合网络通信集团有限公司 Virtual network function deployment method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238631A (en) * 2011-08-17 2011-11-09 南京邮电大学 Method for managing heterogeneous network resources based on reinforcement learning
CN108965024A (en) * 2018-08-01 2018-12-07 重庆邮电大学 A kind of virtual network function dispatching method of the 5G network slice based on prediction
CN109495907A (en) * 2018-11-29 2019-03-19 北京邮电大学 A kind of the wireless access network-building method and system of intention driving
CN109600262A (en) * 2018-12-17 2019-04-09 东南大学 Resource self-configuring and self-organization method and device in URLLC transmission network slice

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238631A (en) * 2011-08-17 2011-11-09 南京邮电大学 Method for managing heterogeneous network resources based on reinforcement learning
CN108965024A (en) * 2018-08-01 2018-12-07 重庆邮电大学 A kind of virtual network function dispatching method of the 5G network slice based on prediction
CN109495907A (en) * 2018-11-29 2019-03-19 北京邮电大学 A kind of the wireless access network-building method and system of intention driving
CN109600262A (en) * 2018-12-17 2019-04-09 东南大学 Resource self-configuring and self-organization method and device in URLLC transmission network slice

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255347A (en) * 2020-02-10 2021-08-13 阿里巴巴集团控股有限公司 Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment
CN111292570A (en) * 2020-04-01 2020-06-16 广州爱浦路网络技术有限公司 Cloud 5GC communication experiment teaching system and teaching method based on project type teaching
CN111292570B (en) * 2020-04-01 2021-09-17 广州爱浦路网络技术有限公司 Cloud 5GC communication experiment teaching system and teaching method based on project type teaching
CN111953510A (en) * 2020-05-15 2020-11-17 中国电力科学研究院有限公司 Smart grid slice wireless resource allocation method and system based on reinforcement learning
CN111953510B (en) * 2020-05-15 2024-02-02 中国电力科学研究院有限公司 Smart grid slice wireless resource allocation method and system based on reinforcement learning
CN111726811A (en) * 2020-05-26 2020-09-29 国网浙江省电力有限公司嘉兴供电公司 Slice resource allocation method and system for cognitive wireless network
CN111726811B (en) * 2020-05-26 2023-11-14 国网浙江省电力有限公司嘉兴供电公司 Slice resource allocation method and system for cognitive wireless network
CN111711538B (en) * 2020-06-08 2021-11-23 中国电力科学研究院有限公司 Power network planning method and system based on machine learning classification algorithm
CN111711538A (en) * 2020-06-08 2020-09-25 中国电力科学研究院有限公司 Power network planning method and system based on machine learning classification algorithm
CN112365366A (en) * 2020-11-12 2021-02-12 广东电网有限责任公司 Micro-grid management method and system based on intelligent 5G slice
CN112383427A (en) * 2020-11-12 2021-02-19 广东电网有限责任公司 5G network slice deployment method and system based on IOTIPS fault early warning
CN112737813A (en) * 2020-12-11 2021-04-30 广东电力通信科技有限公司 Power business management method and system based on 5G network slice
CN113316188B (en) * 2021-05-08 2022-05-17 北京科技大学 AI engine supporting access network intelligent slice control method and device
CN113316188A (en) * 2021-05-08 2021-08-27 北京科技大学 AI engine supporting access network intelligent slice control method and device
CN113225759A (en) * 2021-05-28 2021-08-06 广东电网有限责任公司广州供电局 Network slice safety and decision management method for 5G smart power grid
CN113329414A (en) * 2021-06-07 2021-08-31 深圳聚创致远科技有限公司 Smart power grid slice distribution method based on reinforcement learning
CN113329414B (en) * 2021-06-07 2023-01-10 深圳聚创致远科技有限公司 Smart power grid slice distribution method based on reinforcement learning
CN113630733A (en) * 2021-06-29 2021-11-09 广东电网有限责任公司广州供电局 Network slice distribution method and device, computer equipment and storage medium
CN113840333A (en) * 2021-08-16 2021-12-24 国网河南省电力公司信息通信公司 Power grid resource allocation method and device, electronic equipment and storage medium
CN113840333B (en) * 2021-08-16 2023-11-10 国网河南省电力公司信息通信公司 Power grid resource allocation method and device, electronic equipment and storage medium
CN114531403A (en) * 2021-11-15 2022-05-24 海盐南原电力工程有限责任公司 Power service network distinguishing method and system
CN115913966A (en) * 2022-12-06 2023-04-04 中国联合网络通信集团有限公司 Virtual network function deployment method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110381541B (en) 2023-12-26

Similar Documents

Publication Publication Date Title
CN110381541A (en) A kind of smart grid slice distribution method and device based on intensified learning
Qian et al. Survey on reinforcement learning applications in communication networks
Sun et al. Autonomous resource slicing for virtualized vehicular networks with D2D communications based on deep reinforcement learning
Roig et al. Management and orchestration of virtual network functions via deep reinforcement learning
CN104572307B (en) The method that a kind of pair of virtual resource carries out flexible scheduling
CN107682135A (en) A kind of network slice adaptive virtual resource allocation method based on NOMA
CN111953510B (en) Smart grid slice wireless resource allocation method and system based on reinforcement learning
Gao et al. Computation offloading with instantaneous load billing for mobile edge computing
Xu et al. RJCC: Reinforcement-learning-based joint communicational-and-computational resource allocation mechanism for smart city IoT
Zhou et al. Learning from peers: Deep transfer reinforcement learning for joint radio and cache resource allocation in 5G RAN slicing
Rezazadeh et al. On the specialization of fdrl agents for scalable and distributed 6g ran slicing orchestration
Meng et al. RAN slice strategy based on deep reinforcement learning for smart grid
CN106027288A (en) Communication traffic prediction method for distribution line information monitoring service
Hlophe et al. QoS provisioning and energy saving scheme for distributed cognitive radio networks using deep learning
Othman et al. Efficient admission control and resource allocation mechanisms for public safety communications over 5G network slice
Hou et al. Frequency-reconfigurable cloud versus fog computing: An energy-efficiency aspect
Luu et al. Admission control and resource reservation for prioritized slice requests with guaranteed SLA under uncertainties
CN114938372B (en) Federal learning-based micro-grid group request dynamic migration scheduling method and device
Maia et al. A multi-objective service placement and load distribution in edge computing
Grasso et al. Smart zero-touch management of uav-based edge network
Zhou et al. Digital twin-empowered network planning for multi-tier computing
Zheng et al. Learning based task offloading in digital twin empowered internet of vehicles
Zhong et al. Energy-efficient wireless packet scheduling with quality of service control
Lotfi et al. Attention-based open RAN slice management using deep reinforcement learning
Zeydan et al. A multi-criteria decision making approach for scaling and placement of virtual network functions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant