CN112615730B

CN112615730B - Resource allocation method and device based on block chain network slice proxy

Info

Publication number: CN112615730B
Application number: CN202011322819.1A
Authority: CN
Inventors: 魏翼飞; 公雨; 孙司远; 张勇; 郭达; 宋梅
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2020-11-23
Filing date: 2020-11-23
Publication date: 2022-04-01
Anticipated expiration: 2040-11-23
Also published as: CN112615730A

Abstract

One or more embodiments of the present specification provide a resource allocation method and apparatus based on a blockchain network slice proxy. The method comprises the following steps: a network slice agent receives a resource request sent by a target slice tenant; determining a target response slice tenant based on the resource request; deploying a block chain-based smart contract based on the resource request, the target slice tenant, and the target response slice tenant; and allocating network resources based on the intelligent contract. The implementation mode performs resource allocation through the intelligent contract and the deep reinforcement learning algorithm based on the block chain and deployed by the resource request sent by the user, the target slice tenant and the target response slice tenant, so as to minimize the time delay and the calculation cost of the system on the premise of meeting the service quality of the network slice tenant.

Description

Resource allocation method and device based on block chain network slice proxy

Technical Field

One or more embodiments of the present disclosure relate to the field of computer technologies and communications technologies, and in particular, to a resource allocation method and apparatus based on a blockchain network slice proxy.

Background

Network slicing technology has become one of the key technologies of the fifth generation mobile network (5G). Different application scenes have different requirements on network speed, delay and reliability, and the independent and flexible network fragments abstract a physical network into a virtual logic network aiming at different scenes, so that powerful guarantee is provided for the service quality of users with different requirements.

Network Slicing (NS) allows operators to run multiple virtual networks on one physical infrastructure and will play a key role in 5G implementations. The 5G end-to-end network slicing refers to dividing a plurality of logic subnets which have different characteristics and are isolated from each other according to requirements, and then flexibly allocating network resources. In the future, the combination of network slices and block chains has become a necessary trend.

Network slicing faces a series of challenges in fifth generation communication networks, such as real-time resource allocation for mobile networks. The 5G network scenario is diverse, complex and scalable. Due to mobility of the network slice tenants, the needs of the tenants may change in such a changing situation, which will result in under-or over-configuration of network slice resources and is not easy to achieve in meeting the network slice tenant quality of service requirements, minimizing the system latency and computational cost.

Disclosure of Invention

In view of this, one or more embodiments of the present disclosure provide a resource allocation method, apparatus, device and storage medium based on a blockchain network slice proxy, so as to solve the problem that in an existing network resource allocation scheme, when a service quality requirement of a network slice tenant is met, a delay and a computation cost of a system are minimized.

In view of the above, one or more embodiments of the present specification provide a resource allocation method based on a blockchain network slice proxy, including:

a network slice agent receives a resource request sent by a target slice tenant;

determining a target response slice tenant based on the resource request;

deploying an intelligent contract based on the block chain based on the resource request, the target slice tenant and the target response slice tenant;

and allocating the network resources based on the intelligent contract.

Further, before determining the target response slice tenant based on the resource request, the method further comprises:

determining whether a target slice tenant has an admission control certificate based on the resource request;

in response to determining that the target slice tenant has an admission control certificate, sending a resource request to each slice tenant having an admission control certificate.

Further, determining a target response slice tenant based on the resource request, comprising:

receiving available resources and prices aiming at the resource requests sent by the slice tenants responding to the requests, wherein the prices are determined by the tenants responding to the requests through excessive distribution and betweenness centrality calculation;

and determining a target response slice tenant based on a deep reinforcement learning algorithm, a service level protocol, each available resource and price, and a preset price threshold and a preset delay threshold.

Further, before deploying the intelligent contract based on the block chain based on the resource request, the target slicing tenant and the target response slicing tenant, the method further comprises:

in response to determining that the target slice tenant pays a bill for the resource request, a block chain based smart contract is deployed.

A resource allocation apparatus based on a blockchain network slice proxy, comprising:

the receiving unit is configured to receive a resource request sent by a target slice tenant by the network slice proxy;

a target response slice tenant determination unit configured to determine a target response slice tenant based on the resource request;

the intelligent contract deployment unit is configured to deploy intelligent contracts based on the block chains based on the resource requests, the target slicing tenants and the target response slicing tenants;

and the network resource allocation unit is configured to allocate the network resources based on the intelligent contract.

Further, the apparatus further comprises:

an admission control certificate determination unit configured to determine whether a target slice tenant has an admission control certificate based on the resource request;

a resource request sending unit configured to send a resource request to each slice tenant having an admission control certificate in response to determining that the target slice tenant has the admission control certificate.

Further, the target response slice tenant determination unit is further configured to:

Further, the intelligent contract deployment unit is further configured to: in response to determining that the target slice tenant pays a bill for the resource request, a block chain based smart contract is deployed.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the method for resource allocation based on a blockchain network slicing proxy as described above.

A non-transitory computer readable storage medium storing computer instructions for causing a computer to execute the resource allocation method based on blockchain network slicing proxy as described above.

As can be seen from the above, according to the resource allocation method, apparatus, device and storage medium based on the blockchain network slice proxy provided in one or more embodiments of the present specification, resource allocation is performed through a blockchain-based intelligent contract and a deep reinforcement learning algorithm deployed based on a resource request sent by a user, a target slice tenant and a target response slice tenant, so as to minimize system latency and computation cost on the premise of meeting network slice tenant service quality.

Drawings

In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.

Fig. 1 is a flowchart illustrating a resource allocation method based on a blockchain network slice proxy according to an embodiment of the present disclosure;

fig. 2 is a flowchart illustrating a resource allocation method based on a blockchain network slice proxy according to another embodiment of the present disclosure;

fig. 3 is a block diagram illustrating a structure of a resource allocation apparatus based on a blockchain network slice proxy according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram illustrating a hardware structure of an electronic device for allocating network resources according to an embodiment of the present disclosure.

Detailed Description

For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.

It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.

Fig. 1 shows a flow 100 of one embodiment of a resource allocation method based on blockchain network slice proxies according to the present application. The resource allocation method based on the blockchain network slice proxy of the embodiment comprises the following steps:

step 101, a network slice proxy receives a resource request sent by a target slice tenant.

In this embodiment, the execution main body of the resource allocation method based on the blockchain Network Slicing proxy may be a Network Slicing proxy (NSB). The network slice proxy may receive a resource request sent by a target slice tenant in a network slice scenario. NSB may provide different resources depending on the slice type, for example: in the computing resource domain, the resources comprise a CPU, a VM instance number and the like; in the radio domain, resources are related to function division type, MAC scheduler algorithm, number of Physical Resource Blocks (PRBs), etc.; in the transport domain, the resources include the type of link (bandwidth, latency), the number of VLANs, the capacity of the forwarding link, VPN links, QoS, etc. In this section, we take the example of a computing resource allocation process, and can use a Deep Reinforcement Learning (DRL) framework to model the optimization problem.

Specifically, the network slice proxy is an intermediate entity between the resource provider and the network slice tenant, and is responsible for intermediation between the resource provider and the network slice tenant, and enables the target slice tenant to request and lease resources as needed. The roles of the network slice proxy include: admission control, resource allocation and physical resource scheduling. The role of the NSB is to dynamically adjust the configuration of network slice resources so that tenants can acquire or release resources. Specifically, the network slice proxy of the present application is a network slice proxy based on a block chain, and a resource provider, a network slice tenant, or a user with a specific Service requirement can all respond to a resource request of the user sent by the network slice proxy as a block chain node point, and send an available resource and a price for the resource request to the network slice proxy, and the network slice proxy selects a response node with the lowest price as a target response node (i.e., a target response slice tenant) based on the received available resource and price sent by each node and a deep reinforcement learning algorithm under the condition of meeting a Service Level Agreement (SLA), so that the network slice resource allocation can be supported in an end-to-end manner, including multi-domain characteristics of the network slice tenant and ensuring the requirements of heterogeneous tenants.

For example, a network slice scenario consists of one NSB and several tenants. Tenant i first sends a service request to the NSB specifying the required services (video stream, computing resources, etc.), and then the NSB broadcasts its requirements to other tenants that may wish to provide their quotas. The responding tenant (e.g., tenant j) will send a message to the NSB that includes the available resources and the price. After receiving all the messages, the NSB selects a response tenant with the lowest price by using a deep reinforcement learning algorithm under the condition of meeting an SLA (Service Level Agreement). Tenant 1 may lease resources of tenant k after agreement is reached.

Specifically, a network slice scene may be composed of three parts: infrastructure providers (InP); network Slice Tenants (NST) and Network Slice Brokers (NSB).

Specifically, the target slicing tenant may be a resource provider, a network slicing tenant, or a user with a specific service requirement, and the target slicing tenant is not specifically limited in the present application.

Specifically, network slice tenants are business entities such as Over-The-top (ott), Mobile Virtual Network Operators (MVNOs), and third party industry market participants that rent slice resources for customers to provide special requests, or users with specific business needs may also be directly considered as network slice tenants. Services for network slice tenants may be divided into tenant service a (embb), tenant service b (mtc), and tenant service c (ullrc).

Specifically, the infrastructure provider InP may be the owner of the mobile network physical infrastructure, which owns and manages all or part of the network infrastructure assets. The resources of InP may be divided into access networks, transport networks and core networks. The resources of the access network constitute the access nodes such as a base station (LTE/5G access) and WLAN access. The infrastructure provider InP may be seen as a resource provider in the network architecture, and the resources of the transport network and the core network may be described as transport nodes and core nodes in a blockchain, respectively, which may each be seen as a blockchain node.

Specifically, the resource request sent by the tenant may be a computing task resource request.

Step 102, determining a target response slice tenant based on the resource request.

The execution subject, upon receiving the resource request, can determine a target response slice tenant based on the resource request. Specifically, after receiving the resource request, the execution subject may send the resource request to a resource provider, a network slice tenant, and the like that may provide a resource corresponding to the resource request, and may determine a target response slice tenant corresponding to the resource request type according to the type of the resource request. Specifically, the execution main body may determine the classification identifier corresponding to the resource request according to the resource request and a preset correspondence between the resource request and the classification identifier, so as to determine a type of the resource request, which may be, for example, a multimedia playing resource request, a computing resource request, a cloud storage resource request, and the like, then may receive response information of each slice tenant to the resource request, and determine, according to each response information, a response slice tenant having the same type as the resource request as a target response slice tenant. Of course, the executing entity needs to verify whether the user sending the resource request has admission credentials before processing the resource request.

And 103, deploying an intelligent contract based on the block chain based on the resource request, the target slicing tenant and the target response slicing tenant.

And 104, distributing the network resources based on the intelligent contract.

After acquiring the resource request, the target slice tenant, and the target response slice tenant, the execution subject may deploy an intelligent contract based on the block chain based on the resource request, the target slice tenant, and the target response slice tenant. The target slice tenant and the target response slice tenant can be written into the intelligent contract based on the block chain as a node of the block chain, and the block chain link point corresponding to the target response slice tenant in the intelligent contract based on the block chain responds to the selection of the network slice proxy and provides resources for the target slice tenant based on the resources and the price required to be provided in the intelligent contract. Similarly, the network slice agent can call the block chain node corresponding to the target response slice tenant to provide the required resources for the target slice tenant (or called user, the whole text together) at the agreed price based on the intelligent contract.

In particular, each contract has a unique identifier and some data fields, and may perform operations such as creating a new contract or updating block chain states. When all contracts are negotiated and finalized, an end-to-end slice can be deployed. The sub-sliced infrastructure components may be converted to Technical Domain (TD) resources, and the slice components for several TDs may be labeled NS ═ TD ═ respectively₁，TD₂，…，TD_nR (TD) can then be defined_i)＝{p₁，p₂，…，p_nAs TD_iA parameter set of the resource. TDs may be computing resource domains (e.g. CPU, I/O), storage domains, radio domains (e.g. eNB, central unit CU, distributed unit DU), and transport domains (e.g. VLAN, VPN).

In this embodiment, a slice tenant rents a computing resource, and on the premise that the quality of Service (QoS) of the network slice tenant is satisfied, the delay and the computation cost of the system are minimized.

With continued reference to fig. 2, a flow 200 of another embodiment of a blockchain network slice proxy based resource allocation method according to the present application is shown. As shown in fig. 2, the resource allocation method based on the blockchain network slice proxy of the present embodiment may include the following steps:

in step 201, a network slice proxy receives a resource request sent by a target slice tenant.

The principle of step 201 is the same as that of step 101, and is not described herein again.

Step 202, determining whether the target slice tenant has an admission control certificate based on the resource request.

In this embodiment, after the execution subject obtains the resource request, it may be determined whether the target slice tenant corresponding to the resource request has a public key certificate, that is, an admission control certificate, based on the resource request. To facilitate dynamic resource leasing between NSTs to support end-to-end services across multiple administrative domains, NSBs provide block-chain based admission control systems. Blockchains can be divided into unlicensed blockchains and licensed blockchains. The unlicensed blockchain allows anyone to read, write and participate in the creation of ledgers, and the licensed blockchain imposes restrictions on who is allowed to participate in network activities, for example, restricting the kinds of transactions. A block chain based Network slice Broker framework (BNSB) is an approved system of block chains that can run admission and control mechanisms. If the NST wishes to join the framework, the NST must be required to obtain a public key certificate, such as a commercial Certificate Authority (CA), from a Trusted Third Party (TTP), referred to herein as the NSB. This certificate is used to ensure the security of NST resource sharing. As an admission control mechanism, all certificates granted by an NST are partners in the system, and NSTs without certificates cannot lease resources from others. Rights that partners in the system have, for example: asking for resources from other partners and paying corresponding fees; after payment of the corresponding fee, the partner must lease the resource according to the agreed Service Level Agreement (SLA).

Step 203, in response to determining that the target slice tenant has an admission control certificate, sending a resource request to each slice tenant having an admission control certificate.

The execution body determines that the target slice tenant has the admission control certificate, and can only send the resource request to each slice tenant having the admission control certificate. For example, assuming that M users with admission control credentials request computing resources, the NSB may send the computing resource requests to each slice tenant with admission control credentials (e.g., MEC controller with admission control credentials). There may be N MEC servers (i.e., block nodes or resource providers) responding to the computing requests of the M users. Wherein all users granted credentials by the NSB can be considered buddies in this scenario. The set of users can be represented as

The MEC server can be expressed as

The embodiment can protect the privacy of each slice tenant with the admission control certificate and protect each slice tenant from malicious attack by being based on the admission control.

Step 204, based on the resource request, determining a target response slice tenant.

The principle of step 204 is the same as that of step 102, and is not described here again.

Specifically, step 204 can also be implemented by steps 2041 to 2042:

step 2041, receiving available resources and prices for the resource request sent by each responding slice tenant.

In this embodiment, the price is determined by the tenant of each response through the distribution and betweenness centrality calculation. B may be ═ B₁，b₂，…，b_kAs a set of NSBs. T is_k＝{t₁，t₂，…，t_mDenotes NSB b_kA set of NSTs allowed in the federation. In order to achieve dynamic resource exchange between tenants, a dedicated block chain must be established for each tenant federation. Each NSB b_kDeploying blockchain nodes and registering resources with a table R_k＝{r₁，r₂，…，r_iIs loaded into such node that was originally allocated by InP.

Specifically, in this embodiment, each block chain node responds to the resource request of the target slice tenant sent by the NSB, and sends a message composed of available resources and corresponding prices to the NSB. Accordingly, the NSB receives the available resources and prices for the resource request sent by the blockchain node of each response (i.e., the slice tenant of each response). To ensure authentication, each message is signed with the sender's private key and uniquely identified by an ID number. In the present application, the value of blockchain nodes is defined by complex network theory.

Specifically, the structural features of the complex network theory are used to obtain topology information of the network and define the value of the blockchain nodes according to the importance of the nodes. The network consists of nodes and edges. In the concrete abstraction process, the nodes and edges of two basic elements of a network can represent a variety of different things. In order to quantitatively measure the importance of each node in the network, the importance can be described by using local features such as degree distribution and betweenness centrality and global features.

Degree distribution: the degree distribution measures the number of node-connecting edges, which is the probability of a node-connecting edge, and is expressed as:

d_i＝∑_j∈Nζ_ij, (1)

therein, ζ_ijRepresents whether node i and node j are connected, if ζ_ijWhen 1, the node i and the node j are connected with each other; if ζ_ijIf 0, the node i and the node j have no connection relationship.

Mesomeric centrality: the betweenness centrality measures the proportion of the shortest path between any two nodes passing through node i, and can be calculated by the following formula:

therein, ζ_abRepresents the total number of shortest paths between node a and node b, ζ_ab(i) Representing the number of paths between node a and node b that are the shortest path through node i.

According to degree distribution d_iAnd mesomeric centrality b_iRespectively adding d_iAnd b_iIs normalized to d'_iAnd b'_i. Degree distribution d'_iExpressed as:

and when the total number of the nodes is N, N-1 is the maximum number of edges connecting the nodes i.

Normalized mesomeric center b'_iCan be expressed as:

wherein the content of the first and second substances,

is b_iThe maximum number of cells.

Thus, parameter d 'of each node is combined'_iAnd b'_iThe importance of node i is defined as:

obviously, the value range of in (i) is (0, 1), and thus it can be divided into five levels, when the value range of in (i) is in (0, 0.2), it is defined as the first level, when the value range of in (i) is in (0.2, 0.4), it is defined as the second level, when the value range of in (i) is in (0.4, 0.6), it is defined as the third level, when the value range of in (i) is in (0.6, 0.8), it is defined as the fourth level, when the value range of in (i) is in (0.8, 1), it is defined as the fifth level.

Step 2042, determining target response slice tenants based on the deep reinforcement learning algorithm, the service level agreement, each available resource and price, and the preset price threshold and delay threshold.

After receiving available resources and prices for resource requests sent by each responding slice tenant, the execution main body can select a response tenant with the minimum system delay and the minimum computation cost on the condition of meeting a Service Level Agreement (SLA) based on a deep reinforcement learning algorithm, a preset price threshold and a preset delay threshold, and determine the response tenant as a target response slice tenant.

In particular, Reinforcement Learning (RL) is a field in machine Learning, and the goal of an agent is to maximize long-term cumulative returns, rather than immediate returns. RL may describe a Markov Decision Process (MDP) consisting of five tuples < S, a, P, R, γ >. Wherein S and A are a finite state space and an action space, respectively; p is a state transition probability which represents the probability that the action a belongs to A to reach the next state S' belongs to S under the state S belongs to S; r is the instant reward after selecting action a in state s; γ ∈ [0, 1] is the discount factor used to calculate the jackpot. If γ ∈ [0, 1], the agent only cares to maximize the instant prize, while when γ is close to 1, the goal is to consider the future jackpot, which means that the agent becomes more visible. If the action is deterministic, the strategy is deterministic, in fact the strategy of reinforcement learning is usually a stochastic strategy. The agent may find a better policy by constantly trying other actions, thus introducing a probability factor. The policy pi (a | S) ═ P [ a ═ a | S ═ S ] represents the possibility of selecting an action a ∈ a in the state S ∈ S. In general, the goal of MDP is to find a strategy pi (a | s) to maximize the value function, usually defined by bellman's equation:

the goal of RL is a process that constantly seeks the best strategy. The Bellman optimal equation is to find the optimal strategy:

classical reinforcement Learning algorithms, such as Q-Learning, are suitable for situations where the state space is relatively small. The problem of dimensional explosion occurs when the state space and the action space are high-dimensional continuous. The control strategy can be directly learned from high-dimensional original data by combining deep reinforcement learning and reinforcement learning. Deep Q-Learning (dqn) is one of the Deep reinforcement Learning algorithms that combine Deep Neural Networks (DNN) with Q-Learning. Furthermore, DQN is improved in two ways:

and (3) experience playback: empirical playback requires a data set D_t＝{e₁，…，e_tStore the experiences of the agent, experience of each step e_t＝(s_t，a_t，r_t，s_t+1) Including the current state, the action, the reward, and the value of the next state. In Deep-Q learning, the Q-learning update method is used for a batch of data(s) in a sample set of stored data_t，a_t，r_t，s_t+1) Random and uniform sampling is performed to update the Q-value neural network.

Network Cloning: an agent adds a target Network, which has the same structure as the Q Network, and the same initial weight, except that the Q Network is updated in each iteration, and the target Network is updated periodically.

Empirical replay and Network Cloning enable the combination of Q-learning and neural networks. DQN is the first deep reinforcement learning algorithm to combine traditional reinforcement learning with deep learning. The DQN can efficiently execute and timely make resource allocation decisions according to already mastered policies.

Specifically, in this embodiment, a computing resource allocation process is taken as an example for explanation, and a DRL framework may be used to model an optimization problem, so as to satisfy resource requirements with an optimal policy. Assuming that M users request computing resources and N MEC servers (blockchain nodes, resource providers) respond to the computing requests, all users granted credentials by NSB can be considered as partners in the scenario. The user set is represented as

MEC Server is represented as

It is assumed that the MEC server can handle all the computing tasks due to its multitasking capability. A quasi-static scenario is considered in which the mobile user remains unchanged during the computation offload period (e.g., within a few seconds) and may change at different times.

N Orthogonal Frequency Division Multiplexing (OFDM) channels without interference exist between N MEC servers, K OFDM subchannels exist between a user and a MEC server without mutual interference, and the bandwidth of each subchannel is assumed to be w. The signal-to-noise ratio (SINR) and the uplink data rate of user m in time slot t can be represented by the following equations (8), (9), respectively:

R_m，n(t)＝wk_m，n(t)log₂(1+Γ_m，n(t)) (9)

wherein k is_m，n(t)∈[0，1，2，…，K]Representing the number of subchannels, P, allocated to user m by MEC server n_m(t) represents the transmission power of a user (or referred to as slice tenant, the user in this application is substantially the same as the slice tenant or target slice tenant, and is the same throughout). g_m，n(t) and g_i，n(t) the channel gains between user m and MEC server n, user i and MEC server n, respectively. Sigma²Represents Additive White Gaussian Noise (AWGN).

The computing task that user m offloads to the MEC server is represented as

For each task J_m，D_mInput data size (bits), X, representing user m_mIndicating the required MEC computational intensity (total number of CPU cycles to complete the computational task),

the representation indicates the deadline for each task,

(tokens per CPU cycle) represents the maximum tolerated price for user m. The total task delay is composed of three parts, i.e. the transmission delay

Calculating time delay

And the resulting return delay delta tau^b. Wherein f is_m，nIndicating the computing power (CPU cycles per second) of MEC server n allocated to user m.

Since the result return delay is small, Δ τ is not considered^b. The total delay between user m and MEC server n is:

price Pr of each MEC server n provided for user m according to complex network theory_m，nThere are 5 levels. The goal is to minimize the total cost of the user over each time period t, including the time delay between the user and the MEC server and the price paid to the MEC server. Whether user m offloads the decision to MEC server n, denoted as a_m，nE {0, 1 }. When a is_m，nWhen 1, the user offloads the decision to MEC server n; otherwise, a_m，n0. Modeling the optimization problem of the problem as shown in equation (11):

wherein, ω is_tAnd ω_pRepresenting the time delay and the weight of the price factor, respectively. The constraints are as follows:

a_n∈{0，1},

wherein, a_m，nE {0, 1} represents whether the user m selects the MEC server m for calculation unloading;

indicating that the latency for executing each computational task cannot exceed the maximum tolerated latency;

indicating that the price for performing each computing task cannot exceed the maximum tolerated price;

represents the total amount of computing resources allocated to all users that cannot exceed the computing resources of the MEC server;

representing that the bandwidth allocated to all users cannot exceed the total spectrum bandwidth W.

The process of optimizing the problem can be described as MDP, using a DRL framework to minimize system latency and computational cost. The basic elements of the DRL include:

state space of s_m＝{D_m，Γ_m}，D_mRepresenting the size of the computing task for user m; gamma-shaped_mIndicating the SINR for user m.

An operating space a_m＝{a_m，1，a_m，2，…，a_m，n}，a_m，n1, selecting an MEC server n by a user m to unload a computing task; a is_m，nWhen the ratio is 0: user m does not select MEC server n for computation task offloading.

Return function R_tIndicating the return values of all users of time slot t.

When action a is taken_tThe system will then receive the reward R_t. And obtaining an optimal strategy according to the Bellman optimal equation. Since the system state space is very large, a linear function approximation can be used to approximate the value function in the RL. The effect of Q-learning can be described as:

Q(s，a)＝θTψ(s，a) (13)

wherein psi (s, a) ═ psi₁(s，a)，…，ψ_n(s, a) } is a linear combination of n orthogonal vectors and is a vector of dimension n. The Q value should be close to the target-Q value Q⁺(s，a)：

Q⁺(s，a)＝∑_sP(s′|s，a)[R(s，a)+γmax_a′Q+(s′，a′)] (14)

The objective function may be defined as:

to minimize L (θ), the parameter θ may be graded down:

linear function approximations cannot accurately model the estimate function and some non-linear method such as neural networks can be substituted for Q (s, a; θ). The loss function can be redefined as:

finally, the training process of the resource allocation method based on the deep reinforcement learning algorithm dqn (deep Q learning) is as follows: initializing parameters and memory pool sizes of a Q network and a Target-Q network; before each training is started, the starting state is initialized, and the maximum tolerated delay and price of the agent (which may be the executing agent, i.e., the network slice proxy) is initialized. In each training time slot, the intelligent agent generates an action according to an element-greedy strategy, and selects a random action or an action with the maximum return value according to an element probability; and receiving the reward after the action is executed, transferring to the next state, and storing the experience of the intelligent agent by adopting an experience playback mechanism. When enough experience exists in the experience pool, randomly taking out a small batch of sample data from the experience pool, calculating a predicted value of a Q network by using a current network, calculating a Q network Target value by using a Target-Q network, then calculating a loss function between the Q network Target value and the Q network Target value, updating current network parameters by using gradient descent, repeating for a plurality of times, and copying the parameters of the current network to the Target-Q network until training is completed. The execution subject may select a response tenant with the lowest price, the lowest system delay, and the lowest computation cost on the condition that an SLA (Service Level Agreement) is satisfied, and determine the response tenant as a target response slice tenant, based on a resource allocation method based on a deep reinforcement learning algorithm dqn (deep Q learning) that is completed by training.

In response to determining that the target slice tenant pays the bill for the resource request, a block chain based smart contract is deployed 205.

The execution subject of the embodiment needs to confirm whether the target slice user pays the bill for the resource request before deploying the intelligent contract based on the block chain, and the intelligent contract based on the block chain can be deployed in response to determining that the target slice tenant pays the bill for the resource request. Thereby avoiding the transaction loss of both parties of the contract and maintaining the benefits of both parties.

And step 206, deploying an intelligent contract based on the block chain based on the resource request, the target slicing tenant and the target response slicing tenant.

Step 207, based on the intelligent contract, the network resource is allocated.

The principle of step 206 to step 207 is the same as that of step 103 to step 104, and is not described here again.

With continuing reference to fig. 3, as an implementation of the method shown in the above-mentioned figures, the present application provides an embodiment of a resource allocation apparatus based on a blockchain network slice proxy, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 1, and the apparatus may be applied to various electronic devices in particular.

As shown in fig. 3, the resource allocation apparatus 300 based on the blockchain network slice proxy of the present embodiment includes: a receiving unit 301, a target response slice tenant determination unit 302, an intelligent contract deployment unit 303, and a network resource allocation unit 304.

A receiving unit 301 configured to be configured to receive a resource request sent by a target slice tenant by a network slice proxy.

A target response slice tenant determination unit 302 configured to determine a target response slice tenant based on the resource request.

And the intelligent contract deployment unit 303 is configured to deploy the intelligent contract based on the block chain based on the resource request, the target slice tenant and the target response slice tenant.

A network resource allocation unit 304 configured to allocate network resources based on the intelligent contract.

In some optional implementations of this embodiment, the resource allocation apparatus based on blockchain network slice proxy further includes not shown in fig. 3: an admission control certificate determination unit configured to determine whether a target slice tenant has an admission control certificate based on the resource request; a resource request sending unit configured to send a resource request to each slice tenant having an admission control certificate in response to determining that the target slice tenant has the admission control certificate.

In some optional implementations of this embodiment, the target response slice tenant determination unit 302 is further configured to: receiving available resources and prices aiming at the resource requests sent by the slice tenants responding to the requests, wherein the prices are determined by the tenants responding to the requests through excessive distribution and betweenness centrality calculation; and determining a target response slice tenant based on a deep reinforcement learning algorithm, a service level protocol, each available resource and price, and a preset price threshold and a preset delay threshold.

In some optional implementations of the present embodiment, the smart contract deployment unit 303 is further configured to: in response to determining that the target slice tenant pays a bill for the resource request, a block chain based smart contract is deployed.

Technical carriers involved in payment in the embodiments of the present specification may include Near Field Communication (NFC), WIFI, 3G/4G/5G, POS machine card swiping technology, two-dimensional code scanning technology, barcode scanning technology, bluetooth, infrared, Short Message Service (SMS), Multimedia Message (MMS), and the like, for example.

The biometric features involved in biometric identification in the embodiments of the present specification may include, for example, eye features, voice prints, fingerprints, palm prints, heart beats, pulse, chromosomes, DNA, human teeth bites, and the like. Wherein the eye pattern may include biological features of the iris, sclera, etc.

It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the multiple devices may only perform one or more steps of the method of one or more embodiments of the present disclosure, and the multiple devices may interact with each other to complete the method.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.

The apparatus of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.

Fig. 4 is a schematic diagram illustrating a more specific hardware structure of an electronic device for resource allocation based on a blockchain network slicing proxy according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.

The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.

The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.

The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.

The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).

Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.

It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.

Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of the different aspects of one or more embodiments of the present description as above, which are not provided in detail for the sake of brevity.

In addition, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures, for simplicity of illustration and discussion, and so as not to obscure one or more embodiments of the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the understanding of one or more embodiments of the present description, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.

While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.

It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A resource allocation method based on a block chain network slice proxy is characterized by comprising the following steps:

determining whether the target slice tenant has an admission control certificate based on the resource request;

in response to determining that the target slice tenant has an admission control certificate, sending the resource request to each slice tenant having an admission control certificate;

receiving available resources and prices for the resource requests sent by the slice tenants of the responses, wherein the prices are determined by the tenants of the responses through the distribution and betweenness centrality calculation;

determining a target response slice tenant based on a deep reinforcement learning algorithm, a service level protocol, each available resource and price, and a preset price threshold and a preset time delay threshold;

deploying a block chain-based smart contract based on the resource request, the target slice tenant, and the target response slice tenant;

and allocating network resources based on the intelligent contract.

2. The method of claim 1, wherein prior to said deploying a blockchain-based smart contract based on the resource request, the target slice tenant, and the target response slice tenant, the method further comprises:

in response to determining that the target slice tenant pays a bill for the resource request, deploying a block chain-based smart contract.

3. A resource allocation apparatus based on a blockchain network slice proxy, comprising:

an admission control certificate determination unit configured to determine whether the target slice tenant has an admission control certificate based on the resource request;

a resource request sending unit configured to send the resource request to each slice tenant having an admission control certificate in response to determining that the target slice tenant has an admission control certificate;

a target response slice tenant determination unit configured to receive the available resources and prices for the resource requests sent by the slice tenants of the responses, wherein the prices are determined by the tenants of the responses through distribution and betweenness centrality calculation; determining a target response slice tenant based on a deep reinforcement learning algorithm, a service level protocol, each available resource and price, and a preset price threshold and a preset time delay threshold;

a smart contract deployment unit configured to deploy a block chain based smart contract based on the resource request, the target slice tenant, and the target response slice tenant;

4. The apparatus of claim 3, wherein the intelligent contract deployment unit is further configured to: in response to determining that the target slice tenant pays a bill for the resource request, deploying a block chain-based smart contract.

5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 2 when executing the program.

6. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 2.