CN115904731A

CN115904731A - Edge cooperative type copy placement method

Info

Publication number: CN115904731A
Application number: CN202211696741.9A
Authority: CN
Inventors: 黄宏程; 王秋皓; 胡敏
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-04-04

Abstract

The invention relates to an edge collaborative copy placement method, and belongs to the field of edge computing. The method comprises three parts: dividing an edge cooperation area based on the similarity of the base station content and the base station user; the edge resource manager predicts the popularity of the copies in the region and completes the recommendation of copy placement; the cloud center node optimizes the duplicate deployment rule of each edge cooperation domain through a reinforcement learning algorithm so as to solve the local redundancy of the duplicates in the edge nodes and improve the service quality of users.

Description

Edge cooperative type copy placement method

Technical Field

The invention belongs to the field of edge computing, and relates to an edge collaborative copy placement method.

Background

The traditional cloud computing has the defects of high energy consumption, insufficient real-time performance and the like, namely, a centralized processing mode taking a cloud computing model as a core cannot efficiently process data generated by edge equipment. At the moment, the development of mobile edge computing is promoted by the arrival of the internet of things era, and the mobile edge computing is a new close-range computing paradigm and is an extension of cloud computing to an edge network. The mobile edge computing paradigm allows edge servers to be distributed in different geographic locations, pushing computing and storage resources to the vicinity of mobile users, providing cloud computing power by deploying servers within Radio Access Networks (RANs) and Base Stations (BSs), bypassing network bandwidth and latency bottlenecks, helping to reduce the computing and transport load of cloud computing centers, while at the same time providing lower latency services than traditional cloud computing. In addition, the mobile edge server can provide computing, storage and network resources for the user equipment, improve cooperation among users, and have great potential of providing required QoS to the users under the condition of meeting strict requirements of access delay.

In recent years there have been many solutions for copy placement in moving edge scenes. To improve data availability and cloud storage performance in an edge computing environment, alaer et al proposed an adaptive replica placement method to study the relationship between user access characteristics, number of replicas and replica locations, relying on continuous monitoring of replica requests from the underlying network edge nodes to dynamically create/replace/delete replicas with guaranteed data availability requirements. However, the above work on copy placement does not support delay sensitive and data intensive workflows, i.e. frequent remote access to centrally stored data not only results in high round trip delay, but may even offset the advantages of edge computing when user requests for access to data are not satisfied within the MEC server; based on a hypergraph partitioning algorithm, the optimal decision problem for simultaneously solving the placement of virtual machines and the placement of data copies in a tree-structured network is provided. Firstly, dividing the whole edge calculation storage model into two modules to enable the communication traffic inside the modules and between the modules to be as small as possible, secondly, according to the storage condition of a file copy at an edge node, constructing a vertex and a hyperedge of a hypergraph to obtain a corresponding hypergraph model, searching for an initial division by using a breadth-first search mode according to the obtained model, and finally, dividing the file into different modules by adjusting the relative positions of the file copy and the edge node. The method improves the response delay of copy deployment, but ignores the region difference characteristic of user preference, and may cause the conflict between locality and globality in an actual scene.

Although the MEC has great advantages in saving network bandwidth, improving service quality and relieving the pressure of the cloud data center, the reasonable copy placement requirement under the mobile edge computing still cannot be effectively met due to the defects that the traditional copy placement scheme of cloud edge cooperation has resource limitation and the like. Especially with the continuous emergence of delay-sensitive and data-intensive workflows in the environment of the internet of things, the sharp increase of multi-source heterogeneous data may have serious negative effects on the service quality and the system performance. First, when processing data access requests from local users, different edge servers typically must cache the same copy, however, the high autonomy of edge nodes easily leads to highly redundant storage and high frequency updating of the copy, resulting in "copy flooding," waste of copy utilization low level resources, and the like. Replication therefore has the potential to provide a more cost effective solution when placed on critical network nodes and servicing requests from multiple nearby locations. In addition, the data access request in the moving edge scene has the characteristics of multiple edge access and high concurrency, and the randomness and the regionality of the data access request obviously increase the risk of out-of-control copy management, so that the reasonable copy distribution can relieve huge local performance and calculation pressure and reduce unnecessary waiting time.

Disclosure of Invention

In view of the above, the present invention provides an edge-collaborative copy placement method.

In order to achieve the purpose, the invention provides the following technical scheme:

an edge-collaborative copy placement method, comprising:

s1: dividing base station cooperation domains based on similarity;

s2: forecasting the popularity of the copies of the collaboration domain;

s3: and (4) copy placement decision of cloud edge cooperation.

Optionally, S1 specifically is:

firstly, the similarity of request contents between two base stations is defined and expressed as follows:

wherein

Indicating the number of requests for the content f in the ith base station at time slot t; similarity psi _ij ∈[0,1]The closer the similarity is to 1, the greater the common benefit between the two base stations is; n base stations finally form an N multiplied by N symmetrical matrix, and each item in the matrix represents the content similarity between any two base stations;

if a common user requests service between two base stations, the base stations are called user similarity base stations, and the user similarity between the base stations is expressed as follows:

wherein U (t) # U _j (t) represents the number of users visiting both i and j; j. the design is a square _ij ∈[0,1]The closer the similarity is to 1, the more the people served by the two base stations coincide; the user similarity of all base stations can be represented by an N multiplied by N matrix J; the higher the user similarity is, the higher the base station cooperation possibility is, the higher the possibility that the two base stations are divided into the same cooperation domain is;

the content similarity and the user similarity between the base stations are considered to describe the comprehensive similarity between different base stations, and the comprehensive similarity is expressed as follows:

wherein alpha is ₁ As a content similarity weight, α ₂ Weighting the similarity of the users; the base station similarity describes the similarity of the daily behaviors of the users by combining the similarity of the users and the similarity of the preference of the users; the greater the similarity of two edge base station nodes is, the more the same content requests received by the two nodes are or the more users similar to the two base stations are, and the two base stations are mutually called cooperative base stations; when a user cannot acquire files at a local base station, sequentially searching cooperative base stations from the base station with the maximum similarity with the local base station to acquire contents;

describing the similarity of any two base stations by using a similarity matrix, constructing a weighted undirected graph according to the similarity matrix, wherein the vertex is each base station node, the edge is the similarity between the base stations, and dividing all the base stations into corresponding cooperation domains by using a frequency spectrum clustering method;

in the mobile edge calculation, calculating the required equivalent bandwidth through the relationship between the end-to-end bandwidth of the multi-hop network and the routing hop count; the larger the equivalent bandwidth between two nodes is, the smaller the transmission time delay between the two nodes is, and the calculation formula of the equivalent bandwidth is

Wherein B is _i Representing the bandwidth between two network nodes;

and finally, finding out the optimal placement server in each cooperation domain according to the equivalent bandwidth, and preferentially placing the servers when carrying out global placement, thereby reducing the access delay of the user.

Optionally, the S2 specifically is:

predicting the popularity of the copy at the next moment by adopting an unbiased grey model according to the historical popularity characteristics of the copy; the historical access popularity obtained by the historical request in time slot t is expressed as:

wherein

Cooperation field C in time slot t _n Request content f _i Is greater or less than>

Denoted as cooperation field C in time slot t _n Request content f _i (iii) popularity of (c); based on the popularity of t time slots in history>

As input, a grey time sequence prediction model is constructed;

the grey prediction model G (1, 1) reduces the influence of randomness and volatility on the original data by performing first-order accumulation treatment on the original data, and the obtained new data sequence has a rule of approximate index; establishing a differential equation set according to the approximate exponential law, further establishing a shadow equation of the differential equation by solving the differential equation set, and finally obtaining a prediction function, wherein the modeling process comprises the following steps:

first-order accumulation is carried out on original t data, and a new data sequence can be obtained after accumulation

Wherein->

The gray differential equation for the prediction model is:

wherein a is a development coefficient, b is a gray action quantity, a shadow equation of a gray differential equation is constructed to obtain a solution of the shadow equation:

and carrying out inverse transformation operation on the obtained solution to obtain a prediction model equation:

thereby predicting the popularity of the new time slot copy of each cooperation domain

Meanwhile, the file popularity of each edge collaboration domain needs to be updated regularly so as to dynamically adapt to the changing requirements of users.

Optionally, the S3 specifically is:

in moving edge scenes, S = { S) =isdefined ₁ ,S ₂ ,…,S _T Denoted as the system state space, where S _t Represents the system state of the time slot t; definition a = { a ₁ ,A ₂ ,…,A _T Is the motion space, where A _t Representing the operation set of all edge cooperation domains in the time slot t; definition of L _t (S _t ,A _t ) And C _t (S _t ,A _t ) Respectively representing data access delay and copy placement cost in a specific state;

for multi-objective optimization to achieve collision minimization, a new reward function is proposed, which is expressed as:

LC _t (S _t ,A _t )＝L _t (S _t ,A _t )×C _t (S _t ,A _t )

the objective function of the Q-Learning model is to minimize the LCC-based long-term cumulative metric, expressed as:

wherein γ represents the discount rate of the effect of future rewards on the current accumulated reward; γ =0 means that the model only considers short-term returns, γ =1 means that the model is more focused on long-term returns, as one of the classical value-based Learning algorithms in reinforcement Learning, Q-Learning aims to learn the value of a specific action in a specific state and finally establish an optimal Q-table, wherein the Q-table is responsible for storing action values Q _t (S _t ,A _t )；

Determining the best operation for each state based on the Q-table and obtaining a minimum cumulative LC metric; simultaneous operation value Q _t (S _t ,A _t ) The updates are as follows:

Q _t+1 (S _t ,A _t )＝α _t (LC _t (S _t ,A _t )+γ*minQ _t (S _t+1 ,A _t ))+

(1-α _t )Q _t (S _t ,A _t )

correcting an action value function in Q-Learning by utilizing a deep neural network DNN, and decoupling selection of a target Q value action and calculation of a target Q value by constructing different action value functions; in the Q-Learning-based method, a lookup table is created to represent and update the value function, and an optimal Q value is approximately obtained through a parameter theta after the deep neural network is updated, wherein the value function is represented as follows:

Q(z,d)≈Q(z,d；θ)

the agent can store corresponding experience in the transfer tuple to an experience recovery pool in any time slot t, and uses the arrived tuples to train parameters of the neural network, and meanwhile, the agent can randomly select previous experience from the experience recovery pool for learning in the subsequent training; in each iteration, the Q function is trained by minimizing the Loss function Loss (theta) to gradually approach the target value;

repeating the Q-Learning based update procedure until all possible state pairs are accessed; finally, giving an optimal copy placement rule to optimize the copy deployment of each edge collaboration domain; the model periodically generates new replica placement rules to accommodate system dynamics and variable user requirements.

The invention has the beneficial effects that: while taking into full account the impact of user mobility, user preferences on copy placement, minimizing cost and latency constraints, a placement framework with latency awareness is constructed to improve the balance between user quality of service and server performance.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a system diagram of a moving edge scene.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.

For the moving edge scene system as shown in fig. 1, the present invention aims to construct a delay-aware adaptive framework. The framework consists of a central node layer, an edge node layer and a user equipment layer.

(1) A central node layer: the core component of the layer is a cloud data center and is mainly responsible for making decisions on copy updating rules of each edge node. Since the edge-side recommended copy sequence still has some local and regional restrictions, a copy placement decision module and a global resource manager are built in the cloud data center. By fully considering the influence of the copies in different areas on the overall performance of the system, the central node layer can dig out the optimal copy distribution scheme so as to ensure the satisfactory overall user service quality and the overall performance of the system.

(2) Edge node layer: the layer is the most core part of the framework and is composed of a large number of distributed edge nodes, and each edge node is assumed to be composed of a local micro base station and a transformation-exchange server. And partitioning the distributed edge nodes into cooperation domains by considering the similarity among the edge nodes. In addition, a replica recommendation engine and an edge resource manager are deployed on each edge node to improve replica hit rates and avoid waste of resources.

(3) And a user equipment layer: this layer includes various consumer edge devices, such as cell phones, cars, cameras, and AR/VR devices, that need to request relevant data services from the edge cloud or the central cloud. It is rare that many user equipments move randomly from one location to another location, usually within a period of time, and therefore the base stations they visit may change in real time, thereby affecting the quality of service of the user.

The invention mainly comprises three parts:

1. base station cooperation domain dividing method based on similarity

2. Method for predicting popularity of copies in cooperative domain

3. Copy placement decision method for cloud-edge cooperation

The specific method comprises the following steps:

1. base station cooperative domain division based on similarity

In a typical mobile edge environment, where data access requests are characterized by "multiple edges and high concurrency", a mobile user equipment accessing data may move from area a to area B within a period of time, resulting in a serving edge node switch and requiring the access request to be resubmitted to the current node, so that the user's movement significantly increases the data interaction frequency. Furthermore, the delay perceived by the user depends to a large extent on the distance between the files that the user wants to access, and the copies within the base station influence the average delay for the user access to a large extent.

There are differences in the content preferences of users served by different base stations, resulting in differences in the content categories requested by different regional user groups. Due to the fact that the popularity of the content requested by the user has the characteristic of regional difference, the regional popularity is greatly different from the global popularity. Therefore, similar base station service is sensed and the cooperation domain is divided to provide copies for similar user groups, so that the user access delay can be effectively reduced and the QoS can be improved.

First, the similarity of requested contents between two base stations is defined, which can be expressed as:

wherein

Indicating the number of requests for content f in time slot t, ith base station. Similarity psi _ij ∈[0,1]The closer the similarity is to 1, the greater the common interest between the two base stations. The N base stations finally form an N multiplied by N symmetrical matrix, each item in the matrix represents the content similarity between any two base stations, and the base stations with high similarity can be placed and cooperated.

In the moving process of a user, a plurality of base stations may be requested to perform service, if a common user requests service between two base stations, the common user requests service is called a user similarity base station, and the user similarity between the base stations at this time can be expressed as:

wherein U (t) # U _j (t) represents the number of users accessing both i and j. J. the design is a square _ij ∈[0,1]The more the similarity is connectedNear 1, indicates that the more people served by two base stations coincide. The user similarity of all base stations can be represented by an N × N matrix J. The base station with higher user similarity has higher cooperation possibility, and thus the possibility that two base stations are divided into the same cooperation domain is higher.

Because the content requested by the users has regional difference, the service base stations to which the users with different interest preferences belong are divided by constructing the cooperative domains, so that the base stations in the same cooperative domain can serve the users with the same or similar interest preferences. The partitioning of the cooperation domain needs to ensure that each position in the cooperation domain is relatively adjacent, so the content similarity and the user similarity between the base stations are jointly considered to describe the comprehensive similarity between different base stations, which can be expressed as:

wherein alpha is ₁ As a content similarity weight, α ₂ And (4) weighting the similarity of the users. Base station similarity describes user daily behavior similarity by combining user similarity and user preference similarity. The magnitude of the weights can significantly affect the clustering process. The greater the similarity between two edge base station nodes, the more the same content requests received by the two nodes or the more users similar to the two base stations, the two base stations may be mutually called cooperative base stations. When the user can not obtain the file at the local base station, the cooperative base stations are sequentially searched from the base station with the maximum similarity with the local base station to obtain the content.

Considering the direct proportion relation between the transmission delay and the distance, the maximum threshold value of the distance between the base stations is set so as to filter the non-associated base stations and improve the calculation efficiency. The similarity of any two base stations is described by using a similarity matrix, a weighted undirected graph is constructed according to the similarity matrix, the vertex is each base station node, the edge is the similarity between the base stations, and all the base stations are divided into corresponding cooperation domains by using a frequency spectrum clustering method.

In mobile edge computing, the bandwidth between any two nodes is not necessarily the same, and thus end-to-end over a multi-hop networkThe relationship between bandwidth and routing hop count allows the calculation of the equivalent bandwidth required. The mode of calculating the equivalent bandwidth considers that the larger the equivalent bandwidth between two nodes is, the smaller the transmission time delay between the two nodes is, and the calculation formula of the equivalent bandwidth is

Wherein B is _i Representing the bandwidth between two network nodes.

And finally, finding out the optimal placement server in each cooperation domain according to the equivalent bandwidth, and preferentially placing the servers when carrying out global placement, thereby obviously reducing the access delay of users.

2. Collaborative domain replica popularity prediction

And predicting the popularity of the copy at the next moment by adopting an unbiased grey model according to the historical popularity characteristics of the copy. The historical access popularity obtained by the historical request in time slot t can be expressed as:

wherein

Cooperation field C in time slot t _n Request content f _i In the number of times (v) /)>

Denoted as cooperation field C in time slot t _n Request content f _i The popularity of (c). Based on the popularity of t time slots in history>

As input, a gray temporal prediction model is constructed.

The grey prediction model G (1, 1) reduces the influence of randomness and volatility on the original data by performing first-order accumulation processing on the original data, and the obtained new data sequence has an approximately exponential law. Establishing a differential equation set according to the approximate exponential law, further establishing a shadow equation of the differential equation by using the solution of the differential equation set, and finally obtaining a prediction function, wherein the modeling process comprises the following steps:

Wherein +>

The gray differential equation for the prediction model is:

3. Copy placement decision for cloud-side collaboration

Although the partitioning of collaboration domains allows users under the same collaboration domain to access data with lower latency, the replica recommendation sequence based on each edge collaboration domain still has some limitations, such as locality and redundancy. The replica placement phase is therefore responsible for receiving replica recommendation sequences from the edge collaboration domains and making global replica placement decisions at the cloud-centric nodes.

In addition, considering the influence of different copy placement modes in different areas on the overall performance of the system, the copy placement in each cooperation area needs to be coordinated and balanced with each other. On the one hand, it is important to place copies that are frequently accessed by the user in the local collaboration domain in order to meet the user's requirements for acceptable data access latency. On the other hand, over-placement of the copies will significantly increase the high overhead of transmission and storage. In order to make a good trade-off between the two, the placement is decided by constructing a multi-objective constraint model that minimizes latency and cost.

For the dynamic environment of the network, the optimal behavior decision is adaptively learned from a specific context based on the Q-learning algorithm. Q-Learning is a model-free reinforcement Learning algorithm based on value iterative. In moving edge scenes, define S = { S = { S = } ₁ ,S ₂ ,…,S _T Expressed as the system state space, where S _t Indicating the system state of time slot t. Definition a = { a = ₁ ,A ₂ ,…,A _T Is the action space, where A _t Represents the set of operations for all edge cooperation fields in the time slot t. Definition of L _t (S _t ,A _t ) And C _t (S _t ,A _t ) Respectively representing data access latency and copy placement cost in a particular state.

On the one hand, the requirement of the user on acceptable data access delay needs to be met, and the copy frequently accessed by the user is placed in the cooperation domain, and on the other hand, the excessive placement of the copy causes high overhead of transmission and storage. For multi-objective optimization to achieve collision minimization, a new reward function is first proposed, expressed as:

LC _t (S _t ,A _t )＝L _t (S _t ,A _t )×C _t (S _t ,A _t )

where gamma represents the discount rate of the effect of future awards on the current accumulated award. γ =0 means that the model only considers short-term returns, γ =1 means that the model is more focused on long-term returns, as one of the classical value-based Learning algorithms in reinforcement Learning, Q-Learning aims to learn the value of a specific action in a specific state and finally establish an optimal Q-table, wherein the Q-table is responsible for storing action values Q _t (S _t ,A _t )。

Based on the Q-table, the best operation for each state can be determined and the minimum cumulative LC metric obtained. Simultaneous operation value Q _t (S _t ,A _t ) The following should be updated:

Q _t+1 (S _t ,A _t )＝α _t (LC _t (S _t ,A _t )+γ*minQ _t (S _t+1 ,A _t ))+

(1-α _t )Q _t (S _t ,A _t )

although in the Q-Learning based approach the agent can obtain the optimal strategy by continuously recording and updating Q in a Q-table, the possible action-state space of the agent in the actual time-varying scenario becomes very large, and therefore the Q-Learning based approach is easily trapped in the trouble of dimension explosion. Moreover, traversing the corresponding Q values in such a large table would take a significant amount of time.

In order to avoid the bottleneck in the Q-Learning method, the method optimizes the reinforcement Learning process, corrects the action value function in the Q-Learning by using the deep neural network DNN, and decouples the selection of the target Q value action and the calculation of the target Q value by constructing different action value functions. In the Q-Learning-based method, a lookup table is created to represent and update the value function, and the optimal Q value is approximately obtained through the parameter theta after the deep neural network is updated, and the value is represented as:

Q(z,d)≈Q(z,d；θ)

wherein theta is a weight coefficient of the main neural network, the intelligent agent stores corresponding experience in the transfer tuple to the experience recovery pool in any time slot t, and uses the arrived tuple to train parameters of the neural network, and meanwhile, the intelligent agent randomly selects some previous experiences from the experience recovery pool for learning in the subsequent training. In each iteration, the Q function is trained to gradually approach the target value by minimizing the Loss function Loss (θ).

The Q-Learning based update procedure is repeated until all possible state pairs are accessed. And finally, giving an optimal copy placement rule to optimize the copy deployment of each edge collaboration domain. In addition, the model periodically generates new replica placement rules to accommodate system dynamics and variable user requirements.

Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. An edge-collaborative copy placement method, characterized by: the method comprises the following steps:

s1: dividing base station cooperation domains based on similarity;

s2: forecasting the popularity of the copies of the cooperation domain;

s3: and (4) copy placement decision of cloud edge cooperation.

2. An edge-collaborative copy placement method according to claim 1, wherein: the S1 specifically comprises the following steps:

firstly, the request content similarity between two base stations is defined as follows:

wherein

in the mobile edge calculation, the required equivalent bandwidth is calculated through the relation between the end-to-end bandwidth of the multi-hop network and the routing hop count; the larger the equivalent bandwidth between two nodes is, the smaller the transmission time delay between the two nodes is, and the calculation formula of the equivalent bandwidth is

Wherein B is _i Represents the bandwidth between two network nodes;

and finally, finding out the optimal placement server in each cooperation domain according to the equivalent bandwidth, and preferentially placing the servers when carrying out global placement, thereby reducing the access delay of users.

3. An edge-collaborative copy placement method according to claim 2, wherein: the S2 specifically comprises the following steps:

wherein

As input, a grey time sequence prediction model is constructed;

the grey prediction model G (1, 1) reduces the influence of randomness and volatility on the original data by performing first-order accumulation treatment on the original data, and the obtained new data sequence has a rule of approximate index; establishing a differential equation set according to the approximate exponential law, further establishing a shadow equation of the differential equation by using the solution of the differential equation set, and finally obtaining a prediction function, wherein the modeling process comprises the following steps:

Wherein->

The gray differential equation for the prediction model is:

4. An edge-collaborative copy placement method according to claim 3, wherein: the S3 specifically comprises the following steps:

in moving edge scenes, define S = { S = { S = } ₁ ,S ₂ ,…,S _T Denoted as the system state space, where S _t Represents the system state of the time slot t; definition a = { a ₁ ,A ₂ ,…,A _T Is the motion space, where A _t Representing the operation set of all edge cooperation domains in the time slot t; definition of L _t (S _t ,A _t ) And C _t (S _t ,A _t ) Respectively representing data access delay and copy placement cost in a specific state;

LC _t (S _t ,A _t )＝L _t (S _t ,A _t )×C _t (S _t ,A _t )

Determining each shape based on the Q-tableOptimal operation of states and obtaining a minimum cumulative LC metric; simultaneous operation value Q _t (S _t ,A _t ) The update is as follows:

Q _t+1 (S _t ,A _t )＝α _t (LC _t (S _t ,A _t )+γ*minQ _t (S _t+1 ,A _t ))+

(1-α _t )Q _t (S _t ,A _t )

Q(z,d)≈Q(z,d；θ)

the agent can store corresponding experience in the transfer tuple to an experience recovery pool in any time slot t, and uses the arrived tuples to train parameters of the neural network, and meanwhile, the agent can randomly select previous experience from the experience recovery pool for learning in the subsequent training; in each iteration, training the Q function by minimizing a Loss function Loss (theta) to gradually approach a target value;