WO2021253835A1 - Heterogeneous network cache decision-making method based on user preference prediction - Google Patents

Heterogeneous network cache decision-making method based on user preference prediction Download PDF

Info

Publication number
WO2021253835A1
WO2021253835A1 PCT/CN2021/074167 CN2021074167W WO2021253835A1 WO 2021253835 A1 WO2021253835 A1 WO 2021253835A1 CN 2021074167 W CN2021074167 W CN 2021074167W WO 2021253835 A1 WO2021253835 A1 WO 2021253835A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
file
time
users
base station
Prior art date
Application number
PCT/CN2021/074167
Other languages
French (fr)
Chinese (zh)
Inventor
朱琦
单冠捷
Original Assignee
南京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京邮电大学 filed Critical 南京邮电大学
Publication of WO2021253835A1 publication Critical patent/WO2021253835A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5681Pre-fetching or pre-delivering data based on network characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control

Definitions

  • the invention belongs to wireless communication technology, and specifically relates to a heterogeneous network cache decision-making method based on user preference prediction.
  • 5G heterogeneous networks deploy small base stations to offload the traffic load of macro base stations, but the backhaul line of small base stations has become a bottleneck for system performance.
  • the caching technology caches popular files in some users and small cells in advance. When users need these files, they can be obtained through small cells or D2D communication without occupying the backhaul link of the small cell and the bandwidth of the macro cell. During the peak traffic period, network congestion is avoided, and the delay can also be reduced, thereby improving QoS.
  • the present invention provides a heterogeneous network cache decision-making method based on user preference prediction.
  • a heterogeneous network cache decision-making method based on user preference prediction includes the following steps:
  • Use indicator variables Represents the physical relationship between users. If user i and user j have a physical relationship at time t, then If not then Use ⁇ i,j to represent the exponential distribution parameter of the connection duration between user i and user j, and use ⁇ i,j to represent the exponential distribution parameter of the interval time between user i and user j. According to user i and user j at time t 0 Connections Calculate the probability that user i and user j are connected at t c
  • ⁇ ′ u,s and ⁇ ′ u,s to represent the exponentially distributed parameters that the connection time and interval time between user u and small base station s obey respectively, and use indicator variables Represents the physical relationship between user u and small base station s, based on the connection at time t 0 Calculate the probability that user u and small base station s are connected at time t c
  • H ⁇ H 1 , H 2 ,..., H U ⁇ to represent the historical file request for T b time slots before the decision time.
  • X is the data point vector, and the distance is Euclidean distance
  • N the number of important users
  • f(x) the average system overhead
  • step (15) Repeat step (15) until j>V' i .
  • Step (4) The calculation formula is:
  • Ai represents the social attributes of user i
  • social attributes refer to user interest tags, groups, etc. on social networks
  • frequency(k) represents the social attributes shared by several users in total, and the shared social attributes between user i and user j The more remote, the closer their social relations; the Chinese social relations s i,j are judged as follows:
  • bi,j (g u ) represents the passage between vertex i ⁇ VU and vertex j ⁇ VU in graph G s The number of shortest paths of V u;
  • 1 A (x) represents the indicator function, if the condition x is true, its value is 1, otherwise it is 0;
  • c f is the file category to which file f belongs and satisfies Represents the probability that the user u requests the file in the c i category obtained by fitting the Zipf distribution.
  • step (13) the calculation formula of the average system overhead f(x) in step (13) is:
  • the present invention optimizes the cache placement strategy of small base stations and important users with the goal of minimizing the average system overhead, first predicts user preferences based on the request history; then considers user mobility and social relations, and adopts pseudo
  • the Boolean optimization method optimizes the cache decision-making method. Its notable effects include the following aspects:
  • the nonlinear integer programming problem is an NP-complete problem, and the solution complexity is very high.
  • a The polynomial time greedy algorithm obtains the sub-optimal cache decision.
  • Figure 1 is a schematic flow diagram of the method of the present invention
  • Figure 2 is a schematic diagram of a system model of the method of the present invention.
  • Figure 3 is a comparison diagram of a caching strategy based on popularity, a random caching strategy and the proposed caching strategy
  • Figure 4 is a comparison diagram of a cache strategy that does not consider mobility, a cache strategy that does not consider social relationships, and the proposed cache strategy;
  • Fig. 5 is a comparison diagram of the suboptimal value and the optimal value in the embodiment.
  • a heterogeneous network cache decision method based on user preference prediction provided by the present invention
  • macro base stations, small base stations and D2D communications coexist, and users have mobility and are affected by social relationships.
  • user preferences the probability distribution of users requesting different files
  • machine learning methods are used to predict user preferences based on their request history.
  • the expression of the average system cost is derived.
  • the user’s mobility relative to other users and relative small base stations is described by the following equations (1) and (2) respectively, and the social relationship between users is represented by The following formula (3) describes that, under the constraint of cache capacity, the optimization problem of minimizing the average system cost is constructed with the cache strategy of small base stations and important users as variables, and the cache decision is made by solving the problem.
  • the optimization problem is solved based on the suboptimal algorithm of the greedy algorithm to reduce the complexity of cache decision-making.
  • the present invention proves that the optimization problem formed belongs to the minimization problem of the supermodular function on the partitioned matroid. Under the premise of ensuring the performance of the sub-optimal solution, the calculation complexity of the cache decision is greatly reduced, and the calculation complexity of the cache decision is greatly reduced. Cache to greatly reduce system cost.
  • Fig. 1 the overall flow chart of the method of the present invention is shown in Fig. 1, and includes the following steps:
  • Step1 predict user preferences
  • t ⁇ N represents the t-th time slot, its starting time is ⁇ t , and all time slots are of length T.
  • the macro base station can obtain whether the distance between users meets the requirements of D2D communication, that is, the initial D2D connection status of users in the current time slot Where indicator function It represents whether user i and user j can perform D2D communication at the beginning of time slot t, which can be 1 and vice versa. Then each user randomly requests files according to their preferences to form a file request vector in Is the file requested by user i in time slot t. In order to simplify the model, it is assumed that each user requests a file at the beginning of the time slot.
  • the first is obtained from the cache of important users around through D2D communication, and the system cost is ⁇ 1 ;
  • the second type is obtained from the buffer of the small base station, and the system cost is ⁇ 2 ;
  • the third type is obtained from the macro base station, the system cost is ⁇ 3 , and ⁇ 1 ⁇ 2 ⁇ 3 .
  • D2D communication supports one-to-many, that is, a user can send files to multiple users at the same time or receive files from multiple users at the same time; users who are also in the service range of multiple small base stations can also establish communication with multiple small base stations at the same time .
  • the macro base station In the current time slot, the macro base station first guesses the user’s initial D2D connection status in the next time slot based on the user’s initial D2D connection status, and then comprehensively considers the user’s mobility and social relations and other factors to arrive at the optimal caching strategy for the next time slot, and then Place the files that need to be cached in advance.
  • D2D communication can be established between two users should not only consider the physical relationship between the users, but also the social relationship between them.
  • the physical relationship between users is the physical distance relationship between the two. Because users are mobile, the physical distance between users is constantly changing. One user may be close to or far away from another user. D2D communication needs to be at a certain distance. It can only be established within the range of physical distance, so whether users can establish D2D communication, or whether there is a physical relationship between users can be regarded as a probabilistic question.
  • the connection duration the interval between two successful connections is called the interval duration.
  • connection duration and the interval duration obey an exponential distribution. Since the physical distance between the user and the small base station can only be communicated within the coverage of the small base station, and although the location of the small base station is fixed, due to the mobility of the user, the relative distance between the user and the small base station will also change, so it communicates with D2D Similarly, we can also use exponential distribution to model the connection duration and interval duration between users and small base stations.
  • indicator variables To show the physical relationship between users. If user i and user j have a physical relationship at time t, then If not then Define ⁇ i,j as a parameter of exponential distribution obeyed by the connection time between user i and user j; define ⁇ i,j as an exponential distribution parameter obeyed by the interval time between user i and user j.
  • ⁇ i,j as a parameter of exponential distribution obeyed by the connection time between user i and user j
  • ⁇ i,j as an exponential distribution parameter obeyed by the interval time between user i and user j.
  • connection duration and interval duration between the user u and the small base station s obey the exponential distribution of the parameters ⁇ ′ u,s and ⁇ ′ u,s respectively, and the indicator variable Represents the physical relationship between user u and small base station s. If we know the connection at t 0 Calculate the probability that the user u and the small base station s are connected at t c as:
  • D2D communication also involves social relationships, and only users with close social relationships are willing to establish D2D communication.
  • s i,j as the social relationship between user i and user j, using the Adamic/Adar method to calculate the social relationship between users based on the user’s social attributes as:
  • Ai represents the social attributes of user i (the user's interest tags on social networks, groups, etc.), and frequency(k) represents a total of several users sharing k social attributes.
  • G s (VU, E s ) is used to describe the social connection between users, where VU is the set of users, E s represents the social connection between users, and the wire segment connection between users represents the social connection between users.
  • Cache files in the user's terminal device will occupy the storage space of the device. Due to the user's selfishness, the user himself is unwilling to cache files, and only important users hired by the operator will act as cache nodes. In order to measure the social importance of users and introduce the concept of social importance, operators will select users with the greatest possible social importance as important users. Define social importance as:
  • V u and Bu respectively represent the device capacity and intermediary centrality of user u.
  • Intermediary centrality is a commonly used concept in social network analysis to express the centrality of a point in a social network in the entire network. Betweenness centrality is defined as:
  • bi,j (g u ) represents the passage between vertex i ⁇ VU and vertex j ⁇ VU in graph G s The number of shortest paths of V u.
  • the file preference of each user is unknown.
  • the empirical probability distribution of users for each type of file based on the number of times can be calculated as:
  • 1 A (x) represents the indicator function. If the condition x is true, its value is 1, otherwise it is 0. Represents the empirical probability of user u requesting c i files calculated based on the number of requests.
  • users are divided into different types. For example, some users like to watch science fiction movies the most, and some users like to watch comedy shows the most. In other words, users of the same type can be considered to have basically the same probability distribution. If you can accurately divide the number of user types and the users included in each user type, not only can the probability distribution of different users requesting each type of file be reduced, but also because the same type of user is equivalent to one user, it will increase in disguise.
  • the acquired historical file request data of each user makes the empirical probability distribution more accurate, which is conducive to further predicting user file preferences.
  • K-means method to classify user types, and use the Gap Statistic method to determine the K value, and use this K value as the cluster center point obtained by K-means As the empirical probability distribution of each type of user request for each type of file. Then use this probability distribution to further predict the user's file preferences.
  • Zipf distribution is widely used in mobile network caching research, and it is considered to be a good description of users' file preferences or file popularity (the probability distribution of each file being requested by users), etc. Therefore, Zipf distribution is used to request user requests The empirical probability distribution of each type of file is fitted.
  • the Zipf probability distribution is:
  • P c represents the probability that the user requests a file in the category c in their preferences
  • rank(c) ⁇ ⁇ 1,...,C ⁇ represents the popularity ranking of the c category file
  • s is the Zipf distribution
  • the parameter describes the skewness of the user's preference
  • C is the total number of categories.
  • the logarithm of the probability of each type of file being requested has a linear relationship with the logarithm of the category ranking, with a slope of -s and an intercept of
  • the top-ranked category in the Zipf distribution occupies the vast majority of requests, so only the request probabilities of the top 5 types of files in the probability distribution of each type of user experience are considered, and the logarithm of their probabilities and rankings is linearized. Regression, get the Zipf distribution parameter s that it obeys, and then calculate the request probability of each type of file according to the ranking in the empirical probability distribution. Then it is assumed that the user's preferences for files in each type of file are uniformly distributed, and the predicted user file preferences are:
  • c f is the file category to which file f belongs and satisfies Represents the probability that the user u requests the file in the c i category obtained by fitting the Zipf distribution.
  • Step2 The optimization problem with the goal of minimizing the average system overhead:
  • ⁇ u (t) is the cost for user u to obtain the requested file at time t, and there are:
  • case1 indicates that user u in time slot t can obtain files from themselves or from important users through D2D communication;
  • case2 indicates that user u in time slot t can obtain files from the small base station;
  • case3 indicates that user u at time t can obtain files from the macro base station.
  • the cache placement strategy cannot be changed, and the user request file has been determined, so the system cost ⁇ (t) is determined.
  • the work to be done is to determine the cache placement strategy at t+1 according to the current D2D connection between users and the user’s file preferences to minimize the average system cost E( ⁇ (t+1)).
  • E( ⁇ (t+1)) For convenience, the following The time label is omitted in the text, and all refer to the t+1 time slot except for special instructions.
  • the average system cost is expressed as:
  • the buffer placement strategy for important users is in Is the buffer placement strategy vector of the nth important user in time slot t+1, It is also a 0-1 variable. It is 1 when the nth important user caches the file f in time slot t+1, otherwise it is 0.
  • the probability that the user obtains the requested file through himself or D2D communication is:
  • the event A u, f, n indicates that the user u can obtain the requested file f from the nth important user.
  • the first equal sign is established because the user can establish D2D communication with multiple important users at the same time, as long as one of the important users can completely transfer the file f to it, that is, the D2D communication time between the two is not less than t min , the user is The request file can be obtained from itself or through D2D communication, that is, case1 is satisfied.
  • the third equal sign is established because the events of obtaining files from different important users are independent of each other. Derived below
  • the first equal sign is established because the average system cost is calculated during the buffer placement stage of time slot t, and whether D2D is possible between users at time slot t It is known, and the event A u, f, n is equivalent to the D2D connection condition between the user u and IU n and the D2D communication duration t d2d between the two is not less than t min and there is social connection between the two and IU n
  • the file f requested by the user is cached.
  • the second equal sign is established because of the event It has no effect on the probability of the previous events, and Only affect the event
  • the third equal sign was established because of social connections And cache policy variables It is not a random variable but a certain value.
  • equation (15) In order to simplify the concept, let Then there is Put it into equation (15) to get:
  • the optimization problem can be constructed as:
  • the first limitation is the buffer capacity limitation of the small base station
  • the second limitation is the equipment capacity limitation of important users.
  • the third restriction is that the buffer placement strategy variables of the small cell and important users are both 0-1 variables.
  • Step3 Prove that the optimization problem belongs to the problem of minimizing the monotonically decreasing supermodular function on the partitioned matroid:
  • the objective function in question (21) can be regarded as a function f(x) about x, namely:
  • f(x) is a monotonically decreasing supermodular function.
  • V′ i represents the buffer capacity limit of important users or small base stations, that is, when i ⁇ ⁇ 1,...,N ⁇ , When i ⁇ N+1,...,N+S ⁇ , Then in LF
  • the physical meaning of is the cache placement strategy of important users or small cells that meet the constraint of problem (21), that is to say, LF is the cache placement strategy of all possible important users and all small cells that meet the constraint of problem (21). gather. Therefore, the constraint condition of problem (21) is equivalent to the partition matroid (EF, LF).
  • the optimization problem (21) belongs to the problem of minimizing the monotonically decreasing supermodular function on the partitioned matroid.
  • N the number of important users
  • f(x) the average system overhead
  • Figure 3 shows a comparison of system costs obtained through three different methods. From top to bottom, the first curve corresponds to the system cost obtained through random caching. This strategy randomly places files into the caches of IUs and SBS until they are full.
  • the second curve shows the system cost of using a caching strategy based on popularity. This is a widely used caching strategy whose idea is to cache the most popular files at each cache node. In order to implement a caching strategy based on popularity, after predicting the preferences of all users, we take the average of all user preferences as the global file popularity, and all IUs and SBSs put the most popular files in their caches until Its cache is full.
  • the bottom curve shows the system cost obtained by the proposed suboptimal caching strategy.
  • Figure 4 demonstrates the necessity of considering mobility and sociality in caching strategies.
  • the above curve shows the system cost of using the optimal caching strategy without considering mobility.
  • the curve is obtained in the following way: First, remove the mobility in the scene, that is, if a user can When communicating with another user or SBS in the current time slot, they must be able to communicate in the next time slot. Then, the local greedy caching algorithm is applied to this changed scenario to obtain a caching strategy that does not consider mobility, and then the strategy is applied to a scenario that considers mobility to obtain the system cost corresponding to the strategy.
  • the middle curve shows the system cost of using a caching strategy that does not consider sociality.
  • the curve is obtained in the following way: First, remove the sociality in the scene, that is, if two users physically meet the requirements of D2D communication, then they can establish D2D communication regardless of whether they have a social relationship. Then apply the local greedy caching algorithm to this scenario to obtain a caching strategy that does not consider sociality, and then apply this strategy to a scenario that considers sociality to obtain the corresponding system cost.
  • Figure 5 shows the comparison of the system cost between the proposed suboptimal caching strategy and the optimal caching strategy.
  • the optimal caching strategy here is obtained by replacing variables. Specifically, the nonlinear integer programming problem can be transformed into a linear integer programming problem, and then standard linear integer programming optimization tools can be used to solve the optimal caching strategy problem. Because the optimization problem is NP-complete, in order to reduce the computational complexity, the comparison scenario only contains one SBS, and the number of important users is between 1 and 4. The second best value is obtained by the proposed method. It can be seen that the gap between the optimal value and the sub-optimal value is very small.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

Disclosed in the present invention is a heterogeneous network cache decision-making method based on user preference prediction. In the method, a macro base station, a small base station and D2D communication coexist, and the mobility and social relationship influence of users are considered. First, in a case that user preferences are unknown, a machine learning method is adopted to predict the user preferences according to request historical records of the user preferences; then, the average system cost is calculated by considering mobility, a physical position relationship and a social relationship of the users, an optimization problem of minimizing the average system cost is constructed by taking cache strategies of the small base station and the important users as variables under the constraint of cache capacity, and cache decision-making is performed by solving the problem. According to the method, the optimization problem of the present invention is solved based on the minimization problem of the super-mode function on a partition quasi-matrix, on the premise that the performance of the suboptimal solution is guaranteed, the calculation complexity of cache decision-making is greatly reduced, and therefore the system cost is greatly reduced by caching at a small base station and important users.

Description

一种基于用户偏好预测的异构网络缓存决策方法A heterogeneous network cache decision-making method based on user preference prediction 技术领域Technical field
本发明属于无线通信技术,具体涉及一种基于用户偏好预测的异构网络缓存决策方法。The invention belongs to wireless communication technology, and specifically relates to a heterogeneous network cache decision-making method based on user preference prediction.
背景技术Background technique
随着移动互联网的发展,无线移动设备急剧增长,产生了大量的数据流量,给移动通信带来了挑战,在本地缓存流行的文件便是应对这些挑战的解决方案之一。5G异构网络通过部署小基站来卸载宏基站的流量负荷,但是小基站的回程线路成为系统性能的瓶颈。缓存技术通过预先在部分用户和小基站中缓存流行的文件,当用户需要这些文件时,可以通过小基站或D2D通信来获得这些文件,而不用占用小基站的回程链路和宏基站的带宽,在流量高峰期避免了网络的拥塞,同时也可以降低时延,从而提升了QoS。With the development of the mobile Internet, the rapid growth of wireless mobile devices has generated a large amount of data traffic, which has brought challenges to mobile communications. Local caching of popular files is one of the solutions to these challenges. 5G heterogeneous networks deploy small base stations to offload the traffic load of macro base stations, but the backhaul line of small base stations has become a bottleneck for system performance. The caching technology caches popular files in some users and small cells in advance. When users need these files, they can be obtained through small cells or D2D communication without occupying the backhaul link of the small cell and the bandwidth of the macro cell. During the peak traffic period, network congestion is avoided, and the delay can also be reduced, thereby improving QoS.
但是,考虑到缓存成本的限制,小基站部署的缓存设备容量有限,而移动设备的存储容量更小,远远小于互联网内容库的容量,因此制定正确的缓存决策来确定缓存中放置的文件,对提高缓存命中率是非常重要的。However, considering the limitation of cache cost, the capacity of cache devices deployed by small base stations is limited, and the storage capacity of mobile devices is smaller, far less than the capacity of the Internet content library. Therefore, make correct cache decisions to determine the files placed in the cache. It is very important to improve the cache hit rate.
发明内容Summary of the invention
发明目的:为了克服现有技术中存在的不足,本发明提供一种基于用户偏好预测的异构网络缓存决策方法。Objective of the invention: In order to overcome the deficiencies in the prior art, the present invention provides a heterogeneous network cache decision-making method based on user preference prediction.
为实现上述目的,本发明所提供的技术方案如下:In order to achieve the above objectives, the technical solutions provided by the present invention are as follows:
一种基于用户偏好预测的异构网络缓存决策方法,所述方法中宏基站、小基站和D2D的通信方式并存,包括如下步骤:A heterogeneous network cache decision-making method based on user preference prediction. In the method, macro base station, small base station, and D2D communication modes coexist, and the method includes the following steps:
(S1)首先在用户请求不同文件的概率分布未知的情况下,通过机器学习根据其请求历史记录预测用户偏好;(S1) First, when the probability distribution of the user requesting different files is unknown, use machine learning to predict user preferences based on their request history;
(S2)基于用户的移动性、物理位置关系、社会关系推导平均系统成本的表达式,在缓存容量的约束下,以小基站和重要用户的缓存策略为变量,构建平均系统成本最小化的优化问题,通过求解该问题进行缓存决策;(S2) Derive the expression of the average system cost based on the user's mobility, physical location relationship, and social relationship. Under the constraint of cache capacity, use the cache strategy of small base stations and important users as variables to construct an optimization that minimizes the average system cost The problem, the cache decision is made by solving the problem;
(S3)基于贪心算法的次优算法对平均系统成本最小化的优化问题进行求解,按照解向量决定予以缓存的文件。(S3) The suboptimal algorithm based on the greedy algorithm solves the optimization problem of minimizing the average system cost, and determines the file to be cached according to the solution vector.
进一步的,本发明所述方法的算法处理过程具体如下:Further, the algorithm processing process of the method of the present invention is specifically as follows:
(1)用S={1,...,S}、U={1,2,...,U}、C={1,...,C}和F={1,...,C*F c}分别表示小基站集、用户集、文件类别集和文件集,其中S、U、C、F c分别表示小基站数、用户数、文件类别数和每类文件数,用t min、t min′分别表示通过D2D和通过小基站下载每个文件需要最小通信时间,宏基站包含内容库中的全部文件; (1) Use S={1,...,S}, U={1,2,...,U}, C={1,...,C} and F={1,... ,C*F c }represent the small base station set, user set, file category set and file set, where S, U, C, F c represent the number of small base stations, the number of users, the number of file categories, and the number of files in each category, respectively. t min and t min ′ respectively indicate the minimum communication time required to download each file through D2D and through the small base station, and the macro base station contains all the files in the content library;
(2)将时间划分为等长的时隙,t∈N表示第t个时隙,其起始时刻是τ t,所有时隙长度都为T,每个时隙开始,即当前时隙的用户初始D2D连接情况
Figure PCTCN2021074167-appb-000001
其中指示函数
Figure PCTCN2021074167-appb-000002
代表用户i和用户j在t时隙开始是否可以进行D2D通信,可以为1反之为0,然后每个用户按照其偏好随机的请求文件,构成文件请求向量
Figure PCTCN2021074167-appb-000003
其中
Figure PCTCN2021074167-appb-000004
是用户i在t时隙请 求的文件;
(2) Divide time into equal-length time slots, t ∈ N represents the t-th time slot, its starting time is τ t , all time slots are of length T, and each time slot starts, that is, the current time slot User's initial D2D connection
Figure PCTCN2021074167-appb-000001
Where indicator function
Figure PCTCN2021074167-appb-000002
Represents whether user i and user j can conduct D2D communication at the beginning of time slot t, which can be 1 or 0, and then each user randomly requests files according to their preferences to form a file request vector
Figure PCTCN2021074167-appb-000003
in
Figure PCTCN2021074167-appb-000004
Is the file requested by user i in time slot t;
(3)用指示变量
Figure PCTCN2021074167-appb-000005
表示用户间的物理关系,如果用户i和用户j在t时刻具有物理关系,则
Figure PCTCN2021074167-appb-000006
若没有则
Figure PCTCN2021074167-appb-000007
用μ i,j表示用户i与用户j间连接时长服从的指数分布的参数,用λ i,j表示用户i与用户j间隔时长服从的指数分布参数,根据用户i和用户j在t 0时刻的连接情况
Figure PCTCN2021074167-appb-000008
计算用户i和用户j在t c时刻相连的概率
Figure PCTCN2021074167-appb-000009
(3) Use indicator variables
Figure PCTCN2021074167-appb-000005
Represents the physical relationship between users. If user i and user j have a physical relationship at time t, then
Figure PCTCN2021074167-appb-000006
If not then
Figure PCTCN2021074167-appb-000007
Use μ i,j to represent the exponential distribution parameter of the connection duration between user i and user j, and use λ i,j to represent the exponential distribution parameter of the interval time between user i and user j. According to user i and user j at time t 0 Connections
Figure PCTCN2021074167-appb-000008
Calculate the probability that user i and user j are connected at t c
Figure PCTCN2021074167-appb-000009
(4)用μ′ u,s和λ′ u,s表示用户u和小基站s间的连接时长和间隔时长分别服从的指数分布的参数,用指示变量
Figure PCTCN2021074167-appb-000010
表示用户u和小基站s间的物理关系,根据t 0时刻的连接情况
Figure PCTCN2021074167-appb-000011
计算用户u和小基站s在t c时刻相连的概率
Figure PCTCN2021074167-appb-000012
(4) Use μ′ u,s and λ′ u,s to represent the exponentially distributed parameters that the connection time and interval time between user u and small base station s obey respectively, and use indicator variables
Figure PCTCN2021074167-appb-000010
Represents the physical relationship between user u and small base station s, based on the connection at time t 0
Figure PCTCN2021074167-appb-000011
Calculate the probability that user u and small base station s are connected at time t c
Figure PCTCN2021074167-appb-000012
(5)用S i,j表示用户i与用户j之间的社会关系,用S T表示社会关系阈值,基于S i,j和S T计算用户间的社会联系s i,j,,用θ u表示用户u的社会重要性,用来衡量用户的社会重要程度,计算每个用户的社会重要性θ u=α·V u+β·B u,其中V u,B u分别代表用户u的设备容量和中介中心性,α,β是权重系数,且满足α+β=1,依据社会重要性选取重要用户来缓存文件; (5) Use S i,j to denote the social relationship between user i and user j, and use ST to denote the social relationship threshold . Calculate the social relationship s i,j between users based on S i,j and ST, and use θ u represents a user u social importance, it is important to measure the degree of social users, each user is calculated social importance θ u = α · V u + β · B u, wherein V u, B u representing the user u Equipment capacity and betweenness centrality, α and β are weight coefficients, and satisfy α+β=1, select important users to cache files according to social importance;
(6)用H={H 1,H 2,...,H U}表示决策时刻前T b个时隙的历史文件请求其中
Figure PCTCN2021074167-appb-000013
代表用户u的请求历史,
Figure PCTCN2021074167-appb-000014
为前T b个时隙中第t b个时隙时请求的文件,根据历史文件请求H计算出基于次数的用户对每类文件的经验概率分布
Figure PCTCN2021074167-appb-000015
并作为K-means算法的数据集;
(6) Use H={H 1 , H 2 ,..., H U } to represent the historical file request for T b time slots before the decision time.
Figure PCTCN2021074167-appb-000013
Represents the request history of user u,
Figure PCTCN2021074167-appb-000014
For the file requested at the t b- th time slot in the previous T b time slots, the user’s empirical probability distribution for each type of file based on the number of times is calculated according to the historical file request H
Figure PCTCN2021074167-appb-000015
And as the data set of the K-means algorithm;
(7)计算不同K值下所有数据点到其聚类中心点的距离之和作为衡量当前K-means模型的性能度量,其计算表达式如下所示:(7) Calculate the sum of the distances from all data points to their cluster centers under different K values as a performance metric to measure the current K-means model. The calculation expression is as follows:
Figure PCTCN2021074167-appb-000016
Figure PCTCN2021074167-appb-000016
其中X为数据点向量,距离采用欧式距离;Where X is the data point vector, and the distance is Euclidean distance;
(8)计算Gap(K)=E(logD K)-logD K作为Gap Statistic,其中E(logD K)为logD K 的期望,选取使Gap(K)最大的K值optK作为用户分类的类别数; (8) calculates Gap (K) = E (logD K) -logD K as Gap Statistic, wherein E (logD K) of the K logD desired, selected so that the maximum value of K Gap (K) optK number of classes classified as the user ;
(9)针对每一类用户,计算其聚类中心作为该类用户请求该类文件的经验概率分布,将聚类中心从大到小排序,并获得对应的索引向量,取前五名的值和排名取对数后作为y,x数据进行线性回归求得Zipf分布参数s;(9) For each type of user, calculate its cluster center as the empirical probability distribution of the type of user requesting this type of file, sort the cluster centers from large to small, and obtain the corresponding index vector, taking the top five values Take the logarithm of ranking and rank as y, x data to perform linear regression to obtain Zipf distribution parameter s;
(10)计算该类用户请求每类文件的概率,其计算表达式如下所示:(10) Calculate the probability of this type of user requesting each type of file, and the calculation expression is as follows:
Figure PCTCN2021074167-appb-000017
Figure PCTCN2021074167-appb-000017
依据对每类文件中文件的偏好服从均匀分布求出用户文件偏好(请求所有文件的概率分布)
Figure PCTCN2021074167-appb-000018
其中
Figure PCTCN2021074167-appb-000019
代表用户u请求第f个文件的概率;
Calculate user file preferences based on the uniform distribution of file preferences for each type of file (probability distribution of all files requested)
Figure PCTCN2021074167-appb-000018
in
Figure PCTCN2021074167-appb-000019
Represents the probability of user u requesting the f-th file;
(11)重复步骤(9)至步骤(10)直至optK类用户的文件偏好都被求出,得到所有用户的文件偏好集合
Figure PCTCN2021074167-appb-000020
(11) Repeat steps (9) to (10) until the file preferences of optK users have been calculated, and a set of file preferences of all users is obtained
Figure PCTCN2021074167-appb-000020
(12)令从自身或者或通过D2D通信从重要用户中获取文件的花费为ξ 1;从小基站获取文件的花费为ξ 2;从宏基站获取文件的开销为ξ 3,用户首先考虑从自身存储或重要用户获取请求文件,没有则考虑从小基站,都没有换成则从宏基站获取; (12) Let the cost of obtaining files from oneself or from important users through D2D communication be ξ 1 ; the cost of obtaining files from small base stations is ξ 2 ; the cost of obtaining files from macro base stations is ξ 3 , and the user first considers storing from himself Or important users obtain the request file, if not, consider the small base station, and if they are not replaced, obtain the request file from the macro base station;
(13)令N代表重要用户数,令
Figure PCTCN2021074167-appb-000021
代表所有重要用户和小基站缓存放置策略变量,推导得到平均系统开销f(x)的表达式,初始化i=N+1,x subopt为长度为(N+S)F的全零向量;
(13) Let N represent the number of important users, let
Figure PCTCN2021074167-appb-000021
Represents all important users and small cell buffer placement strategy variables, derives the expression of the average system overhead f(x), initializes i=N+1, and x subopt is an all-zero vector of length (N+S)F;
(14)令j=1,令集合F left={1,...,F}; (14) Let j = 1, let the set F left = {1,...,F};
(15)令
Figure PCTCN2021074167-appb-000022
然后令x subopt中第(i-1)F+fopt个元素值为1,去掉集合F left中的f opt元素,最后令j=j+1;
(15) Order
Figure PCTCN2021074167-appb-000022
Then set the value of the (i-1) F+ fopt element in x subopt to 1, remove the f opt element in the set F left , and finally set j=j+1;
(16)重复执行步骤(15)直至j>V′ i(16) Repeat step (15) until j>V' i .
更进一步地,所述步骤(3)中
Figure PCTCN2021074167-appb-000023
的计算公式如下所示:
Furthermore, in the step (3)
Figure PCTCN2021074167-appb-000023
The calculation formula is as follows:
Figure PCTCN2021074167-appb-000024
Figure PCTCN2021074167-appb-000024
步骤(4)中
Figure PCTCN2021074167-appb-000025
的计算公式为:
Step (4)
Figure PCTCN2021074167-appb-000025
The calculation formula is:
Figure PCTCN2021074167-appb-000026
Figure PCTCN2021074167-appb-000026
更进一步地,步骤(5)中社会关系S i,j的计算公式如下所示: Furthermore, the calculation formula of social relations S i,j in step (5) is as follows:
Figure PCTCN2021074167-appb-000027
Figure PCTCN2021074167-appb-000027
其中A i代表用户i的社交属性,社交属性指用户在社交网络上兴趣标签、群组等,frequency(k)代表一共几个用户共有k社交属性,用户i与用户j之间的共有社会属性越冷僻,则他们的社会关系越紧密;中社会关系s i,j的判定如下: Among them, Ai represents the social attributes of user i, social attributes refer to user interest tags, groups, etc. on social networks, frequency(k) represents the social attributes shared by several users in total, and the shared social attributes between user i and user j The more remote, the closer their social relations; the Chinese social relations s i,j are judged as follows:
当S i,j>=S T时,才认为用户i与用户j之间具有社会联系,此时s i,j=1,否则没有,s i,j=0。 When S i,j >=S T , it is considered that there is a social connection between user i and user j, at this time s i,j =1, otherwise there is no, s i,j =0.
更进一步地,中介中心性B u的计算公式具体如下: Furthermore, the calculation formula of betweenness centrality Bu is as follows:
Figure PCTCN2021074167-appb-000028
Figure PCTCN2021074167-appb-000028
其中b i,j代表图G s中顶点i∈VU和顶点j∈VU之间最短路径条数,b i,j(g u)代表图G s中顶点i∈VU和顶点j∈VU间经过V u的最短路径的条数; Where b i,j represents the number of shortest paths between vertex i∈VU and vertex j∈VU in graph G s , and bi,j (g u ) represents the passage between vertex i∈VU and vertex j∈VU in graph G s The number of shortest paths of V u;
更进一步地,所述步骤(6)中经验概率分布
Figure PCTCN2021074167-appb-000029
的计算公式为:
Furthermore, the empirical probability distribution in the step (6)
Figure PCTCN2021074167-appb-000029
The calculation formula is:
Figure PCTCN2021074167-appb-000030
Figure PCTCN2021074167-appb-000030
其中1 A(x)代表指示函数,如果条件x为真,其值为1,否则为0; Among them, 1 A (x) represents the indicator function, if the condition x is true, its value is 1, otherwise it is 0;
更进一步地,所述步骤(10)中
Figure PCTCN2021074167-appb-000031
的计算公式为:
Furthermore, in the step (10)
Figure PCTCN2021074167-appb-000031
The calculation formula is:
Figure PCTCN2021074167-appb-000032
Figure PCTCN2021074167-appb-000032
其中c f是文件f所属的文件类别并满足
Figure PCTCN2021074167-appb-000033
Figure PCTCN2021074167-appb-000034
代表拟合Zipf分布得到的用户u请求c i类中文件的概率。
Where c f is the file category to which file f belongs and satisfies
Figure PCTCN2021074167-appb-000033
Figure PCTCN2021074167-appb-000034
Represents the probability that the user u requests the file in the c i category obtained by fitting the Zipf distribution.
更进一步地,步骤(13)中平均系统开销f(x)的计算公式为:Furthermore, the calculation formula of the average system overhead f(x) in step (13) is:
Figure PCTCN2021074167-appb-000035
Figure PCTCN2021074167-appb-000035
其中
Figure PCTCN2021074167-appb-000036
in
Figure PCTCN2021074167-appb-000036
有益效果:与现有技术相比,本发明以最小化平均系统开销为目标优化小基站、重要用户的缓存放置策略,首先依据请求历史预测用户偏好;然后考虑用户移动性和社会关系,采用伪布尔优化方法优化缓存决策方法。其显著的效果包括如下几个方面:Beneficial effects: Compared with the prior art, the present invention optimizes the cache placement strategy of small base stations and important users with the goal of minimizing the average system overhead, first predicts user preferences based on the request history; then considers user mobility and social relations, and adopts pseudo The Boolean optimization method optimizes the cache decision-making method. Its notable effects include the following aspects:
1、在假设兴趣相似的用户具有基本相同的文件偏好的基础上,根据用户的历史文件请求使用K-means将其分为不同类型件,得到每种类型请求不同文件的经验概率分布。由于这种概率分布在历史数据有限的情况下是不准确的,因此使用Zipf分布来拟合这些数据,并提供更准确的用户文件偏好预测。1. On the assumption that users with similar interests have basically the same file preferences, use K-means to classify users into different types of files according to their historical file requests, and obtain the empirical probability distribution of different file requests for each type. Since this probability distribution is inaccurate in the case of limited historical data, Zipf distribution is used to fit these data and provide more accurate user file preference predictions.
2、根据用户文件偏好、用户移动性、用户社会关系以及重要用户和小基站的缓存放置内容,推导出用户在下一个时隙从重要用户、小基站或宏基站处获取请求文件的概率,并进一步推导出平均系统开销,从而得到最小化平均系统开销的非线性整数规划问题。2. According to user file preferences, user mobility, user social relationships, and the cache placement content of important users and small base stations, deduce the probability that the user will obtain the requested file from important users, small base stations or macro base stations in the next time slot, and further The average system overhead is derived, and the nonlinear integer programming problem that minimizes the average system overhead is obtained.
3、该非线性整数规划问题是NP完全问题,求解复杂度很高,为了减少复杂度,在证明了该问题的目标函数是一个单调超模函数,约束是一个划分拟阵后,提出了一个多项式时间贪心算法获得次优缓存决策。3. The nonlinear integer programming problem is an NP-complete problem, and the solution complexity is very high. In order to reduce the complexity, after proving that the objective function of the problem is a monotonic supermodular function and the constraint is a division matroid, a The polynomial time greedy algorithm obtains the sub-optimal cache decision.
附图说明Description of the drawings
图1为本发明所述方法的流程示意图;Figure 1 is a schematic flow diagram of the method of the present invention;
图2为本发明所述方法的系统模型示意图;Figure 2 is a schematic diagram of a system model of the method of the present invention;
图3为基于流行度的缓存策略、随机缓存策略与所提缓存策略对比图;Figure 3 is a comparison diagram of a caching strategy based on popularity, a random caching strategy and the proposed caching strategy;
图4为不考虑移动性的缓存策略、不考虑社会关系的缓存策略与所提缓存策略对比图;Figure 4 is a comparison diagram of a cache strategy that does not consider mobility, a cache strategy that does not consider social relationships, and the proposed cache strategy;
图5为实施例中次优值和最优值的对比图。Fig. 5 is a comparison diagram of the suboptimal value and the optimal value in the embodiment.
具体实施方式detailed description
为详细的说明本发明所公开的技术方案,下面结合说明书附图及具体实施例做进一步的阐述。In order to explain in detail the technical solutions disclosed in the present invention, further explanations are given below in conjunction with the drawings and specific embodiments of the specification.
本发明所提供的一种基于用户偏好预测的异构网络缓存决策方法中,宏基站、小基站和D2D通信并存,用户具有移动性并受社会关系影响。首先在用户偏好(用户请求不同文件的概率分布)未知的情况下,采用机器学习的方法根据其请求历史记录预测用户偏好。综合考虑用户的移动性、社会关系推导了平均系统成本的表达式,其中用户相对其他用户和相对小基站的移动性分别由下文中式(1)和式(2)描述,用户间的社会关系由下文中式(3)描述,在缓存容量的约束下,以小基站和重要用户的缓存策略为变量,构建了平均系统成本最小化的优化问题,通过求解该问题进行缓存决策。为了解决重要用户数多时计算复杂度大的问题,在证明了目标函数是超模函数后,并基于贪心算法的次优算法对优化问题求解,降低缓存决策复杂度。本发明通过证明形成的最优化问题属于分区拟阵上超模函数的最小化问题,在保障次优解性能的前提下,大大减少了缓存决策的计算复杂度,从而通过在小基站和重要用户处缓存来大大减少系统成本。In a heterogeneous network cache decision method based on user preference prediction provided by the present invention, macro base stations, small base stations and D2D communications coexist, and users have mobility and are affected by social relationships. First, when user preferences (the probability distribution of users requesting different files) are unknown, machine learning methods are used to predict user preferences based on their request history. Comprehensively considering the user’s mobility and social relations, the expression of the average system cost is derived. The user’s mobility relative to other users and relative small base stations is described by the following equations (1) and (2) respectively, and the social relationship between users is represented by The following formula (3) describes that, under the constraint of cache capacity, the optimization problem of minimizing the average system cost is constructed with the cache strategy of small base stations and important users as variables, and the cache decision is made by solving the problem. In order to solve the problem of high computational complexity when the number of important users is large, after proving that the objective function is a supermodel function, the optimization problem is solved based on the suboptimal algorithm of the greedy algorithm to reduce the complexity of cache decision-making. The present invention proves that the optimization problem formed belongs to the minimization problem of the supermodular function on the partitioned matroid. Under the premise of ensuring the performance of the sub-optimal solution, the calculation complexity of the cache decision is greatly reduced, and the calculation complexity of the cache decision is greatly reduced. Cache to greatly reduce system cost.
具体的,本发明所述方法的总体流程图如图1所示,包括以下步骤:Specifically, the overall flow chart of the method of the present invention is shown in Fig. 1, and includes the following steps:
Step1、预测用户偏好Step1, predict user preferences
如图2所示,宏基站覆盖范围内有多个小基站和多个用户。假设宏基站覆盖范围内共有S个小基站,每个小基站s∈S={1,...,S}的缓存容量都相同且为V SBS;小基站的覆盖范围可以重合,宏基站内共有U个用户,每个用户u∈U={1,2,...,U}的设备容量为V u。文件库由C类文件构成,其中每个类别c∈C={1,...,C}含有F c个文件,则整个文件库一共有F=C*F c个文件,假设每个文件f∈F={1,...,C*F c}的大小为相同,且通过D2D、小基站下载每个文件需要最小通信时间分别为t min,t min′,假设宏基站拥有内容库中的全部文件。用户之间可以进行D2D通信。 As shown in Figure 2, there are multiple small base stations and multiple users in the coverage area of a macro base station. Suppose there are a total of S small base stations in the coverage area of the macro base station, and the buffer capacity of each small base station s∈S={1,...,S} is the same and V SBS ; the coverage areas of the small base stations can be overlapped and shared in the macro base station U users, the device capacity of each user u∈U={1,2,...,U} is V u . The file library is composed of C files, where each category c∈C={1,...,C} contains F c files, then the entire file library has a total of F=C*F c files, assuming each file The size of f∈F={1,...,C*F c } is the same, and the minimum communication time required to download each file through D2D and small base stations is t min , t min ′, respectively, assuming that the macro base station has a content library All files in. D2D communication can be carried out between users.
将时间划分为等长的时隙,t∈N表示第t个时隙,其起始时刻是τ t,所有时隙长度都为T。每个时隙开始,宏基站可以获得用户间的距离是否满足D2D通信的要求,即当前时隙的用户初始D2D连接情况
Figure PCTCN2021074167-appb-000037
其中指示函数
Figure PCTCN2021074167-appb-000038
代表用户i和用户j在t时隙开始是否可以进行D2D通信,可以为1反之为0。然后每个用户按照其偏好随机的请求文件,构成文件请求向量
Figure PCTCN2021074167-appb-000039
其中
Figure PCTCN2021074167-appb-000040
是用户i在t时隙请求的文件。为了简化模型,假设每个用户都是在时隙开始请求文件。
Divide time into time slots of equal length, t ∈ N represents the t-th time slot, its starting time is τ t , and all time slots are of length T. At the beginning of each time slot, the macro base station can obtain whether the distance between users meets the requirements of D2D communication, that is, the initial D2D connection status of users in the current time slot
Figure PCTCN2021074167-appb-000037
Where indicator function
Figure PCTCN2021074167-appb-000038
It represents whether user i and user j can perform D2D communication at the beginning of time slot t, which can be 1 and vice versa. Then each user randomly requests files according to their preferences to form a file request vector
Figure PCTCN2021074167-appb-000039
in
Figure PCTCN2021074167-appb-000040
Is the file requested by user i in time slot t. In order to simplify the model, it is assumed that each user requests a file at the beginning of the time slot.
用户获取文件的方式包括如下三种。There are three ways for users to obtain files.
第一种通过D2D通信从周围的重要用户的缓存中获得,系统成本为ξ 1The first is obtained from the cache of important users around through D2D communication, and the system cost is ξ 1 ;
第二种从小基站的缓存中获得,系统成本为ξ 2The second type is obtained from the buffer of the small base station, and the system cost is ξ 2 ;
第三种从宏基站获得,系统成本为ξ 3,且ξ 1<ξ 2<ξ 3。假设D2D通信支持一 对多,即一个用户可以同时向多个用户发送文件或者同时从多个用户接收文件;同样处在多个小基站服务范围内的用户也可同时与多个小基站建立通信。 The third type is obtained from the macro base station, the system cost is ξ 3 , and ξ 1 <ξ 2 <ξ 3 . Assuming that D2D communication supports one-to-many, that is, a user can send files to multiple users at the same time or receive files from multiple users at the same time; users who are also in the service range of multiple small base stations can also establish communication with multiple small base stations at the same time .
在当前时隙,宏基站先根据用户初始D2D连接情况推测下一时隙的用户初始D2D连接情况,然后综合考虑用户的移动性和社会关系等要素得出下一时隙的最优缓存策略,并依次预先放置需要缓存的文件。In the current time slot, the macro base station first guesses the user’s initial D2D connection status in the next time slot based on the user’s initial D2D connection status, and then comprehensively considers the user’s mobility and social relations and other factors to arrive at the optimal caching strategy for the next time slot, and then Place the files that need to be cached in advance.
两个用户之间是否可以建立D2D通信不仅要考虑用户间的物理关系,还要考虑他们之间的社会关系。用户间的物理关系就是两者之间的物理距离关系,因为用户具有移动性,所以用户之间的物理距离也是不断变化的,一个用户可能接近也有可能远离另一个用户,而D2D通信需要在一定物理距离范围内才能建立,因此可以把用户间能否建立D2D通信,或者说用户间是否具有物理关系看作一个概率性的问题。当两用户间的物理距离小于D2D通信最大距离时,他们就会相连,他们相连的时长称之为连接时长;两次成功相连之间的间隔称之为间隔时长。为了建模用户的移动性,假设连接时长和间隔时长均服从指数分布。由于用户和小基站间的物理距离需要在小基站覆盖范围内才可以通信,并且虽然小基站的位置固定,但是由于用户的移动性,用户和小基站的相对距离也会变化,因此与D2D通信类似,我们也可以用指数分布建模用户和小基站的连接时长和间隔时长。Whether or not D2D communication can be established between two users should not only consider the physical relationship between the users, but also the social relationship between them. The physical relationship between users is the physical distance relationship between the two. Because users are mobile, the physical distance between users is constantly changing. One user may be close to or far away from another user. D2D communication needs to be at a certain distance. It can only be established within the range of physical distance, so whether users can establish D2D communication, or whether there is a physical relationship between users can be regarded as a probabilistic question. When the physical distance between two users is less than the maximum distance of D2D communication, they will be connected. The duration of their connection is called the connection duration; the interval between two successful connections is called the interval duration. In order to model the mobility of users, it is assumed that both the connection duration and the interval duration obey an exponential distribution. Since the physical distance between the user and the small base station can only be communicated within the coverage of the small base station, and although the location of the small base station is fixed, due to the mobility of the user, the relative distance between the user and the small base station will also change, so it communicates with D2D Similarly, we can also use exponential distribution to model the connection duration and interval duration between users and small base stations.
定义指示变量
Figure PCTCN2021074167-appb-000041
来表明用户间的物理关系,如果用户i和用户j在t时刻具有物理关系,则
Figure PCTCN2021074167-appb-000042
若没有则
Figure PCTCN2021074167-appb-000043
定义μ i,j为用户i与用户j间连接时长服从的指数分布的参数;定义λ i,j为用户i与用户j间隔时长服从的指数分布参数。假设知道用户i和用户j在t 0时刻的连接情况
Figure PCTCN2021074167-appb-000044
计算用户i和用户j在t c时刻相连的概率:
Define indicator variables
Figure PCTCN2021074167-appb-000041
To show the physical relationship between users. If user i and user j have a physical relationship at time t, then
Figure PCTCN2021074167-appb-000042
If not then
Figure PCTCN2021074167-appb-000043
Define μ i,j as a parameter of exponential distribution obeyed by the connection time between user i and user j; define λ i,j as an exponential distribution parameter obeyed by the interval time between user i and user j. Suppose we know the connection between user i and user j at t 0
Figure PCTCN2021074167-appb-000044
Calculate the probability that user i and user j are connected at t c:
Figure PCTCN2021074167-appb-000045
Figure PCTCN2021074167-appb-000045
类似的,我们假设用户u和小基站s间的连接时长和间隔时长分别服从参数为μ′ u,s和λ′ u,s的指数分布,指示变量
Figure PCTCN2021074167-appb-000046
表示用户u和小基站s间的物理关系。则假如知道t 0时刻的连接情况
Figure PCTCN2021074167-appb-000047
计算用户u和小基站s在t c时刻相连的概率为:
Similarly, we assume that the connection duration and interval duration between the user u and the small base station s obey the exponential distribution of the parameters μ′ u,s and λ′ u,s respectively, and the indicator variable
Figure PCTCN2021074167-appb-000046
Represents the physical relationship between user u and small base station s. If we know the connection at t 0
Figure PCTCN2021074167-appb-000047
Calculate the probability that the user u and the small base station s are connected at t c as:
Figure PCTCN2021074167-appb-000048
Figure PCTCN2021074167-appb-000048
基于安全的考虑,D2D通信的成功建立还涉及社会关系,只有社会关系密切的用户才愿意建立D2D通信。定义s i,j为用户i与用户j之间的社会关系,利用 Adamic/Adar方法,基于用户的社交属性来计算用户间的社会关系为: Based on security considerations, the successful establishment of D2D communication also involves social relationships, and only users with close social relationships are willing to establish D2D communication. Define s i,j as the social relationship between user i and user j, using the Adamic/Adar method to calculate the social relationship between users based on the user’s social attributes as:
Figure PCTCN2021074167-appb-000049
Figure PCTCN2021074167-appb-000049
其中A i代表用户i的社交属性(用户在社交网络上兴趣标签、群组等),frequency(k)代表一共几个用户共有k社交属性。用户i与用户j之间的共有社会属性越冷僻,则他们的社会关系越紧密,这是因为冷僻的属性更加能体现出用户的特点和喜好。定义S T为社会关系阈值,只有当S i,j>=S T时,才认为用户i与用户j之间具有社会联系,此时s i,j=1,否则没有,s i,j=0。用G s(VU,E s)来描述用户间的社会联系,其中VU是用户集合,E s代表了用户间的社会联系,用户间有线段相连代表用户间具有社会联系。 Among them, Ai represents the social attributes of user i (the user's interest tags on social networks, groups, etc.), and frequency(k) represents a total of several users sharing k social attributes. The more remote the shared social attributes between user i and user j are, the closer their social relationship is. This is because the remote attributes can better reflect the characteristics and preferences of users. Define S T as the social relationship threshold. Only when S i,j >=S T , the user i and user j are considered to have a social connection, at this time s i,j =1, otherwise there is no, s i,j = 0. G s (VU, E s ) is used to describe the social connection between users, where VU is the set of users, E s represents the social connection between users, and the wire segment connection between users represents the social connection between users.
在用户的终端设备中缓存文件,会占用设备的存储空间,由于用户自私性,用户本身不愿意缓存文件,只有运营商雇佣的重要用户才会充当缓存节点。为了衡量用户的社会重要程度引入社会重要性的概念,运营商会选取社会重要性尽可能大的用户作为重要用户。定义社会重要性为:Cache files in the user's terminal device will occupy the storage space of the device. Due to the user's selfishness, the user himself is unwilling to cache files, and only important users hired by the operator will act as cache nodes. In order to measure the social importance of users and introduce the concept of social importance, operators will select users with the greatest possible social importance as important users. Define social importance as:
θ u=α·V u+β·B u,u=1,...,U          (4) θ u =α·V u +β·B u ,u=1,...,U (4)
其中V u,B u分别代表用户u的设备容量和中介中心性。α,β是权重系数,且满足α+β=1。中介中心性是社交网络分析中常用的一个概念,用来表述社交网络中一个点在整个网络中的中心程度。中介中心性定义为: Among them, V u and Bu respectively represent the device capacity and intermediary centrality of user u. α, β are weighting coefficients, and satisfy α+β=1. Intermediary centrality is a commonly used concept in social network analysis to express the centrality of a point in a social network in the entire network. Betweenness centrality is defined as:
Figure PCTCN2021074167-appb-000050
Figure PCTCN2021074167-appb-000050
其中b i,j代表图G s中顶点i∈VU和顶点j∈VU之间最短路径条数,b i,j(g u)代表图G s中顶点i∈VU和顶点j∈VU间经过V u的最短路径的条数。 Where b i,j represents the number of shortest paths between vertex i∈VU and vertex j∈VU in graph G s , and bi,j (g u ) represents the passage between vertex i∈VU and vertex j∈VU in graph G s The number of shortest paths of V u.
可以看到用户的设备容量越大,和其有社会联系的用户数越多,其社会重要性越大。运营商根据用户的社会重要性来选取重要用户,将用户按照社会重要性从大到小的顺序进行排名,一般运营商选取排名靠前的N个用户作为重要用户。It can be seen that the larger the capacity of a user's device and the more users who have social connections with it, the greater its social importance. Operators select important users according to the social importance of users, and rank users in descending order of social importance. Generally, operators select the top N users as important users.
用户的文件偏好在决定缓存放置策略起着至关重要的作用。每个用户的文件偏好都是未知的,宏基站只有每个用户在研究时刻前T b个时隙的历史文件请求H={H 1,H 2,...,H U},其中
Figure PCTCN2021074167-appb-000051
代表用户u的请求历史,
Figure PCTCN2021074167-appb-000052
为前T b个时隙中第t b个时隙时请求的文件。根据历史文件请求H可以计算出基于次 数的用户对每类文件的经验概率分布为:
The user's file preferences play a crucial role in determining the cache placement strategy. The file preference of each user is unknown. The macro base station only has the history file request H={H 1 , H 2 ,..., H U } for each user T b time slots before the research time.
Figure PCTCN2021074167-appb-000051
Represents the request history of user u,
Figure PCTCN2021074167-appb-000052
It is the file requested at the t b- th time slot in the previous T b time slots. According to the historical file request H, the empirical probability distribution of users for each type of file based on the number of times can be calculated as:
Figure PCTCN2021074167-appb-000053
Figure PCTCN2021074167-appb-000053
其中1 A(x)代表指示函数,如果条件x为真,其值为1,否则为0。
Figure PCTCN2021074167-appb-000054
代表根据请求次数计算出的用户u请求c i类文件的经验概率。
Among them, 1 A (x) represents the indicator function. If the condition x is true, its value is 1, otherwise it is 0.
Figure PCTCN2021074167-appb-000054
Represents the empirical probability of user u requesting c i files calculated based on the number of requests.
由于观测用户历史请求的时隙数较少,导致观测数据也较少,所以这个经验概率显然是不能准确描述用户请求每一类文件的真实概率。因此需要根据得到的经验概率来预测用户请求每一类文件的真实概率分布。Since the number of time slots requested by the observation user's history is small, resulting in less observation data, this empirical probability obviously cannot accurately describe the true probability of the user requesting each type of file. Therefore, it is necessary to predict the true probability distribution of each type of file requested by the user based on the obtained empirical probability.
现实生活中,用户分为不同的类型,例如有些用户会最喜欢看科幻电影,有些用户最喜欢看喜剧节目。也就是说可以认为相同类型的用户具有基本相同的概率分布。假如可以准确的划分出有几个用户类型和每种用户类型包含的用户,不仅可以减少不同用户请求每类文件的概率分布情况,而且因为同种类型的用户相当于一个用户,所以变相增加了获取到的每个用户的历史文件请求数据,使经验概率分布更准确,有利于进一步预测用户文件偏好。In real life, users are divided into different types. For example, some users like to watch science fiction movies the most, and some users like to watch comedy shows the most. In other words, users of the same type can be considered to have basically the same probability distribution. If you can accurately divide the number of user types and the users included in each user type, not only can the probability distribution of different users requesting each type of file be reduced, but also because the same type of user is equivalent to one user, it will increase in disguise. The acquired historical file request data of each user makes the empirical probability distribution more accurate, which is conducive to further predicting user file preferences.
采用K-means方法来划分用户类型,并使用Gap Statistic方法来确定K值,用此K值做K-means得到的聚类中心点
Figure PCTCN2021074167-appb-000055
作为每类用户请求每类文件的经验概率分布。之后用这个概率分布来进一步预测用户的文件偏好。
Use the K-means method to classify user types, and use the Gap Statistic method to determine the K value, and use this K value as the cluster center point obtained by K-means
Figure PCTCN2021074167-appb-000055
As the empirical probability distribution of each type of user request for each type of file. Then use this probability distribution to further predict the user's file preferences.
Zipf分布被大量应用于移动网络的缓存研究中,且被认为能很好的描述用户的文件偏好或者文件流行度(每个文件被用户请求的概率分布)等,因此用Zipf分布来对用户请求每类文件的经验概率分布进行拟合。Zipf概率分布为:Zipf distribution is widely used in mobile network caching research, and it is considered to be a good description of users' file preferences or file popularity (the probability distribution of each file being requested by users), etc. Therefore, Zipf distribution is used to request user requests The empirical probability distribution of each type of file is fitted. The Zipf probability distribution is:
Figure PCTCN2021074167-appb-000056
Figure PCTCN2021074167-appb-000056
其中P c代表用户请求在其偏好中排名第c的类别内的文件的概率,rank(c)∈{1,...,C}代表第c类文件的流行度排名,s为Zipf分布的参数,描述了用户偏好的偏斜度,C是总类别数。 Where P c represents the probability that the user requests a file in the category c in their preferences, rank(c) ∈ {1,...,C} represents the popularity ranking of the c category file, and s is the Zipf distribution The parameter describes the skewness of the user's preference, and C is the total number of categories.
可见Zipf分布由参数s决定,因此拟合只需要确定s的值即可。对式(7)等式两边取对数并整理可得:It can be seen that the Zipf distribution is determined by the parameter s, so the fitting only needs to determine the value of s. Take the logarithms of both sides of the equation (7) and arrange them to get:
Figure PCTCN2021074167-appb-000057
Figure PCTCN2021074167-appb-000057
可以看到每类文件被请求的概率的对数与类别排名的对数成线性关系,其斜率为-s,截距为
Figure PCTCN2021074167-appb-000058
而Zipf分布中排名靠前的分类占据了请求的绝大部分,因此只考虑每类用户经验概率分布中排名靠前的5类文件的请求概率,在对其概率和排名取对数之后进行线性回归,得到其服从的Zipf分布的参数s,然后依照经验概率分布中的排名,计算每一类文件的请求概率。然后假设用户对 每类文件中文件的偏好服从均匀分布,从而得到预测的用户文件偏好为:
It can be seen that the logarithm of the probability of each type of file being requested has a linear relationship with the logarithm of the category ranking, with a slope of -s and an intercept of
Figure PCTCN2021074167-appb-000058
The top-ranked category in the Zipf distribution occupies the vast majority of requests, so only the request probabilities of the top 5 types of files in the probability distribution of each type of user experience are considered, and the logarithm of their probabilities and rankings is linearized. Regression, get the Zipf distribution parameter s that it obeys, and then calculate the request probability of each type of file according to the ranking in the empirical probability distribution. Then it is assumed that the user's preferences for files in each type of file are uniformly distributed, and the predicted user file preferences are:
Figure PCTCN2021074167-appb-000059
Figure PCTCN2021074167-appb-000059
其中
Figure PCTCN2021074167-appb-000060
是用户u的文件偏好,
Figure PCTCN2021074167-appb-000061
代表用户u请求第f个文件的概率,且有:
in
Figure PCTCN2021074167-appb-000060
Is the file preference of user u,
Figure PCTCN2021074167-appb-000061
Represents the probability that user u requests the f-th file, and has:
Figure PCTCN2021074167-appb-000062
Figure PCTCN2021074167-appb-000062
其中c f是文件f所属的文件类别并满足
Figure PCTCN2021074167-appb-000063
Figure PCTCN2021074167-appb-000064
代表拟合Zipf分布得到的用户u请求c i类中文件的概率。
Where c f is the file category to which file f belongs and satisfies
Figure PCTCN2021074167-appb-000063
Figure PCTCN2021074167-appb-000064
Represents the probability that the user u requests the file in the c i category obtained by fitting the Zipf distribution.
Step2、以最小化平均系统开销为目标的最优化问题:Step2: The optimization problem with the goal of minimizing the average system overhead:
定义t时隙的所有用户获取请求文件的系统成本为:Define the system cost for all users in time slot t to obtain the requested file as:
Figure PCTCN2021074167-appb-000065
Figure PCTCN2021074167-appb-000065
其中ξ u(t)为用户u在t时刻获取请求文件的花费,且有: Where ξ u (t) is the cost for user u to obtain the requested file at time t, and there are:
Figure PCTCN2021074167-appb-000066
Figure PCTCN2021074167-appb-000066
其中case1表示t时隙用户u可以从自身或者采用D2D通信从重要用户中获取文件;case2表示t时隙用户u可以从小基站获取文件;case3表示t时刻用户u可以从宏基站获得文件。Among them, case1 indicates that user u in time slot t can obtain files from themselves or from important users through D2D communication; case2 indicates that user u in time slot t can obtain files from the small base station; case3 indicates that user u at time t can obtain files from the macro base station.
由于在t时隙,缓存放置策略无法更改,用户请求文件已确定,所以系统成本ξ(t)是确定的。要完成的工作是根据当前用户间的D2D连接情况、用户的文件偏好等信息来确定t+1时刻的缓存放置策略使平均系统成本E(ξ(t+1))最小,为了方便起见,下文中省略了时间标号,除了特殊说明其余都代指t+1时隙。平均系统成本表示为:Since in time slot t, the cache placement strategy cannot be changed, and the user request file has been determined, so the system cost ξ(t) is determined. The work to be done is to determine the cache placement strategy at t+1 according to the current D2D connection between users and the user’s file preferences to minimize the average system cost E(ξ(t+1)). For convenience, the following The time label is omitted in the text, and all refer to the t+1 time slot except for special instructions. The average system cost is expressed as:
Figure PCTCN2021074167-appb-000067
Figure PCTCN2021074167-appb-000067
利用全概率公式可以得到:Using the total probability formula, we can get:
Figure PCTCN2021074167-appb-000068
Figure PCTCN2021074167-appb-000068
其中
Figure PCTCN2021074167-appb-000069
代表用户u在t+1时隙请求文件f的概率,使用第3节中的用户文件偏好预测算法,可得
Figure PCTCN2021074167-appb-000070
in
Figure PCTCN2021074167-appb-000069
Represents the probability that user u requests file f in time slot t+1, using the user file preference prediction algorithm in Section 3, we can get
Figure PCTCN2021074167-appb-000070
设在t+1时隙一共选取了N个重要用户,重要用户的缓存放置策略为
Figure PCTCN2021074167-appb-000071
其中
Figure PCTCN2021074167-appb-000072
是t+1时隙第n个重要用户的缓存放置策略向量,
Figure PCTCN2021074167-appb-000073
也是一个0-1变量,当第n个重要用户在t+1时隙缓存文件f时为1,否则为0。用户通过自身或D2D通信获取请求文件的概率为:
Suppose a total of N important users are selected in time slot t+1, and the buffer placement strategy for important users is
Figure PCTCN2021074167-appb-000071
in
Figure PCTCN2021074167-appb-000072
Is the buffer placement strategy vector of the nth important user in time slot t+1,
Figure PCTCN2021074167-appb-000073
It is also a 0-1 variable. It is 1 when the nth important user caches the file f in time slot t+1, otherwise it is 0. The probability that the user obtains the requested file through himself or D2D communication is:
Figure PCTCN2021074167-appb-000074
Figure PCTCN2021074167-appb-000074
其中事件A u,f,n表示用户u可以从第n个重要用户中获取请求的文件f。第一个等号成立是因为用户可以同时与多个重要用户建立D2D通信,只要其中一个重要用户可以完整的向其传输文件f,即两者间的D2D通信时长不小于t min,该用户即可从自身或通过D2D通信获取请求文件,即满足case1。第三个等号成立是因为从不同的重要用户获取文件的事件相互独立。下面推导
Figure PCTCN2021074167-appb-000075
The event A u, f, n indicates that the user u can obtain the requested file f from the nth important user. The first equal sign is established because the user can establish D2D communication with multiple important users at the same time, as long as one of the important users can completely transfer the file f to it, that is, the D2D communication time between the two is not less than t min , the user is The request file can be obtained from itself or through D2D communication, that is, case1 is satisfied. The third equal sign is established because the events of obtaining files from different important users are independent of each other. Derived below
Figure PCTCN2021074167-appb-000075
Figure PCTCN2021074167-appb-000076
Figure PCTCN2021074167-appb-000076
第一个等号成立因为是在t时隙的缓存放置阶段求平均系统成本,t时隙开始用户间能否D2D的情况
Figure PCTCN2021074167-appb-000077
已知,而且事件A u,f,n等价于用户u和IU n之间满足D2D连接条件并且两者之间D2D通信时长t d2d不小于t min并且两者之间具有社 会联系并且IU n缓存了用户请求的文件f。第二个等号成立是因为事件
Figure PCTCN2021074167-appb-000078
对前面几个事件成立的概率没有影响,而
Figure PCTCN2021074167-appb-000079
只影响事件
Figure PCTCN2021074167-appb-000080
第三个等号成立是因为社会联系
Figure PCTCN2021074167-appb-000081
与缓存策略变量
Figure PCTCN2021074167-appb-000082
不是随机变量而是确定值。第四个等号中的
Figure PCTCN2021074167-appb-000083
可由式(1)求得,为了简化概念令
Figure PCTCN2021074167-appb-000084
则有
Figure PCTCN2021074167-appb-000085
将其带入式(15)中可以得到:
The first equal sign is established because the average system cost is calculated during the buffer placement stage of time slot t, and whether D2D is possible between users at time slot t
Figure PCTCN2021074167-appb-000077
It is known, and the event A u, f, n is equivalent to the D2D connection condition between the user u and IU n and the D2D communication duration t d2d between the two is not less than t min and there is social connection between the two and IU n The file f requested by the user is cached. The second equal sign is established because of the event
Figure PCTCN2021074167-appb-000078
It has no effect on the probability of the previous events, and
Figure PCTCN2021074167-appb-000079
Only affect the event
Figure PCTCN2021074167-appb-000080
The third equal sign was established because of social connections
Figure PCTCN2021074167-appb-000081
And cache policy variables
Figure PCTCN2021074167-appb-000082
It is not a random variable but a certain value. In the fourth equal sign
Figure PCTCN2021074167-appb-000083
It can be obtained by formula (1). In order to simplify the concept, let
Figure PCTCN2021074167-appb-000084
Then there is
Figure PCTCN2021074167-appb-000085
Put it into equation (15) to get:
Figure PCTCN2021074167-appb-000086
Figure PCTCN2021074167-appb-000086
与重要用户缓存策略类似,设小基站的缓存放置策略为
Figure PCTCN2021074167-appb-000087
其中
Figure PCTCN2021074167-appb-000088
是小基站s在t+1时隙的缓存放置策略,按照
Figure PCTCN2021074167-appb-000089
的推导方法可以求得:
Similar to the important user cache strategy, suppose the cache placement strategy of the small cell is
Figure PCTCN2021074167-appb-000087
in
Figure PCTCN2021074167-appb-000088
Is the buffer placement strategy of the small base station s in the t+1 time slot, according to
Figure PCTCN2021074167-appb-000089
The derivation method can be obtained:
Figure PCTCN2021074167-appb-000090
Figure PCTCN2021074167-appb-000090
其中
Figure PCTCN2021074167-appb-000091
注意到由于和小基站通信不考虑社会关系,因此
Figure PCTCN2021074167-appb-000092
Figure PCTCN2021074167-appb-000093
相比少乘了一个表示社会关系的指示变量。
in
Figure PCTCN2021074167-appb-000091
Note that since communication with small base stations does not consider social relations, so
Figure PCTCN2021074167-appb-000092
and
Figure PCTCN2021074167-appb-000093
Compared with that, an indicator variable that represents social relations is multiplied.
由于用户始终可以和宏基站进行通信,而且宏基站拥有内容库中的全部文件,所以有:Since users can always communicate with the macro base station, and the macro base station has all the files in the content library, there are:
Figure PCTCN2021074167-appb-000094
Figure PCTCN2021074167-appb-000094
将式(17)至(19)带入式(13)中可得:Incorporating equations (17) to (19) into equation (13), we can get:
Figure PCTCN2021074167-appb-000095
Figure PCTCN2021074167-appb-000095
根据这个平均系统成本,可以构建优化问题为:According to this average system cost, the optimization problem can be constructed as:
Figure PCTCN2021074167-appb-000096
Figure PCTCN2021074167-appb-000096
其中第一个限制条件是小基站的缓存容量限制,第二个限制条件是重要用户的设备容量限制。第三个限制条件是小基站和重要用户的缓存放置策略变量都是0-1变量。The first limitation is the buffer capacity limitation of the small base station, and the second limitation is the equipment capacity limitation of important users. The third restriction is that the buffer placement strategy variables of the small cell and important users are both 0-1 variables.
Step3、证明最优化问题属于分区拟阵上最小化单调递减超模函数的问题:Step3. Prove that the optimization problem belongs to the problem of minimizing the monotonically decreasing supermodular function on the partitioned matroid:
Figure PCTCN2021074167-appb-000097
代表所有重要用户和小基站缓存放置策略变量,则问题(21)中的目标函数可以看做是关于x的函数f(x),即:
make
Figure PCTCN2021074167-appb-000097
Representing all important users and small cell buffer placement strategy variables, the objective function in question (21) can be regarded as a function f(x) about x, namely:
Figure PCTCN2021074167-appb-000098
Figure PCTCN2021074167-appb-000098
其中
Figure PCTCN2021074167-appb-000099
由其定义可知其取值范围为[0,1]。为了便于证明将
Figure PCTCN2021074167-appb-000100
统一表示为
Figure PCTCN2021074167-appb-000101
当1≤k≤N时,
Figure PCTCN2021074167-appb-000102
代表
Figure PCTCN2021074167-appb-000103
当N+1≤k≤N+S时,
Figure PCTCN2021074167-appb-000104
代表
Figure PCTCN2021074167-appb-000105
Figure PCTCN2021074167-appb-000106
统一表示为
Figure PCTCN2021074167-appb-000107
当1≤k≤N时,
Figure PCTCN2021074167-appb-000108
代表
Figure PCTCN2021074167-appb-000109
当N+1≤k≤N+S时,
Figure PCTCN2021074167-appb-000110
代表
Figure PCTCN2021074167-appb-000111
则式(22)进一步化简为:
in
Figure PCTCN2021074167-appb-000099
According to its definition, the value range is [0,1]. In order to prove that
Figure PCTCN2021074167-appb-000100
Uniformly expressed as
Figure PCTCN2021074167-appb-000101
When 1≤k≤N,
Figure PCTCN2021074167-appb-000102
represent
Figure PCTCN2021074167-appb-000103
When N+1≤k≤N+S,
Figure PCTCN2021074167-appb-000104
represent
Figure PCTCN2021074167-appb-000105
will
Figure PCTCN2021074167-appb-000106
Uniformly expressed as
Figure PCTCN2021074167-appb-000107
When 1≤k≤N,
Figure PCTCN2021074167-appb-000108
represent
Figure PCTCN2021074167-appb-000109
When N+1≤k≤N+S,
Figure PCTCN2021074167-appb-000110
represent
Figure PCTCN2021074167-appb-000111
The formula (22) is further simplified as:
Figure PCTCN2021074167-appb-000112
Figure PCTCN2021074167-appb-000112
下面证明f(x)是关于x的单调递减函数。The following prove that f(x) is a monotonically decreasing function with respect to x.
取任一变量
Figure PCTCN2021074167-appb-000113
对其求一阶导数。(i)当1≤k≤N时,其一阶导数为:
Take any variable
Figure PCTCN2021074167-appb-000113
Find its first derivative. (i) When 1≤k≤N, its first derivative is:
Figure PCTCN2021074167-appb-000114
Figure PCTCN2021074167-appb-000114
因为ξ 1<ξ 2<ξ 3,所以ξ 21>0,ξ 32>0;因为所有的
Figure PCTCN2021074167-appb-000115
都满足
Figure PCTCN2021074167-appb-000116
所以
Figure PCTCN2021074167-appb-000117
此时
Figure PCTCN2021074167-appb-000118
(ii)当N+1≤k≤N+S时,其一阶导数为:
Because ξ 1 <ξ 2 <ξ 3 , ξ 21 >0,ξ 32 >0; because all
Figure PCTCN2021074167-appb-000115
All satisfied
Figure PCTCN2021074167-appb-000116
so
Figure PCTCN2021074167-appb-000117
at this time
Figure PCTCN2021074167-appb-000118
(ii) When N+1≤k≤N+S, the first derivative is:
Figure PCTCN2021074167-appb-000119
Figure PCTCN2021074167-appb-000119
由情况(i)的分析可知ξ 32>0,
Figure PCTCN2021074167-appb-000120
所以此时也有
Figure PCTCN2021074167-appb-000121
From the analysis of situation (i), we know that ξ 32 >0,
Figure PCTCN2021074167-appb-000120
So at this time also
Figure PCTCN2021074167-appb-000121
综合情况(i)和情况(ii)可知对任意
Figure PCTCN2021074167-appb-000122
都有
Figure PCTCN2021074167-appb-000123
也就是说f(x)是关于x的单调递减函数。
Combining situation (i) and situation (ii), we know that for any
Figure PCTCN2021074167-appb-000122
Both have
Figure PCTCN2021074167-appb-000123
That is to say, f(x) is a monotonically decreasing function of x.
下面证明f(x)是一个超模函数。The following proves that f(x) is a supermodular function.
任取两个变量
Figure PCTCN2021074167-appb-000124
对其求二阶导数。(i)当f1≠f2时,观察f(x)的表达式易知其多项式展开式中没有任何一个单项式包含因式
Figure PCTCN2021074167-appb-000125
也就是说此时二阶导数
Figure PCTCN2021074167-appb-000126
(ii)当f1=f2=f且k1,k2满足k1∈{1,...,N},k2∈{1,...,N}时,二阶导数为:
Take any two variables
Figure PCTCN2021074167-appb-000124
Find its second derivative. (i) When f1≠f2, observing the expression of f(x), it is easy to know that none of the monomials in the polynomial expansion contains the factor
Figure PCTCN2021074167-appb-000125
In other words, the second derivative at this time
Figure PCTCN2021074167-appb-000126
(ii) When f1=f2=f and k1, k2 satisfy k1∈{1,...,N}, k2∈{1,...,N}, the second derivative is:
Figure PCTCN2021074167-appb-000127
Figure PCTCN2021074167-appb-000127
由分析单调性时的内容可知式(26)中ξ 21>0,ξ 32>0,
Figure PCTCN2021074167-appb-000128
Figure PCTCN2021074167-appb-000129
所以此时
Figure PCTCN2021074167-appb-000130
(iii)当f1=f2=f且k1,k2中有一个属于 {N+1,...,N+S}时,二阶导数为:
By analyzing the content of monotonicity, we know that in formula (26), ξ 21 >0,ξ 32 >0,
Figure PCTCN2021074167-appb-000128
Figure PCTCN2021074167-appb-000129
So at this moment
Figure PCTCN2021074167-appb-000130
(iii) When f1=f2=f and one of k1, k2 belongs to {N+1,...,N+S}, the second derivative is:
Figure PCTCN2021074167-appb-000131
Figure PCTCN2021074167-appb-000131
其中ξ 32>0,
Figure PCTCN2021074167-appb-000132
所以此时
Figure PCTCN2021074167-appb-000133
综合情况(i)、(ii)、(iii)可知f(x)对任意两个变量
Figure PCTCN2021074167-appb-000134
的二阶导数
Figure PCTCN2021074167-appb-000135
恒成立。由命题1可知函数f(x)为超模函数。
Where ξ 32 >0,
Figure PCTCN2021074167-appb-000132
So at this moment
Figure PCTCN2021074167-appb-000133
Comprehensive situation (i), (ii), (iii), it can be seen that f(x) is for any two variables
Figure PCTCN2021074167-appb-000134
Second derivative of
Figure PCTCN2021074167-appb-000135
Heng was established. From Proposition 1, we can see that the function f(x) is a supermodular function.
因此,f(x)为单调递减的超模函数。Therefore, f(x) is a monotonically decreasing supermodular function.
定义
Figure PCTCN2021074167-appb-000136
当i∈{1,...,N}时,EF i={1,...,F}为第i个重要用户的基础集,代表其可以选择缓存的文件;当i∈{N+1,...,N+S}时,EF i={1,...,F}为第i-N个小基站的基础集,代表其可以选择缓存的文件。显然每个重要用户或小基站可以选择缓存F={1,...,F}中任何文件。定义:
definition
Figure PCTCN2021074167-appb-000136
When i∈{1,...,N}, EF i = {1,...,F} is the basic set of the i-th important user, which represents the file that can be selected for cache; when i∈{N+ When 1,...,N+S}, EF i ={1,...,F} is the basic set of the iN-th small base station, which means that it can choose the file to be cached. Obviously, each important user or small base station can choose to cache any file in F={1,...,F}. definition:
Figure PCTCN2021074167-appb-000137
Figure PCTCN2021074167-appb-000137
其中V′ i代表重要用户或小基站的缓存容量限制,即当i∈{1,...,N},
Figure PCTCN2021074167-appb-000138
当i∈{N+1,...,N+S},
Figure PCTCN2021074167-appb-000139
则LF中的
Figure PCTCN2021074167-appb-000140
的物理意义是满足问题(21)约束条件的重要用户或小基站的缓存放置策略,也就是说LF是满足问题(21)约束条件的所有可能的全部重要用户、全部小基站的缓存放置策略的集合。因此问题(21)约束条件等价于分区拟阵(EF,LF)。
Where V′ i represents the buffer capacity limit of important users or small base stations, that is, when i ∈ {1,...,N},
Figure PCTCN2021074167-appb-000138
When i∈{N+1,...,N+S},
Figure PCTCN2021074167-appb-000139
Then in LF
Figure PCTCN2021074167-appb-000140
The physical meaning of is the cache placement strategy of important users or small cells that meet the constraint of problem (21), that is to say, LF is the cache placement strategy of all possible important users and all small cells that meet the constraint of problem (21). gather. Therefore, the constraint condition of problem (21) is equivalent to the partition matroid (EF, LF).
综上所述,最优化问题(21)属于分区拟阵上最小化单调递减超模函数的问题In summary, the optimization problem (21) belongs to the problem of minimizing the monotonically decreasing supermodular function on the partitioned matroid.
Step4、最优化问题求解:Step4. Solving the optimization problem:
基于贪心算法设计了用于求解缓存放置策略的本地贪心缓存算法,具体步骤如下:Based on the greedy algorithm, a local greedy cache algorithm for solving the cache placement strategy is designed. The specific steps are as follows:
1):令N代表重要用户数,令
Figure PCTCN2021074167-appb-000141
代表所有重要用户和小基站缓存放置策略变量,推导得到平均系统开销f(x)的表达式, 初始化i=N+1,x subopt为长度为(N+S)F的全零向量;
1): Let N represent the number of important users, let
Figure PCTCN2021074167-appb-000141
Represent all important users and small cell buffer placement strategy variables, derive the expression of average system overhead f(x), initialize i=N+1, and x subopt is an all-zero vector of length (N+S)F;
2):令j=1,令集合F left={1,...,F}; 2): Let j = 1, let the set F left = {1,...,F};
3):令
Figure PCTCN2021074167-appb-000142
然后令x subopt中第(i-1)F+fopt个元素值为1,去掉集合F left中的f opt元素,最后令j=j+1;
3): Let
Figure PCTCN2021074167-appb-000142
Then set the value of the (i-1) F+ fopt element in x subopt to 1, remove the f opt element in the set F left , and finally set j=j+1;
4):重复执行步骤(3)直至j>V′ i4): Repeat step (3) until j>V′ i ;
5):按照N+2,...,N+S,1,...,N的顺序依次给i赋值,每次赋值后执行步骤(2)至步骤(4);5): Assign values to i in the order of N+2,...,N+S,1,...,N, and execute step (2) to step (4) after each assignment;
图3给出了通过三种不同方式获得的系统成本的比较。从上到下,第一条曲线对应通过随机缓存获得的系统成本,该策略随机将文件放入IUs和SBS的缓存中,直到存满为止。第二条曲线显示了使用基于流行度的缓存策略获得的系统花费,这是一种广泛使用的缓存策略,其思想是在每个缓存节点缓存最受欢迎的文件。为了实现基于流行度的缓存策略,在预测所有用户的偏好之后,我们将所有用户偏好的平均值作为全局文件流行度,并且所有的IUs和SBSs都将最流行的文件放在其缓存中,直到其缓存满为止。底部曲线显示了所提的次优缓存策略所获得的系统成本。可以看到随机缓存的性能最差,因为它没有考虑用户偏好的影响,只是随机缓存文件。使用该策略得到的系统成本远远大于基于流行度的缓存策略和所提缓存策略。基于流行度的缓存策略比随机缓存的性能要好得多,但由于没有考虑不同IUs和SBSs的联合优化,其系统成本比所提的缓存策略要大,并且这种差距随着IUs数目的增加而增大。Figure 3 shows a comparison of system costs obtained through three different methods. From top to bottom, the first curve corresponds to the system cost obtained through random caching. This strategy randomly places files into the caches of IUs and SBS until they are full. The second curve shows the system cost of using a caching strategy based on popularity. This is a widely used caching strategy whose idea is to cache the most popular files at each cache node. In order to implement a caching strategy based on popularity, after predicting the preferences of all users, we take the average of all user preferences as the global file popularity, and all IUs and SBSs put the most popular files in their caches until Its cache is full. The bottom curve shows the system cost obtained by the proposed suboptimal caching strategy. It can be seen that the performance of random caching is the worst, because it does not consider the influence of user preferences, but caching files randomly. The system cost obtained by using this strategy is far greater than the popularity-based caching strategy and the proposed caching strategy. The caching strategy based on popularity is much better than the performance of random caching, but because it does not consider the joint optimization of different IUs and SBSs, the system cost is larger than the proposed caching strategy, and this gap increases with the increase in the number of IUs. Increase.
图4证明了缓存策略中考虑移动性和社会性的必要性。图上有三条曲线,上面的曲线给出了使用不考虑移动性的最优缓存策略的系统成本,该曲线是通过以下方式得到的:首先,去掉场景中的移动性,即如果一个用户可以在当前时隙中与另一个用户或SBS通信,那么他们必然能够在下一时隙中通信。然后,将本地贪心缓存算法应用到这个更改的场景中得到不考虑移动性的缓存策略,然后将该策略应用到考虑移动性的场景中,得到与该策略相对应的系统成本。可以看出,由于不考虑移动性的缓存策略忽略了场景中的移动性,将当前时隙的连接情况作为下一时隙的连接情况,因此使用该策略得到的系统成本大于所提缓存策略。中间一条曲线给出了使用不考虑社会性的缓存策略的系统成本。该曲线是通过以下方式得到的:首先,去掉场景中的社会性,即如果两个用户物理上满足了D2D通信的要求,那么他们就可以建立D2D通信,而不管他们是否有社会关系。然后将本地贪心缓存算法应用到此场景中,得到不考虑社会性的缓存策略,然后将该策略应用到考虑社会性的场景中,得到相应的系统成本。虽然该策略的系统成本与我们提出的策略在重要用户较少的情况下基本相同,但随着重要用户数量的增加,与所提缓存策略相比,该策略的系统成本更大,差距也越来越大。这是因为它忽略了一些用户由于社会关系不够紧密而无法相互通信的事实,导致放置在用户处的一些文件无效,由于没有社会关系,用户周围的人可能不愿意与他通信。Figure 4 demonstrates the necessity of considering mobility and sociality in caching strategies. There are three curves on the graph. The above curve shows the system cost of using the optimal caching strategy without considering mobility. The curve is obtained in the following way: First, remove the mobility in the scene, that is, if a user can When communicating with another user or SBS in the current time slot, they must be able to communicate in the next time slot. Then, the local greedy caching algorithm is applied to this changed scenario to obtain a caching strategy that does not consider mobility, and then the strategy is applied to a scenario that considers mobility to obtain the system cost corresponding to the strategy. It can be seen that because the caching strategy that does not consider mobility ignores the mobility in the scene, and takes the connection of the current time slot as the connection of the next time slot, the system cost obtained by using this strategy is greater than the proposed caching strategy. The middle curve shows the system cost of using a caching strategy that does not consider sociality. The curve is obtained in the following way: First, remove the sociality in the scene, that is, if two users physically meet the requirements of D2D communication, then they can establish D2D communication regardless of whether they have a social relationship. Then apply the local greedy caching algorithm to this scenario to obtain a caching strategy that does not consider sociality, and then apply this strategy to a scenario that considers sociality to obtain the corresponding system cost. Although the system cost of this strategy is basically the same as that of our proposed strategy when there are fewer important users, as the number of important users increases, compared with the proposed caching strategy, the system cost of this strategy is larger and the gap becomes larger. Come bigger. This is because it ignores the fact that some users cannot communicate with each other because their social relationships are not close enough, resulting in some files placed at the user's place being invalid. Because there is no social relationship, people around the user may be unwilling to communicate with him.
图5给出了所提次优缓存策略和最佳缓存策略之间的系统成本的比较。这里的最优缓存策略是通过替换变量的方法获得的。具体地说,可以将该非线性整数 规划问题转化为线性整数规划问题,然后使用标准的线性整数规划优化工具来解算最优缓存策略问题是。因为该优化问题是NP完全的,为了减少计算复杂度,比较的场景只包含一个SBS,并且重要用户数在1到4之间。次优值就是通过所提方法求出。可以看到,最优值和次优值之间的差距很小。Figure 5 shows the comparison of the system cost between the proposed suboptimal caching strategy and the optimal caching strategy. The optimal caching strategy here is obtained by replacing variables. Specifically, the nonlinear integer programming problem can be transformed into a linear integer programming problem, and then standard linear integer programming optimization tools can be used to solve the optimal caching strategy problem. Because the optimization problem is NP-complete, in order to reduce the computational complexity, the comparison scenario only contains one SBS, and the number of important users is between 1 and 4. The second best value is obtained by the proposed method. It can be seen that the gap between the optimal value and the sub-optimal value is very small.

Claims (8)

  1. 一种基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述方法中宏基站、小基站和D2D的通信方式并存,包括如下步骤:A heterogeneous network cache decision-making method based on user preference prediction, characterized in that: the communication modes of macro base station, small base station, and D2D coexist in the method, and include the following steps:
    (S1)首先在用户请求不同文件的概率分布未知的情况下,通过机器学习根据用户请求历史记录预测用户偏好;(S1) First, when the probability distribution of the user requesting different files is unknown, predict user preferences based on user request history through machine learning;
    (S2)基于用户的移动性、物理位置关系、社会关系推导平均系统成本的表达式,在缓存容量的约束下,以小基站和重要用户的缓存策略为变量,构建平均系统成本最小化的优化问题,通过求解该问题进行缓存决策;(S2) Derive the expression of the average system cost based on the user's mobility, physical location relationship, and social relationship. Under the constraint of cache capacity, use the cache strategy of small base stations and important users as variables to construct an optimization that minimizes the average system cost The problem, the cache decision is made by solving the problem;
    (S3)基于贪心算法的次优算法对平均系统成本最小化的优化问题进行求解,按照解向量决定予以缓存的文件。(S3) The suboptimal algorithm based on the greedy algorithm solves the optimization problem of minimizing the average system cost, and determines the file to be cached according to the solution vector.
  2. 根据权利要求1所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述方法的算法处理过程具体如下:The heterogeneous network cache decision-making method based on user preference prediction according to claim 1, wherein the algorithm processing process of the method is specifically as follows:
    (1)用S={1,...,S}、U={1,2,...,U}、C={1,...,C}和F={1,...,C*F c}分别表示小基站集、用户集、文件类别集和文件集,其中S、U、C、F c分别表示小基站数、用户数、文件类别数和每类文件数,用t min、t min′分别表示通过D2D和通过小基站下载每个文件需要最小通信时间,宏基站包含内容库中的全部文件; (1) Use S={1,...,S}, U={1,2,...,U}, C={1,...,C} and F={1,... ,C*F c }represent the small base station set, user set, file category set and file set, where S, U, C, F c represent the number of small base stations, the number of users, the number of file categories, and the number of files in each category, respectively. t min and t min ′ respectively indicate the minimum communication time required to download each file through D2D and through the small base station, and the macro base station contains all the files in the content library;
    (2)将时间划分为等长的时隙,t∈N表示第t个时隙,其起始时刻是τ t,所有时隙长度都为T,每个时隙开始,即当前时隙的用户初始D2D连接情况
    Figure PCTCN2021074167-appb-100001
    其中指示函数
    Figure PCTCN2021074167-appb-100002
    代表用户i和用户j在t时隙开始是否能够进行D2D通信,用“1”或“0”表示;然后每个用户按照其偏好随机的请求文件,构成文件请求向量R t={r i t:i=1,...,U},其中r i t∈F是用户i在t时隙请求的文件;
    (2) Divide time into equal-length time slots, t ∈ N represents the t-th time slot, its starting time is τ t , all time slots are of length T, and each time slot starts, that is, the current time slot User's initial D2D connection
    Figure PCTCN2021074167-appb-100001
    Where indicator function
    Figure PCTCN2021074167-appb-100002
    Represents whether user i and user j can conduct D2D communication at the beginning of time slot t, which is represented by "1" or "0"; then each user randomly requests files according to his preferences, forming a file request vector R t ={r i t : I=1,...,U}, where r i t ∈F is the file requested by user i in time slot t;
    (3)通过指示变量
    Figure PCTCN2021074167-appb-100003
    表示用户间的物理关系,如果用户i和用户j在t时刻具有物理关系,则
    Figure PCTCN2021074167-appb-100004
    若没有则
    Figure PCTCN2021074167-appb-100005
    定义μ i,j表示用户i与用户j间连接时长服从的指数分布的参数,用λ i,j表示用户i与用户j间隔时长服从的指数分布参数,根据用户i和用户j在t 0时刻的连接情况
    Figure PCTCN2021074167-appb-100006
    计算用户i和用户j在t c时刻相连的概率
    Figure PCTCN2021074167-appb-100007
    (3) Through indicator variables
    Figure PCTCN2021074167-appb-100003
    Represents the physical relationship between users. If user i and user j have a physical relationship at time t, then
    Figure PCTCN2021074167-appb-100004
    If not then
    Figure PCTCN2021074167-appb-100005
    Define μ i,j to represent the exponential distribution parameter of the connection time between user i and user j, and use λ i,j to represent the exponential distribution parameter of the time interval between user i and user j. According to user i and user j at time t 0 Connections
    Figure PCTCN2021074167-appb-100006
    Calculate the probability that user i and user j are connected at t c
    Figure PCTCN2021074167-appb-100007
    (4)定义μ′ u,s和λ′ u,s表示用户u和小基站s间的连接时长和间隔时长分别服从的指数分布的参数,指示变量
    Figure PCTCN2021074167-appb-100008
    表示用户u和小基站s间的物理关系,根据t 0时刻的连接情况
    Figure PCTCN2021074167-appb-100009
    计算用户u和小基站s在t c时刻相连的概率
    Figure PCTCN2021074167-appb-100010
    (4) Define μ′ u,s and λ′ u,s to represent the exponentially distributed parameters that the connection time and interval time between user u and small base station s obey respectively, indicating variable
    Figure PCTCN2021074167-appb-100008
    Represents the physical relationship between user u and small base station s, based on the connection at time t 0
    Figure PCTCN2021074167-appb-100009
    Calculate the probability that user u and small base station s are connected at time t c
    Figure PCTCN2021074167-appb-100010
    (5)定义S i,j表示用户i与用户j之间的社会关系,用S T表示社会关系阈值,基于S i,j和S T计算用户间的社会联系s i,j,用θ u表示用户u的社会重要性,用来衡量用户的社会重要程度,计算每个用户的社会重要性θ u=α·V u+β·B u,其中V u,B u分别代表用户u的设备容量和中介中心性,α,β是权重系数,且满足α+β=1,依据社会重要性选取重要用户来缓存文件; (5) defines S i, j represents a social relationship between the user i and user j, with S T represents a social relationship between a threshold value, based on S i, j and S T calculated social connection s i between the user, j, with θ u social importance of user u represents social importance, to measure the user's social importance is calculated for each user θ u = α · V u + β · B u, wherein V u, B u u representing user device Capacity and betweenness centrality, α and β are weight coefficients, and satisfy α+β=1, select important users to cache files according to social importance;
    (6)构建H={H 1,H 2,...,H U}表示决策时刻前T b个时隙的历史文件请求其中
    Figure PCTCN2021074167-appb-100011
    代表用户u的请求历史,
    Figure PCTCN2021074167-appb-100012
    为前T b个时隙中第t b个时隙时请求的文件,根据历史文件请求H计算出基于次数的用户对每类文件的经验概率分布,并用
    Figure PCTCN2021074167-appb-100013
    表示用户u请求第c i类文件的概率,并作为K-means算法的数据集;
    (6) Constructing H={H 1 , H 2 ,..., H U } represents the historical file request of T b time slots before the decision time.
    Figure PCTCN2021074167-appb-100011
    Represents the request history of user u,
    Figure PCTCN2021074167-appb-100012
    Is the file requested at the t b- th time slot in the previous T b time slots, and calculates the user’s empirical probability distribution for each type of file based on the number of times according to the historical file request H, and uses
    Figure PCTCN2021074167-appb-100013
    Represents the probability of user u requesting the c i-th file, and is used as the data set of the K-means algorithm;
    (7)计算不同K值下所有数据点到其聚类中心点的距离之和作为衡量当前K-means模型的性能度量,计算表达式如下:(7) Calculate the sum of the distances from all data points to their clustering center points under different K values as a measure of the performance of the current K-means model. The calculation expression is as follows:
    Figure PCTCN2021074167-appb-100014
    Figure PCTCN2021074167-appb-100014
    其中X为数据点向量,M i代表第i类的聚类中心,距离采用欧式距离; Wherein X is a vector of data points, the representative of the cluster center M i of class i, Euclidean distance using the distance;
    (8)计算Gap(K)=E(logD K)-logD K作为Gap Statistic,其中E(logD K)为logD K的期望,选取使Gap(K)最大的K值optK作为用户分类的类别数; (8) calculates Gap (K) = E (logD K) -logD K as Gap Statistic, wherein E (logD K) of the K logD desired, selected so that the maximum value of K Gap (K) optK number of classes classified as the user ;
    (9)针对每一类用户,计算其聚类中心作为该类用户请求该类文件的经验概率分布,将聚类中心从大到小排序,并获得对应的索引向量,按照排序取值和排名取对数后作为y,x数据进行线性回归求得Zipf分布参数s;(9) For each type of user, calculate its cluster center as the empirical probability distribution of the type of user requesting this type of file, sort the cluster centers from large to small, and obtain the corresponding index vector, and select the value and ranking according to the ranking Take the logarithm as the y, x data and perform linear regression to obtain the Zipf distribution parameter s;
    (10)计算该类用户请求每类文件的概率,其计算表达式如下所示:(10) Calculate the probability of this type of user requesting each type of file, and the calculation expression is as follows:
    Figure PCTCN2021074167-appb-100015
    Figure PCTCN2021074167-appb-100015
    其中c代表用户类别,rank(c)代表第c类文件的请求数排名,依据对每类文件中文件的偏好服从均匀分布求出用户请求所有文件的概率分布,用户请求所有文件的概率分布表达式如下所示:Where c represents the user category, rank(c) represents the ranking of the number of requests for the c-th file The formula is as follows:
    Figure PCTCN2021074167-appb-100016
    Figure PCTCN2021074167-appb-100016
    其中
    Figure PCTCN2021074167-appb-100017
    代表用户u请求第f个文件的概率;
    in
    Figure PCTCN2021074167-appb-100017
    Represents the probability of user u requesting the f-th file;
    (11)重复步骤(9)至步骤(10)直至optK类用户的文件偏好都被求出,得到所有用户的文件偏好集合
    Figure PCTCN2021074167-appb-100018
    (11) Repeat steps (9) to (10) until the file preferences of optK users have been calculated, and a set of file preferences of all users is obtained
    Figure PCTCN2021074167-appb-100018
    (12)令从自身或者或通过D2D通信从重要用户中获取文件的花费为ξ 1;从小基站获取文件的花费为ξ 2;从宏基站获取文件的开销为ξ 3,用户首先考虑从自身存储或重要用户获取请求文件,没有则考虑从小基站,都没有换成则从宏基站获取; (12) Let the cost of obtaining files from oneself or from important users through D2D communication be ξ 1 ; the cost of obtaining files from small base stations is ξ 2 ; the cost of obtaining files from macro base stations is ξ 3 , and the user first considers storing from himself Or important users obtain the request file, if not, consider the small base station, and if they are not replaced, obtain the request file from the macro base station;
    (13)令N代表重要用户数,令
    Figure PCTCN2021074167-appb-100019
    代表所有重要用户和小基站缓存放置策略变量,其中布尔变量
    Figure PCTCN2021074167-appb-100020
    代表重要用户n是否缓存了文件f,布尔变量
    Figure PCTCN2021074167-appb-100021
    代表小基站s是否缓存了文件f,推导得到平均系统开销f(x)的表达式,初始化i=N+1,x subopt为长度为 (N+S)F的全零向量;
    (13) Let N represent the number of important users, let
    Figure PCTCN2021074167-appb-100019
    Represents all important users and small cell buffer placement strategy variables, among which Boolean variables
    Figure PCTCN2021074167-appb-100020
    Represents whether the important user n caches the file f, boolean variable
    Figure PCTCN2021074167-appb-100021
    Represents whether the small base station s caches the file f, and derives the expression of the average system overhead f(x), initializes i=N+1, and x subopt is an all-zero vector of length (N+S)F;
    (14)令j=1,令集合F left={1,...,F}; (14) Let j = 1, let the set F left = {1,...,F};
    (15)令
    Figure PCTCN2021074167-appb-100022
    然后令x subopt中第(i-1)F+fopt个元素值为1,去掉集合F left中的f opt元素,最后令j=j+1;
    (15) Order
    Figure PCTCN2021074167-appb-100022
    Then set the value of the (i-1) F+ fopt element in x subopt to 1, remove the f opt element in the set F left , and finally set j=j+1;
    (16)重复执行步骤(15),直至j>V′ i(16) Repeat step (15) until j>V' i .
  3. 根据权利要求2所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述步骤(3)中用户i和用户j在t c时刻相连的概率
    Figure PCTCN2021074167-appb-100023
    的计算公式如下:
    The heterogeneous network caching decision-making method based on user preference prediction according to claim 2, characterized in that: in the step (3), the probability that user i and user j are connected at time t c
    Figure PCTCN2021074167-appb-100023
    The calculation formula is as follows:
    Figure PCTCN2021074167-appb-100024
    Figure PCTCN2021074167-appb-100024
    式中,
    Figure PCTCN2021074167-appb-100025
    表示用户i和用户j间的物理关系,
    Figure PCTCN2021074167-appb-100026
    表示用户i和用户j在t时刻具有物理关系;
    Figure PCTCN2021074167-appb-100027
    表示用户i和用户j在t时刻不具有物理关系,μ i,j表示用户i与用户j间连接时长服从的指数分布的参数,λ i,j表示用户i与用户j间隔时长服从的指数分布参数,
    Figure PCTCN2021074167-appb-100028
    表示用户i和用户j在t 0时刻的连接情况。
    Where
    Figure PCTCN2021074167-appb-100025
    Represents the physical relationship between user i and user j,
    Figure PCTCN2021074167-appb-100026
    Indicates that user i and user j have a physical relationship at time t;
    Figure PCTCN2021074167-appb-100027
    Indicates that user i and user j do not have a physical relationship at time t, μ i,j represents the exponential distribution parameter that the connection time between user i and user j obeys, and λ i,j represents the exponential distribution that the interval time between user i and user j obeys parameter,
    Figure PCTCN2021074167-appb-100028
    Indicates the connection between user i and user j at time t 0.
  4. 根据权利要求2所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:步骤(4)中用户u和小基站s在t c时刻相连的概率
    Figure PCTCN2021074167-appb-100029
    的计算公式如下所示:
    The heterogeneous network caching decision-making method based on user preference prediction according to claim 2, characterized in that: in step (4), the probability that user u and small base station s are connected at time t c
    Figure PCTCN2021074167-appb-100029
    The calculation formula is as follows:
    Figure PCTCN2021074167-appb-100030
    Figure PCTCN2021074167-appb-100030
  5. 根据权利要求2所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述步骤(5)中社会关系S i,j的计算公式如下所示: The heterogeneous network caching decision-making method based on user preference prediction according to claim 2, characterized in that: the calculation formula of the social relationship S i,j in the step (5) is as follows:
    Figure PCTCN2021074167-appb-100031
    Figure PCTCN2021074167-appb-100031
    其中A i代表用户i的社交属性,frequency(k)代表共有k社交属性的用户数量,用户i与用户j之间的共有社会属性越冷僻,则他们的社会关系越紧密;所述社会关系s i,j的判定方法具体如下: Wherein a total of between social attributes A i i representing the user's social attributes, the number of users Frequency (k) representative of a total of k social attributes, the user i and user j rare, the more closely their social relations; s social relationship to the The judging method of i, j is as follows:
    当S i,j>=S T时,判定用户i与用户j之间具有社会联系,此时s i,j=1;反之则没有判定用户i与用户j之间不具有社会联系,此时s i,j=0; When S i,j >=S T , it is determined that there is a social connection between user i and user j, at this time s i,j =1; otherwise, it is not determined that there is no social connection between user i and user j, at this time s i,j =0;
    所述中介中心性B u的计算公式具体如下: The calculation formula of the betweenness centrality Bu is as follows:
    Figure PCTCN2021074167-appb-100032
    Figure PCTCN2021074167-appb-100032
    其中b i,j代表图G s中顶点i∈VU和顶点j∈VU之间最短路径条数,b i,j(g u)代表图G s中顶点i∈VU和顶点j∈VU间经过V u的最短路径的条数。 Where b i,j represents the number of shortest paths between vertex i ∈ VU and vertex j ∈ VU in graph G s , and bi, j (g u ) represents the passage between vertex i ∈ VU and vertex j ∈ VU in graph G s The number of shortest paths of V u.
  6. 根据权利要求2所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述步骤(6)中经验概率分布
    Figure PCTCN2021074167-appb-100033
    的计算公式如下:
    The heterogeneous network caching decision-making method based on user preference prediction according to claim 2, characterized in that: the empirical probability distribution in the step (6)
    Figure PCTCN2021074167-appb-100033
    The calculation formula is as follows:
    Figure PCTCN2021074167-appb-100034
    Figure PCTCN2021074167-appb-100034
    其中1 A(x)代表指示函数,如果条件x为真,其值为1,否则为0。 Among them, 1 A (x) represents the indicator function. If the condition x is true, its value is 1, otherwise it is 0.
  7. 根据权利要求2所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述步骤(10)中
    Figure PCTCN2021074167-appb-100035
    的计算公式如下:
    The heterogeneous network caching decision-making method based on user preference prediction according to claim 2, characterized in that: in the step (10)
    Figure PCTCN2021074167-appb-100035
    The calculation formula is as follows:
    Figure PCTCN2021074167-appb-100036
    Figure PCTCN2021074167-appb-100036
    其中c f是文件f所属的文件类别并满足
    Figure PCTCN2021074167-appb-100037
    代表第c f类中的文件总数,
    Figure PCTCN2021074167-appb-100038
    代表拟合Zipf分布得到的用户u请求c i类中文件的概率。
    Where c f is the file category to which file f belongs and satisfies
    Figure PCTCN2021074167-appb-100037
    Represents the total number of files in the c f category,
    Figure PCTCN2021074167-appb-100038
    Represents the probability that the user u requests the file in the c i category obtained by fitting the Zipf distribution.
  8. 根据权利要求2所述的基于用户偏好预测的异构网络缓存决策方法,其特征在于:所述步骤(13)中平均系统开销f(x)的计算公式如下:The heterogeneous network cache decision-making method based on user preference prediction according to claim 2, characterized in that: the calculation formula of the average system overhead f(x) in the step (13) is as follows:
    Figure PCTCN2021074167-appb-100039
    Figure PCTCN2021074167-appb-100039
    其中
    Figure PCTCN2021074167-appb-100040
    in
    Figure PCTCN2021074167-appb-100040
PCT/CN2021/074167 2020-06-17 2021-01-28 Heterogeneous network cache decision-making method based on user preference prediction WO2021253835A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010551762.6 2020-06-17
CN202010551762.6A CN111860595A (en) 2020-06-17 2020-06-17 Heterogeneous network cache decision method based on user preference prediction

Publications (1)

Publication Number Publication Date
WO2021253835A1 true WO2021253835A1 (en) 2021-12-23

Family

ID=72986748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/074167 WO2021253835A1 (en) 2020-06-17 2021-01-28 Heterogeneous network cache decision-making method based on user preference prediction

Country Status (2)

Country Link
CN (1) CN111860595A (en)
WO (1) WO2021253835A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115550937A (en) * 2022-07-26 2022-12-30 福州大学 Content distribution method in multi-unmanned-plane-assisted edge computing system
CN116261211A (en) * 2023-02-02 2023-06-13 北方工业大学 Low-energy-consumption dynamic caching method for unmanned aerial vehicle auxiliary data transmission
CN116761152A (en) * 2023-08-14 2023-09-15 合肥工业大学 Roadside unit edge cache placement and content delivery method
US11985186B1 (en) 2023-02-10 2024-05-14 Nanjing University Of Posts And Telecommunications Method of drone-assisted caching in in-vehicle network based on geographic location

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860595A (en) * 2020-06-17 2020-10-30 南京邮电大学 Heterogeneous network cache decision method based on user preference prediction
CN112597388B (en) * 2020-12-18 2022-10-14 南京邮电大学 Cache-enabled D2D communication joint recommendation and caching method
CN112822726B (en) * 2020-12-31 2022-06-10 杭州电子科技大学 Modeling and decision-making method for Fog-RAN network cache placement problem
CN113038496B (en) * 2021-02-26 2022-06-28 盐城吉研智能科技有限公司 D2D content caching method based on personal preference perception
CN113810933B (en) * 2021-08-31 2023-09-26 南京邮电大学 Caching method based on energy collection and user mobility
CN114595632A (en) * 2022-03-07 2022-06-07 北京工业大学 Mobile edge cache optimization method based on federal learning
CN115034690B (en) * 2022-08-10 2022-11-18 中国航天科工集团八五一一研究所 Battlefield situation analysis method based on improved fuzzy C-means clustering

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106658598A (en) * 2016-12-08 2017-05-10 南京邮电大学 Service migration method based on content caching and network state awareness
CN107277159A (en) * 2017-07-10 2017-10-20 东南大学 A kind of super-intensive network small station caching method based on machine learning
CN111860595A (en) * 2020-06-17 2020-10-30 南京邮电大学 Heterogeneous network cache decision method based on user preference prediction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106658598A (en) * 2016-12-08 2017-05-10 南京邮电大学 Service migration method based on content caching and network state awareness
CN107277159A (en) * 2017-07-10 2017-10-20 东南大学 A kind of super-intensive network small station caching method based on machine learning
CN111860595A (en) * 2020-06-17 2020-10-30 南京邮电大学 Heterogeneous network cache decision method based on user preference prediction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHAN GUANJIE; ZHU QI: "Sociality and Mobility-Based Caching Strategy for Device-to-Device Communications Underlying Heterogeneous Networks", IEEE ACCESS, vol. 7, 1 January 1900 (1900-01-01), USA , pages 53777 - 53791, XP011722005, DOI: 10.1109/ACCESS.2019.2912674 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115550937A (en) * 2022-07-26 2022-12-30 福州大学 Content distribution method in multi-unmanned-plane-assisted edge computing system
CN116261211A (en) * 2023-02-02 2023-06-13 北方工业大学 Low-energy-consumption dynamic caching method for unmanned aerial vehicle auxiliary data transmission
CN116261211B (en) * 2023-02-02 2024-02-09 北方工业大学 Low-energy-consumption dynamic caching method for unmanned aerial vehicle auxiliary data transmission
US11985186B1 (en) 2023-02-10 2024-05-14 Nanjing University Of Posts And Telecommunications Method of drone-assisted caching in in-vehicle network based on geographic location
CN116761152A (en) * 2023-08-14 2023-09-15 合肥工业大学 Roadside unit edge cache placement and content delivery method
CN116761152B (en) * 2023-08-14 2023-11-03 合肥工业大学 Roadside unit edge cache placement and content delivery method

Also Published As

Publication number Publication date
CN111860595A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
WO2021253835A1 (en) Heterogeneous network cache decision-making method based on user preference prediction
Zhong et al. Deep reinforcement learning-based edge caching in wireless networks
Zhang et al. Fair task offloading among fog nodes in fog computing networks
Yan et al. Smart multi-RAT access based on multiagent reinforcement learning
CN112020103B (en) Content cache deployment method in mobile edge cloud
CN114143891A (en) FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network
Yang et al. Social-energy-aware user clustering for content sharing based on D2D multicast communications
CN109729507B (en) D2D cooperative caching method based on incentive mechanism
Zhang et al. Cooperative edge caching based on temporal convolutional networks
Cao et al. Reliable and efficient multimedia service optimization for edge computing-based 5G networks: game theoretic approaches
CN108600998B (en) Cache optimization decision method for ultra-density cellular and D2D heterogeneous converged network
Zhao et al. Mobility-aware and interest-predicted caching strategy based on IoT data freshness in D2D networks
Li et al. A delay-aware caching algorithm for wireless D2D caching networks
Li et al. Learning-based delay-aware caching in wireless D2D caching networks
Wu et al. Content popularity prediction in fog radio access networks: A federated learning based approach
Zhao et al. Predictive UAV base station deployment and service offloading with distributed edge learning
CN113918829A (en) Content caching and recommending method based on federal learning in fog computing network
Huang et al. Federated learning based qos-aware caching decisions in fog-enabled internet of things networks
Somesula et al. Deadline-aware caching using echo state network integrated fuzzy logic for mobile edge networks
Yu et al. A caching strategy based on many-to-many matching game in D2D networks
CN111698732A (en) Time delay oriented cooperative cache optimization method in micro-cellular wireless network
Yao et al. Cooperative task offloading and service caching for digital twin edge networks: A graph attention multi-agent reinforcement learning approach
Shan et al. Sociality and mobility-based caching strategy for device-to-device communications underlying heterogeneous networks
Cao et al. Mobility-aware routing and caching in small cell networks using federated learning
Ali et al. Optimized resource and power allocation for sum rate maximization in D2D-assisted caching networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21826839

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21826839

Country of ref document: EP

Kind code of ref document: A1