CN107592656B

CN107592656B - Caching method based on base station clustering

Info

Publication number: CN107592656B
Application number: CN201710704882.3A
Authority: CN
Inventors: 刘楠; 牛岩; 潘志文; 尤肖虎
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2017-08-17
Filing date: 2017-08-17
Publication date: 2020-12-11
Anticipated expiration: 2037-08-17
Also published as: CN107592656A

Abstract

The invention discloses a caching method based on base station clustering, which comprises the steps of firstly collecting and analyzing historical requests of all base station service users under a dense base station network, clustering base stations based on the historical requests, wherein the users served by all base stations in each class have similar interests; meanwhile, the cache content of each base station is decided by combining the collaborative filtering in the field of the recommendation system; by adopting the cluster-based collaborative filtering, the expandability and the data sparsity of the algorithm can be effectively improved. The invention combines the local popularity of the content and the TOPN cooperative filtering system, effectively improves the cache hit rate of the base station, and can effectively solve the contradiction between the limited cache capacity of the base station and the continuously increased mass data, thereby improving the user satisfaction and the network backhaul load.

Description

Caching method based on base station clustering

Technical Field

The invention relates to the technical field of mobile communication systems, in particular to a caching method based on base station clustering.

Background

In order to deal with the challenge to the system capacity brought by the increase of mass data, an effective scheme is to deploy a cache on a base station, and if a user requests content in the cache, the base station directly transmits the content through a wireless link; otherwise it needs to be acquired from the core network via the backhaul link. The active storage of the base station is to store the content in the base station before the request arrives, so that the flow of a return link can be reduced, the flow load in a cellular system is further relieved, and the performance of the system is improved. The invention provides a caching strategy based on base station clustering by analyzing historical requests.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a base station clustering-based cache method, which can greatly improve the cache hit rate, effectively alleviate the backhaul link load, and improve the user satisfaction.

In order to solve the technical problem, the invention provides a caching method based on base station clustering, which comprises the following steps: (1) firstly, the problem of base station clustering is considered; counting the request frequency of each base station service user about the content in the past period, regarding each base station, taking the request frequency of the service user to each content as the characteristic of the base station, clustering the base stations by adopting k-means clustering, wherein the users served by the base stations of each class have similar interests and hobbies, namely the requested content has a plurality of similar places;

(2) the Top N collaborative filtering recommendation system based on the base station predicts the content which is not requested by the user and covered by the base station by utilizing the similarity between the base stations;

(3) and giving each class in the first step, performing targeted caching on the base station by combining the collaborative filtering on each class according to the distribution of the content popularity in the class, and determining the content cached by each base station.

Preferably, the specific clustering of the base stations in step (1) includes the following steps:

(11) according to historical request information of a period of time in the past, a content popularity matrix is obtained through analysis of data by a core network

Wherein each element p in the matrix_m,fRepresenting the frequency of requests for content f from users served by the base station m, the frequency of requests for content being characteristic of the base station, each row P of the matrix P_mA vector representing an F dimension, representing a feature vector of a base station;

(12) randomly selecting k base stations as initial central points of base station clusters, and expressing feature vectors of the k base stations as initial central points

Here, the

Superscript (1) denotes the first round, i.e. the initial value, and the subscript denotes the ith center point;

(13) from the center point of each class, to minimize the intra-class sum of squares, it is determined to which class the base station belongs as follows:

here, the

Indicating that the t-th round belongs to the base station set of the i-th class;

(14) calculating a new center point according to the class divided in the step (13):

(15) repeating (13) and (14) to c_iIs less than a given threshold, finally obtaining k class, H₁，...,H_kEach base station belongs to one of the classes.

Preferably, in the step (2), the intra-class base station-based collaborative filtering specifically includes the following steps:

(21) calculating similarity between base stations within class

The similarity between the intra-class base stations is calculated by the following similarity formula:

base station m_iAnd base station m_jBelong to the same class, T (m)_i) And T (m)_j) Respectively represent base stations m_iAnd base station m_jThe served users access a collection of content; t (f) represents a set of base stations that have accessed the content f;

(22) from (21), m can be obtained_iSet S (m) of closest base stations_iG), then base station m_iThe served user has an interest level in the content f that has never been requested for a period of time in the past

Where t (f) is the set of base stations that have made requests for content f,

is a base station m_jThe level of interest in f of the content, here the elements of the content popularity matrix P.

Preferably, the specific caching method in step (3) includes the following steps:

(31) firstly, analyzing the content popularity in each class, namely counting the request contents of all base station service users in the class, and sequencing the request contents according to the access times of the contents from high to low;

(32) buffer capacity of each base station m is S_m(ii) a Eta is the cache capacity S occupied by the content cached through the intra-class popularity_mThe content is firstly cached to the base station from high to low according to the intra-class traffic, and before caching the content, whether the total size of the cached content exceeds eta S is checked_mIf yes, giving up the cache;

(33) and for the residual buffer capacity of the base station, performing buffer by a step (22) based on base station cooperative filtering in the class, and performing buffer on the content from high to low according to p (m, f) until the total amount of the buffer content is larger than the buffer capacity.

The invention has the beneficial effects that: the invention provides a caching strategy based on base station clustering by carrying out clustering analysis on the base stations, on one hand, the interest and hobbies of base station service users can be well judged, on the other hand, the complexity of base station collaborative filtering is greatly reduced, and the algorithm performance is improved; the invention combines the local popularity of the content and the TOPN cooperative filtering system, effectively improves the cache hit rate of the base station, and can effectively solve the contradiction between the limited cache capacity of the base station and the continuously increased mass data, thereby improving the user satisfaction and the network backhaul load; compared with the prior art, the method and the device have the advantages that the base stations are clustered, and the machine learning algorithm is introduced into the prediction of the cache content, so that the cache hit rate is greatly improved, the load of a backhaul link is effectively relieved, and the user satisfaction is improved.

Detailed Description

A caching method based on base station clustering comprises the following steps:

(1) firstly, the problem of base station clustering is considered; counting the request frequency of each base station service user about the content in the past period, regarding each base station, taking the request frequency of the service user to each content as the characteristic of the base station, clustering the base stations by adopting k-means clustering, wherein the users served by the base stations of each class have similar interests and hobbies, namely the requested content has a plurality of similar places;

Here, the

here, the

(21) calculating similarity between base stations within class

Wherein T (f) is of a base station having made a request for the content fIn the collection of the images, the image data is collected,

Example (b):

network deployment considering M base stations

Each base station is connected to the core network through a backhaul link with a buffer capacity of S_mThe content request is collected as

The size of each content is l (f). R (m) serves the set of user request content for base station m, c (m) caches the set of content for base station m. We define the cache hit rate as follows:

the caching method is implemented in the following mode and comprises the following steps:

(1) firstly, the problem of base station clustering is considered, specifically, the frequency of requests of each base station service user about content in the past period is counted. For each base station, the request frequency of the service users to each content is taken as the characteristic of the base station, k-means clustering is adopted to cluster the base stations, and the users served by the base stations of each class have similar interests and hobbies, namely the requested contents have many similar places.

(2) The Top N collaborative filtering recommendation system based on the base station predicts the content which is not requested by the user and covered by the base station by utilizing the similarity between the base stations.

(3) Given each class in the first step, the method performs targeted caching on the base station according to the distribution of content popularity in the class and in combination with performing cooperative filtering on each class, and determines the content cached by each base station.

The step (1) of clustering base stations includes

(11) Analysis of the data by the core network based on historical request information over a period of time. We can derive a content popularity matrix

Wherein each element p in the matrix_m,fRepresenting the frequency of requests for content f by users served by base station m. The request frequency of the content is taken as the characteristic of the base station, and each row P of the matrix P_mA vector representing the F dimension represents the feature vector of a base station.

(12) Randomly selecting k base stations as initial central points of base station clusters, wherein the characteristic vectors are expressed as

(14) calculating a new central point according to the class divided in the step (3):

(15) (15) repeating (13), (14) up to c_iIs less than a given threshold. Finally obtaining k, H₁，...,H_kEach base station belongs to one of the classes.

The step (2) is based on the intra-class base station collaborative filtering, and comprises the following specific steps:

(21) calculating similarity between base stations within class

base station m_iAnd base station m_jBelong to the same class, T (m)_i) And T (m)_j) Respectively represent base stations m_iBase station m_jThe served users access the collection of content. Where t (f) represents the set of base stations that have accessed the content f.

Where t (f) is the set of base stations that have made requests for content f.

Is a base station m_jInterest in f of contentThe degree, here an element of the content popularity matrix P.

The specific caching mode in the step (3) is as follows:

(31) the content popularity (intra-class popularity) in each class is analyzed first, that is, the requested content of all the users served by the base station in the class is counted. The content is ordered by the number of accesses from high to low.

(32) Buffer capacity of each base station m is S_m. Eta is the cache capacity S occupied by the content cached through the intra-class popularity_mPercentage of (c). Firstly, caching contents from high to low according to the intra-class traffic flow, and before caching the contents, checking whether the total size of the cached contents exceeds eta × S or not_m. If yes, abandoning the cache.

While the invention has been shown and described with respect to the preferred embodiments, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the scope of the invention as defined in the following claims.

Claims

1. A caching method based on base station clustering is characterized by comprising the following steps:

(1) firstly, the problem of base station clustering is considered; counting the request frequency of each base station service user about the content in the past period, regarding each base station, taking the request frequency of the service user to each content as the characteristic of the base station, clustering the base stations by adopting k-means clustering, wherein the users served by the base stations of each class have similar interests and hobbies, namely the requested content has a plurality of similar places; the specific base station clustering comprises the following steps:

Here, the

here, the

(15) repeating (13) and (14) to c_iIs less than a given threshold, finally obtaining k class, H₁，...,H_kEach base station belongs to one of the classes;

(2) the Top N collaborative filtering recommendation system based on the base station predicts the content which is not requested by the user and covered by the base station by utilizing the similarity between the base stations; the cooperative filtering based on the base station specifically comprises the following steps:

(21) calculating similarity between base stations within class

Where t (f) is the set of base stations that have made requests for content f,

is a base station m_jThe degree of interest in f of the content, here the elements of the content popularity matrix P;

(3) giving each class in the first step, performing targeted caching on the base station by combining the collaborative filtering on each class according to the distribution of the content popularity in the class, and determining the content cached by each base station; the specific caching mode comprises the following steps: