CN112261668A

CN112261668A - Content caching method and device in mobile edge network and electronic equipment

Info

Publication number: CN112261668A
Application number: CN202011125620.XA
Authority: CN
Inventors: 景文鹏; 陈冠鹏; 赵书越; 路兆铭; 温向明
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2020-10-20
Filing date: 2020-10-20
Publication date: 2021-01-22
Anticipated expiration: 2040-10-20
Also published as: CN112261668B

Abstract

The embodiment of the disclosure discloses a content caching method, a content caching device and electronic equipment in a mobile edge network, wherein the method comprises the following steps: acquiring an access record of a user to the content within a preset time period; training by using the access records to obtain a probability prediction model; the probability prediction model is used for predicting the click probability of the user on the content; predicting the probability that each content in the content set will be accessed by the user by using the probability prediction model; storing content in the set of content at a base station based on the probability. The technical scheme can realize the extraction of user requirements of a small base station side, balance and consider the relation between personal preference and group preference, select most of contents which are interested by users as far as possible and cache the contents in the base station while considering the QoE requirement of the users, and provide the hit rate of the contents cached at the edge of a network by jointly processing the problems of content caching and recommendation on the premise of ensuring that the single user preference is met as far as possible.

Description

Content caching method and device in mobile edge network and electronic equipment

Technical Field

The present disclosure relates to the field of communications technologies, and in particular, to a content caching method and apparatus in a mobile edge network, and an electronic device.

Background

The proliferation of smart devices has led to an explosive growth in mobile data traffic, resulting in delays for users and a significant increase in traffic to core, backhaul, and wireless access networks. Cisco's report shows that global mobile data traffic will increase 7-fold from 2016 to 2021, and over 75% of the traffic will be video traffic by 2021. In fact, a significant proportion of mobile data traffic is triggered by the transmission of small amounts of content, such as popular video content. Pre-caching popular content on base stations has received attention in recent years in order to cope with the pressure on the network brought by the increase in traffic and to solve the problem of providing a better experience for more users at lower cost. As can be seen from the survey, the transmission of video traffic is divided into peak periods and valley periods. The operator dynamically predicts the user demand during off-peak hours by deploying a cache-capable micro base station (SBS) and stores the content accordingly in a cache where SBS co-exists near the user. During peak periods of content demand, users can directly obtain content from the edge side, so that local satisfaction of demand is realized, transmission delay and bandwidth usage of a backward transmission link are reduced, and quality of experience (QoE) of viewers can be improved.

On the other hand, in order to solve the problem of information overload, recommendation systems have been widely deployed in online information systems, such as e-commerce platforms, social media websites, news portals, and the like. The recommendation system analyzes personal preferences of the user by analyzing historical access records of the user, so as to recommend content which is interested by the user.

The inventor of the present disclosure finds that, most of existing methods aim at reducing the load of a backward link or aim at reducing the transmission delay, but rarely consider how to improve the cache hit rate and optimize the user experience.

Disclosure of Invention

The embodiment of the disclosure provides a content caching method and device in a mobile edge network and electronic equipment.

In a first aspect, an embodiment of the present disclosure provides a content cache in a mobile edge network, including:

acquiring an access record of a user to the content within a preset time period;

training by using the access records to obtain a probability prediction model; the probability prediction model is used for predicting the click probability of the user on the content;

predicting the probability that each content in the content set will be accessed by the user by using the probability prediction model;

storing content in the set of content at a base station based on the probability.

Further, still include:

and pushing the content stored in the base station to a user.

Further, training by using the visit record to obtain a probability prediction model, comprising:

taking the contents clicked by the user in the access record as positive samples, and taking the part of the contents not clicked by the user in the access record as negative samples;

training the probabilistic predictive model using the positive and negative examples.

Further, storing the content in the content set in the base station near the user side according to the probability includes:

determining a second probability that each content in the content set is clicked by all users in the user set according to the first probability; the first probability is used for representing the probability that each content in the content set is clicked by each user in the user set;

sequencing all contents in the content set according to the second probability;

and sequentially storing the contents in the content set according to the sequencing result and the sequence of the second probability from large to small.

Further, after storing the content in the content set in the base station near the user side according to the probability, the method further includes:

after a predetermined time interval has elapsed, a new preset time period is determined and the previous steps are repeatedly performed.

Further, the probability prediction model comprises a generalized matrix decomposition module and a multi-layer perceptron module; the generalized matrix decomposition module acquires potential interaction information of a user and content in a linear mode; the multi-layer perceptron module acquires potential interaction information of a user and content in a non-linear mode.

In a second aspect, an embodiment of the present disclosure provides a content caching apparatus in a mobile edge network, including:

the acquisition module is configured to acquire an access record of a user to the content within a preset time period;

a training module configured to train with the visit record to obtain a probabilistic predictive model; the probability prediction model is used for predicting the click probability of the user on the content;

a prediction module configured to predict a probability that each content in a set of contents will be accessed by a user using the probabilistic prediction model;

a storage module configured to store content in the set of content at a base station based on the probability.

The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions.

In one possible design, the content caching apparatus in the mobile edge network is configured to include a memory and a processor, the memory is configured to store one or more computer instructions that support the content caching apparatus in the mobile edge network to perform the method described in the first aspect, and the processor is configured to execute the computer instructions stored in the memory. The content caching apparatus in the mobile edge network may further include a communication interface for the content caching apparatus in the mobile edge network to communicate with other devices or a communication network.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including a memory and a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of any of the above aspects.

In a fourth aspect, the disclosed embodiments provide a computer-readable storage medium for storing computer instructions for use by any of the above-mentioned apparatuses, including computer instructions for performing the method according to any of the above-mentioned aspects.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

according to the method, the access records of the user to the content in the preset time period are obtained, the access records are trained to obtain the probability prediction model, the probability that each content in the content set is accessed by the user is predicted by the probability prediction model, and part or all of the content in the content set is stored in the base station based on the probability. By the method, the extraction of the user requirements of the small base station side is realized, the relationship between the personal preference and the group preference is balanced and considered, the content which most users are interested in is selected as much as possible and cached in the base station while the QoE requirement of the user is considered, and the hit rate of the content cached at the edge of the network is provided by jointly processing the content caching and recommendation problems on the premise of ensuring that the single user preference is met as much as possible.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

Other features, objects, and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments when taken in conjunction with the accompanying drawings. In the drawings:

fig. 1 illustrates a flow chart of a content caching method in a mobile edge network according to an embodiment of the present disclosure;

FIG. 2 shows a flow chart of step S102 according to the embodiment shown in FIG. 1;

FIG. 3 shows a flowchart of step S104 according to the embodiment shown in FIG. 1;

FIG. 4 is a schematic diagram illustrating a flow of implementing a video caching policy in a mobile edge network according to an embodiment of the present disclosure;

fig. 5 is a block diagram illustrating a structure of a content caching apparatus in a mobile edge network according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an electronic device suitable for implementing a content caching method in a mobile edge network according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. Also, for the sake of clarity, parts not relevant to the description of the exemplary embodiments are omitted in the drawings.

In the present disclosure, it is to be understood that terms such as "including" or "having," etc., are intended to indicate the presence of the disclosed features, numbers, steps, behaviors, components, parts, or combinations thereof, and are not intended to preclude the possibility that one or more other features, numbers, steps, behaviors, components, parts, or combinations thereof may be present or added.

It should be further noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

The details of the embodiments of the present disclosure are described in detail below with reference to specific embodiments.

Fig. 1 shows a flowchart of a content caching method in a mobile edge network according to an embodiment of the present disclosure. As shown in fig. 1, the content caching method in the mobile edge network includes the following steps:

in step S101, an access record of a user to a content within a preset time period is obtained;

in step S102, a probability prediction model is obtained by using the access record training; the probability prediction model is used for predicting the click probability of the user on the content;

in step S103, predicting the probability that each content in the content set will be accessed by the user by using the probability prediction model;

in step S104, the content in the content set is stored in the base station based on the probability.

In this embodiment, the preset time period may be a past time period, such as a past hour, a past day, a past week, a past month, and the like. The content may be any content provided by an online information system (e.g., an e-commerce platform, an audio-video application system, a social media website, a news portal, etc.), such as textual content, video, audio, merchandise, etc.

Step 101

In some embodiments, the access record of the user within the preset time period can be obtained by analyzing the log record of the user in the online information system. Assuming that the set of users is U and the set of contents is I, where U ═ 1, 2.., N., N }, and N denotes the number of users; i ═ 1,2,. F }, where F denotes the number of contents. Defining a content interaction matrix for a user as: y is formed by the element R^N×F

Wherein, y_uiA value of 1 indicates that there is a record of interaction between the user and the content, y_uiA value of 0 indicates that there is no interaction record between the user and the content.

Step S102

In some embodiments, a probabilistic predictive model is used to identify the magnitude of a probability that a user clicks on a piece of content. The probabilistic predictive model may use a machine self-learning model, for example, a neural collaborative filtering Network (NCF) in a recommendation system may be employed. The neural collaborative filtering network may be pre-trained for analyzing relationships between users and content. The NCF network includes a Generalized Matrix Factorization (GMF) module and a multi-layer perceptron (MLP) module. The GMF uses a linear manner to obtain the potential interaction information of the user and the content, and the MLP uses a non-linear manner to obtain the potential interaction information of the user and the content. GMF and MLP continuously learn through respective embedded layers, and finally information fusion is achieved through hidden layers connected with GMF and MLP. Through the processing mode, the model can be given more flexibility, so that the fusion model has better model performance.

In some embodiments, the above scheme may be represented using the following mathematical model:

in the above formula, phi^GMFRepresents the output of GMF, φ^MLPThe output of the MLP is represented as,

and

user embedded vectors representing GMF and MLP, respectively(i.e., the user's vector represents the vector obtained after passing through the input layers of GMF and MLP); accordingly, the number of the first and second electrodes,

and

content embedding vectors respectively representing the GMF and the MLP (namely, the vector of the content represents the vector obtained after passing through the input layers of the GMF and the MLP), W, b and a respectively represent a weight matrix, a bias vector and an activation function of each layer in the MLP; h denotes the edge weight of the output layer and σ denotes the activation function Sigmoid. Finally outputted

Representing the probability of the model predicting a user clicking on the content. The model extracts potential relationship information between the user and the content by combining the linear features of the GMF and the non-linear features of the neural network.

Will y_uiWhen y is used as a label_uiWhen the value of (1) is taken, it indicates that the user u is related to the content i; when y is_uiA value of 0 indicates that user u is independent of content i. Therefore, the result is predicted

The value size of (a) can represent the probability of the content i and the user u having correlation. To achieve this, the predicted outcome may be used

Is limited to the value of [0, 1 ]]Within the range of (a). In the embodiment of the disclosure, a probability function (e.g. Logistic or Probit function) is adopted as an activation function of an output layer for pairing

The values of (a) are compressed. Thus, the likelihood function may be defined as:

in the above equation, P and Q represent the potential factor matrices, Θ, for user and content, respectively_fRepresenting model parameters, y represents a positive sample and represents a content set formed by interactive behaviors between a user and content; y is^_And representing a negative example representing a content collection formed by no interactive behavior between the user and the content.

Taking the negative logarithm of the likelihood function can result in:

based on the data set of interaction records of the user and the content obtained in the first step, by continuously reducing the predicted value

And its target value y_uiThe mean square error between the two is used for training the neural network, and the final output value can be regarded as the probability P of the user clicking the content at the next moment_i ^u。

In some alternative implementations, as shown in fig. 2, step S102, namely the step of training to obtain the probabilistic predictive model by using the visit record, further includes the following steps:

in step S201, the content clicked by the user in the access record is taken as a positive sample, and the part of the content not clicked by the user in the access record is taken as a negative sample;

in step S202, the probabilistic predictive model is trained using the positive and negative examples.

And establishing a content access matrix aiming at a user set and a content set from user access records acquired from the system log, and expressing the access condition of a user to a certain content in a past preset time period by using elements in the access matrix. The positive sample and the negative sample of the training probability prediction model are obtained by using the record in the access matrix, wherein the content accessed by the user in the past preset time period is used as the positive sample, the content not accessed by the user in the past preset time period is used as the negative sample, the label corresponding to the positive sample is 1, that is, the probability that the content in the positive sample is accessed by the user is 1, the label corresponding to the negative sample is 0, that is, the probability that the content in the negative sample is accessed by the user is 0. In some embodiments, a portion of the data in the content access matrix may be taken as positive and negative samples for training the probabilistic predictive model.

After the probability prediction model is obtained through training, the probability prediction model can be used for predicting the probability of the content set being accessed by each user, and then the probability that the content is possibly accessed by the user in the future is determined according to the probability.

Step S103

In the step, a content caching strategy of the joint recommendation system is designed by comprehensively considering the relationship between the personal preference and the group preference of the user. And selecting and caching the content which is interested by most users at the base station side as far as possible while considering the QoE requirement of the users.

Assuming that n is a user (which may be any one user) in the user set U, the probability prediction model is used to predict the probability that each content in the content set I will be clicked by the user in the user set U in the future, and may be expressed as

Assuming that f is the content in the content set I (e.g., any content that has not been accessed by the user), the total probability of being clicked by each user in the user set can be expressed as

Recording the total probability of each content in the content set I clicked by all users as P_I＝{P₁,P₂,...,P_F}。

Step S104

After the probability of each content in the content set being accessed by the user in the future is predicted by using the probability prediction model, the content can be cached according to the probability, for example, the content with high probability can be cached preferentially, and the content with low probability can not be cached when the cache space of the base station is not enough.

In some alternative implementations, as shown in fig. 3, step S104, namely the step of training the probability prediction model by using the visit record, further includes the following steps:

in step S301, determining a second probability that each content in the content set is clicked by all users in the user set according to the first probability; the first probability is used for representing the probability that each content in the content set is clicked by each user in the user set;

in step S302, sorting the contents in the content set according to the second probability;

in step S303, the contents in the content set are sequentially stored according to the sorting result in the order from the second probability to the second probability.

In the optional implementation manner, after a first probability that each content in the content set is clicked by each user in the user set in the future is predicted by using the probability prediction model, determining a total probability that each content in the content set is clicked by all users in the user set, namely a second probability, according to the first probability; and sequencing the contents in the content set according to the second probability, and sequentially storing the contents in the content set from the large to the small according to the second probability until the storage space in the base station is exhausted.

Suppose the buffer space size of the base station is C_max，S^fFor the size of the f-th cache content, use C^fTo indicate whether the base station has buffered the content f, e.g. C^f1 may indicate that the base station has buffered the content f. Then, for P_IAnd sequencing according to the probability, and sequentially caching the contents in front of the sequencing until no cache space is available. Namely satisfy

The hit rate HR is the proportion of the number of the real user click contents in the total user click contents at the next moment cached by the base station. The concrete expression is as follows:

in some optional implementations, after the step of training the probability prediction model by using the visit record at step S104, the method further includes the following steps:

and pushing the content stored in the base station to a user.

In some embodiments, the content provider may actively push the base station cache content to the user, thereby further improving the cache hit rate, reducing traffic load, and optimizing user experience.

After active recommendation, the cache hit rate HR can be recalculated according to the formula given in step three^*。

In this optional implementation manner, the user access record in the next time period is obtained, the step 101 is skipped, and the above steps are repeatedly performed, so as to dynamically adjust the cache content of the base station, optimize the cache performance, and optimize the user experience.

According to the method, the content caching strategy of the combined recommendation system is designed, the recommendation system is used for predicting the personal preference of the user, then the personal preference and the group preference are weighed and considered, and most of the content which the user is interested in is selected as far as possible and cached at the base station side while the user requirements are considered. By jointly processing the content caching and recommendation problems, the hit rate of the content caching at the network edge is improved.

The content cached by the base station is actively recommended to the user at the base station side through the content provider, and the user is attracted to click the cached content, so that the cache hit rate is further improved, the load of a backward transmission link is reduced, the network delay is reduced, and the user experience is optimized. And by adopting an interval updating method, the cache content of the base station is continuously updated, the user experience is optimized, and the cache performance of the base station is improved.

Fig. 4 is a schematic diagram illustrating an implementation flow of a video caching policy in a mobile edge network according to an embodiment of the present disclosure. As shown in fig. 4, the process includes the following steps:

in step S401, access records of the user at a certain time interval are acquired, and data is preprocessed.

In step S402, a deep neural network is constructed and trained, and the probability of the user clicking the content is predicted.

In step S403, a content caching policy of the joint recommendation system is designed by comprehensively considering the relationship between the personal preference and the group preference of the user. And selecting cache contents which are interested by most users as far as possible while considering the QoE requirements of the users.

In step S404, the buffer of the base station is dynamically adjusted by using the selected buffer content, where the base station is a base station close to the user side, that is, a base station in the edge network.

In step S405, the content provider actively recommends the cached content of the base station, so as to attract the user to click on the cached content of the base station.

In step S406, a user access record in the next time period is obtained, and step S401 is skipped to optimize the cache performance and optimize the user experience.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods.

Fig. 5 is a block diagram illustrating a structure of a content caching apparatus in a mobile edge network according to an embodiment of the present disclosure, which may be implemented as part of or all of an electronic device by software, hardware, or a combination of the two. As shown in fig. 5, the content caching apparatus in the mobile edge network includes:

an obtaining module 501, configured to obtain a record of access to content by a user within a preset time period;

a training module 502 configured to train with the visit record to obtain a probability prediction model; the probability prediction model is used for predicting the click probability of the user on the content;

a prediction module 503 configured to predict a probability that each content in the content set will be accessed by the user using the probability prediction model;

a storage module 504 configured to store the content of the set of content at a base station based on the probability.

Acquisition Module 501

Training module 502

and

user embedded vectors representing GMF and MLP, respectively (i.e., the user's vector represents the vector obtained after passing through the input layers of GMF and MLP); accordingly, the number of the first and second electrodes,

and

content embedding vectors representing the GMF and MLP, respectively (i.e., the vector of content represents the vector obtained after passing through the input layers of the GMF and MLP), W, b and a represent the weight matrix, bias vector, and laser of each layer in the MLP, respectivelyA live function; h denotes the edge weight of the output layer and σ denotes the activation function Sigmoid. Finally outputted

Taking the negative logarithm of the likelihood function can result in:

In some alternative implementations, the training module 502 may be further implemented to:

Prediction module 503

Memory module 504

In some alternative implementations, the storage module 504 may be further implemented as:

sequencing all contents in the content set according to the second probability;

in some optional implementations, the apparatus further comprises:

a push module configured to push the content stored in the base station to a user.

In some optional implementations, the apparatus further comprises:

and the repeated execution module is configured to determine a new preset time period after a preset time interval, and repeatedly execute the previous steps.

In this optional implementation manner, the user access record in the next time period is acquired, the module 501 is skipped to, and the above modules are repeatedly executed, so as to dynamically adjust the cache content of the base station, optimize the cache performance, and optimize the user experience.

As shown in FIG. 6, electronic device 600 includes a processing unit 601 that may be implemented as a CPU, GPU, FPAG, NPU, or other processing unit. The processing unit 601 may perform various processes in the embodiments of any one of the above-described methods of the present disclosure according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, according to embodiments of the present disclosure, any of the methods described above with reference to embodiments of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing any of the methods of the embodiments of the present disclosure. In such embodiments, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowcharts or block diagrams may represent a module, a program segment, or a portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules described in the embodiments of the present disclosure may be implemented by software or hardware. The units or modules described may also be provided in a processor, and the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.

As another aspect, the present disclosure also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the above-described embodiment; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the present disclosure.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A content caching method in a mobile edge network comprises the following steps:

2. The method of claim 1, further comprising:

and pushing the content stored in the base station to a user.

3. The method of claim 1 or 2, wherein training with the visit record results in a probabilistic predictive model comprising:

4. The method of claim 3, wherein storing the content in the content set in the base station near the user side according to the probability comprises:

sequencing all contents in the content set according to the second probability;

5. The method according to any of claims 1-2 and 4, wherein after storing the content in the content set in the base station near the user side according to the probability, the method further comprises:

6. The method according to any one of claims 1-2, 4, wherein the probabilistic predictive model includes a generalized matrix factorization module and a multi-layered perceptron module; the generalized matrix decomposition module acquires potential interaction information of a user and content in a linear mode; the multi-layer perceptron module acquires potential interaction information of a user and content in a non-linear mode.

7. A content caching apparatus in a mobile edge network, comprising:

8. An electronic device, comprising a memory and a processor; wherein the content of the first and second substances,

the memory is to store one or more computer instructions, wherein the one or more computer instructions are to be executed by the processor to implement the method of any one of claims 1-7.

9. A computer readable storage medium having computer instructions stored thereon, wherein the computer instructions, when executed by a processor, implement the method of any of claims 1-7.