CN107370807B

CN107370807B - Server based on transparent service platform data access and cache optimization method thereof

Info

Publication number: CN107370807B
Application number: CN201710567988.3A
Authority: CN
Inventors: 盛津芳; 李伟民; 陈琳; 侯翔宇
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2017-07-12
Filing date: 2017-07-12
Publication date: 2020-05-08
Anticipated expiration: 2037-07-12
Also published as: CN107370807A

Abstract

The invention relates to a computer network technology, and discloses a server based on transparent service platform data access and a cache optimization method thereof, so as to improve the hit rate of cache and improve the transparent computing service quality. The method comprises the following steps: carrying out frequency statistics on access behaviors of a large number of terminal users to the data block of the transparent computing server in a time-sharing interval, quantizing the access behaviors of the data block of the user by using the information entropy, and judging whether the access behaviors of the current user are centralized or not; when the user access behavior is judged to be centralized, screening out all data blocks with high frequency of current access, and predicting the access frequency distribution of the screened out all data blocks in a future period of time by using an exponential smoothing prediction algorithm; and optimizing the cache of the server according to the predicted frequency distribution result.

Description

Server based on transparent service platform data access and cache optimization method thereof

Technical Field

The invention relates to a computer network technology, in particular to a server based on transparent service platform data access and a cache optimization method thereof.

Background

In recent years, cloud computing is a typical representative of a network computing mode, and the computing mode is changed from a software-hardware-oriented mode to a service-oriented mode, so that storage and computing resources of a service end can be transmitted to a client according to the requirements of an end user. Transparent computing is a special case of cloud computing, is a novel service mode taking a user as a center and aims to provide ubiquitous transparent service for the user. The transparent service platform consists of a transparent client carrying a lightweight microkernel operating system, a transparent network and a server management platform providing data service. The main functions of the server are to provide transparent computing data access service and transparent resource management service. Therefore, the transparent computing user autonomously controls the process of using the service on demand, and the essence of the transparent service platform data access behavior.

In the transparent service platform, a transparent terminal which is not provided with a hard disk accesses data stored in a service end by virtue of a virtual disk technology, so that remote loading and running of a terminal operating system are realized. The virtual disk model adopted by the method has the following characteristics:

(1) and a three-layer chain data storage mechanism. The data resources in the virtual disk are divided into 3 types according to the resource sharing degree and the property: the system resource with the highest sharing degree, the application group resource with the same application attribute and the user can access the private data resource.

(2) And adopting a redirection-while-writing mechanism. Storing the system virtual disk image S _ VDI and the group virtual disk image G _ VDI with high sharing degree in a server in a read-only mode, and sharing the system virtual disk image S _ VDI and the group virtual disk image G _ VDI to a plurality of terminal users; and saving the rewriting blocks of the S _ VDI and the G _ VDI of the end user in a user virtual disk mirror image U _ VDI corresponding to the user by adopting a ROW write redirection mechanism, and marking the position of each rewriting block by adopting a Bitmap.

Based on the concept of transparent computing storage and computing classification, when the current network bandwidth is relatively limited and faces to large-scale transparent terminals of various types, the performance of the server becomes the performance bottleneck of the transparent computing system. The hit rate of the cache is the most critical factor for the performance of the transparent computing server, so that the efficient cache mechanism is designed at the server, and the improvement of the hit rate of the cache is an extremely important way for improving the transparent computing service quality. At present, the cache optimization thought aiming at transparent computing mainly comprises the following steps:

(1) the strategy adjustment is carried out on the access time and frequency of a single data block by combining cache replacement strategies such as FIFO, LRU, LFU and the like;

(2) partitioning the terminal cache data, and improving the data query rate by combining the technologies such as indexing and the like;

(3) and establishing a local disk cache updated in real time for the terminal by synchronizing the local cache with the virtual disk data.

The problem of large data redundancy is solved by a three-layer storage model adopted by a virtual disk in a transparent service platform, and the following problems still exist in the aspect of service performance by combining the current research situation of transparent calculation and cache optimization thereof:

(1) transparent computing emphasizes a user-centric network computing service model, and all the resources of the user are stored at the server. Under the condition of serving multiple clients, when a large number of users access the transparent server, severe load is generated on service resources such as networks and the like;

(2) the characteristics of three-layer chain storage of the transparent service platform determine the specificity of data access, so that a user data access behavior model of the transparent service platform is different from other virtual disk storage models. The effect is not obvious by using the traditional cache strategy;

(3) the current behavior of the user determines the next access behavior to a certain extent, the user is used as a source of data access, and has important influence on a cache prefetching strategy, and the research work for transparently calculating the access behavior of the user is lacked at present.

Disclosure of Invention

The invention aims to disclose a server based on transparent service platform data access and a cache optimization method thereof, so as to improve the hit rate of cache and improve the transparent computing service quality.

In order to achieve the above object, the present invention discloses a cache optimization method for a server based on transparent service platform data access, which comprises:

carrying out frequency statistics on access behaviors of a large number of terminal users to the data block of the transparent computing server in a time-sharing interval, quantizing the access behaviors of the data block of the user by using the information entropy, and judging whether the access behaviors of the current user are centralized or not;

when the user access behavior is judged to be centralized, screening out all data blocks with high frequency of current access, and predicting the access frequency distribution of the screened out all data blocks in a future period of time by using an exponential smoothing prediction algorithm;

and optimizing the cache of the server according to the predicted frequency distribution result.

Optionally, the frequency statistics of the access behaviors of the large number of end users to the transparent computing server data block in the time interval includes:

representing the set of all data blocks accessed by a user by a server side by BS, and representing the user behavior UB as a binary group<B，T>Where B denotes a block of data and B ∈ BS, T denotes the time at which the user initiated the request,<B_i，T_m>indicating that the user is at T_mAt a moment visit B_iA data block;

the server receives a plurality of user requests during a period of time, at T_αThe set of behaviors of all users is UBS, and the user can be represented by the following formula in the time period T_αHas internal access to data block B_iThe number of times of (2):

F_Bi＝∑(B_i，T_m)，(B_i，T_m)∈UBS，T_m∈T_α

all at T_αThe frequency count set of the accessed data blocks is expressed as

Aiming at the binary model, the invention can quantize the access behavior of the data block of the user by using the information entropy and judge whether the current access behavior of the user is centralized, and the method specifically comprises the following steps:

with B_iThe frequency of accessed data blocks is divided by the frequency of all accessed data blocks in the time period to calculate the frequency T_αData block B in time interval_iProbability of being visited P:

the set of access probabilities for a block of data, denoted as P ═ P (B)₁)，P(B₂)，…，P(B_n)}；

The amount of information accessed per data block is expressed as-log₂P(B_i) In order to measure the overall information of the user behavior, the mathematical expectation of the self-information quantity is defined as the average information quantity of the user behavior, also called information entropy, and the calculation formula is as follows:

by HT_αRepresents T_αEntropy value in time period, if HT_αAnd T_αThe entropy values of the two subsequent continuous periods are smaller than a preset threshold value, and the current user is judgedThe access behavior is centralized.

In the invention, the transparent computing server cache is designed into three partitions, and the three partitions are used for respectively storing cache data blocks of an operating system, an application program and user private data; and each cache partition is composed of three LRU queues, wherein each LRU queue is Q_L、Q_HAnd Q-history queues, and the Q-history queues store and cull Q-histories according to LRU rules_L、Q_HReplacing the buffer block; wherein the access priority is ordered as: q_H＞Q_LQ-history; the access priority is associated with an access frequency number;

when the server receives the data block request from the terminal user, it first goes to the Q of the corresponding buffer_HInquiring, if yes, performing read or write operation; if not, then go to Q in sequence_LThe same query and processing are carried out in the Q-history, if the Q-history is hit in the Q-history queue, the access times of the corresponding data blocks are set as 1, and the data blocks are moved into the Q_LThe head of the queue.

Based on the partition, optionally, the optimizing the cache of the server according to the predicted frequency distribution result of the present invention includes: prefetching the data block whose predicted access frequency meets a certain condition, and placing the data block into a corresponding cache partition, which may specifically include:

determining the partition of the predicted data block and placing the partition into a corresponding cache partition;

evaluating Q that should be placed in the corresponding cache partition based on the predicted access frequency value of the predicted data block_LOr is Q_H；

If the data block has a corresponding queue, setting the access count as an evaluation value; if the data block does not have a corresponding queue and the queue is not full, placing the data block at the head of the corresponding queue, and setting the access count of the data block as an evaluation value;

if Q_HWhen full, move the queue tail data block into Q_LA head portion; if Q_LWhen the queue is full, moving the data block at the tail of the queue into the head of the Q-history queue; and

after the data block enters the corresponding queue, if the data block is accessed once, thenIts access count is incremented by one; when Q is_LWhen the access count of the medium data block reaches a given value, the access count should be shifted into Q_HThe head of the team.

Further, the above described evaluation of whether the predicted data block is stored in Q_LOr is Q_HThe specific method comprises the following steps:

setting different weights for each prediction period of the data block, wherein the weight values are obtained by a weight calculation method in an Adaboost algorithm; and then weighting the access frequency of each prediction period of the data block to obtain a final access frequency evaluation value.

Preferably, the invention predicts the access frequency distribution of each screened data block in a future period by utilizing a cubic exponential smoothing prediction algorithm. The method specifically comprises the following steps:

(1) using { y₁，y₂，...，y_nExpressing the frequency of the accessed single data block in the previous n history periods, firstly using a formula

Sequentially calculating the primary exponential smoothing values of all the data blocks in the FBS in the n periods; then using the formula

Sequentially calculating the quadratic exponential smoothing values of all data blocks in the FBS in the n periods; using the formula

Sequentially calculating three exponential smoothing values of all data blocks in the FBS in the n periods;

wherein FBS represents the screened set of data blocks, y_tIs the true value for the t period, α is the smoothing factor,

is the first, second and third smoothing values in the t time period;

(2)、the cubic smoothness index prediction model is

Wherein the parameter a_t、b_t、c_tThe calculation is performed using the first, second, third exponential smoothing values:

wherein,

a predicted value representing T periods after the time T;

(3) using the formula

Predicting the frequency f of each data block to be accessed in FBS within 1, 2 and 3 cycles in the future₁，f₂，f₃I.e. f₁＝a_t+b_t+c_t，f₂＝a_t+b_t*2+c_t*4，f₃＝a_t+b_t*3+c_t*9；

(4) Using a formula based on the error rate per cycle

Obtaining a predicted value f₁，f₂，f₃The weight of (c). Wherein

Weight representing t periods, e_tAverage error rate representing the t-period prediction; calculating that a block of data is to be accessedNumber of times of

And taking W as the number of times the data block is accessed as an important basis for prefetching and replacing in the cache strategy.

Corresponding to the method, the invention also discloses a server for executing the method.

In conclusion, the invention has the following beneficial effects:

since transparent computing is a user-centric network computing service model, the core task is to process data access requests sent from a large number of users. The method takes the characteristic analysis of the access behavior of the user to the data block of the server as an entry point, grasps the source of the transparent calculation requirement, adopts an information entropy strategy and an exponential smoothing prediction model to analyze and predict the access behavior, simultaneously designs a reasonable cache structure and a reasonable cache model of the server according to the data block request characteristics of a large number of transparent calculation users, and finally stores the data block to be accessed by the large number of users in a cache region as much as possible and is not easy to replace. Therefore, the cache hit rate is effectively improved, the cache mechanism of the server is optimized, and finally the transparent computing service performance is improved.

The present invention will be described in further detail below with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic diagram illustrating an interaction process between a service management platform and a transparent client according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a user behavior prediction model architecture disclosed in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a cache structure disclosed in an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating the effect of the accuracy of the prediction for medium and short periods according to the embodiment of the present invention;

FIG. 5 is a comparison graph of access hit rates for a user, as disclosed in an embodiment of the present invention.

Detailed Description

The embodiments of the invention will be described in detail below with reference to the drawings, but the invention can be implemented in many different ways as defined and covered by the claims.

Example 1

The embodiment discloses a cache optimization method of a server based on transparent service platform data access.

In the transparent service platform, a transparent terminal not equipped with a hard disk accesses data stored in a service end by means of a virtual disk technology to realize remote loading and running of a terminal operating system, and fig. 1 is an interactive process of a service management platform and a transparent client. The request data packet sent by the client to the server contains an original data set of user behaviors, and characteristic values representing the user behaviors are extracted from the original data set: TYPE, IP, OFFSET, DATA LENGTH, TIME. TYPE is an operation code of a data packet, describing requests for establishing a session, disconnecting a session, reading, writing and the like, and including 6 kinds of operation codes. The IP is the IP of the client sending the data packet and is used for marking the client. OFFSET is an OFFSET describing the starting position of a data block accessed by a user, and is a relative numerical value, and the OFFSET is used for representing the data block accessed by the user when analyzing the access behavior of the user. DATA LENGTH is the length of data requested from the starting location. TIME denotes the TIME at which the request was initiated.

In this embodiment, the request behavior of the user is predicted from a macroscopic perspective, and the overall characteristics of the user behavior can be obtained by analyzing the set of all user behaviors, and fig. 2 is a model architecture for predicting the user behavior. Because the transparent service platform adopts a three-layer chain storage structure, the sharing degree of resources is high, and the sharing is mainly embodied in the sharing of an operating system image and the sharing of an application program used by a user. In this resource-shared storage mode, the accesses to the data blocks by different users have very high similarity. That is, user behavior under the transparent service platform also has centralized and similar features. The information entropy is used for judging the concentration and the dispersion of the access behaviors of the user. Based on this, the present embodiment proposes a cache policy tcsc (cache Computing server cache) based on the prediction of the user access behavior.

Determining the information statistical period as T according to the actual application scene_αAt T_αIn the time period, the steps of analyzing and predicting the access behavior of the user are as follows:

(1) BS (Block set) represents the set of all data blocks accessed by the user at the server, and user behavior UB (user behavior) represents a binary group<B，T>. Where B represents a block of data and B ∈ BS, T represents the time at which the user initiated the request. Therefore, the temperature of the molten metal is controlled,<B_i，T_m>indicating that the user is at T_mAt a moment visit B_iA block of data. The server receives a plurality of user requests during a period of time, at T_αIn this case, the Behavior set of all users is UBS (user Behavior set). To analyze the behavior characteristics of the user, the user's behavior during the time period T can be expressed by the following formula_αHas internal access to data block B_iThe number of times of (2):

F_Bi＝∑(B_i，T_m)，(B_i，T_m)∈UBS，T_m∈T_α

all at T_αThe frequency count set of the accessed data blocks is expressed as

(2) With B_iThe frequency of accessed data blocks is divided by the frequency of all accessed data blocks in the time period to calculate the frequency T_αData block B in time interval_iProbability of being visited P:

the set of access probabilities for a block of data, denoted as P ═ P (B)₁)，P(B₂)，…，P(B_n)}

(3) Information table for each data block to be accessedShown as-log₂P(B_i) In order to measure the overall information of the user behavior, the mathematical expectation of the self-information quantity is defined as the average information quantity of the user behavior, also called information entropy, and the calculation formula is as follows:

by HT_αRepresents T_αEntropy value in time period, if HT_αAnd T_αAnd the entropy values of the two subsequent continuous periods are smaller than the preset threshold value, which shows that the current user access behavior is centralized. The set of data blocks currently accessed with a higher frequency is denoted by FBS (frequency block set), and a prediction needs to be made on the number of times the data blocks in the FBS will be accessed.

In the three-layer chain type storage model of the operating system, the application program and the user private data of the transparent server, system resources and application program resources are shared to a great extent. Therefore, the client accesses the two layers of blocks repeatedly and intensively, and only small probability is needed to modify the same block. A substantially read-only request. The sharing degree and the access centralization of the three levels are different, data of the three levels are distinguished when the cache is designed, the cache of the system is designed into three parts of Co, Ca and Cu which respectively represent system resource cache, application program cache and user data cache.

Each of the Co, Ca, Cu buffers consists of three LRU queues. Wherein Q is_LStoring data blocks, Q, with lower access priority_HThe data blocks with higher access priority are stored, and the access priority is related to the access frequency. The Q-history queue stores slave Q_LReplaces the outgoing records, and the blocks are not completely removed, but are temporarily kept in the queue. Each data block in the cache has several attributes of count, lastTime and bData, where count represents the number of times the data block was accessed, lastTime is last accessed, and bData represents the content of the data block actually stored. The detailed implementation is shown in fig. 3.

(1) When searching for a block from the buffer, the search is started from the head of the queue with higher access priority. If the corresponding data block exists in the buffer queue, the count is added with 1, and the access time is recorded and given to lastTime; if the data block is not found in the buffer queue, taking the corresponding data block from the disk and putting the data block into a Q_LThe head of the queue.

(2) If at Q_HHit in the queue, judge whether its visit time and last visit time interval reach the time threshold, if reach, set its count as 1 and put into Q_LA head of the queue; otherwise, the data block is moved to the head of the queue according to the LRU rule. Q_HWhen the queue is full, degrading the data at the tail of the queue to Q_LThe head of the queue.

(3) If at Q_LHit in the queue, judge whether its count reaches the frequency threshold, if so, put the data block into Q_HHead of line, otherwise move to Q_LThe head of the queue. Q_LAnd when the queue is full, eliminating the data at the tail of the queue to the head of the Q-history queue.

(4) If hit in the Q-history queue, set its count to 1 and put it in Q_LThe head of the queue. The length of the Q-history queue is fixed, and when the queue exceeds the length, the data block is moved out from the tail of the queue according to the strategy of the LRU, and the data block is completely moved out of the buffer at the moment.

When passing HT_αWhen the user access behavior is monitored to be in a continuous centralized state, a data block in the FBS is used as an analysis object, and the condition that the data block is accessed in the future time period is predicted by adopting an exponential smoothing method and is used as the basis of a cache prefetching strategy.

And (3) predicting by adopting an exponential smoothing method, wherein historical data of a plurality of periods is taken as an observation value, and prefetching is carried out on a corresponding data block. Assuming that the historical data of the previous n cycles is adopted to predict future access behaviors, the specific steps of predicting the user access behaviors and prefetching the data blocks are as follows:

(1) using { y₁，y₂，...，y_nIndicates the frequency with which a single data block was accessed in the previous n history periods. Firstly using the formula

Three exponentially smooth values of n periods of all data blocks in the FBS are calculated in turn.

Wherein, y_tIs the true value for the t period, α is the smoothing factor,

are the first, second, third smoothed values over the t period.

(2) The cubic smoothing index prediction model is

wherein,

indicating the predicted value T cycles after time T.

(3) Using the formula

Predicting the frequency f of each data block to be accessed in FBS within 1, 2 and 3 cycles in the future₁，f₂，f₃I.e. f₁＝a_t+b_t+c_t，f₂＝a_t+b_t*2+c_t*4，f₃＝a_t+b_t*3+c_t*9。

(4) Using a formula based on the error rate per cycle

Obtaining a predicted value f₁，f₂，f₃The weight of (c). Wherein

Weight representing t periods, e_tRepresenting the average error rate of the prediction over the t period. Calculating the number of times a data block will be accessed

Taking W as the number of times a data block will be accessed is an important basis for prefetching and replacement in a cache policy.

(5) Dividing data blocks in the FBS according to the predicted access frequency and the cached frequency threshold, and putting the data blocks into the FBS when the predicted value reaches the threshold_HOtherwise put in Q_L。

In order to test the effectiveness of the prediction method in the embodiment, the accuracy of the prediction algorithm in the medium and short periods is measured by comparing the errors of the predicted value and the observed value. Randomly extracting data blocks in a comparative set accessed by the transparent service platform in a period of time, taking the previous 10 periods as observed values, and predicting 11 th to 18 th periods of all the data blocks, wherein fig. 4 shows the test results of the future periods of 1 to 8 respectively and taking the average error as a measurement index. The test result shows that when 1-3 periods are predicted, the average errors are respectively as follows: 0.07, 0.12, 0.19, the error increases sharply from the fourth cycle. Therefore, the data block can be accurately predicted in three periods by using the cubic exponential smoothing prediction method.

In order to further test the effect of the cache strategy in the invention, JAVA is used for realizing the algorithm of the TCSC, the LRU, the LFU and the LFRU, and the cache hit rate difference under different strategies is compared. In the testing process, firstly, the access logs of 35 users freely operating the transparent computing terminal within 90 minutes are collected, and then the corresponding cache hit rate is recorded by simulating the access of the users.

The test involves 2134258 access records, and fig. 5 is a hit ratio comparison graph of different cache replacement strategies when the cache sizes are respectively 4M, 8M, 16M, 32M and 64M. It can be seen from the graph that the hit rate is significantly improved by using the TCSC method, and the advantage is more significant when the cache capacity is smaller.

Example 2

Corresponding to the above method embodiments, the present embodiment discloses a server for executing the above method.

Referring to embodiment 1, a cache optimization method of a server based on transparent service platform data access, executed by a server in this embodiment, includes:

representing the set of all data blocks accessed by a user by a server side by BS, and representing the user behavior UB as a binary group<B，T>Where B represents a data block and B ∈ BS, TIndicating the time at which the user initiated the request,<B_i，T_m>indicating that the user is at T_mAt a moment visit B_iA data block;

F_Bi＝∑(B_i，T_m)，(B_i，T_m)∈UBS，T_m∈T_α

all at T_αThe frequency count set of the accessed data blocks is expressed as

by HT_αRepresents T_αEntropy value in time period, if HT_αAnd T_αAnd judging that the current user access behavior is centralized when the entropy values of the two subsequent continuous periods are smaller than a preset threshold value.

after the data block enters the corresponding queue, if the data block is accessed once, the access count is increased by one; when Q is_LWhen the access count of the medium data block reaches a given value, the access count should be shifted into Q_HThe head of the team.

(₁) Using { y₁，y₂，...，y_nExpressing the frequency of the accessed single data block in the previous n history periods, firstly using a formula

is the first, second and third smoothing values in the t time period;

(2) the cubic smoothing index prediction model is

wherein,

a predicted value representing T periods after the time T;

(3) using the formula

(4) Using a formula based on the error rate per cycle

Obtaining a predicted value f₁，f₂，f₃The weight of (c). Wherein

Weight representing t periods, e_tAverage error rate representing the t-period prediction; calculating the number of times a data block will be accessed

To sum up, in the server based on transparent service platform data access and the cache optimization method thereof disclosed in the embodiments of the present invention, since the transparent computing itself is a network computing service mode taking users as centers, the core task is to process data access requests sent by a large number of users. The method takes the characteristic analysis of the access behavior of the user to the data block of the server as an entry point, grasps the source of the transparent calculation requirement, adopts an information entropy strategy and an exponential smoothing prediction model to analyze and predict the access behavior, simultaneously designs a reasonable cache structure and a reasonable cache model of the server according to the data block request characteristics of a large number of transparent calculation users, and finally stores the data block to be accessed by the large number of users in a cache region as much as possible and is not easy to replace. Therefore, the cache hit rate is effectively improved, the cache mechanism of the server is optimized, and finally the transparent computing service performance is improved.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A cache optimization method of a server based on transparent service platform data access is characterized by comprising the following steps:

optimizing the cache of the server according to the predicted frequency distribution result, comprising: prefetching the data blocks of which the predicted access frequency meets a certain condition and placing the data blocks into corresponding cache partitions;

the transparent computing server side cache is designed into three partitions, and cache data blocks of an operating system, an application program and user private data are stored respectively; and each cache partition is composed of three LRU queues, wherein each LRU queue is Q_L、Q_HAnd Q-history queues, and the Q-history queues store and cull Q-histories according to LRU rules_L、Q_HReplacing the buffer block; wherein the access priority is ordered as: q_H>Q_L>Q-history; the access priority is associated with an access frequency number;

2. The transparent service platform data access-based server cache optimization method as claimed in claim 1, wherein the performing of frequency statistics on access behaviors of a large number of end users to the transparent computing server data block in the time-division interval comprises:

representing the set of all data blocks accessed by a user by a server side by BS, and representing the user behavior UB as a binary group<B,T>Where B denotes a block of data and B ∈ BS, T denotes the time at which the user initiated the request,<B_i,T_m>indicating that the user is at T_mAt a moment visit B_iA data block;

the server receives a plurality of user requests during a period of time, at T_αThe behavior set of all users is UBS, and the time period T of the user is represented by the following formula_αHas internal access to data block B_iThe number of times of (2):

F_Bi＝∑(B_i,T_m),(B_i,T_m)∈UBS，T_m∈T_α

all at T_αThe frequency count set of the accessed data blocks is expressed as

3. The cache optimization method for the server based on the transparent service platform data access according to claim 2, wherein the quantifying the data block access behavior of the user by using the information entropy, and the determining whether the current user access behavior is centralized comprises:

the set of access probabilities for a block of data, denoted as P ═ P (B)₁),P(B₂),…,P(B_n)}；

by HT_αRepresents T_αEntropy of time periods, e.g.Fruit HT_αAnd T_αAnd judging that the current user access behavior is centralized when the entropy values of the two subsequent continuous periods are smaller than a preset threshold value.

4. The cache optimization method for the server based on the transparent service platform data access according to claim 1, wherein the pre-fetching of the data blocks whose predicted access frequency satisfies a certain condition into the corresponding cache partition specifically comprises:

5. The transparent service platform data access-based server-side cache optimization method according to claim 4, wherein whether the predicted data block is stored in Q is evaluated_LOr is Q_HThe specific method comprises the following steps:

6. The method for optimizing the cache of the service end based on the data access of the transparent service platform according to any one of claims 1 to 3, wherein a cubic exponential smoothing prediction algorithm is used to predict the access frequency distribution of each screened data block in a future period.

7. The cache optimization method for the server based on the transparent service platform data access according to claim 6, wherein the cubic exponential smoothing prediction algorithm specifically comprises:

(1) using { y₁,y₂,...,y_nExpressing the frequency of the accessed single data block in the previous n history periods, firstly using a formula

is the first, second and third smoothing values in the t time period;

(2) the cubic smoothing index prediction model is

In which refer toNumber a_t、b_t、c_tThe calculation is performed using the first, second, third exponential smoothing values:

wherein,

a predicted value representing T periods after the time T;

(3) using the formula

Predicting the frequency f of each data block to be accessed in FBS within 1, 2 and 3 cycles in the future₁,f₂,f₃I.e. f₁＝a_t+b_t+c_t,f₂＝a_t+b_t*2+c_t*4,f₃＝a_t+b_t*3+c_t*9；

(4) Using a formula based on the error rate per cycle

Obtaining a predicted value f₁,f₂,f₃The weight of (c); wherein

8. A server for performing the method of any one of claims 1 to 7.