CN117971503B - Data caching method and system based on edge calculation - Google Patents

Data caching method and system based on edge calculation Download PDF

Info

Publication number
CN117971503B
CN117971503B CN202410372649.XA CN202410372649A CN117971503B CN 117971503 B CN117971503 B CN 117971503B CN 202410372649 A CN202410372649 A CN 202410372649A CN 117971503 B CN117971503 B CN 117971503B
Authority
CN
China
Prior art keywords
data
user
cached
cache
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410372649.XA
Other languages
Chinese (zh)
Other versions
CN117971503A (en
Inventor
许磊
许洁
毛骜鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yuanshi Technology Co ltd
Original Assignee
Hangzhou Yuanshi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yuanshi Technology Co ltd filed Critical Hangzhou Yuanshi Technology Co ltd
Priority to CN202410372649.XA priority Critical patent/CN117971503B/en
Publication of CN117971503A publication Critical patent/CN117971503A/en
Application granted granted Critical
Publication of CN117971503B publication Critical patent/CN117971503B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data caching, in particular to a data caching method and system based on edge calculation, wherein the method comprises the following steps: obtaining cache data information of user edge equipment; acquiring vocabulary in each cache data; analyzing the occurrence frequency of the vocabulary in the cache data and the importance degree of the vocabulary, constructing a keyword characteristic value and acquiring keywords; constructing a user behavior retrieval depth for the access degree of the keywords of the cache data in a plurality of times; acquiring a cold and hot attribute adjustment coefficient of the association characteristic of the data to be cached and the cached data; constructing a cold and hot coefficient; and selecting the edge equipment based on the cold and heat coefficients of different users for the data to be cached, thereby completing the data caching based on edge calculation. The invention aims to predict the user cache data according to the historical access mode and behavior of the user so as to respond to the user request more quickly and reduce the data access time delay.

Description

Data caching method and system based on edge calculation
Technical Field
The invention relates to the technical field of data caching, in particular to a data caching method and system based on edge calculation.
Background
The data cache refers to a high-speed memory in the hard disk, and is used for storing some temporarily unused data so as to facilitate the subsequent reading of the data. And the data caching based on edge calculation is to deploy the caching on the edge equipment, store the data at a position closer to the user, reduce the delay of data transmission, and thus, the data access speed can be accelerated and the user experience can be improved. However, in the process of data caching, if the data which is not frequently used by the user is cached on the edge device, the reading and writing speed of the user can be increased intangibly, the content transmission speed is reduced, the network congestion is increased, and therefore the use experience of the user is greatly reduced.
The traditional LFU algorithm realizes a cache data elimination mechanism according to the access frequency of data, but the algorithm is suitable for application scenes with relatively fixed access frequency, and has lower cache hit rate for personal users compared with the access efficiency of fixed services such as enterprises; while conventional LRU algorithms add to the time-plane considerations, such algorithms require recording access time stamps for each data, adding to the computational complexity.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a data caching method and system based on edge calculation, and the adopted technical scheme is as follows:
In a first aspect, an embodiment of the present invention provides a data caching method based on edge computation, where the method includes the following steps:
obtaining cache data information of user edge equipment in a historical time period, wherein the cache data information comprises a cache path vector, cache delay data and cache data;
Processing the cache data of the user edge equipment in the historical time period to obtain vocabulary in each cache data; encoding each vocabulary in the cache data to obtain word vectors of each vocabulary; clustering each vocabulary based on the word vector to obtain each cluster; acquiring key word characteristic values of each cluster according to the distribution of elements in each cluster and the importance degree of each vocabulary in the cache data; acquiring keywords of each cache data in a historical time period according to the keyword characteristic values of each cluster; according to the difference degree of the keywords of each cache data in the historical time period, combining the cache path vector of the user edge device and the corresponding path cache delay data to obtain the user behavior retrieval depth of the user in the historical time period; acquiring a cold and hot attribute adjustment coefficient of the data to be cached at the current moment according to the access frequency and the access time delay of the data to be cached at the current moment; acquiring the cold and hot coefficients of the current data to be cached according to the user behavior retrieval depth of the user in the historical time period and the cold and hot attribute adjustment coefficient of the data to be cached at the current moment;
And selecting the optimal edge equipment to finish the caching of the data based on the cold and heat coefficients of different users according to the data to be cached.
Preferably, the processing the cached data of the user edge device in the history period to obtain the vocabulary in each cached data includes:
and (3) for each cache data in the historical time period, adopting ASCIL code conversion, and adopting a bidirectional maximum matching method to acquire vocabulary in each cache data.
Preferably, the clustering of the vocabularies based on the word vectors is performed to obtain clustering clusters, specifically:
And taking the Hamming distance between word vectors corresponding to the words as a clustering distance, taking all words in each cache data as the input of a clustering algorithm, and outputting each clustering cluster.
Preferably, the obtaining the keyword feature value of each cluster according to the distribution of the elements in each cluster and the importance degree of each vocabulary in the cache data includes:
acquiring the lengths of the longest identical substrings of word vectors corresponding to all words in each cluster; acquiring TF-IDF values of each vocabulary, and calculating the sum value of all the TF-IDF values in each cluster; and taking the product of the sum value, the length and the vocabulary quantity of each cluster as the keyword characteristic value of each cluster.
Preferably, the obtaining the keyword of each cache data in the history period according to the keyword feature value of each cluster includes:
And for each cache data in the history time period, using the vocabulary with the maximum TF-IDF value in the cluster corresponding to the maximum keyword characteristic value as the keyword of each cache data.
Preferably, the step of obtaining the retrieval depth of the user behavior of the user in the historical time period according to the difference degree of the keywords of each cache data in the historical time period and combining the cache path vector of the user edge device and the corresponding path cache delay data specifically includes:
acquiring Euclidean distance between word vectors of the corresponding keywords of the cache data under each adjacent access times in the historical time period; obtaining dtw distances among cache path vectors of the cache data under each adjacent access times in a historical time period; calculating the product of the Euclidean distance and the dtw distances; calculating the reciprocal of the product; and taking the sum of the reciprocal values under all adjacent access times in the historical time period as the user behavior retrieval depth of the user in the historical time period.
Preferably, the obtaining the adjustment coefficient of the cold and hot attribute of the data to be cached at the current moment according to the access frequency and the access time delay of the data to be cached at the current moment includes:
for data to be cached at the current moment, acquiring the access frequency of a user to access the data to be cached in a historical time period; acquiring the time delay of accessing the data to be cached each time in the historical time period; calculating the sum of all the time delays in the historical time period; acquiring hit rate of user access data in a historical time period;
when the hit rate is smaller than or equal to a preset hit rate threshold, calculating the ratio of the access frequency to the hit rate, and taking the product of the ratio and the sum as a cold and hot attribute adjustment coefficient of data to be cached at the current moment;
And when the hit rate is larger than a preset hit rate threshold, calculating the product of the access frequency and the sum value as a cold and hot attribute adjustment coefficient of the data to be cached at the current moment.
Preferably, the obtaining the heat and cold coefficient of the current data to be cached is specifically a product of the user behavior retrieval depth of the user in the historical time period and the heat and cold attribute adjustment coefficient of the current data to be cached.
Preferably, the selecting the best edge device to complete the data caching according to the data to be cached based on the cold and heat coefficients of different users includes:
Calculating the absolute value of the difference between the cold and hot coefficients of each user and other users based on the same data to be cached, taking the average value of the absolute values of the difference between each user and all other users as the characteristic distance of each user, and taking the reciprocal of the characteristic distance of each user as the weight of each user; acquiring a mesh topology structure of edge equipment;
Taking the average value of the cold and hot coefficients of all users as a cold and hot coefficient threshold value; storing a set formed by edge equipment corresponding to a user with a cold and heat coefficient larger than a cold and heat coefficient threshold as a user edge set; storing a set formed by all edge devices as an edge set;
the cost value expression of each edge device in the edge set is:
In the method, in the process of the invention, Cost value representing the ith edge device in the edge set,/>Representing the minimum distance between the ith edge device in the edge set and the jth edge device in the user edge set in the mesh topology,/>Representing the weight corresponding to the jth edge device in the user edge set, n represents the number of elements of the user edge set, and/(>)Representing a minimization function;
and taking the edge device with the minimum cost value in the edge set as the best cache edge device for the data to be cached.
In a second aspect, an embodiment of the present invention further provides a data caching system based on edge calculation, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor implements the steps of any one of the methods described above when executing the computer program.
The invention has at least the following beneficial effects:
The method and the device have the advantages that the key words of the cached data under different access times are analyzed, the types of the data accessed by the user in the historical time period are mined, the historical behaviors of the user are further analyzed conveniently, the preference of the accessed data of the user is learned in real time, and therefore the data caching of each user is personalized; constructing a user behavior retrieval depth for a user in a historical time period, and describing data mining for the user to deeply access the information of the same type in the time period; constructing a cold and hot attribute adjustment coefficient, and improving the cache hit rate of data to be cached, thereby reducing the cache time delay; constructing a cost value, and finishing the selection of the best cache edge equipment of the data to be cached; by the data caching based on edge calculation, the user request can be responded more quickly, the data access time delay is reduced, the burden of a central data center is lightened, and the expandability and the stability of the whole system are improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart illustrating a data caching method based on edge computation according to an embodiment of the present invention;
Fig. 2 is a flowchart for acquiring the heat and cold coefficients of data to be buffered.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to the specific implementation, structure, characteristics and effects of a data caching method and system based on edge calculation according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of the data caching method and system based on edge calculation provided by the invention with reference to the accompanying drawings.
Referring to fig. 1, a flowchart illustrating a data caching method based on edge computation according to an embodiment of the invention is shown, the method includes the following steps:
Step S001: and collecting cache data information of the user edge equipment in a historical time period.
The cache path vector is specifically a vector formed by the position information of the cache path of the cache data in the edge device, and the cache information of the data read in a period of time in the edge device where each user is located in the cloud end is obtained, wherein the cache path vector is a vector formed by the position information of the cache path of the cache data in the edge device, in this embodiment, all the user cache data in the first m times from the current time are obtained, in this embodiment, the value of m is 1 hour, and an implementer can set the value according to actual conditions. The period corresponding to the first m times from the current time is noted as a history period for convenience of description.
So far, the cache data information of the user edge equipment is obtained.
Step S002: analyzing historical cache data information of a user, firstly obtaining keywords of the cache data, analyzing the keywords of the cache data under the continuous access times, and constructing a user behavior retrieval depth; and combining the characteristics of the data to be cached of the user at the current moment, constructing a cold and hot attribute adjustment coefficient, and obtaining the cold and hot coefficient of the current data to be cached.
According to the access mode and the behavior of the user, the data possibly needed by the user is predicted and cached on the edge device in advance. Thus, the data transmission time can be reduced, and the hit rate can be improved.
Aiming at the data condition accessed in the user history time period, the data types accessed by the user in the history time period are mined by analyzing keywords for caching the data under different access times, so that the user history behavior is further analyzed conveniently, the user access data preference is learned in real time, and therefore data caching is performed for each user individually.
According to the embodiment, for each cache data of a user in a historical time period, ASCIL code conversion is carried out on each cache data, and word segmentation is carried out on the converted text data by adopting a bidirectional maximum matching method. Meanwhile, in order to analyze key information in data, some vocabularies which have no practical meaning in the data need to be removed, in the embodiment, vocabularies which appear in all cache data but have no practical meaning are removed by adopting a Hadamard stop vocabulary, and word vectors of all vocabularies are obtained by adopting One-hot coding vectors for all vocabularies obtained after removal. The two-way maximum matching method, the Ha-Gong Dai stop vocabulary and the One-hot encoding vector are known techniques, and the present embodiment will not be described in detail.
Meanwhile, aiming at any cache data, a clustering algorithm is adopted, and the Hamming distance between word vectors corresponding to the words is used as a clustering distance, so that each cluster comprising each word is obtained. Combining TF-IDF values of various vocabularies in the cache data to obtain a keyword characteristic value of any cache data of a user in a historical time period, wherein the expression is as follows:
In the method, in the process of the invention, A keyword feature value representing an ith cluster; /(I)Representing the number of words in the i-th cluster,Representing the length of the longest identical substring in the word vectors corresponding to all vocabularies in the ith clusterThe TF-IDF value representing the jth vocabulary in the ith cluster. It should be noted that, the calculation of the TF-IDF (term frequency-inverse frequency) value is known in the prior art, and will not be described in detail in this embodiment.
When the number of words in the cluster is larger, and the length of the longest identical substring between word vectors corresponding to words in the cluster is longer, the words in the cluster are similar, the TF-IDF value in the cache data is larger, the probability of containing high-frequency words in the cluster is larger, the feature of the cache data extraction keyword is more consistent, and therefore the feature value of the keyword of the cluster is larger, and the words in the cluster are more consistent with the keyword of the cache data content.
Selecting a cluster with the largest key word characteristic value, and marking the element with the largest TF-IDF value as the key word in the cache data as
In general, since the speed of data caching has a certain association with the data volume and the data content of the cached data, when the data volume is larger, the cache hit rate is higher, the average cache delay is reduced to a certain extent, so that the subsequent data access time is reduced. If the user performs deep access to the information of the same type in the historical time period, the cache hit rate is greatly increased, so that the cache time delay is reduced.
For the edge devices of each user, when the user needs to buffer certain data, the user can first check whether the buffer data exists in the own buffer memory, and if not, the data needs to be acquired from the buffer memories of other edge devices. When the edge device of the user accesses data from other edge devices, a cache path vector Hc corresponding to each cache data in each user history time period can be obtained, the cache path comprises a transmission path of the data obtained by the edge device of the user, meanwhile, a cache time delay Hs corresponding to each path is obtained, the cache path vector in the user history time period and the cache time delay of the corresponding path are analyzed, and a user behavior retrieval depth in the user history time period is obtained, wherein the expression is as follows:
In the method, in the process of the invention, Representing user behavior retrieval depth over a historical period of time,/>Representing the access frequency of users to all cached data in a historical time period,/>Europe distance between word vectors representing keywords corresponding to buffer data accessed by ith time and (i+1) th time in history time period,/>、/>Buffer path vectors respectively representing buffer data accessed by ith and (i+1) th times in historical time period,/>As a dtw distance function. The dtw distance is a known technology, and will not be described in detail in this embodiment.
Meanwhile, when the search paths of the user for searching the cache data among all adjacent access times are similar in the historical time period, the search habit of the user is deeper, namely the cache data needed by the user in the historical time period has a similar condition, namely the similar condition among the data is mined, so that the method can be used for mining the historical behavior characteristics of the user, namelyThe larger the data accessed by the user in the time period is, the data needs to be cached to other edge devices which are closer to the edge device of the user, so that the user can access the data conveniently.
For the cached data in the historical time period, a certain relation exists between the cached data and the data to be cached at the current moment, if the data to be cached is frequently accessed in the historical time period, but the hit rate is lower, and the caching delay of the data to be cached is larger, the fact that the data to be cached is frequently accessed in the time period is indicated, but too much caching time is wasted in the process of accessing the data, so that the data caching efficiency is reduced, and the data transmission time is prolonged. Therefore, the cold and hot attribute adjustment coefficient of the data to be cached at the current moment of the user is calculated, and the expression is as follows:
In the method, in the process of the invention, The adjustment coefficient of the cold and hot attribute of the data to be cached at the current moment is expressed by the formula/>Representing the access frequency of users to the data to be cached in the history time period,/>Representing hit rate of user access data in history time period,/>Representing the time delay of the ith access to the data to be cached in the history time period,/>For the hit rate threshold, 0.1 is set in the present embodiment. It should be noted that, the hit rate is calculated as a prior art, and the details are not repeated in this embodiment.
It should be noted that, when the data to be cached is frequently requested to be accessed in the history period, but the hit rate of the user accessing the data to be cached in the history period is smaller than the hit rate threshold, it is indicated that the data to be cached may be the data frequently accessed by the edge device in the history period, and the data needs to be cached and reserved, so that more time is wasted when the subsequent data is accessed, therefore, the adjustment coefficient of the cold and hot properties of the data to be cached is corrected in combination of the cache delay and the access frequency of the data to be cached in the history period,The larger the data to be cached at the current moment is, the more the data to be cached needs to be cached on the edge equipment which is closer to the edge equipment of the user, so that the next data access is facilitated.
And constructing a data cooling and heating coefficient of the data cached by the user by adopting a cooling and heating attribute adjustment coefficient of the data accessed by the user and a user behavior retrieval depth in a historical time period, wherein the data cooling and heating coefficient is used for representing the caching necessity of the data for the user, and the expression is as follows:
In the method, in the process of the invention, Representing the cold and hot coefficient of data to be cached at the current moment,/>Representing the depth of retrieval of the user's behavior of the user over a historical period of time,/>And the cold and hot attribute adjustment coefficient of the data to be cached at the current moment is represented. The flow of obtaining the heat and cold coefficients of the data to be cached is shown in fig. 2.
It should be noted that, when the user behavior search depth of the user caching the data is larger and the cooling and heating attribute adjustment coefficient of the data is larger in the history period, it is indicated that the data is deeply accessed by the user in the history period, and the data is more hot data, namelyThe larger.
So far, the cold and hot coefficients of the data to be cached are obtained.
Step S003: and analyzing the cold and hot coefficients of the same data to be cached based on different users, constructing a cost value, and finishing data caching based on edge calculation.
Considering that similar data accessed by users may exist between edge devices of users which are closer to each other, if two or more user edge devices have data access requirements on the same data, the data is cached on the edge devices which are closer to the two or more users, so that the multi-user access is facilitated.
And calculating the cold and hot coefficients of other users according to the cache data in the historical time period of the edge equipment of the other users and the data to be cached based on the same data to be cached. Taking the average value of the absolute values of the difference values of the cold and hot coefficients of each user and other users as a characteristic distance, and taking the inverse of the characteristic distance of each user as the weight of each user; and taking each edge device as each node, and acquiring the position information of the edge device and the mesh topological structure. Taking the average value of the cold and hot coefficients of all users as a cold and hot coefficient threshold value, screening users needing data to be cached, namely users with cold and hot coefficients larger than the cold and hot coefficient threshold value, storing a set formed by corresponding edge devices as a user edge set, and storing a set formed by all edge devices as an edge set; obtaining cost values of edge devices in the edge set according to a mesh topological structure between the edge set of the user and the edge devices in the edge set, wherein the cost values are expressed as follows:
In the method, in the process of the invention, Cost value representing the ith edge device in the edge set,/>Representing the minimum distance between the ith edge device in the edge set and the jth edge device in the user edge set in the mesh topology,/>Representing the weight corresponding to the jth edge device in the user edge set, n represents the number of elements of the user edge set, and/(>)Representing a minimization function.
And taking the edge device with the minimum cost value in the edge set as the optimal edge device of the data to be cached.
By the data caching method and the data caching system based on edge calculation, user requests can be responded more quickly, data access time delay is reduced, the burden of a central data center is lightened, and the expandability and the stability of the whole system are improved.
Based on the same inventive concept as the above method, the embodiment of the present invention further provides a data caching system based on edge calculation, which includes a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor implements the steps of any one of the above methods based on edge calculation when executing the computer program.
In summary, the embodiment of the invention mainly analyzes the keywords of the cached data under different access times, and mines the data types accessed by the user in the historical time period, so that the historical behavior of the user is further analyzed, the preference of the accessed data of the user is learned in real time, and the data caching is performed for each user in a personalized way; constructing a user behavior retrieval depth for a user in a historical time period, and describing data mining for the user to deeply access the information of the same type in the time period; constructing a cold and hot attribute adjustment coefficient, and improving the cache hit rate of data to be cached, thereby reducing the cache time delay; constructing a cost value, and finishing the selection of the best cache edge equipment of the data to be cached; by the data caching based on edge calculation, the user request can be responded more quickly, the data access time delay is reduced, the burden of a central data center is lightened, and the expandability and the stability of the whole system are improved.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. And the foregoing description has been directed to specific embodiments of this specification. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
The foregoing description of the preferred embodiments of the present invention is not intended to be limiting, but rather, any modifications, equivalents, improvements, etc. that fall within the principles of the present invention are intended to be included within the scope of the present invention.

Claims (6)

1. A data caching method based on edge calculation, which is characterized by comprising the following steps:
obtaining cache data information of user edge equipment in a historical time period, wherein the cache data information comprises a cache path vector, cache delay data and cache data;
Processing the cache data of the user edge equipment in the historical time period to obtain vocabulary in each cache data; encoding each vocabulary in the cache data to obtain word vectors of each vocabulary; clustering each vocabulary based on the word vector to obtain each cluster; acquiring key word characteristic values of each cluster according to the distribution of elements in each cluster and the importance degree of each vocabulary in the cache data; acquiring keywords of each cache data in a historical time period according to the keyword characteristic values of each cluster; according to the difference degree of the keywords of each cache data in the historical time period, combining the cache path vector of the user edge device and the corresponding path cache delay data to obtain the user behavior retrieval depth of the user in the historical time period; acquiring a cold and hot attribute adjustment coefficient of the data to be cached at the current moment according to the access frequency and the access time delay of the data to be cached at the current moment; acquiring the cold and hot coefficients of the current data to be cached according to the user behavior retrieval depth of the user in the historical time period and the cold and hot attribute adjustment coefficient of the data to be cached at the current moment;
selecting optimal edge equipment to finish data caching based on the cold and heat coefficients of different users according to the data to be cached;
The retrieval depth of the user behavior of the user in the historical time period is obtained by combining the cache path vector of the user edge device and the corresponding path cache delay data according to the difference degree of the keywords of each cache data in the historical time period, and is specifically as follows:
Acquiring Euclidean distance between word vectors of the corresponding keywords of the cache data under each adjacent access times in the historical time period; obtaining dtw distances among cache path vectors of the cache data under each adjacent access times in a historical time period; calculating the product of the Euclidean distance and the dtw distances; calculating the reciprocal of the product; taking the sum of the reciprocal values of all adjacent access times in the historical time period as the user behavior retrieval depth of the user in the historical time period;
The obtaining the adjustment coefficient of the cold and hot attribute of the data to be cached at the current moment according to the access frequency and the access time delay of the data to be cached at the current moment comprises the following steps:
for data to be cached at the current moment, acquiring the access frequency of a user to access the data to be cached in a historical time period; acquiring the time delay of accessing the data to be cached each time in the historical time period; calculating the sum of all the time delays in the historical time period; acquiring hit rate of user access data in a historical time period;
when the hit rate is smaller than or equal to a preset hit rate threshold, calculating the ratio of the access frequency to the hit rate, and taking the product of the ratio and the sum as a cold and hot attribute adjustment coefficient of data to be cached at the current moment;
When the hit rate is larger than a preset hit rate threshold, calculating the product of the access frequency and the sum value as a cold and hot attribute adjustment coefficient of data to be cached at the current moment;
the obtained cold and hot coefficients of the current data to be cached are specifically products of the user behavior retrieval depth of the user in the historical time period and the cold and hot attribute adjustment coefficients of the data to be cached at the current moment;
the selecting the optimal edge equipment to finish the data caching based on the cold and heat coefficients of different users according to the data to be cached comprises the following steps:
Calculating the absolute value of the difference between the cold and hot coefficients of each user and other users based on the same data to be cached, taking the average value of the absolute values of the difference between each user and all other users as the characteristic distance of each user, and taking the reciprocal of the characteristic distance of each user as the weight of each user; acquiring net topology structures of all edge devices;
Taking the average value of the cold and hot coefficients of all users as a cold and hot coefficient threshold value; storing a set formed by edge equipment corresponding to a user with a cold and heat coefficient larger than a cold and heat coefficient threshold as a user edge set; storing a set formed by all edge devices as an edge set;
the cost value expression of each edge device in the edge set is:
In the method, in the process of the invention, Cost value representing the ith edge device in the edge set,/>Representing the minimum distance between the ith edge device in the edge set and the jth edge device in the user edge set in the mesh topology,/>Representing the weight corresponding to the jth edge device in the user edge set, n represents the number of elements of the user edge set, and/(>)Representing a minimization function;
and taking the edge device with the minimum cost value in the edge set as the best cache edge device for the data to be cached.
2. The method for caching data based on edge calculation as claimed in claim 1, wherein the processing the cached data of the user edge device in the history period to obtain vocabulary in each cached data includes:
and (3) for each cache data in the historical time period, adopting ASCIL code conversion, and adopting a bidirectional maximum matching method to acquire vocabulary in each cache data.
3. The data caching method based on edge calculation as claimed in claim 1, wherein the clustering of words based on word vectors is performed to obtain clusters, specifically:
And taking the Hamming distance between word vectors corresponding to the words as a clustering distance, taking all words in each cache data as the input of a clustering algorithm, and outputting each clustering cluster.
4. The method for caching data based on edge calculation as claimed in claim 1, wherein the obtaining the keyword feature value of each cluster according to the distribution of elements in each cluster and the importance degree of each vocabulary in the cached data includes:
acquiring the lengths of the longest identical substrings of word vectors corresponding to all words in each cluster; acquiring TF-IDF values of each vocabulary, and calculating the sum value of all the TF-IDF values in each cluster; and taking the product of the sum value, the length and the vocabulary quantity of each cluster as the keyword characteristic value of each cluster.
5. The method for caching data based on edge calculation as claimed in claim 4, wherein said obtaining keywords of each cached data in a history period according to the keyword feature values of each cluster includes:
And for each cache data in the history time period, using the vocabulary with the maximum TF-IDF value in the cluster corresponding to the maximum keyword characteristic value as the keyword of each cache data.
6. A data caching system based on edge computation, comprising a memory, a processor and a computer program stored in the memory and running on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1-5 when executing the computer program.
CN202410372649.XA 2024-03-29 2024-03-29 Data caching method and system based on edge calculation Active CN117971503B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410372649.XA CN117971503B (en) 2024-03-29 2024-03-29 Data caching method and system based on edge calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410372649.XA CN117971503B (en) 2024-03-29 2024-03-29 Data caching method and system based on edge calculation

Publications (2)

Publication Number Publication Date
CN117971503A CN117971503A (en) 2024-05-03
CN117971503B true CN117971503B (en) 2024-06-11

Family

ID=90851750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410372649.XA Active CN117971503B (en) 2024-03-29 2024-03-29 Data caching method and system based on edge calculation

Country Status (1)

Country Link
CN (1) CN117971503B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549719A (en) * 2018-04-23 2018-09-18 西安交通大学 A kind of adaptive cache method based on cluster in mobile edge calculations network
EP3648436A1 (en) * 2018-10-29 2020-05-06 Commissariat à l'énergie atomique et aux énergies alternatives Method for clustering cache servers within a mobile edge computing network
CN111680161A (en) * 2020-07-07 2020-09-18 腾讯科技(深圳)有限公司 Text processing method and device and computer readable storage medium
CN112749010A (en) * 2020-12-31 2021-05-04 中南大学 Edge calculation task allocation method for fusion recommendation system
US11218561B1 (en) * 2021-03-09 2022-01-04 Wipro Limited Method and system for managing cache data in a network through edge nodes
CN114785856A (en) * 2022-03-21 2022-07-22 鹏城实验室 Edge calculation-based collaborative caching method, device, equipment and storage medium
CN115988575A (en) * 2022-12-01 2023-04-18 郑州师范学院 Mixed type edge data caching method
CN116320000A (en) * 2023-02-21 2023-06-23 青海大学 Collaborative caching method, collaborative caching device, electronic equipment and storage medium
CN116521904A (en) * 2023-06-29 2023-08-01 湖南大学 Ship manufacturing data cloud fusion method and system based on 5G edge calculation
JP2023118632A (en) * 2022-02-15 2023-08-25 Solution Creators株式会社 Method for preventing server deterioration and fire, and data center for preventing deterioration and fire
CN116842290A (en) * 2023-06-29 2023-10-03 中国平安财产保险股份有限公司 Data caching method, device, equipment and computer readable storage medium
CN116865842A (en) * 2023-09-05 2023-10-10 武汉能钠智能装备技术股份有限公司 Resource allocation system and method for communication multiple access edge computing server

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549719A (en) * 2018-04-23 2018-09-18 西安交通大学 A kind of adaptive cache method based on cluster in mobile edge calculations network
EP3648436A1 (en) * 2018-10-29 2020-05-06 Commissariat à l'énergie atomique et aux énergies alternatives Method for clustering cache servers within a mobile edge computing network
CN111680161A (en) * 2020-07-07 2020-09-18 腾讯科技(深圳)有限公司 Text processing method and device and computer readable storage medium
CN112749010A (en) * 2020-12-31 2021-05-04 中南大学 Edge calculation task allocation method for fusion recommendation system
US11218561B1 (en) * 2021-03-09 2022-01-04 Wipro Limited Method and system for managing cache data in a network through edge nodes
JP2023118632A (en) * 2022-02-15 2023-08-25 Solution Creators株式会社 Method for preventing server deterioration and fire, and data center for preventing deterioration and fire
CN114785856A (en) * 2022-03-21 2022-07-22 鹏城实验室 Edge calculation-based collaborative caching method, device, equipment and storage medium
CN115988575A (en) * 2022-12-01 2023-04-18 郑州师范学院 Mixed type edge data caching method
CN116320000A (en) * 2023-02-21 2023-06-23 青海大学 Collaborative caching method, collaborative caching device, electronic equipment and storage medium
CN116521904A (en) * 2023-06-29 2023-08-01 湖南大学 Ship manufacturing data cloud fusion method and system based on 5G edge calculation
CN116842290A (en) * 2023-06-29 2023-10-03 中国平安财产保险股份有限公司 Data caching method, device, equipment and computer readable storage medium
CN116865842A (en) * 2023-09-05 2023-10-10 武汉能钠智能装备技术股份有限公司 Resource allocation system and method for communication multiple access edge computing server

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种移动边缘计算环境中服务QoS的预测方法;任丽芳;王文剑;;小型微型计算机系统;20200529(第06期);58-63 *

Also Published As

Publication number Publication date
CN117971503A (en) 2024-05-03

Similar Documents

Publication Publication Date Title
CN104253855B (en) Classification popularity buffer replacing method based on classifying content in a kind of content oriented central site network
US5305389A (en) Predictive cache system
CN108446340B (en) A kind of user's hot spot data access prediction technique towards mass small documents
CN108710639B (en) Ceph-based access optimization method for mass small files
CN110287010B (en) Cache data prefetching method oriented to Spark time window data analysis
CN111314862B (en) Caching method with recommendation under deep reinforcement learning in fog wireless access network
CN113364854B (en) Privacy protection dynamic edge cache design method based on distributed reinforcement learning in mobile edge computing network
CN110471939A (en) Data access method, device, computer equipment and storage medium
CN112667528A (en) Data prefetching method and related equipment
US20220374692A1 (en) Interleaving memory requests to accelerate memory accesses
CN111491331B (en) Network perception self-adaptive caching method based on transfer learning in fog computing network
CN110418367A (en) A kind of 5G forward pass mixture of networks edge cache low time delay method
CN115712583B (en) Method, device and medium for improving distributed cache cross-node access performance
CN113271631B (en) Novel content cache deployment scheme based on user request possibility and space-time characteristics
Chao Web cache intelligent replacement strategy combined with GDSF and SVM network re-accessed probability prediction
CN117971503B (en) Data caching method and system based on edge calculation
Einziger et al. Lightweight robust size aware cache management
Kazi et al. Web object prefetching: Approaches and a new algorithm
CN113127515A (en) Power grid-oriented regulation and control data caching method and device, computer equipment and storage medium
CN113435601A (en) Data prefetching method and device and storage device
EP3274844A1 (en) Hierarchical cost based caching for online media
CN113268458A (en) Caching method and system based on cost-sensitive classification algorithm
CN110363015A (en) A kind of construction method of the markov Prefetching Model based on user property classification
CN105530303B (en) A kind of network-caching linear re-placement method
CN110399979B (en) Click rate pre-estimation system and method based on field programmable gate array

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant