CN113157215A - Hotspot data access method and device, electronic equipment and storage medium - Google Patents

Hotspot data access method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113157215A
CN113157215A CN202110530391.8A CN202110530391A CN113157215A CN 113157215 A CN113157215 A CN 113157215A CN 202110530391 A CN202110530391 A CN 202110530391A CN 113157215 A CN113157215 A CN 113157215A
Authority
CN
China
Prior art keywords
data
hotspot
access
access request
copy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110530391.8A
Other languages
Chinese (zh)
Inventor
夏天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110530391.8A priority Critical patent/CN113157215A/en
Publication of CN113157215A publication Critical patent/CN113157215A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a hot spot data access method, a hot spot data access device, electronic equipment and a storage medium, which are used for acquiring a hot spot access request aiming at predetermined hot spot data and determining a substitute access request corresponding to the hot spot access request. The substitute access request is used to access replica data of the hotspot data. The mode of responding the hotspot access request by accessing the copy data of the hotspot data is provided through the substitute access request, and the copy data of the hotspot data and the hotspot data are stored in different cache hosts, so that the request amount of the predetermined hotspot data can be reduced through the substitute access request, the pressure of the cache host where the hotspot data is located is relieved, and the probability of service abnormity caused by the hotspot data is reduced.

Description

Hotspot data access method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of internet, in particular to a hotspot data access method and device, electronic equipment and a storage medium.
Background
Since database performance is difficult to meet with the increasing amount of requests, caching has been introduced to improve the availability of the system and the user's experience. After the cache is introduced, when a business system initiates a certain query request, whether the data exists in the cache is judged firstly, if the data exists in the cache, the data is directly returned, and if the data does not exist in the cache, the database is queried again, and then the data is returned.
However, some emergencies, such as hot news, price reduction and rush purchase, may greatly increase the amount of requests for certain data, and form hot spot data, so that access is concentrated in a certain cache host, and when the carrying capacity of the cache host is exceeded, a situation of cache penetration or even service avalanche may occur.
Disclosure of Invention
The invention provides a hot spot data access method, a device, electronic equipment and a storage medium, which are used for solving the defect that access of hot spot data is concentrated in a certain cache host, and cache penetration and even service avalanche can be caused when the load capacity of the cache host is exceeded, and the aims of reducing the request amount of the hot spot data by replacing an access request, relieving the pressure of the cache host storing the hot spot data and reducing the probability of service abnormity caused by the hot spot data are fulfilled.
The invention provides a hotspot data access method, which comprises the following steps:
acquiring a hotspot access request aiming at predetermined hotspot data;
determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and taking the copy data obtained according to the substitute access request as data responding to the hotspot access request.
According to the present invention, there is provided a hotspot data access method, wherein after determining a substitute access request corresponding to the hot access request, the method further includes:
and determining whether the copy data is hot data or not according to the access times of the copy data corresponding to the substitute access request, and if so, performing scattered operation on the copy data.
According to the hotspot data access method provided by the invention, on the basis, the dispersion operation of the copy data comprises the following steps:
determining a heat level corresponding to the duplicate data, wherein the heat level is determined according to the access times of the duplicate data;
and determining the copy number for copying the copy data when the dispersion operation is performed according to the heat level.
According to the hotspot data access method provided by the invention, on the basis, the determining of the alternative access request corresponding to the hot access request comprises the following steps:
acquiring a data identifier of the hot spot data, and determining a copy data identifier corresponding to each copy data of the hot spot data through a preset identifier generation rule based on the data identifier;
generating an access request for accessing the copy data according to the copy data identifier, and taking the access request as the substitute access request;
and when the copy data is stored, determining a cache host for caching the copy data according to the copy data identifier.
The invention provides a hotspot data access method, which comprises the following steps of, on the basis of the above, before acquiring a hotspot access request for predetermined hotspot data:
and for any data, determining whether the data is hot data or not according to the access times of requesting to access the data.
According to the present invention, on the basis, the determining whether any data is the hotspot data according to the access frequency of the request to access the any data includes:
and determining whether any data is hot data or not according to the access times of requesting to access any data in a set time window.
According to the present invention, on the basis, the determining whether any data is the hotspot data according to the access frequency of the request to access the any data includes:
determining whether any data is hot data according to a prediction model;
wherein the predictive model is determined based on the number of accesses to request access to the any of the data.
The invention also provides a hot spot data access device, comprising:
the acquisition module is used for acquiring a hotspot access request aiming at predetermined hotspot data;
a determination module to determine an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and the response module is used for taking the copy data acquired according to the substitute access request as the data responding to the hotspot access request.
The invention further provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of any one of the hot spot data access methods.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the hotspot data access method as described in any one of the above.
According to the hotspot data access method and device, the electronic equipment and the storage medium, the hotspot access request aiming at the predetermined hotspot data is obtained, and the substitute access request corresponding to the hotspot data is determined. The substitute access request is used to access replica data of the hotspot data. The mode of responding the hotspot access request by accessing the copy data of the hotspot data is provided through the substitute access request, and the copy data of the hotspot data and the hotspot data are stored in different cache hosts, so that the request amount of the predetermined hotspot data can be reduced through the substitute access request, the pressure of the cache host where the hotspot data is located is relieved, and the probability of service abnormity caused by the hotspot data is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a network architecture in which a caching host and an application server are deployed together for comparison purposes;
FIG. 2 is a schematic diagram of a network architecture for a comparative multi-level caching scheme;
FIG. 3 is a schematic diagram of a network architecture providing a comparative read-write separation mode;
FIG. 4 is a schematic flow chart of a hot spot data access method provided by the present invention;
FIG. 5 is a schematic diagram of a network structure for hot spot data access provided by the present invention;
FIG. 6 is a schematic diagram illustrating an implementation principle of hot spot data access provided by the present invention;
FIG. 7 is a schematic diagram of the statistics of data request volumes over a time window provided by the present invention;
FIG. 8 is a block diagram of a hotspot data access device provided by the present invention;
FIG. 9 is a schematic physical structure diagram of an electronic device provided by the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Several solutions for solving the hot spot data problem are provided as comparison, fig. 1 is a schematic diagram of a network structure in which a cache host and an application server are deployed together as comparison, and referring to fig. 1, the cache host and the application service are deployed together, and one application service computer uses a single cache host. Thus the application service cannot generate physical limits beyond which the cache cannot tolerate. Fig. 2 is a schematic diagram of a network structure of a multi-level cache scheme for comparison, in which a common cache service is used as a first-level cache (e.g., redis, memcache), a local cache is added as a second-level cache, and when a hot spot data request occurs, a Key and data of the hot spot data are cached in the local cache, and the first-level cache is not requested any more. Fig. 3 is a schematic diagram of a network structure of a read-write separation mode for comparison, referring to fig. 3, a Master-Slave mode is used for a cache service, the Master provides a write operation, a plurality of slaves provide a read-only service, share requests, and maintain data synchronization between the Master and the Slave.
However, for the technical solutions of fig. 1 and fig. 2 in which the cache host and the application service are deployed in the same physical machine and a local cache is used, since data written into the cache is fixed, and hot data may change with access, both of the solutions may have a possibility of cache data inconsistency. And the same physical machine and local cache are greatly limited by hardware resources, because the cache uses a memory for storage, when the data needing to be cached is more, the requirement is difficult to meet. For the scheme of deploying multiple Slave machines in fig. 3, although the possibility of inconsistency (in practice, there is still the possibility of inconsistency when data is synchronized) is greatly reduced by means of Master-Slave synchronization, as the number of accesses increases, more Slave needs to be extended, and the required cost is higher. At present, the cache is selected to be expanded during activities such as a big promotion and the like, and the activities are recycled after the activities are finished so as to reduce the cost, but the activities are realized under the condition of predicting hotspot data, and on the contrary, some sudden hotspot information cannot be realized through the scheme.
In order to solve the problem of the hot spot data under the condition of using fewer resources as much as possible, fig. 4 is a schematic flow chart of the hot spot data access method provided in this embodiment, where the hot spot data access method is generally applied to a server, and referring to fig. 4, the method includes:
step 401: and acquiring a hotspot access request aiming at predetermined hotspot data.
Whether a certain data is hot data is determined according to the number of accesses to the data. And if the counted access times of certain data are larger than the preset access times, determining that the data are hot data.
Step 402: determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts.
In this embodiment, the copy data is generated in real time for the data determined as the hot spot data, and the copy is stored in the cache host different from the data, so that the hot spot data can be acquired by accessing the copy data, and the pressure of the cache host storing the hot spot data is relieved.
After the copy data is stored in the cache host, the copy data can be automatically deleted after a preset time length, so that the cache space occupied by the copy data is released. Therefore, according to the scheme provided by this embodiment, it is not necessary to specially deploy a cache host to the application server or add a large number of cache hosts, and usually only the copy data of the hot spot data needs to be distributed and stored in the existing cache host in real time, so that the pressure of the cache host where the hot spot data is located can be relieved, and the hotness of the hot spot data is reduced. When the hotspot data is changed, each corresponding copy data is also changed correspondingly.
For the hotspot access request, a substitute access request for accessing the copy data can be generated according to the cache host where the copy data of the hotspot data is located, so that the access to the hotspot data is reduced.
Step 403: and taking the copy data obtained according to the substitute access request as data responding to the hotspot access request.
After the copy data is obtained according to the substitute access request, the copy data can be returned to the requester of the hot spot access request, so that the problem of the hot spot data is solved, and the normal access of the user to the hot spot data is ensured.
The embodiment provides a hotspot data access method, which acquires a hotspot access request for predetermined hotspot data, and determines a substitute access request corresponding to the hotspot access request. The substitute access request is used to access replica data of the hotspot data. The mode of responding the hotspot access request by accessing the copy data of the hotspot data is provided through the substitute access request, and the copy data of the hotspot data and the hotspot data are stored in different cache hosts, so that the request amount of the predetermined hotspot data can be reduced through the substitute access request, the pressure of the cache host where the hotspot data is located is relieved, and the probability of service abnormity caused by the hotspot data is reduced.
Specifically, fig. 5 is a schematic diagram of a network structure for accessing hotspot data provided in this embodiment, and referring to fig. 5, an Interceptor and a Monitor may be set between a Proxy server Proxy of a server and a storage Cache. Intercepting the access request forwarded by the Proxy through the Interceptor, forwarding the access request to the Monitor, and judging whether the data requested by the access request is the hotspot data through the Monitor. And when the hot data is judged to be the hot data, copying the hot data and storing the copied data into different cache hosts. Meanwhile, for the intercepted hotspot access request for accessing the hotspot data, a substitute access request for accessing the copy data of a part of the request can be generated, and after the copy data is obtained, the Proxy returns the copy data to the requester for requesting the hotspot data, so that the access times of directly accessing the hotspot data are reduced.
The interpolator can obtain a Proxy request (namely an access request for accessing data) in a form of embedding no buried point or directly capturing a data packet, and obtains a data identifier (the data identifier is a Key because the constant data is stored in a Key-value form) corresponding to the data requested to be accessed in the access request, and then informs the Monitor to count and count the hot points. The Interreceptor receives and records the hotspot Key and the information fed back by the Monitor in the memory, when hotspot data occurs, the Key of the hotspot data is obtained and copied to obtain keys corresponding to the copy data (the hotspot data and the keys corresponding to each copy data are different), and the copy data is stored in different cache hosts. And then generating a substitute access request for a part of access requests for accessing the hot spot data according to the key corresponding to the copy data, so as to solve the problem of the hot spot data.
Further, on the basis of the above embodiment, after determining the substitute access request corresponding to the hot access request, the method further includes:
and determining whether the copy data is hot data or not according to the access times of the copy data corresponding to the substitute access request, and if so, performing scattered operation on the copy data.
When the hot spot data is copied, a plurality of copies can be copied and stored in different cache hosts respectively, the access to the hot spot data is dispersed to each cache host for storing the hot spot data and the copy data of the hot spot data, and the problems of the hot spot data are solved.
It should be noted that when accessing the duplicate data according to the substitute access request, there still exists a problem that the access frequency of a certain duplicate data is too many, for example, when the access frequency of a certain duplicate data is greater than a preset access frequency, the duplicate data has a hot spot problem, and at this time, in the scheme provided in this embodiment, the duplicate data is subjected to a distributed operation. Namely, the copy data is continuously copied to obtain the copy data of the copy data, and the copy data is dispersedly stored in different cache hosts.
In the embodiment, after the copy data is obtained, the access times of the copy data are continuously monitored, and when the access times of the copy data reach the preset access times, the copy data are continuously subjected to the dispersed operation, so that the problem of hot spots reappearing in the access process of the copy data is solved.
Further, on the basis of the foregoing embodiments, the performing a dispersion operation on the duplicate data includes:
determining a heat level corresponding to the duplicate data, wherein the heat level is determined according to the access times of the duplicate data;
and determining the copy number for copying the copy data when the dispersion operation is performed according to the heat level.
Specifically, the heat level is determined according to the number of accesses requesting access to the copy data in a unit time length. Generally, the greater the number of accesses per unit time length, the higher the heat level, and the greater the number of copies that need to be copied. For example, if the number of accesses to a copy data per second is greater than 10 ten thousand, the current heat level of the copy data is higher (for example, set to be one-level heat), and the number of copies of the copy data is 10. If the number of accesses to a copy data per second is greater than 5 ten thousand and less than 10 ten thousand, the current heat level of the copy data is generally high (for example, set to be secondary heat), and the number of copies of the copy data is 5.
In the embodiment, in the process of performing the distributed operation on the copy data, the copy data with different copy numbers are copied according to the heat level of the copy data, and the higher the heat level is, the more the copy number is, the access pressure is relieved, and the rationality of the distributed operation is improved.
Further, on the basis of the foregoing embodiments, the determining an alternative access request corresponding to the hot access request includes:
acquiring a data identifier of the hot spot data, and determining a copy data identifier corresponding to each copy data of the hot spot data through a preset identifier generation rule based on the data identifier;
generating an access request for accessing the copy data according to the copy data identifier, and taking the access request as the substitute access request;
and when the copy data is stored, determining a cache host for caching the copy data according to the copy data identifier.
Further, a cache host for caching any copy data can be determined through a hash algorithm according to a data identifier corresponding to the copy data.
The identification generation rule is a rule that is prescribed in advance and generates a data identification for each copy data. Further, in order to represent the relationship between the hotspot data and the copy data, the data identifier corresponding to each copy data may be generated based on the data identifier corresponding to the hotspot data through the identifier generation rule. For example, if the data identifier of the hotspot data is test, the number may be added after the test to the data identifier corresponding to the copy data of the hotspot data. More specifically, the identifier generation rule may further be used to generate a data identifier corresponding to each copy data based on the data identifier and the number of copies corresponding to the hotspot data, for example, if the number of copies is 3, the data identifiers of the copy data are test @1, test @2, and test @3, respectively.
Based on the identification generation rule, the data identification of the copy data can be quickly determined, and then a substitute access request for accessing any copy data can be quickly generated. For example, for hotspot data with a data identifier of test, when the number of copies is known to be 3, n may be assigned with any one of values 1, 2, and 3 in test @ n according to an identifier generation rule, so as to obtain a substitute access request.
In this embodiment, the data identifier of each copy data can be quickly determined based on the identifier generation rule, and after the cache host that caches the copy data is determined according to the data identifier, the substitute access request can be quickly generated through the identifier generation rule, so that the access efficiency of accessing the hot data is improved.
Fig. 6 is a schematic diagram of an implementation principle of hotspot data access provided in this embodiment, referring to fig. 6, a Monitor is a newly added Monitor, and the Monitor is responsible for collecting keys of an access request intercepted by an Interceptor, and when it is determined that data corresponding to the Key is hotspot data according to the access times of the keys, the cache is operated to copy a plurality of backup keys, and copy data corresponding to the backup keys are sent to different cache hosts. And simultaneously feeding back the hotspot Key to the interpolator.
The identification generation rule can set a plurality of hotspot thresholds for determining the heat level corresponding to hotspot data, different heat levels set different numbers of copy data, and a new Key is generated by means of an original Key + copy number sequence number, and the identification generation rule is notified to an interpolator in an agreed format.
For example: and determining that data with Key as test is hot point data through access times statistics, 6 copies of data need to be explained, and determining data identifiers corresponding to each copy of data as test @1, test @2, test @3, test @4, test @5 and test @6 respectively according to an identifier generation rule. Different keys are easy to store in different cache hosts respectively due to hash, and at the moment, the Monitor informs all the interpeptors that the generation rule of the identifier is test:6 (indicating that the test is copied by 6 copies as hot data). The Interreceptor records that the identification generation rule is local, and when the access request of the key of the test exists again, the random positive integer within 6 is added to the test @ to generate a substitute access request, and the access request to the test is replaced. The access request to test is hashed to a different caching host at this time.
Further, the Monitor also makes a statistical judgment on keys such as test @1, test @2, test @3, test @4, test @5, and test @6 to determine whether the keys are hot spot data, and if a certain copy data is determined to be hot spot data again, the keys are split into forms such as test1@1@1 to replace the test @ 1. In order to ensure that the Key generated by the identifier generation rule does not conflict with the Key generated by the service, some characters or rules which are not commonly used may be agreed to be used for generating the copied Key.
The method comprises the steps that hot spot data with a Key value of test exists in a cache, the Interreceptor tells a Monitor about the Key requested by the Interreceptor when requesting the cache each time, and when the Monitor judges that the test is the hot spot data, the test is copied, and a plurality of keys of the Key value (such as test @1, test @2, test @3, test @4, test @5 and the like) of each copy data are determined according to an identification generation rule to be sent to a cache host. Because the cache host is stored in a Key-Value form and is subjected to Hash storage to different machines through keys, the copied data has no hot spot problem. Meanwhile, the Monitor distributes different new keys to the interceptors, so that the interceptors request to distribute to different cache machines.
It should be noted that, the scheme provided in this embodiment only copies the determined hotspot data, so as to save the cache usage space, and Monitor the cache through the Monitor, and when the original Key is changed, modify the attribute of the copy Key at the same time, so as to ensure the consistency of the data.
Further, on the basis of the foregoing embodiments, before acquiring a hotspot access request for predetermined hotspot data, the method includes:
and for any data, determining whether the data is hot data or not according to the access times of requesting to access the data.
Specifically, whether the data is the hotspot data is determined according to the access times of accessing a certain data, and if so, the access request for accessing the hotspot data is taken as the hotspot access request.
In the embodiment, the monitoring on the hot data is realized, so that the hot data can be copied in time to obtain the copy data, and the access to the hot data is converted into the access to the copy data.
Further, on the basis of the foregoing embodiments, the determining whether any data is hot data according to the number of times of requesting access to any data includes:
and determining whether any data is hot data or not according to the access times of requesting to access any data in a set time window.
It may be considered that the time period indicated by it is set to the set time window, for example, 3 seconds.
Fig. 7 is a schematic diagram of the principle of counting the data request amount through the time window according to the present embodiment, and the analysis is performed through the count of the linked list of the sliding window (i.e. the set time window): the sliding window maintains a window number of a certain time unit. Every time a request comes in, whether the total access amount in the previous N time units reaches a threshold value is judged, and the current window is counted to be + 1. Referring to fig. 7, which is a sliding window maintained for-3 seconds, each window is set to 1 second, each request is incremented by the count of the corresponding Key +1 in the current time window, and the statistics of the requests within three seconds is the sum of-3 to-1, it can be seen that the past 3 seconds, i.e., a, requests are totally 13+24+43 times, B is 17+15+3, and C is 3+31 times.
In this embodiment, by setting the time window, it is possible to count a change in a request amount for any one data in units of the set time window, and determine whether the data is hot spot data or not in comparison with counting a request amount for a long time.
In addition, whether the hot spot is a hot spot or not can be judged simply through setting the times and the threshold value, and even a trend judgment mode and the like can be added in a more accurate way.
Further, on the basis of the foregoing embodiments, the determining whether any data is hot data according to the number of times of requesting access to any data includes:
determining whether any data is hot data according to a prediction model;
wherein the predictive model is determined based on the number of accesses to request access to the any of the data.
Historical data is typically collected for the amount of requests to access any of the data, and a predictive model is then generated based on the historical data.
For example, the request amount of the access to any data at different times per year in 5 years is collected, and data fitting is carried out based on the change situation of the access times of any data at different times per year in 5 years, so that a prediction model for predicting the change of the access times of any data at different times per year is obtained. And further based on a prediction model, the data can be judged to be hot data possibly at different time every year, so that the data can be copied in advance, and the hot problem can be prevented in advance. Of course, the model training can be performed in a machine learning manner through the collected historical data to obtain the prediction model.
In this embodiment, hot spot data can be predicted in advance through the prediction model, and the predicted hot spot data is copied before the hot spot problem occurs through the method, so that the hot spot data problem is prevented.
Fig. 8 is a block diagram of a hotspot data access device provided in this embodiment, and referring to fig. 8, the hotspot data access device includes an obtaining module 801, a determining module 802, and a responding module 803, wherein,
an obtaining module 801, configured to obtain a hotspot access request for predetermined hotspot data;
a determining module 802 for determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
a response module 803, configured to use the copy data obtained according to the substitute access request as data responding to the hotspot access request.
The hot spot data access device provided in this embodiment is suitable for the hot spot data access method provided in each of the above embodiments, and details are not described here.
The hotspot data access device provided by this embodiment acquires a hotspot access request for predetermined hotspot data, and determines a substitute access request corresponding to the hotspot access request. The substitute access request is used to access replica data of the hotspot data. The mode of responding the hotspot access request by accessing the copy data of the hotspot data is provided through the substitute access request, and the copy data of the hotspot data and the hotspot data are stored in different cache hosts, so that the request amount of the predetermined hotspot data can be reduced through the substitute access request, the pressure of the cache host where the hotspot data is located is relieved, and the probability of service abnormity caused by the hotspot data is reduced.
According to the present invention, there is provided a hotspot data access device, further comprising, after determining a substitute access request corresponding to the hot access request:
and determining whether the copy data is hot data or not according to the access times of the copy data corresponding to the substitute access request, and if so, performing scattered operation on the copy data.
According to the hotspot data access device provided by the invention, on the basis, the performing of the dispersion operation on the copy data comprises:
determining a heat level corresponding to the duplicate data, wherein the heat level is determined according to the access times of the duplicate data;
and determining the copy number for copying the copy data when the dispersion operation is performed according to the heat level.
According to the present invention, on the basis of the above, the determining of the alternative access request corresponding to the hot access request includes:
acquiring a data identifier of the hot spot data, and determining a copy data identifier corresponding to each copy data of the hot spot data through a preset identifier generation rule based on the data identifier;
generating an access request for accessing the copy data according to the copy data identifier, and taking the access request as the substitute access request;
and when the copy data is stored, determining a cache host for caching the copy data according to the copy data identifier.
On the basis, before acquiring a hotspot access request aiming at predetermined hotspot data, the hotspot data access device provided by the invention comprises:
and for any data, determining whether the data is hot data or not according to the access times of requesting to access the data.
According to the present invention, on the basis, the determining whether any data is the hotspot data according to the access frequency of the request to access the any data includes:
and determining whether any data is hot data or not according to the access times of requesting to access any data in a set time window.
According to the present invention, on the basis, the determining whether any data is the hotspot data according to the access frequency of the request to access the any data includes:
determining whether any data is hot data according to a prediction model;
wherein the predictive model is determined based on the number of accesses to request access to the any of the data.
Fig. 9 illustrates a physical structure diagram of an electronic device, and as shown in fig. 9, the electronic device may include: a processor (processor)910, a communication Interface (Communications Interface)920, a memory (memory)930, and a communication bus 940, wherein the processor 910, the communication Interface 920, and the memory 930 communicate with each other via the communication bus 940. Processor 910 may invoke logic instructions in memory 930 to perform a hotspot data access method comprising:
acquiring a hotspot access request aiming at predetermined hotspot data;
determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and taking the copy data obtained according to the substitute access request as data responding to the hotspot access request.
Furthermore, the logic instructions in the memory 930 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a hotspot data access method, the method comprising:
acquiring a hotspot access request aiming at predetermined hotspot data;
determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and taking the copy data obtained according to the substitute access request as data responding to the hotspot access request.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program that when executed by a processor is implemented to perform a hotspot data access method, the method comprising:
acquiring a hotspot access request aiming at predetermined hotspot data;
determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and taking the copy data obtained according to the substitute access request as data responding to the hotspot access request.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A hotspot data access method is characterized by comprising the following steps:
acquiring a hotspot access request aiming at predetermined hotspot data;
determining an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and taking the copy data obtained according to the substitute access request as data responding to the hotspot access request.
2. The hotspot data access method of claim 1, further comprising, after determining an alternative access request corresponding to the hot access request:
and determining whether the copy data is hot data or not according to the access times of the copy data corresponding to the substitute access request, and if so, performing scattered operation on the copy data.
3. The hotspot data access method of claim 2, wherein the performing a scatter operation on the replica data comprises:
determining a heat level corresponding to the duplicate data, wherein the heat level is determined according to the access times of the duplicate data;
and determining the copy number for copying the copy data when the dispersion operation is performed according to the heat level.
4. The hotspot data access method of claim 1, wherein the determining an alternative access request corresponding to the hot access request comprises:
acquiring a data identifier of the hot spot data, and determining a copy data identifier corresponding to each copy data of the hot spot data through a preset identifier generation rule based on the data identifier;
generating an access request for accessing the copy data according to the copy data identifier, and taking the access request as the substitute access request;
and when the copy data is stored, determining a cache host for caching the copy data according to the copy data identifier.
5. The hotspot data access method of claim 1, wherein before obtaining the hotspot access request for the predetermined hotspot data, the method comprises:
and for any data, determining whether the data is hot data or not according to the access times of requesting to access the data.
6. The method according to claim 5, wherein the determining whether the any data is the hotspot data according to the number of access times of the request to access the any data comprises:
and determining whether any data is hot data or not according to the access times of requesting to access any data in a set time window.
7. The method according to claim 5, wherein the determining whether the any data is the hotspot data according to the number of access times of the request to access the any data comprises:
determining whether any data is hot data according to a prediction model;
wherein the predictive model is determined based on the number of accesses to request access to the any of the data.
8. A hotspot data access device, comprising:
the acquisition module is used for acquiring a hotspot access request aiming at predetermined hotspot data;
a determination module to determine an alternative access request corresponding to the hot access request; wherein the substitute access request is used for accessing copy data of the hotspot data; the duplicate data and the hot spot data are stored in different cache hosts;
and the response module is used for taking the copy data acquired according to the substitute access request as the data responding to the hotspot access request.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the hotspot data access method of any one of claims 1 to 7 when executing the program.
10. A non-transitory readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the hotspot data access method of any one of claims 1 to 7.
CN202110530391.8A 2021-05-14 2021-05-14 Hotspot data access method and device, electronic equipment and storage medium Pending CN113157215A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110530391.8A CN113157215A (en) 2021-05-14 2021-05-14 Hotspot data access method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110530391.8A CN113157215A (en) 2021-05-14 2021-05-14 Hotspot data access method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113157215A true CN113157215A (en) 2021-07-23

Family

ID=76876080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110530391.8A Pending CN113157215A (en) 2021-05-14 2021-05-14 Hotspot data access method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113157215A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033187A (en) * 2022-08-10 2022-09-09 蓝深远望科技股份有限公司 Big data based analysis management method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020103975A1 (en) * 2001-01-26 2002-08-01 Dawkins William Price System and method for time weighted access frequency based caching for memory controllers
CN103294167A (en) * 2013-05-21 2013-09-11 暨南大学 Data behavior based low-energy consumption cluster storage replication device and method
CN110149394A (en) * 2019-05-20 2019-08-20 典基网络科技(上海)有限公司 Dispatching method, device and the storage medium of system resource

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020103975A1 (en) * 2001-01-26 2002-08-01 Dawkins William Price System and method for time weighted access frequency based caching for memory controllers
CN103294167A (en) * 2013-05-21 2013-09-11 暨南大学 Data behavior based low-energy consumption cluster storage replication device and method
CN110149394A (en) * 2019-05-20 2019-08-20 典基网络科技(上海)有限公司 Dispatching method, device and the storage medium of system resource

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033187A (en) * 2022-08-10 2022-09-09 蓝深远望科技股份有限公司 Big data based analysis management method
CN115033187B (en) * 2022-08-10 2022-11-08 蓝深远望科技股份有限公司 Big data based analysis management method

Similar Documents

Publication Publication Date Title
US11886731B2 (en) Hot data migration method, apparatus, and system
CN108418900B (en) Caching method, write-in point client and read client in server cluster system
WO2017092447A1 (en) Method and apparatus for data quality management and control
CA3137748C (en) Method and apparatus for determining configuration knob of database
JPWO2004063928A1 (en) Database load reduction system and load reduction program
US9703705B2 (en) Performing efficient cache invalidation
JP5817558B2 (en) Information processing apparatus, distributed processing system, cache management program, and distributed processing method
US6973536B1 (en) Self-adaptive hybrid cache
CN110908965A (en) Object storage management method, device, equipment and storage medium
CN108021339B (en) Method and device for reading and writing magnetic disk and computer readable storage medium
US8370800B2 (en) Determining application distribution based on application state tracking information
CN113157215A (en) Hotspot data access method and device, electronic equipment and storage medium
US11010410B1 (en) Processing data groupings belonging to data grouping containers
Sandeep et al. CLUEBOX: A Performance Log Analyzer for Automated Troubleshooting.
EP4137971A1 (en) Distributed generic cacheability analysis
JP2018511131A (en) Hierarchical cost-based caching for online media
CN112202895B (en) Method and system for collecting monitoring index data, electronic equipment and storage medium
CN116737764A (en) Method and device for data synchronization, electronic equipment and storage medium
CN115114316A (en) Processing method, device, cluster and storage medium for high-concurrency data
US20150269046A1 (en) Data transfer device, data transfer method, and non-transitory computer readable medium
US20240195908A1 (en) Maintaining service availability
US11977487B2 (en) Data control device, storage system, and data control method
CN117472918B (en) Data processing method, system, electronic device and storage medium
CN117992689B (en) Method and system for improving webpage performance by repeatedly utilizing webpage response result
JP2015095246A (en) Information processing system, management device, server device and key allocation program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination